Icon View Thread

The following is the text of the current message along with any replies.
Messages 1 to 9 of 9 total
Thread Issue with RANGE in a subquery
Sun, Dec 12 2010 8:57 PMPermanent Link

Chris B

Hi guys,
Using EDB 2.04 B4

I'm having problems with the below query

SELECT
ordering_field,
Product.ProductCode AS ProductCode
FROM Product  
INNER JOIN (SELECT
     SUM(SaleQuantity) AS ordering_field,
     Product.ProductCode AS ProductCode
     FROM Product
     INNER JOIN ProductSaleHistory ON Product.ProductCode = ProductSaleHistory.ProductCode
     GROUP BY ProductCode
     ORDER BY ordering_field DESC
     RANGE 1 TO 10) AS TopProductSaleHistory
        ON Product.ProductCode = TopProductSaleHistory.ProductCode
INNER JOIN ProductSaleHistory ON Product.ProductCode = ProductSaleHistory.ProductCode
INNER JOIN Business ON ProductSaleHistory.BusinessCode = Business.BusinessCode
Product.ProductCode = ProductSaleHistory.ProductCode
GROUP BY ProductCode
ORDER BY ordering_field DESC

The result should contain a maximum of 10 rows. However, Elevate sometimes returns incorrect results (see attachment) - excution plan sometimes changes/re-orders the driver table to 'Business' so the join is missing a condition. Instead of 10 rows I get a rowcount which is the distinct number of ProductSaleHistory.ProductCode records.



Execution plan
================================================================================
SQL Query (Executed by ElevateDB 2.04 Build 4)

Note: The SQL  shown here is generated by ElevateDB and may not be exactly the
same as the SQL that was originally entered.  However, none of the differences
alter the execution results in any way.
================================================================================

SELECT ALL
"ordering_field" AS "ordering_field",
"Product"."ProductCode" AS "ProductCode"
FROM "Product" INNER JOIN (SELECT ALL SUM("SaleQuantity") AS "ordering_field",
"Product"."ProductCode" AS "ProductCode" FROM "ProductSaleHistory" INNER JOIN
"Product" ON "Product"."ProductCode" = "ProductSaleHistory"."ProductCode" GROUP
BY "Product"."ProductCode" ORDER BY SUM("SaleQuantity") DESC RANGE 1 TO 10) AS
"TopProductSaleHistory" ON "Product"."ProductCode" =
"TopProductSaleHistory"."ProductCode",
INNER JOIN "ProductSaleHistory" ON "Product"."ProductCode" =
"ProductSaleHistory"."ProductCode"
INNER JOIN "Business" ON "ProductSaleHistory"."BusinessCode" =
"Business"."BusinessCode"
GROUP BY "Product"."ProductCode"
ORDER BY "ordering_field" DESC

Source Tables
-------------

Product: 41711 rows
TopProductSaleHistory1 (TopProductSaleHistory): 10 rows
ProductSaleHistory: 9215 rows
Business: 1 rows

Result Set
----------

The result set was insensitive and read-only
The result set consisted of zero or more rows

Joins
-----

The driver table was the Product table

The Product table was joined to the TopProductSaleHistory1
(TopProductSaleHistory) table with the inner join expression:

"Product"."ProductCode" = "TopProductSaleHistory"."ProductCode"

The Product table was joined to the ProductSaleHistory table with the inner join
expression:

"Product"."ProductCode" = "ProductSaleHistory"."ProductCode"

The ProductSaleHistory table was joined to the Business table with the inner
join expression:

"ProductSaleHistory"."BusinessCode" = "Business"."BusinessCode"

The optimizer attempted to re-order the joins to a more optimal order

The optimizer successfully re-ordered the joins into this more optimal order:

The driver table was the Business table

The Business table was joined to the ProductSaleHistory table with the inner
join expression:

"ProductSaleHistory"."BusinessCode" = "Business"."BusinessCode"

The ProductSaleHistory table was joined to the Product table with the inner join
expression:

"Product"."ProductCode" = "ProductSaleHistory"."ProductCode"

The Product table was joined to the TopProductSaleHistory1
(TopProductSaleHistory) table with the inner join expression:

"TopProductSaleHistory"."ProductCode" = "Product"."ProductCode"

The following join condition was applied to the ProductSaleHistory table:

"ProductSaleHistory"."BusinessCode" = "Business"."BusinessCode"

Index scan (ProductSaleHistory.UQ_ProductSaleHistory)

The following join condition was applied to the Product table:

"Product"."ProductCode" = "ProductSaleHistory"."ProductCode"

Index scan (Product.ProductCode_idx)

The following join condition was applied to the TopProductSaleHistory1
(TopProductSaleHistory) table:

"TopProductSaleHistory"."ProductCode" = "Product"."ProductCode"

Index scan (TopProductSaleHistory1.__GroupBy)


Result set I/O statistics
-------------------------
Total rows visited: 9215
Row buffer manager
Max buffer size: 1048560 Buffer size: 111960
Hits: 22047   Misses: 0   Hit ratio: 1
Reads: 0   Bytes read: 0
Writes: 0   Bytes written: 0
Index Page buffer manager
Max buffer size: 2097152 Buffer size: 110592
Hits: 28787   Misses: 0   Hit ratio: 1
Reads: 0   Bytes read: 0
Writes: 0   Bytes written: 0
================================================================================
2799 row(s) returned in 6.172 secs
================================================================================


FYI, this is not our production query. We have WHERE clauses attached to both the inner and outer query and more columns in the SELECT. However, I've managed to reproduce the issue with the above query.

Please help. Let me know if you have any more questions.



Attachments: range_query.JPG
Wed, Dec 15 2010 4:12 PMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Chris,


<< The result should contain a maximum of 10 rows. However, Elevate
sometimes returns incorrect results (see attachment) - excution plan
sometimes changes/re-orders the driver table to 'Business' so the join is
missing a condition. Instead of 10 rows I get a rowcount which is the
distinct number of ProductSaleHistory.ProductCode records. >>

I'm a little confused here.  Your image shows two different queries, one
with the additional condition in the INNER JOIN on the Business table.  Are
you saying that either of the queries returns different results with
multiple executions (and different join orderings caused by EDB), or that
they return different results due to the inclusion/exclusion of the extra
condition ?

Also, I can't really test this here without the database, so I'll need you
to send me the database catalog and table files in order to execute the
query here and see what's going on.

--
Tim Young
Elevate Software
www.elevatesoft.com
Wed, Dec 15 2010 4:25 PMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Chris,

Never mind about the confusion - I see that the execution plan only has the
one condition.  However, I still need the data in order to test this.

--
Tim Young
Elevate Software
www.elevatesoft.com
Wed, Dec 15 2010 4:38 PMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Chris,

Scratch the data request also - I was able to "manufacture" some product
sales history from the product table.

I'm looking into the issue now.

--
Tim Young
Elevate Software
www.elevatesoft.com
Thu, Dec 16 2010 3:17 PMPermanent Link

Chris B

Thanks Tim.

Sorry for the confusion. Both the queries should return an equivalent recordset - with a maximum of 10 records. The second query is only there to show the correct resultset.

The first query, depending on a WHERE clause (WHERE clause is dynamic and not in the example given) is sometimes returning incorrect results. The problem appears to be driven by which table is selected by the optimiser as the driver. If Business table is the driver, the returned recordset contains way too many records. If any other table is the driver, then I see a maximum of 10 records.

However, the second query always returns a maximum of 10 rows - regardless of any additional WHERE filters or which table is the driver.

Cheers,
Chris
Thu, Dec 16 2010 3:22 PMPermanent Link

Chris B

<< Are you saying that either of the queries returns different results with
multiple executions (and different join orderings caused by EDB), or that
they return different results due to the inclusion/exclusion of the extra
condition ? >>

Multiple executions with the same WHERE clause don't change the results. However, the first query sometimes returns more than 10 rows.
Thu, Dec 16 2010 3:32 PMPermanent Link

Chris B

Chris B wrote:
<<However, the first query sometimes returns more than 10 rows.>>

I meant depending on the WHERE clause . Either query should never return more than 10 rows and depending on the WHERE, the first one does.
Fri, Dec 17 2010 2:05 PMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Chris,

<< The first query, depending on a WHERE clause (WHERE clause is dynamic and
not in the example given) is sometimes returning incorrect results. The
problem appears to be driven by which table is selected by the optimiser as
the driver. If Business table is the driver, the returned recordset contains
way too many records. If any other table is the driver, then I see a maximum
of 10 records. >>

Got it - it was an issue similar to the one with the expression filter on
the RANGE query, namely the ordering of the filters and how they are ANDed
together for the final result.  Whenever the joins were flipped and the
derived table was the target of the join (instead of the driver), it would
result in an issue where the RANGE was effectively cancelled out.  EDB now
properly handles ranges with respect to derived tables, and even will allow
for correct sensitive result sets with queries like this (provided proper
indexes exist):

SELECT *
FROM
  (SELECT * FROM Customer ORDER BY Company RANGE 1 TO 10) Test
ORDER BY CustNo
RANGE 2 TO 4

which is pretty cool considering that you can edit the result set and the
ranges will automatically update accordingly.

--
Tim Young
Elevate Software
www.elevatesoft.com
Sun, Dec 19 2010 3:42 PMPermanent Link

Chris B

Thanks.
That's great news Tim.
Image