Icon View Thread

The following is the text of the current message along with any replies.
Messages 1 to 10 of 17 total
Thread dbisam and elevatedb slow query in a select with Left Outer Join with several fields joining
Fri, Nov 14 2014 5:52 AMPermanent Link

macc2010

Hello,

I am evaluating dbisam and elevatedb and after success migrating my application, i have a problem.

I have the following select :

Select a.Field1, a.Field2, a.Field3, a.Field4
From TABLEA a
 Left Join TABLEB b
 On ( a.Field1 = b.Field1 and a.Field2 = b.Field2 and a.Field4 = b.Field4 and a.Field3 = b.Field3 )
Where b.Field1 is Null

TABLEA has 7000 records and TABLEB 127000 records. It is supposed that all records in TABLEA must be in TABLEB, and mi intention with this select is to check what records of TABLEA are not in TABLEB. For a more comprension of the tables, there are records in TABLEB that does not exist in TABLEA.

TABLEB has a primary index composed of the fields : FIELD1,FIELD2,FIELD4,FIELD3.

The execution time of this kind of query running in bde paradox ( database desktop ) is very fast, it only need 4 o 5 seconds. In dbisam and elevatedb, it takes about 250 seconds. I think that when in a dbisam query, the select has in the LEFT JOIN ON clause more than one fields, the query optimizer is not able to relation the left table with the right table through an existing index. Is as if the master/detail relation is not using primary index.

I have migrated my tables to elevatedb and there is the same problem and the query is very slow compared to paradox query.

I have attached the files to test this problem.

For the test, i have used the last dbisam database system utility, version 4.40 build 1.

I am pending to take the decission of migration due to this problem. If this problem is solved in dbisam i will migrate to this.

Thank you and best regards.



Attachments: tables_a_and_b.zip
Fri, Nov 14 2014 9:15 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

macc2010


By simply adding individual indices the time goes down to c30 seconds for DBISAM and c45 seconds for ElevateDB.

Looking at the execution plan for ElevateDB that then means an index scan is being carried out for the join conditions but a row scan is still required for the b.field1 is null

Using a bit of lateral thinking I came up with

SELECT * INTO Memory\b FROM TableB WHERE Field1 IS NULL;
SELECT * INTO Memory\a FROM TableA Where Field1 IS NULL;

Select a.Field1, a.Field2, a.Field3, a.Field4
From Memory\a a
 Left Join Memory\b b
 On ( a.Field2 = b.Field2 and a.Field4 = b.Field4 and a.Field3 = b.Field3 )

Which executes VERY fast.

You'll need to drop the memory tables when you're finished with the information.

Something similar could be done with ElevateDB


================================================================================
SQL statement (Executed with 4.25 Build 5)
================================================================================

SELECT * INTO Memory\b FROM TableB WHERE Field1 IS NULL

Tables Involved
---------------

TableB (TableB) table opened shared, has 126540 rows

Result Set Generation
---------------------

Result set will be canned

Result set will consist of one or more rows

WHERE Clause Execution
----------------------

The expression:

Field1 IS NULL

is OPTIMIZED, covers 0 rows or index keys, costs 0 bytes, and will be applied
to the TableB table (TableB) before any joins

================================================================================
>>>>> 0 rows affected in 0 seconds
================================================================================

================================================================================
SQL statement (Executed with 4.25 Build 5)
================================================================================

SELECT * INTO Memory\a FROM TableA Where Field1 IS NULL

Tables Involved
---------------

TableA (TableA) table opened shared, has 7350 rows

Result Set Generation
---------------------

Result set will be canned

Result set will consist of one or more rows

WHERE Clause Execution
----------------------

The expression:

Field1 IS NULL

is OPTIMIZED, covers 0 rows or index keys, costs 0 bytes, and will be applied
to the TableA table (TableA) before any joins

================================================================================
>>>>> 0 rows affected in 0 seconds
================================================================================

================================================================================
SQL statement (Executed with 4.25 Build 5)
================================================================================

Select a.Field1, a.Field2, a.Field3, a.Field4
From Memory\a a
 Left Join Memory\b b
 On ( a.Field1 = b.Field1 and a.Field2 = b.Field2 and a.Field4 = b.Field4 and
a.Field3 = b.Field3 )

Tables Involved
---------------

a (a) table opened shared, has 0 rows
b (b) table opened shared, has 0 rows

Result Set Generation
---------------------

Result set will be canned

Result set will consist of one or more rows

Join Ordering
-------------

The driver table is the a table (a)

The a table (a) is joined to the b table (b) with the LEFT OUTER JOIN
expression:

a.Field1 = b.Field1 and a.Field2 = b.Field2 and a.Field4 = b.Field4 and
a.Field3 = b.Field3

Optimizer will attempt to re-order the joins to a more optimal order
Use the NOJOINOPTIMIZE clause at the end of the SQL statement to force the
optimizer to leave the joins in their declared order

The joins are already in optimal order and cannot be optimized any further

Join Execution
--------------

Costs ARE NOT being taken into account when executing this join
Use the JOINOPTIMIZECOSTS clause at the end of the SQL statement to force the
optimizer to consider costs when optimizing this join

================================================================================
>>>>> 0 rows affected in 0 seconds
================================================================================




Roy Lambert

Fri, Nov 14 2014 5:37 PMPermanent Link

macc2010

Hello Roy,

I have attached to this post new data in the tables for that you can test the query again.

The query :

Select a.Field1, a.Field2, a.Field3, a.Field4
From TABLEA a
Left Join TABLEB b
On ( a.Field1 = b.Field1 and a.Field2 = b.Field2 and a.Field4 = b.Field4 and a.Field3 = b.Field3 )
Where b.Field1 is Null

With the new data gives a result of one record :

FIELD1  FIELD2 FIELD3 FIELD4
4           334      1           E

And if i apply your solution, the select gives me a result of 0 records, because there are no records in TABLEB containing null in FIELD1 field.

I have seen in dbisam documentation that when you relation through JOIN clause more than one field, the query optimizer does not use the index for right join table. I have a perfect index on TABLEB to relation TABLEA and TABLEB through FIELD1 FIELD2 FIELD4 and FIELD3, but the query does not use this index, because of that, the query is very slow.

The same query in paradox takes only 4 seconds, but in dbisam and elevatedb takes 250 seconds approximately.

Is that a dbisam limitation?

The same problem occurs with elevatedb, is this also an elevatedb limitation?.

Thank you and best regards.



Attachments: tables_a_and_b_test2.zip
Sat, Nov 15 2014 4:26 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

macc2010

>The query :
>
>Select a.Field1, a.Field2, a.Field3, a.Field4
>From TABLEA a
> Left Join TABLEB b
> On ( a.Field1 = b.Field1 and a.Field2 = b.Field2 and a.Field4 = b.Field4 and a.Field3 = b.Field3 )
>Where b.Field1 is Null
>
>With the new data gives a result of one record :
>
>FIELD1 FIELD2 FIELD3 FIELD4
>4 334 1 E
>
>And if i apply your solution, the select gives me a result of 0 records, because there are no records in TABLEB containing null in FIELD1 field.

My solution MUST be correct and your's is in error, either that or I am totally misinterpreting your query and logic.

The first condition of your your join is

a.Field1 = b.Field1

you then qualify the result with a filter

Where b.Field1 is Null

therefore the query should only return a result where a.Field1 = b.Field1 = null. What you are showing is a.Field1 = 4 ie it cannot be right.

Neither Paradox nor DBISAM handle NULL in accordance with the current SQL standards, and I don't think they handle it in the same way as each other. I switched to ElevateDB a long time ago but I seem to have memories of some unanticipated behaviour when a composite primary key had null as the first element.

Roy Lambert
Sat, Nov 15 2014 4:28 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

macc2010


Another thought - this is a user supported forum. You might be best served by addressing this pre-sales question direct to ElevateDB


Roy Lambert
Sat, Nov 15 2014 11:21 AMPermanent Link

Raul

Team Elevate Team Elevate

On 11/15/2014 4:26 AM, Roy Lambert wrote:
> The first condition of your your join is
>
>   a.Field1 = b.Field1
>
> you then qualify the result with a filter
>
> Where b.Field1 is Null


He is doing a left join though (not inner) so it's possible to have
records in the left table (a) that do not exist on the right (b) so just
checking for null in the table is not the same as checking b.Field1 in
null (i.e. record in table a only)

Raul
Sat, Nov 15 2014 1:03 PMPermanent Link

Raul

Team Elevate Team Elevate

On 11/14/2014 5:52 AM, macc2010 wrote:
> Select a.Field1, a.Field2, a.Field3, a.Field4
>  From TABLEA a
>    Left Join TABLEB b
>    On ( a.Field1 = b.Field1 and a.Field2 = b.Field2 and a.Field4 = b.Field4 and a.Field3 = b.Field3 )
> Where b.Field1 is Null

Something like this runs reasonably fast for me (approx 2.2 sec).

Select a.Field1, a.Field2, a.Field3, a.Field4
From TABLEA a WHERE
CAST(a.Field1 as char(10)) + '-' + CAST(a.Field2 as char(10)) + '-' +
CAST(a.Field3 as char(10))+ '-' + CAST(a.Field4 as char(10)) NOT IN
( select CAST(b.Field1 as char(10)) + '-' + CAST(b.Field2 as char(10)) +
'-' + CAST(b.Field3 as char(10))+ '-' + CAST(b.Field4 as char(10)) as AA
From TABLEB b)


Should be comparable to your query - basically find entries in A that do
not exist in b based on 4 columns.

It's not pretty but lot better than join condition on 4 fields.


Raul

Sun, Nov 16 2014 3:21 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Raul


>He is doing a left join though (not inner) so it's possible to have
>records in the left table (a) that do not exist on the right (b) so just
>checking for null in the table is not the same as checking b.Field1 in
>null (i.e. record in table a only)

Whilst walking the dog this morning the question of what a LEFT JOIN did in DBISAM came to my mind - even though the title mentioned outer I'd been assuming inner. I shall now go and shoot myself.

Roy Lambert
Sun, Nov 16 2014 3:03 PMPermanent Link

Jeff Cook

Aspect Systems Ltd

Avatar

On 16/11/2014 9:21 p.m., Roy Lambert wrote:
>
> Whilst walking the dog this morning the question of what a LEFT JOIN did in DBISAM came to my mind - even though the title mentioned outer I'd been assuming inner. I shall now go and shoot myself.
>

Please don't - the dog at least will miss you (or maybe just the morning
walks)  Wink

Cheers

Jeff
Mon, Nov 17 2014 3:34 AMPermanent Link

macc2010

Roy Lambert wrote:

Raul


>He is doing a left join though (not inner) so it's possible to have
>records in the left table (a) that do not exist on the right (b) so just
>checking for null in the table is not the same as checking b.Field1 in
>null (i.e. record in table a only)

Whilst walking the dog this morning the question of what a LEFT JOIN did in DBISAM came to my mind - even though the title mentioned outer I'd been assuming inner. I shall now go and shoot myself.

Roy Lambert

So, i am not wrong, my query :

Select a.Field1, a.Field2, a.Field3, a.Field4
 From TABLEA a
   Left Join TABLEB b
   On ( a.Field1 = b.Field1 and a.Field2 = b.Field2 and a.Field4 = b.Field4 and a.Field3 = b.Field3 )
Where b.Field1 is Null

is valid and i want to know if dbisam delays in give me the results having an index that satisfies the join.

And i confirm you that in paradox ( database desktop ) it give me the results in 4 seconds and paradox handles null ok.

I have a lot of selects in my program with the same structure and i do not know how to proceed.

Is this a dbisam limitation?.

Will you correct this limitation in a near future?.

Thank you and best regards.
Page 1 of 2Next Page »
Jump to Page:  1 2
Image