Icon View Thread

The following is the text of the current message along with any replies.
Messages 1 to 10 of 10 total
Thread Join optimization
Wed, Apr 30 2008 1:08 PMPermanent Link

Rodrigo Pires
The Field "par.Parte",  on the third join, is not indexed.

And still, the query runs well (acceptable :D), but when the query returns more then 1100
rows (aprox), it crashes.

Example:

'2007-04-15' AND '2007-06-14' - 1102 rows 17,025 secs

'2007-04-15' AND '2007-06-16'  - crashed (more than 10min without any sign of life) -  and
there´s just 36 rows between day 14 and 16

I just don´t get it. I already make a workaround, but wanna know, why this happens?

  SELECT                                                                    
         par.protocolo              
   
  FROM            "Par_Tit.dat"   PAR                                      
  INNER JOIN      "Capa_Tit.dat"  CAP on Upper(cap.protocolo) = Upper(par.protocolo)     
-->   INNER JOIN      "Cod_Par.dat"    VIN on vin.codigo = par.parte
              
  WHERE cap.Data_Cadastro BETWEEN '2007-04-15' AND '2007-06-14'


DBISAM v 4.25 b4
Wed, Apr 30 2008 1:46 PMPermanent Link

"Jose Eduardo Helminsky"
Rodrigo

<<
The Field "par.Parte",  on the third join, is not indexed.
>>

You have already answered the question with the above afirmation.

DBISAM will scan using brute force when there is no index available to the
condition and this is your case. Try creating the index to optimize the
field "par.Parte" and see what happens.

Eduardo

Wed, Apr 30 2008 1:59 PMPermanent Link

Rodrigo Pires
<<
DBISAM will scan using brute force when there is no index available to the
condition and this is your case. Try creating the index to optimize the
field "par.Parte" and see what happens.
>>

I know that. And already fixed it (cannot add an index there, i´ve modified the query).

My question is why it crashes with more than 1102 rows??? What´s the difference between a
full table scan resulting in 1102 or 1138 rows?

Must be some kind of buffer, i don´t know.
Wed, Apr 30 2008 2:13 PMPermanent Link

"Jose Eduardo Helminsky"
Rodrigo

When you say "crashes" what really happens ? Is there any error message ?

Could you put the query plan here ?

Eduardo

Wed, Apr 30 2008 2:26 PMPermanent Link

Rodrigo Pires
No error messages, no QueryProgress, and DbSys (or my application) just stop working
(maybe its working, but don´t give any response) - had to ctrl+alt+del them.

I tested again, and the second query (with two more days) still crashes.

The plan of the query that returns 1102 rows:

================================================================================
SQL statement (Executed with 4.25 Build 4)
================================================================================

  SELECT                                                                    
                    par.protocolo                            as Protocolo
                                                                                   
                                                                                        
              FROM            "c:\Ansata\Rtd\Dat\Titulos\Par_Tit.dat"   PAR             
    
                  
              INNER JOIN      "c:\Ansata\Rtd\Dat\Titulos\Capa_Tit.dat"  CAP on
Upper(cap.protocolo) = Upper(par.protocolo)     
              INNER JOIN      "c:\Ansata\Rtd\Dat\Config\Cod_Par.dat"    VIN on
Upper(vin.codigo)  = Upper(par.parte)           
           
                                                                                        
                             
              WHERE cap.Data_Cadastro BETWEEN '2007-04-15' AND '2007-06-14'

Tables Involved
---------------

Par_Tit (PAR) table opened shared, has 17019 rows
Capa_Tit (CAP) table opened shared, has 5724 rows
Cod_Par (VIN) table opened shared, has 369 rows

Result Set Generation
---------------------

Result set will be canned

Result set will consist of one or more rows

WHERE Clause Execution
----------------------

The expression:

cap.Data_Cadastro BETWEEN '2007-04-15' AND '2007-06-14'

is OPTIMIZED, covers 635 rows or index keys, costs 18986 bytes, and will be
applied to the Capa_Tit table (CAP) before any joins

Join Ordering
-------------

The driver table is the Par_Tit table (PAR)

The Par_Tit table (PAR) is joined to the Capa_Tit table (CAP) with the INNER
JOIN expression:

Upper(par.protocolo) = Upper(cap.protocolo)

The Par_Tit table (PAR) is joined to the Cod_Par table (VIN) with the INNER
JOIN expression:

Upper(par.parte) = Upper(vin.codigo)

Optimizer will attempt to re-order the joins to a more optimal order
Use the NOJOINOPTIMIZE clause at the end of the SQL statement to force the
optimizer to leave the joins in their declared order

Optimized Join Ordering
-----------------------

The driver table is the Capa_Tit table (CAP)

The Capa_Tit table (CAP) is joined to the Par_Tit table (PAR) with the INNER
JOIN expression:

Upper(cap.protocolo) = Upper(par.protocolo)

The Par_Tit table (PAR) is joined to the Cod_Par table (VIN) with the INNER
JOIN expression:

Upper(par.parte) = Upper(vin.codigo)

Join Execution
--------------

Costs ARE NOT being taken into account when executing this join
Use the JOINOPTIMIZECOSTS clause at the end of the SQL statement to force the
optimizer to consider costs when optimizing this join

The expression:

Upper(cap.protocolo) = Upper(par.protocolo)

is OPTIMIZED

The expression:

Upper(par.parte) = Upper(vin.codigo)

is UN-OPTIMIZED

================================================================================
>>>>> 1102 rows affected in 16,87 seconds
================================================================================
Wed, Apr 30 2008 2:34 PMPermanent Link

Rodrigo Pires
i changed the driver table PAR with the Join table CAP, to avoid the join reordering, but
the query keep crashing.

Seems like a infinite loop.
Wed, Apr 30 2008 2:46 PMPermanent Link

Rodrigo Pires
i repaired, verified, and optimized (in that order) all the tables and indexes involved,
tested in another database, and still get the same result
Wed, Apr 30 2008 2:56 PMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Rodrigo,

<< i repaired, verified, and optimized (in that order) all the tables and
indexes involved, tested in another database, and still get the same result
>>

Send me the database tables that you're using via email along with the exact
query SQL, and I will tell you what is going on.

--
Tim Young
Elevate Software
www.elevatesoft.com

Wed, Apr 30 2008 3:39 PMPermanent Link

Rodrigo Pires
they´re going.

thanks!
Fri, May 2 2008 3:38 PMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Rodrigo,

The reason that the first version of the query ('2007-06-14') is running
fast, while the second ('2007-06-16') is not, is that the first is able to
re-order the joins so that the join ordering is optimal and can take
advantage of indexes.  The second version of the query has a query plan that
looks like this:

================================================================================
SQL statement (Executed with 4.26 Build 2)
================================================================================

/**/

              SELECT
                    par.protocolo                            as Protocolo,
                    par.parte                                as Vinculo


              FROM            "Par_Tit.dat"   PAR
              INNER JOIN      "Capa_Tit.dat"  CAP on Upper(cap.protocolo) =
Upper(par.protocolo)
              INNER JOIN      "Cod_Par.dat"    VIN on Upper(vin.codigo)  =
Upper(par.parte)



              WHERE cap.Data_Cadastro BETWEEN '2007-04-15' AND '2007-06-16'

Tables Involved
---------------

Par_Tit (PAR) table opened shared, has 17019 rows
Capa_Tit (CAP) table opened shared, has 5724 rows
Cod_Par (VIN) table opened shared, has 369 rows

Result Set Generation
---------------------

Result set will be canned

Result set will consist of one or more rows

WHERE Clause Execution
----------------------

The expression:

cap.Data_Cadastro BETWEEN '2007-04-15' AND '2007-06-16'

is OPTIMIZED, covers 635 rows or index keys, costs 18986 bytes, and will be
applied to the Capa_Tit table (CAP) before any joins

Join Ordering
-------------

The driver table is the Par_Tit table (PAR)

The Par_Tit table (PAR) is joined to the Capa_Tit table (CAP) with the INNER
JOIN expression:

Upper(par.protocolo) = Upper(cap.protocolo)

The Par_Tit table (PAR) is joined to the Cod_Par table (VIN) with the INNER
JOIN expression:

Upper(par.parte) = Upper(vin.codigo)

Optimizer will attempt to re-order the joins to a more optimal order
Use the NOJOINOPTIMIZE clause at the end of the SQL statement to force the
optimizer to leave the joins in their declared order

Optimized Join Ordering
-----------------------

The driver table is the Cod_Par table (VIN)

The Cod_Par table (VIN) is joined to the Par_Tit table (PAR) with the INNER
JOIN expression:

Upper(vin.codigo) = Upper(par.parte)

The Par_Tit table (PAR) is joined to the Capa_Tit table (CAP) with the INNER
JOIN expression:

Upper(par.protocolo) = Upper(cap.protocolo)

Join Execution
--------------

Costs ARE NOT being taken into account when executing this join
Use the JOINOPTIMIZECOSTS clause at the end of the SQL statement to force
the
optimizer to consider costs when optimizing this join

The expression:

Upper(vin.codigo) = Upper(par.parte)

is UN-OPTIMIZED

The expression:

Upper(par.protocolo) = Upper(cap.protocolo)

is UN-OPTIMIZED

================================================================================
>>>>> 1126 rows affected in 205.859 seconds
================================================================================

You'll notice that both join conditions are un-optimized due to missing
indexes, specifically the CAPA_TIT table does not have a case-insensitive
index on the protocolo field, and the PART_TIT table does not have an index
at all on the parte field.

The second version does not perform join re-ordering because it can't find
an ideal driver table, so it defaults to the one specified in the original
query.  You can see this in action by including the NOJOINOPTIMIZE clause on
the first version of the query - it will then run as slow as the second
version of the query.

Furthermore, your tables are encrypted, so the result is that DBISAM has to
read 17019 * 5724 * 369 rows from an encrypted table in order to process
those joins.

The solution is to hand re-order the joins and include the NOJOINOPTIMIZE
clause, or to make sure that you add the missing indexes.

--
Tim Young
Elevate Software
www.elevatesoft.com

Image