Icon View Thread

The following is the text of the current message along with any replies.
Messages 1 to 7 of 7 total
Thread Huge difference between almost the SQL statements
Wed, Mar 21 2007 1:05 PMPermanent Link

"Jose Eduardo Helminsky"
Tim (or others)

I would like some explanation about the following situation:

Table: HNOT
===========
Number of records: 366393
Main fields:
 EMP - Integer
 SER - String
 NUM - Integer
 SEQ - Integer
 EMI - DateTime

 Primary key: Emp + Ser + Num + Seq
 Secondary indexes: SER
                    NUM
                    EMI

Table: HINO
===========
Number of records: 835497
Main fields:
  EMP - Integer
  SER - String
  NUM - Integer
  EMI - DateTime

  Primary key: Emp + Ser + Num
  Secondary indexes: SER
                     NUM
                     EMI

Situation I - Query Plan:
-------------------------
SQL statement (Executed with 4.22 Build 4)
================================================================================

select * from hino
inner join hnot on (hino.num=hnot.num and hino.emp=hnot.emp and
hino.ser=hnot.ser)
where emp=1 and emi between '2006-01-01' and '2006-01-31'

Tables Involved
---------------
hino (hino) table opened shared, has 835497 rows
hnot (hnot) table opened shared, has 366393 rows
Result Set Generation
---------------------
Result set will be canned
Result set will consist of one or more rows
WHERE Clause Execution
----------------------
The expression:
emp = 1 and emi between '2006-01-01' AND '2006-01-31'
has been rewritten and is OPTIMIZED, covers 216826 rows or index keys, costs
15764406 bytes, and will be applied to the hino table (hino) before any
joins
Join Ordering
-------------
The driver table is the hino table (hino)
The hino table (hino) is joined to the hnot table (hnot) with the INNER JOIN
expression:
hino.num = hnot.num and hino.emp = hnot.emp and hino.ser = hnot.ser
Optimizer will attempt to re-order the joins to a more optimal order
Use the NOJOINOPTIMIZE clause at the end of the SQL statement to force the
optimizer to leave the joins in their declared order
The joins are already in optimal order and cannot be optimized any further
Join Execution
--------------
Costs ARE NOT being taken into account when executing this join
Use the JOINOPTIMIZECOSTS clause at the end of the SQL statement to force
the
optimizer to consider costs when optimizing this join
The expression:
hino.num = hnot.num and hino.emp = hnot.emp and hino.ser = hnot.ser
has been rewritten and is OPTIMIZED
================================================================================
>>>>> 5900 rows affected in 1007,015 seconds
================================================================================

Situation II - Query Plan
================================================================================
SQL statement (Executed with 4.22 Build 4)
================================================================================

select * from hino
inner join hnot on (hino.num=hnot.num)
where emp=1 and hnot.emp=1 and ser='' and hnot.ser='' and
emi between '2006-01-01' and '2006-01-31'

Tables Involved
---------------
hino (hino) table opened shared, has 835497 rows
hnot (hnot) table opened shared, has 366393 rows
Result Set Generation
---------------------
Result set will be canned
Result set will consist of one or more rows
WHERE Clause Execution
----------------------
The expression:
emp = 1
is OPTIMIZED, covers 432821 rows or index keys, costs 10690678 bytes, and
will
be applied to the hino table (hino) before any joins
The expression:
hnot.emp = 1
is OPTIMIZED, covers 171199 rows or index keys, costs 3115821 bytes, and
will
be applied to the hnot table (hnot) before any joins
The expression:
ser = ''
is OPTIMIZED, covers 812759 rows or index keys, costs 13735627 bytes, and
will
be applied to the hino table (hino) before any joins
The expression:
hnot.ser = ''
is OPTIMIZED, covers 357498 rows or index keys, costs 3717979 bytes, and
will
be applied to the hnot table (hnot) before any joins
The expression:
emi between '2006-01-01' AND '2006-01-31'
is OPTIMIZED, covers 216826 rows or index keys, costs 5073728 bytes, and
will
be applied to the hino table (hino) before any joins
Join Ordering
-------------
The driver table is the hino table (hino)
The hino table (hino) is joined to the hnot table (hnot) with the INNER JOIN
expression:
hino.num = hnot.num
Optimizer will attempt to re-order the joins to a more optimal order
Use the NOJOINOPTIMIZE clause at the end of the SQL statement to force the
optimizer to leave the joins in their declared order
The joins are already in optimal order and cannot be optimized any further
Join Execution
--------------
Costs ARE NOT being taken into account when executing this join
Use the JOINOPTIMIZECOSTS clause at the end of the SQL statement to force
the
optimizer to consider costs when optimizing this join
The expression:
hino.num = hnot.num
is OPTIMIZED

================================================================================
>>>>> 5900 rows affected in 13,453 seconds
================================================================================

Wed, Mar 21 2007 4:10 PMPermanent Link

"Donat Hebert \(Worldsoft\)"
The first query makes no mention of Ser being null (or blank)  Also check
second query as nickname missing for ser (hino)

hth  Donat.

======================

select * from hino
inner join hnot on (hino.num=hnot.num and hino.emp=hnot.emp and
hino.ser=hnot.ser)
where emp=1 and emi between '2006-01-01' and '2006-01-31'


select * from hino
inner join hnot on (hino.num=hnot.num)
where emp=1 and hnot.emp=1 and ser='' and hnot.ser='' and
emi between '2006-01-01' and '2006-01-31'


Wed, Mar 21 2007 4:23 PMPermanent Link

"Jose Eduardo Helminsky"
Donat

I forgot to type it but it doesn´t matter in the issue because almost ALL
records contains NULL in SER field.
BTW, thanks for help.

Eduardo

Wed, Mar 21 2007 4:46 PMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Eduardo,

<<  I would like some explanation about the following situation: >>

The two queries aren't the same at all.  The faster one has a much more
selective WHERE clause, which is why it is faster.

--
Tim Young
Elevate Software
www.elevatesoft.com

Wed, Mar 21 2007 4:55 PMPermanent Link

"Jose Eduardo Helminsky"
Tim

Just to complete my question:

Situation I:
-----------
select * from hino
inner join hnot on (hino.num=hnot.num and hino.emp=hnot.emp and
hino.ser=hnot.ser)
where emp=1 and emi between '2006-01-01' and '2006-01-31'

Situation II:
-----------
select * from hino
inner join hnot on (hino.num=hnot.num)
where emp=1 and hnot.emp=1 and ser='' and hnot.ser='' and
emi between '2006-01-01' and '2006-01-31'

Ok. I agree that second SQL is more selective but even if I put exactly
where clause on the first SQL it will
produce the same results but very very slow. Can I follow a simple rule that
avoid more than one comparation on join and put all conditions at where
clause ?

Eduardo


"Tim Young [Elevate Software]" <timyoung@elevatesoft.com> escreveu na
mensagem news:E28901EE-5426-4CF0-9706-4BEB6B32EAB2@news.elevatesoft.com...
> Eduardo,
>
> <<  I would like some explanation about the following situation: >>
>
> The two queries aren't the same at all.  The faster one has a much more
> selective WHERE clause, which is why it is faster.
>
> --
> Tim Young
> Elevate Software
> www.elevatesoft.com
>

Thu, Mar 22 2007 10:58 AMPermanent Link

"Donat Hebert \(Worldsoft\)"
Eduardo, there is more overhead associated with joins so yes, the more you
can
restrict processing in the Where statement, the better.  The issue is
whether you need
those joins to get the correct result. (Business decision)

As a side note, you'll find that data distribution within a dataset will
impact how much benefit
or cost you'l l get by introducing a key.  Through testing (ie DbSys
portions) for the high priority processing, we
have been able to optimize our calculation routines.  hth.  Donat.

> .... Can I follow a simple rule that avoid more than one comparation on
> join and put all conditions at where clause ?
>
> Eduardo
>

Thu, Mar 22 2007 5:22 PMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Eduardo,

<< Ok. I agree that second SQL is more selective but even if I put exactly
where clause on the first SQL it will
> produce the same results but very very slow. >>

DBISAM has to perform two extra index scans per each row in the driver
table.  This is going to cause some extra processing.

<< Can I follow a simple rule that avoid more than one comparation on join
and put all conditions at where clause ? >>

You should always put in whatever joins are necessary to get the proper
result based upon the table structures.  That's the only requirement that
matters.

--
Tim Young
Elevate Software
www.elevatesoft.com

Image