Icon View Thread

The following is the text of the current message along with any replies.
Messages 1 to 6 of 6 total
Thread Performance problems with this SQL
Thu, Nov 1 2007 8:50 PMPermanent Link

"Adam H."
Hi,

I have the following SQL, and with the data I currently have, it takes
around 3 seconds to execute:

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
select 0 as TotalHrsClaimed, 0 as totalclaims, cd."Max SCH", a.Code,
a.Surname, a.Firstname, ac.PartA, av.AppCourseID, sum(hrs) as assessed, 0 as
credman, 0 as NYC, sum(HrsClaimable) as TotalHrsClaimable
from apprenticevisits av
inner join apprenticecourse ac on (ac.ID = av.AppCourseID)
left outer join apprentice a on (a.ID = ac.appID)
left outer join coursedet cd on (cd.courseID = ac.CourseID) and (cd.item =
ac.CourseItem)
where
/*1*/ (ac.ID = ':ID')
/*2*/ and (AV.credited <> TRUE) and (AV.manual <> true)
Group by cd."Max SCH", a.Code, a.Surname, a.Firstname, ac.PartA,
av.AppCourseID
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

The real gotcha for me is, if I remove line /*2*/ out of the SQL, it
executes almost immediately, as line /*1*/ reduces the number of records
down to next to none.

My understanding is that the query should execute line /*1*/ first, and then
execute line /*2*/ after, which would make it nearly instant. Instead, for
some reason it does not.

Can someone please inform me as to why line /*2*/ is causing so many
overheads, or what I can do about it to increase performance? (I have posted
the 'Plan' below).

Thanks & Regards

Adam.




================================================================================
SQL statement (Executed with 4.25 Build 3)
================================================================================

select 0 as TotalHrsClaimed, 0 as totalclaims, cd."Max SCH", a.Code,
a.Surname,
a.Firstname, ac.PartA, av.AppCourseID, sum(hrs) as assessed, 0 as credman, 0
as
NYC, sum(HrsClaimable) as TotalHrsClaimable

from apprenticevisits av

inner join apprenticecourse ac on (ac.ID = av.AppCourseID)

left outer join apprentice a on (a.ID = ac.appID)

left outer join coursedet cd on (cd.courseID = ac.CourseID) and (cd.item =
ac.CourseItem)

where

/**/ (ac.ID = ':ID')

/**/ and (AV.credited <> TRUE) and (AV.manual <> true)

Group by cd."Max SCH", a.Code, a.Surname, a.Firstname, ac.PartA,
av.AppCourseID

Tables Involved
---------------

apprenticevisits (av) table opened shared, has 87312 rows
apprenticecourse (ac) table opened shared, has 11089 rows
apprentice (a) table opened shared, has 10049 rows
coursedet (cd) table opened shared, has 649 rows

Result Set Generation
---------------------

Result set will be canned

Result set will consist of one or more grouped rows

Result set will be grouped by the following column(s) using a temporary
index:

Max SCH
Code
Surname
Firstname
PartA
AppCourseID

Result set will be ordered by the temporary index created for the grouping

WHERE Clause Execution
----------------------

The expression:

ac.ID = ':ID'

is OPTIMIZED, covers 0 rows or index keys, costs 0 bytes, and will be
applied
to the apprenticecourse table (ac) before any joins

The expression:

AV.credited <> TRUE

is OPTIMIZED, covers 87183 rows or index keys, costs 854393 bytes, and will
be
applied to the apprenticevisits table (av) before any joins

The expression:

AV.manual <> true

is OPTIMIZED, covers 87309 rows or index keys, costs 855628 bytes, and will
be
applied to the apprenticevisits table (av) before any joins

Join Ordering
-------------

The driver table is the apprenticevisits table (av)



The apprenticevisits table (av) is joined to the apprenticecourse table (ac)
with the INNER JOIN expression:



av.AppCourseID = ac.ID



The apprenticecourse table (ac) is joined to the apprentice table (a) with
the
LEFT OUTER JOIN expression:



ac.appID = a.ID



The apprenticecourse table (ac) is joined to the coursedet table (cd) with
the
LEFT OUTER JOIN expression:



ac.CourseID = cd.courseID and ac.CourseItem = cd.item

Optimizer will attempt to re-order the joins to a more optimal order
Use the NOJOINOPTIMIZE clause at the end of the SQL statement to force the
optimizer to leave the joins in their declared order

Optimized Join Ordering
-----------------------

The driver table is the apprenticecourse table (ac)



The apprenticecourse table (ac) is joined to the apprenticevisits table (av)
with the INNER JOIN expression:



ac.ID = av.AppCourseID



The apprenticecourse table (ac) is joined to the apprentice table (a) with
the
LEFT OUTER JOIN expression:



ac.appID = a.ID



The apprenticecourse table (ac) is joined to the coursedet table (cd) with
the
LEFT OUTER JOIN expression:



ac.CourseID = cd.courseID and ac.CourseItem = cd.item

Join Execution
--------------

Costs ARE NOT being taken into account when executing this join
Use the JOINOPTIMIZECOSTS clause at the end of the SQL statement to force
the
optimizer to consider costs when optimizing this join

================================================================================
>>>>> 0 rows affected in 3.047 seconds
================================================================================

Sat, Nov 3 2007 6:15 PMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Adam,

<< The real gotcha for me is, if I remove line /*2*/ out of the SQL, it
executes almost immediately, as line /*1*/ reduces the number of records
down to next to none. >>

My understanding is that the query should execute line /*1*/ first, and
then execute line /*2*/ after, which would make it nearly instant. Instead,
for some reason it does not.

Can someone please inform me as to why line /*2*/ is causing so many
overheads, or what I can do about it to increase performance? (I have posted
the 'Plan' below). >>

Without seeing the actual tables, my guess is that it's all down to the two
index scans on the AV table that cover 87,000+ keys.  That's what is taking
so long, and the two index scans are probably cheaper than scanning the
entire 87,000 rows for both conditions.

What I think you're expecting the optimizer to do is to take into account
the INNER JOIN as part of the criteria by which it determines how to handle
the AV table's conditions.  However, DBISAM does not do this, and that is
because projecting the number of rows that satisfy a join condition is very
difficult to do accurately.

--
Tim Young
Elevate Software
www.elevatesoft.com

Sun, Nov 4 2007 8:05 PMPermanent Link

"Adam H."
Hi Tim,

Thanks for your reply.

> What I think you're expecting the optimizer to do is to take into account
> the INNER JOIN as part of the criteria by which it determines how to
> handle the AV table's conditions.  However, DBISAM does not do this, and
> that is because projecting the number of rows that satisfy a join
> condition is very difficult to do accurately.

Yes - that's exactily what I was thinking DBISam should have been doing, but
now I know better - thanks. Smiley

FYI, I have managed to get this running faster now that I know the inner
join won't work the way I thought. Instead with a bit of playing around, I
have found that the following SQL works much faster:

select 0 as TotalHrsClaimed, 0 as totalclaims, cd."Max SCH", a.Code,
a.Surname, a.Firstname, ac.PartA, av.AppCourseID, sum(hrs) as assessed, 0 as
credman, 0 as NYC, sum(HrsClaimable) as TotalHrsClaimable
from apprenticecourse ac
left outer join apprenticevisits av on (AV.AppCourseID = AC.ID)
left outer join apprentice a on (a.ID = ac.appID)
left outer join coursedet cd on (cd.courseID = ac.CourseID) and (cd.item =
ac.CourseItem)
/*1*/ where ac.ID = ':ID'
and (AV.credited <> TRUE) and (AV.manual <> true)/* and (AV.NYC <> True)
NYC hours ARE claimable! */
Group by cd."Max SCH", a.Code, a.Surname, a.Firstname, ac.PartA,
av.AppCourseID


Since this still does what I'm after, and works much faster - I'll stick
with this.

Out of curiosity, does EDB work with inner joins the way that I would
expect, or is it the same as DBISam?

Thanks & Regards

Adam.
Mon, Nov 5 2007 1:28 PMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Adam,

I'm glad that you've got a workaround.

<< Out of curiosity, does EDB work with inner joins the way that I would
expect, or is it the same as DBISam? >>

It works the same as DBISAM in this respect.

--
Tim Young
Elevate Software
www.elevatesoft.com

Tue, Nov 6 2007 4:26 AMPermanent Link

"John Hay"
Adam,

> Hi,
>
> I have the following SQL, and with the data I currently have, it takes
> around 3 seconds to execute:
>
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> select 0 as TotalHrsClaimed, 0 as totalclaims, cd."Max SCH", a.Code,
> a.Surname, a.Firstname, ac.PartA, av.AppCourseID, sum(hrs) as assessed, 0
as
> credman, 0 as NYC, sum(HrsClaimable) as TotalHrsClaimable
> from apprenticevisits av
> inner join apprenticecourse ac on (ac.ID = av.AppCourseID)
> left outer join apprentice a on (a.ID = ac.appID)
> left outer join coursedet cd on (cd.courseID = ac.CourseID) and (cd.item =
> ac.CourseItem)
> where
> /*1*/ (ac.ID = ':ID')
> /*2*/ and (AV.credited <> TRUE) and (AV.manual <> true)
> Group by cd."Max SCH", a.Code, a.Surname, a.Firstname, ac.PartA,
> av.AppCourseID
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

I know you have solved your problem but just as a matter of interest if you
change (ac.ID=':ID') to (av.AppCourseID=':ID') does the performance improve?

John

Wed, Nov 7 2007 4:23 PMPermanent Link

"Adam H."
Hi John,

Good Question. I'm currently away from that site now, but will look into it,
and get back to you.

Best Regards

Adam.

> I know you have solved your problem but just as a matter of interest if
> you
> change (ac.ID=':ID') to (av.AppCourseID=':ID') does the performance
> improve?
Image