Login ProductsSalesSupportDownloadsAbout |
Home » Technical Support » DBISAM Technical Support » Support Forums » DBISAM SQL » View Thread |
Messages 1 to 9 of 9 total |
Query Speed issues on casting a join |
Mon, Jan 21 2013 12:48 AM | Permanent Link |
Adam H. | Hi,
I'm just looking for some suggestions as to the best way to handle this challenge I face. I have a query in DBISam that looks like the following: Select M.DeltaStudentID, Max(AC.Item) Item into SVTImportMaxItemM From SVTImport M inner join ApprenticeCourse AC on (AC.DeltaStudentNo = M.DeltaStudentID) group by DeltaStudentID; The initial problem I faced is that DeltaStudentNo / ID are string fields, but in one of the tables may contain leading zero's, and in another may not. As this is from an import routine, I don't really have the option myself to force people to use integer's for this, so I match them up by casting them as large integer values as follows: Select M.DeltaStudentID, Max(AC.Item) Item into SVTImportMaxItemM From SVTImport M inner join ApprenticeCourse AC on (cast(AC.DeltaStudentNo as LargeInt) = cast(M.DeltaStudentID as LargeInt)) group by DeltaStudentID; This works fine in DBSys, running quite fast as I would expect. (Just over 1 second). However, when I run it through my application in Delphi (the same SQL) it takes a significant amount of time. (Seems like more than 10 minutes). Now one thing I notice is that if I run it through the IDE I get a significant amount of error messages (such as exception class EConvertError with message 'sdash zbn csvvvvvvv' is not a valid integer value). I'm suspicious that maybe their are different compiling options turned on (debug?) that might be taking longer to execute the try/except block at line 4572 in dbisamlb.pas - but I'm not sure, nor am I sure if there's a better way to get around this problem? Does anyone have any thoughts? Best Regards Adam. |
Mon, Jan 21 2013 5:16 AM | Permanent Link |
Roy Lambert NLH Associates Team Elevate | Adam
1) DBSys is "just" a Delphi application to which you have the source code. You can have a look inside and see how its compiled what switches are used etc. 2) You don't mention which version of Delphi you're using - I think DBSys is compiled in D7 so later versions of Delphi could have an impact, but I seriously doubt it would be of the magnitude you're getting. 3) Are there any visual controls connected - that's always a good way to slow things down. 4) Even though they are string fields you're using can you not format them on import or pre or post processes them to ensure compatibility 5) If you can't reformat the fields can you add an extra field which is an integer That's all the general stuff This <<Now one thing I notice is that if I run it through the IDE I get a significant amount of error messages (such as exception class EConvertError with message 'sdash zbn csvvvvvvv' is not a valid integer value).>> I'm not sure exactly what it means. My first guess would be that you have some data that is not formed from digits. My second was that even with a largeint you have some numbers that are to big to convert. My third is that you have a corrupt table / index. A quick and dirty test would be SELECT cast(DeltaStudentNo as LargeInt),DeltaStudentNo FROM ApprenticeCourse WHERE cast(DeltaStudentNo as LargeInt) IS NULL and then the same for the other table I'd also be interested in seeing the execution plan both for running in DBSys and in your app Roy Lambert [Team Elevate] |
Mon, Jan 21 2013 10:37 AM | Permanent Link |
Tim Young [Elevate Software] Elevate Software, Inc. timyoung@elevatesoft.com | Adam,
<< This works fine in DBSys, running quite fast as I would expect. (Just over 1 second). However, when I run it through my application in Delphi (the same SQL) it takes a significant amount of time. (Seems like more than 10 minutes). >> I would expect DBSYS to take a long time also because the join is not optimizable with the cast in there. Are you sure that you're using the same query ? Thanks, Tim Young Elevate Software www.elevatesoft.com |
Mon, Jan 21 2013 11:07 AM | Permanent Link |
Roy Lambert NLH Associates Team Elevate | Adam
Tim got here before I could add that but I will add - are you looking at the same data? Roy Lambert [Team Elevate] |
Mon, Jan 21 2013 4:22 PM | Permanent Link |
Adam H. | Hi Roy,
Thanks for your reply... I'm using Delphi 2007 at present on Windows 64. Currently running DBISam 4.34b5 but will rebuild with the latest version to see if it makes a difference. I have got a progress bar that goes across, but even when I disable that (ie, just run the query normally) it is still very slow. I will try and look at creating additional tables 'in memory' casting them before doing any joins on tables to see if this helps, although the thing that's got me beat is why DBSys works so fast, and my application is running so slow. I ran the process last night, and will reply to Tim's post with a copy of the query plans. Last night's took more than 10 minutes (took 8hrs to complete!) I haven't known it to take that long before... Cheers Adam. > Adam > > > 1) DBSys is "just" a Delphi application to which you have the source code. You can have a look inside and see how its compiled what switches are used etc. > > 2) You don't mention which version of Delphi you're using - I think DBSys is compiled in D7 so later versions of Delphi could have an impact, but I seriously doubt it would be of the magnitude you're getting. > > 3) Are there any visual controls connected - that's always a good way to slow things down. > > 4) Even though they are string fields you're using can you not format them on import or pre or post processes them to ensure compatibility > > 5) If you can't reformat the fields can you add an extra field which is an integer > > That's all the general stuff > > This > > <<Now one thing I notice is that if I run it through the IDE I get a > significant amount of error messages (such as exception class > EConvertError with message 'sdash zbn csvvvvvvv' is not a valid integer > value).>> > > I'm not sure exactly what it means. My first guess would be that you have some data that is not formed from digits. My second was that even with a largeint you have some numbers that are to big to convert. My third is that you have a corrupt table / index. > > A quick and dirty test would be > > SELECT cast(DeltaStudentNo as LargeInt),DeltaStudentNo FROM ApprenticeCourse WHERE cast(DeltaStudentNo as LargeInt) IS NULL > > and then the same for the other table > > I'd also be interested in seeing the execution plan both for running in DBSys and in your app > > Roy Lambert [Team Elevate] > |
Mon, Jan 21 2013 4:27 PM | Permanent Link |
Adam H. | Hi Tim,
Thanks for your reply. Indeed I would have thought that DBSys would have taken a little longer, but it runs very efficiently and quickly. Last night I ran the two queries, one in DBSys, and the other in my application, and generated plans for both of them, which I've posted below. I can confirm that we're running on the same data. (Query plans show same record counts for each table). Last night, my application took 8hrs to run the same query that DBSys ran in just over a second. (And my application ran after DBSys, so if there was any disk caching involved, I would have thought it should favour my application). In all fairness, it has never taken this long before in my application, so I don't know what I've done to make it so different to DBSys. The query component doesn't have a dataset attached, and infact within my application is called using execsql as opposed to open. Best Regards Adam. The plan through DBSys looks like the following: ------------------------------------------------ ================================================================================ SQL statement (Executed with 4.34 Build 5) ================================================================================ Select M.DeltaStudentID, Max(AC.Item) Item into SVTImportMaxItemM From SVTImport M inner join ApprenticeCourse AC on (cast(AC.DeltaStudentNo as LargeInt) = cast(M.DeltaStudentID as LargeInt)) group by DeltaStudentID Tables Involved --------------- SVTImport (M) table opened shared, has 11 rows ApprenticeCourse (AC) table opened shared, has 21228 rows Result Set Generation --------------------- Result set will be canned Result set will consist of one or more grouped rows Result set will be grouped by the following column(s) using a temporary index: DeltaStudentID Result set will be ordered by the temporary index created for the grouping Join Ordering ------------- The driver table is the SVTImport table (M) The SVTImport table (M) is joined to the ApprenticeCourse table (AC) with the INNER JOIN expression: cast(M.DeltaStudentID,LARGEINT) = cast(AC.DeltaStudentNo,LARGEINT) Optimizer will attempt to re-order the joins to a more optimal order Use the NOJOINOPTIMIZE clause at the end of the SQL statement to force the optimizer to leave the joins in their declared order The joins are already in optimal order and cannot be optimized any further Join Execution -------------- Costs ARE NOT being taken into account when executing this join Use the JOINOPTIMIZECOSTS clause at the end of the SQL statement to force the optimizer to consider costs when optimizing this join The expression: cast(M.DeltaStudentID,LARGEINT) = cast(AC.DeltaStudentNo,LARGEINT) is UN-OPTIMIZED ================================================================================ >>>>> 3 rows affected in 1.219 seconds ================================================================================ The plan through my application looks like the following: Select M.DeltaStudentID, Max(AC.Item) Item into SVTImportMaxItemM From SVTImport M inner join ApprenticeCourse AC on (cast(AC.DeltaStudentNo as LargeInt) = cast(M.DeltaStudentID as LargeInt)) group by DeltaStudentID; ------------------------------------------- ================================================================================ SQL statement (Executed with 4.34 Build 5) ================================================================================ Select M.DeltaStudentID, Max(AC.Item) Item into SVTImportMaxItemM From SVTImport M inner join ApprenticeCourse AC on (cast(AC.DeltaStudentNo as LargeInt) = cast(M.DeltaStudentID as LargeInt)) group by DeltaStudentID Tables Involved --------------- SVTImport (M) table opened shared, has 11 rows ApprenticeCourse (AC) table opened shared, has 21228 rows Result Set Generation --------------------- Result set will be canned Result set will consist of one or more grouped rows Result set will be grouped by the following column(s) using a temporary index: DeltaStudentID Result set will be ordered by the temporary index created for the grouping Join Ordering ------------- The driver table is the SVTImport table (M) The SVTImport table (M) is joined to the ApprenticeCourse table (AC) with the INNER JOIN expression: cast(M.DeltaStudentID,LARGEINT) = cast(AC.DeltaStudentNo,LARGEINT) Optimizer will attempt to re-order the joins to a more optimal order Use the NOJOINOPTIMIZE clause at the end of the SQL statement to force the optimizer to leave the joins in their declared order The joins are already in optimal order and cannot be optimized any further Join Execution -------------- Costs ARE NOT being taken into account when executing this join Use the JOINOPTIMIZECOSTS clause at the end of the SQL statement to force the optimizer to consider costs when optimizing this join The expression: cast(M.DeltaStudentID,LARGEINT) = cast(AC.DeltaStudentNo,LARGEINT) is UN-OPTIMIZED ================================================================================ >>>>> 3 rows affected in 29161.594 seconds ================================================================================ |
Mon, Jan 21 2013 5:51 PM | Permanent Link |
Adam H. | OK, I think I'm onto something here...
I have created a brand new application, with just the Engine, Session, Database and Query components to run the SQL. If I open the Query at Designtime, they open up very quickly (same as DBSys). However, if I run the application and open them up at runtime, it is taking a significant amount of time to open. (I haven't allowed it to run long enough yet to get an exact plan or length of time). I've then fiddled around with the compiler options. I've found that if I turn off DEBUGGING INFORMATION - I can get the query to run within a couple of seconds!!!! I haven't had this issue before, but I'm guessing it has to do with the number of errors that are raised in the try/except section. Interesting, because when I look at the help for Debug Information it says that it increases the size of unit file and takes up additional memory when compiling programs that use the unit, but it does not affect the size or speed of the executable program. I'll keep digging... |
Tue, Jan 22 2013 5:28 AM | Permanent Link |
Roy Lambert NLH Associates Team Elevate | Adam
>I've then fiddled around with the compiler options. I've found that if I >turn off DEBUGGING INFORMATION - I can get the query to run within a >couple of seconds!!!! > >I haven't had this issue before, but I'm guessing it has to do with the >number of errors that are raised in the try/except section. > >Interesting, because when I look at the help for Debug Information it >says that it increases the size of unit file and takes up additional >memory when compiling programs that use the unit, but it does not affect >the size or speed of the executable program. This makes me wonder. Are you using FastMM4 in debug mode to trace memory leaks? I had a problem with this in ElevateDB where queries were running faster in EDBManager than in my app. What was happening was that FastMM was keeping hold of things so that it could report on memory leaks and thus slowed everything down. I don't remember it being as extreme as your case though. My "fix" was to prevent FastMM from carrying out its checks when not in the IDE. You add the following code into the dpr just after the begin. if DebugHook <> 0 then begin ReportMemoryLeaksOnShutdown := True; SetMMLogFileName(PChar('C:\zap\Leaks.txt')); end; Roy Lambert [Team Elevate] |
Tue, Jan 22 2013 9:11 PM | Permanent Link |
Adam H. | Hi Roy,
Thanks. I do have FastMM referenced in the uses clause, but am not using the memory leak reporting feature of it. However I am using Eurekalog - and maybe that has something to do with it. I was really blown away by the difference in speed between DBSys and my app - so I'll continue to look around in that particular area. In the mean time, I have added {$D-} and {$I-} to some of the DBISam units to force debugging to be turned off for those. Cheers Adam. |
This web page was last updated on Thursday, May 23, 2024 at 07:54 PM | Privacy PolicySite Map © 2024 Elevate Software, Inc. All Rights Reserved Questions or comments ? E-mail us at info@elevatesoft.com |