Login ProductsSalesSupportDownloadsAbout |
Home » Technical Support » DBISAM Technical Support » Support Forums » DBISAM SQL » View Thread |
Messages 1 to 10 of 26 total |
Optimising "not" clauses in SQL |
Thu, Nov 12 2009 6:25 PM | Permanent Link |
"Adam H." | Hi all,
I've noticed in DBISam, that if I have a query such as: Select something from mytable where myfield = 'C' The query runs quite fast. (All these examples assume that myfield is indexed). However, if I run something like: Select something from mytable where myfield <> 'C' The query runs substantially slower. (Can be 10 times slower). But, if I run something like Select something from mytable where (myfield = 'A') or myfield = ('B') or (myfield = 'D') or (myfield = 'E').... and so on - the query runs relatively fast again. The question I have is in order to optimise my query with a 'not' statement - is there some trick that I'm missing with the indexing of the field, or am I better to have select statements for the field for all the possible matches except for the field I don't want (such as my 3rd example) as far as speed and optimization is concerned? I'm running DBISam 4.29 on Delphi2007 in Windows 7. Cheers Adam. |
Fri, Nov 13 2009 1:56 AM | Permanent Link |
Roy Lambert NLH Associates Team Elevate | Adam
How about Select something from mytable where NOT myfield = 'C' Roy Lambert |
Fri, Nov 13 2009 12:42 PM | Permanent Link |
Tim Young [Elevate Software] Elevate Software, Inc. timyoung@elevatesoft.com | Adam,
<< However, if I run something like: Select something from mytable where myfield <> 'C' The query runs substantially slower. (Can be 10 times slower). >> I depends upon the number of records returned, but such queries can involve a lot of index scanning, and in some cases, can get converted to record scans because the number of records returned is very high. How many records are in the source table that don't match 'C' ? -- Tim Young Elevate Software www.elevatesoft.com |
Sun, Nov 15 2009 7:36 PM | Permanent Link |
"Adam H." | Good Morning Tim,
Thanks for your reply mate, > I depends upon the number of records returned, but such queries can involve > a lot of index scanning, and in some cases, can get converted to record > scans because the number of records returned is very high. > > How many records are in the source table that don't match 'C' ? Can I give 'LOTS' as an answer? At present there are around 56000 records that don't match 'C'. The table has a record size of 1856 bytes. The field is indexed, but I'm guessing that this may not make a difference the way NOT seems to work? I'm just trying to find a more optimized solution to run my query. (Hopefully not something too difficult or complex considering the simplicity of the original SQL). Cheers mate Adam. |
Sun, Nov 15 2009 7:36 PM | Permanent Link |
"Adam H." | Hi Roy,
> Select something > from mytable > where NOT myfield = 'C' Thanks for the idea, but unfortunately that seems to take the same amount of time as using the <> expression. Cheers mate Adam. |
Mon, Nov 16 2009 3:35 PM | Permanent Link |
Tim Young [Elevate Software] Elevate Software, Inc. timyoung@elevatesoft.com | Adam,
<< Can I give 'LOTS' as an answer? >> Yes, that is valid answer. Can you do me a favor and post the execution plan for the query ? That may shed some light on what the optimizer is doing. <<The field is indexed, but I'm guessing that this may not make a difference the way NOT seems to work? >> No, it won't make any difference. -- Tim Young Elevate Software www.elevatesoft.com |
Mon, Nov 16 2009 10:27 PM | Permanent Link |
"Adam H." | Hi Tim,
Thanks again for getting back to me. > Can you do me a favor and post the execution plan for the query ? That may > shed some light on what the optimizer is doing. Absolutely. The plan is: ================================================================================ SQL statement (Executed with 4.29 Build 2) ================================================================================ select T.Ticket, T.ProductID From Tickets T where (T.Type <> 'C') Tables Involved --------------- Tickets (T) table opened shared, has 65638 rows Result Set Generation --------------------- Result set will be canned Result set will consist of one or more rows WHERE Clause Execution ---------------------- The expression: T.Type <> 'C' is OPTIMIZED, covers 716 rows or index keys, costs 9021 bytes, and will be applied to the Tickets table (T) before any joins ================================================================================ >>>>> 56284 rows affected in 12.547 seconds ================================================================================ as opposed to: ================================================================================ SQL statement (Executed with 4.29 Build 2) ================================================================================ select T.Ticket, ProductID From Tickets T where (T.Type = 'A') or (T.Type = 'B') or (T.Type = 'D') or (T.Type = 'E') or (T.Type = 'F') or (T.Type = 'G') or (T.Type = 'H') or (T.Type = 'I') or (T.Type = 'J') or (T.Type = 'K') or (T.Type = 'L') or (T.Type = 'M') or (T.Type = 'N') or (T.Type = 'O') or (T.Type = 'P') or (T.Type = 'Q') or (T.Type = 'R') or (T.Type = 'S') or (T.Type = 'T') or (T.Type = 'U') or (T.Type = 'V') or (T.Type = 'W') or (T.Type = 'X') or (T.Type = 'Y') or (T.Type = 'Z') OR (T.Type is null) Tables Involved --------------- Tickets (T) table opened shared, has 65638 rows Result Set Generation --------------------- Result set will be canned Result set will consist of one or more rows WHERE Clause Execution ---------------------- The expression: T.Type = 'A' or T.Type = 'B' or T.Type = 'D' or T.Type = 'E' or T.Type = 'F' or T.Type = 'G' or T.Type = 'H' or T.Type = 'I' or T.Type = 'J' or T.Type = 'K' or T.Type = 'L' or T.Type = 'M' or T.Type = 'N' or T.Type = 'O' or T.Type = 'P' or T.Type = 'Q' or T.Type = 'R' or T.Type = 'S' or T.Type = 'T' or T.Type = 'U' or T.Type = 'V' or T.Type = 'W' or T.Type = 'X' or T.Type = 'Y' or T.Type = 'Z' OR T.Type is null is OPTIMIZED, covers 56284 rows or index keys, costs 709177 bytes, and will be applied to the Tickets table (T) before any joins ================================================================================ >>>>> 56284 rows affected in 2.312 seconds ================================================================================ Cheers mate, Adam. |
Tue, Nov 17 2009 5:59 PM | Permanent Link |
"Adam H." | Hi Tim,
With further investigation, I've found something else... When I run the Query with (T.Type <> 'C') is for the first time, I get the results as reported, however if I run it again it works much faster the second time. (Roughly the same amount of time as the query with the expression (T.Type = 'A') or (T.Type = 'B').... etc I don't know if this has anything to do with the issue at hand or not. Cheers Adam. |
Tue, Nov 17 2009 6:18 PM | Permanent Link |
Tim Young [Elevate Software] Elevate Software, Inc. timyoung@elevatesoft.com | Adam,
<< Absolutely. The plan is: >> Hmm, the execution plan for this needs some further work. The stats reported for the WHERE clause don't match the number of resulting records, which is okay since it is actually reporting the number of matching records for 'C'. However, what it isn't reporting is a further record scan that has to occur. What the DBISAM optimizer does in these cases is this: 1) Scan all index keys where Type='C', and populate a bitmap with the corresponding records. 2) Flip the bitmap. 3) Scan all records that match the set bits in the bitmap and make sure that they aren't deleted. It is 3) that is most likely taking up the majority of the time, and this is what the execution plan is leaving out. -- Tim Young Elevate Software www.elevatesoft.com |
Wed, Nov 18 2009 7:51 PM | Permanent Link |
"Adam H." | Hi Tim,
Thanks for your reply... > Hmm, the execution plan for this needs some further work. The stats > reported for the WHERE clause don't match the number of resulting records, > which is okay since it is actually reporting the number of matching records > for 'C'. However, what it isn't reporting is a further record scan that has > to occur. > > What the DBISAM optimizer does in these cases is this: > > 1) Scan all index keys where Type='C', and populate a bitmap with the > corresponding records. > 2) Flip the bitmap. > 3) Scan all records that match the set bits in the bitmap and make sure that > they aren't deleted. > > It is 3) that is most likely taking up the majority of the time, and this is > what the execution plan is leaving out. Thanks for the explanation. However, (just out of curiosity) if we were to do a Query where Type='C' as opposed to "NOT C", wouldn't step 3 still need to occur in this instance too? (As it's my impression that deleted records still hold all the data, they're just marked as 'deleted' until that record is overwritten with another one). Cheers mate Adam. |
Page 1 of 3 | Next Page » | |
Jump to Page: 1 2 3 |
This web page was last updated on Saturday, May 4, 2024 at 12:54 AM | Privacy PolicySite Map © 2024 Elevate Software, Inc. All Rights Reserved Questions or comments ? E-mail us at info@elevatesoft.com |