Login ProductsSalesSupportDownloadsAbout |
Home » Technical Support » ElevateDB Technical Support » Support Forums » ElevateDB SQL » View Thread |
Messages 1 to 10 of 17 total |
Worse SQL speed for LIKE |
Sat, Mar 17 2012 12:29 PM | Permanent Link |
Rolf Frei eicom GmbH | I have now moved some of my larger tables to EDB2 from DBISAM3 and did some
speed tests agianst it. I'm right shocked at the moment, as EDB2 is extremly slower than DBISAM3. The following simple SQL takes 2.8 seconds in DBSIAM3, but in EDB it needs 10.8 (!!!) seconds. Select * from parts where LOWER(info) like '%akku%' or Lower(Description) like '%akku%' order by description What is going on here? I tought EDB should be faster? In EDB2 I have created an unicode DB, may that make that huge difference? All indexes are the same as in DBISAM3. If this is normal in EDB2, this will be a show stopper for me, as this like query is right much used in the Web for database product searching. Another question: Is there a simple way to convert a EDB Unicode DB to an ANSI DB? Regards Rolf |
Sat, Mar 17 2012 1:25 PM | Permanent Link |
Rolf Frei eicom GmbH | Some other strange timings:
Select * from parts where Lower(Description) like '%akku%' or Lower(Description) like '%asus%' or Lower(Description) like '%ati%' order by description DBSIAM3: 0.5s EDB2: 1.2s (2.5x slower) I did now create a Textindex on the Infofield and get this results: Select * from parts where Info contains 'akku' or Info contains 'asus' or Info contains 'ati' order by description DBSIAM3 not tested EDB2 0.15s (these seems to be acceptable) But now it gets very strange on combining this two queries: Select * from parts where Info contains 'akku' or Info contains 'asus' or Info contains 'ati' or Lower(Description) like '%akku%' or Lower(Description) like '%asus%' or Lower(Description) like '%ati%' order by description Here I expect to get something about 1.0-1.5 sec but this SQL needs 14.2 s(!!!). What the hell is going on here? Regards Rolf |
Sun, Mar 18 2012 5:09 AM | Permanent Link |
Roy Lambert NLH Associates Team Elevate | Rolf
First you don't need the Lower() any more to make things case insensitive. This is now achieved via collations. Second is that collations can really mess you up if you don't "appreciate" them. Is the index you think you're using the same collation as the column you're testing? eg the column may be ANSI_CI and the index ANSI. If that's the case it won't be used. There have been many posts which seem to indicate that DBISAM V3 was the fastest. I know Tim has explained a number of times what was changed (memory says something about buffers). Roy Lambert |
Mon, Mar 19 2012 4:01 AM | Permanent Link |
Roy Lambert NLH Associates Team Elevate | Rolf
A post over in the DBISAM section jogged my memory for another issue - are your tables encrypted. If so that will be part of the answer. I don't know how big a part though. DBISAM V3 used an XOR encryption method simple, fast and easy to break. ElevateDB uses a much more secure method but that will come at a bit of a performance penalty. Nowhere as big as you're quoting which is much more likely to be an index / collation problem. Roy Lambert [Team Elevate] |
Mon, Mar 19 2012 11:48 AM | Permanent Link |
Rolf Frei eicom GmbH | Roy
In DBSIAM 3 both fields (Descripton VARCHAR(100) and Info CBLOB) doesn't have an index at all. It's a simple sequential rowscan what is done. The first test in EDB was with no indexes as well. So it did a rowscan also, but much, much slower. As of the Lower(): I need it as the collation isn't and should not be CI. This query returns worse result: Select * from parts where LOWER(info) like '%akku%' or LOWER(info) like '%asus%' or LOWER(info) like '%ati%' LOWER(description) like '%akku%' or LOWER(description) like '%asus%' or LOWER(description) like '%ati%' order by description EDB and DBISAM have no indexes on this fields. The Collation is DES (Swiss german with no CI). There are a total of 11'000 records and this query returns 3600 records of them. DBISAM: 4.5 Second EDB2: 23.5 Seconds I was able to reduce the time by adding a textindex and using "contains", but so I will not anymore get the same results back. In german there are many concatinated words like "Buchtitel" which are two words on english "book title". As such, any searches for "Buch" will not anymore find what I expect. As of this problem I don't want to use Textindexes. In my case the search with "like" returns 3600 records, where "conatins" only returns 600. I realy wonder what is that different to DBISAM that it takes about 5x more time to get the same results in EDB?! At the moment I'm absolutly frustrated, as this will make EDB2 ununsable for me. I can live with about 10% speed degration but by far not with 520% as in my sample. There must be something wrong in the engine für LIKE conditions. It looks to me as in EDB every single LIKE makes a complete table rowscan. So in this case it scans the complete table 6 times. This isn't needed at all and if it is realy the case, there is a big probem in EDB than, which Tim should looking at. But I cant realy believe that it is realy done this way. Regards Rolf "Roy Lambert" schrieb im Newsbeitrag news:2B842CBC-8BAE-4B23-AA21-D3061C28C68E@news.elevatesoft.com... Rolf First you don't need the Lower() any more to make things case insensitive. This is now achieved via collations. Second is that collations can really mess you up if you don't "appreciate" them. Is the index you think you're using the same collation as the column you're testing? eg the column may be ANSI_CI and the index ANSI. If that's the case it won't be used. There have been many posts which seem to indicate that DBISAM V3 was the fastest. I know Tim has explained a number of times what was changed (memory says something about buffers). Roy Lambert |
Mon, Mar 19 2012 11:52 AM | Permanent Link |
Rolf Frei eicom GmbH | Roy
The table isn't encrypted, not in DBISAM and not in EDB2. So this can not be the reason. I have no also tried to run the queries against an ANSI EDB2 table, but the result is near the same. The ANSI EDB" Version is a little bit faster then den Unicode Version of EDB. Regards Rolf "Roy Lambert" schrieb im Newsbeitrag news:8F489494-30E3-4FB8-A7A8-52956854158E@news.elevatesoft.com... Rolf A post over in the DBISAM section jogged my memory for another issue - are your tables encrypted. If so that will be part of the answer. I don't know how big a part though. DBISAM V3 used an XOR encryption method simple, fast and easy to break. ElevateDB uses a much more secure method but that will come at a bit of a performance penalty. Nowhere as big as you're quoting which is much more likely to be an index / collation problem. Roy Lambert [Team Elevate] |
Mon, Mar 19 2012 12:23 PM | Permanent Link |
Roy Lambert NLH Associates Team Elevate | Rolf
>As of the Lower(): I need it as the collation isn't and should not be CI. You still don't need it The approach is to use collate as part of the query eg description COLLATE DES_CI LIKE '%asus% I now know you're using ORDER BY against an unindexed column. This will also slow things down, and it will (I think) force the production of an insensitive (ie canned) result set. If DBISAM V3 was producing a Sensitive (live) result set that could go a long way towards explaining the timing differences. LOWER() will work as well but using COLLATE will give you added flexibility. >I was able to reduce the time by adding a textindex and using "contains", >but so I will not anymore get the same results back. In german there are >many concatinated words like "Buchtitel" which are two words on english >"book title". As such, any searches for "Buch" will not anymore find what I >expect. As of this problem I don't want to use Textindexes. In my case the >search with "like" returns 3600 records, where "conatins" only returns 600. what happens if you try CONTAINS 'buch*' >I realy wonder what is that different to DBISAM that it takes about 5x more >time to get the same results in EDB?! At the moment I'm absolutly >frustrated, as this will make EDB2 ununsable for me. I can live with about >10% speed degration but by far not with 520% as in my sample. There must be >something wrong in the engine für LIKE conditions. > >It looks to me as in EDB every single LIKE makes a complete table rowscan. >So in this case it scans the complete table 6 times. This isn't needed at >all and if it is realy the case, there is a big probem in EDB than, which >Tim should looking at. But I cant realy believe that it is realy done this >way. This I can't comment on. My suggestion at this point is to email Tim direct - he'll probably want a copy of the table(s) and associated config file. Roy Lambert [Team Elevate] |
Mon, Mar 19 2012 12:58 PM | Permanent Link |
Rolf Frei eicom GmbH | Roy
>> description COLLATE DES_CI LIKE '%asus% Changed to this type of SQL, but still the same timings. >>I now know you're using ORDER BY against an unindexed column. This will >>also slow things down, and it will (I think) force the production of an >>insensitive (ie canned) result set. If DBISAM V3 was producing a Sensitive >>(live) result set that could go a long way towards explaining the timing >>differences. Removing the Order by reduces the time used from 23.5 to 22.8 seconds. As there is no index on description, in DBSIAM3 this query retruns a canned result set as in EDB2. DBISAM needs 4.0 instead 4.5 with the order by. >> what happens if you try CONTAINS 'buch*' This will only return words beginning with "buch", but the search word should also find words ending with it or inside another longer word. >> This I can't comment on. My suggestion at this point is to email Tim >> direct - he'll probably want a copy of the table(s) and associated config >> file. I have just sent an email. Thanks anyway for your efforts! Regards Rolf |
Wed, Mar 21 2012 2:54 PM | Permanent Link |
Tim Young [Elevate Software] Elevate Software, Inc. timyoung@elevatesoft.com | Rolf,
<< In DBSIAM 3 both fields (Descripton VARCHAR(100) and Info CBLOB) doesn't have an index at all. It's a simple sequential rowscan what is done. The first test in EDB was with no indexes as well. So it did a rowscan also, but much, much slower. >> As indicated via email, the issue is with the number of string comparisons in the LIKE functionality. EDB has to make a Windows API call to compare characters in a collation-sensitive way, and that is accounting for the majority of the execution time. The Windows API call is made around 83 million times in the process of executing that query, as many of the rows have several KB of text that must be compared character-by-character. Also, as I said in my email, there really is no way around this at this time, but I'll keep digging to see if the number of calls can be reduced by modifying the logic of the LIKE functionality. In general, though, there are some distinct differences in the way that EDB's LIKE functionality works when compared to DBISAM. Most notably, EDB does some extra comparisons to account for certain collations like German that may match on more than one character. This can cause additional overhead, and in the case of your query, it accounts for 15 additional seconds of overhead (out of 35 seconds total execution time). << As of the Lower(): I need it as the collation isn't and should not be CI. >> What you're doing is essentially a case-insensitive operation so, no, you do *not* need the LOWER() call and can use a simple COLLATE specification to achieve the same thing. << It looks to me as in EDB every single LIKE makes a complete table rowscan. So in this case it scans the complete table 6 times. >> This is not correct. It makes one scan through the whole table, and evaluates each row for all of the conditions specified. The query execution plan will tell you this information. -- Tim Young Elevate Software www.elevatesoft.com |
Wed, Mar 21 2012 4:01 PM | Permanent Link |
Roy Lambert NLH Associates Team Elevate | Rolf
Just thinking about this again having read Tim's answer. Have you tried Select * from parts where Info contains '*akku*' or Info contains '*asus*' or Info contains '*ati*' order by description I think ElevateDB will do that these days. Roy Lambert [Team Elevate] |
Page 1 of 2 | Next Page » | |
Jump to Page: 1 2 |
This web page was last updated on Sunday, May 19, 2024 at 08:46 AM | Privacy PolicySite Map © 2024 Elevate Software, Inc. All Rights Reserved Questions or comments ? E-mail us at info@elevatesoft.com |