Icon View Thread

The following is the text of the current message along with any replies.
Messages 11 to 17 of 17 total
Thread Worse SQL speed for LIKE
Thu, Mar 22 2012 12:46 PMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Roy,

<< Just thinking about this again having read Tim's answer. Have you tried
>>

There is an issue with using the text-indexing due to the way that some of
the text is formatted in his CLOB column.  The word-breaking causes breaks
in ways that don't allow it to work for his needs.

Normally, you are 100% correct - optimized CONTAINS first, and then weed
down the results with a LIKE (CONTAINS AND LIKE).

--
Tim Young
Elevate Software
www.elevatesoft.com
Tue, Mar 27 2012 11:34 AMPermanent Link

Rolf Frei

eicom GmbH

Roy

This is something I did try already. This makes no real difference to the
LIKE SQL. The Text Index doesn't help anything noticable for this type of
SQL searches. I would try it with Info CONATAINS 'akku*' to get better
results, but sadly this is still much to slow. In my case the sample 11'000
records table is by far not the biggest and if it takes that long with it,
it would take forever on one table with 80'000 parts, which one of my
customers will have soon. Contains and Text Indexes only work fast in EDB if
the exact word is searched like Info CONTAINS 'akku', but that is returrning
to less records and is not usabel in practice this way.

The bad thing is, that DBISAM is lightning fast with text indexes and
returns the 'akku*'records in 0.01 seconds from a 35'000 records Parts
table. EDB needs for the same a few seconds! As of this problems EDB2 will
not be usable and I wonder how other guys are unsing it in any acceptable
speed for searching large tables.

As the search functionality is fundemental in my application, which is a
online web shop, at the moment it looks realy bad for me, as I will not be
able to use EDB 2. So I will only have 3 options:

1. Stay with DBISAM 3 forever
2. Move to DBSIAM 4 to get some new functionality (custom functions,
asc/desc indexes)
3. Move to another DBMS as MS SQL, MySQL, Firebird or whatever.

I was a realy big Elevate Software fan, but this problems with EDB2, forcing
me to move away from them. I'm very unhappy and disappointed about this and
it gives me sleepless nights.  Frown

Regards
Rolf


"Roy Lambert"  schrieb im Newsbeitrag
news:9BCD424C-5A71-46C0-9216-81BFB1D97BFD@news.elevatesoft.com...

Rolf


Just thinking about this again having read Tim's answer. Have you tried


Select * from parts
where
 Info contains '*akku*' or
 Info contains '*asus*' or
 Info contains '*ati*'
order by description

I think ElevateDB will do that these days.

Roy Lambert [Team Elevate]
Tue, Mar 27 2012 12:59 PMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Rolf,

As answered via email:

<< I use this type of LIKE queries since years and it worked corretly all
the time. I can't see what should be wrong with it?! This must be a very
rare corner case in any other languages than in german, as I never had a
problem with it. But what  multiple characters do collate as one character?
I can't think of any in german, and does it at all existing in any
ANSI-charset
(ASCII 0-255) languages at all? Isn't it a Unicode problem? This should be
optional if absolutly needed. >>

The problem is this:

1) Partial-searches, whether they are in a LIKE or in any other type of text
searching construct, can be subject to multiple characters collating to a
single character, or different comparison results depending upon which
character is being compared:

http://www.elevatesoft.com/incident?action=viewrep&category=edb&release=2.03&incident=3249

2) EDB uses the built-in Windows collation facilities.  Therefore, it is not
possible for me to determine exactly which languages may have the above
properties, so I cannot "code around" the issue when it comes to
non-ANSI/Unicode collations (built-in EDB collations).

3) DBISAM didn't have this issue because it didn't even pay attention to it.
It simply returned weird results for the above situations.

Finally, text-indexed searches in EDB that don't use wildcards are always
instantaneous, so this has nothing to do with my abilities as a developer or
improper coding.  All of the overhead is due to the above requirements,
combined with the fact that millions of calls to the Windows API for
properly comparing characters takes a lot of time, and there's nothing I can
do about that.

I'm looking into making sure that the built-in (in EDB, not
Windows-provided) ANSI/Unicode collations aren't subject to the above issues
for the next build.  I think the LIKE is still subject to it, but not the
CONTAINS.  However, this will require that you switch you index collations,
or add new indexes with ANSI collations, in order to take advantage of the
better performance.

--
Tim Young
Elevate Software
www.elevatesoft.com
Tue, Mar 27 2012 1:46 PMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Rolf


One last experiment for you - have a look in the extensions newsgroup

From: Kick
Subject: lightningfast partial search with custom word generator

To quote from Kick's post "On a 1.2 milion record database searching for part of a word in a 70 char string is almost instant."

What he (or possibly she) has done is write a custom word generator so that any search is essentially for a complete word. I don't know if it can be adopted to your requirements but possibly worth a look.

Roy Lambert [Team Elevate]
Tue, Mar 27 2012 1:59 PMPermanent Link

Rolf Frei

eicom GmbH

Roy

As far I can understand what he does, this would generate extremly large
index files and would take forever to generate it. He uses a 70 char field,
but in my case I have 3 char fields with total of 300 chars and in addition,
and here will be the big problem, 2 CBLOB fields with much of product
informations in it (1-2K of text or more on every record). Thank you anyway
to pointing me to it.

Regards
Rolf



"Roy Lambert"  schrieb im Newsbeitrag
news:28BC7A44-DDA3-4DB5-A005-B136962C071E@news.elevatesoft.com...

Rolf


One last experiment for you - have a look in the extensions newsgroup

From: Kick
Subject: lightningfast partial search with custom word generator

To quote from Kick's post "On a 1.2 milion record database searching for
part of a word in a 70 char string is almost instant."

What he (or possibly she) has done is write a custom word generator so that
any search is essentially for a complete word. I don't know if it can be
adopted to your requirements but possibly worth a look.

Roy Lambert [Team Elevate]
Tue, Mar 27 2012 2:37 PMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Rolf


>As far I can understand what he does, this would generate extremly large
>index files

Oh yes Smileybut is that a problem?

and would take forever to generate it. He uses a 70 char field,
>but in my case I have 3 char fields with total of 300 chars and in addition,
>and here will be the big problem, 2 CBLOB fields with much of product
>informations in it (1-2K of text or more on every record). Thank you anyway
>to pointing me to it.

I'd be willing to have a go and see how long, how big if you want to email me a sample database.

Roy Lambert [Team Elevate]
Thu, Apr 5 2012 12:51 PMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Rolf,

<< As far I can understand what he does, this would generate extremly large
index files and would take forever to generate it. He uses a 70 char field,
but in my case I have 3 char fields with total of 300 chars and in addition,
and here will be the big problem, 2 CBLOB fields with much of product
informations in it (1-2K of text or more on every record). Thank you anyway
to pointing me to it. >>

The only issue is how quick inserts, updates, and deletes are (after the
text index(es) are built).  If the performance is acceptable, and it should
most definitely be fine, then you shouldn't have any other issues.  This is,
of course, all dependent upon whether his word generation technique would
work for your needs.

--
Tim Young
Elevate Software
www.elevatesoft.com


« Previous PagePage 2 of 2
Jump to Page:  1 2
Image