Login ProductsSalesSupportDownloadsAbout |
Home » Technical Support » ElevateDB Technical Support » Support Forums » ElevateDB SQL » View Thread |
Messages 1 to 10 of 22 total |
CONTAINS (Full Text Index) vs LIKE ... |
Wed, Aug 12 2009 7:39 PM | Permanent Link |
Charalampos Michael | Hello,
What's your opinion for VARCHAR/CHAR fields ? Thank you -- Charalampos Michael - [Creation Power] - http://www.creationpower.gr |
Thu, Aug 13 2009 2:57 AM | Permanent Link |
Roy Lambert NLH Associates Team Elevate | Charalampos
My take is even with the fact that ElevateDB no longer right trims varchars on storing since it does ignore them on comparison I'd go for them UNLESS I have a fixed length fixed format field that required all characters filling. With VARCHARS you could even find that an = works if the LIKE is just to accommodate blank spaces. CONTAINS vs LIKE. Well CONTAINS is a lot faster, but requires more disk space, more insert, update & delete time and can't handle phrases. Also a pet peeve unlike TEXTSEARCH it ONLY works on indexed fields. I'm using full text indexing a lot. Often for things that could go into a x-ref table but its easier to store everything in the one table and it allows filters to work. For columns with a full text index finding a phrase I generally use a mix of CONTAINS and LIKE. If you do go down the CONTAINS route then unless you write your own text filter and word generator you could find the default one has filtered out some of the words in your field. The same words will be filtered out of any search text but it can seem to give wrong results. But the big difference is word vs phrase searching. Roy Lambert |
Thu, Aug 13 2009 6:49 AM | Permanent Link |
Charalampos Michael | Dear Roy,
> My take is even with the fact that ElevateDB no longer right trims varchars on storing > since it does ignore them on comparison I'd go for them UNLESS I have a fixed length fixed > format field that required all characters filling. With VARCHARS you could even find that > an = works if the LIKE is just to accommodate blank spaces. > > CONTAINS vs LIKE. Well CONTAINS is a lot faster, but requires more disk space, more insert, > update & delete time and can't handle phrases. Also a pet peeve unlike TEXTSEARCH it > ONLY works on indexed fields. I'm using full text indexing a lot. Often for things > that could go into a x-ref table but its easier to store everything in the one table and > it allows filters to work. For columns with a full text index finding a phrase I > generally use a mix of CONTAINS and LIKE. > > If you do go down the CONTAINS route then unless you write your own text filter and word generator > you could find the default one has filtered out some of the words in your field. The same words > will be filtered out of any search text but it can seem to give wrong results. But the big > difference is word vs phrase searching. Very interesting stuff! Thank you Roy! Actually i want to do a "Keyword" searching and maybe use it for filtering. I don't care about disk space, i care about speed and as you said CONTAINS is a lot of faster than LIKE. Since I'm intent to use them on fixed length VARCHAR fields what bothers me is that I'm afraid that CONTAINS may miss keywords where's LIKE will not ... or I'm wrong ? What about the mix of CONTAINS & LIKE ? Is it faster and safer ? Thank you very much -- Charalampos Michael - [Creation Power] - http://www.creationpower.gr |
Thu, Aug 13 2009 10:09 AM | Permanent Link |
Roy Lambert NLH Associates Team Elevate | Charalampos
>Actually i want to do a "Keyword" searching and maybe use it for >filtering. I don't care about disk space, i care about speed and >as you said CONTAINS is a lot of faster than LIKE. > >Since I'm intent to use them on fixed length VARCHAR fields What is a fixed length VARCHAR field when its at home? >what >bothers me is that I'm afraid that CONTAINS may miss keywords >where's LIKE will not ... or I'm wrong ? Not unless they have been filtered out (eg its a stop word or is longer than the max length allowed) and that's down to you setting up the index and the text filter and word generator (hint have a look in the extensions ng for mine) >What about the mix of CONTAINS & LIKE ? Is it faster and safer ? Lets give you some examples. My money making work was recruitment (hopefully it will be again in the near future when the economy picks up) so my examples are slanted towards that. I'm going to pick 2 fields from the Contacts table: Notes: memo free form text Skillsets: memo managed text NOTES is just what it says a note field into which the user can type anything they like eg Been on holiday, likes skiing wife's name is Vicky has 3 children, and he's very good with Microsoft Word SKILLSETS is a list of the skills the contact has eg Delphi, Microsoft Word, C++. These are managed because 1) the field is populated via lookup from a table and 2) the spaces are replaced with a hard space (#160). The reason for the substitution is that Microsoft#160Word is treated as one word whilst Microsoft#32Word would be two words BUT both look the same when printed! So a key difference is the way I have the two set up. Skillsets CONTAINS 'Microsoft#160Word' would only return hits with that "entire word" whereas Notes CONTAINS 'Microsoft Word' would return hits for any row with the words Microsoft and Word anywhere in the text so to get just the MSWord users you'd have to do Notes CONTAINS 'Microsoft Word' AND Notes LIKE '%Microsoft Word%' This will be faster than just Notes LIKE '%Microsoft Word%', but the speed difference will depend on the number of hits that CONTAINS returns to be further processed by LIKE If you want any more can you give me a few examples of what you're looking to do. Roy Lambert [Team Elevate] |
Thu, Aug 13 2009 11:21 AM | Permanent Link |
Tim Young [Elevate Software] Elevate Software, Inc. timyoung@elevatesoft.com | Michael,
<< What about the mix of CONTAINS & LIKE ? Is it faster and safer ? >> Using a mix is the best option, and will allow you to have the proper combination of performance and accuracy. -- Tim Young Elevate Software www.elevatesoft.com |
Fri, Aug 14 2009 5:42 AM | Permanent Link |
Charalampos Michael | Dear Roy,
>> Since I'm intent to use them on fixed length VARCHAR fields > > What is a fixed length VARCHAR field when its at home? "Name" VARCHAR(50) COLLATE "UNI", >> what >> bothers me is that I'm afraid that CONTAINS may miss keywords >> where's LIKE will not ... or I'm wrong ? > > Not unless they have been filtered out (eg its a stop word or is longer than the max length allowed) and that's down to you setting up the index and the text filter and word generator (hint have a look in the extensions ng for mine) I don't have any filters .... >> What about the mix of CONTAINS & LIKE ? Is it faster and safer ? > > Lets give you some examples. My money making work was recruitment (hopefully > it will be again in the near future when the economy picks up) so my examples > are slanted towards that. > > I'm going to pick 2 fields from the Contacts table: > > Notes: memo free form text > Skillsets: memo managed text You're talking for CLOB fields while i'm talking for plain VARCHAR fields Although thanks for the information about the filtering because i'll need it! -- Charalampos Michael - [Creation Power] - http://www.creationpower.gr |
Fri, Aug 14 2009 5:43 AM | Permanent Link |
Charalampos Michael | Dear Tim,
> << What about the mix of CONTAINS & LIKE ? Is it faster and safer ? >> > > Using a mix is the best option, and will allow you to have the proper > combination of performance and accuracy. Thanks, so i must have 2 index fields for doing this fast right ? (One Index and One Text Index for the same field) -- Charalampos Michael - [Creation Power] - http://www.creationpower.gr |
Fri, Aug 14 2009 7:07 AM | Permanent Link |
Roy Lambert NLH Associates Team Elevate | Charalampos
>> What is a fixed length VARCHAR field when its at home? > >"Name" VARCHAR(50) COLLATE "UNI", Ahhh you were talking about normal ElevateDB/DBISAM VARCHARs vs PICK like variable length fields? >I don't have any filters .... The text filter is the little dll that you write, or the standard one that Tim provides (which, if it exists, I think just passes the text unchanged) that takes the raw text string and strips out stuff that you don't want like numbers or html or rtf formatting >You're talking for CLOB fields while i'm talking for plain VARCHAR >fields Not a lot of difference really apart from being able to set a normal index on a VARCHAR which you can't do on a CLOB Roy Lambert |
Fri, Aug 14 2009 8:16 PM | Permanent Link |
Charalampos Michael | Dear Roy,
>>> What is a fixed length VARCHAR field when its at home? >> "Name" VARCHAR(50) COLLATE "UNI", > > Ahhh you were talking about normal ElevateDB/DBISAM VARCHARs vs PICK like variable length fields? ? Simple and clean "String" fields ... Rec1: Name = 'Charalampos' Rec2: Name = 'Nick' Rec3: Name = 'Peter' etc >> I don't have any filters .... > > The text filter is the little dll that you write, or the standard one that Tim > provides (which, if it exists, I think just passes the text unchanged) > that takes the raw text string and strips out stuff that you don't want like numbers > or html or rtf formatting That will gonna help me on CLOB fields which i put the RTF/HTML >> You're talking for CLOB fields while i'm talking for plain VARCHAR >> fields > > Not a lot of difference really apart from being able to set a normal index > on a VARCHAR which you can't do on a CLOB You're forgetting that VARCHAR has different file format (limited/fixed) than CLOB. -- Charalampos Michael - [Creation Power] - http://www.creationpower.gr |
Sat, Aug 15 2009 2:52 AM | Permanent Link |
Roy Lambert NLH Associates Team Elevate | Charalampos
>> >> Not a lot of difference really apart from being able to set a normal index > > on a VARCHAR which you can't do on a CLOB > >You're forgetting that VARCHAR has different file format (limited/fixed) >than CLOB. But that makes no difference to full text indexing (apart from the number of words in the field) Roy Lambert |
Page 1 of 3 | Next Page » | |
Jump to Page: 1 2 3 |
This web page was last updated on Sunday, May 19, 2024 at 08:46 AM | Privacy PolicySite Map © 2024 Elevate Software, Inc. All Rights Reserved Questions or comments ? E-mail us at info@elevatesoft.com |