Login ProductsSalesSupportDownloadsAbout |
Home » Technical Support » ElevateDB Technical Support » Support Forums » ElevateDB General » View Thread |
Messages 1 to 5 of 5 total |
The character "_" should not be treated as word breaker in the default word generator |
Tue, May 24 2011 11:48 AM | Permanent Link |
Xiannong Chen | The character "_" should not be treated as word breaker in the default word generator. It is part of a word.
|
Tue, May 24 2011 12:51 PM | Permanent Link |
Tim Young [Elevate Software] Elevate Software, Inc. timyoung@elevatesoft.com | Xiannong,
<< The character "_" should not be treated as word breaker in the default word generator. It is part of a word. >> While I agree with you, I'm not in a position to change the default text indexing without breaking the indexes of existing installations. Your other option is to use your own word generator. You can easily create one using the ElevateDB Word Generator Module template in the Object Repository in the Delphi IDE. The template comes pre-configured to exactly reproduce the same results as the default word generator, so you only need to tweak it by following the comments in the code. -- Tim Young Elevate Software www.elevatesoft.com |
Tue, May 24 2011 1:13 PM | Permanent Link |
Roy Lambert NLH Associates Team Elevate | Xiannong / Tim
I disagree, at least as far as the English language is concerned (not sure about American). Roy Lambert |
Tue, May 24 2011 1:16 PM | Permanent Link |
Xiannong Chen | Tim, I migrated my application from DBISAM. In DBISAM, "_" is not treated as a word breaker. I think you should change the default setting. It may not cause a problem to existing installations because users do not expect "_" to be a word breaker. If you do not change it, more installations will be distributed with the wrong word breaker. I bought ElevateDB CS with source. Could you tell me where can I change this myself? Thanks, Xiannong
|
Tue, May 31 2011 3:39 PM | Permanent Link |
Tim Young [Elevate Software] Elevate Software, Inc. timyoung@elevatesoft.com | Xiannong,
<< Tim, I migrated my application from DBISAM. In DBISAM, "_" is not treated as a word breaker. I think you should change the default setting. It may not cause a problem to existing installations because users do not expect "_" to be a word breaker. If you do not change it, more installations will be distributed with the wrong word breaker. >> You're not understanding me: it will *break their text indexes* if I change it. In other words, it will require a REPAIR TABLE run on every table with text indexes if I change it, otherwise searching/updating will not work correctly. << I bought ElevateDB CS with source. Could you tell me where can I change this myself? >> The relevant code is in edbstring.pas: function GetNextTextIndexWord(Collation: Integer; const Value: TEDBString; var Position: Integer; AllowWildCard: Boolean=False): TEDBString; var TempWord: TEDBString; TempWildcards: Boolean; begin Result:=''; TempWord:=''; TempWildcards:=False; while (Result='') and (Position <= Length(Value)) do begin while (Result='') and (Position <= Length(Value)) do begin if SpaceChars.CharInSet(Value[Position]) and (not (AllowWildcard and (Value[Position]=WILDCARD))) then begin Inc(Position); Break; end else if IncludeChars.CharInSet(Value[Position]) then TempWord:=TempWord+Value[Position] else if (AllowWildcard and (Value[Position]=WILDCARD)) then begin TempWord:=TempWord+Value[Position]; TempWildcards:=True; end; Inc(Position); end; if AllowWildcard and TempWildcards then Result:=TempWord else if (not AllowWildcard) or (AllowWildcard and (not TempWildcards)) then begin if (Length(TempWord) >= MIN_WORD_SIZE) and (Length(TempWord) <= MAX_WORD_SIZE) and (not IsStopWord(Collation,TempWord)) then Result:=TempWord else TempWord:=''; end else TempWord:=''; end; end; However, all you need to change is the initialization of the SpaceChars object in the initialization section: SpaceChars:=TEDBCharSet.Create; with SpaceChars do begin AddCharRange(TEDBChar(0),TEDBChar(47)); AddCharRange(TEDBChar(58),TEDBChar(64)); AddCharRange(TEDBChar(91),TEDBChar(96)); AddCharRange(TEDBChar(123),TEDBChar(130)); AddCharRange(TEDBChar(132),TEDBChar(137)); AddChar(TEDBChar(139)); AddChar(TEDBChar(141)); AddCharRange(TEDBChar(143),TEDBChar(153)); AddChar(TEDBChar(155)); AddChar(TEDBChar(157)); AddCharRange(TEDBChar(160),TEDBChar(191)); AddChar(TEDBChar(215)); AddChar(TEDBChar(247)); end; Just add an: AddChar(TEDBChar('_')); at the bottom. -- Tim Young Elevate Software www.elevatesoft.com |
This web page was last updated on Saturday, April 27, 2024 at 08:52 PM | Privacy PolicySite Map © 2024 Elevate Software, Inc. All Rights Reserved Questions or comments ? E-mail us at info@elevatesoft.com |