Icon View Thread

The following is the text of the current message along with any replies.
Messages 11 to 18 of 18 total
Thread Text Index, ANSI Code Page 1250 character set and leading asterisk searches
Fri, Jun 2 2017 1:15 PMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Roy,

<< I get the same problem with my word generator using the default ANSI_CI collation for all text columns. ARe you trying it with UNICODE? >>

What should be Unicode ?  The database, the word generator, the compiler, etc. ?

Tim Young
Elevate Software
www.elevatesoft.com
Sat, Jun 3 2017 2:21 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Tim

><< I get the same problem with my word generator using the default ANSI_CI collation for all text columns. ARe you trying it with UNICODE? >>
>
>What should be Unicode ? The database, the word generator, the compiler, etc. ?

I was just wondering if it was a unicode vs ansi issue

Nothing in my setup is unicode. Everything is ANSI, D2006, collation set to ANSI_CI for the database since the word generator is written using D2006 and I don't specifically set the collation I assume that's ansi as well.


Roy Lambert
Mon, Jun 5 2017 3:47 PMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Roy,

<< I was just wondering if it was a unicode vs ansi issue >>

There *can* be issues with Unicode vs. ANSI, if you're talking about a non-English language.  That's why one needs to create a custom word generator when using non-English languages in order to support the proper word separators, etc. for one's particular target language, as well as take into account variations in the language rules when it comes to mapping code pages (ANSI) to Unicode characters.  Some characters map one-to-one, while others do not.

<< Nothing in my setup is unicode. Everything is ANSI, D2006, collation set to ANSI_CI for the database since the word generator is written using D2006 and I don't specifically set the collation I assume that's ansi as well. >>

If you want to send me something that replicates it, I'll be happy to take a look.

Tim Young
Elevate Software
www.elevatesoft.com
Tue, Jun 6 2017 3:30 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Tim

><< I was just wondering if it was a unicode vs ansi issue >>
>
>There *can* be issues with Unicode vs. ANSI, if you're talking about a non-English language. That's why one needs to create a custom word generator when using non-English languages in order to support the proper word separators, etc. for one's particular target language, as well as take into account variations in the language rules when it comes to mapping code pages (ANSI) to Unicode characters. Some characters map one-to-one, while others do not.
>
><< Nothing in my setup is unicode. Everything is ANSI, D2006, collation set to ANSI_CI for the database since the word generator is written using D2006 and I don't specifically set the collation I assume that's ansi as well. >>
>
>If you want to send me something that replicates it, I'll be happy to take a look.

My word generator / text filer code is in the extensions newsgroup. All I did was a select along the lines of

SELECT * FROM EMails WHERE _Message CONTAINS 'information ' - 1665 rows returned

SELECT * FROM EMails WHERE _Message CONTAINS 'informat*' - 1668 rows returned

SELECT * FROM EMails WHERE _Message CONTAINS '*nformation' - 0 rows returned

SELECT * FROM EMails WHERE _Message CONTAINS '*nformatio*' - 0 rows returned

I used EDBManager - nothing complex. I also tried something similar on a database using the default word generator / text filter and that was fine so I'd suspect something in mine and Janusz's code. Would we have needed to change anything in the word generator after you introduced the front stemming (or whatever the term is)

Its not something I've bothered with, my query generator doesn't allow it and generally where I'd want *test* its for a phrase and I use POS

If you need me to I can create a small test database and application to check out - just let me know


Roy Lambert
Tue, Jun 6 2017 5:55 AMPermanent Link

Janusz Cyran

Tim, Rob,
sorry, my mistake. I just started with Rob's word generator, noticed the problem, then created my own test word generator but during the test still used wrongly the Rob's one. Now when I fixed it all seems works great and as expected.

Janusz
Tue, Jun 6 2017 6:57 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Janusz


That's good news. It identifies the problem as my word generator, and it was a VERY simple fix - new version will be uploaded to the extensions newsgroup soon

Roy Lambert
Wed, Jun 7 2017 1:26 PMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Roy,

<< That's good news. It identifies the problem as my word generator, and it was a VERY simple fix - new version will be uploaded to the extensions newsgroup soon >>

So, what was the fix ? Smile

Tim Young
Elevate Software
www.elevatesoft.com
Thu, Jun 8 2017 4:09 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Tim

><< That's good news. It identifies the problem as my word generator, and it was a VERY simple fix - new version will be uploaded to the extensions newsgroup soon >>
>
>So, what was the fix ? Smile

  if Text[Position] = '*' then begin
   Word := Word + '*';
   inc(Position);
   if Word <> '*' then Exit; <<<<<<<<<<<<< changed from Exit;
  end else begin
   if (Text[Position] <> #39) or AllowQM then begin
    Word := Word + Text[Position];
    if Text[Position] in Alphas then AlphasFound := True
    else if Text[Position] in Numbers then NumbersFound := True;
   end;

Roy

ps new version is in extensions
« Previous PagePage 2 of 2
Jump to Page:  1 2
Image