Icon View Thread

The following is the text of the current message along with any replies.
Messages 1 to 10 of 20 total
Thread Full text indexing problem
Thu, May 8 2008 11:20 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Tim

I think you have a either a bug with full text indexing or the behaviour has changed. I'm holding user IDs (initials) in a clob field in my emails table. I have a text filter which basically passes the contents of the field straight through (FilteredText := TextToFilter) kits type is set to ANSI.

I've created a computed field which is computed as ANSI and set that as the filtertypecolumn for the text filter. I then have a word generator which spits out the words simply based on a crlf pair

I've just tried a test record

RL
RL3
RLx4
RLxx5

and only the last entry is found when I issue

select * from emails where _readby contains 'rlxx5'

ie if I replace the text to match with rl..rlx4 nada

It looks to me as though somewhere in your code there's a minimum word length being set

OTOH it could be me Smiley

Roy Lambert
Thu, May 8 2008 11:36 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

As a ps

I added dialogs and a showmessage(word) to the word generator to see what was being returned and its spitting out all of the userids not just the 5 character ones.


Roy Lambert
Thu, May 8 2008 12:30 PMPermanent Link

"keith crusius"
If we are talking about the same thing, see my thread on 3/28 regarding the
minimum word length being 3.
Thu, May 8 2008 1:31 PMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

keith

>If we are talking about the same thing, see my thread on 3/28 regarding the
>minimum word length being 3.

Similar but different SmileyI seem to have a minimum word length of 5 characters. I remembered the thread but my memory (wrongly) said that it was only the case for the default word generator, not a global setting.

I suppose in some ways I'm to blame cos I suggested a min and max word length to Tim, but I wanted it configurable.

This is going to cause me a significant problem since I'm "mis"using full text indexing to achieve what a lot of people would use additional tables for eg contact skills where if its in IT can even be a single letter. I really don't want to have to have the additional tables and use JOINS where I can now use a single table and a filter and deleting a record gets rid of everything.

Roy Lambert
Thu, May 8 2008 2:47 PMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Roy,

<< Similar but different SmileyI seem to have a minimum word length of 5
characters. I remembered the thread but my memory (wrongly) said that it was
only the case for the default word generator, not a global setting. >>

There is no min or max word length when you're using your own word
generator.  Please send me the code for the word generator that you're
using - more than likely there's an issue in there.

--
Tim Young
Elevate Software
www.elevatesoft.com

Thu, May 8 2008 3:30 PMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Tim


I'll be fascinated to know what I screwed up in this Smiley

Roy Lambert

procedure TEDBwgHS.GenerateWord(Collation: Integer; const Text: string; var Position: Integer; var Word: string; SearchWords:
Boolean = False);
var
Chop: integer;
begin
Word := '';
if (Text = '') or (Position >= Length(Text)) then Exit;
Chop := PosEx(#13#10, Text, Position);
if Chop = 0 then begin
if Position = 0 then Word := Text else Word := Copy(Text, Position, MaxInt);
Position := MaxInt;
end else begin
Word := Copy(Text, Position, Chop - (Position - 1));
Position := Chop + 2;
end;
end;

end.

Fri, May 9 2008 2:14 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Tim


With the clue that your code was okI found MY fault. Do you think if I took this week back I could get a refund?

Roy Lambert
Fri, May 9 2008 5:49 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Tim


I'm making some progress with the fti but I'm finding some anomalies.

The table I'm using is EMails which has 10609 rows

select * from emails where _message contains 'default' returns 47 rows

select * from emails where _message does not contain 'default' returns 10427 rows

Ah I thought the good old null but

select * from emails where _message contains 'default' or _message is null returns 180 rows (and takes an extortionate 157 secs to return) so I'm still 2 rows out

That's on the _Message column which uses a word generator based on the column type (mainly html or plain text)

Also on the _ReadBy column (this is the one where I cocked up with the wordgenerator but its working correctly now - trust me I'm a programmer)

select * from emails where _readby contains 'abcdef' returns 1 row
select * from emails where _readby does not contain 'abcdef' returns 10609 rows (ie everything)

No point in adding or is null here because this column was populated as part of the transfer and all rows are filled.

Suggestions, comments, help etc

Table size is 586Mb so awkward to send

Final question I added a showmessage back into the wordgenerator for the _readby column - why is is called 2 or 3 times when running the query?

Roy Lambert
Fri, May 9 2008 12:27 PMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Tim


OK its the NULL handling - I think

Rows    Statement
80        select * from emails where _message is null

10427    select * from emails where _message does not contain 'default'
46          select * from emails where _message contains 'default'
10474

10429    select * from emails where _message does not contain 'default' and _message is not null (157 secs)
180        select * from emails where _message contains 'default' or _message is null
10609

more correctly
10509    select * from emails where _message does not contain 'default' or _message is null (183 sec)
46         select * from emails where _message contains 'default'
10556

There's a couple of integer fields in the table so I zeroed them and did

update emails set _fkcompanies = 1 where _message does not contain 'default' or _message is null (156 secs)
followed by
update emails set _fkcontacts = 1 where _message contains 'default' (0 sec)

and then

select count(_msgNo), sum(_fkcompanies), sum(_fkcontacts) from emails

the result was 10609, 10509, 46

The total row count should be 10609

select * from emails where _fkcompanies = 0 and _fkcontacts = 0 gives me 54 rows

select * from emails where _fkcompanies = 1 and _fkcontacts = 1 gives me 0 rows

My logic registers have now suffered total breakdown

Roy Lambert
Fri, May 9 2008 1:52 PMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Tim


I just did a bit more work on the 54 rows where neither

update emails set _fkcompanies = 1 where _message does not contain 'default' or _message is null
nor
update emails set _fkcontacts = 1 where _message contains 'default'

altered the column.

All 54 rows have (html) text in the _Message column, it does contain default in _Message (eg <a href="http://woodstock.advertserve.com/advertpro/servlet/click/zone?zid=2&cid=36&mid=136&pid=0&default=false&random=87) but my filter strips out all the tags.

I then wrote a bit of code to go through the 54 and write the filtered text into another column just in case I'd made another cock up. Unfortunately I hadn't and none of them had default left in. (ok some had things like defaultStatus but that's it)

Roy Lambert
Page 1 of 2Next Page »
Jump to Page:  1 2
Image