Icon View Thread

The following is the text of the current message along with any replies.
Messages 1 to 8 of 8 total
Thread Any impact from TEXT INDEX
Wed, Apr 13 2016 7:33 AMPermanent Link

Adam Brett

Orixa Systems

I have started to add TEXT INDEXES to my EDB tables, and I am just astonished how much faster results come back from larger tables with CLOB fields. Wow.

I want to roll out TEXT INDEXES on more of my tables. But I just thought I should check with other users about impact on other aspects of performance.

I am guessing that data is slower to POST ... as the index has to be updated.

How much impact does adding a TEXT INDEX have on speed of posting?
Is it slower as the index / table gets bigger?
Wed, Apr 13 2016 7:46 AMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Adam,

<< I am guessing that data is slower to POST ... as the index has to be updated. >>

Correct.

<< How much impact does adding a TEXT INDEX have on speed of posting? >>

It's slightly worse than a normal index and, of course, is dependent upon the size of the CLOB that needs to be indexed.

<< Is it slower as the index / table gets bigger? >>

No, it's only proportional to the size of the CLOB/VARCHAR/CHAR columns involved in the current insert, update, or delete.

Tim Young
Elevate Software
www.elevatesoft.com
Wed, Apr 13 2016 9:07 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Adam


I'm a heavy user of this feature - it was the main incentive for me to upgrade from DBISAM - and I can agree with Tim. If you're into high volume transactions don't even think about using it as standard, but for normal volumes its fine. Depending on the size of the CLOBs users may see a slight hiccough but that's it.

What you may want to do is invest time into creating specialist text filters and word generators.

Roy Lambert
Thu, Apr 14 2016 5:47 AMPermanent Link

Adam Brett

Orixa Systems

Thanks for this.

I am finding behaviour I wasn't expecting with the text index.

I have a column which often contains product codes like "sg03" as well as general text.

With the Text Index a SQL statement:

WHERE "Memo" CONTAINS 'sg03'

Does not return any records, while

WHERE "Memo" LIKE '%sg03%' does, but is much slower.

I guess this is because the product codes are not normal English "words" presumably I could add all the product-codes to a custom word generator?

Any pointers on how to do this? Ideally I would need to add new codes to the word-generator list as new product codes were added to the database.
Thu, Apr 14 2016 7:36 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Adam

The extensions newsgroup is your friend - there are my text filters and word generators and (I think) one from someone who wanted something very like your requirement here.

>I am finding behaviour I wasn't expecting with the text index.
>
>I have a column which often contains product codes like "sg03" as well as general text.
>
>With the Text Index a SQL statement:
>
>WHERE "Memo" CONTAINS 'sg03'
>
>Does not return any records, while

Your guess is not quite right - its because the standard text filter / word generator does not index numbers - have a look at the text indexing section of the OLH especially space characters and include characters.
>
>WHERE "Memo" LIKE '%sg03%' does, but is much slower.

How slow this gets is very dependent on the size of the CLOB record. Because Tim's implementation is very complete he uses a sliding window approach to test for character matches. Once you get a decent sized chunk of text it reaches the "drinking treacle through a straw" state. I wrote an external function "FldHasStr" which basically uses pos. Search the extensions ng for "New external function: FldHasStr"

>Any pointers on how to do this? Ideally I would need to add new codes to the word-generator list as new product codes were added to the database.

Yoou could probably get away with making sure numbers are in the include characters list and not in the space characters list. Don't know how to do that since I've been customising the full text indexing from DBISAM days.

A better option (although it would be a bit slower) might be to create your own text filter / word generator which allows words with numbers only if they are on a product table.

Roy Lambert
Thu, Apr 14 2016 7:55 AMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Roy,

<< I'm a heavy user of this feature - it was the main incentive for me to upgrade from DBISAM - and I can agree with Tim. If you're into high volume transactions don't even think about using it as standard, but for normal volumes its fine. Depending on the size of the CLOBs users may see a slight hiccough but that's it. >>

Typical CLOBs (free-form text notes/descriptions) will be fine.

You're indexing emails that can be quite large, correct ?

Tim Young
Elevate Software
www.elevatesoft.com
Thu, Apr 14 2016 9:29 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Tim


><< I'm a heavy user of this feature - it was the main incentive for me to upgrade from DBISAM - and I can agree with Tim. If you're into high volume transactions don't even think about using it as standard, but for normal volumes its fine. Depending on the size of the CLOBs users may see a slight hiccough but that's it. >>
>
>Typical CLOBs (free-form text notes/descriptions) will be fine.
>
>You're indexing emails that can be quite large, correct ?

Is 200k large?

The largest message portion I have is 201793 bytes I have no idea if there are also attachments or graphics as well.

I wouldn't have said that some of my emails have a slight hiccough when posted - its more like a lay down and snooze SmileyAs well as some of them being large there's also the decode the email, extract the message (in some cases decode from base64) and strip out the html & css junk. Add in the fact that there are 4 full text indices and 5 normal indices. This is why this all gets done in my background email subsystem so its all done and dusted when it gets to me.



Roy
Fri, Apr 15 2016 7:44 AMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Roy,

<< Is 200k large? >>

Yes, for plain, free-form text (not RTF, XML, HTML, etc.) it is.

Tim Young
Elevate Software
www.elevatesoft.com
Image