Icon View Thread

The following is the text of the current message along with any replies.
Messages 1 to 10 of 23 total
Thread TEXTSEARCH speed concerns
Sat, Nov 25 2006 1:42 PMPermanent Link

"Jerry Clancy"
Hopefully I am doing something wrong that one of you can point out. Text
searching is one of if not the most important function in our new app but I
am seeing horrendous and unacceptable search times.

I have a BillText database with these specs:

 Rec. size: 120
 Blob size: 512
 3 indexes: primary, bill and filerec -- all small fields
   (file is opened, index set to bill)
 Blob (and search) field: Plaintext, type Memo, set as searchtext index
field
 Current file size:
   .dat    1,589 KB
   .idx   77,393 KB
   .blb  260,733 KB

In a test ISAPI app, mimicing the real app (and running in localmode on the
same system and drive as the databases themselves), we set the search string
programmatically:

 S := 'cable television';   { for testing }

 sFilter := 'TEXTSEARCH('+QuotedStr(S) + ' IN PLAINTEXT)';
 with UserSession.tblBillText do begin
   Filtered := False;
   Filter := '';    { Clear it }
   IndexName := 'bill';
   Filter := sFilter;
   Filtered := True;
   First;
 ...

This search takes approx. 1 minute, 20 secs. Why? This, at least in my
opinion, is expected to be more or less instantaneous.

76 records out of the some 14,000 are hits. The app then simply walks
through the file to EOF and displays three small fields, none of which is
the blob field, in an non-data-aware grid. This process takes another 2
minutes, 36 seconds. Again, why so long?

I can't determine if dbisam is doing a brute-force search or not (it
shouldn't be since the blob field is the textsearch index field), or it is a
funcion of record movement through the 260 MB blob field, even though it is
never referenced in the display phase.

I'm using local DBISAM 4.22 Build 3 with IntraWeb on a system running Win2K
with some 512 MB of RAM.

Would this be faster running with the primary (Record ID) index?
Would this be faster running with the C/S version of DBISAM?

What am I doing wrong?

Jerry


Sun, Nov 26 2006 5:15 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Jerry


I've just tried an sql version on a 128k records returning c31k records which took about 16 secs and a second search using "mimicing" to return 1 result - I couldn't count seconds quick enough.

One thought occurs -
   .dat    1,589 KB
   .idx   77,393 KB
   .blb  260,733 KB

Ubless you're doing custom word filtering the .idx just doesn't look big enough to have indexed the .blb. Are you sure you've set up a full text index?

Roy Lambert
Sun, Nov 26 2006 10:37 PMPermanent Link

"Jerry Clancy"
Roy,

Thanks for the response. No custom filtering. In fact, I had just copied the
two-memo-fields database (one a source HTML field and the second an ASCII
version of the first stripped of all HTML) to a new database and then
deleted the HTML memo field, a process that took 4 hours or so. It
definitely is text-indexed on the ASCII memo field, "PLAINTEXT". I cut out
the HTML field to reduce the .blb by more than half and also to see if that
made a difference in the search and display times (it didn't). This leads me
to believe that a brute force search is being done, but I don't know why.

The PLAINTEXT field itself is only used in the full text index, not any
other (don't believe you can otherwise index a memo field).

Very discouraging.

Jerry


"Roy Lambert" <roy.lambert@skynet.co.uk> wrote in message
news:B16F4D49-CDF8-4038-A418-2450E2B3DADC@news.elevatesoft.com...
| Jerry
|
|
| I've just tried an sql version on a 128k records returning c31k records
which took about 16 secs and a second search using "mimicing" to return 1
result - I couldn't count seconds quick enough.
|
| One thought occurs -
| .dat 1,589 KB
| .idx 77,393 KB
| .blb 260,733 KB
|
| Ubless you're doing custom word filtering the .idx just doesn't look big
enough to have indexed the .blb. Are you sure you've set up a full text
index?
|
| Roy Lambert
|

Sun, Nov 26 2006 10:39 PMPermanent Link

"Jerry Clancy"
Roy,

I'm going to do an overnight rebuild with dbsys and see what we get
tomorrow.

Jerry

Mon, Nov 27 2006 4:21 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Jerry


Since there's no custom filtering use DBSys and an SQL query with the gererate plan option - that should tell you what's going on.

Roy Lambert

"Jerry Clancy" <jclancy@billtrak.com> wrote on Sun, 26 Nov 2006 22:33:51 -0500

>Roy,
>
>I'm going to do an overnight rebuild with dbsys and see what we get
>tomorrow.
>
>Jerry
>
>
Mon, Nov 27 2006 1:17 PMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Jerry,

<< Hopefully I am doing something wrong that one of you can point out. Text
searching is one of if not the most important function in our new app but I
am seeing horrendous and unacceptable search times. >>

More than likely the cause is this:

 IndexName := 'bill';

If the records that fit the filter are small and exist closer to the end of
the current index order, then it may take a bit of time to navigate to the
first record that fits the filter.  However, I wouldn't expect 1 minute and
20 seconds on just ~14,000 records.  Could you possibly send me the table so
that I can try it here ?

--
Tim Young
Elevate Software
www.elevatesoft.com

Mon, Nov 27 2006 8:38 PMPermanent Link

"Jerry Clancy"
Roy,

Thanks for this. I first did a text search using DBiDesktop and then, after
doing some homework on SQL (don't use it), I constructed and ran this query
in dbsys:

SELECT Bill, Ver INTO Hits FROM BText WHERE TEXTSEARCH('CABLE TELEVISION' IN
PLAINTEXT)

In both cases the search occured within a second or two, which is the
performance I would expect. I also ran the (former) search with the BILL
index active and got a bit slower but otherwise comparable performance. So,
it has to be something I am doing otherwise in the code, but I'll be darned
if I know what. Time to disect the test program code again.

Jerry

"Roy Lambert" <roy.lambert@skynet.co.uk> wrote in message
news:0B330DF2-879A-4173-9E48-BA41315FE6C0@news.elevatesoft.com...
| Jerry
|
|
| Since there's no custom filtering use DBSys and an SQL query with the
gererate plan option - that should tell you what's going on.
|
| Roy Lambert
|

Mon, Nov 27 2006 8:53 PMPermanent Link

"Jerry Clancy"
Tim,

See my last response to Roy. Both DBiDesktop and dbsys give me the
performance I expect so I have to look more closely at my code.

I'd have to mail a CD to you if it becomes necessary (need an address). The
file is huge (the original .blb is 600MB and contains every version of every
bill in the current session of the U.S. Congress). The test version
eliminates the source versions and retains only the ASCII versions.

Let me check the code again first.

Jerry

"Tim Young [Elevate Software]" <timyoung@elevatesoft.com> wrote in message
news:166B33B9-64FF-4586-B683-09E99C4BCD9A@news.elevatesoft.com...
| Jerry,
|
| << Hopefully I am doing something wrong that one of you can point out.
Text
| searching is one of if not the most important function in our new app but
I
| am seeing horrendous and unacceptable search times. >>
|
| More than likely the cause is this:
|
|   IndexName := 'bill';
|
| If the records that fit the filter are small and exist closer to the end
of
| the current index order, then it may take a bit of time to navigate to the
| first record that fits the filter.  However, I wouldn't expect 1 minute
and
| 20 seconds on just ~14,000 records.  Could you possibly send me the table
so
| that I can try it here ?
|
| --
| Tim Young
| Elevate Software
| www.elevatesoft.com
|
|

Tue, Nov 28 2006 1:39 AMPermanent Link

"Jerry Clancy"
Tim,

I've burned a CD with the the database and Delphi Win32 program I created
tonight to test in this mode rather than the Internet/IntraWeb environment.
The results are the same: very slow.

The CD will be in the mail first thing Tuesday morning.

Jerry

"Tim Young [Elevate Software]" <timyoung@elevatesoft.com> wrote in message
news:166B33B9-64FF-4586-B683-09E99C4BCD9A@news.elevatesoft.com...
| Jerry,
| More than likely the cause is this:
|
|   IndexName := 'bill';
|
| If the records that fit the filter are small and exist closer to the end
of
| the current index order, then it may take a bit of time to navigate to the
| first record that fits the filter.  However, I wouldn't expect 1 minute
and
| 20 seconds on just ~14,000 records.  Could you possibly send me the table
so
| that I can try it here ?

Tue, Nov 28 2006 4:10 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Jerry


Are you sure your problem isn't simply latency across the web?

Have you tried using DBSys on the remote data rather than using your app. If it returns fast its got to be you Smiley

Nice that you're learning sql, but why not simply put it into the filter?

Roy Lambert
Page 1 of 3Next Page »
Jump to Page:  1 2 3
Image