Login ProductsSalesSupportDownloadsAbout |
Home Technical Support DBISAM Technical Support Support Forums DBISAM General View Thread |
Messages 1 to 3 of 3 total |
Full Text Indexing performance |
Mon, Dec 31 2012 6:54 PM | Permanent Link |
Jim Gallagher | We have a Delphi7 program in production that processes a text file containing an ID and a Name field. It contains about 10 million records, and the process ran about 20-30 minutes. It builds a full text index on the name field.
We lost the source code to this program (bad disk, bad backups), which I originally wrote, and the replacement code, which I did not write, was running 3-4 days(!). I tried to reverse engineer the old program (postponing the index builds until after the text file was processed). This reduced the time to about 13 hours. Still too long. The old version was written in 2007, with whatever version of DBISAM was current then. My re-write (also in Delphi 7) is with 4.34 Build 5. My code is quite simple. I create the table without the indexes. The program processes the 10 million records in 2+ minutes. The build of the full text index is taking over 10 hours. This is the (throwaway) create table code: with dbisamtable1 do try if Active then Close; TableName:= 'FV00020621'; Exclusive:= True; try DeleteTable; except end; try with dbisamtable1 do begin DatabaseName:= eDB.Text; TableName:='fv00020621'; Exclusive:=True; if (not Exists) then begin with FieldDefs do begin Clear; Add('KEY',ftString,30); Add('TITLE',ftString,40); end; with IndexDefs do begin Clear; Add('','KEY',[ixPrimary,ixUnique],'',icFull); end; CreateTable(0,1,0,False,'','',4096,512,0) end; end; finally end; except end; After the text file is processed and the table contains 10 million rows, I build the indexes. I added the single letter stopwords to avoid indexing initials in names (John Q Public), which helped reduce the time. This is the process that is taking 10+ hours: dbisamtable1.Close; dbisamtable1.Exclusive := True; with dbisamtable1 do begin StopWords:=TStringList.Create; try with StopWords do // eliminate initals from the full text search begin Add('A'); Add('B'); Add('C'); Add('D'); Add('E'); Add('F'); Add('G'); Add('H'); Add('I'); Add('J'); Add('K'); Add('L'); Add('M'); Add('N'); Add('O'); Add('P'); Add('Q'); Add('R'); Add('S'); Add('T'); Add('U'); Add('V'); Add('W'); Add('X'); Add('Y'); Add('Z'); end; AlterTable(0,1,0,False,'','',4096,512,0,'TITLE',StopWords, #1+#2+#3+#4+#5+#6+#7+#8+#9+#10+#11+#12+#13 +#14+#15+#16+#17+#18+#19+#20+#21+#22+#23 +#24+#25+#26+#27+#28+#29+#30+#31+#32+'*+'+ ',-./:;<=>\`', '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ_ab'+ 'cdefghijklmnopqrstuvwxyz'+ ''+ ''+ '',True); finally StopWords.Free; end; end; Any ideas as to what I might be doing wrong? I was guessing somewhat on the run time build of a full text index, since I could not find any sample code. I also tried sql: // dbisamquery1.SQL.Clear; // dbisamquery1.SQL.Add('alter table FV00020621 TEXT INDEX (TITLE);'); // dbisamquery1.ExecSQL; But that was no faster. Is it possible that the unicode stuff that might have been added since that version of DBISAM is causing me this grief? Thank you for any ideas, suggestions. -Jim |
Tue, Jan 1 2013 4:58 AM | Permanent Link |
Roy Lambert NLH Associates Team Elevate | Jim
>But that was no faster. Is it possible that the unicode stuff that might have been added since that version of DBISAM is causing me this grief? I switched to ElevateDB some while back so I'm not up to date with DBISAM but I thought it was still non-unicode. D7 certainly is so I doubt that it "unicode stuff" that is causing the problem. Looking at your code it couldn't get much simpler, and the only suggestions I could make are to reduce the NewTextIndexIncludeChars to a-z, A-Z, and the NewTextIndexSpaceChars to #32,'.' and ',' - unless you're likely to encounter others in a name. I don't think that will give you the speed up you want though. You could also try playing with the buffer sizes, that may gain you some speed. Also, if you remember it, and are on a support contract, you could try downloading and installing the earlier version of DBISAM, compile your utility and then reinstall the latest version. Finally is there a possibility that you'd written your own engine function for full text indexing? Roy Lambert [Team Elevate] |
Tue, Jan 8 2013 12:04 PM | Permanent Link |
Tim Young [Elevate Software] Elevate Software, Inc. timyoung@elevatesoft.com | Jim,
<< The build of the full text index is taking over 10 hours. This is the (throwaway) create table code: >> Did you try creating the table with the text index(es) first, and then populating the table (opened exclusively) ? Tim Young Elevate Software www.elevatesoft.com |
This web page was last updated on Friday, September 20, 2024 at 05:39 PM | Privacy PolicySite Map © 2024 Elevate Software, Inc. All Rights Reserved Questions or comments ? E-mail us at info@elevatesoft.com |