Support Forums - View Thread

View Thread

The following is the text of the current message along with any replies.

Messages 11 to 20 of 20 total

Appends take 2.5x longer to complete if transactions are used.

Tue, Aug 7 2007 12:21 PM	Permanent Link
Roy Lambert NLH Associates Team Elevate	Dave >You're certainly not talking about Vista are you? <rofl> Strangely enough ........ Roy Lambert
Tue, Aug 7 2007 4:37 PM	Permanent Link
Tim Young [Elevate Software] Elevate Software, Inc. timyoung@elevatesoft.com	Dave, << Local drive. I tried using exclusive access to the table and it didn't make much speed difference. I took Tim's advice and ran Commit(false) to delay the writes and it speeded things up quite a bit. It is almost the same speed as non-transactional which is quite an improvement from before. >> It should still be much faster than non-transactional, literally night and day. There must be something else that is causing the bottleneck. -- Tim Young Elevate Software www.elevatesoft.com
Tue, Aug 7 2007 4:39 PM	Permanent Link
Tim Young [Elevate Software] Elevate Software, Inc. timyoung@elevatesoft.com	Dave, << Here's what I'm doing. I'm importing a large CSV file into a memory table, 5k rows at a time. >> Are you using the transaction on the in-memory table, or the disk-based table ? << I do processing on these rows and I synchronize them with a DBISAM table (check each field against the memory table). If there are any changes then the physical table gets updated with the row from the memory table. A range is performed on the physical table so it corresponds to the 15k rows in the memory table before the comparison is started. >> So, you're not necessarily writing out 1k or 5k rows with every commit, correct ? IOW, you could only be writing out one row modification during the commit due to the fact that only one row needed updating. -- Tim Young Elevate Software www.elevatesoft.com
Tue, Aug 7 2007 5:21 PM	Permanent Link
Dave Harrison	Tim Young [Elevate Software] wrote: > Dave, > > << Here's what I'm doing. I'm importing a large CSV file into a memory > table, 5k rows at a time. >> > > Are you using the transaction on the in-memory table, or the disk-based > table ? It should just be the disk based table. I'll double check to make sure. > > << I do processing on these rows and I synchronize them with a DBISAM table > (check each field against the memory table). If there are any changes then > the physical table gets updated with the row from the memory table. A range > is performed on the physical table so it corresponds to the 15k rows in the > memory table before the comparison is started. >> > > So, you're not necessarily writing out 1k or 5k rows with every commit, > correct ? IOW, you could only be writing out one row modification during > the commit due to the fact that only one row needed updating. > Correct. I count the number of updated rows and commit after 1k-10k rows have been updated. When the disk table is empty, then it of course it adds all the rows from the memory table to the disk based table. BTW, the delayed flush on the commit did speed things up quite a bit, so it's around the same speed as not using transactions. Dave
Tue, Aug 7 2007 7:26 PM	Permanent Link
"Rob Frye"	Dave >> Were you using exclusive access when you ran without transactions? You missed the above question. Rob
Wed, Aug 8 2007 3:18 PM	Permanent Link
Dave Harrison	Rob Frye wrote: > Dave > > >>>Were you using exclusive access when you ran without transactions? > > > You missed the above question. > > Rob > > Speedwise, it doesn't matter if exclusive access is set or not. I have a control panel on the program where I can turn on/off transactions, exclusivity, memory tables (for temp tables) etc.. One of the things that is slow (has nothing to do with transactions) is appending records to a memory table (TDBISAMTable). I have GetTickCount's around the Post command and accumulate the time when rows are added to the memory table. The table has 2 indexes and no blobs. I'm getting appends of only 6264 rcds/sec, and for another table appending is 6900 rcds/sec and 6080 rcds/sec. I decided to turn off all resident programs (AV etc) and that added maybe a couple hundred rows/sec to the results. The program is of course run outside of the IDE as a separate program. There is over 1.2gb of free physical memory and no other process is getting more than 1% of the other CPU. The record structure for one of the tables is: AutoInc, String(17), Date, String(1), Float, Float, Float, Float, Float, TimeStamp. The other table is: AutoInc, String(17), Date, 24 Float fields, TimeStamp. The indexes are String(17);String(1);Date and Date;String(17) and similar indexes for the other table. The memory table only gets around 15k rows at at time before it is emptied and another batch is imported. This means the index depth is quite small. (The String(17) is the same for all 15k rows). So it appears I can't get memory tables to append much faster than 6k records/second which I think is pretty slow. (Transactions is turned off). I'm running this on a dual core 4400 with 2gb memory. I'll probably yank out the memory tables and just add rows to a TList which should be considerably faster. Has anyone else noticed the memory tables were slower than they should be? Dave
Wed, Aug 8 2007 3:25 PM	Permanent Link
Tim Young [Elevate Software] Elevate Software, Inc. timyoung@elevatesoft.com	Dave, << Correct. I count the number of updated rows and commit after 1k-10k rows have been updated. When the disk table is empty, then it of course it adds all the rows from the memory table to the disk based table. BTW, the delayed flush on the commit did speed things up quite a bit, so it's around the same speed as not using transactions. >> That's the catch - it should be much, much faster than doing the same process without transactions if you're only committing when you've reached the 1k-10k mark. IOW, the lower the number of rows committed during each commit, the closer you will get to the same performance as without any transaction at all. -- Tim Young Elevate Software www.elevatesoft.com
Thu, Aug 9 2007 5:22 AM	Permanent Link
"Rob Frye"	Dave Here are a number of items to consider based on what I have observed with our applications. I understand that you have already experimented with some of these and certainly you will get different results with your program/data/environment but hopefully something will help. 1) for best performance writing to physical tables, use Exclusive = True and no transactions (can be about 3-4 times faster than when Exclusive = False) 2) if you can't have exclusive access to the tables then use transactions with optimum commits (with a bit of work can be fairly close to speed of Exclusive = True) 3) for in memory tables, use Exclusive = True (can be about 2-3 times faster than when Exclusive = False) 4) check that session has ForceBufferFlush = False (can make a big difference when using transactions and/or Exclusive = False) 5) if you aren't already using it, consider FastMM (can improve memory related performance about 5%) Rob
Mon, Aug 13 2007 3:04 PM	Permanent Link
Dave Harrison	Tim Young [Elevate Software] wrote: > Dave, > > << I'm adding a few million rows to an empty table, and I thought I could > speed things up if I used a transaction for every 5k rows. > > Well, it slowed it down considerably, 2.5x slower. So instead of taking 101 > minutes to complete, it now takes 250 minutes with transactions. > > I don't really need transactions because I'm only updating a couple of > tables. I'm more concerned with speed. > > Does anyone have recommendations on how to speed it up? >> > > You're most likely I/O bound with the hard drive due to the hard commits. > When doing the Commit call, pass False as the Flush parameter and that > should help things a bit. > Tim, I did some more testing and wrote an app that puts dummy records into a table with a similar file format to what I'm using. The clickable options were: Transactions, Memory Table, Exclusive etc.. After "fine tuning" it I discovered the transaction size I was using, 5k-10k wasn't large enough. After a lot of testing I found the optimum setting to be 17,500 rows per transaction which was coming close to the speed of a memory table. It seems that DBISAM is very finicky when it comes to choosing the size of a transaction. If I used 100, it crawled along and 1k and 5k was a bit faster but still much slower than not using transactions. It wasn't until I exceed 10k that things started to speed up. It just didn't occur to me before, to make the transactions that large. I think there needs to be a utility program (3rd party?) that will test different transaction sizes for a table and will automatically find the sweet spot. So it would test 10k, 5k, 20k rows/transaction and will keep bisecting the ranges until it zeros in on the sweet spot. This is necessary because I suspect a lot of people may be running with the wrong transaction size and this will greatly hamper performance. Just me 2 cents. Dave
Mon, Aug 13 2007 4:15 PM	Permanent Link
Tim Young [Elevate Software] Elevate Software, Inc. timyoung@elevatesoft.com	Dave, << I think there needs to be a utility program (3rd party?) that will test different transaction sizes for a table and will automatically find the sweet spot. So it would test 10k, 5k, 20k rows/transaction and will keep bisecting the ranges until it zeros in on the sweet spot. This is necessary because I suspect a lot of people may be running with the wrong transaction size and this will greatly hamper performance. Just me 2 cents. >> The issue is whether the transaction is just pumping in rows via inserts, or whether it is doing other reads and searches in-between, and such a utility couldn't account for the latter, which I believe is your case also. -- Tim Young Elevate Software www.elevatesoft.com

« Previous Page	Page 2 of 2
Jump to Page: 1 2