Icon View Thread

The following is the text of the current message along with any replies.
Messages 11 to 20 of 20 total
Thread Appends take 2.5x longer to complete if transactions are used.
Tue, Aug 7 2007 12:21 PMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Dave

>You're certainly not talking about Vista are you? <rofl>

Strangely enough ........

Roy Lambert
Tue, Aug 7 2007 4:37 PMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Dave,

<< Local drive. I tried using exclusive access to the table and it didn't
make much speed difference. I took Tim's advice and ran Commit(false) to
delay the writes and it speeded things up quite a bit. It is almost the same
speed as non-transactional which is quite an improvement from before. >>

It should still be much faster than non-transactional, literally night and
day.  There must be something else that is causing the bottleneck.

--
Tim Young
Elevate Software
www.elevatesoft.com

Tue, Aug 7 2007 4:39 PMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Dave,

<< Here's what I'm doing. I'm importing a large CSV file into a memory
table, 5k rows at a time. >>

Are you using the transaction on the in-memory table, or the disk-based
table ?

<< I do processing on these rows and I synchronize them with a DBISAM table
(check each field against the memory table). If there are any changes then
the physical table gets updated with the row from the memory table. A range
is performed on the physical table so it corresponds to the 15k rows in the
memory table before the comparison is started. >>

So, you're not necessarily writing out 1k or 5k rows with every commit,
correct ?  IOW, you could only be writing out one row modification during
the commit due to the fact that only one row needed updating.

--
Tim Young
Elevate Software
www.elevatesoft.com

Tue, Aug 7 2007 5:21 PMPermanent Link

Dave Harrison
Tim Young [Elevate Software] wrote:

> Dave,
>
> << Here's what I'm doing. I'm importing a large CSV file into a memory
> table, 5k rows at a time. >>
>
> Are you using the transaction on the in-memory table, or the disk-based
> table ?

It should just be the disk based table. I'll double check to make sure.

>
> << I do processing on these rows and I synchronize them with a DBISAM table
> (check each field against the memory table). If there are any changes then
> the physical table gets updated with the row from the memory table. A range
> is performed on the physical table so it corresponds to the 15k rows in the
> memory table before the comparison is started. >>
>
> So, you're not necessarily writing out 1k or 5k rows with every commit,
> correct ?  IOW, you could only be writing out one row modification during
> the commit due to the fact that only one row needed updating.
>

Correct. I count the number of updated rows and commit after 1k-10k rows
have been updated. When the disk table is empty, then it of course it
adds all the rows from the memory table to the disk based table. BTW,
the delayed flush on the commit did speed things up quite a bit, so it's
around the same speed as not using transactions.

Dave
Tue, Aug 7 2007 7:26 PMPermanent Link

"Rob Frye"
Dave

>> Were you using exclusive access when you ran without transactions?

You missed the above question.

Rob

Wed, Aug 8 2007 3:18 PMPermanent Link

Dave Harrison
Rob Frye wrote:
> Dave
>
>
>>>Were you using exclusive access when you ran without transactions?
>
>
> You missed the above question.
>
> Rob
>
>

Speedwise, it doesn't matter if exclusive access is set or not. I have a
control panel on the program where I can turn on/off transactions,
exclusivity, memory tables (for temp tables) etc..

One of the things that is slow (has nothing to do with transactions) is
appending records to a memory table (TDBISAMTable). I have
GetTickCount's around the Post command and accumulate the time when rows
are added to the memory table.  The table has 2 indexes and no blobs.
I'm getting appends of only 6264 rcds/sec, and for another table
appending is 6900 rcds/sec and 6080 rcds/sec. I decided to turn off all
resident programs (AV etc) and that added maybe a couple hundred
rows/sec to the results. The program is of course run outside of the IDE
as a separate program. There is over 1.2gb of free physical memory and
no other process is getting  more than 1% of the other CPU.

The record structure for one of the tables is:
AutoInc, String(17), Date, String(1), Float, Float, Float, Float, Float,
TimeStamp.

The other table is:
AutoInc, String(17), Date, 24 Float fields, TimeStamp.

The indexes are String(17);String(1);Date and Date;String(17) and
similar indexes for the other table. The memory table only gets around
15k rows at at time before it is emptied and another batch is imported.
This means the index depth is quite small. (The String(17) is the same
for all 15k rows).

So it appears I can't get memory tables to append much faster than 6k
records/second which I think is pretty slow. (Transactions is turned
off). I'm running this on a dual core 4400 with 2gb memory.

I'll probably yank out the memory tables and just add rows to a TList
which should be considerably faster.

Has anyone else noticed the memory tables were slower than they should be?

Dave
Wed, Aug 8 2007 3:25 PMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Dave,

<< Correct. I count the number of updated rows and commit after 1k-10k rows
have been updated. When the disk table is empty, then it of course it adds
all the rows from the memory table to the disk based table. BTW, the delayed
flush on the commit did speed things up quite a bit, so it's around the same
speed as not using transactions. >>

That's the catch - it should be much, much faster than doing the same
process without transactions if you're only committing when you've reached
the 1k-10k mark.   IOW, the lower the number of rows committed during each
commit, the closer you will get to the same performance as without any
transaction at all.

--
Tim Young
Elevate Software
www.elevatesoft.com

Thu, Aug 9 2007 5:22 AMPermanent Link

"Rob Frye"
Dave

Here are a number of items to consider based on what I have observed with
our applications.  I understand that you have already experimented with some
of these and certainly you will get different results with your
program/data/environment but hopefully something will help.

1) for best performance writing to physical tables, use Exclusive = True and
no transactions
       (can be about 3-4 times faster than when Exclusive = False)

2) if you can't have exclusive access to the tables then use transactions
with optimum commits
       (with a bit of work can be fairly close to speed of Exclusive =
True)

3) for in memory tables, use Exclusive = True
       (can be about 2-3 times faster than when Exclusive = False)

4) check that session has ForceBufferFlush = False
       (can make a big difference when using transactions and/or Exclusive
= False)

5) if you aren't already using it, consider FastMM
       (can improve memory related performance about 5%)

Rob

Mon, Aug 13 2007 3:04 PMPermanent Link

Dave Harrison
Tim Young [Elevate Software] wrote:
> Dave,
>
> << I'm adding a few million rows to an empty table, and I thought I could
> speed things up if I used a transaction for every 5k rows.
>
>  Well, it slowed it down considerably, 2.5x slower. So instead of taking 101
> minutes to complete, it now takes 250 minutes with transactions.
>
>  I don't really need transactions because I'm only updating a couple of
> tables. I'm more concerned with speed.
>
>  Does anyone have recommendations on how to speed it up? >>
>
> You're most likely I/O bound with the hard drive due to the hard commits.
> When doing the Commit call, pass False as the Flush parameter and that
> should help things a bit.
>

Tim,
    I did some more testing and wrote an app that puts dummy records
into a table with a similar file format to what I'm using. The clickable
options were: Transactions, Memory Table, Exclusive etc.. After "fine
tuning" it I discovered the transaction size I was using, 5k-10k wasn't
large enough. After a lot of testing I found the optimum setting to be
17,500 rows per transaction which was coming close to the speed of a
memory table.

   It seems that DBISAM is very finicky when it comes to choosing the
size of a transaction. If I used 100, it crawled along and 1k and 5k was
a bit faster but still much slower than not using transactions. It
wasn't until I exceed 10k that things started to speed up. It just
didn't occur to me before, to make the transactions that large.

  I think there needs to be a utility program (3rd party?)  that will
test different transaction sizes for a table and will automatically find
the sweet spot. So it would test 10k, 5k, 20k rows/transaction and will
keep bisecting the ranges until it zeros in on the sweet spot. This is
necessary because I suspect a lot of people may be running with the
wrong transaction size and this will greatly hamper performance. Just me
2 cents. Smile

Dave
Mon, Aug 13 2007 4:15 PMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Dave,

<< I think there needs to be a utility program (3rd party?)  that will test
different transaction sizes for a table and will automatically find the
sweet spot. So it would test 10k, 5k, 20k rows/transaction and will keep
bisecting the ranges until it zeros in on the sweet spot. This is necessary
because I suspect a lot of people may be running with the  wrong transaction
size and this will greatly hamper performance. Just me 2 cents. Smile>>

The issue is whether the transaction is just pumping in rows via inserts, or
whether it is doing other reads and searches in-between, and such a utility
couldn't account for the latter, which I believe is your case also.

--
Tim Young
Elevate Software
www.elevatesoft.com

« Previous PagePage 2 of 2
Jump to Page:  1 2
Image