Icon View Thread

The following is the text of the current message along with any replies.
Messages 1 to 10 of 20 total
Thread Appends take 2.5x longer to complete if transactions are used.
Fri, Aug 3 2007 5:22 PMPermanent Link

Dave Harrison
I'm adding a few million rows to an empty table, and I thought I could
speed things up if I used a transaction for every 5k rows.

Well, it slowed it down considerably, 2.5x slower. So instead of taking
101 minutes to complete, it now takes 250 minutes with transactions.

I don't really need transactions because I'm only updating a couple of
tables. I'm more concerned with speed.

Does anyone have recommendations on how to speed it up?

TIA
Dave
Sat, Aug 4 2007 2:29 PMPermanent Link

"Rob Frye"
Hi Dave

> Does anyone have recommendations on how to speed it up?
>

Try a smaller number of rows per transaction (eg. 1000).

Rob

Mon, Aug 6 2007 12:12 AMPermanent Link

Dave Harrison
Rob Frye wrote:
> Hi Dave
>
>
>>Does anyone have recommendations on how to speed it up?
>>
>
>
> Try a smaller number of rows per transaction (eg. 1000).
>
> Rob
>
>
Rob,
   I did as you suggested and tried different transaction sizes, 100,
500, 1000, 5000, 10000 etc. and in all cases the time is approx double
of what it is without transactions. Has anyone else noticed this?

Dave
Mon, Aug 6 2007 4:17 PMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Dave,

<< I'm adding a few million rows to an empty table, and I thought I could
speed things up if I used a transaction for every 5k rows.

Well, it slowed it down considerably, 2.5x slower. So instead of taking 101
minutes to complete, it now takes 250 minutes with transactions.

I don't really need transactions because I'm only updating a couple of
tables. I'm more concerned with speed.

Does anyone have recommendations on how to speed it up? >>

You're most likely I/O bound with the hard drive due to the hard commits.
When doing the Commit call, pass False as the Flush parameter and that
should help things a bit.

--
Tim Young
Elevate Software
www.elevatesoft.com

Mon, Aug 6 2007 8:38 PMPermanent Link

"Rob Frye"
Hi Dave

Are you accessing a local drive, a network drive, or using client-server?

Do you have exclusive access to the table?

Rob

Tue, Aug 7 2007 12:27 AMPermanent Link

Dave Harrison
Rob Frye wrote:
> Hi Dave
>
> Are you accessing a local drive, a network drive, or using client-server?
>
> Do you have exclusive access to the table?
>
> Rob
>
>
Rob,
   Local drive. I tried using exclusive access to the table and it
didn't make much speed difference. I took Tim's advice and ran
Commit(false) to delay the writes and it speeded things up quite a bit.
It is almost the same speed as non-transactional which is quite an
improvement from before.

Dave
Tue, Aug 7 2007 5:09 AMPermanent Link

"Rob Frye"
Hi Dave

>    Local drive. I tried using exclusive access to the table and it didn't
> make much speed difference.

Were you using exclusive access when you ran without transactions?  If so, I
don't think transactions have a chance of making things run any faster.

101 minutes to add a few million records still seems rather slow.  To get a
very rough ballpark, I just timed one of our apps writing out a million
records (exclusive access without transaction) and it took less than 3
minutes on an old laptop.  In this case the records were only 112 bytes,
there are 2 indexes, and there aren't any blob fields.  Are your records
quite a bit larger?  Do you have many indexes?  Are there any blob fields?
Are you using the default DBISAM buffer settings (eg.
MaxTableDataBufferSize)?  Could your anti-virus software be interfering?

Rob

Tue, Aug 7 2007 5:41 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Dave


Something you haven't said yet - which OS? I've not played much but have seen a few weird results using the M$ lastest and "greatest".

Roy Lambert
Tue, Aug 7 2007 10:29 AMPermanent Link

Dave Harrison
Roy Lambert wrote:
> Dave
>
>
> Something you haven't said yet - which OS? I've not played much but have seen a few weird results using the M$ lastest and "greatest".
>
> Roy Lambert
>

Roy,
   XP Pro with 2gb ram.

::few weird results using the M$ lastest and "greatest"
You're certainly not talking about Vista are you? <rofl>

Dave
Tue, Aug 7 2007 10:46 AMPermanent Link

Dave Harrison
Rob Frye wrote:

> Hi Dave
>
>
>>   Local drive. I tried using exclusive access to the table and it didn't
>>make much speed difference.
>
>
> Were you using exclusive access when you ran without transactions?  If so, I
> don't think transactions have a chance of making things run any faster.
>
> 101 minutes to add a few million records still seems rather slow.  To get a
> very rough ballpark, I just timed one of our apps writing out a million
> records (exclusive access without transaction) and it took less than 3
> minutes on an old laptop.
> In this case the records were only 112 bytes,
> there are 2 indexes, and there aren't any blob fields.  

I'm actually adding rows to 2 tables. The first table is 112 bytes and
the second is 272 bytes (no blobs). Mosty floats in both. There are 3
Indexes per table including the RecordId. No blobs.

> Are your records
> quite a bit larger?  Do you have many indexes?  Are there any blob fields?
> Are you using the default DBISAM buffer settings (eg.
> MaxTableDataBufferSize)?  Could your anti-virus software be interfering?

I'm using Nod32 AV and I excluded the data directories just now and
there is no speed difference.

Here's what I'm doing. I'm importing a large CSV file into a memory
table, 5k rows at a time. I do processing on these rows and I
synchronize them with a DBISAM table (check each field against the
memory table). If there are any changes then the physical table gets
updated with the row from the memory table. A range is performed on the
physical table so it corresponds to the 15k rows in the memory table
before the comparison is started.

Now some of the extra time is actually used by adding rows to the memory
table. From past experience I've found that referencing fields from a
memory table (any memory table) will eat of valuable ms and if you do
enough of them then it can turn into a lot of minutes. I'm seriously
considering dropping the use of memory tables (since the data in the CSV
file is sorted anyways), and just load the data into a TList. That way I
can reference the native Float, String, Date values without going
through Field1.AsFloat etc. Of course I don't use FieldByName(), but the
actual TField instead.

Well, there you have it in a nutshell.

Dave
Page 1 of 2Next Page »
Jump to Page:  1 2
Image