Icon View Thread

The following is the text of the current message along with any replies.
Messages 1 to 10 of 15 total
Thread Turtle slow insert
Wed, Jun 17 2020 4:55 AMPermanent Link

Teco

TECHNOLOG Systems GmbH

Hi to all,
I have a local database with around 38600 records.
Adding a record as transaction takes around 9 seconds.

With aound 5000 records in the table the adding is below 1 second.

with only

Handling is prepared for a later Multi User environement.

The records are added automaticly. Each record has a manual sequence id. This ID is created ascending but not randomized and the Sequence ID has an Index.

Sequence ID is like:
a00000101a
a00000101b
....
a00000101f
a000001020

Any idea what could make it so slow?


Following some information about the table and the insert:

-------------------------------------------------------------------
Table (stipped in some way - Original table has 15 columns):

CREATE TABLE "Table1"
       (
       "ID" INTEGER GENERATED ALWAYS AS IDENTITY (START WITH 0, INCREMENT BY 1) NOT NULL,
       "code" VARCHAR(12) COLLATE "UNI" NOT NULL,
       "Value1" INTEGER NOT NULL,
       "History" INTEGER NOT NULL,
       "Remarks" VARCHAR(1023) COLLATE "UNI",
       "Signature" VARCHAR(160) COLLATE "UNI"
       )
       VERSION 1.00
       READWRITE
       ENCRYPTED
       INDEX PAGE SIZE 8192
       BLOB BLOCK SIZE 1024
       PUBLISH BLOCK SIZE 1024
       PUBLISH COMPRESSION 0
       MAX ROW BUFFER SIZE 32768
       MAX INDEX BUFFER SIZE 65536
       MAX BLOB BUFFER SIZE 32768
       MAX PUBLISH BUFFER SIZE 32768

       CREATE INDEX "Index1"
       ON "Table1"
       ("code" COLLATE "UNI" ASC)
       NO BACKUP FILES           

-------------------------------------------------------------------

Insert (Record will be written, read to get the Record ID and updated with a MD5 Signature):


Step 1 (Check if Record has been created prior)
select count(id) from "Table1" where code = 'abcdefg'
-------------------------------------------------------------------
Step 2 (Insert Record)

Transaction start

insert into "Table1"
(code, Value, History, Remarks, Signature)
values
('aabbccdd', 123456, 0, 'temp')

Transaction commit
-------------------------------------------------------------------
Step 3 (Read Record to catch latest version)
select * from "Table1" where code = 'abcdefg' and history = 0
-------------------------------------------------------------------
Step 4 (Update signature)

Transaction start

update "table" set signature = '***place calculated MD5 here***' where ID=38500

Transaction commit
-------------------------------------------------------------------
Wed, Jun 17 2020 5:53 AMPermanent Link

Teco

TECHNOLOG Systems GmbH

ADDITIONAL INFORMATION:

Update is the slowest part so far.

After Memory Check it looks like that the whole database is copied to the RAM at startup, depending on memory consumption. Start of the application takes some seconds until the RAM consumption is somewhat over the size of the database.
Wed, Jun 17 2020 8:22 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Teco

First a suggestion - add some instrumentation and get some timings for each part of the operation - split it into as discrete chunks as possible (eg calculate the MD5 has outside the update statement) so the timings are as fine grained as possible.


I'm a bit confused - why do you have two unique columns with a non-unique index on the second and no primary key?

Also

<<insert into "Table1"
(code, Value, History, Remarks, Signature)
values
('aabbccdd', 123456, 0, 'temp')>>

specifying 5 columns but only supplying 4 values - I assume ElevateDB will supply a null for the fifth but its bad code (in my opinion) You could also set 0 as a default for history and then its taken care of automatically.

I'm guessing that the need for the additional update is because you're using the ID as part of the MD5 hash  which isn't available prior to the insert, but I don't understand why its needed.

If you make Code the primary key then

select count(id) from "Table1" where code = 'abcdefg'

will not be needed - a quick but not large speed saving - just use a try except block and cancel in the except

I think you're overusing transactions (unless there's a lot going on that you haven't shown) - the insert is atomic as is the update, and unless you move the cursor the same data will be under it. The only possible problem would be someone else updating the record in the small gaps between each atomic operation.


Roy Lambert
Wed, Jun 17 2020 8:53 AMPermanent Link

Teco

TECHNOLOG Systems GmbH

Roy Lambert wrote:

Teco

First a suggestion - add some instrumentation and get some timings for each part of the operation - split it into as discrete chunks as possible (eg calculate the MD5 has outside the update statement) so the timings are as fine grained as possible.


I'm a bit confused - why do you have two unique columns with a non-unique index on the second and no primary key?

Also

<<insert into "Table1"
(code, Value, History, Remarks, Signature)
values
('aabbccdd', 123456, 0, 'temp')>>

specifying 5 columns but only supplying 4 values - I assume ElevateDB will supply a null for the fifth but its bad code (in my opinion) You could also set 0 as a default for history and then its taken care of automatically.

I'm guessing that the need for the additional update is because you're using the ID as part of the MD5 hash  which isn't available prior to the insert, but I don't understand why its needed.

If you make Code the primary key then

select count(id) from "Table1" where code = 'abcdefg'

will not be needed - a quick but not large speed saving - just use a try except block and cancel in the except

I think you're overusing transactions (unless there's a lot going on that you haven't shown) - the insert is atomic as is the update, and unless you move the cursor the same data will be under it. The only possible problem would be someone else updating the record in the small gaps between each atomic operation.


Roy Lambert

Dear Roy,

Insert Statement:   There was a typo. Correct is
<<insert into "Table1"
(code, Value, History, Remarks, Signature)
values
('aabbccdd', 123456, 0,'Remark for the record', 'temp')>>

The update is needed because in the final application another process can read the record and alter some of the information. To get the latest records, I readout complete and write back

Using the Code as primary key is not possible.
One key can be in the database many times. The history counter shows how much a record has been changed. Every change will be stored, exept changes for adding the signature. This is the reason why I read out the record with the ID. If another process has done a changing the history counter will be set from 0 to the correct history value.

The ID is included to prevent a valid hash if somebody deletes a record and renumber the IDs.

the largest time consumption is the Select statement from Step 3 and the update statement in Step 4. Everything else is really fast.

Because I insert the records in ascending order, could it be that the index becomes slow? Maybe because of permanent reorganization to an AVL Tree or whatever reason?
Wed, Jun 17 2020 9:37 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Teco

>Insert Statement: There was a typo. Correct is
><<insert into "Table1"
>(code, Value, History, Remarks, Signature)
>values
>('aabbccdd', 123456, 0,'Remark for the record', 'temp')>>

I'd add a default for History and Signature - 0 and 'temp' then

>The update is needed because in the final application another process can read the record and alter some of the information. To get the latest records, I readout complete and write back

The chances of that happening while not zero are pretty small (we're talking milliseconds at most) unless there's a wadge of code you haven't shown.

I'm not sure at what point the generated ID becomes available in a transaction but it would be worth seeing if its there before the commit. If it is move the whole operation into the transaction block and there's no need for the step 3 select at all.


>Using the Code as primary key is not possible.

In that case make the ID the primary key

>The ID is included to prevent a valid hash if somebody deletes a record and renumber the IDs.

Fair enough

>the largest time consumption is the Select statement from Step 3 and the update statement in Step 4. Everything else is really fast.

Whilst I'd still like to see the times I can make a suggestion - add an index for History.

In EDBManager there is the facility to request an Execution Plan. If you break your operation down to discrete parts you can run each in EDBManager and look at the Execution PLan and it will show you where it might be possible to improve performance. Try your

select * from "Table1" where code = 'abcdefg' and history = 0

and post the execution plan here.

I'd also like to see how long calculating the MD5 hash takes (and for idle curiosity how you're doing it)

>Because I insert the records in ascending order, could it be that the index becomes slow? Maybe because of permanent reorganization to an AVL Tree or whatever reason?

Nah.

9 seconds for a select, insert, select, update is appalling and there has to be something going on that accounts for it. As yet you haven't given me anything I can say causes it. Someone else may have a better insight.

Roy
Wed, Jun 17 2020 10:11 AMPermanent Link

Teco

TECHNOLOG Systems GmbH

Hi Roy,

The only code I have not shown are the additonal columns and real records. These are internal Items from our production.

I have also tried without Transactions. Timings are unchanged.

Times have been calculated as below (Lazarus 2.0.6):
-----------------------------------------------------------------
time_start := now;
....SQL Execution
time_end := now;
time_needed := milisecondsbetween(time_end, time_start)
-----------------------------------------------------------------
Records: approx. 40000

First insert: approx 80 - 110 ms
Select (Step 3): approx. 5800 - 6100 ms
Update (Step 4): approx. 5600 - 6000 ms

For MD5 I put all columns to a large string and send it to an MD5 function. Result is the MD5 hash.
MD5 is calculated with a library from Lazarus and takes below 1 ms. The time counter shows 0 as needed time.


The execution plan for the select statement is below (some internal items are removed)
-----------------------------------------------------------------

================================================================================
SQL Query (Executed by ElevateDB 2.31 Build 13)

Note: The SQL  shown here is generated by ElevateDB and may not be exactly the
same as the SQL that was originally entered.  However, none of the differences
alter the execution results in any way.
================================================================================

SELECT ALL
"table1"."ID" AS "ID",
"table1"."code" AS "code",
"table1"."History" AS "History",
"table1"."Remark" AS 'Remark",
"table1"."Signature" AS "Signature"
FROM "table1"
WHERE "code" = 'abcdefg' AND "History" = 0

Source Tables
-------------

table1: 40640 rows

Result Set
----------

The result set was insensitive and read-only
The result set consisted of zero or more rows

Filtering
---------

The following filter condition was applied to the table1 table:

"code" = 'abcdefg'

Index scan (table1.Index1): 1 keys, 8KB estimated cost

AND

"History" = 0

Row scan (table1): 1 rows, 4,68KB estimated cost

================================================================================
1 row(s) returned in 0 secs
================================================================================
Wed, Jun 17 2020 10:39 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Teco


You've added one more critical piece of information - Lazarus. It looks as though the interaction between ElevateDB and Lazarus may be the problem, and for that you'll need to contact Tim directly.

The only other thing I can offer to do is, if you can supply me the necessary data, to try the same operation in Delphi and see what happens. On the basis of the select operation in EDBManager (a Delphi application) I'm guessing you'll see a marked speed improvement.

Roy Lambert
Wed, Jun 17 2020 10:43 AMPermanent Link

Teco

TECHNOLOG Systems GmbH

Der Roy,

Thank you for the information.

Is Lazarus so new for ElevateDB?

I can give you the application only when you are in the EU and sign an NDA Agreement.
Wed, Jun 17 2020 2:54 PMPermanent Link

Raul

Team Elevate Team Elevate

On 6/17/2020 10:43 AM, Teco wrote:

> Is Lazarus so new for ElevateDB?

No but most of us use Delphi so we can easily try to duplicate the issue
there but as Roy said it's not entirely apples-to-apples

> I can give you the application only when you are in the EU and sign an NDA Agreement.

Can you post the schema (no data) - i would personally never ever sign a
NDA but would be happy to generate random data into schema and see how
delphi version behaves

Raul
Thu, Jun 18 2020 2:24 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Teco


As Raul says - no to an NDA - however, all you need is a small test app showing the problem. Only the tables that are involved and some simple code to reproduce. One thing I've often found is producing a test app shows me where my problem was Smiley

Roy Lambert
Page 1 of 2Next Page »
Jump to Page:  1 2
Image