Support Forums - View Thread

View Thread

The following is the text of the current message along with any replies.

Messages 1 to 10 of 16 total

Tue, Apr 14 2009 7:56 AM	Permanent Link
Dan	I'm using "IMPORT TABLE" to feed data into a table from a file. It's a simple 3 fields table: INTEGER, FLOAT, FLOAT. I do the import before I add any indexes. The speed I'm getting on my system is some 11000 records/s, which is three times slower than I expected for such a simple import operation, with no indexes involved. Can you look into the code for the IMPORT TABLE please and see if there are any bottlenecks there? All the INSERTs are put all in a single transaction? Any improvements will be greatly appreciated, as this is the single point where 80% of the execution time is wasted for me. Thanks in advance!
Tue, Apr 14 2009 8:06 AM	Permanent Link
Tim Young [Elevate Software] Elevate Software, Inc. timyoung@elevatesoft.com	Dan, << I'm using "IMPORT TABLE" to feed data into a table from a file. It's a simple 3 fields table: INTEGER, FLOAT, FLOAT. I do the import before I add any indexes. The speed I'm getting on my system is some 11000 records/s, which is three times slower than I expected for such a simple import operation, with no indexes involved. >> Actually EDB uses an internal row ID index for the primary key if you don't define a primary key for the table. How many total rows are you importing, and how often does the import occur ? 11,000 rows/sec is actually pretty good for an import into a shared table. What is the basis for your expectation of 33,000 rows/sec on an import ? If you're comparing it to a TList population or something similar, then that is not a valid comparison. -- Tim Young Elevate Software www.elevatesoft.com
Tue, Apr 14 2009 1:46 PM	Permanent Link
Dan	"Tim Young [Elevate Software]" wrote: How many total rows are you importing, and how often does the import occur ? The import is part of a function in a EDB remote SQL Server app. That's why I need to optimize this process, as it is frequently used. There are 30.000 to 100.000 rows, but the problem appears beacause this is not a one time call, but a commonly used function. The import is done locally in the SQL server. 11,000 rows/sec is actually pretty good for an import into a shared table. What is the basis for your expectation of 33,000 rows/sec on an import ? I estimated the speed based both on the speed of the process used to generate the data being imported (a full text search engine) and with the performance of SQLite for the same op. I wonder if the IMPORT can be improved or you have any other idea to get the CSV file data with max speed into the EDB. Thanks.
Tue, Apr 14 2009 2:04 PM	Permanent Link
Dan	I forgot to write that for the import the user has exclusive rights, no shared access during that step.
Thu, Apr 16 2009 5:08 PM	Permanent Link
Tim Young [Elevate Software] Elevate Software, Inc. timyoung@elevatesoft.com	Dan, << I estimated the speed based both on the speed of the process used to generate the data being imported (a full text search engine) and with the performance of SQLite for the same op. >> Comparions to SQLite won't be very helpful since it doesn't even do multi-user. BTW, I forgot to ask what version of EDB were you testing, and are you wrapping the import in a transaction at all ? -- Tim Young Elevate Software www.elevatesoft.com
Thu, Apr 16 2009 5:11 PM	Permanent Link
Tim Young [Elevate Software] Elevate Software, Inc. timyoung@elevatesoft.com	Dan, << I forgot to write that for the import the user has exclusive rights, no shared access during that step. >> Yes, but IMPORT TABLE always opens the table in a shared manner. Per my previous message about the version being used, if you are trying this with a build prior to 2.02 Build 11, try it with Build 11 and see if that helps. If you're already using Build 11, then that is about as fast as things can get. -- Tim Young Elevate Software www.elevatesoft.com
Wed, Apr 22 2009 4:22 AM	Permanent Link
Dan	"Tim Young [Elevate Software]" wrote: Dan, <<Yes, but IMPORT TABLE always opens the table in a shared manner. Per my previous message about the version being used, if you are trying this with a build prior to 2.02 Build 11, try it with Build 11 and see if that helps. If you're already using Build 11, then that is about as fast as things can get.>> I did the upgrade, things improved a bit, but I still search for a solution to speed up the import. I start from the fact that for the import the user has exclusive rights, no shared access during that step. And the data write in CSV format is very fast. Maybe can you make a suggestion on how to achieve the fastest speed on that operation. It is clear to me a bottleneck for the whole operation. The FTS is 10% of the total execution time, the write in CSV format 5%, the import is 80% and the query is another 10%. So a mundane task as the data import is taking 80% of the time. It is not OK for me. Maybe to skip both CSV format and IMPORT, then write the data directly to the file in ElevateDB table format? It has exclusive access and it is a very simple structure INTEGER , FLOAT, FLOAT. Any other suggestion please to bring that data in ELEVATEDB environment? Thanks again.
Wed, Apr 22 2009 7:49 AM	Permanent Link
Roy Lambert NLH Associates Team Elevate	Dan Out of interest -why can't you do the query on SQLServer? Roy Lambert
Wed, Apr 22 2009 11:25 AM	Permanent Link
Tim Young [Elevate Software] Elevate Software, Inc. timyoung@elevatesoft.com	Dan, << I start from the fact that for the import the user has exclusive rights, no shared access during that step. And the data write in CSV format is very fast. Maybe can you make a suggestion on how to achieve the fastest speed on that operation. It is clear to me a bottleneck for the whole operation. The FTS is 10% of the total execution time, the write in CSV format 5%, the import is 80% and the query is another 10%. So a mundane task as the data import is taking 80% of the time. It is not OK for me. >> I'm sorry, but that is about as fast as ElevateDB will get with an import (B11). What you're expecting is performance that is equivalent to an in-memory collection or some other basic type of data structure, and that is not how a database engine works. It may appear mundane to you, but there is a lot more going on with an import. << Maybe to skip both CSV format and IMPORT, then write the data directly to the file in ElevateDB table format? >> We don't recommend doing that. Any change to the way that ElevateDB handles the table files could cause your code to break, and you still have to re-index the table after you're done. -- Tim Young Elevate Software www.elevatesoft.com
Wed, Apr 22 2009 11:29 AM	Permanent Link
Tim Young [Elevate Software] Elevate Software, Inc. timyoung@elevatesoft.com	Dan, One thing that you may try is bumping up the buffer sizes for the table, specifically the MAX ROW BUFFER SIZE: http://www.elevatesoft.com/manual?action=mantopic&id=edb2sql&category=10&topic=149 http://www.elevatesoft.com/edb_faqt_3.htm Something like this: ALTER TABLE MyTable MAX ROW BUFFER SIZE 1048576 That will permit ElevateDB to use more memory for buffering rows during the import. -- Tim Young Elevate Software www.elevatesoft.com

	Page 1 of 2	Next Page »
Jump to Page: 1 2