Icon View Thread

The following is the text of the current message along with any replies.
Messages 1 to 5 of 5 total
Thread Indexes
Tue, Jul 3 2007 10:08 AMPermanent Link

I have an app that does a lot of intense data analysis in memory, and then
stores the data in a DBISAM database. The database has a number of tables,
each with a fair number of indexes as needed to make the queries optimal.
The analysis storage is done by a number of threads and I'm pondering
their efficiency because they all become very IO bound. The writing is
taking more than the analysis at the moment, with 4 processors running at
about 20-30% use.

I am wondering if it wouldn't be better to suspend the indexes for a while
(either dropping them or something) while the data is stored, and then
having a final cleanup re-create and build them. I presume this would save
a load of reading and writing of pages.

I'd like to experiment with this, but to do so a way to turn off the
indexes and build them again later would be good. DBISAM 4 is in use.

Thoughts anyone?

/Matthew Jones/
Tue, Jul 3 2007 11:05 AMPermanent Link

Dave Harrison
mattjones@cix.co.uk (Matthew Jones) wrote:

> I have an app that does a lot of intense data analysis in memory, and then
> stores the data in a DBISAM database. The database has a number of tables,
> each with a fair number of indexes as needed to make the queries optimal.
> The analysis storage is done by a number of threads and I'm pondering
> their efficiency because they all become very IO bound. The writing is
> taking more than the analysis at the moment, with 4 processors running at
> about 20-30% use.
>
> I am wondering if it wouldn't be better to suspend the indexes for a while
> (either dropping them or something) while the data is stored, and then
> having a final cleanup re-create and build them. I presume this would save
> a load of reading and writing of pages.
>
> I'd like to experiment with this, but to do so a way to turn off the
> indexes and build them again later would be good. DBISAM 4 is in use.
>
> Thoughts anyone?
>
> /Matthew Jones/

Matthew,
   With any database that I've used, it is always faster loading data into
an empty table without indexes, and then build the indexes when the data
has finished loading. If you already have a lot of data in the table and
are loading data, I don't think you will gain much speed because to drop
the index, load the data, and re-build the index will require the data
to be loaded twice (3 times?). It would be nice if DBISAM had the
ability to disable all non-unique/primary indexes (like MySQL can) and
then re-enable them when you're finished loading data, which rebuilds
these indexes.

Here are some general speed tips that you may find useful:
http://www.softwarewithbrains.com/dbisamtips.htm

Dave
Wed, Jul 4 2007 4:22 AMPermanent Link

> If you already have a lot of data in the table and
> are loading data,

I specifically start by emptying the relevant tables. I may do some
experimentation. Thanks for the tips.

/Matthew Jones/
Wed, Jul 4 2007 9:45 AMPermanent Link

"Robert"

"Matthew Jones" <mattjones@cix.co.uk> wrote in message
news:memo.20070704091908.4572H@nothanks.nothanks.co.uk...
>> If you already have a lot of data in the table and
>> are loading data,
>
> I specifically start by emptying the relevant tables. I may do some
> experimentation. Thanks for the tips.
>

If you start with empty tables, then it's a snap.

1. Go to Dbsys and reverse engineer your tables into SQL.
2. Separate the generated SQL into two files: one that drops the existing
table and creates the new one, and the other the part that builds the
indexes.
3. Execute the first script before adding the data, and the second after you
add the data.

Voila.

Robert

> /Matthew Jones/

Wed, Jul 4 2007 12:02 PMPermanent Link

I can use that principle, but due to the variability of the data I'd have
to generate the SQL at runtime. Dropping the table does look sensible
though.

/Matthew Jones/
Image