Icon View Thread

The following is the text of the current message along with any replies.
Messages 11 to 16 of 16 total
Thread DBISAM 4 Tuning
Tue, Jul 24 2007 10:27 AMPermanent Link

Dave Harrison
Tim Young [Elevate Software] wrote:
> Dave,
>
> << I had set the EDB table buffer to several MB (I tried several settings)
> and I didn't notice any difference at all. Have you or anyone else actually
> noticed any improvement if you allocate more buffers to a large table? I'm
> doing Ranges and it never gets faster than 150 ranges/second regardless of
> the buffer size. I'm going to have to look into this further but thought I'd
> ask if there are any improvements to using more buffers. >>
>
> There are so many variables involved with such a test that is next to
> impossible for me to make any comment that would cover everything.  The
> bottom line is that the default buffering should be adequate for a range
> test, and any increase in the buffers probably won't help improve the
> performance because the amount of buffering required is beyond what EDB can
> realistically use without getting into other issues with buffer management,
> etc.
>

I think there is a problem then. In my range tests EDB scored 150
ranges/second (32MB buffers), MySQL did 1500 queries/sec, and another
database did approx 15,000 ranges/sec (same data). These benchmarks were
run over night to see the effects of the caching.

The other databases had excellent caching that really helped to increase
the speed. Sure they used more RAM, but if someone has 3gb-4gb of RAM
installed, it seems kind of silly to allocated only 32MB for buffers.

In my tests, it didn't matter if EDB had 32k buffers or 32MB buffers
(32MB was the max I could allocate), it always ran at 150 ranges/second.
 Because of this 32MB restriction, buffers won't help on large tables.

Are you absolutely sure the buffers are being allocated? I only ask this
because my program allocated only 1.7MB of physical memory even though
the table was to have 32MB buffers (embedded).

Caching is absolutely critical for web servers that are running 24/7. If
there is a problem with EDB's caching going beyond 32MB, then I
recommend the caching algorithm be corrected.

Dave
Tue, Jul 24 2007 4:09 PMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Dave,

<< Caching is absolutely critical for web servers that are running 24/7. If
there is a problem with EDB's caching going beyond 32MB, then I recommend
the caching algorithm be corrected. >>

I know what our products can and cannot do, and why.  I am trying to explain
to you why it is the way it is, but you don't seem to want to hear what I'm
saying.   I've been through this already once with you, and what I'm trying
to convey to you is that the way the buffering is designed is *intentional*.
EDB and DBISAM are not designed to buffer large amounts of data *instead of*
the OS doing so (IOW, using raw I/O at the OS level and bypassing the OS
buffering).   The OS already caches the files to a large extent and caching
them further in DBISAM and EDB is simply a waste of memory and defeats the
purpose of both products being lightweight in terms of memory consumption.

The enterprise level server that I have planned for EDB will have a large
memory model and it will be able to buffer large amounts of memory because
it will be designed to do so.

If you want to send me your benchmark application that you're using, then I
can take a look and tell you what I can find.   More than likely the issue
is simply due to the fact that the table is large enough and the ranges
random enough to cause the process to become I/O bound where EDB is
constantly buffering data that is immediately ejected from the cache anyways
because the space is needed for newer data, rinse and repeat.

--
Tim Young
Elevate Software
www.elevatesoft.com

Tue, Jul 24 2007 4:42 PMPermanent Link

Dave Harrison
Tim Young [Elevate Software] wrote:

> Dave,
>
> << Caching is absolutely critical for web servers that are running 24/7. If
> there is a problem with EDB's caching going beyond 32MB, then I recommend
> the caching algorithm be corrected. >>
>
> I know what our products can and cannot do, and why.  I am trying to explain
> to you why it is the way it is, but you don't seem to want to hear what I'm
> saying.   I've been through this already once with you, and what I'm trying
> to convey to you is that the way the buffering is designed is *intentional*.
> EDB and DBISAM are not designed to buffer large amounts of data *instead of*
> the OS doing so (IOW, using raw I/O at the OS level and bypassing the OS
> buffering).   The OS already caches the files to a large extent and caching
> them further in DBISAM and EDB is simply a waste of memory and defeats the
> purpose of both products being lightweight in terms of memory consumption.

Actually the OS caching in my tests have no effect whatsoever. I'm not
even sure the OS is caching anything, at least not like it did with
DBISAM. There is barely any drop in memory after 10 hours of running and
the application speed is the same after 10 hours as it is in the first 5
seconds.

> The enterprise level server that I have planned for EDB will have a large
> memory model and it will be able to buffer large amounts of memory because
> it will be designed to do so.

Ok, great. I didn't know you were working on an Ent version. (I'm
juggling 4 databases at a time here so maybe I plain forgot.)

> More than likely the issue
> is simply due to the fact that the table is large enough and the ranges
> random enough to cause the process to become I/O bound where EDB is
> constantly buffering data that is immediately ejected from the cache anyways
> because the space is needed for newer data, rinse and repeat.

Correct. Because the cache isn't large enough to hold enough records to
make a difference. Old records keep getting swapped out.

I'll put EDB on the shelf for now and wait for the Ent version. Thanks.

Dave
Wed, Jul 25 2007 3:37 PMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Dave,

<< Actually the OS caching in my tests have no effect whatsoever. I'm not
even sure the OS is caching anything, at least not like it did with DBISAM.
There is barely any drop in memory after 10 hours of running and the
application speed is the same after 10 hours as it is in the first 5
seconds. >>

What are you using to measure the OS memory usage for caching ?   The task
manager should reflect some change, especially with random access over a
large table.

<< Ok, great. I didn't know you were working on an Ent version. (I'm
juggling 4 databases at a time here so maybe I
plain forgot.) >>

This is just one of those issues that is architectural in nature, and no
amount of manipulation of the buffering settings will change that.

<< Correct. Because the cache isn't large enough to hold enough records to
make a difference. Old records keep getting swapped out. >>

Yes, but this is not an issue that is completely solvable.  Given random
enough input, it is possible to defeat even a very large cache when you're
dealing with a very large table, especially when you factor in read-ahead.
IOW, even a 256MB or 512MB cache could be defeated if the table is several
GBs in size.  It may not be defeated completely, but it could be defeated
enough to cause thrashing and performance degradation.  Also, DBISAM and EDB
use private caches per session, whereas most database servers use a shared
cache architecture.  With such a scenario, a single-user test of performance
is not reflective of what will be experienced when multiple users are
hitting the database server and causing rows (that a particular session
might need) to be ejected from the cache prematurely.  This doesn't occur
with DBISAM or EDB - the I/O patterns seen in a single-user test are
replicatable in a multi-user test also.

<< I'll put EDB on the shelf for now and wait for the Ent version. Thanks.
>>

Well, at the very least you should send me what you're using so I can at
least verify that what you state is the actually the reason for the
performance that you're seeing.   I will confirm here publicly what I find.

--
Tim Young
Elevate Software
www.elevatesoft.com

Wed, Jul 25 2007 4:07 PMPermanent Link

Dave Harrison
Tim Young [Elevate Software] wrote:
>
> Yes, but this is not an issue that is completely solvable.  Given random
> enough input, it is possible to defeat even a very large cache when you're
> dealing with a very large table, especially when you factor in read-ahead.

Yes, I'm testing the worst case scenario.

> Also, DBISAM and EDB
> use private caches per session, whereas most database servers use a shared
> cache architecture.  

So you mean the table Buffers are not shared? If I have Buffers set to
8MB, then you're saying each session creates their own 8MB buffer for
the table? Oh-oh. Most large C/S databases these days use shared buffers
because it's more memory efficent. Assigning 500MB cache to all users
makes more sense to me rather than assigning 10MB caches to 50 sessions,
or 1MB cache to 500 sessions. That way if one user fetches a row, then
it is available in the cache for all users. Will your ENT version use a
shared cache?

> With such a scenario, a single-user test of performance
> is not reflective of what will be experienced when multiple users are
> hitting the database server and causing rows (that a particular session
> might need) to be ejected from the cache prematurely.  This doesn't occur
> with DBISAM or EDB - the I/O patterns seen in a single-user test are
> replicatable in a multi-user test also.
>
> << I'll put EDB on the shelf for now and wait for the Ent version. Thanks.
>  >>
>
> Well, at the very least you should send me what you're using so I can at
> least verify that what you state is the actually the reason for the
> performance that you're seeing.   I will confirm here publicly what I find.
>

I'd like to, but the data and table structure can't be distributed
because of my NDS. I'll have to find time to hobble something together
and generate data for it.

Dave
Wed, Jul 25 2007 5:02 PMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Dave,

<< So you mean the table Buffers are not shared? If I have Buffers set to
8MB, then you're saying each session creates their own 8MB buffer for the
table? Oh-oh. Most large C/S databases these days use shared buffers because
it's more memory efficent. >>

Sure, but it is also subject to the performance issues that I mentioned,
which is why most, like Sybase for example, offer a way to designate private
cache areas for sessions in order to avoid these issues.  EDB and DBISAM
don't suffer from these issues because they basically use a hybrid - the OS
file caching as the shared cache and the local buffering for local caches.

<< Assigning 500MB cache to all users makes more sense to me rather than
assigning 10MB caches to 50 sessions,
or 1MB cache to 500 sessions. That way if one user fetches a row, then it
is available in the cache for all users. >>

You're looking at the good side of the situation.   The bad side is that the
fetch might have just ejected a row from the cache that will be needed by
another session.  With local or private caches, this scenario does not take
place.

<< Will your ENT version use a shared cache? >>

Yes, but it will still also use local caches and retain the hybrid cache.
The shared cache will simply replace what the OS is currently doing now.

<< I'd like to, but the data and table structure can't be distributed
because of my NDS. I'll have to find time to hobble something together and
generate data for it. >>

Well, if you could at least just give me the code that you're using, I can
match it up with a table structure of my ow making that should do what I
want for profiling purposes.

BTW, I did find an issue with the buffering settings in B6 and earlier - the
buffering settings are being saved in the current session, but not actually
saved in the catalog itself.  Therefore, if you close down the application
and restart it, the buffering settings for the tables will revert back to
their defaults.  Apparently this has been in there for the duration - I
checked every source code backup going back to the initial release, and none
of them are saving the buffering settings properly.

--
Tim Young
Elevate Software
www.elevatesoft.com

« Previous PagePage 2 of 2
Jump to Page:  1 2
Image