Support Forums - View Thread

View Thread

The following is the text of the current message along with any replies.

Messages 11 to 20 of 21 total

Wed, Oct 10 2007 9:06 AM	Permanent Link
Tim Young [Elevate Software] Elevate Software, Inc. timyoung@elevatesoft.com	Dave, << I've uploaded the benchmark program to the binaries. EDB did better when I decreased the transaction size to 100 rows instead of using 1500 or 15000. It was about twice as fast as DBISAM when tuned correctly. >> Is it just me, or is the ElevateDB index building portion of the code missing ? -- Tim Young Elevate Software www.elevatesoft.com
Wed, Oct 10 2007 9:30 AM	Permanent Link
Roy Lambert NLH Associates Team Elevate	Tim That would explain the consistency of performance Roy Lambert
Wed, Oct 10 2007 5:15 PM	Permanent Link
Dave Harrison	Tim Young [Elevate Software] wrote: > Dave, > > << I've uploaded the benchmark program to the binaries. EDB did better when > I decreased the transaction size to 100 rows instead of using 1500 or 15000. > It was about twice as fast as DBISAM when tuned correctly. >> > > Is it just me, or is the ElevateDB index building portion of the code > missing ? > Oops. No, it was me. I've fixed it and re-uploaded it. This now compares with my real-world tests where adding rows to an EDB table is a bit slower than DBISAM. I had hoped it would be considerably faster. Maybe there is some magical tweaking that can be done by the wizard himself. Unfortuantely I my magic wand is broken and I've got another one on order from http://www.alivans.com/choose-your-wand.htm and it won't be here for a few weeks. Dave
Thu, Oct 11 2007 1:02 PM	Permanent Link
Tim Young [Elevate Software] Elevate Software, Inc. timyoung@elevatesoft.com	Dave, << I've fixed it and re-uploaded it. This now compares with my real-world tests where adding rows to an EDB table is a bit slower than DBISAM. >> Well, for starters the EDB index format isn't going to be able to compress near-unique values as well, and your tests are generating completely random data that is near-unique in most cases. That will be a disadvantage to EDB's index compression and not DBISAM's. Apart from that, I'll have to do some more testing to find out what the whole situation is. I've run EDB a few times, but not DBISAM yet. -- Tim Young Elevate Software www.elevatesoft.com
Thu, Oct 11 2007 3:24 PM	Permanent Link
Dave Harrison	Tim Young [Elevate Software] wrote: > Dave, > > << I've fixed it and re-uploaded it. This now compares with my real-world > tests where adding rows to an EDB table is a bit slower than DBISAM. >> > > Well, for starters the EDB index format isn't going to be able to compress > near-unique values as well, and your tests are generating completely random > data that is near-unique in most cases. The key is name up of the fields Name1,Code1,Date1. Name1 may appear to be random, but each Name1 value will occur at least 4500 times, and up to 38,000 times when the table has 1 million rows. Code1 has values only of "A","B","C". Date1 ranges over a 27 year period. The combination of all 3 will make a nearly unique key, which would be the case in a real world application. > That will be a disadvantage to > EDB's index compression and not DBISAM's. Apart from that, I'll have to do > some more testing to find out what the whole situation is. I've run EDB a > few times, but not DBISAM yet. > Ok. I thought I'd run it by you in case I was doing something wrong. Wouldn't be the first time. I was a little surprised to see DBISAM was faster at loading the data. Dave
Thu, Oct 11 2007 5:20 PM	Permanent Link
Tim Young [Elevate Software] Elevate Software, Inc. timyoung@elevatesoft.com	Dave, << The key is name up of the fields Name1,Code1,Date1. Name1 may appear to be random, but each Name1 value will occur at least 4500 times, and up to 38,000 times when the table has 1 million rows. Code1 has values only of "A","B","C". Date1 ranges over a 27 year period. The combination of all 3 will make a nearly unique key, which would be the case in a real world application. >> Yes, that's exactly what I meant. However, discard what I said - I assumed that the DBISAM test was using compression for the indexes, and it isn't, so what I said doesn't matter nearly as much. << Ok. I thought I'd run it by you in case I was doing something wrong. Wouldn't be the first time. I was a little surprised to see DBISAM was faster at loading the data. >> Well, as far as I can see, it isn't on the average: DBISAM Elapsed: 1,130.1 sec, #Rcds:1,000,000 885 Rcds/Sec Trans Flush: No Trans Size=1000 Build Index Before DBISAM Elapsed: 978.5 sec, #Rcds:1,000,000 1,022 Rcds/Sec Trans Flush: No Trans Size=5000 Build Index Before DBISAM Elapsed: 1,190.0 sec, #Rcds:1,000,000 840 Rcds/Sec Trans Flush: No Trans Size=10000 Build Index Before DBISAM Elapsed: 440.1 sec, #Rcds:1,000,000 2,272 Rcds/Sec Trans Flush: No Trans Size=15000 Build Index Before DBISAM Elapsed: 376.5 sec, #Rcds:1,000,000 2,656 Rcds/Sec Trans Flush: No Trans Size=30000 Build Index Before ElevateDb Elapsed: 652.8 sec, #Rcds:1,000,000 1,532 Rcds/Sec Trans Size=1000 Build Index Before ElevateDb Elapsed: 637.7 sec, #Rcds:1,000,000 1,568 Rcds/Sec Trans Size=5000 Build Index Before ElevateDb Elapsed: 473.6 sec, #Rcds:1,000,000 2,112 Rcds/Sec Trans Size=10000 Build Index Before ElevateDb Elapsed: 446.8 sec, #Rcds:1,000,000 2,238 Rcds/Sec Trans Size=15000 Build Index Before ElevateDb Elapsed: 464.9 sec, #Rcds:1,000,000 2,151 Rcds/Sec Trans Size=30000 Build Index Before There are a couple of instances that are reproducible in which DBISAM becomes completely disk-bound during the process. The two that stick out are the 1000 row and 10000 row tests that I did. EDB, on the other hand, does not have this problem since it doesn't touch nearly as much index data during an average index update due to not having the statistics in the indexes. This is what I was referring to when I spoke about the index updates being faster. -- Tim Young Elevate Software www.elevatesoft.com
Fri, Oct 12 2007 12:57 AM	Permanent Link
Dave Harrison	Tim Young [Elevate Software] wrote: > Dave, > > << The key is name up of the fields Name1,Code1,Date1. Name1 may appear to > be random, but each Name1 value will occur at least 4500 times, and up to > 38,000 times when the table has 1 million rows. Code1 has values only of > "A","B","C". Date1 ranges over a 27 year period. The combination of all 3 > will make a nearly unique key, which would be the case in a real world > application. >> > > Yes, that's exactly what I meant. However, discard what I said - I assumed > that the DBISAM test was using compression for the indexes, and it isn't, so > what I said doesn't matter nearly as much. > > << Ok. I thought I'd run it by you in case I was doing something wrong. > Wouldn't be the first time. I was a little surprised to see DBISAM was > faster at loading the data. >> > > > There are a couple of instances that are reproducible in which DBISAM > becomes completely disk-bound during the process. The two that stick out > are the 1000 row and 10000 row tests that I did. EDB, on the other hand, > does not have this problem since it doesn't touch nearly as much index data > during an average index update due to not having the statistics in the > indexes. This is what I was referring to when I spoke about the index > updates being faster. Each engine has to be optimized for the best transaction size. The best DBISAM speed I found was: DBISAM Elapsed: 137.0 sec, #Rcds:1,000,000 7,299 Rcds/Sec Trans Flush: No Trans Size=7500 Build Index Before beats the best EDB speed: ElevateDb Elapsed: 193.0 sec, #Rcds:1,000,000 5,182 Rcds/Sec Trans Size=6000 Build Index Before by about 40%. So if EDB doesn't touch the index as much as DBISAM, then EDB is spending a lot more time some place else. BTW, I'm not sure why you're getting such lousy times: Yours: > DBISAM Elapsed: 1,190.0 sec, #Rcds:1,000,000 840 Rcds/Sec > Trans Flush: No Trans Size=10000 Build Index Before Mine is over 8x faster for same transaction size: DBISAM Elapsed: 144.1 sec, #Rcds:1,000,000 6,941 Rcds/Sec Trans Flush: No Trans Size=10000 Build Index Before Yours: > ElevateDb Elapsed: 652.8 sec, #Rcds:1,000,000 1,532 Rcds/Sec Trans > Size=1000 Build Index Before Mine is 3x faster: ElevateDb Elapsed: 213.5 sec, #Rcds:1,000,000 4,685 Rcds/Sec Trans Size=1000 Build Index Before Are you running yours on a laptop? Dave
Fri, Oct 12 2007 9:46 AM	Permanent Link
Tim Young [Elevate Software] Elevate Software, Inc. timyoung@elevatesoft.com	Dave, << Each engine has to be optimized for the best transaction size. >> Yes, I understand. However, my point is that EDB has the best overall performance with any transaction size, not just selective ones. << The best DBISAM speed I found was: DBISAM Elapsed: 137.0 sec, #Rcds:1,000,000 7,299 Rcds/Sec Trans Flush: No Trans Size=7500 Build Index Before beats the best EDB speed: ElevateDb Elapsed: 193.0 sec, #Rcds:1,000,000 5,182 Rcds/Sec Trans Size=6000 Build Index Before by about 40%. So if EDB doesn't touch the index as much as DBISAM, then EDB is spending a lot more time some place else. >> I can't account for what happens on your particular machine without being able to profile it here. All I can go by is what I see here when I profile the application on my machine (Intel Pentium D 2.8GHz desktop with 1GB of RAM with SpeedStep disabled). From my tests, DBISAM is becoming I/O bound on certain tests while EDB provides fairly-consistent results across the board. As far as EDB spending a lot more time elsewhere, there really isn't any where else that would account for a difference that is as pronounced as that. I think we'll need to get some other machines in this mix in order to rule out any particular issue with your machine or my machine with regard to this test. << BTW, I'm not sure why you're getting such lousy times: >> Well, we're using different machines, for starters. I don't use particularly fast machines because they tend to give database engine developers a false sense of performance for the average machine. I'm also using D7 instead of D2007 for the compilation, but I don't think that there is that much difference between the two compilers. I'm doing some tests right now to rule out any differences with FastMM vs. the native Delphi memory manager. -- Tim Young Elevate Software www.elevatesoft.com
Fri, Oct 12 2007 10:41 PM	Permanent Link
Tim Young [Elevate Software] Elevate Software, Inc. timyoung@elevatesoft.com	Dave, Okay, after testing with FastMM the differences become more pronounced in DBISAM's favor. Further testing reveals that the issue is the Name1 column in the two indexes. EDB has to do quite a bit more moving around of data for strings since it uses long strings (AnsiStrings and WideStrings) for comparing, etc. This goes back to the original design requirements for not using direct pointer access for .NET managed code portability. String accesses require some allocation and move-ing of memory, whereas DBISAM directly addresses the column in the row buffer. If you remove the Name1 column from both indexes, you will see this phenomenon in action. For example, this is the result that I get without the Name1 column in the indexes and using the FastMM memory manager: ElevateDb Elapsed: 169.9 sec, #Rcds:1,000,000 5,886 Rcds/Sec Trans Size=15000 Build Index Before DBISAM Elapsed: 223.0 sec, #Rcds:1,000,000 4,484 Rcds/Sec Trans Flush: No Trans Size=15000 Build Index Before DBISAM starts off fast, but degrades quickly, whereas EDB degrades fairly gracefully as it does with the Name1 column included. I'm going to have to investigate some caching techniques to work around this issue, but for now the main thing to take away from this is to avoid using string columns that are ~20 characters or more in the indexes if you're interested in the maximum amount of performance. In your test, the randomness of the Name1 input also exacerbates the issue since it causes many more column comparisons in the index keys than a less random input would cause. BTW, here are my original FastMM results with the Name1 column included: DBISAM Elapsed: 275.9 sec, #Rcds:1,000,000 3,625 Rcds/Sec Trans Flush: No Trans Size=15000 Build Index Before ElevateDb Elapsed: 336.3 sec, #Rcds:1,000,000 2,973 Rcds/Sec Trans Size=15000 Build Index Before Tim Young Elevate Software www.elevatesoft.com
Fri, Oct 12 2007 11:41 PM	Permanent Link
Dave Harrison	Tim Young [Elevate Software] wrote: > Dave, > > Okay, after testing with FastMM the differences become more pronounced in > DBISAM's favor. Further testing reveals that the issue is the Name1 column > in the two indexes. EDB has to do quite a bit more moving around of data > for strings since it uses long strings (AnsiStrings and WideStrings) for > comparing, etc. This goes back to the original design requirements for not > using direct pointer access for .NET managed code portability. String > accesses require some allocation and move-ing of memory, whereas DBISAM > directly addresses the column in the row buffer. If you remove the Name1 > column from both indexes, you will see this phenomenon in action. For > example, this is the result that I get without the Name1 column in the > indexes and using the FastMM memory manager: > > ElevateDb Elapsed: 169.9 sec, #Rcds:1,000,000 5,886 Rcds/Sec Trans > Size=15000 Build Index Before > > DBISAM Elapsed: 223.0 sec, #Rcds:1,000,000 4,484 Rcds/Sec Trans Flush: > No Trans Size=15000 Build Index Before > > DBISAM starts off fast, but degrades quickly, whereas EDB degrades fairly > gracefully as it does with the Name1 column included. That's what I discovered too. EDB speed doesn't diminish as quickly as other databases. > > I'm going to have to investigate some caching techniques to work around this > issue, but for now the main thing to take away from this is to avoid using > string columns that are ~20 characters or more in the indexes if you're > interested in the maximum amount of performance. In your test, the > randomness of the Name1 input also exacerbates the issue since it causes > many more column comparisons in the index keys than a less random input > would cause. > > BTW, here are my original FastMM results with the Name1 column included: > > DBISAM Elapsed: 275.9 sec, #Rcds:1,000,000 3,625 Rcds/Sec Trans Flush: > No Trans Size=15000 Build Index Before > > ElevateDb Elapsed: 336.3 sec, #Rcds:1,000,000 2,973 Rcds/Sec Trans > Size=15000 Build Index Before So without the String it is 50% to 60% faster. A lot of databases will index faster without string fields in the index. The only solution I see is to convert the string into a unique integer that is maintained in another table. It could work but would be a bit difficult to sort it because new string values could be added every month would mean reordering the integers. I think this is still feasible though. Thanks for your feedback. Dave

« Previous Page	Page 2 of 3	Next Page »
Jump to Page: 1 2 3