![]() | Products |
| Home » Technical Support » ElevateDB Technical Support » Support Forums » ElevateDB General » View Thread |
| Messages 1 to 8 of 8 total |
| Tue, Jul 30 2024 11:44 PM | Permanent Link |
Shane Sturgeon | Hi all
I have an application which has a couple of fields that I'm not sure how to store most efficiently - i.e. to minimize the space needed on disk. It's not that disk space is scarce - it's just that I like to be efficient. 1) Field #1: It's an enumerated type in the application. I can store it as either an integer value as a Small Int (which I understand to be TWO bytes on disk) or as a VARCHAR(1). I don't know how many bytes that will take, but I seem to recall that Unicode is something like 4 bytes per character, so I presume the 16bit SMALLINT is the more efficient of the two. Is this correct? 2) Field #1: There is also the option of a Byte(1) field which presumably is only ONE byte, however it does seem to be a tricky type to work with based on what I've managed to glean from these forums, and I did see Tim chime in at one point and my take away was that it's probably not worth it. It wasn't the same use case though, so I'm interested in peoples thoughts on that as an option. 3) Field #2: This is a field for notes, and I've currently used VarChar(1024). It's hooked up to a TMemo in the app. My expectation is that this field will rarely be used, so most of the time this field will be '' (empty String) in the table. I don't know how exactly VarChars work in the table storage (there is mention of not being padded in the docs which I'm assuming means "when returned via SQL", and hoping it also means on disk), but if it's storing 1024x 4? bytes for each row, and that's empty space, that seems like it could be more efficient to create a separate child table just for those rows that need to include a note. Is there a better way? Does a NULL use less space than an empty string? 4) This is another question along the same lines that I may as well add in here. GUID fields as I understand them are fixed length VARCHAR (38 according to the docs). Are they converted back to the record type when being stored on disk, or are they taking up 38x4? bytes of diskspace for each key? Is it more efficient to use a BigInt for a key where you don't want to run the risk of overflowing a standard INTEGER Key? Looking forward to being enlightened Thanks Shane |
| Wed, Jul 31 2024 2:45 AM | Permanent Link |
Roy Lambert NLH Associates | Shane
I think you may be using the wrong definition of efficient Taking the second field first I'd go along with your idea of a separate table, the increased disk space per row should, from your description, be more than compensated by the reduction in number of rows. I'd look at the efficiency in processing before considering the disk space saving eg what extras are involved - maybe an SQL join or a master-detail relationship. Field 1 is a bit more complex. I've tended to use lists rather than proper enumerated types and store either the text value or the position in the list (or even both in a few cases). Again I'd look at the processing in your application rather than the amount of disk space used. The reason I generally go for lists is that they are easier to maintain so if the enumerated type is not fixed for all time a list can be stored in a table and changed at will. To overflow an integer used as a key would mean a rather large table (Delphi's MaxInt value is (2^32)-1 = 2,147,483,647) Sorry I can't be more helpful Roy Lambert |
| Wed, Jul 31 2024 3:30 AM | Permanent Link |
Shane Sturgeon | Thanks Roy - that was helpful. Always eager to get a different perspective.
I'm using Aurelius so the extra table for the notes would be done as a lazy load which would actually work well in a lot of situations (as in those where the notes are not required). I am storing the labels for the enumerated type in a table along with the enum (and some custom enum space) so I'll consider your list idea. Cheers Shane Roy Lambert wrote: Shane I think you may be using the wrong definition of efficient Taking the second field first I'd go along with your idea of a separate table, the increased disk space per row should, from your description, be more than compensated by the reduction in number of rows. I'd look at the efficiency in processing before considering the disk space saving eg what extras are involved - maybe an SQL join or a master-detail relationship. Field 1 is a bit more complex. I've tended to use lists rather than proper enumerated types and store either the text value or the position in the list (or even both in a few cases). Again I'd look at the processing in your application rather than the amount of disk space used. The reason I generally go for lists is that they are easier to maintain so if the enumerated type is not fixed for all time a list can be stored in a table and changed at will. To overflow an integer used as a key would mean a rather large table (Delphi's MaxInt value is (2^32)-1 = 2,147,483,647) Sorry I can't be more helpful Roy Lambert |
| Wed, Jul 31 2024 2:45 PM | Permanent Link |
Fernando Dias | Hello ,
Before answering your question, I would like to have a notion about the expected size of your tables. Is it hundreds of rows or millions of rows? Not yet knowing the details, for now just consider this: VARCHAR columns in EDB use a fixed amount of space - so if using Unicode, a VARCHAR(1024) will use 2048 bytes for each row, even when NULL or empty. I would use a CLOB column instead because it saves space and it's way easier to deal with than using 2 tables with the overhead of joins, etc. -- Fernando Dias [Team Elevate] |
| Wed, Jul 31 2024 8:40 PM | Permanent Link |
Shane Sturgeon | Hi Fernando
My questions are a bit more generic than a specific application - I have several on the go, and have wondered about these issues in past apps as well. However, I can address your query re sizes for a couple to the app suite I am working on at the moment. Field#1: I use this a lot, in a lot of different tables. In the current app suite, there is one table where I expect that the row count could easily enter the 7 to 8 figure range. Most tables with enumerated types are however only in the hundreds of rows category. Field#2: This is generally in the hundreds of rows category. Thanks for the info re CLOB - I haven't used those before. Is there any trick to storing strings in them, or is it the same as with VARCHAR? I also currently use VARCHAR(80) for captions (100's of row tables) - Given most captions are not that long, should I be using CLOB for those as well? Fernando Dias wrote: Hello , Before answering your question, I would like to have a notion about the expected size of your tables. Is it hundreds of rows or millions of rows? Not yet knowing the details, for now just consider this: VARCHAR columns in EDB use a fixed amount of space - so if using Unicode, a VARCHAR(1024) will use 2048 bytes for each row, even when NULL or empty. I would use a CLOB column instead because it saves space and it's way easier to deal with than using 2 tables with the overhead of joins, etc. -- Fernando Dias [Team Elevate] |
| Thu, Aug 1 2024 3:41 AM | Permanent Link |
Roy Lambert NLH Associates | Shane
<<Thanks for the info re CLOB - I haven't used those before. Is there any trick to storing strings in them, or is it the same as with VARCHAR? I also currently use VARCHAR(80) for captions (100's of row tables) - Given most captions are not that long, should I be using CLOB for those as well?>> This is back to the wrong sort of efficiency (or that's my big headed opinion) CLOBS have an entry in the host table (.EDBTbl) (I think its 10 bytes but I'm going on aging memory) and then allocates blocks in the blob table (.EDBBlb). I know how big they are has been mentioned but I can't remember - I actually wrote a program to analyse blob files for DBISAM to see if any optimisation was worth it (generally it wasn't). You also have compression to consider - disk size vs how long it takes to compress & decompress. CLOBs are great - from the manuals Min/Max BLOB Block Size for a Table The minimum BLOB block size is 64 bytes for ANSI databases and 128 bytes for Unicode databases. The maximum BLOB block size is 2GB. Table BLOBs (.EDBBlb) Used to store a fixed-length header for BLOB statistics along with the fixed-length BLOB blocks. The BLOB block size is variable and can be set between 64 bytes and 64 kilobytes on a per-table basis. All BLOBs for all BLOB columns defined for the table are stored in this file. From a post of mine back in 2012 (its great searching the newsgroups) BLOBS and CLOBS are stored in the .EDBBlb file that makes up a table along with the .EDBTbl and .EDBIdx files. A pointer to an offset where the specific item begins is stored in the .EDBTbl file. Within the .EDBBlb file each BLOB/CLOB is stored in one or more blocks. The size of these blocks can be set when the table is created/altered using the BLOB BLOCK SIZE clause. I'm guessing now because I don't know the exact format of the blocks but I'd expect it to be "from pointer" "data" "to pointer" and possibly some sort of hash to help with repair/integrity checking. As you read the data ElevateDB will read each block in turn to get the entire BLOB, larger blocks mean less reads and hence better performance but come at a cost of increased slack (waste space). The same applies to writing. Depending on the format of the videos you may want to turn compression on. There's a trade off between amount of data to be read and decompression time but you'll have to experiment to find out what's best for you. You'll also certainly need to alter the default BLOB BLOCK SIZE. What to is a difficult question to answer. I'd suggest trying around 16K and see what happens. I don't know what the maximum you can set in ElevateDB is but to give an example (assuming its possible) if you set it to 60Mb then for the 15MB video you'd have 45Mb slack. Basically to answer your question DO NOT USE CLOBs FOR SMALL CHARACTER COLUMNS. By small I'd mean anything under 250 and probably under 1024 Roy Lambert |
| Thu, Aug 1 2024 5:45 AM | Permanent Link |
Shane Sturgeon | Thanks Roy
I do take your point re efficiency I have searched the forums which is how I found out about not being particularly practical to use the Byte option in some cases, but I had never consider CLOBS - Always figured those were for storing things like actual files, and figured there was probably a better way than to store large files in a database table Thanks for your advice re using them as strings - seems like VARCHAR is the way to go there. Roy Lambert wrote: <<Basically to answer your question DO NOT USE CLOBs FOR SMALL CHARACTER COLUMNS. By small I'd mean anything under 250 and probably under 1024>> |
| Thu, Aug 1 2024 7:02 AM | Permanent Link |
Roy Lambert NLH Associates | Shane
I do know the mindset - its why I wrote my compression optimisation checker for DBISAM - I was most disappointed to find out it was a waste of time Padding & VARCHAR comes from the fact that ElevateDB is NOT a variable length field database (for that you need something like PICK or one of the "modern" ones) so if you make provision for a character field to store up to 20 characters (ie VARCHAR(20)) it will have space on disk for 20 characters even if you only put 1 in there. One of the things I use CLOBs for is to store "sub tables". With DBISAM and in-memory tables I would stream a DBISAM in-memory table in there, with ElevateDB, as part of my hatred of Tstringgrids I wrote nlhStrTable which (for small tables) uses linked Tstringlists to hold data which is then displayed in my enhanced DBGrid. Those can then be written / read (csv, json, xml, native) in a CLOB. Roy Lambert |
This web page was last updated on Thursday, March 12, 2026 at 07:53 AM | Privacy Policy © 2026 Elevate Software, Inc. All Rights Reserved Questions or comments ? |

