Icon View Thread

The following is the text of the current message along with any replies.
Messages 1 to 10 of 14 total
Thread Bulk data copy
Wed, Nov 14 2007 10:07 AMPermanent Link

Paul Endersby
The product that I code for has the ability to archive the current database to a ZIP file then at a later date reimport that data. Because the database structure may
have changed meantime, when importing each row it iterates over each column to see if it still exists in the current structure, if it does then it copies the data
across.

For tables with lots of columns and many rows this is turning out to be a very expensive operation and I suspect not the best way to do this. So, question is does
DBISAM have some sort of inbuilt functionality that may do this quicker or a 3rd party component that may do the job far more effieiciently?

Cheers,

Paul.
Wed, Nov 14 2007 10:20 AMPermanent Link

"Robert"
When you import the data, do you add/replace rows on existing tables, or
delete existing data and replace with archive? IOW, if you archived records
1 and 3, and you have added record 2 to current datbase, after the restore
do you expect to have records 1 and 3, or 1,2 and 3?

What version of DBISAM?

Robert

"Paul Endersby" <paul.endersby@trilogycomms.com> wrote in message
news:05A14A92-090B-4A4C-9C9D-3FF8338C26BD@news.elevatesoft.com...
> The product that I code for has the ability to archive the current
> database to a ZIP file then at a later date reimport that data. Because
> the database structure may
> have changed meantime, when importing each row it iterates over each
> column to see if it still exists in the current structure, if it does then
> it copies the data
> across.
>
> For tables with lots of columns and many rows this is turning out to be a
> very expensive operation and I suspect not the best way to do this. So,
> question is does
> DBISAM have some sort of inbuilt functionality that may do this quicker or
> a 3rd party component that may do the job far more effieiciently?
>
> Cheers,
>
> Paul.
>

Wed, Nov 14 2007 10:24 AMPermanent Link

Paul Endersby
My apologies. The current data is completely purged and the new data is copied in. This is using the latest 4.25 build.

Cheers,

Paul.

"Robert" <ngsemail2005withoutthis@yahoo.com.ar> wrote:

When you import the data, do you add/replace rows on existing tables, or
delete existing data and replace with archive? IOW, if you archived records
1 and 3, and you have added record 2 to current datbase, after the restore
do you expect to have records 1 and 3, or 1,2 and 3?

What version of DBISAM?

Robert

"Paul Endersby" <paul.endersby@trilogycomms.com> wrote in message
news:05A14A92-090B-4A4C-9C9D-3FF8338C26BD@news.elevatesoft.com...
> The product that I code for has the ability to archive the current
> database to a ZIP file then at a later date reimport that data. Because
> the database structure may
> have changed meantime, when importing each row it iterates over each
> column to see if it still exists in the current structure, if it does then
> it copies the data
> across.
>
> For tables with lots of columns and many rows this is turning out to be a
> very expensive operation and I suspect not the best way to do this. So,
> question is does
> DBISAM have some sort of inbuilt functionality that may do this quicker or
> a 3rd party component that may do the job far more effieiciently?
>
> Cheers,
>
> Paul.
>

Wed, Nov 14 2007 11:00 AMPermanent Link

"Robert"

"Paul Endersby" <paul.endersby@trilogycomms.com> wrote in message
news:8213748C-6E54-4768-B508-72DC535C8D08@news.elevatesoft.com...
> My apologies. The current data is completely purged and the new data is
> copied in. This is using the latest 4.25 build.

What's wrong then with using DBISAM backup and restore? If you use a
sequential number on one of the tables user major  version to indicate the
table structure "sequence number" and you store that number somehwere in the
backup description, you can test before you start the restore, and take
appropriate action if the backup tables version is not equeal to your
current version. Something like

  i := MyTable.UserMajorVersion;
  TablesToBackup := tStringList.Create;
  BackupDesc := 'Datbase Backup ' + DateTimeToStr(now) + ' Tables Version
[' + IntToStr(j) + ']'; ;
  wDB := GetDBname;
  wBU := GetBackupname;
  dm.Session.GetTableNames(wDB, TablesToBackup);
  if not dm.db.Backup(wBU, BackupDesc, 6, TablesToBackup)
     then messageDlg('Backup Failed!', mtError, [mbOK], 0);

Then before you do a restore, you look for the number following [ in the
backup description and compare it to the current MyTable.UserMajorVersion.

I do the tables upgrade using SQL, which makes it very easy

ALTER TABLE "MYTABLE.DAT"
USER MAJOR VERSION 202;

ALTER TABLE SOMETABLE ADD SOMEFIELD INTEGER DEFAULT 0;

Robert


Wed, Nov 14 2007 11:02 AMPermanent Link

"Robert"

"Robert" <ngsemail2005withoutthis@yahoo.com.ar> wrote in message
news:B88C950F-A746-41F3-A88E-FD95E62BAAA5@news.elevatesoft.com...
>

In fact, this is a great enhancement, I'll add it to the next version of my
software.

Robert

Wed, Nov 14 2007 11:34 AMPermanent Link

Paul Endersby
And herein lies my problem. We (the company I work for) do not keep an upgrade script from version to version (odd I know). The table structure is just altered
within the current release cycle and released with the current software. A user can then come along with an older backup of the data where the structure differed
and import into the live system. The code copies in the data where the structure matches and then runs upgrade code on the data to tidy up and make good any
holes or gaps that now exist.

This is why at present for each row each column is checked to see if it exists in the current structure. This to me is an utterly baffling way to do this but it's what
I'm stuck with and I've been given the task of trying to speed it up. I don't unfortunately have the luxury of changing this mechanism as we have users out there
with old backups of their data the we do not wish to make invalid.

"Robert" <ngsemail2005withoutthis@yahoo.com.ar> wrote:


"Robert" <ngsemail2005withoutthis@yahoo.com.ar> wrote in message
news:B88C950F-A746-41F3-A88E-FD95E62BAAA5@news.elevatesoft.com...
>

In fact, this is a great enhancement, I'll add it to the next version of my
software.

Robert

Wed, Nov 14 2007 11:56 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Paul


1. Restore the backup to a different directory
2. Empty the existing tables
3. Build an SQL script INSERT INTO ..... by reading the field definitions for the restored table and the current structure
4. ExecSQL

Should be faster since you only do the check to see if a field exists once and the SQL "should" be faster for inserting.

The only problem is if you have new fields or relationships that depend on the data - but you're stuffed either way on that since you'd need a custom restore.

Roy Lambert
Wed, Nov 14 2007 11:57 AMPermanent Link

"Jose Eduardo Helminsky"
Paul

Here in my company we have the same situation but it does not hurt us
because each time we send a new application then a special flag is sent
together to sign the application to compare the structure of tables in the
disk and defined by an internal dictionary, if they are different then the
application change the structure and this process is transparent to the
users and for us too.

We just add fields/indexes and left this mechanism do the rest of task.

Eduardo

Wed, Nov 14 2007 12:21 PMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Paul,

<< The product that I code for has the ability to archive the current
database to a ZIP file then at a later date reimport that data. Because the
database structure may have changed meantime, when importing each row it
iterates over each column to see if it still exists in the current
structure, if it does then it copies the data across. >>

I would move the column checking out of the main insertion loop, and instead
use the TField.Tag property to set up a referenced to the "Old" column in
the "Old" table like this:

// It is assumed that both new and old tables are open

with MyNewTable do
  begin
  for I:=0 to FieldCount-1 do
     Fields[I].Tag:=Integer(MyOldTable.FieldByName(Fields[I].FieldName));
  end;

Then, when performing the import into the new structure, just do this:

with MyNewTable do
  begin
  Append;
  for I:=0 to FieldCount-1 do
     begin
     with Fields[I] do
        begin
        if (Tag <> 0) then
           Assign(TField(Tag));
     end;
  Post;
  end;

It will still be expensive, but not nearly as expensive as before.

--
Tim Young
Elevate Software
www.elevatesoft.com

Thu, Nov 22 2007 7:16 AMPermanent Link

Paul Endersby
Sorry for the delay in replying. this method worked an absolute treat and sped things up immensely. However My application can work in local mode (accessing the files directly on the
machine - not across a file share) and in remote mode where we have coded our own server using the DBISAM comps.

When in local mode the import process flies along and takes about two minutes to process the data. However the same data in remote mode takes in excess of 25 minutes! I have traced
this to the post event. In local mode the post is taking less than 1ms to complete, but in remote mode this is taking between 10 - 15 ms to complete. Not a lot on it's own, but when you
multiply this up to 100,000 rows it becomes an issue.

I came across a post on here that mentioned mentioned the SaveTo/LoadFrom Stream mechanisms and now if my schemas are the same version (new code!) I have changed my code to so
that it runs the following code instead:

var
 msData: TMemoryStream;
begin
 msData := TMemoryStream.Create;
 try
   sTbl.SaveToStream(msData); // source table
   dtbl.LoadFromStream(msData); //destination table
 finally
   FreeAndNil(msData);
 end;
end;

This works beautifully with largeish datasets, and I get very similar times in local and remote modes, but on my largest dataset it always throws up a EPriviledge exception on the
LoadFromStream call (89660 rows x 44 Columns). I know that the docs say not to use this method on large datasets because you will run out of memory, but I'm checking my memory when
this code is running and It's not running out at all. Any insights into this issue. I'm, guessing that I'm going to have to try and break up large tables into smaller bite size chunks, but I'm
unsure as to how to proceed on that front.

Cheers,

Paul.
Page 1 of 2Next Page »
Jump to Page:  1 2
Image