Icon View Thread

The following is the text of the current message along with any replies.
Messages 1 to 6 of 6 total
Thread Replication
Fri, Oct 31 2014 5:14 AMPermanent Link

Richard Beeston

Lablogic Systems Ltd

Hello,
I need a little help with replication... I followed the sales quote database example and have setup a couple of databases Main and Branch but the replication doesn't seem to work as i want/expect. I'm not using jobs/scripts this is all within delphi code. I need to be able to add extra branches at any point in time, not just define them once at start. Plus i need to be able to alter the database structure (add tables columns etc) - though that's for the future as i haven't got the basic replication working yet.

Problem.
If i do a change at the main site and the branch site to the same row/column combination and then replicate after a few iterations the updates stop flowing between the two sites but the values at main/branch aren't the same.

e.g.
Main, Branch starting values 0 , 0
Change Branch to 8 and leave main alone
Sync a few times to allow all the updates to fly about the system.
Main, Branch both end up as 8, 8
Change Branch to 6 and Main to 4 at the same time.. no syncing at all
Now sync a few times
End result is Main=6, Branch=4
i.e. the values have swapped places?!!? I want them to end at the same value i.e. the database to be consistant everywhere. I don't mind that its the last write which persists as long as the values are the same.

What's going wrong?

Notes.
I use RESET PUBLISHED TABLES on restore at the branch site during setup. I've tried with this on and off and it seems to have zero affect?! I have tried PUBLISH at the branch site and that has zero affect. I can't find the Information.tables entries to check the manifests? Where are they hiding? I see only the ElevateDB tables not the ones i've made?

Quick description of the program/operation...

Initial setup is
Create main site, create stores, create tables, publish entire database, insert standing data into tables.
.
Prepare to make a branch site (Can be done at any time to add a new branch)
At main, Synchronise everything, remove old outgoing updates, create branch stores, backup database for branch, create standing data for branch, synchronise to create first outgoing sync data for new branch.

At branch site inital setup...
Create stores, retrieve main site backup, restore backup, synchronise first changes.

Then everything is setup for synching, which is

Main site...
Loop through each site, look at all incoming site transactions in date order, load them into the main database.
Create main site outgoing updates, copy these into branch stores ready for their next connection.

Branch site...
Save outgoing transactions, send to main site,
look at all incoming transactions in date order, load them into the branch database.

Hope someone can help, this is driving me bonkers aaaargh.

Richard
Tue, Nov 4 2014 9:49 AMPermanent Link

Richard Beeston

Lablogic Systems Ltd

Having looked through the source code i have concluded that the replication doesn't work, and won't work, where two different sites update the same row/column combination at the same time. It will always end up with inconsistant data. So BEWARE! Ensure that your sites can only update their "own" data even if they can see all data. This is probably by design since conflict reasolution, detection etc are going to be rather hard to implement.

Frownback to the drawing board for me for this particular project... aaaargh
Tue, Nov 4 2014 11:57 AMPermanent Link

Adam Brett

Orixa Systems

Richard

EDB's replication is decent, though it is a bit long-winded to manage, with need for a lot of scripting, stores and jobs.

I have it successfully working for about 5 customers, but all of them have fairly "segmented" data (i.e. Office 1 enters different rows from Office 2). However customers do regularly update the same row (not at the same time!) and I do not see conflicts when this happens.

If you are really, really careful about having a central site which acts as a replication "hub" (I use a cloud db-server) and pass ALL updates through this, so all are applied in strict time-added-order then EDB _does_ cope well with repeated and multiple updates to the same row, as updates are all applied in the correct order.

There are still issues of over-writing data, but these can and should be resolved by restructuring the database IMHO, to generate structures where "new" additions result in new rows rather than editing existing rows.

--

The issue of updating and changing the database structure is _much_ more of a problem though!!!

Any significant change to DB structure has to be managed with the _whole_ db system "stopped" (no-user access) and with all pending updates applied before any changes are made, i.e. with replication "off".

Then a change script can be run on all instances and the replication process can be restarted.

If you have any kind of 24-7 installation this isn't really workable.

--

I wouldn't throw out the idea of replication immediately, it is really powerful.

Adam
Wed, Nov 5 2014 4:42 AMPermanent Link

Richard Beeston

Lablogic Systems Ltd

Hi Adam,
Thanks for the reply

> I have it successfully working for about 5 customers, but all of them have fairly "segmented" data (i.e. Office 1 enters different rows from Office 2). However customers do regularly update the same row (not at the same time!) and I do not see conflicts when this happens.

I don't get this, how often are you syncing/replicating? I've basically made a little test system and all i do is update 1 integer value in a table. If i do the changes at one site, let the replication run through completely at all sites, then everything is fine. If i do an update and then another elsewhere at any point in the replication process then i end up with mismatched data at the sites.

I've attached a diagram showing the process... which explains the scenario (i hope).

> If you are really, really careful about having a central site which acts as a replication "hub" (I use a cloud db-server) and pass ALL updates through this, so all are applied in strict time-added-order then EDB _does_ cope well with repeated and multiple updates to the same row, as updates are all applied in the correct order.

I don't get how you do this. I process the update files in creation date order, per site do you put them all in one store and do ALL sites at once in creation order? But that won't work if, for example, Site B goes away for a week working then comes back and replicates. Those updates won't be "in order".

> The issue of updating and changing the database structure is _much_ more of a problem though!!!
Any significant change to DB structure has to be managed with the _whole_ db system "stopped" (no-user access) and with all pending updates applied before any changes are made, i.e. with replication "off".

Haven't got that far in my testing... that was my next hurdle once i had a replication system up.

For my current project i might be able to use things as they are since the data at the various sites is very separated. It would be very unusual for someone to want to update other sites data. But the next project wouldn't be like that so i'd like to get this sorted out, at the minimum be able to detect the conflict and inform the user.

Richard



Attachments: replication.png
Fri, Nov 7 2014 5:35 AMPermanent Link

Adam Brett

Orixa Systems

Richard Beeston wrote:

>>I've attached a diagram showing the process... which explains the scenario (i hope).

I think your problem here may be a simple one of the order in which you are saving / loading updates.

Line 2 of your PNG.
Branch A makes a change at t0.
Branch B makes a change at t1.

You show the update BA -> Main Site, but not BB -> MainSite (row 3 of the PNG)

Basically the Main Site should not sync back (your row 4) until it has received updates from BA and BB.

If that is done then BB's change at t1 follows through correctly, and all sites end up with BB's value "2".

--

>>I don't get how you do this. I process the update
>>files in creation date order, per site do you put
>>them all in one store and do ALL sites at once
>>in creation order?

On the central cloud site (which manages all the updates) all Update files land into 1 store and are updated in order.

The central site then saves 1 update into 1 store, and copies this update file into stores for ALL branches.

Each branch downloads all the update files in _its_ store. That means all Branches always get their files in order.


>>But that won't work if, for example, Site B goes away
>>for a week working then comes back and replicates.
>>Those updates won't be "in order".

Site B will get its updates "in order", however it may post out-dated edits to some records, i.e. its update file will cover a long period. It is important to ensure that before they do away on holiday Site B sends all its updates. In that way you don't have ancient updates coming back around the system.

And, as soon as they log back on after a break, Site B should run updates to get up-to-date before they do anything else, so they are then on the most current version of the system.

My sites tend not to close down in the way you describe, but I do have "roving sales people" who connect rarely. With these rarely connected users I do take care that the records they are editing and updating are well separated in the database, so other users are very unlikely to touch them.

--

I have made a couple of requests to Tim for future versions to make it easier to query and update / edit UPDATE files. I am not sure where he is with these requests, but once they appear in the product it will be easier to handle more complex replication situations.

It is important to remember that replication can't do impossible stuff. If a user changes a record then another user makes a change the system has to apply the change, even if new data is over-ridden with old data.

The key issue you don't want to see is de-synchronisation. With EDB it is still too easy to get de-synchronisation (as your example demonstrates) if updates are saved and loaded in an even slightly out of sync way.

This is partly because replication has been created to solve several problems, for example you can use Replication in place of Backup with really large databases which are very slow to backup, as replication allows you to only copy over changes, which are usually smaller files.

I would emphasize reviewing your data-architecture once you understand what is possible / easy / difficult with replication and perhaps adapt it a bit to render it more robust to the changes that go with a replication situation. This will probably make it more powerful in other ways too.
Fri, Nov 7 2014 10:30 AMPermanent Link

Richard Beeston

Lablogic Systems Ltd

Hi Adam,

Thanks for the reply, Unfortunately i can't control the branch site synchronisation times. They will do things as and when they feel like it (salesmen Wink) So i can't control when things appear from them. This means i can't pause main site processing waiting for updates to appear to be synchronised as you say. I will have a think about, not sending updates until i see branch transactions, but that will need more thought. Since i can't control incoming syncs my example was correct Smile A syncs, Main syncs, B Syncs, A Syncs...etc It would be nice to be able to say its always going to be A,B,M,A,B,M but can't think of a way to do that with salesmen on the road and without network access. I plan on having the main site run periodically (haven't decided on the period yet) but the branch sites will be ad-hoc.

The spanner in the works is basically the first ever updates... these aren't "in sequence" with everything else, but its also the only way of doing this. I thought about having it such that i save the branch updates, then restore the database back to pre-update, and then processing all the transactions (branches and main) in order which would have worked if elevate didn't ignore the changes its already seen. i.e. the local transactions.

I have fiddled about and got a working setup by tweaking how Elevate processes its replies at the branch site. This works in all my scenarios and always ends up with consistant data. Basically, the last writer wins any of these conflicting changes.

The process all sites at once suggestion was basically the trick i needed!

Main site.. (Just like the sales example except does all the sites at once not site by site)
1. Saves local updates to a file
2. Places this file in each branch sites outgoing store ready for the next time they connect.
3. Copies all incoming transactions from all branch sites into one incoming store.
4. Loads all from incoming store into main database, in CreatedOn order.
5. Saves the incoming updates to a file.
6. Sends this file to each branch site (as per 2)

Branch sites...
1. Saves local transactions to a file
2. Sends to main site
3. Copies all incoming transactions from main site to local store.
4. Loads all incoming transactions into database, in CreatedOn order *

* - the only differences here to standard elevate are...
a) incoming update rows which would have been skipped before because they had the publish ID present of this local branch site are re-processed. Thus putting back any overwritten data caused by sequence anomalies. This results in consistant data system wide, instead of maybe having the odd site having 1 different value.
b) When processing incoming updates, no outgoing updates are created. Thus i don't send out again these changes causing an infinite loop.

The only "oddity" i have found with this is the main site transactions may be overwritten by "older" data. This is basically because any entries written directly into the main site database (not coming from a branch site) are going to be "out of sequence". Just like the original spanner which causes all this to occur. I might go like you and have a central hub which no one accesses except via synchronisation. And a "main" site which acts like a branch, even though its hosted centrally and never roams. Will have to ponder on that one.

The other thing i have to be aware of is triggers on updates wouldn't cause outgoing transactions. But i have no triggers at present, will jump that hurdle if i ever reach it! Wink

I use local databases at the branch sites, i.e. no db server running locally. So i can have tweaked elevate code just in my exe's. The main site can run the standard elevate server.

I did think of tweaking elevate to add in a new option to the LOAD UPDATES command. Then i could control operation instead of forcing it.

LOAD UPDATES ...
REPROCESS LOCAL UPDATES
NO REPLIES

But that would be harder to do... haven't delved into the elevate SQL parsing code Smile

I'm now about to start fighting with DDL statements and how to propogate these. A whole new can of worms eh! Wink

Cheerio

Richard
Image