data replication

Date: 2005-06-11 11:48 pm (UTC)
if a bunch of processes need access to the same data, and it's relatively static data, you might want to look into AFS. it's a bitch to configure (and requires an adequate kerberos infratructure, for starters), but:

  • it's very cross-platform, with ways to hook it into Windows and many flavors of unix, including MacOS

  • it's designed to handle the replication issues natively -- that is, your various client machines just know that they are fetching data from your AFS cell, but in fact, the same data could be spread across a large number of servers to minimize the load. dunno how well this is supposed to work for more dynamic data, though.

  • it has neat features like automatic backups (snapshots of your data taken at specific points) which can be mounted read-only for recovery while the live system is still running.

  • it's expandable without having to allocate a new filesystem on a single server; for example, you (or, ahem, your administrator) can tell the system "this chunk of data is now going to be stored over here on this new disk on this new machine", and none of the clients need to be reconfigured or anything.


That said, i've still never managed to get an AFS cell up and running in full, and i've taken a few cracks at it.

but from all the docs i've read, it seems to do (mostly) what i want. if only it encrypted all the traffic more robustly, i'd be happy.

Alternately (more of a hack, less of a principled solution), if you can easily split out your data into static and dynamic sections, you could manually replicate your static sections across a set of N different NFS fileservers (using an hourly rsync cron job or something). then configure each client machine to mount the static export from just one of the N fileservers.

and if you have data like this that's really static, you could just replicate it (via rsync?) to the local filesystem of each of the client machines when authoritative data source gets modified. that would save you on the network crunch as well as reducing a lot of load on the fileserver.

i'm sure you've considered solutions like this in some form or another, though. i'd be interested to hear what you come up with. how well can you segregate your static data from your dynamic data?
This account has disabled anonymous posting.
If you don't have an account you can create one now.
HTML doesn't work in the subject.
More info about formatting

Profile

trochee: (Default)
trochee

June 2016

S M T W T F S
   1234
567 89 1011
12131415 161718
19202122232425
2627282930  

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Dec. 28th, 2025 03:54 am
Powered by Dreamwidth Studios