UCS Fault Tolerant File sharing

vbrice.adminsys · August 24, 2016, 1:29pm

Hi everybody,

I’m working on Fault tolerant setup of UCS with 1 master, 2 backup, and 2 site systems and I’m handling a tricky case.

I’d like to make Fault Tolerant my File sharing system INSIDE UCS. In case of my master fell in fault, my LDAP, Kerberos, DNS are replicated and delegate on the backup systems.
But I’d like to have the same behaviour on file sharing and printer sharing.

I’ve think to double net int to provide a VIP pointing dynamically on the running server but i’m not able to add a generic sharing server with the reversed name of that VIP.

Am I heading in the right way and if i do, what am i supposed to add in LDAP tree to make my VIP seen as a true UCS other server ?

Regards,

vbrice

Moritz_Bunkus · September 7, 2016, 8:01am

Hey,

well, having files synchronized across multiple servers isn’t a trivial task. There are several layers at which you can implement a solution:

[ul][li]At the storage layer: use a centralized SAN or NAS system that’s mounted by all servers, e.g. a Synology DiskStation (as a cheap solution, e.g. using iSCSI for the transport) or something like a Dell PowerVault MD3200 as a professional and performant one. That way the data is stored centrally on a single device.[/li]
[li]At the block layer: you can use the DRBD (distributed replicated block device) for implementing data synchronization across servers. This means that the data is stored on each of the servers. The synchronized data appears as a regular block device on each server, and you can use any kind of file system (or the logical volume manager) on top of that device. One thing to note about this approach is that you must only have the file system mounted on one node. Therefore one usually uses a cluster manager technology like Heartbeat or Pacemaker in tandem with DRBD: Pacemaker checks the servers for reachability, and if it determines that the currently active node isn’t reachable any longer it’ll active its configured resources (e.g. the aforementioned DRBD) on a remaining node. This is far outside of the domain of UCS, though.[/li]
[li]At the file system layer: there are several distributed file systems available, e.g. Ceph or GlusterFS. The purpose of these file systems is that they can be active ( = mounted) on each of the participating nodes simultaneously. Modifications can occur on any of the nodes and will be replicated to all the over participating nodes. This is far outside of what UCS usually provides, too.[/li]
[li]At the application layer: you could set up synchronization software like Seafile or AeroFS. They usually require a central server component as well as a client on each of the participating servers.[/li][/ul]

None of these solutions are trivial to implement, and that’s because the whole topic is a complex one. None of those solutions are first class citizens in UCS either, meaning that UCS doesn’t provide any tools for managing such setups beyond what those projects themselves offer.

Kind regards,
mosu

drzzzzz · October 19, 2016, 4:36pm

What about using something like csync2 to rsync the files? It’s great for syncing config files between servers and I’ve actually used it (quite some time ago!) to sync user home directories across a WAN! I would think it would be easy to deploy on the UCS nodes (although I haven’t tried…)

Regards
drzzzzz

Moritz_Bunkus · October 20, 2016, 7:31am

Hey,

csync2’s main focus is a one-to-many synchronization of files, not a two-way synchronization. It is well-suited for deploying a (large) number of files from one location to multiple other locations. But it falls short woefully for a bidirectional synchronization (kepping the same set of files identical when you’ve got changes on both sides).

I’ve used csync2 in a bidirectional synchronization with several custom scripts in the past, but you still had to take care of conflicts (if the same file has been modified on both ends between two synchronization points) by human intervention. Human intervention means the use of the csync2 command-line application on the servers to resolve the conflicts, which only an administrator can do. It just doesn’t scale for such a task.

csync2 is a tool that works at what I’ve called “application layer” earlier. It shares these problems with certain other solutions on that layer (e.g. unison), that’s why I haven’t mentioned it earlier. Other applications on that layer (e.g. OwnCloud, AeroFS) make the human intervention easier by providing both copies of the same file and integrating conflict-resolution tools into their desktop client applications. That way normal users can resolve the conflicts and not just tech-savvy admins.

Edit 1: You’ve said it works fine for home directories. That’s actually one of the use cases where you can get good mileage out of csync2 because the set of files that are modified on each end are usually disjunct: user A logged on to machine B will only write to /home/a whereas user C on machine D will only write to /home/d at the same time. There won’t be any conflicts. But this just isn’t the same as a shared document folder used by different people on different machines at the same time.

Don’t get me wrong, I like csync2, it does what it’s been designed to do very well. But it’s not the right tool for this job. Don’t use a screwdriver to drive in a nail etc.

Edit 2: After re-reading the original poster tools like rsync, unison or csync2 might be a good solution for what was asked for: a read-only replica of the files that’s only activated once the source server fails. If this is the case (if it’s guaranteed that there will be no modifications on the mirror site) then those tools can be used to build mirrors with multi-minute update granularity.

Kind regards,
mosu