Please note that this tutorial is valid for the ArangoDB 3.3 milestone 1 version of DC to DC replication!
Interested in trying out ArangoDB? Fire up your cluster in just a few clicks with ArangoDB ArangoGraph: the Cloud Service for ArangoDB. Start your free 14-day trial here
This milestone release contains data-center to data-center replication as an enterprise feature. This is a preview of the upcoming 3.3 release and is not considered production-ready.
In order to prepare for a major disaster, you can setup a backup data center that will take over operations if the primary data center goes down. For a server failure, the resilience features of ArangoDB can be used. Data center to data center is used to handle the failure of a complete data center.
Data is transported between data-centers using a message queue. The current implementation uses Apache Kafka as message queue. Apache Kafka is a commonly used open source message queue which is capable of handling multiple data-centers. However, the ArangoDB replication is not tied to Apache Kafka. We plan to support different message queues systems in the future.
The following contains a high-level description how to setup data-center to data-center replication. Detailed instructions for specific operating systems will follow shortly. Read more
comes with asynchronous master-slave replication. The new replication feature should make it much easier to create a backup from a running ArangoDB server. For example, a second ArangoDB instance can now be used as a slave by cloning all data from the master. The slave will be populated in the background while the master is running and accepting requests – not disrupting the master operations.
UPDATE: ArangoDB 2 introduced sharding! 🙂
Original blog post:
In ArangoDB’s google group there was recently an interesting discussion on what ArangoDB should offer in terms of replication and sharding. For the rest of you who does not follow the posts in the group, I have copied Frank Celler’s answer into this post:
We will start with a master-slave, asynchronous replication for 1.4. This has at least the following advantages:
- It is a good fit for most use cases.
- It will allow us to implement backup as “slave”.
- It easily gives you redundancy by setting up multiple instances.
- It gives you read-scaling.
There are also drawbacks. For example, you need to manually select and switch masters in case of fail-over. However, restricting to a simple solution (which is still hard enough to implement) should allow us to release V1.4 this summer. If you think about MySQL, you will see that in most case a master-slave replication is sufficient.
The next step will be master-master replication. This, however, requires more complex protocols like Paxos to elect a master and at least three nodes. We have to decide, if this will be in version 1.5 or maybe already 2.0. We have to see how much has to be changed.
Get the latest tutorials,
blog posts and news:
Thanks for subscribing! Please check your email for further instructions.