Using arangodump and arangorestore with a multi-database installation
When you want to manage full backups, migrate your data between different instances, or make breaking changes to how your cluster data is structured, you will likely find yourself using our arangodump
and arangorestore
tools. Typically these tools dump or restore a single database at a time, and default to the _system
database.
In order to dump or restore a different database, each tool accepts the --server.database
option. So for instance, if you have a database foo
, your can dump it as follows.
arangodump --server.database foo
Now, what if you have several databases? In releases prior to ArangoDB 3.5, you would have to run the command once for each database, specifying the name of each database. You could script this, fetching the list of database names and dumping each database to its own directory; similarly you could scan for each dump directory and restore each individually. We decided this was too much work, so we added a simple flag to automate the behavior for you.
Both utilities now take the --all-databases
option. In order to dump all databases serially and automatically, use a command like the following (adding your own options like --server.endpoint
if necessary).
arangodump --all-databases true
This will dump each database into its own subfolder in the dump directory, so that it can automatically be restored with the following command (again with any necessary additional options like --server.endpoint
).
arangorestore --all-databases true
Limitations
This approach will not guarantee a consistent snapshot in time across databases. It is functionally equivalent to retrieving the list of databases yourself and executing the dump or restore command for each database in sequence.
Accordingly, this will not result in any speedup, as it operates on the list serially, not in parallel.
The --threads
option may parallelize operations across collections within a single database at a time. This option is compatible with the --all-databases
flag.
In the case of a cluster, we have a script, under the Fast Cluster Restore section of our arangorestore
documentation, which can distribute the restoration of each collection within a database across multiple coordinators. Using the --all-databases
flag will break this script.