Additional Features of ArangoDB Enterprise
The major goal of the ArangoDB 3 development phase is solid scalability with all supported data models. Version 3.0 introduced a completely overhauled cluster architecture, our agency to ensure high availability in a cluster environment and no single point of failure for everyone. Together with our binary storage format VelocyPack we created a basis for upcoming innovations.
The Enterprise Edition of ArangoDB focuses on solving enterprise-scale problems and secure work with data. In version 3.1, we introduced SmartGraphs to bring fast traversal response times to sharded datasets in a cluster. With 3.2, we bring you our latest innovation called SatelliteCollections.
- SatelliteCollections: faster join operations when working with sharded datasets. Avoid expensive network hops with collections replicated to each machine to allow local joins
- SmartGraphs: Scale with graphs into a cluster and stay performant. This unique feature enables you to explore entirely new spheres in graph usage and provides nearly the same performance of graph traversals as a single instance setup
- Enhanced Encryption: Choose your level of SSL encryption and use AES encryption for your data on disk
- Enhanced Authentication with LDAP: use an external server to manage your users
- Auditing: Keep a detailed log of everything that was written or read in ArangoDB
SatelliteCollections enable faster join operations when working with sharded datasets, in particular if you need a join operation between a very large collection (sharded across your cluster) and a small one.
SatelliteCollections are non-sharded collections, which are replicated to each machine in your cluster. The ArangoDB query optimizer knows where each shard is located at and sends requests to the DBServers involved, which then execute the query locally. The DBservers will then send the partial results back to the Coordinator which puts together the final result. With this approach, network hops during join operations on sharded collections can be avoided, hence query performance is increased and network traffic reduced.
For more details, see our blog post: The new Satellite Collections Feature of ArangoDB
When the data-set for a graph exceeds the limits of what you can host on a single instance of ArangoDB, you need to scale. However, sharding a graph through a cluster introduces new issues. When using standard graphs, traversals can involve many network hops between database servers. As edges carry the traversal onto different machines, performance worsens.
SmartGraphs solve this issue by optimizing the distribution of data between the shards, reducing the number of edges that require network hops to other servers.
Scaling with Graphs
The Community Edition of ArangoDB can handle large data-sets on a single instance, allowing you to scale vertically without issue. It can also handle scaling horizontally to a cluster with all three data models. However, you may begin to encounter performance issues when, in scaling horizontally, you shard a graph through the cluster.
Picture a graph that handles a large data-set, such as what you might find in say an IoT, finance, communications, healthcare or genomics application. The natural distribution of data involves a series of highly interconnected communities with many edges running between these communities.
Traversing graphs on this scale can take you through billions or even trillions of vertices. That amount of data is far too much to fit on a single machine and whenever an edge takes you from one machine to another, performance bottlenecks on the network connection. If an edge on the second machine takes you back to the first or out to a third, it grows worse still. The more network hops the traversal requires, the greater the network latency, which can grow very expensive compared to in-memory computations. Eventually, performance degrades to a point where it’s no longer suitable for your given use case.
Scaling with SmartGraphs
Performance issues when traversing sharded graphs relate to network latency. The more network hops your traversal requires, the less benefit you get from horizontal scaling. With ArangoDB Enterprise Edition you benefit from SmartGraphs, solving the network latency issues of traversals by using the smartness of your application layer.
Graphs know nothing of themselves. But, your application knows a lot about the graph. In many data-sets there are highly interconnected communities, but few connections between these communities. For instance, a set covering your customers, regions or any other logic you apply to organize your graph at the application layer can in turn be used in sharding the graph through the cluster.
SmartGraphs use the smartness of your application layer to optimize how it shards data through the cluster. For instance, customer ID, regions or any other logic that fits into your main queries. With this smartness, you can shard highly connected communities within your graph to specific instances.
By optimizing the distribution of data, SmartGraphs reduce the number of network hops traversals require. Internal tests show a 40-120x performance gain when traversing sharded graphs.
Enterprise Edition users can now work on complete new use cases or further optimize current graph-based applications. If you’d like to know more contact about how you can set up such a high performance cluster with two clicks.
ArangoDB Community Edition supports the use of SSL/TLS to encrypt communications with the database and between database instances in a cluster.
Enterprise Edition users have the option of taking this a step further with Enhanced Encryption. This allows you to configure ArangoDB to only use TLS 1.2, ensuring that your databases always use the highest-level of security standards in your production environments.
Encryption at Rest
To protect sensitive data in your database under all circumstances, it requires encryption at the transport layer, but it also needs encryption when the data is on disk. Using the RocksDB storage engine, you can encrypt the data stored on disk in ArangoDB using a highly secure AES algorithm. Even if someone steals one of your disks, they won’t be able to access the data.
With this feature, ArangoDB takes another big step towards HIPPA compliance.
Normally, users are defined and managed in ArangoDB itself. Starting with the Enterprise Edition 3.2, you can use an external server to manage your users with LDAP. We have implemented a common schema which can be extended.
In terms of both compliance and forensic analysis of data breaches, auditing is an important tool. ArangoDB audit logs provide an irrefutable record of actions taken, whether they are generated by a database, directory, or operating system.
The ArangoDB audit log records the following actions:
- Database creation and deletion
- Collection creation and deletion
- Index creation and deletion
- Read access to documents
- Query alterations
Want to know more? Let us show you the power of ArangoDB Enterprise Edition and how we can contribute to your project with our 20+ years of database experience. Request a demo or an introduction call via the form below.