How We Wronged Neo4j & PostgreSQL: Update of ArangoDB Benchmark 2018

Recently, we published the latest findings of our Performance Benchmark 2018 including Neo4j, PostgGreSQL, MongoDB, OrientDB and, of course, ArangoDB. We tested bread & butter tasks in a client/server setup for all databases like single read/write and aggregation, but also things like shortest path queries which are a speciality for graph databases. Our goal was and is to demonstrate that a native multi-model database like ArangoDB can at least compete with the leading single model databases on their home turf.

Traditionally, we are transparent with our benchmarks, learned plenty from community feedback and want to keep it that way. Unfortunately, we did something wrong in our latest version and this update will explain what happened and how we fixed it. Read more

More info...

NoSQL Performance Benchmark 2018 – MongoDB, PostgreSQL, OrientDB, Neo4j and ArangoDB

ArangoDB, as a native multi-model database, competes with many single-model storage technologies. When we started the ArangoDB project, one of the key design goals was and still is to at least be competitive with the leading single-model vendors on their home turf. Only then does a native multi-model database make sense. To prove that we are meeting our goals and are competitive, we run and publish occasionally an update to the benchmark series. This time we included MongoDB, PostgreSQL (tabular & JSONB), OrientDB and Neo4j.
Read more

More info...

ArangoDB | Thank You for Your Interest in ArangoDB!

“By developers for developers” has been our internal motto since the first lines of ArangoDB code. Building an open-source project at such level of complexity and at a market competitive standard, undoubtedly puts a lot of pressure and almost solely relies on the support and trust of the community.

Every victory counts, be it small appreciation or big success – it’s what gives us inspiration and keeps us going forward. A while ago we’ve been having one of those rainy gray days here in Cologne. Receiving over 10 stars put a smile on faces of our whole team, motivating us to hack harder, brainstorm, bug fix, build, release…

Today we have reached 4,000 stars mark on GitHub! Read more

More info...

Webinar: The native multi-model approach and its benefits for developers, architects and DevOps

Tuesday, May 16th (6PM CEST/12PM ET/ 9AM PT) – Join the webinar here.

This webinar gives a general overview of the multi-model database movement, in particular we will discuss its main advantages and technological benefits from an architect and devops perspective.

Since the first relational databases were invented the needs of companies and the technological possibilities have changed completely. Luca Olivari (recently announced President of ArangoDB) will deep dive into current trends in the database world, how native multi-model databases help companies of all sizes, and walk you through use cases where ArangoDB is beneficial. He will share decades of experience in the field and views on ever-changing needs of developers, companies and customers in the modern times. Read more

More info...

Using GraphQL with ArangoDB: A NoSQL Database Solution

GraphQL is a query language created by Facebook for modern web and mobile applications as an alternative to REST APIs. Following the original announcement alongside Relay, Facebook has published an official specification and reference implementation in JavaScript. Recently projects outside Facebook like Meteor have also begun to embrace GraphQL.

Users have been asking us how they can try out GraphQL with ArangoDB. While working on the 2.8 release of our NoSQL database we experimented with GraphQL and published an ArangoDB-compatible wrapper for GraphQL.js. With the general availability of ArangoDB 2.8 you can now use GraphQL in ArangoDB using Foxx services (JavaScript in the database).

A GraphQL primer

GraphQL is a query language that bears some superficial similarities with JSON. Generally GraphQL APIs consist of three parts:

The GraphQL schema is implemented on the server using a library like graphql-sync and defines the types supported by the API, the names of fields that can be queried and the types of queries that can be made. Additionally it defines how the fields are resolved to values using a backend (which can be anything from a simple function call, a remote web service or accessing a database collection).

The client sends queries to the GraphQL API using the GraphQL query language. For web applications and JavaScript mobile apps you can use either GraphQL.js or graphql-sync to make it easier to generate these queries by escaping parameters.

The server exposes the GraphQL API (e.g. using an HTTP endpoint) and passes the schema and query to the GraphQL implementation, which validates and executes the query, later returning the output as JSON.

New to multi-model and graphs? Check out our free ArangoDB Graph Course.

GraphQL vs REST

Whereas in REST APIs each endpoint represents a single resource or collection of resources, GraphQL is agnostic of the underlying protocols. When used via HTTP it only needs a single endpoint that handles all queries.

The API developer still needs to decide what information should be exposed to the client or what access controls should apply to the data but instead of implementing them at each API endpoint, GraphQL allows centralising them in the GraphQL schema. Instead of querying multiple endpoints, the client can pick and choose from the schema when defining the query and filter the response to only contain the fields it actually needs.

For example, the following GraphQL query:

query {
 user(id: "1234") {
   name
   friends {
     name
   }
 }
}

could return a response like this:

{
 "data": {
   "user": {
     "name": "Bob",
     "friends": [
       {
         "name": "Alice"
       },
       {
         "name": "Carol"
       }
     ]
   }
 }
}

whereas in a traditional REST API accessing the names of the friends would likely require additional API calls and filtering the responses to certain fields would either require proprietary extensions or additional endpoints.

GraphQL Demo Service

If you are running ArangoDB 2.8 you can install the Foxx service demo-graphql from the Store. The service provides a single HTTP POST endpoint /graphql that accepts well-formed GraphQL queries against the Star Wars data set used by GraphQL.js.

It supports three queries:

  • hero(episode) returns the human or droid that was the hero of the given episode or the hero of the Star Wars saga if no episode is specified. The valid IDs of the episodes are "NewHope", "Empire", "Jedi" and "Awakens" corresponding to episodes 4, 5, 6 and 7.
  • human(id) returns the human with the given ID (a string value in the range of "1000" to "1007"). Humans have an id, name and optionally a homePlanet.
  • droid(id) does the same for droids (with IDs "2000", "2001" and "2002"). Droids don't have a homePlanet but may have a primaryFunction.

Both droids and humans have friends (which again can be humans or droids) and a field appearsIn mapping them to episodes (which have an id, title and description).

For example, the following query:

{
 human(id: "1007") {
   name
   friends {
     name
   }
   appearsIn {
     title
   }
 }
}

returns the following JSON:

{
 "data": {
   "human": {
     "name": "Wilhuff Tarkin",
     "friends": [
       {
         "name": "Darth Vader"
       }
     ],
     "appearsIn": [
       {
         "title": "A New Hope"
       }
     ]
   }
 }
}

It's also possible to do deeply nested lookups like "what episodes have the friends of friends of Luke Skywalker appeared in" (but note that mutual friendships will result in some duplication in the output):

{
 human(id: "1000") {
   friends {
     friends {
       appearsIn {
         title
       }
     }
   }
 }
}

Additionally it's possible to make queries about the API itself using __schema and __type. For example, the following tells us the "droid" query returns something of a type called "Droid":

{
 __schema {
   queryType {
     fields {
       name
       type {
         name
       }
     }
   }
 }
}

And the next query tells us what fields droids have (so we know what fields we can request when querying droids):

{
 __type(name: "Droid") {
   fields {
     name
   }
 }
}

GraphQL: The Good

GraphQL shifts the burden of having to specify what particular subset of information should be returned to the client. Unlike traditional REST based solutions this is built into the language from the start: a client will only see information they explicitly request, they don't have to know about anything they're not already interested in.

At the same time a single GraphQL schema can be written to represent the entire global state graph of an application domain without having to hard-code any assumptions about how that data will be presented to the user. By making the schema declarative GraphQL avoids the necessary duplication and potential for subtle bugs involved in building equally exhaustive HTTP APIs.

GraphQL also provides mechanisms for introspection, allowing developers to explore GraphQL APIs without external documentation.

GraphQL is also protocol agnostic. While REST directly builds on the semantics of the underlying HTTP protocol, GraphQL brings its own semantics, making it easy to re-use GraphQL APIs for non-HTTP communication (such as Web Sockets) with minimal effort.

GraphQL: The Bad

The main drawback of GraphQL as implemented in GraphQL.js is that each object has to be retrieved from the data source before it can be queried further. For example, in order to retrieve the friends of a person, the schema has to first retrieve the person and then retrieve the person's friends using a second query.

Currently all existing demonstrations of GraphQL use external databases with ORMs or ODMs with complex GraphQL queries causing multiple consequent network requests to an external database. This added cost of network latency, transport overhead, serialization and deserialization makes using GraphQL slow and inefficient compared to an equivalent API using hand-optimized database queries.

This can be mitigated by inspecting the GraphQL Abstract Syntax Tree to determine what fields will be accessed on the retrieved document. However, it doesn't seem feasible to generate efficient database queries ad hoc, foregoing a lot of the optimizations otherwise possible with handwritten queries in databases.

Conclusion

Although there doesn't seem to be any feasible way to translate GraphQL requests into database-specific queries (such as AQL), the impact of having a single GraphQL request result in a potentially large number of database requests is much less significant when implementing the GraphQL backend directly inside the database.

While RESTful HTTP APIs are certainly here to stay and GraphQL like any technology has its own trade-offs, the advantages of having a standardized yet flexible interface for accessing and manipulating an application's global state graph are undeniable.

GraphQL is a promising fit for schema-free databases and dynamically typed languages. Instead of having to spread validation and authorization logic across different HTTP endpoints and native database format restrictions a GraphQL schema can describe these concerns. Thus guaranteeing that sensitive fields are not accidentally exposed and the data formats remain consistent across different queries.

We're excited to see what the future will hold for GraphQL and encourage you to try out GraphQL in the database with ArangoDB 2.8 and Foxx today. Have a look at the demo-graphql from the Store. If you have built or are planning to build applications using GraphQL and ArangoDB, let us know in the comments.

More info...

Multi-Model Benchmark: Assessing ArangoDB’s Versatility

Claudius Weinberger, CEO ArangoDB

TL;DR Native multi-model databases combine different data models like documents or graphs in one tool and even allow to mix them in a single query. How can this concept compete with a pure document store like MongoDB or a graph database like Neo4j? I myself and a lot of folks in the community asked that question.

So here are some benchmark results: 100k reads → competitive; 100k writes → competitive; friends-of-friends → superior; shortest-path → superior; aggregation → superior.

Feel free to comment, join the discussion on HN and contribute – it’s all on Github.

The latest edition of the NoSQL Performance Benchmark (2018) has been released. Please click here

(more…)

More info...

Is Multi-Model the Future of NoSQL? ArangoDB Insights

Here is a slideshare and recording of my talk about multi-model databases, presented in Santa Clara earlier this month.

Abstract: Recently a new breed of “multi-model” databases has emerged. They are a document store, a graph database and a key/value store combined in one program. Therefore they are able to cover a lot of use cases which otherwise would need multiple different database systems. This approach promises a boost to the idea of “polyglot persistence“, which has become very popular in recent years although it creates some friction in the form of data conversion and synchronisation between different systems. This is, because with a multi-model database one can enjoy the benefits of polyglot persistence without the disadvantages.

In this talk I will explain the motivation behind the multi-model approach, discuss its advantages and limitations, and will then risk to make some predictions about the NoSQL database market in five years time. (more…)

More info...

ArangoDB at NoSQL Matters Paris: Insights & Innovations

If you are interested in NoSQL and come from France, the NoSQL matters conference in Paris is your place to go. ArangoDB contributes with a workshop and a talk and is a silver sponsor of the conference as well. You can meet our team at the exhibition space and ask your ArangoDB questions in person.

Tickets are available for both days, starting at €299 for the conference pass.

Anyway, come and meet us at the historical Tapis Rouge venue in the heart of Paris city!

Building Single Page Applications with Angular.JS and Foxx

Workshop on March, 26th

Angular.JS is Google’s open-source JavaScript framework optimized to build awesome single page applications. This ease of use has convinced many developers to switch. With MVC in the browser all you need from your backend is an easy way to define an API for your data.- That’s where Foxx excels.

In this training session we will build a simple single page application. Showing you how to use Angular.JS and what is a good way to define your model using Foxx.

Polyglot Persistence & Multi-Model NoSQL Databases

Talk on March, 27th (10 am)

In many modern applications the database side is realized using polyglot persistence – store each data format (graphs, documents, etc.) in an appropriate separate database. This approach yields several benefits, databases are optimized for their specific duty, however there are also drawbacks:

  • keep all databases in sync
  • queries might require data from several databases
  • experts needed for all used systems

A multi-model database is not restricted to one data format, but can cope with several of them. In this talk i will present how a multi-model database can be used in a polyglot persistence setup and how it will reduce the effort drastically.

More info...

Data Modeling: MongoDB vs ArangoDB | ArangoDB Blog

MongoDB is a document DB whereas ArangoDB is a multi-model DB supporting documents, graphs and key/values within a single database. When it comes to data modeling and data querying, they pursue somewhat different approaches.


In a Nutshell: In MongoDB, data modeling is “aggregate-oriented”, avoiding relations and joins. On the other side, everybody has probably used relational databases which organize the data in tables with relations and try to avoid as much redundancy as possible. Both approaches have their pros and cons. ArangoDB is somewhat in-between: You can both model and query your data in a “relational way” but also in an “aggregate-oriented way”, depending on your use case. ArangoDB offers joins, nesting of sub-documents and multi-collection graphs. (more…)

More info...

CAP, Google Spanner, & Survival: Eventual Consistency | ArangoDB

In Next gen NoSQL: The demise of eventual consistency a recent post on Gigaom FoundationDB founder Dave Rosenthal proclaims the demise of eventual consistency. He argues that Google Spanner “demonstrates the falsity of a trade-off between strong consistency and high availability”. In this article I show that Google Spanner does not disprove CAP, but rather chooses one of many possible compromises between total consistency and total availability. For organizations with a less potent infrastructure than Google other compromises might be more suitable, and therefore eventual consistency is still a very good idea, even for future generations of nosql databases.

(more…)

More info...

Get the latest tutorials,
blog posts and news: