Document Update with arangoimp: ArangoDB Data Management

Inspired by the feature request in Github issue #1298, we added update and replace support for ArangoDB’s import facilities.

This extends ArangoDB’s HTTP REST API for importing documents plus the arangoimp binary so they can not only insert new documents but also update existing ones.

Inserts and updates can also be mixed in a single import run. What exactly will happen is configurable by setting arangoimp’s new command-line option --on-duplicate.

By default, error will be reported if a document already exists. This behavior can be changed by setting --on-duplicate to a value of update, replace or ignore. Here is an example result of an import with duplicated keys:

> arangoimp --file data.json --collection users --on-duplicate update

created:          1
warnings/errors:  0
updated/replaced: 2
ignored:          0

So, if you want to aggregate data from several data files, you can try the new import command-line option --on-duplicate.

In a blog post, Jan provides a few usage examples.

ArangoDB 2.5.2 Release: Enhanced Features & Stability

This version is deprecated. Download the new version of ArangoDB

A maintenance release of ArangoDB 2.5 is available for download. The latest v2.5.2 comes with cluster speedups, some fixes in cluster mode and improved graph queries.

You can start an ArangoDB cluster on Digital Ocean with a single command, other cloud services will follow in the next days. Please try the new release, maybe on Digital Ocean with the new cluster setup script, and provide us your feedback.

Which cloud provider should we support next? (more…)

ArangoDB DigitalOcean Cluster: Scalable and Efficient Deployment

It is often difficult and time-consuming to setup a cluster environment for development or production purposes. For this reason, we decided to make an initial setup for you as easy as possible.

Today we’re introducing the first part of our new deployment tool for cloud computing platforms (Edit: now also available: Amazon Web Services and Google Compute Engine):

Part 1: Digital Ocean

We’ve released our first prototype, which deploys an ArangoDB Cluster on Digital Ocean. Just download a single bash script, export your Digital Ocean API Token and watch the tool take care of the rest for you.

wget https://raw.githubusercontent.com/ArangoDB/deployment/publish/DigitalOcean_ArangoDB_Cluster.sh
chmod 755 DigitalOcean_ArangoDB_Cluster.sh

(more…)

ArangoDB Release Candidate: Testing New Features & Stability

How often did you typed

var db = require("internal").db;

in the arangod console?

If you are familiar with the arangosh JavasScript shell than you probably use a custom .arangosh.rc startup script in your home-directory which defines your own extra variables and functions that you need often.

Now we’ve also added support for a file .arangod.rc that will be executed on server start. For example, you could put the following into the .arangod.rc file:

internal = require("internal");
fs = require("fs");
db = internal.db;
time = internal.time;
timed = function (cb) {
  var s  = time(); 
  cb();
  return time() - s;
};
print = internal.print;

You’ll never have to go through the history again to add your favourite function again. (Available in devel-branch, coming to the next releases soon).

More Efficient Data Exports with new Export API

ArangoDB 2.6 provides a specialized export API for exporting all documents from a collection and shipping them to a client application. It is rather limited but faster than the general-purpose AQL cursor API and can store its snapshots using less memory.

export_api

A side effect of the speedup is that the first results will arrive much earlier in the client application. This will help in reducing client connection timeouts in case clients are enforcing them on temporarily non-responding connections. (more…)

Improved Cursor API: ArangoDB Query Efficiency Boost

This week we pushed some modifications for ArangoDB’s cursor API into the devel branch. The change will result in less copying of AQL query results between the AQL and the HTTP layers. As a positive side effect, this will reduce the amount of garbage collection the built-in V8 has to do.

These modifications should improve the cursor API performance significantly for many cases, while at the same time keeping its REST API stable. Client programs do not need to be adjusted to reap the benefits. In a blog post, Jan shows some first unscientific performance tests comparing the old cursor API with its new, improved implementation.
(more…)

Enhancements for Data Modification Queries: ArangoDB Updates

Data-modification queries were enhanced in ArangoDB 2.4 to be able to also return the inserted, update or removed documents. For example, the following statement inserted a few documents and also returned them with all their attributes:

FOR i IN 1..10
  INSERT { value: i } IN test
  LET inserted = NEW
  RETURN inserted

The syntax for returning documents from data-modification queries only supported the exact above format. Using a LET clause was required, and the RETURN clause was limited to returning the variable introduced by the LET. These syntax restrictions have been lifted in the devel branch, which will become release 2.6 eventually.

The changes make returning values from data-modification statements easier and also more flexible.
(more…)

Upsert Operations in ArangoDB: Efficient Data Management

This week saw the completion of the AQL UPSERT command. This command will be very helpful in a lot of use cases, including the following:

  • ensure that a document exists
  • update a document if it exists, otherwise create it
  • replace a document if it exists, otherwise create it

The UPSERT command is executed on the server side and so delivers client applications from issuing a fetch command followed by a separate, conditional UPDATE or INSERT command.

The general format of an UPSERT statement is:

UPSERT search-document
INSERT insert-expression
UPDATE update-expression
IN collection-name

Jan collected a few example invocations of UPSERT in his blog.

Is Multi-Model the Future of NoSQL? ArangoDB Insights

Here is a slideshare and recording of my talk about multi-model databases, presented in Santa Clara earlier this month.

Abstract: Recently a new breed of “multi-model” databases has emerged. They are a document store, a graph database and a key/value store combined in one program. Therefore they are able to cover a lot of use cases which otherwise would need multiple different database systems. This approach promises a boost to the idea of “polyglot persistence“, which has become very popular in recent years although it creates some friction in the form of data conversion and synchronisation between different systems. This is, because with a multi-model database one can enjoy the benefits of polyglot persistence without the disadvantages.

In this talk I will explain the motivation behind the multi-model approach, discuss its advantages and limitations, and will then risk to make some predictions about the NoSQL database market in five years time. (more…)

Graphs in data modeling 

Max wrote an inspiring article about graphs in data modeling on Medium, packed with his own thoughts – “to sort out some things in my brain” (Max).

He asks and answers the question: Are graphs and graph databases useful in data modeling, and if so, for what and under which circumstances?

In his article, he goes all the way down from the theoretical approach of what is a graph? towards storing a graph in different storage models (RDBMS, document store and graph databases) to querying a graph and finally to his personal conclusion. (more…)

Get the latest tutorials, blog posts and news: