Talk: Using mruby with ArangoDB (@Cologne Ruby User Group)

Frank has given a short lightning talk on using mruby in ArangoDB (at this time still called AvocadoDB) at the Cologne Ruby user group.
(more…)

More info...

Preliminary Performance Tests for mruby and V8

Note: We changed the name of the database in May 2012. AvocadoDB is now called ArangoDB

I’m still investigating the possibility to use mruby as embedded language for AvocadoDB, see me last post. I managed to create an interactive shell to play with mruby. Now am trying to do some performance tests.
(more…)

More info...

Towards an interactive mruby shell

In my last post I investigated the possibility of using mruby as embedded language for AvocadoDB. As the first results look quite promising I decided to write a small interactive mruby shell. There is no better way to explore a new toy than to play with it.
(more…)

More info...

Using Minimalistic Ruby as alternative to server-side JavaScript

Note: We changed the name of the database in May 2012. AvocadoDB is now called ArangoDB.

Introduction

One of the design goals of AvocadoDB is:

Use AvocadoDB as an application server and fuse your application and database together for maximal throughput (more…)

More info...

RFC – The ArangoDB/AvocadoDB query language

Note: We changed the name of the database in May 2012. AvocadoDB is now called ArangoDB.

The REST API for AvocadoDB is already available and stable and people are writing APIs using it. Awesome. As AvocacoDB offers more complex data structures like graphs and lists REST is not enough. We implemented a first version of a query language some time ago which is very similar to SQL and UNQL.
(more…)

More info...

Tutorial for ArangoDB’s PHP API published

Note: We changed the name of the database in May 2012. AvocadoDB is now called ArangoDB.

Hey there, short notice: Jan has written a tutorial on how to use the PHP API for AvocadoDB (note: this is “just” the API for the REST interface, it does not cover yet the upcoming query language). You can find both the

PHP API and the tutorial on Github. We are making progress with other languages as well… but that’s something for another blog post (cliffhanger 😉 ).

More info...

Is UNQL Dead?

Note: We changed the name of the database in May 2012. AvocadoDB is now called ArangoDB.

UNQL started with quite some hype last year. However, after some burst of activity the project came to a hold. So it seems, that – at least as a project – UNQL has been a failure. IMHO one of the major issues with the current UNQL is, that it tries to cover everything in NoSQL, from key-value stores to document-stores to graph-database. Basically you end up with greatest common divisor – namely key-value access. But with graph structures and also document-structures you really want to supports joins, paths or some sort of sub-structures.

Apart from all the technical and theoretical benefits of SQL and what advantages the underlying theory has to offer, the major plus from an users point of view is that it is readable. You simple can see an SQL statement – be it in C, Java, Ruby – and understand what is going on. It is declarative, not imperative. With other imperative solution, like a fluent interface or a map-reduce, you need to understand the underlying syntax or language. With SQL you only need to understand English – at least most of the time.

And here I think is where UNQL is totally right. We need something similar for the NoSQL world. But it should not try to be a “fits all situation”. It should be a fit for 80% of the problems. For simple key-values stores a fluent-interface is indeed enough. For very complex graph traversals a traversal program must be written. For very complex map-reduces you might need to write a program – but check out Google’s talk (www.nosql-matters.org/program) about NoNoSQL. There they describe why they are developing a SQL-like interface for Map/Reduce.

In my experience most of the time you have a set of collections holding different “types” of documents with some relations between them. One of the biggest advantages of document stores or graph databases is that you can have lists and sub-objects. The problem with SQL is, that it has no good way to deal with these structures. So I believe UNQL would be quite successful if it would concentrate on these strong advantages of NoSQL, instead of trying to unify everything – especially after hear Jan’s talk about a document query language at the NoSQL Cologne UG (an English version is also available).

More info...

martin on skip list indices and why we use them in ArangoDB

Note: We changed the name of the database in May 2012. AvocadoDB is now called ArangoDB.

Last week AvocadoDB got mentioned in “nosql weekly” and the project achieved a huge amount of public interest especially from Japan. Awesome! 🙂

In this context Mr. Fiber asked on twitter what the use of skip list indices in AvocadoDB is. Here’s a short video reply by chief architect martin Schoenert. Got an opinion on this? – we’d love to hear your thoughts on this in the comments.

skip list index from NoSQL matters on Vimeo or on Youtube

More info...

Using Hilbert curves and Polyhedrons for Geo-Indexing

Cambridge mathematician Richard R. Parker presents a novel algorithm he has developed using a Hilbert curve and Polyhedrons to efficiently implement geo-indexing.

More info...

Under a microscope: how ArangoDB stores data in RAM and data is secured consistently nonetheless in case of a server crash

Note: We changed the name of the database in May 2012. AvocadoDB is now called ArangoDB.

AvocadoDB uses AppendOnly memory-mapped files with frequent fsync. Derived data (indices, etc.) is stored in the main memory only. This article explains why that particular combination leads to high performance and consistent data at the same time―even in case of a system failure.

Classical database systems – a bulk of data and insufficient main memory

Put simply, there are three possible settings regarding databases:

  • Setting 1: All data fits into the main memory.
  • Setting 2: The complete data pool does not fit into the main memory all at once, but the main memory is large enough to store all the data accessed in an average time span.
  • Setting 3: Even the sub-set of data accessed in an average time span is too large for the main memory.

Classical database systems had to cope with setting 3 because main memory was too expensive to store the majority of data.

Basically, classical database systems had to manage the main memory themselves. To manage all data sets that exceeded the capacity of the main memory they needed sufficiently intelligent algorithms which the system software couldn’t provide (i.e., to stream the data through main memory for full table scans).

(more…)

More info...

Get the latest tutorials,
blog posts and news: