Blog Post Template

Community Notebook Challenge

Calling all Community Members! 🥑

Today we are excited to announce our Community Notebook Challenge.

What is our Notebook Challenge you ask? Well, this blog post is going to catch you up to speed and get you excited to participate and have the chance to win the grand prize: a pair of custom Apple Airpod Pros.

(more…)
More info...

July 2021: What’s the Latest with ArangoDB?

Estimated reading time: 6 minutes

Hello Community,

Welcome to the seventh ArangoDB newsletter of 2021! We hope you are enjoying summer as safely as you can.

In this edition, we are excited to share: 

Read more

More info...

Introducing ArangoDB 3.8 – Graph Analytics at Scale

Estimated reading time: 5 minutes

We are proud to announce the GA release of ArangoDB 3.8!

With this release, we improve many analytics use cases we have been seeing – both from our customers and open-source users – with the addition of new features such as AQL window operations, graph and Geo analytics, as well as new ArangoSearch functionality.

pasted-image-4

If you want to get your hands on ArangoDB 3.8, you can either download the Community or Enterprise Edition, pull our Docker images, or start a free trial of our managed service ArangoGraph.

As with any release, ArangoDB 3.8 comes with many improvements, bug fixes, and features. Feel free to browse through the complete feature list in the release notes to appreciate all the work which has gone into this release.

In this blog post, we want to focus on some of the highlights including AQL Window Operations, Weighted Graph Traversals, Pipeline Analyzer and Geo Support in ArangoSearch.

AQL Window Operations

The WINDOW keyword can be used for aggregations over related rows, usually preceding and / or following rows.

The WINDOW operation performs a COLLECT AGGREGATE-like operation on a set of query rows. However, whereas a COLLECT operation groups multiple query rows into a single result group, a WINDOW operation produces a result for each query row:

  • The row for which function evaluation occurs is called the current row
  • The query rows related to the current row over which function evaluation occurs comprise the window frame for the current row

There are two syntax variants for WINDOW operations:

  • Row-based (evaluated across adjacent documents)
  • Range-based (evaluated across value or duration range)

pasted-image-1

Weighted Graph Traversals

Graph traversals in ArangoDB 3.8 support a new traversal type, "weighted", which enumerates paths by increasing weights.

The cost of an edge can be read from an attribute which can be specified with the weightAttribute option.

FOR x, v, p IN 0..10 OUTBOUND "places/York" GRAPH "kShortestPathsGraph"
  OPTIONS {
    order: "weighted",
    weightAttribute: "travelTime",
    uniqueVertices: "path"
  }

As the previous traversal option bfs was deprecated, the new preferred way to start a breadth-first search from now on is with order: "bfs". The default remains depth-first search if no order is specified, but can also be explicitly requested with order: "dfs".

ArangoSearch Pipeline & AQL Analyzers

pasted-image-3

ArangoSearch added a new Analyzer type, "pipeline", for chaining effects of multiple Analyzers into one. This allows for example to combine text normalization for a case insensitive search with n-gram tokenization, or to split text at multiple delimiting characters followed by stemming.

Furthermore, the new Analyzer type "aql"is capable of running an AQL query (with some restrictions) to perform data manipulation/filtering. For example, a user can define a soundex analyzer for phonetically similar term search:

arangosh> var a = analyzers.save("soundex", "aql", { queryString: "RETURN SOUNDEX(@param)" }, ["frequency", "norm", "position"]);

Note that the query must not access the storage engine. This means no FOR loops over collections or Views, no use of the DOCUMENT() function and no graph traversals.

Enhanced Geo support in ArangoSearch

While AQL has supported Geo indexing and functions for a long time, ArangoDB 3.8 adds Geo support also to ArangoSearch with the GeoJSON and GeoPoint analyzer and respective ArangoSearch Geo functions:

  • Geo_Contains()
  • Geo_Distance()
  • Geo_In_Range()
  • Geo_Intersects()

pasted-image-2

NB: Check out the community ArangoBnB project to learn more about Geo capabilities in ArangoSearch.

Improved Replication Protocol

For collections created with ArangoDB 3.8, a new internal data format is used that allows for a very fast synchronization of differences between the leader and a follower that is trying to reconnect.

The new format used in 3.8 is based on Merkle trees, making it more efficient to pin-point the data differences between the leader and a follower that is trying to reconnect.

The algorithmic complexity of the new protocol is determined by the amount of differences between the leader and follower shard data, meaning that if there are no or very few differences, the getting-in-sync protocol will run very fast. In previous versions of ArangoDB, the complexity of the protocol was determined by the number of documents in the shard, and the protocol required a scan over all documents in the shard on both the leader and the follower to find the differences.

The new protocol is used automatically for all collections/shards created with ArangoDB 3.8. Collections/shards created with earlier versions will use the old protocol, which is still fully supported. Note that such “old” collections will only benefit from the new protocol if the collections are logically dumped and recreated/restored using arangodump and arangorestore.

Other notable features

Upgrade

Upgrading to ArangoDB 3.8 can be performed with zero downtime following the upgrade instructions for your respective deployment option. Please note our recent update advisory and update either to a newer 3.6/3.7 version or 3.8 if you are running an affected version.

ArangoGraph

The easiest way to give ArangoDB 3.8 a spin is ArangoGraph, ArangoDB’s managed service in the cloud.

Feedback

Feel free to provide any feedback either via our Slack channel or mailing list.

Special Edition Lunch Session

Join Simran Spiller on August 4th for a special Graph and Beyond Lunch Session #15.5 - Aggregating Time-Series Data with AQL.

The new WINDOW operation added to AQL in ArangoDB 3.8 allows you to compute running totals, rolling averages, and other statistical properties of your sensor, log, and other data. You can aggregate adjacent documents (or rows if you will), as well as documents in value or duration ranges with a sliding window.

In this lunch and learn session,  we will take a look at the two syntax variants of the WINDOW operation and go over a few examples queries with visual explanations.

Hear More from the Author

Graph Analytics with ArangoDB

ArangoML

Continue Reading

Introducing Developer Deployments on ArangoDB ArangoGraph

ArangoBnB: Data Preparation Case Study

C++ Memory Model: Migrating from X86 to ARM

More info...

Entity Resolution in ArangoDB

Estimated reading time: 8 minutes

This post will dive into the world of Entity Resolution in ArangoDB.  This is a companion piece for our Lunch and Learn session, Graph & Beyond Lunch Break #15: Entity Resolution.

In this article we will:

  • give a brief background in Entity Resolution (ER)
  • discuss some use-cases for ER
  • discuss some techniques for performing ER in ArangoDB
(more…)
More info...

Inside the Avocado Grove: From Canada to Germany and the Digital Marketing of Avocados

Estimated reading time: 8 minutes

My name is Laura, and I am responsible for digital marketing here at ArangoDB. 

In the following post, I will dive into my own experience working at ArangoDB and how I ended up from Northern Ontario, Canada to work in Germany at a native multi-model graph database company. Are you interested in learning more about working abroad, working remotely, or diving into a new industry? This post covers all of the above topics.

(more…)
More info...

Word Embeddings in ArangoDB

Estimated reading time: 12 minute

This post will dive into the world of Natural Language Processing by using word embeddings to search movie descriptions in ArangoDB.

In this post we:

  • Discuss the background of word embeddings
  • Introduce the current state-of-the-art models for embedding text
  • Apply a model to produce embeddings of movie descriptions in an IMDb dataset
  • Perform similarity search in ArangoDB using these embeddings
  • Show you how to query the movie description embeddings in ArangoDB with custom search terms
(more…)
More info...

June 2021: What’s the Latest with ArangoDB?

Estimated reading time: 5 minutes

Hello Community,

Welcome to the sixth ArangoDB newsletter of 2021! Hard to believe we are already half-way through 2021 🤯

In this edition, we are excited to share: 

Read more

More info...

Introducing Developer Deployments on ArangoDB ArangoGraph

Estimated reading time: 4 minutes

Today we’re announcing the introduction of Developer deployments as a beta feature on the Oasis platform.

In this blog post, we’ll tell you what Developer deployments are, what you can do with them, what you should not do with them, and how to get started.

What are Developer deployments?

Since we launched Oasis, a deployment on Oasis has always been a highly available ArangoDB cluster. That is great for high availability, and it allows you to scale your deployment to incredibly large sets of data.

Many customers have told us that they would like something a bit smaller that can be used by an individual developer, or better yet, let every developer have their own deployment.

That request is exactly what we are answering with the introduction of Developer deployments.

A Developer deployment consists of only a single server on a single node.

With that configuration, there is obviously no high availability, and scaling is limited to vertical scaling of that single server.

For those reasons, Developer deployments are not suitable for any kind of production environment, and because of that, Developer deployments are excluded from audit logs.

Support is given for Developer deployments on a best effort basis. It is not possible to buy an additional support plan with a Developer deployment.

All other features, like backups, full encryption, and Foxx, are fully available.

What can you do with a Developer deployment?

Developer deployments are ideal when you want to experiment with ArangoDB, or are just learning its features.

They are also ideal for (small scale) analytics experiments. You can quickly load your data into them, configure your graphs and perform your analysis.

The lack of high availability is usually not a problem for such experiments.

Of course, you can also give a single Developer deployment to all of your developers to develop an application against. When you do so, you have to keep in mind that there are small differences between a single instance of ArangoDB and a cluster.

Within Oasis, we have reduced these changes by requiring the use of the WITH statement in exactly the same way that a cluster deployment requires that keyword (from ArangoDB version 3.7.12 and higher).

How are Developer deployments priced?

A Developer deployment launched in AWS Ohio is available for as little as $ 0.058 per hour or $42.2 per month. The same deployment with General CPU is available for $ 0.068 per hour.

Similar to our OneShard and Sharded deployments, network traffic and backup storage is charged separately.

For more details on pricing, please log in to the Oasis dashboard and visit the Pricing page.

What not do with Developer deployments

As mentioned before, Developer deployments are not highly available and can be restarted at any time. For that reason, you should not use them for any kind of production environment.

Since we strongly believe that a staging environment should reflect the production environment as much as possible, you should also not use a Developer deployment for a staging environment.

How to get started with Developer deployments

To create a Developer deployment on Oasis, log in to your ArangoDB Oasis account on cloud.arangodb.com.
Then go to your project and click on New Deployment.

New-deployments

Enter all the normal fields such as the name of your deployment and select a cloud provider & region.
Then click on the Developer (beta) button to choose the Developer model.

configuration

Select the size of your deployment and click on Create.

Create.

Your Developer deployment will now bootstrap.

Once that has finished, you’ll receive an email and can start to use your deployment.

Hear More from the Author

OCB: Challenges in Building Multi-Cloud-Provider Platform With Kubernetes

Getting Started with ArangoDB Oasis

Continue Reading

ArangoML Part 1: Where Graphs and Machine Learning Meet

A story of a memory leak in GO: How to properly use time.After()

ArangoDB 3.7 – A Big Step Forward for Multi-Model

More info...

ArangoBNB Data Preparation Case Study: Optimizing for Efficiency

Estimated reading time: 18 minutes

This case study covers a data exploration and analysis scenario about modeling data when migrating to ArangoDB. The topics covered in this case study include:

  • Importing data into ArangoDB
  • Developing Application Requirements before modeling
  • Data Analysis and Exploration with AQL

This case study can hopefully be used as a guide as it shows step-by-step instructions and discusses the motivations in exploring and transforming data in preparation for a real-world application.
The information contained in this case study is derived from the development of the ArangoBnB project; a community project developed in JavaScript that is always open to new contributors. The project is an Airbnb clone with a Vue frontend and a React frontend being developed in parallel by the community. It is not necessary to download the project or be familiar with JavaScript for this guide. To see how we are using the data in a real-world project, check out the repository.

(more…)
More info...

ArangoDB Newsletter #133: May Updates and Insights

Estimated reading time: 4 minutes

Hello Community,

Welcome to the fifth ArangoDB newsletter of 2021!

In this edition, we are excited to share: 

We hope you enjoy it!

Read more

More info...

Get the latest tutorials,
blog posts and news: