Neo4j Fabric: Scaling out is not only distributing data
Estimated reading time: 3 minutes
Neo4j, Inc. is the well-known vendor of the Neo4j Graph Database, which solely supports the property graph model with graphs of previously limited size (single server, replicated).
In early 2020, Neo4j finally released its 4.0 version which promises “unlimited scalability” by the new feature Neo4j Fabric. While the marketing claim of “scalability” is true seen from a very simplistic perspective, developers and their teams should keep a few things in mind – most importantly: True horizontal scalability with graph data is not achieved by just allowing distributing data to different machines. Read more
ArangoML Pipeline Cloud – Managed Machine Learning Metadata Service
Estimated reading time: 4 minutes
We all know how crucial training data for data scientists is to build quality machine learning models. But when productionizing Machine Learning, Metadata is equally important.
Consider for example:
- Capture of Lineage Information (e.g., Which dataset influences which Model?)
- Capture of Audit Information (e.g, A given model was trained two months ago with the following training/validation performance)
- Reproducible Model Training
- Model Serving Policy (e.g., Which model should be deployed in production based on training statistics)
If you would like to see a live demo of ArangoML Pipeline Cloud, join our Head of Engineering and Machine Learning, Jörg Schad, on February 13, 2020 – 10am PT/ 1pm ET/ 7pm CET for a live webinar.
Efficient Massive Inserts into ArangoDB with Node.js
Estimated reading time: 3 minutes
Nothing performs faster than arangoimport and arangorestore for bulk loading or massive inserts into ArangoDB. However, if you need to do additional processing on each row inserted, this blog will help with that type of functionality.
If the data source is a streaming solution (such as Kafka, Spark, Flink, etc), where there is a need to transform data before inserting into ArangoDB, this solution will provide insight into that scenario as well. Read more
What’s new in ArangoDB 3.6: OneShard Deployments and Performance Improvements
Estimated reading time: 9 minutes
Welcome 2020! To kick off this new year, we are pleased to announce the next version of our native multi-model database. So here is ArangoDB 3.6, a release that focuses heavily on improving overall performance and adds a powerful new feature that combines the performance characteristics of a single server with the fault tolerance of clusters.
If you would like to learn more about the released features in a live demo, join our Product Manager, Ingo Friepoertner, on January 22, 2020 – 10am PT/ 1pm ET/ 7pm CET for a webinar on “What’s new in ArangoDB 3.6?”. Read more
Release Candidate 2 of the ArangoDB 3.6 available for testing
We are working on the release of ArangoDB 3.6 and today, just in time for the holiday season, we reached the milestone of RC2. You can download and take the RC2 for a spin: Community Edition and Enterprise Edition. Read more
ArangoDB and the Cloud-Native Ecosystem: Integration Insights
ArangoDB is joining CNCF to continue its focus on providing a scalable native multi-model database, supporting Graph, Document, and Key-Value data models in the Cloud Native ecosystem.
ArangoDB
ArangoDB is a scalable multi-model model database. What does that mean?
You might have already encountered different NoSQL databases specialized for different data models e.g., graph or document databases. However most real-life use-cases actually require a combination of different data models like Single View of Everything, Machine Learning or even Case Management projects to name but a few.
In such scenarios, single data model databases typically require merging data from different databases and often even reimplementing some database logic in the application layer as well as the effort to operate multiple database in a production environment.
Say Hi To ArangoDB ArangoGraph: A Fully-Managed Multi-Model Database Service
After two years of planning, preparation and a few lines of code, you can now enjoy an even more comfortable developers’ life with ArangoDB.
Today, we are happy to announce the launch of ArangoDB’s managed service ArangoGraph– a fully-managed graph database, document, and key-value store, as well as a full-text search engine – in one place.
Building Our Managed Service on Kubernetes: ArangoDB Insights
Running distributed databases on-prem or in the cloud is always a challenge. Over the past years, we have invested a lot to make cluster deployments as simple as possible, both on traditional (virtual) machines (using the ArangoDB Starter) as well as on modern orchestration systems such as Kubernetes (using Kube-ArangoDB).
However, as long as teams have to run databases themselves, the burden of deploying, securing, monitoring, maintaining & upgrading can only be reduced to a certain extent but not avoided.
For this reason, we built ArangoDB ArangoGraph.
Read more
ArangoDB Hot Backup: Creating Consistent Cluster-Wide Snapshots
Introduction
“Better to have, and not need, than to need, and not have.”
Franz Kafka
Franz Kafka’s talents wouldn’t have been wasted as DBA. Well, reasonable people might disagree.
With this article, we are shouting out a new enterprise feature for ArangoDB: consistent online single server or cluster-wide “hot backups.”
ArangoML Pipeline: Simplifying Machine Learning Workflows
Over the past two years, many of our customers have productionized their machine learning pipelines. Most pipeline components create some kind of metadata which is important to learn from.
This metadata is often unstructured (e.g. Tensorflow’s training metadata is different from PyTorch), which fits nicely into the flexibility of JSON, but what creates the highest value for DataOps & Data Scientists is when connections between this metadata is brought into context using graph technology…. so, we had this idea… and made the result open-source.
We are excited to share ArangoML Pipeline with everybody today – A common and extensible metadata layer for ML pipelines which allows Data Scientists and DataOps to manage all information related to their ML pipelines in one place.
Get the latest tutorials,
blog posts and news:
Thanks for subscribing! Please check your email for further instructions.