Diffing Two Documents in AQL: ArangoDB Data Comparison
I just stumbled upon a comment in the ArangoDB blog asking how to create a diff of two documents with AQL.
Though there is no built-in AQL function to diff two documents, it is easily possible to build your own like in the following query.
Read more on how to diff two documents in AQL.
2 Comments
Leave a Comment
Get the latest tutorials, blog posts and news:
I wonder if a custom AQL function written in JS for document diffing would be slower than your pure AQL query… BTW: there’s a json-patch format https://tools.ietf.org/html/rfc6902 and a diff format used here: https://github.com/benjamine/jsondiffpatch
I was aware of JSON-patch, but I wasn’t sure what the intention of the to-be-generated diffs was. For simply detecting whether or not two documents differ and to what extent, a simple solution as demonstrated may already be sufficient. If the goal however is to create patches that can be sent somewhere via HTTP PATCH, JSON-patch will be the natural choice.
I wasn’t aware of jsondiffpatch yet, and after looking into it, I still prefer JSON-patch.
But which way to go really depends on what is to be achieved with the generated diffs.
Performance-wise a custom AQL function may be faster if the documents
are small. For example, the following custom function was about twice as
fast as the AQL-only solution for the two example documents I used in the post:
require(“org/arangodb/aql/functions”).register(“my::diff”, function (doc1, doc2) {
var result = {
missing: { },
changed: { },
added: { }
};
Object.keys(doc1).forEach(function(key) {
if (! doc2.hasOwnProperty(key)) {
result.missing[key] = doc1[key];
}
else if (JSON.stringify(doc1[key]) !== JSON.stringify(doc2[key])) {
result.changed[key] = { old: doc1[key], ‘new’: doc2[key] };
}
});
Object.keys(doc2).forEach(function(key) {
if (! doc1.hasOwnProperty(key)) {
result.added[key] = doc2[key];
}
});
return result;
};
The JavaScript solution can save one iteration over all attributes because it can use if/then/else, which AQL does not provide.
It may look different for other types of documents (especially bigger ones) and if new types of AQL optimizations are added.