ArangoDB 1.1 Feature Preview: Batch Request API | ArangoDB 2012
Clients normally send individual operations to ArangoDB in individual HTTP requests. This is straightforward and simple, but has the disadvantage that the network overhead can be significant if many small requests are issued in a row.
To mitigate this problem, ArangoDB 1.1 offers a batch request API that clients can use to send multiple operations in one batch to ArangoDB. This method is especially useful when the client has to send many HTTP requests with a small body/payload and the individual request results do not depend on each other.
Clients can use ArangoDB’s batch API by issuing a multipart HTTP POST request to the URL /_api/batch
handler. The handler will accept the request if the Content-Type is multipart/form-data
and a boundary string is specified. ArangoDB will then decompose the batch request into its individual parts using this boundary. This also means that the boundary string itself must not be contained in any of the parts. When ArangoDB has split the multipart request into its individual parts, it will process all parts sequentially as if it were a standalone request. When all parts are processed, ArangoDB will generate a multipart HTTP response that contains one part for each part operation result. For example, if you send a multipart request with 5 parts, ArangoDB will send back a multipart response with 5 parts as well.
The server expects each message part to start with exactly the following literal: Content-Type: application/x-arango-batchpart
, followed by two Windows linebreaks (i.e. \r\n\r\n
). Any deviation will lead to the part being rejected or incorrectly interpreted. The part request payload, formatted as a regular HTTP request, must follow this literal directly.
Note that the literal Content-Type: application/x-arango-batchpart
technically is the header of the MIME part, and the HTTP request (including its headers) is the body part of the MIME part.
An actual part request should start with the HTTP method, the called URL, and the HTTP protocol version as usual, followed by arbitrary HTTP headers. Its body should follow after the usual \r\n\r\n
literal. Part requests are therefore regular HTTP requests, only embedded inside a multipart message. This might sound complicated at first, however, it has the advantage that any HTTP request can transparently be embedded as part request inside a multipart message.
The following example will send a batch with 3 individual document creation operations. The boundary used in this example is XXXpartXXX
. The complete request is:
curl -X POST \ --data-binary @- \ --header "Content-Type: multipart/form-data; boundary=XXXpartXXX" \ http://localhost:8529/_api/batch --XXXpartXXX Content-Type: application/x-arango-batchpart POST /_api/document?collection=xyz&createCollection=true HTTP/1.1 {"a":1,"b":2,"c":3} --XXXpartXXX Content-Type: application/x-arango-batchpart POST /_api/document?collection=xyz HTTP/1.1 {"a":1,"b":2,"c":3,"d":4} --XXXpartXXX Content-Type: application/x-arango-batchpart POST /_api/document?collection=xyz HTTP/1.1 {"a":1,"b":2,"c":3,"d":4,"e":5} --XXXpartXXX--
The server will then respond with one multipart message, containing the overall status and the individual results for the part operations. The overall status should be 200 except in case there was an error while inspecting and processing the multipart message. The overall status therefore does not indicate the success of each part operation, but only indicates whether the multipart message could be handled successfully.
Each part operation will return its own status value. As the part operation results are regular HTTP responses (just included in one multipart response), the part operation status is returned as a HTTP status code. The status codes of the part operations are exactly the same as if you called the individual operations standalone. Each part operation might also return arbitrary HTTP headers and a body/payload:
HTTP/1.1 200 OK connection: Keep-Alive content-type: multipart/form-data; boundary=XXXpartXXX content-length: 1055 --XXXpartXXX Content-Type: application/x-arango-batchpart HTTP/1.1 202 Accepted location: /_api/document/101059/9514299 content-type: application/json; charset=utf-8 etag: "9514299" content-length: 53 {"error":false,"_id":"101059/9514299","_rev":9514299} --XXXpartXXX Content-Type: application/x-arango-batchpart HTTP/1.1 202 Accepted location: /_api/document/101059/9579835 content-type: application/json; charset=utf-8 etag: "9579835" content-length: 53 {"error":false,"_id":"101059/9579835","_rev":9579835} --XXXpartXXX Content-Type: application/x-arango-batchpart HTTP/1.1 202 Accepted location: /_api/document/101059/9645371 content-type: application/json; charset=utf-8 etag: "9645371" content-length: 53 {"error":false,"_id":"101059/9645371","_rev":9645371} --XXXpartXXX--
In the above example, the server returned an overall status code of 200, and each part response contains its own status value (202 in the example):
When constructing the multipart HTTP response, the server will use the same boundary that the client supplied. If any of the part responses has a status code of 400 or greater, the server will also return an HTTP header x-arango-errors
containing the overall number of part requests that produced errors.
Here’s a batch request that will produce an error:
curl -X POST \ --data-binary @- \ --header "Content-Type: multipart/form-data; boundary=XXXpartXXX" \ http://localhost:8529/_api/batch --XXXpartXXX Content-Type: application/x-arango-batchpart POST /_api/document?collection=nonexisting {"a":1,"b":2,"c":3} --XXXpartXXX Content-Type: application/x-arango-batchpart POST /_api/document?collection=xyz {"a":1,"b":2,"c":3,"d":4} --XXXpartXXX--
In this example, the overall response code is 200, but as some of the part request failed (with status code 404), the x-arango-errors
header of the overall response is 1
:
HTTP/1.1 200 OK x-arango-errors: 1 content-type: multipart/form-data; boundary=XXXpartXXX content-length: 711 --XXXpartXXX Content-Type: application/x-arango-batchpart HTTP/1.1 404 Not Found content-type: application/json; charset=utf-8 content-length: 111 {"error":true,"code":404,"errorNum":1203,"errorMessage":"collection \/_api\/collection\/nonexisting not found"} --XXXpartXXX Content-Type: application/x-arango-batchpart HTTP/1.1 202 Accepted location: /_api/document/101059/9841979 content-type: application/json; charset=utf-8 etag: "9841979" content-length: 53 {"error":false,"_id":"101059/9841979","_rev":9841979} --XXXpartXXX--
Please note that the feature is available in ArangoDB version 1.1, which is still in development. If you want to, you can give it try already, but it should not be used in production until 1.1 is released officially.
1 Comments
Leave a Comment
Get the latest tutorials, blog posts and news:
Very useful addition! Will try it out soon. Thanks!