Tutorial: ArangoDB with Python
For Python developers there are several drivers available, allowing you to operate on and administer ArangoDB servers and databases from within your applications.
This tutorial is based on the pyArango driver by Tariq Daouda. You need to install and start ArangoDB on your host. Then install pyArango from the Python Package Index.
$ pip install pyarango --user
Once pip finishes the installation process, you can begin developing ArangoDB application in Python.
ArangoDB with Python Usage
In order to operate on ArangoDB servers and databases from within your application, you need to establish a connection to the server then use it to open or create a database on that server.
PyArango manages server connections through the conveniently named Connection
class.
>>> from pyArango.connection import * >>> conn = Connection(username="root", password="root_passwd")
When this code executes, it initializes the server connection on the conn
variable. By default, pyArango attempts to establish a connection to http://127.0.0.1:8529. That is, it wants to initialize a remote connection to your local host on port 8529. If you are hosting ArangoDB on a different server or use a different port, you need to set these options when you instantiate the Connection
class.
Creating and Opening Databases
With a connection to the ArangoDB Server, you can open or create a database on the server and begin to operate on it. The createDatabase()
method on the server connection handles both operations, returning a Database
instance.
>>> db = conn.createDatabase(name="school")
When the school
database does not exist, pyArango creates it on the server connection. When it does exist, it attempts to open the database. You can also open an existing database by using its name as a key on the server connection. For instance,
>>> db = conn["school"] >>> db ArangoDB database: school
Creating Collections
ArangoDB groups documents and edges into collections. This is similar to the concept of tables in Relational databases, but with the key difference that collections are schema-less.
In pyArango, you can create a collection by calling the createCollection()
method on a given database. For instance, in the above section you created a school
database. You might want a collection on that database for students.
>>> studentsCollection = db.createCollection(name="Students") >>> db["Students"] ArangoDB Collection name: Students, id: 202, type: document, status loaded
Creating Documents
With the database and collection set up, you can begin adding data to them. Continuing the comparison to Relational databases, where a collection is a table, a document is a row on that table. Unlike rows, however, documents are schema-less. You can include any arrangement of values you need for your application.
For instance, add a student to the collection:
>>> doc1 = studentsCollection.createDocument() >>> doc1["name"] = "John Smith" >>> doc1 ArangoDoc 'None': {'name': 'John Smith'} >>> doc2 = studentsCollection.createDocument() >>> doc2["firstname"] = "Emily" >>> doc2["lastname"] = "Bronte" >>> doc2 ArangoDoc 'None': {'firstname': 'Emily', 'lastname': 'Bronte'}
The document shows its _id
as “None” because you haven’t yet saved it to ArangoDB. This means the variable exists in your Python code, but not the database. ArangoDB constructs _id
values by pairing the collection name with the _key
value. It also handles the assignment for you, you just need to set the key and save the document.
>>> doc1._key = "johnsmith" >>> doc1.save() >>> doc1 ArangoDoc 'Students/johnsmith': {'name': 'John Smith'}
Rather than enter and save the data for all the students manually, you might want to enter the data through a loop rather than individual calls. For instance,
4 5 6 7 8 9 10 11 >>> students = [('Oscar', 'Wilde', 3.5), ('Thomas', 'Hobbes', 3.2), ... ('Mark', 'Twain', 3.0), ('Kate', 'Chopin', 3.8), ('Fyodor', 'Dostoevsky', 3.1), ... ('Jane', 'Austen',3.4), ('Mary', 'Wollstonecraft', 3.7), ('Percy', 'Shelley', 3.5), ... ('William', 'Faulkner', 3.8), ('Charlotte', 'Bronte', 3.0)] >>> for (first, last, gpa) in students: ... doc = studentsCollection.createDocument() ... doc['name'] = "%s %s" % (first, last) ... doc['gpa'] = gpa ... doc['year'] = 2017 ... doc._key = ''.join([first, last]).lower() ... doc.save()
Reading Documents
Eventually, you’ll need to access documents in the database. The easiest way to do this is with the _key
value.
For instance, the school database now has several students. Imagine it as part of a larger application with more data available on each student and you would like to check the GPA of a particular student:
>>> def report_gpa(document): ... print("Student: %s" % document['name']) ... print("GPA: %s" % document['gpa']) >>> kate = studentsCollection['katechopin'] >>> report_gpa(kate) Student: Kate Chopin GPA: 3.8
Updating Documents
When you read a document from ArangoDB into your application, you create a local copy of the document. You can then operate on the document, making whatever changes you like to it, then push the results to the database using the save()
method.
For instance, each semester as the final grades come in from their classes, you need to update the students’ grade point averages on the database. Given that this happens frequently, you might want to create a specific function to handle the update:
>>> def update_gpa(key, new_gpa): ... doc = studentsCollection[key] ... doc['gpa'] = new_gpa ... doc.save()
Listing Documents
Occasionally, you may want to operate on all documents in a given collection. Using the fetchAll()
method, you can retrieve and iterate over a list of documents. For instance, say it’s the end of the semester and you want to know which students have a grade point average above 3.5:
>>> def top_scores(col, gpa): ... print("Top Soring Students:") ... for student in col.fetchAll(): ... if student['gpa'] >= gpa: ... print("- %s" % student['name']) >>> top_scores(studentsCollection, 3.5) Top Scoring Students: - Mary Wollstonecraft - Kate Chopin - Percy Shelly - William Faulkner - Oscar Wilde
Removing Documents
Eventually, you may want to remove documents from the database. This can be accomplished with the delete()
method. For instance, say that the student Thomas Hobbes has decided to move to another city:
>>> tom = studentsCollection["thomashobbes"] >>> tom.delete() >>> studentsCollection["thomashobbes"] KeyError: ( 'Unable to find document with _key: thomashobbes', { 'code': 404, 'errorNum': 1202, 'errorMessage': 'document Students/thomashobbes not found', 'error': True })
AQL Usage
In addition to the Python methods shown above, ArangoDB also provides a query language, (called AQL), for retrieving and modifying documents on the database. In pyArango, you can issue these queries using the AQLQuery()
method.
For instance, say you want to retrieve the keys for all documents in ArangoDB:
>>> aql = "FOR x IN Students RETURN x._key" >>> queryResult = db.AQLQuery(aql, rawResults=True, batchSize=100) >>> for key in queryResult: ... print(key) marywollstonecraft katechopin percyshelley fyodordostoevsky marktwain ...
In the above example, the AQLQuery()
method takes the AQL query as an argument, with two additional options:
rawResults
Defines whether you want the actual results returned by the query.batchSize
When the query returns more results than the given value, the pyArango driver automatically asks for new batches.
Bear in mind, the order of the documents isn’t guaranteed. In the event that you need the results in a particular order, add a sort clause to the AQL query.
Inserting Documents with AQL
Similar to document creation above, you can also insert documents into ArangoDB using AQL. This is done with an INSERT
statement using the bindVars
option for the AQLQuery()
method.
>>> doc = {'_key': 'denisdiderot', 'name': 'Denis Diderot', 'gpa': 3.7} >>> bind = {"doc": doc} >>> aql = "INSERT @doc INTO Students LET newDoc = NEW RETURN newDoc" >>> queryResult = db.AQLQuery(aql, bindVars=bind)
Using the RETURN newDoc
sets the new document added to the database as the return value. Meaning that, you can now see the results by calling:
>>> queryResult[0] ArangoDoc 'Students/denisdiderot': {'name': 'Denis Diderot', 'gpa': 3.7}
Updating Documents with AQL
In cases where a document already exists in your database and you would like to modify data in that document, you can use the UPDATE
statement. For instance, say that you receive the students’ updated grade point average in a CSV file.
First, check the GPA of one of the students to see the old value:
>>> db["Students"]["katechopin"] ArangoDoc 'Students/katechopin': {'name': 'Kate Chopin', 'gpa': 3.6}
Then loop through the file updating the GPA’s of each student:
>>> with open("grades.csv", "r') as f: ... grades = f.read().split(',') >>> for key,gpa in grades.items(): ... doc = {"gpa": float(gpa)} ... bind = {"doc": doc, "key": key} ... aql = "UPDATE @key WITH @doc IN Stdents LET pdated NEW RETRN updated" ... db.AQLQuery(aql, bindVars=bind)
Lastly, check the student’s GPA again.
>>> db["Students"]["katechopin"] ArangoDoc 'Students/katechopin': {'name': 'Kate Chopin', 'gpa': 4.0}
Though it’s possible for a student to have the same GPA between semesters, in this case Kate’s GPA went up by a few points.
Removing Documents with AQL
Lastly, you can also remove documents from ArangoDB using REMOVE
statements. For instance, imagine that this year of students are done and have graduated, you want to remove them from the database. You can use year
property to differentiate between different classes of students, which allows you to use a FILTER
clause to keep some and remove others.
>>> bind = {"@collection": "Students"} >>> aql = """ ... FOR x IN @@collection ... FILTER x.year == 2017 ... REMOVE x IN @@collection ... LET removed = OLD RETURN removed ... """ >>> queryResult = db.AQLQuery(aql, bindVars=bind)
The FILTER
condition only iterates over documents that match the condition. The statement REMOVE x IN
deletes the documents (matching that condition). The @@collection
(note the @@
), defines the bound variable for the name of the collection.
The return value given by AQLQuery()
preserves the old documents. So, if you query it directly, it’ll show you the data.
>>> queryResult[0] ArangoDoc 'Studnets/williamfaulkner': {'name': 'William Faulkner', 'gpa': 3.8, 'year': 2017}
If instead you attempt to retrieve the document from the database, ArangoDB returns an error.
>>> db["Students"]["katechopin"] ArangoDoc 'Students/katechopin': {'name': 'Kate Chopin', 'gpa': 4.0}
-
Learn more
Now you know how to work with ArangoDB.
- We also have a Tutorials page with even more How-to’s.
- Look at AQL to learn more about our query language.
- Do you want to know more about Databases? Click here!
- Read more about Collections.
- Explore Documents in our documentation.
- For more examples you can explore the ArangoDB cookbook.