Skip to content

Overview of ArangoDB and Neo4j Graph Databases

Introduction

Graph databases are widely used in a variety of industries, including:

  • Finance: Graph databases are used in the finance industry to store and analyze financial data, such as stock prices and trades, and to detect fraudulent activity.
  • Healthcare: In the healthcare industry, graph databases are used to store and analyze patient data, including medical records, diagnoses, and treatment plans. They are also used to track the spread of diseases and to optimize the allocation of healthcare resources.
  • Social media: Graph databases are used by social media platforms to store and analyze user data, including connections between users and their interactions.
  • Supply chain management: In the supply chain industry, graph databases are used to optimize the flow of goods and materials and to track the movement of products through the supply chain.
  • Telecommunications: Graph databases are used in the telecommunications industry to store and analyze network data, including the connections between devices and the flow of data through the network.

Overall, graph databases are well-suited for applications that involve complex relationships and connections in the data, such as social networks, recommendation systems, and fraud detection.

ArangoDB and Neo4j are both popular databases that have been widely used in various applications and industries. ArangoDB was first released in 2014, while Neo4j had its first release in 2007. Both databases have their roots in open source, but as you will see later, that now comes with an asterisk next to Neo4j.

In this article, we will compare the two databases and discuss their main characteristics and differences. We will also provide examples of their query languages and discuss the factors to consider when choosing the right database for your project.

ArangoDB

ArangoDB is a native multi-model database that is written in C++ and supports document, graph, and key-value data models.

ArangoDB is fully open source with code available on GitHub, so the code and commit history can be reviewed at any time.

It has a powerful query language called AQL (ArangoDB Query Language), which is similar to SQL and can be used to perform various operations on the data stored in the database.

One of the advantages of ArangoDB is its good integration with cloud platforms. It offers native support for running on cloud infrastructure and provides tools for deploying and managing ArangoDB clusters on cloud platforms such as AWS, Azure, and Google Cloud.

Neo4j

Neo4j is a graph database that is written in Java and is designed to store and query data represented in the form of a graph. It uses a query language called Cypher, which is a declarative language specifically designed for querying and modifying graph data.

It is worth noting here that while Neo4j Community Edition remains fully open source, this is no longer true for Neo4j Enterprise Edition. The Neo4j Enterprise Edition source code is not available on GitHub since version 3.5, and consequently the code and commits of subsequent and current releases can not be reviewed.

One of the main advantages of Neo4j is its ability to handle complex relationships and connections in the data, which makes it well-suited for applications that involve networks or interconnected data. It is also a highly scalable database, with the ability to handle large amounts of data and a high number of read and write operations.

One potential disadvantage of Neo4j is that it is written in Java, which can incur additional overhead due to the need to run the Java Virtual Machine (JVM). This can impact the performance of the database, especially in comparison to databases written in languages such as C++, which do not have the same overhead. However, Neo4j has been optimized to minimize the impact of the JVM and can still provide good performance in many scenarios.

Comparison of AQL and Cypher

To help you make a decision, here are some examples of AQL (ArangoDB Query Language) and Cypher (Neo4j’s query language):

AQL

Find all users with a specific attribute:

FOR u IN users
FILTER u.attribute == "value"
RETURN u

Insert a new document into a collection:

INSERT { _key: "key", attribute: "value" } INTO collection

Update a document in a collection:

UPDATE { _key: "key" } WITH { attribute: "new value" } IN collection

Cypher

Find all nodes with a specific label and property:

MATCH (n:Label {property: "value"})
RETURN n

Create a new node with a label and set its properties:

CREATE (n:Label {property: "value"})
RETURN n

Create a relationship between two nodes:

MATCH (a:Label1), (b:Label2)
WHERE a.property = "value" AND b.property = "value"
CREATE (a)-[r:RELATIONSHIP_TYPE]->(b)
RETURN r

Choosing the Right Database


Neo4j remains the most popular graph database, it faces increasing competition from multi-model databases capable of graph. Of the top 5 most popular databases capable of graph, four of them are multi-model, and ArangoDB is among them. That being said, popularity is not a good metric to base your evaluation on, it can offer a starting point; the suitability of a database must be considered for your specific use case.

So, out of ArangoDB and Neo4j which database is the best choice for your current and potential future projects? This ultimately depends on your forseeable needs and requirements.

If you are looking for a database that has a query language similar to SQL, good cloud integration, and the flexibility for not only graph, but also document and key-value data, then ArangoDB might be a better fit.

On the other hand, if your application is commited to graph, and needs a graph-only database that is optimized for handling complex relationships in the data, Neo4j might be a better choice.