Memgraph

View Current Viewing Revision #19 from 01/03/2023 12:30 p.m.

Memgraph is a native fully distributed in-memory graph database built to handle real-time use-cases. It features full ACID-compliant transactions and strong consistency, and uses the RAFT algorithm for replication. It supports the standardized openCypher query language. Various graph algorithms are also shipped with the database, such as breadth first search and weighted shortest path. Memgraph is available on-premises or in-cloud, and is offered in a single node or distributed version.

History

Memgraph is a graph database start-up that was founded in 2016 by Dominik Tomicevic and Marko Budiselic.

Checkpoints

Consistent

Snapshots are taken periodically during the runtime of Memgraph. When a snapshot is triggered, the whole data storage is written to disk. Database recovery is done on startup from the most recent snapshot file.

Concurrency Control

Multi-version Concurrency Control (MVCC)

Memgraph supports strongly-consistent ACID transactions and uses multi-version concurrency control.

Data Model

Graph

Memgraph's underlying data model is the property graph model. A property graph consists of nodes and relationships (or vertices and edges). Nodes can hold any number of properties (key-value pairs) and can be assigned labels that represent their roles within a particular domain. For example, a node with a label "person" could have properties such as "name" or "age." Relationships are directed connections between two nodes and contain properties (just like nodes) as well as a single edge type. For example, a relationship in a graph of people can have the edge type "friend of," and a property "since."

Foreign Keys

Not Supported

In graph data models, relationships are as important as the data itself. Thus foreign keys are not required to infer connections between entities.

Indexes

Skip List

The central part of Memgraph's index data structure is a highly-concurrent skip list.

Isolation Levels

Snapshot Isolation

Memgraph’s implementation of MVCC provides snapshot isolation level.

Logging

Physiological Logging

Memgraph uses write-ahead logging.

Parallel Execution

Intra-Operator (Horizontal)

Memgraph features a distributed query execution engine. Each query plan is divided into two: a plan that will be executed on the machine where the query is received and a plan that will be executed on the other machines. Nodes are allowed to exchange data during the execution process.

Query Compilation

Not Supported

Query Execution

Tuple-at-a-Time Model

The execution of a query proceeds iteratively, generating one entry of the result set at a time.

Query Interface

Cypher

Memgraph supports the openCypher query language developed by Neo4j. It is a declarative language for querying graph databases. openCypher is an open-source project to standardize Cypher as the query language for graph processing.

Storage Architecture

In-Memory Hybrid

Although Memgraph is an in-memory database by default, it offers an option to store a certain amount of data on disk. A list of properties to be stored on disk can be passed in via the command line. It is recommended to use this feature on large, cold properties, i.e., properties that are rarely accessed. The storage location of a property cannot be changed while Memgraph is running. The user can, however, reload their database from snapshot and provide a different list of properties to store on disk.

Additionally, snapshots and the write-ahead log are stored on disk.

Storage Model

Custom

Since Memgraph is a graph DBMS, data is stored in the form of graph elements: nodes and edges. Each graph element can contain various types of data.

Stored Procedures

Not Supported

Although the openCypher initiative stems from the Neo4j's Cypher query language, Memgraph's openCypher implementation does not support certain Cypher constructs, such as stored procedures.