GrapheneDB

View Current Viewing Revision #9 from 11/24/2019 5:51 p.m.

GrapheneDB is a cloud-based Database-as-a-Service provider for graph databases based on Neo4j. It stores data in graphs, the most generic of data structures, and are especially suited for datasets where relationships are relevant.

History

In 2012, the founder of GrapheneDB, Alberto Perdomo, wanted to use Neo4j for a project. The process of setting up and monitoring servers and finding a place to host it was challenging. Therefore, he decided to create GrapheneDB to make it easier for clients to focus on learning about Cypher and graph modeling in Neo4j as well as developing their applications.

Query Interface

Cypher

Cypher is Neo4j's graph query language that lets user's store and get data from the graph database easily. GrapheneDB has also followed suit and uses Cypher. Cypher has similar functionalities as SQL since it was inspired by the SQL language. Cypher allows queries to be written efficiently as it can also select, insert, update, delete data without a description of how to do it.

Indexes

B+Tree Inverted Index (Full Text)

GrapheneDB uses B+ trees as its native index. The key size is limited, however, so if a transaction reaches the maximum key size, it will fail. A user can use Lucene as an index is the key size for the B+ index is limited. It can also support full text search by keeping all the data up to date automatically whenever new data or indexes are created.

Stored Procedures

Supported

A user can create a procedure using Cypher. The arguments when calling the function can either come from the query or from the associated parameter set.

Compression

Bit Packing / Mostly Encoding

Neo4j uses "bit shaving" to compress the number of bits needed to store primitive types in arrays. It also has classes to limit the number of characters in each class. For example, the "Numerical, Date, and Hex" class have a 54 character count limit. This helps determine whether a string can be inlined or not, which allows for compression and less disk space required. Graphene DB uses Neo4j as its graph database and therefore also makes these compression choices when storing data.

Joins

Hash Join

Neo4j uses HashJoin, with the build input being the left operator and the probe input being the right operator. The NodeHashJoin joins on the node ids whereas the ValueHashJoin joins on the data in two columns specified. Since GrapheneDB is a Database-as-a-Service that uses Neo4j, it has continued with this join choice as well.

Foreign Keys

Supported

In relational databases, there is a concept of foreign keys in that the database has to search for when a user calls a JOIN algorithm. In graph databases, relationships are just as important as the actual data, so the relationships and adjacent nodes are stored in the data itself, so when doing JOINS or looking for relationships between data, the graph database is fast and has a lot of the information needed in the data itself.

Logging

Logical Logging

The Neo4j uses logical logging to recover the database after an unclean shutdown and for incremental backups. The recommended time to store logs is about 7 days.

Query Execution

Tuple-at-a-Time Model

GrapheneDB uses Neo4j, which uses the tuple-at-a-time model as each operator in the tree takes in an input and first evaluates that. This output then goes into its parent operator as the input.

Storage Architecture

Disk-oriented

GrapheneDB stores its database on disk. It optimizes performance by storing information in a cache when possible.

Storage Model

Custom

GrapheneDB uses Neo4j, which uses a custom storage model. Because Neo4j does not have a schema, each store file has the nodes, relationships, and key value properties. These are all stored at particular offsets in the files.

Data Model

Graph

Graphene is a Database-as-a-Service that uses Neo4j as the underlying graph database. Graph databases are effective for data that is highly related to each other data point as graph databases store data as nodes and the relationships between the nodes. Querying data that is highly related to each other is extremely fast in graph databases since relationships are stored within each node.

Checkpoints

Non-Blocking

GrapheneDB is a Database-as-a-Service provider that uses Neo4j. Neo4j uses non-block checkpoints, so it can be backed up as it serves user traffic. Neo4j can have daily or weekly full backups, which results in a database image on the disk. It also provides incremental backups that can be done hourly or daily. A combination of incremental and full backups allows for safety and efficiency. GrapheneDB also provides similar backups- daily, weekly, monthly, or on-demand backups. This also captures a current snapshot of the database for recovery (f needed) and doesn't require downtime.

Concurrency Control

Two-Phase Locking (Deadlock Detection)

GrapheneDB uses Neo4j as its graph database. Neo4j does deadlock detection and the transaction causing the deadlock is rolled back so the other transactions can continue. Neo4j allows the user to retry the transaction either by asking for the transaction to be attempted a certain number of times or through a retry loop.

Revision #9 | Updated 11/24/2019 5:51 p.m.

View Current Viewing Revision #9 from 11/24/2019 5:51 p.m.

Website

https://www.graphenedb.com/

Tech Docs

http://www.graphenedb.com/docs

Developer

GrapheneDB Labs, S.L.

Country of Origin

Start Year

2012