Geode

NoSQL OLTP

Apache Geode is an open-source, in-memory distributed database designed to support transactional applications with the demand for low latency and high concurrency.

History

Geode was first developed by GemStone Systems in 2002 under the name "GemFire" and used in financial services providers including major Wall Street banks. In 2010, GemStone Systems was acquired by VMWare. In 2013, Pivotal was spun out of VMware and took GemFire with it, and in 2015 they outsourced the project to Apache Incubator as "Geode". Finally, in November 2016, the Apache Software Foundation (ASF) announced that Apache Geode has graduated as a Top-Level Project (TLP).

Compression

Bit Packing / Mostly Encoding

By default, Geode supports Snappy compressor, and users are allowed to use their customized compressor by region. When compression is enabled in a region, all the values in memory will be compressed, and they will be decompressed when they are read from the cache.

Concurrency Control

Timestamp Ordering

For a partitioned region, all the updates on a specific key will be routed to the primary Geode member serializable.

For a replicated region, as long as a member hosts the region, it can update the key and propagate it to other members. The consistency is retained by storing the version stamp and the Geode member ID which last updated the entry in each entry of a region. The version stamp will get incremented each time an update for that entry appears. When an update message is received, the Geode member/client will compare the version stamp of the update message to the one in the local cache. If the former one is larger, the member/client will perform the update and update the version stamp information. When the update message has the same version stamp as the local cache, the one with the higher member ID will be kept. The concurrency control of non-replicated regions and client cache is similar to the way in replicated regions.

Data Model

Key/Value

In a Geode distributed, system, caches are defined as an abstraction that describes the node of the in-memory storage for the data. Each cache contains regions, which data is stored in the form of key-value pairs. So caches are similar to the construct of databases and regions are similar to tables in a relational database.

Foreign Keys

Not Supported

Although Geode doesn't support foreign keys, it provides a similar function called data colocation to store related data entries that have the same ID from different data regions into one single member. For example, the Geode system contains one customer records region and one customer orders region and they are related to each other through the customer.

By using colocation, users can maintain all records and orders information for a customer in a cache of a single member, which will be used by all operations regarding this customer only.

Isolation Levels

Read Uncommitted Repeatable Read

Geode supports repeatable read isolation. By default, the transactions are isolated at the thread level. So other threads can't see the changes made by one thread of a transaction until the commit begins. But after the commit begins, in the cache the partial results of the transaction are now visible to threads who visit the changing data, which could lead to dirty reads.

Users can prevent dirty reads and acquire a strict isolation model by setting "-Dgemfire.detectReadConflicts=true".

Query Interface

Custom API

Geode supports Object Query Language (OQL) to query region data. Although OQL and SQL share many syntactical similarities, they differ a lot, such as OQL doesn't support aggregation functions, it supports querying on complex object graphs, and by default, OQL queries on the value of the region rather than the key, etc.

Geode also supports query index hints to filter on the specified index by adding "" at the beginning of the select query.

Storage Architecture

In-Memory

Geode follows the in-memory data grid (IMDG) outline. But it also has a disk store module to deal with data overflow and persistence. With the disk store, users can export data to disk when memory usage becomes too high or to persist data as a backup copy.

System Architecture

Shared-Nothing

Geode is designed in the shared-nothing architecture in order to have low latency and high concurrency performance.

People Also Viewed