Apache Geode is an open-source, in-memory distributed database designed to support transactional applications with the demand for low latency and high concurrency.
Geode was first developed by GemStone Systems in 2002 under the name "GemFire" and used in financial services providers including major Wall Street banks. In 2010, GemStone Systems was acquired by VMWare. In 2013, Pivotal was spun out of VMware and took GemFire with it, and in 2015 they outsourced the project to Apache Incubator as "Geode". Finally, in November 2016, the Apache Software Foundation (ASF) announced that Apache Geode has graduated as a Top-Level Project (TLP).
By default, Geode supports Snappy compressor, and users are allowed to use their customized compressor by region. When compression is enabled in a region, all the values in memory will be compressed, and they will be decompressed when they are read from the cache.
For a partitioned region, all the updates on a specific key will be routed to the primary Geode member serializable.
For a replicated region, as long as a member hosts the region, it can update the key and propagate it to other members. The consistency is retained by storing the version stamp and the Geode member ID which last updated the entry in each entry of a region. The version stamp will get incremented each time an update for that entry appears. When an update message is received, the Geode member/client will compare the version stamp of the update message to the one in the local cache. If the former one is larger, the member/client will perform the update and update the version stamp information. When the update message has the same version stamp as the local cache, the one with the higher member ID will be kept. The concurrency control of non-replicated regions and client cache is similar to the way in replicated regions.
In a Geode distributed, system, caches are defined as an abstraction that describes the node of the in-memory storage for the data. Each cache contains regions, which data is stored in the form of key-value pairs. So caches are similar to the construct of databases and regions are similar to tables in a relational database.
Although Geode doesn't support foreign keys, it provides a similar function called data colocation to store related data entries that have the same ID from different data regions into one single member. For example, the Geode system contains one customer records region and one customer orders region and they are related to each other through the customer.
By using colocation, users can maintain all records and orders information for a customer in a cache of a single member, which will be used by all operations regarding this customer only.
Geode uses hashCode() on the key to mapping a data entry to a specific position in the region, and the key is hashed by a fixed number of buckets.
Geode supports repeatable read isolation. By default, the transactions are isolated at the thread level. So other threads can't see the changes made by one thread of a transaction until the commit begins. But after the commit begins, in the cache the partial results of the transaction are now visible to threads who visit the changing data, which could lead to dirty reads.
Users can prevent dirty reads and acquire a strict isolation model by setting "-Dgemfire.detectReadConflicts=true".
Geode supports Object Query Language (OQL) to query region data. Although OQL and SQL share many syntactical similarities, they differ a lot, such as OQL doesn't support aggregation functions, it supports querying on complex object graphs, and by default, OQL queries on the value of the region rather than the key, etc.
Geode also supports query index hints to filter on the specified index by adding "
Geode follows the in-memory data grid (IMDG) outline. But it also has a disk store module to deal with data overflow and persistence. With the disk store, users can export data to disk when memory usage becomes too high or to persist data as a backup copy.