etcd

etcd is a distributed key-value store which is highly available, strongly consistent, and watchable for changes. The name "etcd" was from a unix's configuration directory, "etc" and "d"istributed system. etcd is adopted by cloud-native systems such as Kubernetes, Cloud Foundry Diego, and Project Calico. Major uses cases include metadata store and distributed coordination.

History

Originally etcd was started for two use cases: reboot coordination and application configuration. CoreOS used etcd to coordinate reboot of CoreOS cluster and avoid that all nodes in the cluster rebooted at the same time. Also, etcd was used to store application configuration, so whenever a server starts or restarts or whenever application configuration is updated, the server receives application configuration from etcd.

Stored Procedures

Not Supported

Storage Model

N-ary Storage Model (Row/Record)

etcd stores physically data as a key-value pair. The key is consist of a 3-tuple: major, sub, type. Major contains the revision (a counter which is incremented when data modification is requested.) Sub contains the identifier among the revision because the transaction might produce a single revision with multiple keys. Type is an optional and one use case is for a tombstone. The value contains a delta from a previous version.

Indexes

B+Tree

etcd creates an in-memory btree index for keys and provides range operations.

Query Compilation

Not Supported

Logging

Command Logging

The documentation mentioned that etcd used WAL to recover from a failure but did not mention what kind of information it logs. According to the source code, it seems to log a gRPC request in WAL.

Storage Architecture

Disk-oriented

etcd stores a key-value pair in a persistent disk as the b+ tree structure sorted by a key.

Data Model

Key/Value

etcd stores data as a multiversion key-value pair. Each mutative operation (e.g. PUT) creates a new version and does not change older versions. Previous versions are also accessible until they are compacted.

System Architecture

Shared-Nothing

etcd does not require shared resource and works with independent resources. A typical deployment model is to use a cloud service such as AWS and set up the cluster with multiple virtual machines.