Scylla

Scylla is an open-source distributed NoSQL database. It is compatible with Apache Cassandra, and uses the same protocols (Cassandra Query Language and Thrift) and same file formats (SSTable). It is optimized for workloads that requires low latency and high throughput, in addition to Apache Cassandra's high availability, scalability and fault-tolerance guarantee. Scylla uses a shared-nothing model and shard-per-core architecture, where each thread executes on its own CPU core, memory, and multi-queue network interface controller. Cross-core communication is carried out by explicit message passing, using an event-driven asynchronous programming model, Seastar.

History

Scylla project is started in 2014 by an Israel startup Cloudius Systems (rebranded as ScyllaDB Inc.), lead by Avi Kivity and Dor Laor. It is a C++ rewritten implementation of Apache Cassandra in Java, and is released as open source in 2015. The current (late 2018) version is 3.0.

Views

Materialized Views

Scylla supports Materialized Views in version 2.0 as an experimental feature. Whenever the base table is updated, the materialized view table will be automatically updated. Materialized View tables are distributed as normal tables and scale as well as normal tables. However, there are still limitations in the current experimental release, including but not limited to lack of local locking and local batch log.

Storage Organization

Log-structured

Storage Architecture

Disk-oriented

Logging

Logical Logging

Scylla writes a commitlog for each write request coming in. Before the mutation is applied to the MemTable, the commitlog that contains the data in the mutation is written to disk to guarantee the durability.

Data Model

Column Family / Wide-Column Key/Value

Scylla uses the same data model as Apache Cassandra, which represents data in key-value pairs (like row in RDBMS), and organizes a collection of rows as a column family (like table in RDBMS). One or more column families are contained in a keyspace (like database in RDBMS). It is encouraged that one application should use one keyspace.

Concurrency Control

Multi-version Concurrency Control (MVCC)

Scylla does not support ACID transactions as in RDBMS. However, CQL has a BATCH statement that allows multiple update statements belonging to a given partition key be applied in isolation (note that batches are not a full analogue for SQL transactions). Besides, in UPDATE, INSERT, and DELETE statements, modifications belonging to the same partition key are performed atomically and in isolation. Scylla implements Multi-version Concurrency Control (MVCC) for partition mutation.

Query Interface

Custom API

Scylla uses Cassandra Query Language (CQL) as the Query Interface. Besides, drivers for the following languages are provided: C++, C#, Go, Java, Node.js, PHP, Python, Ruby, and Rust.

Checkpoints

Non-Blocking

Scylla supports non-blocking checkpoints through per-node backup procedures, which include full backup/snapshots and incremental backup. Snapshots are taken by the snapshot operation provided by the nodetool utility, while the incremental backup option can be configured in the configuration file. Automatic unnecessary backup cleaning is not implemented.

System Architecture

Shared-Nothing

Scylla uses a shared-nothing model. Nodes in the cluster are organized in a decentralized consistent hashing ring and data is partitioned into shards by the key across all nodes. Scylla uses a shard-per-core architecture, where each thread for a shard executes on its own CPU core, memory, and multi-queue network interface controller. Cross-core communication is carried out by explicit message passing. Scylla also uses replicas for fault-tolerance.

Compression

Dictionary Encoding

Scylla uses Apache Cassandra chunked compression on SSTable files. Three dictionary-based compression algorithms are provided: LZ4 (default), Snappy, and DEFLATE. Data needs to be decompressed before being processed during query execution.

Stored Procedures

Not Supported

Joins

Not Supported

Each SELECT statement only applies on one single table.

Isolation Levels

Serializable

Scylla does not support ACID transactions as in RDBMS. However, CQL has a BATCH statement that allows multiple update statements belonging to a given partition key be applied in isolation (note that batches are not a full analogue for SQL transactions). Besides, in UPDATE, INSERT, and DELETE statements, modifications belonging to the same partition key are performed atomically and in isolation. Scylla has a roadmap for supporting CQL Light-Weight Transactions (LWT) in 3.x.

Scylla Logo
Website

http://www.scylladb.com/

Source Code

https://github.com/scylladb/scylla

Tech Docs

https://docs.scylladb.com/

Developer

ScyllaDB Inc.

Country of Origin

IL

Start Year

2014

Project Type

Commercial, Open Source

Written in

C++

Supported languages

C#, C++, Go, Java, JavaScript, PHP, Python, Ruby, Rust

Derived From

Cassandra

Compatible With

Cassandra

Operating Systems

Linux

Licenses

AGPL v3

Wikipedia

https://en.wikipedia.org/wiki/Scylla_(database)