SingleStore

SingleStore is a distributed, cloud-native database that can handle transactional and analytical workloads with a unified engine. It is a modern SQL DBMS and cloud service that supports multiple data models, including structured data, semi-structured data based on JSON, time-series, full text, spatial, and vector data.

History

SingleStore was originally founded as MemSQL as a Y-combinator graduate company in 2011.

The company renamed the system to SingleStore in October 2020.

Checkpoints

Non-Blocking Consistent

SingleStore takes a form of checkpoint called a snapshot. Snapshots are performed periodically and they contain a copy of all in-memory rowstore data. To recover the database after a crash or restart, the latest snapshot is read and deserialized into memory, and then the log files are played back from the start time of snapshot creation to the current time.

Concurrency Control

Multi-version Concurrency Control (MVCC)

SingleStore uses multi-version concurrency control (MVCC) and lock-free data structures. Read operations are not blocked, and write operations acquire row-level locks. Row locks are acquired as rows are written to and are held until the transaction that acquired them commits or rolls back, using 2-phase locking to ensure serializability. The distributed query optimizer evenly distributes the processing workload to maximize the efficiency of CPU usage, and query plans are compiled to machine code and cached to expedite subsequent executions.

Data Model

Relational Key/Value Document / XML Object-Oriented Multi-Value Vector

SingleStore is a multi-model database system. Its primary data model is relational. It also supports semistructured (JSON), vector, time series, geospatial, key-value and full-text models. It supports a SQL interface and also a NoSQL interface called Kai that is largely compatible with the Mongo™ API. Semistructured data access can be done through SQL (using the JSON type) or Kai.

Foreign Keys

Not Supported

SingleStore currently supports foreign keys to assist sharding, but referential integrity is not enforced.

Indexes

Skip List Hash Table

SingleStore supports shard key, primary key, columnstore sort key, rowstore key, skiplist, hash, full-text, geospatial and vector indexes.

Isolation Levels

Read Committed

SingleStore supports the "Read Committed".

Joins

Nested Loop Join Hash Join Sort-Merge Join Broadcast Join

Nested loop join, index-nested loop join, merge join and hash join are supported in SingleStore. For distributed join queries, if two tables are joined with identical shard key, the join will be performed locally; otherwise the dataset is broadcast or reshuffled to other nodes via the network.

Logging

Physical Logging

SingleStore implements transactions via write-ahead logging. Each partition has its own log file.

Query Compilation

Code Generation JIT Compilation

Instead of the traditional interpreter-based execution model, SingleStore comes with a new code generation architecture, which compiles a SQL query to LLVM to machine code. When the SingleStore server encounters a SQL query, it parses SQL into AST and extracts parameters from the query, which is then transformed into a SingleStore-specific intermediate representation in SingleStore Plan Language (MPL). SingleStore then flattens MPL AST into a more compact format as SingleStore Bytecode (MBC). Plans in MBC format are then transformed into LLVM Bitcode, which LLVM uses to generate machine code. Such code generation architecture enables many low-level optimizations and avoids much unnecessary work compared to interpreter-based execution. Compiled plans are also cached on disk for future use.

Query Execution

Tuple-at-a-Time Model Vectorized Model

SingleStore uses Tuple-at-a-Time Model for rowstore query execution and a Vectorized Model for columnstore query execution. Plans are compiled to machine code (see Query Compilation).

Query Interface

SQL PL/SQL HTTP / REST

SingleStore supports a subset of MySQL syntax, plus extensions for distributed SQL, vector, geospatial and JSON queries. MySQL wire protocol is supported. It also features the Kai interface which has a high level of compatibility with the Mongo API.

Storage Architecture

Disk-oriented In-Memory Hybrid

SingleStore features Universal Storage which is an evolution of the columnstore, accommodating transactional workloads that would have traditionally been managed by the rowstore. Universal Storage combines rowstore and columnstore to support both Online Transaction Processing (OLTP) and Hybrid Transactional and Analytical Processing (HTAP) workloads at lower total cost of ownership (TCO). Designed to enhance both parallelism and fault tolerance, databases in SingleStore are divided into partitions, also referred to as shards, that are evenly distributed among the available leaf nodes. Each partition holds a subset of data based on the SHARD KEY defined in the CREATE TABLE statement.

Storage Model

Decomposition Storage Model (Columnar) N-ary Storage Model (Row/Record) Hybrid

In SingleStore, tables are broken into million-row chunks called segments. Row segments in rowstore are stored in-memory. Column segments in columnstore are stored on disk and external object storage. Each columnstore partition has an in-memory rowstore segment holding recently updated or inserted data. Columnstore tables are kept sorted by a sort key and several types of compression are applied for columnstore data including value encoded, RLE, and dictionary encoding.

Stored Procedures

Supported

SingleStore supports stored procedures (SPs), user-defined scalar-valued functions (UDFs), user-defined table-valued (TVFs), user-defined aggregate functions (UDAFs). Wasm-based UDFs, TVFs, and UDAFs are also supported, as are external UDFs and TVFs.

System Architecture

Shared-Disk

SingleStore has a two-tier, clustered architecture. The nodes in the upper tier are aggregators, which are cluster-aware query routers. One special node called the Master Aggregator is responsible for cluster monitoring. The nodes in the lower tier are leaves, which store and process partitions (shards). The aggregator sends extended SQL to leaves to perform distributed query execution.

Views

Virtual Views

SingleStore supports creating and querying views. Views in SingleStore are not materialized and cannot be written into.

Revision #12 | Updated 03/14/2024 11:51 a.m.

SingleStore

History

Checkpoints

Concurrency Control

Data Model

Foreign Keys

Indexes

Isolation Levels

Joins

Logging

Query Compilation

Query Execution

Query Interface

Storage Architecture

Storage Model

Stored Procedures

System Architecture

Views

People Also Viewed

Website

Tech Docs

Twitter

Developer

Country of Origin

Start Year

Former Name

Project Type

Written in

Supported languages

Inspired By

Compatible With

Operating Systems

Licenses

Wikipedia

People Also Viewed