C-Store is a column-oriented DBMS designed for read-optimized OLAP workloads. It adopts a column store architecture, explores various DSM compression schemes and corresponding query optimization strategies, stores data in overlapping collections of projections for both performance and availability, and employs other optimizations specific to column store. (Please delete anything in parenthesis as it is used to point out ambiguity)
C-Store is an academic project led by Michael Stonebraker and Daniel Abadi, involving people from Brown University, Brandeis University, MIT and the University of Massachusetts Boston. It was later commercialized into Vertica.
Two-Phase Locking (Deadlock Detection)
The system maintains a distributed lock table. Deadlock is resolved via timeouts by aborting one of the deadlocked transactions. (I think this means deadlock detection?)
Logically, C-Store supports the standard relational data model, where a database contains a collection of tables and a table contains a collection of attributes.
Despite the different possible encoding schemes of a column (e.g. RLE, bit-map encoding, or block-oriented delta encoding), they all use B-tree indexes. The system also stores join indices to stitch together all records in a table from its different columns (projections). Since a column which is ordered by another column in the same projection and contains few distinct values is encoded using bit-map encoding plus RLE, the paper also mentioned their extensive use of bitmap indexes.
(The paper talks in detail about their support for snapshot isolation, but does not mention if they support other isolation level ...)
Nested Loop Join Hash Join Sort-Merge Join
(found in their source code)
"We use logical logging (as in ARIES), since physical logging would result in many log records, due to the nature of the data structures in WS." (But I believe ARIES log is physical logging? So confused ...)
Logically, users interact with C-Store in SQL, with standard SQL semantics.
Each column is stored as a separate file containing a list of 64K blocks, each packing as many values as possible.
Decomposition Storage Model (Columnar)
As the name suggests, C-Store is all about column store ... Interestingly, both the read-optimized store component and the update/insert-oriented writable store component adopt the column store architecture.
(I don't think they mention it, so I guess it's a no?)
The architecture was designed anticipating an environment of grid computers, containing large number of nodes each with private disk and memory. The data is horizontally partitioned across the disks of the nodes.
(found in their source code 'write store materialized view')
Massachusetts Institute of Technology, Brown University