Actian Vector is a commercial main-memory RDBMS targeting analytical workload and decision support application. It adopts columnar storage model and vectorized processing model. To speed up analytical query execution, Vector makes use of various technologies including x86 SIMD execution, in-cache execution, parallel execution, data compression, storage indexes.
Actian Vector is available on Windows, Linux, Hadoop, AWS, and Microsoft Azure platforms.
The origin of Vector dates back to the MonetDB project at CWI (Centrum Wiskunde & Informatica) Research Institute in Netherlands. MonetDB is a columnar in-memory RDBMS that adopts column-at-a-time processing model and targets analytical workload. The "MonetDB/X100" project started in 2005 further improved MonetDB with vectorized processing model and other technologies. In 2008, X100 project parted from CWI as a foundation for a commercial RDBMS product in Vectorwise BV company. Vectorwise BV cooperated with Ingres Corporation (renamed as Actian Corporation in 2011) to integrate X100 with Ingres front-end. The first version of Vectorwise DBMS was released in June 2010. In 2011, Vectorwise technology was acquired by Actian Corporation. In 2014, Vectorwise was rebranded as Actian Vector.
Checkpoints can be taken either online or offline. For online checkpoints, Vector uses blocking consistent checkpoints. The system waits for all running transactions to finish before taking a new checkpoint. Any new transaction is blocked until the checkpoint is completed.
Dictionary Encoding Delta Encoding Run-Length Encoding Naïve (Page-Level) Bit Packing / Mostly Encoding
Vector compresses each column on a per-page basis for better compression performance. Thus different pages of the same column may be compressed using different algorithms.
For the page-level naïve compression, Vector uses LZ4 to compress string values.
Vector adopts late decompression approach. The compressed columns in the memory-based disk buffer are decompressed only when they are needed for query processing.
Multi-version Concurrency Control (MVCC) Optimistic Concurrency Control (OCC)
Vector uses multi-version optimistic concurrency control to support snapshot isolation. With optimistic concurrency control, records are not locked for reads/writes, and conflicts are detected at transaction commit time. Optimistic concurrency control improves the performance of Vector because record updates and read-only analytical transactions do not block each other.
Vector supports foreign keys. User can specify foreign keys for a relation after primary key is assigned.
Vector supports clustered index and secondary index.
CREATE INDEX statement creates a clustered index where data records are sorted according to the indexed key. Only one clustered index is allowed for each table.
CREATE INDEX with option
WITH STRUCTURE=VECTORWISE_SI creates a secondary index on one or more attributes. One or more secondary indices are allowed for each table. Vector stores secondary index only in memory, so all secondary indices are rebuilt at the bootstrap time.
Vector only supports snapshot isolation. In snapshot isolation, a transaction observes a consistent state of the database at the starting timestamp.
Vector supports hash join and sort-merge join. Bloom filter is used to accelerate hash join. Before looking up a key in a hash table, the key is checked against a bloom filter. If the key passes the bloom filter, lookup in hash table will be performed. In the case where the key is not in hash table, bloom filter can improve performance because lookup in bloom filter is less costly than lookup in hash table.
Vector executes relational algebra operators using vectorized model and pre-compiled primitive functions. Vector implements thousands of primitive functions which operate on vectors of input data. These vectorized primitives are pre-compiled and invoked at runtime, so that overhead of function calls are amortized by the size of input vector. Vectorized primitive functions also allow for compiler optimization and use of SIMD instructions.
Vector supports industry-standard ANSI SQL:2003.
Vector supports processing of data resident in memory or on disk. There are two memory pools in Vector: execution memory and column buffer manager (CBM). Execution memory is used for query execution and to store intermediate results, secondary indices, etc. CBM is used to store data records. Data larger than memory is compressed and stored on disk. Users can allow spilling intermediate results to disk by setting
enable_aggregation_disk_spilling = true.
Vector uses an adaptive storage model based on PAX partitions. By default, Vector stores tables in columnar model. Nullable attribute has a boolean type column which is stored together with the attribute column. However, if there exists an index where the key spans multiple attributes, these columns are grouped together and stored in the same block. For small tables, user can instruct Vector to store data in n-ary model to reduce wasted space.
Heaps Indexed Sequential Access Method (ISAM)
Vector stores tables in heaps or ISAM.
Vector is a three-tiered shared-everything DBMS. In the first tier, client application is connected to DBMS Server. The second tier includes Vector X100 execution engine, Ingres tables and catalogs. The third tier is the storage layer, which contains databases and Vector's write-ahead logging.
C, C#, C++, COBOL, Java, Perl, PHP, Python, SQL, Visual Basic