TileDB is a storage engine designed to support the storage and access of both dense and sparse multi-dimensional arrays. The key idea of TileDB is that it stores array elements into collections called fragments, which can be either dense or sparse. Each of these fragments stores data in data tiles. In the case of dense fragments, the capacity of data tiles is limited by a fixed chunk size. In the case of sparse fragments, the capacity of data tiles is limited by a fixed element size. TileDB also supports parallel I/O and is completely multi-threaded.
TileDB is designed to store many different types of data, such as genomic data, machine learning model parameters, imaging data, and LiDaR data.
TileDB was invented at the Intel Science and Technology Center for Big Data. The research center was a collaboration between Intel Labs and MIT. The research project was published in a VLDB 2017 paper. TileDB, Inc. was founded in February 2017 to further develop and maintain the DBMS.
TileDB supports API for the following languages: SQL, C, C++, Python, Java, R, and Go. TileDB Python API is under further development, and subject to change.
TileDB does not support views at the current version.
TileDB provides no transactional support in the current version. It only guarantees atomic reads and writes. TileDB allows users to build a transactional manager on top for concurrency control.
TileDB does not support logging at the current stage.
TileDB does not support join operations at the current stage.
Decomposition Storage Model (Columnar)
TileDB supports columnar format for different attributes stored in arrays.
TileDB provides no transaction support in the current version, and no isolation could be guaranteed.
TileDB does not support stored procedures at the current stage.
TileDB uses a disk-oriented storage format that can store dense and sparse array data and support fast updates.
TileDB does not support checkpoints at the current stage.
TileDB does not support foreign keys at the current stage.
TileDB uses a multi-dimensional array format that handles both sparse data and dense data. An array is composed of fragments, where each fragment is an array snapshot containing cells written in that write operation. Fragments can be categorised into sparse fragments and dense fragments. Sparse fragments store their elements in a global order. Dense fragments store their elements into regularised chunks in the index space.
TileDB is a embeddable storage library.
Dictionary Encoding Run-Length Encoding
TileDB supports the following compressors: GZIP, Zstandard, LZ4, RLE, Bzip2, and Double-delta. Double-delta is a compressor created for TileDB, and is similar to Facebook's Gorilla system.
TileDB Inc, Intel Labs
C, C#, C++, Go, Java, Python, R