SenseiDB is a distributed database that supports the backend of LinkedIn homepage and LinkedIn Signal. The data is protected by replication and eventual consistency is guaranteed. SenseiDB is also a search engine for look-ups on structured metadata and unstructured contents.
SenseiDB was initially developed and employed by LinkedIn team in 2009. Engineers from both LinkedIn and Xiaomi were working on the project. It then became an open-source project that was contributed by many individuals. After three releases, SenseiDB has no longer updated or used since year 2013.
Browsing Query Language (BQL) is supported by SenseiDB, which has the similar syntax to SQL. Besides those existing clauses in SQL, BQL introduces the "BROWSE BY" clause to support faceted search and navigation. This is the biggest feature of BQL.
SenseiDB does not support concurrent operations . Only a single thread can run at a time. Besides, transactions are not supported as well.
Joins are not supported by SenseiDB.
N-ary Storage Model (Row/Record)
A SenseiDB instance is a table of data that is organized into columns. The attributes of a table is stored as metadata, and each column may belong to one of the supported types: string, int, long, short, float, double, char, date, text.
Transactions are not supported by SenseiDB.
Query execution is fulfilled by a query engine called Bobo, which uses the tuple-at-a-time model.
SenseiDB applies an indexing manager called Zoie, which is an independent searching and indexing engine built on Apache Lucene that uses inverted index to efficiently retrieve data. The biggest feature of Zoie is the support for real-time searches and updates.
SenseiDB uses a disk-oriented distributed storage architecture.
SenseiDB is a relation database that are organized with tables. The users are able to index and retrieve the data with Browsing Query Language (BQL).
The entire database is partitioned into a number of shards. Each shard is replicated across N nodes so that there might be more than one shards in a single node. There is not a master node in the system and each node is independent. Upcoming requests go through a separate load balancer to decide the nodes to request.