Sqrrl is a graph database management system (DBMS) . It uses secure, adaptable NoSQL built from Apache Accumulo. Originally founded by former employees of the National Security Agency (NSA), Sqrrl focuses on cybersecurity.
The main product from Sqrrl is Sqrrl Enterprise, which is a solution for enterprise users on top of Accumulo that can handle massive structured and/or unstructured data sets. Sqrrl is aimed towards users who want to analyze data for security purposes. A Sqrrl whitepaper describes Sqrrl Enterprise as a threat hunting tool that uses large amounts of data with linked data analysis to aid users going through the 'hunting loop'. In particular, Sqrrl manages data and can display it to users raw or in the form of visualizations for analytics, finding threat patterns, or for further investigation.
Sqrrl has native Hadoop integration and supports multi-structured data.
Sqrrl development began in 2011. As a company, it was founded by Ely Kahn and Oren J. Falkowitz who worked in cybersecurity before creating Sqrrl. Sqrrl was built on Apache Accumulo. Customers using Sqrrl came from both the private and public sector.
In January of 2018, Sqrrl was acquired by Amazon Web Services (AWS). Existing users of Sqrrl were able to continue using the software without any changes. In late November 2018, Sqrrl co-founder Ely Kahn (who is now a Security Strategist at AWS) commented on the release of AWS Security Hub, which includes visualizations and security remediation recommendations. However, there were no statements made explicitly about Sqrrl's integration with AWS.
Sqrrl Enterprise offers users a proprietary query language called SqrrlQL that is integrated with the cell-level security concept. Users are able to execute SQL-like queries (key-value), full-text queries, or graph searches.
Multi-version Concurrency Control (MVCC)
Concurrency control is implemented in Sqrrl in a way that is not publicly known. Sqrrl supports concurrency control by doing atomic updates on data entities.
Based on the Accumulo implementation, checkpoints are organized with the Hadoop Distributed File System (HDFS) in order to create a system of of redundant files. The replication framework creates two instances of Accumulo that are consistent with each other.
After data is ingested, the Sqrrl Enterprise product details each piece at a cell level. Data can be a key/value pair or a field in a JSON document. Sqrrl then uses secondary indexing techniques to store the data in Apache Accumulo in its native form. This flat architecture allows Sqrrl to handle data of different structures.
In 2016, a presentation by Sqrrl stated that the version of Accumulo used by Sqrrl Enterprise stores sorted key-value pairs. Sqrrl then combines the data into graph storage form, using a linked data model.
Sqrrl is therefore considered a graph DBMS because it both represents data to users in a graphical format and is schema-free. Custom iterators for Accumulo developed by Sqrrl allow users to complete graph analysis on the data.
With native Hadoop integration, Sqrrl Enterprise ingests data. It is encrypted and labeled, then indexed and stored in Sqrrl's version of Apache Accumulo within the Hortonworks Data Platform (HDP). Sqrrl and HDP, together, store multi-structured data in its native format. This allows users of Sqrrl to make queries of many different kinds. Sqrrl is considered a distributed database.
Sqrrl converts SQL queries into Accumulo iterators.
Delta Encoding Bit Packing / Mostly Encoding
Based on the implementation of Accumulo, Sqrrl likely uses several types of compression. GZip and LZO are two compression algorithms used by Accumulo. In addition, Accumulo uses delta encoding by resending information if it is the same as prior sent information. Delta encoding is also used when storing sequential information in the table rows.
Sqrrl Data, Inc.