Derby is a lightweight embedded relational database implemented completely in Java. It is an embedded database for any Java applications.
In 1997, Cloudscape Inc., a start-up in Oakland, California, developed a database engine called JBMS, which was later renamed as Cloudscape. From 1999 to 2001, Cloudscape was acquired first by Informix Software, and by IBM, and its name was changed to IBM Cloudscape. In 2004, IBM contributed the code to the Apache Software Foundation as Derby. The Apache DB project, supported by Apache Software Foundation, aims at creating and maintaining open-sourced, high-quality databases. In 2005, Derby exited the incubator and became a Apache DB subproject.
There are row-level and table-level locks in Derby system to ensure data consistency. Also, deadlock detection is implemented such that the transaction that holds the least number of locks will be aborted. However, there is no deadlock prevention mechanisms. It depends on the correct application logics to prevent deadlocks.
Derby implements a combination of physical and logical logging. For actions on the same page, it uses physical logging. For BTree operations, which might affect several pages, it uses physical redo and logical undo.
Derby provides two types of join strategies -- nested loop and hash join. Nested loop join is more preferable in most cases. Hash join is preferred when inner table values are unique and outer table have many qualifying rows. Also, when the system estimates that the amount of memory required for hash join exceeds the amount available, nested loop will be used.
N-ary Storage Model (Row/Record)
Derby implements row-based storage model. Rows corresponds to records in data pages.
Read Uncommitted Read Committed Serializable Repeatable Read
It supports all four level of isolations. Isolation levels only differ for SELECT statements. They behave the same for other operations.
Subqueries can only be materialized if they not correlate with outer queries, and return one row. For subqueries that cannot be flattened (DISTINCT), optimization can be made on subqueries such as using Hash Join.
Derby support Java stored procedures.
Derby implements standard B+ Tree algorithms. It stores keys in leaf pages only. The B+ Tree supports page-level latching. Derby uses only exclusive latches, not shared latches in B+ tree implementation.
Derby mainly support on-disk database. It also provides in-memory database for testing and developing applications.
Derby implements ARIES with slight variance. It allows fuzzy checkpointing. However, unlike ARIES implementation, it does not store a dirty page list in checkpoint records. Instead, during checkpoint, Derby flushes all database pages to disk. The checkpoint control files contains "undoLWM" and "RedoLWM". "UndoLWM" is set as the the starting LSN of the oldest active transaction when checkpoint starts, and "RedoLWM" is the the current LSN of the checkpoint.
Foreign key is implemented as one of the CONSTAINT clauses. There are two levels of CONSTAINTS, column level and table level. Foreign key constraint in a column level enforces that the values in the column must corresponds to the values in the referenced column marked as primary key or unique key. Table level constraint works similarly, but it is for multiple columns.
Insert, update or delete instructions will be rejected with a statement exception if the foreign key constraint is violated. The constraint check can be at statement execution or commit depending on the constraint mode. (IMMEDIATE or DEFERRED).
Code Generation JIT Compilation
Derby parses the prepared statement using Javacc and generates the Java binary code directly. JIT complier is supported, so that after several executions, JIT compiler will compile it to native code for performance improvement.
JBMS, Cloudscape, Java DB