B+Tree

B+Tree #

A B+Tree is a self balanced tree data structure that keeps data sorted and allows searches, sequential access, insertion, and deletions in O(log(n)). It is optimized for disk-oriented DBMSs that read/write large blocks of data.

Almost every modern DBMS that supports order-preserving indexes uses a B+Tree. There is a specific data structure called a B-Tree, but people also use the term to generally refer to a class of data structures. Modern B+Tree implementations combine features from other B-Tree variants, such as the sibling pointers used in the Blink-Tree.

Formally, a B+Tree is an M-way search tree with the following properties:

  1. It is perfectly balanced (i.e., every leaf node is at the same depth).
  2. Every inner node other than the root is at least half full (M/2 − 1 <= num of keys <= M − 1).
  3. Every inner node with k keys has k+1 non-null children.

Every node in a B+Tree contains an array of key/value pairs:

  1. Arrays at every node are (almost) sorted by the keys.
  2. The value array for inner nodes will contain pointers to other nodes.
  3. Two approaches for leaf node values
    1. Record IDs: A pointer to the location of the tuple
    2. Tuple Data: The actual contents of the tuple is stored in the leaf node

B-Tree FAMILY #

There is a specific data structure called a B-Tree, people also use the term to generally refer to a class of balanced tree data structures:

  • B-Tree(1971)
  • B+Tree(1973)
  • B*Tree(1977?)
  • Blink-Tree (1981)