The Adaptive Radix Tree: ARTful Indexing for Main-Memory Databases
Leis et. al., ICDE 2013 [paper]
Tries are an unloved third data structure for building key-value stores and indexes, after search trees (like B-trees and red-black trees) and hash tables. Yet they have a number of very appealing properties that make them worthy of consideration - for example, the height of a trie is independent of the number of keys it contains, and a trie requires no rebalancing when updated. Weighing against those advantages is the heavy memory cost that vanilla radix tries can incur, because each node contains a pointer for every possible value of the ‘next’ character in the key. With ASCII as an example, that’s 256 pointers for every node in the tree.
But the astute reader will feel in their bones that this is naive - there must be more efficient ways to store a set of pointers, indexed by a fixed size set of keys (the trie’s alphabet). Indeed, there are - several of them, in fact, distinguished by the number of children the node actually has, not just how many it might potentially have.
This is where the Adaptive Radix Tree (ART) comes in. In this breezy, easy-to-read paper, the authors show how to reduce the memory cost of a regular radix trie by adapting the data structure used for each node to the number of children that it needs to store. In doing so they show, perhaps surprisingly, that the amount of space consumed by a single key can be bounded no matter how long the key is.
[Read More]