The Index Leaf Nodes


The primary purpose of an index is to provide an ordered representation of the indexed data. It is, however, not possible to store the data sequentially because an insert statement would need to move the following entries to make room for the new one. Moving large amounts of data is very time-consuming so the insert statement would be very slow. The solution to the problem is to establish a logical order that is independent of physical order in memory.

About our book “SQL Performance Explained”
It is well written and is not too heavy on the guts of databases
Mr. B on Amazon.co.uk (5 stars)

The logical order is established via a doubly linked list. Every node has links to two neighboring entries, very much like a chain. New nodes are inserted between two existing nodes by updating their links to refer to the new node. The physical location of the new node doesn’t matter because the doubly linked list maintains the logical order.

The data structure is called a doubly linked list because each node refers to the preceding and the following node. It enables the database to read the index forwards or backwards as needed. It is thus possible to insert new entries without moving large amounts of data—it just needs to change some pointers.

Doubly linked lists are also used for collections (containers) in many programming languages.

Programming LanguageName
Javajava.util.LinkedList
.NET FrameworkSystem.Collections.Generic.LinkedList
C++std::list

Databases use doubly linked lists to connect the so-called index leaf nodes. Each leaf node is stored in a database block or page; that is, the database’s smallest storage unit. All index blocks are of the same size—typically a few kilobytes. The database uses the space in each block to the extent possible and stores as many index entries as possible in each block. That means that the index order is maintained on two different levels: the index entries within each leaf node, and the leaf nodes among each other using a doubly linked list.

Figure 1.1. Index Leaf Nodes and Corresponding Table Data


Figure 1.1 illustrates the index leaf nodes and their connection to the table data. Each index entry consists of the indexed columns (the key, column 2) and refers to the corresponding table row (via ROWID or RID). Unlike the index, the table data is stored in a heap structure and is not sorted at all. There is neither a relationship between the rows stored in the same table block nor is there any connection between the blocks.

If you like my way of explaining things, you’ll love my book.

About the Author

Photo of Markus Winand
Markus Winand tunes developers for high SQL performance. He also published the book SQL Performance Explained and offers in-house training as well as remote coaching at http://winand.at/

?Recent questions at
Ask.Use-The-Index-Luke.com

0
votes
0
answers
576
views

Join with inequalities only

Dec 16 at 12:06 Markus Winand ♦♦ 936
inequality join
0
votes
1
answer
375
views

PostgreSQL Scripts: Performance Testing and Scalability problem and question

Nov 12 at 14:53 Markus Winand ♦♦ 936
testing postgresql scalability
0
votes
1
answer
1.1k
views

PostgreSQL Bitmap Heap Scan on index is very slow but Index Only Scan is fast

Oct 31 at 11:31 Markus Winand ♦♦ 936
index postgresql postgres sql