Anatomy of an SQL Index


“An index makes the query fast” is the most basic explanation of an index I have ever heard of. Although it describes the most important aspect of an index very well, it is—unfortunately—not sufficient for this book. This chapter describes the index structure in a less superficial way, but doesn’t dive too deep into details. It provides just enough insight to understand the SQL performance aspects discussed throughout the book.

An index is a distinct structure in the database, which is built by the create index statement. It requires its own disk space and holds a copy of the indexed table data. That means that an index is pure redundancy. Creating an index does not change the table data; it just creates a new data structure that refers to the table. A database index is, after all, very much like the index at the end of a book: it occupies its own space; it is highly redundant; and it refers to the actual information, which is stored at a different place.

Clustered Indexes (SQL Server, MySQL/InnoDB)

SQL Server and MySQL (using InnoDB) take a broader view of what “index” means. They refer to tables that consist of the index structure only, as clustered index. These tables are called Index-Organized Table (IOT) in the Oracle database.

Chapter 5, “Clustering Data: The Second Power of Indexing describes them in more detail and explains their advantages and disadvantages.

Searching in a database indexes is like searching in a printed telephone directory. The key concept is that all entries are lined-up in a well-defined order. Finding data in an ordered data set is fast and easy because the sort-order determines each entries position.

Coaching by the Author
Faster, easier and more memorable than reading.

A database index is, however, more complex than a printed directory because it undergoes permanent change. Updating a printed directory for every change is impossible. Just alone for the reason that there is no space between existing entries to add new ones. A printed directory bypasses this problem, because it covers the accumulated updates with the next print only. An SQL database cannot wait for that. It must process insert, delete and update statements online, keeping the index order without moving large amounts of data.

The database combines two data structures to meet the challenge: a doubly linked list and a search tree. These two structures explain most of the database’s performance behaviour.

Recent Questions at Ask.Use-The-Index-Luke.com

0
votes
1
answer
229
views

query regd the CBO decision

Apr 17 at 10:27 Hulda(suspended)
index-choice optimizer
0
votes
3
answers
2.0k
views

Examples for Function Based Indexes?

Mar 25 at 15:52 Castorp 1
function-based
0
votes
1
answer
610
views

Updating multiple rows using a subquery in SQL

Jan 08 at 09:52 Jan 26
subquery update sql