Sorting and Grouping


An index provides an ordered representation of the indexed data. This principle was already described in Chapter 1. One could also say an index stores the data in a presorted fashion. The index is, in fact, sorted like using the index definition in an order by clause. It is therefore no surprise that we can use indexes to avoid explicit sorting for the order by clause.

Sorting is very resource intensive. First of all, because it needs a fair amount of CPU time. But the main problem is the temporary materialization of the ordered result. Sorting must, after all, read the complete input before it can produce the first output. It cannot be executed pipelined. This is a problem for large data sets.

Faster, easier and more memorable than reading.

But an INDEX RANGE SCAN becomes inefficient for large data sets as well, especially when followed by a table access. That can nullify the savings from using the index to avoid the sort operation. The optimizer might therefore prefer a FULL TABLE SCAN with subsequent sort even if an index could prevent the sort operation.

Indexing the order by clause can bring enormous benefits nevertheless, because it returns the first row immediately after the tree traversal. Further results are delivered in a pipelined manner. Chapter 7, “Partial Results” explains how to exploit the pipelined execution when you need the first rows from a result only.

Using an index for the order by clause saves not only sorting efforts, but also enables further optimization methods that require pipelined execution. That makes the pipelined order by so important that I refer to it as the third power of indexing.

Note

The B-Tree traversal is the first power of indexing.

Clustering is the second power of indexing.

Pipelined order by is the third power of indexing.

This chapter explains how to use an index for a pipelined order by execution. The interaction with the where clause requires special attention, but you must also consider ASC and DESC modifiers in the order by clause. The chapter concludes by applying these techniques to group by clauses.

Recent Questions at Ask.Use-The-Index-Luke.com

0
votes
1
answer
277
views

Updating multiple rows using a subquery in SQL

Jan 08 at 09:52 Jan 26
subquery update sql
1
vote
1
answer
210
views

Should 'id' (the primary key) be included in an index

Jan 03 at 15:24 Jan 26
index include
0
votes
1
answer
227
views

Best index for a multiple join-tables and filter

Jan 03 at 14:31 Markus Winand ♦♦ 216
index join where