Sorting and Grouping


Sorting is a very resource intensive operation. It needs a fair amount of CPU time, but the main problem is that the database must temporarily buffer the results. After all, a sort operation must read the complete input before it can produce the first output. Sort operations cannot be executed in a pipelined manner—this can become a problem for large data sets.

An index provides an ordered representation of the indexed data: this principle was already described in Chapter 1. We could also say that an index stores the data in a presorted fashion. The index is, in fact, sorted just like when using the index definition in an order by clause. It is therefore no surprise that we can use indexes to avoid the sort operation to satisfy an order by clause.

Ironically, an INDEX RANGE SCAN also becomes inefficient for large data sets—especially when followed by a table access. This can nullify the savings from avoiding the sort operation. A FULL TABLE SCAN with an explicit sort operation might be even faster in this case. Again, it is the optimizer’s job to evaluate the different execution plans and select the best one.

About our book “SQL Performance Explained”
This book is definitively worth having in the company library.
” — Joe Celko

An indexed order by execution not only saves the sorting effort, however; it is also able to return the first results without processing all input data. The order by is thus executed in a pipelined manner. Chapter 7, “Partial Results” , explains how to exploit the pipelined execution to implement efficient pagination queries. This makes the pipelined order by so important that I refer to it as the third power of indexing.

Note

The B-Tree traversal is the first power of indexing.

Clustering is the second power of indexing.

Pipelined order by is the third power of indexing.

This chapter explains how to use an index for a pipelined order by execution. To this end we have to pay special attention to the interactions with the where clause and also to ASC and DESC modifiers. The chapter concludes by applying these techniques to group by clauses as well.

If you like my way of explaining things, you’ll love my book.

About the Author

Photo of Markus Winand
Markus Winand tunes developers for high SQL performance. He also published the book SQL Performance Explained and offers in-house training as well as remote coaching at http://winand.at/

?Recent questions at
Ask.Use-The-Index-Luke.com

0
votes
1
answer
164
views

PostgreSQL Scripts: Performance Testing and Scalability problem and question

Nov 12 at 14:53 Markus Winand ♦♦ 936
testing postgresql scalability
0
votes
1
answer
562
views

PostgreSQL Bitmap Heap Scan on index is very slow but Index Only Scan is fast

Oct 31 at 11:31 Markus Winand ♦♦ 936
index postgresql postgres sql
3
votes
2
answers
588
views

pagination with nulls

Oct 29 at 22:39 Rocky 46
pagination