by Markus Winand.

Indexing Order By


SQL queries with an order by clause do not need to sort the result explicitly if the relevant index already delivers the rows in the required order. That means the same index that is used for the where clause must also cover the order by clause.

As an example, consider the following query that selects yesterday’s sales ordered by sale date and product ID:

SELECT sale_date, product_id, quantity
  FROM sales
 WHERE sale_date = TRUNC(sysdate) - INTERVAL '1' DAY
 ORDER BY sale_date, product_id

There is already an index on SALE_DATE that can be used for the where clause. The database must, however, perform an explicit sort operation to satisfy the order by clause:

An INDEX RANGE SCAN delivers the result in index order anyway. To take advantage of this fact, we just have to extend the index definition so it corresponds to the order by clause:

  DROP INDEX sales_date
CREATE INDEX sales_dt_pr ON sales (sale_date, product_id)
SELECT sale_date, product_id, quantity
  FROM sales
 WHERE sale_date = TRUNC(sysdate) - INTERVAL '1' DAY
 ORDER BY sale_date, product_id

The sort operation SORT ORDER BY disappeared from the execution plan even though the query still has an order by clause. The database exploits the index order and skips the explicit sort operation.

Important

If the index order corresponds to the order by clause, the database can omit the explicit sort operation.

Even though the new execution plan has fewer operations, the cost value has increased considerably because the clustering factor of the new index is worse (see Automatically Optimized Clustering Factor). At this point, it should just be noted that the cost value is not always a good indicator of the execution effort.

Automatically Optimized Clustering Factor

The Oracle database keeps the clustering factor at a minimum by considering the ROWID for the index order. Whenever two index entries have the same key values, the ROWID decides upon their final order. The index is therefore also ordered according to the table order and thus has the smallest possible clustering factor because the ROWID represents the physical address of table row.

By adding another column to an index, you insert a new sort criterion before the ROWID. The database has less freedom in aligning the index entries according to the table order so the index clustering factor can only get worse.

Regardless, it is still possible that the index order roughly corresponds to the table order. The sales of a day are probably still clustered together in the table as well as in the index—even though their sequence is not exactly the same anymore. The database has to read the table blocks multiple times when using the SALE_DT_PR index—but these are just the same table blocks as before. Due to the caching of frequently accessed data, the performance impact could be considerably lower than indicated by the cost values.

For this optimization, it is sufficient that the scanned index range is sorted according to the order by clause. Thus the optimization also works for this particular example when sorting by PRODUCT_ID only:

SELECT sale_date, product_id, quantity
  FROM sales
 WHERE sale_date = TRUNC(sysdate) - INTERVAL '1' DAY
 ORDER BY product_id

In Figure 6.1 we can see that the PRODUCT_ID is the only relevant sort criterion in the scanned index range. Hence the index order corresponds to the order by clause in this index range so that the database can omit the sort operation.

Figure 6.1 Sort Order in the Relevant Index Range

This optimization can cause unexpected behavior when extending the scanned index range:

SELECT sale_date, product_id, quantity
  FROM sales
 WHERE sale_date >= TRUNC(sysdate) - INTERVAL '1' DAY
 ORDER BY product_id

This query does not retrieve yesterday’s sales but all sales since yesterday. That means it covers several days and scans an index range that is not exclusively sorted by the PRODUCT_ID. If we look at Figure 6.1 again and extend the scanned index range to the bottom, we can see that there are again smaller PRODUCT_ID values. The database must therefore use an explicit sort operation to satisfy the order by clause.

If the database uses a sort operation even though you expected a pipelined execution, it can have two reasons: (1) the execution plan with the explicit sort operation has a better cost value; (2) the index order in the scanned index range does not correspond to the order by clause.

Support My Work

I offer SQL training, tuning and consulting. Buying my book “SQL Performance Explained” (from €9.95) also supports my work on this website.

A simple way to tell the two cases apart is to use the full index definition in the order by clause—that means adjusting the query to the index in order to eliminate the second cause. If the database still uses an explicit sort operation, the optimizer prefers this plan due to its cost value; otherwise the database cannot use the index for the original order by clause.

Tip

Use the full index definition in the order by clause to find the reason for an explicit sort operation.

In both cases, you might wonder if and how you could possibly reach a pipelined order by execution. For this you can execute the query with the full index definition in the order by clause and inspect the result. You will often realize that you have a false perception of the index and that the index order is indeed not as required by the original order by clause so the database cannot use the index to avoid a sort operation.

If the optimizer prefers an explicit sort operation for its cost value, it is usually because the optimizer takes the best execution plan for the full execution of the query. In other words, the optimizer opts for the execution plan which is the fastest to get the last record. If the database detects that the application fetches only the first few rows, it might in turn prefer an indexed order by. Chapter 7, “Partial Results, explains the corresponding optimization methods.

Previous pageNext page

You can’t learn everything in one day. Subscribe the newsletter via E-Mail, Twitter or RSS to gradually catch up. Have a look at modern-⁠sql.com as well.

About the Author

Photo of Markus Winand

Markus Winand provides insights into SQL and shows how different systems support it at modern-sql.com. Previously he made use-the-index-luke.com, which is still actively maintained. Markus can be hired as trainer, speaker and consultant via winand.at.

Buy the Book

Cover of “SQL Performance Explained”: Squirrel running on grass

The essence of SQL tuning in 200 pages

Buy now!
(paperback and/or PDF)

Paperback also available at Amazon.com.

Hire Markus

Markus offers SQL training and consulting for developers working at companies of all sizes.
Learn more »

Connect with Markus Winand

Markus Winand on LinkedInMarkus Winand on XINGMarkus Winand on Twitter
“Use The Index, Luke!” by Markus Winand is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 Unported License.
Legal | Contact | NO WARRANTY | Trademarks | Privacy and GDPR | CC-BY-NC-ND 3.0 license