Don’t ask for what you don’t need


select * has a bad reputation. Most people think it is because changes in the table definition cause hard-to-detect bugs. Listing all columns in the desired order solves that problem, right? Yes, but that doesn’t solve the second problem of select *.

The other problem is that asking for more columns causes more work for the database. The surprise is that the performance impact can be huge.

Up to 100x slower when preventing an Index-Only Scan

Broadly speaking, the less columns you ask for, the less data must be loaded from disk when processing your query. However, this relationship is non-linear.

Quite often, selecting from a table involves two steps: (1) use an index to find the address where the selected rows are stored; (2) load the selected rows from the table. Now imagine that you are just selecting columns that are present in the index. Why should the database still perform the second step? In fact, most databases don’t. They can process your query just with the information stored in the index—hence index-only scan.

But why should an index-only scan be 100 times faster? Simple: an ideal index stores the selected rows next to each other. It’s not uncommon that each index page holds about 100 rows—a ballpark figure; it depends on the size of the indexed columns. Nonetheless, it means that one IO operation might fetch 100 rows. The table data, on the other hand, is not organized like that (exceptions). Here it is quite common that a page just contains one of the selected rows—along with many other rows that are of no interest for the particular query. So, the reason an Index-Only Scan can be 100 times faster is that an index access can easily deliver 100 rows per IO while the table access typically just fetches a few rows per IO.

If you select a single column that’s not in the index, the database cannot do an index-only scan. If you select all columns, … , well I guess you know the answer.

Further, some databases store large objects in a separate place (e.g., LOBs in Oracle). Accessing those causes an extra IO too.

Up to 5x slower when bloating server memory footprint

Although databases avoid storing the result in the server’s main memory—instead they deliver each row after loading and forget about it again—it is sometimes inevitable. Sorting, for example, needs to keep all rows—and all selected columns—in memory to do the job. Once again, the more columns you select, the more memory the database needs. In the worst case, the database might even need to do an external sort on disk.

However, most database are extremely well tuned for this kind of workload. Although I’ve seen a sorting speed-up of factor two quite often—just by removing a few unused columns—I cannot remember having got more than factor five. However, it’s not just sorting, hash joins are rather sensitive to memory bloat too. Don’t know what that is? Please read this article.

These are just the two top issues from database perspective. Remember that the network suffers too and that the client needs to process the data as well—which might put a considerable pressure on garbage collection.

Many people think the star is the bad thing. Consequently they believe they are not having any problem if their ORM lists all columns by name anyway. In fact, the crime is to select all columns without thinking about it—and most ORMs readily commit this crime on behalf of their users.

When we say “select * is bad” we use the star as a symbol for “selecting everything without thinking about it”. This is the bad thing. If you need a more catch phrase to remember that, take this:

It’s not about the star, stupid!

If you like my way to explain things, you’ll love my book SQL Performance Explained.

About the Author

Photo of Markus Winand

Markus Winand provides insights into SQL and shows how different systems support it at modern-sql.com. Previously he made use-the-index-luke.com, which is still actively maintained. Markus can be hired as trainer, speaker and consultant via winand.at.

Buy his Book

Cover of “SQL Performance Explained”: Squirrel running on grass

The essence of SQL tuning in 200 pages

Buy now!
(paperback and/or PDF)

Paperback also available at Amazon.com.

Hire Markus

Markus offers SQL training and consulting for developers working at companies of any size.
Learn more »

Do not use offset for pagination

Learn why

Visit my sibling!A lot changed since SQL-92!

The Use The Index, Luke! mug

Stickers, coasters, books and coffee mugs. All you need for learning.

Shop now

Connect with Markus Winand

Markus Winand on LinkedInMarkus Winand on XINGMarkus Winand on Twitter
“Use The Index, Luke!” by Markus Winand is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 Unported License.
Legal | Contact | NO WARRANTY | Trademarks | Privacy and GDPR | CC-BY-NC-ND 3.0 license