Use The Index, Luke Blog - Latest News About SQL Performance

The two top performance problems caused by ORM tools

ORMs are not entirely useless…” I just tweeted in response to a not exactly constructive message that we should fire developers who want to use ORMs. But than I continued in a not exactly constructive tone myself when I wrote "…the problem is that the ORM authors don’t know anything about database performance”.

Well, I don’t think they “don’t know anything” but I wonder why they don’t provide decent solutions (including docs) for the two most striking performance problems caused by ORMs? Here they are:

The infamous N+1 selects problem

This problem is actually well-known and I believe there are working solutions in all recent ORMs. The problem that remains is about documentation and culture: although there are solutions, many developers are not aware of them or still live in the “joins are slow—let’s avoid them” universe. The N+1 selects problem seems to be on the decline but I still think ORMs’ documentation should put more emphasis on joins.

For me, it looks very similar to the SQL injection topic: each and every database access layer provides bind parameters but the documentation and books just show examples using literal values. The result: SQL injection was the most dangerous weakness in the CWE/SANS Top 25 list. Not because the tools don’t provide proper ways to close that hole, but because the examples don’t use them consistently.

The hardly-known Index-Only Scan

Although the Index-Only Scan is one of the most powerful SQL tuning techniques, it seems to be hardly known by developers (related SO question I was involved in recently). However, I’ll try to make the story short.

Whenever a database uses an index to find the requested data quickly, it can also use the information in the index to deliver the queried data itself—if the queried data is available in the index. Example:

     ( last_name  VARCHAR(255)
     , first_name VARCHAR(255)
     -- more columns and constraints

    ON demo (last_name, first_name);

SELECT last_name, first_name
  FROM demo
 WHERE last_name = ?;

If the database uses the IOS_DEMO index to find the rows in question, it can directly use the first name that is stored along with the last name in the index and deliver the queries’ result right away without accessing the actual table. That saves a lot of IO—especially when you are selecting more than a few rows. This technique is even more useful (important) for databases that use clustered indexes like SQL Server or MySQL+InnoDB because they have an extra level of indirection between the index and the table data.

Did you see the crucial prerequisite to make an Index-Only Scan happen? Or asking the other way around: what’s a good way to make sure you’ll never get an Index-Only Scan? Yes, selecting all columns is the most easy yet effective Index-Only-Scan-Preventer. And now guess who is selecting all the columns all the time on behalf of you? Your good old friend the ORM-tool.

This is where the tool support is really getting sparse. Selecting partial objects is hardly supported by ORMs. If it is supported then often in a very inconvenient way that doesn’t give runtime control over the columns you’d like to select. And for gods sake, let’s forget about the documentation for a moment. To say it straight: using Index-Only Scans is a pain in the a** with most ORMs.

Besides Index-Only Scans, not selecting everything can also improve sorting, grouping and join performance because the database can save memory that way.

What we would actually need to get decent database performance is a way to declare which information we are going to need in the upcoming unit of work. Malicious tongues might now say “that’s SQL!” And it’s true (in my humble opinion). However, I do acknowledge that ORMs reduce boilerplate code. Luckily they often offer an easier language than SQL: their whatever-QL (like HQL). Although the syntactic difference between SQL and whatever-QL is often small, the semantic difference is huge because they don’t work on tables and columns but on objects and classes. That avoids a lot of typing and feels more natural to developers from the object world. Of course, the whatever-QL needs to support everything we need—also partial objects like in this Doctrine example.

After all, I think ORM documentation should be more like this: first introduce whatever-QL as simplified SQL dialect that is the default way to query the database. Those methods that are currently mentioned first (e.g., .find or .byId) should be explained as a shortcut if you really need only one row from a single table.

If you like my way to explain things, you’ll love SQL Performance Explained.

I need your help!


Would you please spare me a few minutes and help me a little bit?

As you might know—or maybe not—I’m making my living as an independent trainer and consultant. Up till now I’ve only delivered on-site training at the clients’ site, but I though it makes sense to offer open trainings as well so that singe participants can also join. For that I’d need to know how many people would like to join such a training, where they are physically located, and which database they are using. So, I’ve set up a short survey:

It’s just one form and won’t take much time. Naturally, you are under no obligation if you fill out the form, and as a way of saying thank you for your time, I’ll be drawing three participants to receive a DON’T PANIC towel—the perfect preparation for the upcoming Towel Day on 25th May. So take a minute to complete the survey now because the deadline is 26th of April!



Party time

Good news everybody. I’m in a good mood today ;) I’ll explain. First of all, the French edition of SQL Performance Explained is almost done and will ship at the end of the month (pre-order now ;). It has a different name (SQL : Au cœur des performances) but is otherwise the same. So, that triggered some kind of "project done" feeling.

Further, I’ve given a few good trainings recently and looking forward for some more in the next weeks. It’s just awesome when the audience comes up with great questions, but it’s even more awesome if they come up with great conclusions that proof their correct understanding. That triggers a "missing accomplished" feeling.

So, as the pre-sale of the French edition is running on a 10% discount, I though why not giving a general 10% discount to celebrate the French edition? And here it is: ’aucoeur’. It’s only valid for order placed directly at (not valid on Amazon) and expires on March 28 (the release date of the French edition).

Further, I’ll repeat the ’retweet to win’ campaign every week during March. It goes like this: I’ll tweet something like ’Retweet this for a chance to win’. After a while, I’ll select one of the retweeters as winner. You can’t win twice (sorry @d_gustafsson). So, watch out, the first giveaway tweet comes soon.

Better Performance with PostgreSQL

Dalibo, the company where Guillaume Lelarge works (he does the French translation of SQL Performance Explained) invited me to give a talk at their (mostly French) for-free conference “Better Performance with PostgreSQL” on 28th March in Paris. Seats are limited: if you’d like to go there, register now!

And if you don’t speak French, there are other good news for you. SQL Performance Explained is currently on sale at for £25.00 (you save £1.99, -7%). Please note that this is the free-delivery threshold for these countries: Belgium, Denmark, Luxembourg, Netherlands, Andorra, Finland, Gibraltar, Greece, Iceland, Ireland, Italy, Liechtenstein, Norway, Portugal, San Marino, Spain, Sweden, Vatican city and Poland. They will also ship for free to UK of course.

FOSDEM Impressions

Last weekend I’ve visited the Free and Open source Software Developers’ European Meeting (FOSDEM) in Brussels, Belgium. It is a huge and free conference covering countless open source topics such as Open/Libre Office, Microkernels, MySQL, NoSQL, PostgreSQL, OWASP, and much more.

On the day before FOSDEM, Friday, I’ve been at the PgDay giving my talk “Pagination Done the PostgreSQL Way” [slides]. I think it was well received but I’m still curious to get the feedback from the feedback forms. So, if you have been there, please fill in the feedback form!

Here you find a short list of notable talks. I’ve actually attended more, I’m just mentioning the most impressive ones.

The GNU/Hurd architecture, nifty features, and latest news

Hurd is still being released in two years. This has been funny 15 years ago, but is quite sad today. However, they aim to get a fully supported Debian distro based on Hurd by that time. Good luck. It seems that current virtualization developments (e.g. Linux containers) are a very good fit for the Hurd architecture, effectively not needing any changes to support containers natively.

Lightning talk: Spoiling and Counter-spoiling

An attempt to make a coding contest that honours maintainable code. The problem: how to measure it? The solution: make three contests: (1) one for writing maintainable code; (2) one for incorporating difficult bugs into that code; (3) and a last one to fix those bugs again. Add some statistics to that and you know who’s code had the best maintainability by measuring how long it took to find and fix the bugs. Funny idea, but not yet executed.

Practical Security for developers, using OWASP ZAP

Promising tool for security checks (brute force). Video available at the homepage.

Lightning talk: The C2 programming language

Another successor of C. It’s a minimalistic approach to just change what really hurts in C. That are a few syntax issues and the preprocessor (#include). At a very early stage. See

How to Share a Trademark

About legal issues (e.g., Trademark vs. Registered Trademark). Some example cases from the open source scene.

Lightning talk: BIND 10: DNS by Cooperating Processes

Have I ever mentioned that I used to be the DNS admin of an Austrian ISP? It’s more than 10 years ago, but I still hate BIND ;) Privately I’m only using djbdns since then. One of the fundamental design principles of djbdns is that there is a separation of concerns (e.g., serving a zone authoritatively is different from acting as caching resolver). Guess what’s gonna be introduced in BIND 10 ;)

PostGIS 2.0 and beyond

Really interesting talk where PostGIS is coming from (1.5 releases) and how 2.0 differs from it. One of the nice things about PostGIS presentations is that they always have nice visualizations (by definition). There was also a PostGIS 3D talk at PgDay.

When and How to Take Advantage of New Optimizer Features in MySQL 5.6

I’ve been to another MySQL talks as well and I must say they don’t feel like FOSS talks. It’s just Oracle presenting. However, the topics presented are no surprise (e.g. Index Condition Pushdown, a.k.a. Index-Filterpredicates). The scary thing about this talk was that the examples almost always needed hits to work. Sure you could say that these are just made up examples, but I think it just a proof of the optimizers weakness if it needs all the time hints to do the “right thing”.

A real-time architecture using Hadoop & Storm

Presenting the “classical” batch/online mixture to handle BigData: Running Hadoop/MapReduce in Batches and processing the new data online (in memory). To get the full picture, you must query both of them.

Top Tweets January 2013

This is a short collection of the best tweets on @SQLPerfTips and/or @MarkusWinand for those who missed them and those who don’t use twitter at all.

As I’m late posting the January tweets, I also include an awesome picture taken at FOSDEM (February, technically).

I’m speaking at FOSDEM PGDay 2013

My talk “Pagination done the PostgreSQL way” was accepted for the one-day PostgreSQL conference PGDay on 1st Feb 2013 just before FOSDEM in Brussels.

As this is the first talk I’ll deliver on a conference, I’ve not yet prepared anything. However, I’ve three weeks left to prepare an awesome talk. Luckily, I do have some ideas to make this talk awesome, but casting a few good ideas into a well-delivered presentation is a non-trivial task. However, I’ll try hard. Promised.

Btw, Guillaume LeLarge, who is currently doing the French translation of SQL Performance Explained, is also there.

On the following weekend, there will be the Free and Open source Software Developers’ European Meeting (FOSDEM) in Brussels. From the website: “It is renowned for being highly developer-oriented and brings together 5000+ geeks from all over the world. No registration necessary.” At FOSDEM you can find all kinds of topics: from Ada and Free Java to Python, from MySQL and PostgreSQL to NoSQL. Just have a look at the schedule to get an idea.

Top Tweets December 2012

This is a short collection of the best tweets on @SQLPerfTips and/or @MarkusWinand for those who missed them and those who don’t use twitter at all.

And I want to take this opportunity to wish you a happy new year :)

French anybody?

Good news everybody—well, at least if you speak French. Use The Index, Luke and SQL Performance Explained will be translated. You can already start reading the first chapters here.

How did that happen? We all owe Guillaume Lelarge (blog|twitter) a big thank you for that! He contacted me a while ago about the book and suggested to translate it into French. My first reaction was very skeptical—not because I don’t believe there is a need for a French translations, but because I know translating a book is very hard work. However, Guillaume has already translated a book (Bases de données PostgreSQL) and the PostgreSQL documentation (major parts of it). And the best thing is: these are all database topics. I’m quite sure that he is the right one :)

We will publish the translated chapters in a step-by-step manner here on Use The Index, Luke! The print and PDF editions will be available once the translation is done (no release date yet). In the meanwhile I invite you to read the first parts and report errors if you find them—there is an error reporting form on the bottom of each page. Besides that you should also send Guillaume some kudos and read his post about the translation.

Instant What?

You might guess it, but I’m not making my living from this site nor are the book sales paying all the bills—although the books pays some bills, thank you! However, to pay the remaining bills I’m doing coaching, training and consulting. You can find all the details on my professional site but I’d like to highlight one service here: Instant Coaching.

After googling and trying many ways to solve a problem, have you ever got that feeling of resignation where you just want to ask somebody who should know the answer? Well, that’s Instant Coaching—at least if your problem is about performance and/or databases.

Instant Coaching is remarkably easy and efficient: we just arrange an online meeting where you present your problem or question to me. We’ll use a desktop sharing tool (TeamViewer) so that I can see your screen and—if you like—show you something on your computer (control your mouse/keyboard). Instant Coaching needs virtually no preparation: it often starts just a few minutes after you faced the problem—hence “Instant”!

Another critical success factor for Instant Coaching is that we can put our knowledge and experiences together: you bring the know-how about your application and your data while I’m adding a little database and performance know-how to the mix. Instead of guessing about things I don’t know about your application or you don’t know about databases, we can just ask each other! I’m usually getting asked a lot: you don’t just get the “magic” solution, but you learn about the rational behind it so that can solve similar problems on your own the next time—hence “Coaching”!

If you have read this marketing post up to here, you’ll even get a special present: just tell me about this blog post and I’ll give you twenty percent off on any Instant Coaching session. And to make that even better, I’ll give up to two hours free Instant Coaching to free open source projects (FOSS). Both offers valid till July 30, 2014 and until I’m booked out.

So, reserve your seat by filling in the Instant Coaching registration now!

About the Author

Photo of Markus Winand
Markus Winand tunes developers for high SQL performance. He also published the book SQL Performance Explained and offers in-house training as well as remote coaching at