Use The Index, Luke Blog - Latest News About SQL Performance


Joe Celko’s review of “SQL Performance Explained”

Joe Celko’s review of my book SQL Performance Explained was just published. The bottom line: “This book is definitively worth having in the company library.”

PG.Conf EU 2012 — Summary

With a little delay of just three weeks, I’m writing something about this years PostgreSQL conference Europe.

Including the training day, it was four days full of database stuff—in other terms: an awesome time, if you love databases :) But let’s start from the beginning.

I booked the full day training with Joe Celko on the first day. And I must say that was the best part of the conference—mostly because Joe is great. If you never meet Joe Celko, or don’t even know who the heck he is, you should definitely look for an opportunity to meet him.

The training he gave (“A day of SQL with Celko”) was a mixture of SQL related topics like keys, naming (e.g., plurals for tables), metadata, rollup (group by grouping sets, group by cube) and trees. Due to my day to day work, I’ve already seen some of the “rarely known features” he mentioned. It was still very valuable because he presented good examples for them. The “nested sets model” is particularly noteworthy. I hope to write about some of these topics soon.

Outside of the training, it was also very nice to talk to Joe. He even signed a copy of my book for me—kidding that he should write “I did not write this book.” On the next day, when I was skipping one session, I had a chance to chat with him. So, I challenged him about his famous “80-95% of the work in SQL is done in the DDL, not the DML” statement (e.g., recently cited here and here) and asked if he considers create index as part of DDL in this context. Well, there is a good reason create index is not part of the SQL standard: because indexing is an implementation detail. Still, I argued that it is required to make the final system working and I believe Joe agreed after a while. I hope he still agrees after a few weeks :)

Day 2, after Joe’s opening keynote, I attended to Bruce Momjian’s “Programming the SQL Way with Common Table Expressions.” Well, I did not like this talk because it was not focused on CTEs in general, but mostly about the "writable CTE" extenstion by PostgreSQL. Sure it is OK to focus on specifics on a PostgreSQL conference, but I don’t like this particular feature—even though I like some other proprietary features of PostgreSQL.

In the afternoon, I listened to Guillaume Lelarge’s “Understanding EXPLAIN’s output”. He covered almost all explain plan operations in 50 minutes, but it was still possible to follow him. I learned about EXPLAIN BUFFERS (how could I miss that so far?).

The talk about range types (“Your Life Will Never Be The Same”) did not exactly hold up to my expectations. It introduced range types in a, well, superficial way so that I neither had questions during the talk, nor did I feel enlightened. The only thing that remained was the feeling that I should have a closer look into this.

Josh Berkus’s talk “Elephants and Windmills” was also a little bit disappointing. It was basically about a data warehouse, and as far as I could see, there was not particularly noteworthy innovation in their solution. On the contrary: in the age of NoSQL, sharding and cloud computing, it sounds strange that they just dedicated separate hardware to handle the load of separate clients because it was just too much for a single box. However, I might have missed a critical point.

I liked Bruce Momjians second talk “Inside PostgreSQL Shared Memory”. It was just what I expected: some insight into internals you probably don’t need to know about :)

The last talk on Thursday “Index support for regular expression search” by Alexander Korotkov was very interesting. He is basically transforming regular expressions to logical expressions made up of trigraphs (3-letter groups) like this: /[ab]cde/ => (acd OR bcd) AND cde (taken from his slides). After that, just use the existing PostgreSQL trigraph index, check each trigraph and apply the logic. Finally, the full regular expression must be applied to filter false positives (the logical expression can match too many rows). After all, it is very promising research that might give considerable speed improvements in some cases.

Friday morning: “Beyond Query Logging”. It basically explained new ways to find slow queries—which is, an absolute essential trouble shooting tool. It seems that PostgreSQL is coming considerably closer to the tools other databases, like Oracle, support.

Gianni Ciolli’s "Debugging complex SQL queries with writable CTEs" was the last session I attended. Well, writeable CTEs. I had to attend that just to give writeable CTEs a second chance. I liked the presentation (Gianni does it in a very funny way) and I didn’t dislike the use of writable CTEs in this context because it is not intended for production use. The trick is as following: Just convert all sub-queries into CTEs, and capture the content of each CTE with another writable CTE that copies the content into a debug table. In this way, you can capture the intermediate results of an SQL statement and review it later to see where you have an error in the statement. Without this approach, we have to run each part of the SQL statement after another. It is a nice gimmick presented in a very funny way, which doesn’t change my opinion about writable CTEs.

There were also two social events (Wednesday and Thursday evening, sponsored by EnterpriseDB and Heroku) and it was very nice to chat about the talks, about Prague, and about SQL Performance Explained. Writing a book is a very lonesome job—it was great to meet some of my readers in person.

Stuff you missed without Twitter

If you are not following my SQL Performance Tips (@SQLPerfTips) feed on twitter, you are missing a lot. However, two things you should probably know, even without using twitter:

All You Base Conf — Database conference for web developers

I’m in no way affiliated with those guys, but they were so nice to offer a discount code for my readers (followers?). The conference takes place in Oxford, UK on 23rd Nov 2012. Have a look at http://allyourbaseconf.com/ for more details. Use the promo code “SQLPerf” to get £35 off the standard ticket price.

Celebrating 10k followers

My @SQLPerfTips account has crossed the 10k followers mark recently. So I though I’ll celebrate this with another discount code (10kflwrs) that is good for -10% for orders of “SQL Performance Explained” when placed directly at http://sql-performance-explained.com/. This code expires on Fri, 12th Oct 20:00 Vienna local time (CEST). That’s 11am PDT, for example.

My PGConf.EU 2012 Schedule

As mentioned before, I’ll be at this year PostgreSQL Conference in Prague. Still two weeks to go, but I just had a few minutes "free" time and prepared my personal schedule.

As usual, I plan to attend as many sessions as possible. As usual, I’ll skip some in favour of a nice chat and/or change my mind in the last moment.

Tuesday, October 23

I’ll spend the whole day with Joe Celko’s session “A day of SQL with Celko.” Thought it might be a good idear to listen to him—probably less time consuming than reading his books ;)

Wednesday, October 24

11:10 - 12:00 — Room: Seine

“Programming the SQL Way with Common Table Expressions.” CTE’s are not exactly new to me, still I’m hardly using them. Hope to see some good examples how they can help.

12:10 - 13:00 — Room: Thames

“MultiMaster Replication: Applications, Comparison, Implementation.” What caught my attention was “ various replication implementations in CouchDB, MongoDB, SQLServer and Oracle” — whow, that sounds interesting. NoSQL/SQL comparisons are difficult, let’s see how this one works out.

14:00 - 14:50 — Room: Seine

“Understanding EXPLAIN’s output.” I guess everybody knows: I love execution plans :)

15:20 - 16:10 — Room: Thames

“Range Types in PostgreSQL 9.2 - Your Life Will Never Be The Same.” We will see...

16:20 - 17:10 — Room: Thames

“High availability in Postgres-XC, the symmetric PostgreSQL cluster.” Again, something in the domains of NoSQL...

Thursday, October 25

09:30 - 10:20 — Room: Vltava

“Marketing PostgreSQL.” When read “Your Life Will Never Be The Same” above I immediately though that must be a marketing guy. Confirmed. However, I need to do a lot of marketing as well.

10:50 - 11:40 — Room: Thames

“Universal Data Access with SQL/MED.”

11:50 - 12:40 — Room: Vltava

“Elephants and Windmills.”

13:40 - 14:30 — Room: Seine

“Inside PostgreSQL Shared Memory.”

14:40 - 15:30 — Room: Vltava

“Index support for regular expression search.” No comment :)

Friday, October 26

09:30 - 10:20 — Room: Seine

“Beyond Query Logging.”

10:50 - 11:40 — Room: Thames

“Migrating Oracle queries to PostgreSQL.” ... subtle (often undocumented) differences are highlighted.

11:50 - 12:40 — ??

Debugging complex SQL queries with writable CTEs” —OR— “PG-Strom - GPU Accelerated Asynchronous Query Execution Module” Here I’ve a real problem, both sound very interesting.

13:40 - 14:30 — Room: Seine

”Large Scale MySQL Migration to PostgreSQL.”

PGConf.EU 2012

Oh, one more thing that expires on Friday: the early bird rate for this years PostgreSQL Conference Europe in Prague. I’ll be there. Joe Celko also.

Convinced

After publishing the English edition of “SQL Performance Explained” there was a huge demand for an ebook edition. After a few days answering these queries, I though it’s probably easier to offer an e-book than to explain why there isn’t one. Well, I must admit that I don’t read ebooks very much. My wife has an e-ink device, but is hardly using it (probably in lack of German content). I’ve uploaded the ePUB pre-release from 2011 and must say it works, but isn’t nice. However, many people told me they would be fine with a PDF, because they are anyway using tablets or smartphones to read their ebooks. Fast forward a few weeks: since Monday I can offer a PDF download for just EUR 9.95 (English, German [update] or French, you choose).

Afterwards there was a huge demand—again—for a bundle discount when purchasing both, the PDF and the hard copy. That caught me by surprise. Not that people want a discount, but that people would like to buy both editions. So I asked, and listened, and learned something about my own business.

Sure there were usual suspected like “I’d just like the PDF for searching” “to have it with me when being on the road" and so forth. But the interesting point was “I’d like the hard copy, but would also like to start reading immediately.” Well, I didn’t think of that (shame on me). Although I’m offering free shipping to 150+ countries, it takes its time because I’m shipping from Europe only once a week. Instead of express delivery, I just need to offer a bundle.

It took another day to tweak my accounting software because there are different VAT rates applied to real books and PDF downloads—but the accounting software could do only one VAT rate for each invoice. After all I’m just a geek like most of you, so I hacked a day and made it work. The bundle is on sale for EUR 34.95 since yesterday evening—that’s a EUR 4.95 discount.

And now, I’ll be your “What-If” Machine:

What if I already bought the print edition and would like to get the PDF and the bundle discount?

No problem, you’ll get the discount! I don’t see any reason I should treat the “early adopters” any worse than people who didn’t buy the print ed yet.

UPDATE 2013-07-08: It currently works like this: On the invoice for the print edition, you’ll find a discount code to get the PDF edition for the reduced price. Same is true vice versa.

OUTDATED: Please fill in the order form and select the PDF edition. The important part is that you need to mention that you already bought the print ed in the “Notes” section. Please add any references you have for your purchase. You should have received an invoice from me, even when you bought at Amazon. Mention the invoice or offer-no. If you bought it at your local book dealer, tell me when and where you bought it. I’ll check your original purchase and grant you the discount. That is, however, a manual procedure and may take a while.

What if I bought the PDF edition and would like to get the print ed and the bundle discount?

No problem, you’ll get the discount! Same procedure as above.

No matter in which order you buy, you’ll get the discount if you can prove your previous purchase.

Finally, I have to tell you that I’ll be abroad in the next few week and I cannot dispatch international orders during this time. That means, the last international dispatch from my office will take place on Friday. So, place your orders now. Regardless of that, there are more than enough copies on stock at Amazon. They will take care of your order at any time but might charge shipping fees.

Oh, I forgot one more thing. Some readers from the United States have reported that they didn’t have to pay any custom duties or other fees after receiving the hard copy. So, if you are in the US, it seems that you don’t need to worry about customs when ordering the hard copy.

Second Anniversary

“Use The Index, Luke!” celebrates its second anniversary today. Unfortunately, I was not able to prepare anything for this anniversary like I did last year when I presented the 3-minute test. The test is still very popular, by they way, it is taken about 20 times a day. However, all I have for this years anniversary is a promise.

Do you know http://sqlfiddle.com/? You should ;) It’s an online tool to test SQL queries with five different databases (SQL Server, MySQL, Oracle, PostgreSQL and SQLite). It is not a cloud database that you could use for building applications. SQLFiddle is a tool for collaborative online work on SQL problems. It was built to improve asking and solving SQL related questions on Q/A sites like Stackoverflow.

For me, SQL Fiddle is interesting because of the "View Execution Plan" Button. As said before, it is not a cloud DB that we could use for performance tests, but we can have a look at executions plans. I believe it would be very convenient to have prepared fiddles for the examples from „Use The Index, Luke!“ so that you can easily test them online without needing your own database.

And this is exactly what will come—although I don’t have a time line yet. Stay tuned.

Customs and, well, errata...

I’ve finally managed to put all the errors that were reported by readers so far online. I’m not exactly proud of this, but there are already more than 30 known errata for the German edition. There are still some German copies of "SQL Performance Explained" on stock, but it seems like I have to arrange a second printing soon. I will surly take up all these errors for the second printing. The English edition has just two errata at the moment, but I doubt it will stay that low. You are more than welcome to report any mistakes you find: I do follow them up :)

And now, something different: customs. I’ve received some inquires if I can indicate how much customs you might have to pay if you order the book from outside the EU. Well, I cannot tell you the exact amount, but help you Google it ;) The important information you need to find the rate in your country is the so-called HS number and the origin of the goods. I’ve to put this information on the CN22 customs declaration on the outside of the package, so authorities know what’s inside. The HS number for books is 490199 and this particular book is printed in Austria. That’s what I write on the customs declaration form. I’ve also been told that some countries might charge other fees besides customs (e.g., VAT). Sorry, don’t know any more details.

However, I’m asking all international buyers to report if and how much fees they had to pay. Until now, I’ve not received any feedback—but I’ve done only one international shipping so far. Next one is due on Tuesday.

Finally: Almost World-Wide Free Shipping

The English edition of “SQL Performance Explained” is finally available. It has its own website http://sql-performance-explained.com/ where you can find a PDF preview and order information. The cover price is EUR 29.95 or GBP 26.99.

If you order it from outside the EU, you must also consider that you might be required to pay import duties in the destination country.

Knowing that I have readers all over the world, I’ve spent a fair amount of time to make world-wide free shipping possible. I almost succeeded. If you use the direct order form I’ll be able to offer free shipping to 150+ countries. However, this is only possible when using a volume discount offered by the Austrian postal service. Therefore, I’ll dispatch international orders only once a week.

Fine print: There are some countries that do not accept dutiable letter post. These are: Afghanistan, Albania, Azerbaijan, Bangladesh, Belarus, Benin, Burkina Faso, Cambodia, Chile, Colombia, Côte d’Ivoire (Rep.), Cuba, Dem. People’s Rep. of Korea, Djibouti, El Salvador, Kazakhstan, Latvia, Mali, Mauritania, Moldova, Nepal, Peru, Russian Federation, Turkmenistan, Ukraine, Uzbekistan and Venezuela (see Article X on page 79 of the Universal Postal Convention).

If you live in one of those countries, please feel free to use the direct order form anyway. There is no obligation in filling in the form. If I know that you live in one of these countries and would like to get the print edition, I can check how I could send it to you anyway.

Done! Or What?

During the last few weeks, it has become a little silent here at Use The Index, Luke! Today, I’d like to tell you the news that caused the silence and give a short outlook about the future of Use The Index, Luke!

The most important news is that I became father. It’s very important to me, at least. My son, Marcel, was born on 25th March 2012. So, we already had two month to settle down. Mommy is quite good at the night shift—I can sleep almost normally and have enough time and energy to work during daytime.

Most of that time I’ve spent to complete the book. Yes, it is “done”. As done as an online IT book can get. That means, the originally planned content is—after two years—finalized. The German translation was completed at about the same time. Use The Index, Luke! is now fully available in two languages.

During these two years of writing, I have always strived for an exciting text, because I know database tuning is not a popular topic for developers. I designed the structure so that one topic leads to the next in the hope it is read from cover to cover in one shot. The problem with the website is, however, that most people just see one or two pages. Those pages are often are often out of context and don’t make sense on their own. To cut a long story short: I believe the text only works when read entirely. So, I decided to publish the book as a real book.

I took the title I already used for the e-Pub edition last year: SQL Performance Explained. The cover photo is a quickly running squirrel because the book is about fast SQL. So it comes that I sometimes introduce the book as “Squirrel Performance Explained”.

At the moment, there is only a German edition available (to my own surprise). That is because I found some confusingly written paragraphs during translation. Definitely not good enough for printing. Those parts were not just translated, but rewritten from scratch. In two languages—at least at the beginning. Due to time constraints, I had to stop editing the English text after a while. Just writing in my native language was, of course, faster.

You can buy the German edition directly via http://sql-performance-explained.de/ or—private consumers in Germany—also in my amazon.de-Shop. Delivery time is the same in both cases because I’m always using amazon’s delivery network for Germany—no matter which way you order.

Finally, I’ll give you a small outlook what’s happening next. I’m currently completing the English text, of course. I’m looking for a lecturer, btw; A native speaker, preferably with German and IT knowledge (contact). Once the text is finalized, printing is no big issue. After that, I need to take care to earn some money again. From business perspective, this year was a disaster until now. Not only because I spent quite some time on the book, but also to prepare the arrival of my son. I finally took drivers education, for example. One would not believe how much time that takes.

Earning money is for me mostly about doing workshops and Instant-Coaching. I’m constantly getting inquires for the workshop. The problem is, however, the travel time and expenses. That makes it too expensive very often. As workaround, I try to get three (or more) clients in the same area—in that case, the travel time and expenses is no problem anymore. So, if you would like to hire me for a workshop, just leave me a short note. For the planning, I just need a postal code. The book is part of the course material, of course. At the moment, I’m only doing on-site workshops in the D-A-CH region. I’m, however, getting inquires from all over the world (thank you!). But that is even harder to manage. For those of you living outside Austria, Germany or Switzerland, please consider using the Instant-Coaching model. It is a virtual workshop, using your system. I can explain the principles based on examples from your own application, if you like.

If I find some time in between, I’d like to process my blog backlog. I’m having topics for about 20 articles in the pipeline—some of them for the miserably neglected myth directory. I can also imagine writing some opinion articles. I’ve always aimed to stay objective when writing for the book. Now, I might write possibly some…well…more emotional articles.

I did not yet make up my mind regarding long-term planning for Use The Index, Luke! I will, of course, maintain and update the material. Let me know if you have any topic you would like to see on Use The Index, Luke! Your wish might come true.

So long,

-markus

About the Author

Photo of Markus Winand
Markus Winand tunes developers for high SQL performance. He also published the book SQL Performance Explained and offers in-house training as well as remote coaching at http://winand.at/