What I learned about SQLite…at a PostgreSQL conference
So, I’ve been to PgCon 2014 in Ottawa to give a short version of my SQL performance training. However, I think I ended up learning more about SQLite than about PostgreSQL there. Here is how that happened and what I actually learned.
Richard Hipp, creator of SQLite was the keynote speaker at this years PgCon. In his keynote (slides, video) he has put the focus on three topics: how PostgreSQL influenced SQLite development (“SQLite was originally written from PostgreSQL 6.5 documentation” and the “What Would PostgreSQL Do?” (WWPD) way of finding out what the SQL standard tries to tell us). The second main topic was that SQLite should be seen as an application file format—an alternative to inventing own file formats or using ZIPped XMLs. The statement “SQLite is not a replacement for PostgreSQL. SQLite is a replacement for fopen()” nails that (slide 21). Finally, Richard put a lot of emphasis on that fact that SQLite takes care of your data (crash safe, ACID)—unlike many of the so-called NoSQL systems, which Richard refers to as “Postmodern Databases: absence of objective truth; Queries return opinions rather than facts”. In his keynote, Richard has also shown that SQLite is pretty relaxed when it comes to data types. As a matter of fact, SQLite accepts strings like “Hello” for
INT fields. Note that it still stores “Hello”—no data is lost. I think he mentioned that it is possible to enforce the types via
Here I have to mention that I’ve had an e-mail conversation with Richard about covering SQLite on Use The Index, Luke last year. When our ways crossed at the PgCon hallway I just wanted to let him know that this has not been forgotten (despite the minimal progress on my side). Guess what—he was still remembering our mail exchange and immediately suggested to go through those topics once more in PgCon’s hacker lounge later. Said. Done. Here comes what I learned about SQLite.
First of all, SQLite uses the clustered index concept as known from SQL Server or MySQL/InnoDB. However, before version 3.8.2 (released 2013-12-06) you always had to use an
INTEGER column as clustering key. If the primary key happened to be a non-integer, SQLite was creating an integer primary key implicitly and used this as the main clustering key. Querying by the real primary key (e.g.
TEXT) required a secondary index lookup. This limitation was recently lifted by introducing WITHOUT ROWID tables.
When it comes to execution plans, SQLite has two different variants: the regular explain prefix returns the byte code that is actually executed. To get an execution plan similar to what we are used to, use explain query plan. SQLite has some hints (unlike PostgreSQL) to affect the query plan: indexed by and unlikely, likelihood.
We also discussed whether or not SQLite can cope with search conditions like
(col1, col2) < (val1, val2)—it can’t. Here Richard was questioning the motivation to have that, so I gave him an elevator-pitch version of my “Pagination Done The PostgreSQL Way” talk. I think he got “excited” about this concept to avoid offset at all and I’m curious to see if he can make it work in SQLite faster than I’m adding SQLite content to Use The Index, Luke :)
SQLite supports limit and offset (*sigh*) but the optimizer does currently not consider limit for optimization. Offering the SQL:2008 first fetch ... rows only syntax is not planned and might actually turn out to be hard because the parser is close to run out of 8-bit token codes (when I understood that right).
Joins are basically always executed as nested loops joins but SQLite might create an “automatic transient index” for the duration of the query. We also figured out that there seems to be a oddity with they way
CROSS JOIN works in SQLite: it turn the optimizers table reordering off. Non-cross joins, on the other hand, don’t enforce the presence of
USING so that you can still build a Cartesian product using the optimizers smartness to reorder the tables.
Last but not least I’d like to mention that SQLite supports partial indexes in the way it should be: just a
WHERE clause at index creation. Partial indexes are available since version 3.8.0 (released 2013-08-26).
On the personal side, I’d describe Richard as very pragmatic and approachable. He joined twitter a few month ago and all he did there is helping people who asked questions about SQLite. Really, look at his time line. That says a lot.
I’m really happy that I had the opportunity to meet Richard at a PostgreSQL conference—as happy as I was to meet Joe “Smartie” Celko years ago…at a another PostgreSQL conference. Their excellent keynote speaker selection is just one more reason to recommend PostgreSQL conferences in general. Here are the upcoming ones just in case you got curious.
ps.: There was also a unconference where I discussed the topic of Bitmap Index Only Scan (slides & minutes).