Did you know pagination with
offset is very troublesome but easy to avoid?
offset instructs the databases skip the first N results of a query. However, the database must still fetch these rows from the disk and bring them in order before it can send the following ones.
This is not an implementation problem, it's the way
offset is designed:
…the rows are first sorted according to the <order by clause> and then limited by dropping the number of rows specified in the <result offset clause> from the beginning…
SQL:2016, Part 2, §4.15.3 Derived tables
But the trouble with
offset doesn't stop here: think about what happens if a new row is inserted between fetching two pages?
offset➌ to skip the previously fetched entries❶, you'll get duplicates in case there were new rows inserted between fetching two pages➋. There are other anomalies possible too, this is just the most common one.
This is not even a database problem, it is the way frameworks implement pagination: they just say which page number to fetch or how many rows to skip. With this information alone, no database can do any better.
Life Without OFFSET
Now imagine a world without these problems. As it turns out, living without
offset is quite simple: just use a
where clause that selects only data you haven't seen yet.
For that, we exploit the fact that we work on an ordered set—you do have an
order by clause, ain't you? Once there is a definite sort order, we can use a simple filter to only select what follows the entry we have see last:
SELECT ... FROM ... WHERE ... AND id < ?last_seen_id ORDER BY id DESC FETCH FIRST 10 ROWS ONLY
This approach—called seek method or keyset pagination—solves the problem of drifting results as illustrated above and is even faster than
offset. If you'd like to know what happens inside the database when using
offset or keyset pagination, have a look at these slides (benchmarks, benchmarks!):
On slide 43 you can also see that keyset pagination has some limitations: most notably that you cannot directly navigate to arbitrary pages. However, this is not a problem when using infinite scrolling. Showing page number to click on is a poor navigation interface anyway—IMHO.
If you want to read more about how to properly implement keyset pagination in SQL, please read this article. Even if you are not involved with SQL, it's worth reading that article before starting to implement anything.
But the Frameworks…
The main reason to prefer
offset over keyset pagination is the lack of tool support. Most tools offer pagination based on
offset, but don't offer any convenient way to use keyset pagination.
The hall of fame of frameworks that do support keyset pagination is rather short:
Django (Python) chunkator
blaze-persistence — a rich Criteria API for JPA providers (started in July 2014)
This is where I need your help. If you are maintaining a framework that is somehow involved with pagination, I ask you, I urge you, I beg you, to build in native support for keyset pagination too. If you have any questions about the details, I'm happy to help (forum, contact form, Twitter)!
Even if you are just using software that should support keyset pagination such as a content management system or webshop, let the maintainers know about it. You might just file a feature request (link to this page) or, if possible, supply a patch. Again, I'm happy to help out getting the details right.
Take WordPress as an example.
Spread the Word
The problem with key-set pagination is not a technical one. The problem is just that it is hardly known in the field and has no tool support. If you like the idea of
offset-less pagination, please help spreading the word. Tweet it, share it, mail it, you can even re-blog this post (CC-BY-NC-ND). Translations are also welcome, just contact me beforehand—I'll include a link to the translation on this page too!
Oh, and if you are blogging, you could also add a banner on your blog to make your readers aware of it. I've prepared a NoOffset banner gallery with some common banner formats. Just pick what suits you best.