Response Time, Throughput and Horizontal Scalability


Bigger hardware is not always faster—but it can usually take more load. It is more like a highway with more lanes—it will not make you travel faster. That’s the reason suspicious SQL response times will not automatically improve on bigger hardware.

We are not in the 1990s anymore. CPU clock rates were increasing rapidly at that time. Most response time issues disappeared on newer hardware—just because of the higher clock rate. That was like making the car go faster. However, clock rates hit the wall during the first few years of the 21st century. There is almost no improvement on that axis since then. Multi-core is the new strategy to build more powerful CPUs. Even though it allows multiple tasks to run concurrently, it doesn’t improve performance if there is only one task running at all. It’s a different axis—like having more lanes on the highway.

Faster, easier and more memorable than reading.

Scaling horizontally (adding more servers) has a similar limitation. Although more servers can process more requests, they won’t improve response times automatically. Response time reductions are typically achieved with a (balanced) search tree—even in non-relational systems like CouchDB or MongoDB.

Important

Proper indexing is the best way to reduce query response time.

That’s true for relational SQL databases as well as many non-relational systems.

Proper indexing aims to exploit the logarithmic scalability of the B-Tree index to its full extent. However, most response time problems are caused by sloppy indexing—that is, suboptimal index usage. The following chart, taken from Section 1, shows the difference. It plots the response time to a simple SQL query on growing data volume. Once in red, when the query uses an suboptimal index. And once more in green, using a refined index. The previous section explains the setup in detail.

Figure 3.5. Response Time by Data Volume

Response Time by Data Volume


The response time difference is stunning. And it is hardly possible to improve it by scaling horizontally. Even if it would be easy to cut the response time by adding more servers, it’s still questionable if that is the best response to sloppy indexing.

The horizontal performance gains of the so-called NoSQL systems are mostly on the write side—often reached with the eventual consistency model. Simply put, they allow temporary inconsistencies that will eventually become consistent. That’s not done for fun. It’s a limitation implied by Brewer’s CAP Theorem. SQL, on the other hand, enforces very rigid consistency. That increases response times for write operations but does not necessarily imply bad throughput.

Eventual Consistency and Brewer’s CAP Theorem

Maintaining strict consistency in a distributed system—e.g., like scaling horizontally with multiple servers—requires the members to coordinate all changes in a synchronous manner. That has two unpleasant side effects: (1) synchronous communication adds latencies and increases response times; (2) it reduces the overall availability because each change requires multiple members taking part.

A distributed SQL database is often confused with something like a shared storage cluster—multiple database servers that access the same storage—or master-slave replication. A distributed database is more like a web shop and an ERP system—often two different products from different vendors. The consistency between both systems is still a desirable goal—often achieved with the two-phase commit (2PC) protocol. 2PC allows each system to start a global transaction that modifies data in both systems. The global transaction maintains the consistency. All or nothing—just like we know transactions. It can, however, not succeed if one system is unavailable—the overall system availability is reduced. Even if both systems are available, 2PC increases the overall response time due to the additional coordination effort.

The more nodes take part in a distributed system, the more troublesome strict consistency gets. Strict consistency makes it practically impossible to scale to more than a few nodes. Dropping strict consistency, on the other hand, solves the availability problem as well as the increased response time. The basic idea is to reestablish the global consistency after the write operation completed. The downside is that there is no way to prevent conflicts if two nodes accept conflicting changes. Consistency is eventually reached by handling conflicts, not by preventing them. In that context, consistency means that all nodes will consistently take one of the conflicting versions—that’s not necessarily the best available version.

Brewer’s CAP Theorem describes the dependencies between Consistency, Availability and Partition tolerance in general.

More hardware will typically not improve response times. In fact, it might even make the system slower because latencies can make up most of the response time. Network latencies won’t be a problem if the application and database run on the same computer. But that setup is rather uncommon in production environments where the database is typically running on dedicated hardware. Security policies might even require a firewall between the application server and the database—often doubling the network latency. The more complex the infrastructure, the more latencies accumulate, the slower the response. That’s often leading to the counterintuitive observation that the production hardware, that is supposed to be very powerful, responds slower than the cheap desktop PC that was used during development.

Another very important latency is the disk seek time. Especially spinning hard disk drives (HDD) need a rather long time to move the mechanical parts so that the requested data can be read—typically in the milliseconds range. That means that a B-Tree traversal with four levels needs four times as long—typically a few dozen milliseconds. Although that’s half an eternity for computers, it is way below human perception threshold—if it’s done only once. It is, however, very easy to write SQL statements that cause many hundred disk seeks just by joining a few tables. Although caching reduces the problem drastically and new technologies like SSD might improve the seek time by an order of magnitude, joins are often causing response time problems. The optimizer puts a huge effort into finding the best join path, but it is still limited by the available indexes on the underlying tables. The next chapter will therefore explain how to index for efficient table joins.

Solid State Disks (SSD) and Caching

Solid State Disks (SSD) are a new mass storage technology that doesn’t use any moving parts. The typical response time of SSDs is by an order of magnitude faster than the seek time of HDDs. SSDs became available for server computers and enterprise storage systems around 2010. As of 2011, SSD storage is very expensive and has a limited life span. It’s not commonly used in databases.

Databases do, however, cache frequently accessed data in the main memory. That’s especially true for the index root node, which is read every time the index is used. A frequently accessed index might be cached entirely so that the tree traversal doesn’t need a single disk seek. The explanations in this book neglect caching for simplicity—although caching improves the response times dramatically in reality.

Factbox

  • Performance has two dimensions: response time and throughput.

  • More hardware will typically not improve query response time.

  • Proper indexing is the best way to improve query response time.

Recent Questions at Ask.Use-The-Index-Luke.com

0
votes
1
answer
278
views

Updating multiple rows using a subquery in SQL

Jan 08 at 09:52 Jan 26
subquery update sql
1
vote
1
answer
210
views

Should 'id' (the primary key) be included in an index

Jan 03 at 15:24 Jan 26
index include
0
votes
1
answer
227
views

Best index for a multiple join-tables and filter

Jan 03 at 14:31 Markus Winand ♦♦ 216
index join where