- August 12–22: Online Training (EU shift)
- Enrollment
- FOSDEM Impressions
- June 8–18: Online Training (US shift)
- Non-monetary micro sponsoring
- November 11-12 in Frankfurt am Main
- Online-Training in July and August
- Oracle + PostgreSQL
- Party time
- PostgreSQL Performance Event
- SQL Server Performance Kurs in Stuttgart
- SQL Server performance training in London
- Shipping Terms
- The two top performance problems caused by ORM tools
- Top Tweets January 2013
- Training Survey
- Training and Conference Dates
- Use The Index, Luke
- Ask
- Consulting
2011-09-07Clustering Data: The Second Power of Indexing
The term cluster is used in various fields. A star cluster, for example, is a group of stars. A computer cluster, on the other hand, is a group of computers that work closely together—either to solve a complex problem (high-performance computing cluster) or to increase availability (failover cluster). Generally speaking, clusters are related things that appear together.
In the field of computing there is one more type of cluster—one that is often misunderstood: the data cluster. Clustering data means to store consecutively accessed data closely together so that accessing it requires fewer IO operations. Data clusters are very important in terms of database tuning. Computer clusters, on the other hand, are also very common in a database context—thus making the term cluster very ambiguous. The sentence “Let’s use a cluster to improve database performance” is just one example; it might refer to a computer cluster but could also mean a data cluster. In this chapter, cluster generally refers to data clusters.
Show some Love!
Buying the PDF supports our work on this site.
And you'll get a beautiful e-book.
And you'll get a beautiful e-book.
The simplest data cluster in an SQL database is the row. Databases store all columns of a row in the same database block if possible. Exceptions apply if a row doesn’t fit into a single block—e.g., when LOB types are involved.
Indexes allow one to cluster data. The basis for this was already explained in Chapter 1, “Anatomy of an SQL Index”: the index leaf nodes store the indexed columns in an ordered fashion so that similar values are stored next to each other. That means that indexes build clusters of rows with similar values. This capability to cluster data is so important that I refer to it as the second power of indexing.
The following sections explain how to use indexes to cluster data and improve query performance.
Stay connected:
RSS Feed
Like on Facebook
Follow me on Twitter
Share at Google+
RSS FeedFlattr this! Follow me on TwitterShare at Google+Like on Facebook