The Join Operation


An SQL query walks into a bar and sees two tables.
He walks up to them and asks ’Can I join you?’

— Source: Unknown

The join operation transforms data from a normalized model into a denormalized form that suits a specific processing purpose. Joining is particularly sensitive to disk seek latencies because it combines scattered data fragments. Proper indexing is again the best solution to reduce response times. The correct index however depends on which of the three common join algorithms is used for the query.

About our book “SQL Performance Explained”
Probably the best book on SQL performance I've read
Guillaume Lelarge on Amazon.co.uk (5 stars)

There is, however, one thing that is common to all join algorithms: they process only two tables at a time. A SQL query with more tables requires multiple steps: first building an intermediate result set by joining two tables, then joining the result with the next table and so forth.

Pipelining Intermediate Results

Although intermediate results explain the algorithm very well, it does not mean that the database has to materialize it. That would mean storing the intermediate result of the first join before starting the next one. Instead, databases use pipelining to reduce memory usage. That means that each row from the intermediate result is immediately pipelined to the next join operation—avoiding the need to store the intermediate result set.

Even though the join order has no impact on the final result, it still affects performance. The optimizer will therefore evaluate all possible join order permutations and select the best one. That means that just optimizing a complex statement might become a performance problem. The more tables to join, the more execution plan variants to evaluate—mathematically speaking: n! (factorial growth), though this is not a problem when using bind parameters.

Important

The more complex the statement the more important using bind parameters becomes.

Not using bind parameters is like recompiling a program every time.

If you like my way of explaining things, you’ll love my book.

About the Author

Photo of Markus Winand
Markus Winand tunes developers for high SQL performance. He also published the book SQL Performance Explained and offers in-house training as well as remote coaching at http://winand.at/

?Recent questions at
Ask.Use-The-Index-Luke.com

1
vote
1
answer
42
views

Generate test data for a given case

11 hours ago Markus Winand ♦♦ 711
testcase postgres
0
votes
1
answer
164
views

Database design suggestions for a data scraping/warehouse application?

Aug 27 at 09:29 Markus Winand ♦♦ 711
mysql optimization database
1
vote
1
answer
330
views

How to query for "previous page" with keyset pagination?

Aug 22 at 04:21 alextsg 16
pagination postgresql