Primary Keys


We start with the simplest yet most common where clause: the primary key lookup. For the examples throughout this chapter we use the EMPLOYEES table defined as follows:

CREATE TABLE employees (
   employee_id   NUMBER         NOT NULL,
   first_name    VARCHAR2(1000) NOT NULL,
   last_name     VARCHAR2(1000) NOT NULL,
   date_of_birth DATE           NOT NULL,
   phone_number  VARCHAR2(1000) NOT NULL,
   CONSTRAINT employees_pk PRIMARY KEY (employee_id)
);

The database automatically creates an index for the primary key. That means there is an index on the EMPLOYEE_ID column, even though there is no create index statement.

Tip

Appendix C, “Example Schema” contains scripts to populate the EMPLOYEES table with sample data. You can use it to test the examples in your own environment.

To follow the text, it is enough to know that the table contains 1000 rows.

The following query uses the primary key to retrieve an employee’s name:

SELECT first_name, last_name 
  FROM employees 
 WHERE employee_id = 123

The where clause cannot match multiple rows because the primary key constraint ensures uniqueness of the EMPLOYEE_ID values. The database does not need to follow the index leaf nodes—it is enough to traverse the index tree. We can use the so-called execution plan for verification:

MySQL
Try online at SQL Fiddle+----+-----------+-------+---------+---------+------+-------+
| id | table     | type  | key     | key_len | rows | Extra |
+----+-----------+-------+---------+---------+------+-------+
|  1 | employees | const | PRIMARY | 5       |    1 |       |
+----+-----------+-------+---------+---------+------+-------+

Type const is MySQL’s equivalent of Oracle’s INDEX UNIQUE SCAN.

Oracle
Try online at SQL Fiddle---------------------------------------------------------------
|Id |Operation                   | Name         | Rows | Cost |
---------------------------------------------------------------
| 0 |SELECT STATEMENT            |              |    1 |    2 |
| 1 | TABLE ACCESS BY INDEX ROWID| EMPLOYEES    |    1 |    2 |
|*2 |  INDEX UNIQUE SCAN         | EMPLOYEES_PK |    1 |    1 |
---------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("EMPLOYEE_ID"=123)
PostgreSQL
Try online at SQL Fiddle                QUERY PLAN
-------------------------------------------
 Index Scan using employees_pk on employees 
   (cost=0.00..8.27 rows=1 width=14)
   Index Cond: (employee_id = 123::numeric)

The PostgreSQL operation Index Scan combines the INDEX [UNIQUE/RANGE] SCAN and TABLE ACCES BY INDEX ROWID operations from the Oracle Database. It is not visible from the execution plan if the index access might potentially return more than one row.

SQL Server
Try online at SQL Fiddle
|--Nested Loops(Inner Join)
   |--Index Seek(OBJECT:employees_pk,
   |               SEEK:employees.employee_id=@1
   |            ORDERED FORWARD)
   |--RID Lookup(OBJECT:employees,
                   SEEK:Bmk1000=Bmk1000
                 LOOKUP ORDERED FORWARD)

The SQL Server operation INDEX SEEK and RID Lookup correspond to Oracle’s INDEX RANGE SCAN and TABLE ACCESS BY ROWID respectively. Unlike the Oracle Database, SQL Server explicitly shows the Nested Loops join to combine the index and table data.

The Oracle execution plan shows an INDEX UNIQUE SCAN—the operation that only traverses the index tree. It fully utilizes the logarithmic scalability of the index to find the entry very quickly—almost independent of the table size.

Tip

The execution plan (sometimes explain plan or query plan) shows the steps the database takes to execute an SQL statement. Appendix A explains how to retrieve and read execution plans with other databases.

After accessing the index, the database must do one more step to fetch the queried data (FIRST_NAME, LAST_NAME) from the table storage: the TABLE ACCESS BY INDEX ROWID operation. This operation can become a performance bottleneck—as explained in “Slow Indexes, Part I”—but there is no such risk in connection with an INDEX UNIQUE SCAN. This operation cannot deliver more than one entry so it cannot trigger more than one table access. That means that the ingredients of a slow query are not present with an INDEX UNIQUE SCAN.

If you like my way of explaining things, you’ll love my book.

Primary Keys without Unique Index

A primary key does not necessarily need a unique index—you can use a non-unique index as well. In that case the Oracle database does not use an INDEX UNIQUE SCAN but instead the INDEX RANGE SCAN operation. Nonetheless, the constraint still maintains the uniqueness of keys so that the index lookup delivers at most one entry.

One of the reasons for using non-unique indexes for a primary keys are deferrable constraints. As opposed to regular constraints, which are validated during statement execution, the database postpones the validation of deferrable constraints until the transaction is committed. Deferred constraints are required for inserting data into tables with circular dependencies.

About our book “SQL Performance Explained”
An indispensable manual for anyone, DBA, developer or system administrator
Luigi Zambetti on Amazon.co.uk (5 stars)

About the Author

Photo of Markus Winand
Markus Winand tunes developers for high SQL performance. He also published the book SQL Performance Explained and offers in-house training as well as remote coaching at http://winand.at/

Recent Questions at Ask.Use-The-Index-Luke.com

0
votes
1
answer
54
views

Database design suggestions for a data scraping/warehouse application?

Aug 27 at 09:29 Markus Winand ♦♦ 656
mysql optimization database
1
vote
1
answer
215
views

How to query for "previous page" with keyset pagination?

Aug 22 at 04:21 alextsg 16
pagination postgresql
0
votes
0
answers
259
views