by Markus Winand.

DB2 Example Scripts for “3-Minute Test”


This section contains the create, insert and select statements for the “Test your SQL Know-How in 3 Minutes” test. You may want to test yourself before reading this page.

The execution plans shown are abbreviated for better readability.

Table Setup

CREATE TABLE tbl (
  id          NUMERIC NOT NULL,
  date_column DATE,
  a           NUMERIC,
  b           NUMERIC,
  text        VARCHAR(255),
  state       CHAR(1),
  PRIMARY KEY (id)
);
--#SET TERMINATOR ;;
CREATE FUNCTION random_string(minlen NUMERIC, maxlen NUMERIC)
RETURNS VARCHAR(1000)
LANGUAGE SQL
NOT DETERMINISTIC
NO EXTERNAL ACTION
READS SQL DATA
BEGIN
  DECLARE rv  VARCHAR(1000) DEFAULT '';
  DECLARE i   NUMERIC       DEFAULT 0;
  DECLARE len NUMERIC       DEFAULT 0;

  IF maxlen < 1 OR minlen < 1 OR maxlen < minlen THEN
    RETURN NULL;
  END IF;

  SET i = floor(rand()*(maxlen-minlen)) + minlen;
  WHILE (i > 0)  DO
    SET rv = rv || chr(97+CAST(rand() * 25 AS INTEGER));
    SET i  =  i - 1;
  END WHILE;
  RETURN rv;
END
;;
--#SET TERMINATOR ;
INSERT INTO tbl (id, date_column, a, b, text, state)
WITH generate_series (n) AS ( 
     SELECT 1 FROM sysibm.sysdummy1 
  UNION ALL 
     SELECT n+1 
       FROM generate_series 
      WHERE n < 50000 
) 
SELECT generate_series.n 
     , CURRENT_DATE - generate_series.n DAYS
     , MOD(generate_series.n,1234) 
     , CAST(RAND() * 10 AS NUMERIC), random_string(20, 20) 
     , CASE MOD(generate_series.n,5) WHEN 0 THEN 'X' ELSE 'A' END 
FROM generate_series;

Question 1 — DATE Anit-Pattern

CREATE INDEX tbl_idx ON tbl (date_column);
EXPLAIN PLAN FOR
 SELECT text, date_column
   FROM tbl
  WHERE TO_CHAR(date_column, 'YYYY') = '2015';
EXPLAIN PLAN FOR
 SELECT text, date_column
   FROM tbl
  WHERE date_column >= TO_DATE('2015-01-01', 'YYYY-MM-DD')
    AND date_column <  TO_DATE('2016-01-01', 'YYYY-MM-DD');
DROP INDEX tbl_idx;

The first execution plan performs a full table scan (TBSCAN). The second execution plan, on the other hand, performs an index range scan (IXSCAN).

Explain Plan
--------------------------------------------------------------------------
ID | Operation   |                    Rows |             ActualRows | Cost
 1 | RETURN      |                         |  285 of 2000 ( 14.25%) |  621
 2 |  TBSCAN TBL | 2000 of 50000 (  4.00%) | 285 of 50000 (   .57%) |  621

Predicate Information
 2 - SARG ( TO_CHAR(TIMESTAMP(Q1.DATE_COLUMN, 0), 'YYYY') = '2015')
Explain Plan
------------------------------------------------------------------------------
ID | Operation        |                   Rows |             ActualRows | Cost
 1 | RETURN           |                        |   285 of 285 (100.00%) |   40
 2 |  FETCH TBL       |   285 of 285 (100.00%) |   285 of 285 (100.00%) |   40
 3 |   IXSCAN TBL_IDX | 285 of 50000 (   .57%) | 285 of 50000 (   .57%) |   15

Predicate Information
 3 - START ('2015-01-01-00.00.00.000000' <= Q1.DATE_COLUMN)
      STOP (Q1.DATE_COLUMN < '2016-01-01-00.00.00.000000')

Question 2 — Indexed Top-N

CREATE INDEX tbl_idx ON tbl (a, date_column);
SELECT id, date_column
  FROM tbl
 WHERE a = 123
 ORDER BY date_column DESC
 FETCH FIRST 1 ROW ONLY;
DROP INDEX tbl_idx;

The query uses the index (IXSCAN) and fetches in reverse order (REVERSE). Note that there is no sort operation.

Explain Plan
-------------------------------------------------------------------------------------
ID | Operation                  |                  Rows |           ActualRows | Cost
 1 | RETURN                     |                       |     1 of 1 (100.00%) |   20
 2 |  FETCH TBL                 |     1 of 40 (  2.50%) |    1 of 40 (  2.50%) |  183
 3 |   IXSCAN (REVERSE) TBL_IDX | 40 of 50000 (   .08%) | 1 of 50000 (   .00%) |   13

Predicate Information
 3 - START (Q1.A = +00123.)
      STOP (Q1.A = +00123.)

Question 3 — Column Order

CREATE INDEX tbl_idx ON tbl (a, b);
SELECT id, a, b
  FROM tbl
 WHERE a = 123
   AND b = 1;
SELECT id, a, b
  FROM tbl
 WHERE b = 1;
DROP INDEX tbl_idx;

The second query reads the entire table (TBSCAN).

Explain Plan
--------------------------------------------------------------------------
ID | Operation        |                 Rows |           ActualRows | Cost
 1 | RETURN           |                      |     3 of 4 ( 75.00%) |   13
 2 |  FETCH TBL       |     4 of 4 (100.00%) |     3 of 4 ( 75.00%) |   13
 3 |   IXSCAN TBL_IDX | 4 of 50000 (   .01%) | 3 of 50000 (   .01%) |    6

Predicate Information 
 3 - START (Q1.A = +00123.)
     START (Q1.B = +00001.)
      STOP (Q1.A = +00123.)
      STOP (Q1.B = +00001.)
Explain Plan
---------------------------------------------------------------------------
ID | Operation   |                    Rows |              ActualRows | Cost
 1 | RETURN      |                         |  4965 of 4575 (108.52%) |  611
 2 |  TBSCAN TBL | 4575 of 50000 (  9.15%) | 4965 of 50000 (  9.93%) |  611

Predicate Information
 2 - SARG (Q1.B = +00001.)

Changing the column order in the index allows both queries to use a IXSCAN:

Explain Plan
--------------------------------------------------------------------------
ID | Operation        |                 Rows |           ActualRows | Cost
 1 | RETURN           |                      |     3 of 4 ( 75.00%) |   13
 2 |  FETCH TBL       |     4 of 4 (100.00%) |     3 of 4 ( 75.00%) |   13
 3 |   IXSCAN TBL_IDX | 4 of 50000 (   .01%) | 3 of 50000 (   .01%) |    6

Predicate Information 
 3 - START (Q1.A = +00123.)
     START (Q1.B = +00001.)
      STOP (Q1.A = +00123.)
      STOP (Q1.B = +00001.)
Explain Plan
----------------------------------------------------------------------------------
ID | Operation          |                    Rows |              ActualRows | Cost
 1 | RETURN             |                         |  4965 of 4965 (100.00%) |  519
 2 |  FETCH TBL         |  4965 of 4965 (100.00%) |  4965 of 4965 (100.00%) |  519
 3 |   RIDSCN           |  4965 of 4965 (100.00%) |  4965 of 4965 (100.00%) |   52
 4 |    SORT (UNIQUE)   |  4965 of 4965 (100.00%) |  4965 of 4965 (100.00%) |   52
 5 |     IXSCAN TBL_IDX | 4965 of 50000 (  9.93%) | 4965 of 50000 (  9.93%) |   50

Predicate Information
 2 - SARG (Q1.B = +00001.)
 5 - START (Q1.B = +00001.)
      STOP (Q1.B = +00001.)

Question 4 — LIKE

CREATE INDEX tbl_idx ON tbl (text);
SELECT id, text
  FROM tbl
 WHERE text LIKE '%TERM%';
DROP INDEX tbl_idx;

The query reads the entire table (TBSCAN).

Explain Plan
------------------------------------------------------------------------
ID | Operation   |                    Rows |           ActualRows | Cost
 1 | RETURN      |                         |  0 of 1629 (   .00%) |  615
 2 |  TBSCAN TBL | 1629 of 50000 (  3.26%) | 0 of 50000 (   .00%) |  615

Predicate Information
 2 - SARG (Q1.TEXT LIKE '%TERM%')

Question 5 — Range Condition

The difference is hard to see in the execution plan—on the first sight, they are both fine:

CREATE INDEX tbl_idx ON tbl (date_column, state);
SELECT id, date_column, state
  FROM tbl
 WHERE date_column >= current_date -  5 YEARS
   AND state = 'X';
DROP INDEX tbl_idx;
Explain Plan
--------------------------------------------------------------------------------
ID | Operation          |                   Rows |             ActualRows | Cost
 1 | RETURN             |                        |   365 of 371 ( 98.38%) |   97
 2 |  FETCH TBL         |   371 of 371 (100.00%) |   365 of 371 ( 98.38%) |   97
 3 |   RIDSCN           |   371 of 371 (100.00%) |   365 of 371 ( 98.38%) |   68
 4 |    SORT (UNIQUE)   |   371 of 371 (100.00%) |   365 of 371 ( 98.38%) |   68
 5 |     IXSCAN TBL_IDX | 371 of 50000 (   .74%) | 365 of 50000 (   .73%) |   68

Predicate Information
 2 - SARG (Q1.STATE = 'X')
     SARG ((CURRENT DATE - 5 YEARS) /* SPREG_EXPR VALUE='10/13/2010'*/ <= Q1.DATE_COLUMN)
 5 - START ((CURRENT DATE - 5 YEARS) /* SPREG_EXPR VALUE='10/13/2010'*/ <= Q1.DATE_COLUMN)
     START (Q1.STATE = 'X')
      SARG (Q1.STATE = 'X')

The important difference is that the first execution plan has filter predicates on operation 5 (SARG on IXSCAN) while the next execution plan only has START and STOP conditions on the IXSCAN operation:

CREATE INDEX tbl_idx ON tbl (state, date_column);
SELECT id, date_column, state
  FROM tbl
 WHERE date_column >= current_date -  5 YEARS
   AND state = 'X';
DROP INDEX tbl_idx;
Explain Plan
--------------------------------------------------------------------------------
ID | Operation          |                   Rows |             ActualRows | Cost
 1 | RETURN             |                        |   365 of 360 (101.39%) |   47
 2 |  FETCH TBL         |   360 of 360 (100.00%) |   365 of 360 (101.39%) |   47
 3 |   RIDSCN           |   360 of 360 (100.00%) |   365 of 360 (101.39%) |   19
 4 |    SORT (UNIQUE)   |   360 of 360 (100.00%) |   365 of 360 (101.39%) |   19
 5 |     IXSCAN TBL_IDX | 360 of 50000 (   .72%) | 365 of 50000 (   .73%) |   19

Predicate Information
 2 - SARG (Q1.STATE = 'X')
     SARG ((CURRENT DATE - 5 YEARS) /* SPREG_EXPR VALUE='10/13/2010'*/ <= Q1.DATE_COLUMN)
 5 - START (Q1.STATE = 'X')
     START ((CURRENT DATE - 5 YEARS) /* SPREG_EXPR VALUE='10/13/2010'*/ <= Q1.DATE_COLUMN)
      STOP (Q1.STATE = 'X')

Question 5 — Index Only Scan

CREATE INDEX tbl_idx ON tbl (a, date_column);
EXPLAIN PLAN FOR
 SELECT date_column, count(*)
   FROM tbl
  WHERE a = 123
  GROUP BY date_column;
EXPLAIN PLAN FOR
 SELECT date_column, count(*)
   FROM tbl
  WHERE a = 123
    AND b = 1
  GROUP BY date_column;
DROP INDEX tbl_idx;

Both indexes are used, of course. The difference is that the first query doesn't access the table, so the first query is much faster.

Explain Plan
------------------------------------------------------
ID | Operation          |                  Rows | Cost
 1 | RETURN             |                       |   13
 2 |  GRPBY (COMPLETE)  |    40 of 40 (100.00%) |   13
 3 |   IXSCAN TBL_IDX   | 40 of 50000 (   .08%) |   13

Predicate Information
 3 - START (Q1.A = +00123.)
      STOP (Q1.A = +00123.)
Explain Plan
---------------------------------------------------------
ID | Operation             |                  Rows | Cost
 1 | RETURN                |                       |   20
 2 |  GRPBY (COMPLETE)     |      4 of 4 (100.00%) |   20
 3 |   TBSCAN              |      4 of 4 (100.00%) |   20
 4 |    SORT               |      4 of 4 (100.00%) |   20
 5 |     FETCH TBL         |     4 of 40 ( 10.00%) |   20
 6 |      RIDSCN           |    40 of 40 (100.00%) |   13
 7 |       SORT (UNIQUE)   |    40 of 40 (100.00%) |   13
 8 |        IXSCAN TBL_IDX | 40 of 50000 (   .08%) |   13

Predicate Information
 5 - SARG (Q1.B = +00001.)
     SARG (Q1.A = +00123.)
 8 - START (Q1.A = +00123.)
      STOP (Q1.A = +00123.)

The second query must be considerably slower because every row needs a table access — also for those that are filtered by the new condition. Even if the index has a low clustering factor, it is still about twice as many blocks to read.

About the Author

Photo of Markus Winand

Markus Winand teaches efficient SQL—inhouse and online. He minimizes the development time using modern SQL and optimizes the runtime with smart indexing. His book entitled SQL Performance Explained has become standard reading.

Buy his Book on Amazon

Cover of “SQL Performance Explained”: Squirrel running on grass

The essence of SQL tuning in 200 pages

Buy on Amazon
(paperback only)

Paperback and PDF also available at Markus’ store.

Hire Markus

…to answer your current SQL questions.

The quick and easy way to benefit from his extensive knowledge and experience.
Learn more »

“Use The Index, Luke!” by Markus Winand is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 Unported License.
Legal | Contact | NO WARRANTY | Trademarks | Privacy | CC-BY-NC-ND 3.0 license