- August 12–22: Online Training (EU shift)
- Enrollment
- FOSDEM Impressions
- June 8–18: Online Training (US shift)
- Non-monetary micro sponsoring
- November 11-12 in Frankfurt am Main
- Online-Training in July and August
- Oracle + PostgreSQL
- Party time
- PostgreSQL Performance Event
- SQL Server Performance Kurs in Stuttgart
- SQL Server performance training in London
- Shipping Terms
- The two top performance problems caused by ORM tools
- Top Tweets January 2013
- Training Survey
- Training and Conference Dates
- Use The Index, Luke
- Ask
- Consulting
2011-08-13PostgreSQL Example Scripts for “3-Minuten Test”
This section contains the create, insert and select statements for the “Test your SQL Know-How in 3 Minutes” test. You may want to test yourself before reading this page.
The execution plans shown are abbreviated for better readability.
Table Setup
CREATE TABLE tbl ( id NUMERIC NOT NULL, date_column DATE, a NUMERIC, b NUMERIC, text VARCHAR(255), state CHAR(1), PRIMARY KEY (id) ); CREATE OR REPLACE FUNCTION random_string(len INTEGER) RETURNS VARCHAR(1000) AS $$ DECLARE rv VARCHAR(1000) := ''; i INTEGER := 0; BEGIN IF len < 1 THEN RETURN rv; END IF; FOR i IN 1..len LOOP rv := rv || chr(97+(random() * 25)::int); END LOOP; RETURN rv; END; $$ LANGUAGE plpgsql; INSERT INTO tbl SELECT generate_series , CURRENT_DATE - generate_series , MOD(generate_series,1234) , (RANDOM() * 10)::INT, random_string(20) , CASE MOD(generate_series,5) WHEN 0 THEN 'X' ELSE 'A' END FROM generate_series(1, 50000); ANALYZE tbl;
Question 1 — DATE Anit-Pattern
CREATE INDEX tbl_idx ON tbl (date_column);
EXPLAIN
SELECT text, date_column
FROM tbl
WHERE TO_CHAR(date_column, 'YYYY') = '2011';
EXPLAIN
SELECT text, date_column
FROM tbl
WHERE date_column >= TO_DATE('2011-01-01', 'YYYY-MM-DD')
AND date_column < TO_DATE('2012-01-01', 'YYYY-MM-DD');
DROP INDEX tbl_idx;
The first execution plan performs a full table scan (Seq Scan). The second execution plan, on the other hand, performs an index range scan (Index Scan).
QUERY PLAN --------------------------------------------------------------- Seq Scan on tbl (cost=0.00..1386.00 rows=250) Filter: (to_char((date_column), 'YYYY'::text) = '2011'::text) QUERY PLAN --------------------------------------------------------------- Index Scan using tbl_idx on tbl (cost=0.01..14.36 rows=205) Index Cond: (( date_column >= to_date('2011-01-01'::text, 'YYYY-MM-DD'::text) ) AND ( date_column < to_date('2012-01-01'::text, 'YYYY-MM-DD'::text) ))
Question 2 — Indexed Top-N
CREATE INDEX tbl_idx ON tbl (a, date_column); PREPARE STMT(int) AS SELECT id, date_column FROM tbl WHERE a = $1 ORDER BY date_column DESC LIMIT 1; EXPLAIN EXECUTE STMT(123); DEALLOCATE STMT; DROP INDEX tbl_idx;
The query uses the index (Index Scan) and fetches in reverse order (Backward). Note that there is no sort operation.
QUERY PLAN
----------------------------------------------
Limit (cost=0.00..4.03 rows=1 width=12)
-> Index Scan Backward using tbl_idx on tbl
(cost=0.00..164.96 rows=41)
Index Cond: (a = ($1)::numeric)
Question 3 — Column Order
CREATE INDEX tbl_idx ON tbl (a, b);
PREPARE STMT(int, int) AS
SELECT id, a, b
FROM tbl
WHERE a = $1
AND b = $2;
EXPLAIN EXECUTE STMT(123, 1);
DEALLOCATE STMT;
PREPARE STMT(int) AS
SELECT id, a, b
FROM tbl
WHERE b = $1;
EXPLAIN EXECUTE STMT(1);
DEALLOCATE STMT;
DROP INDEX tbl_idx;
The second query reads the entire table (Seq Scan). Changing the column order in the index allows both queries to use a (Bitmap) Index Scan.
QUERY PLAN ------------------------------------------------------------- Bitmap Heap Scan on tbl (cost=4.30..19.32 rows=4) Recheck Cond: ((a = ($1)::numeric) AND (b = ($2)::numeric)) -> Bitmap Index Scan on tbl_idx (cost=0.00..4.30 rows=4) Index Cond: ((a = ($1)::numeric) AND (b = ($2)::numeric)) QUERY PLAN ---------------------------------------------------------- Seq Scan on tbl (cost=0.00..1261.00 rows=4545 width=20) Filter: (b = ($1)::numeric)
Question 4 — LIKE
CREATE INDEX tbl_idx ON tbl (text varchar_pattern_ops); EXPLAIN SELECT id, text FROM tbl WHERE text LIKE '%TERM%'; DROP INDEX tbl_idx;
The query reads the entire table (Seq Scan).
QUERY PLAN
-------------------------------------------------------
Seq Scan on tbl (cost=0.00..1136.00 rows=5 width=29)
Filter: ((text)::text ~~ '%TERM%'::text)
Question 5 — Range Condition
This sample is a little bit harder to proof, since the PostgreSQL’s cost estimation doesn’t reflect the performance difference correctly. So, we compare actual execution performance using EXPLAIN ANALYZE.
CREATE INDEX tbl_idx ON tbl (date_column, state);
CLUSTER tbl_idx ON tbl;
ANALYZE tbl;
EXPLAIN ANALYZE
SELECT id, date_column, state
FROM tbl
WHERE date_column >= now()::date - INTERVAL '5' YEAR
AND state = 'X';
DROP INDEX tbl_idx;
CREATE INDEX tbl_idx ON tbl (state, date_column);
CLUSTER tbl_idx ON tbl;
ANALYZE tbl;
EXPLAIN ANALYZE
SELECT id, date_column, state
FROM tbl
WHERE date_column >= now()::date - INTERVAL '5' YEAR
AND state = 'X';
DROP INDEX tbl_idx;
Both indexes are used, of course. The difference is only the execution time. The performance difference becomes bigger, if selecting a longer time period.
QUERY PLAN ------------------------------------------------------------ Index Scan using tbl_idx (actual time=0.071..0.764 rows=365) Index Cond: ( (date_column >= ((now())::date - '5 years'::interval year)) AND (state = 'X'::bpchar)) Total runtime: 0.842 ms QUERY PLAN ------------------------------------------------------------- Index Scan using tbl_idx (actual time=0.092..0.419 rows=365) Index Cond: ( (state = 'X'::bpchar) AND (date_column >= ((now())::date - '5 years'::interval year))) Total runtime: 0.520 ms
Stay connected:
RSS Feed
Like on Facebook
Follow me on Twitter
Share at Google+
RSS FeedFlattr this! Follow me on TwitterShare at Google+Like on Facebook