- August 12–22: Online Training (EU shift)
- Enrollment
- FOSDEM Impressions
- June 8–18: Online Training (US shift)
- Non-monetary micro sponsoring
- November 11-12 in Frankfurt am Main
- Online-Training in July and August
- Oracle + PostgreSQL
- Party time
- PostgreSQL Performance Event
- SQL Server Performance Kurs in Stuttgart
- SQL Server performance training in London
- Shipping Terms
- The two top performance problems caused by ORM tools
- Top Tweets January 2013
- Training Survey
- Training and Conference Dates
- Use The Index, Luke
- Ask
- Consulting
2011-11-21PostgreSQL Example Scripts for “Sorting and Grouping”
This section contains the code and execution plans for Chapter 6 in a PostgreSQL database.
Indexed Order By
DROP INDEX sales_date; CREATE INDEX sales_dt_pr ON sales (sale_date, product_id); EXPLAIN SELECT sale_date, product_id, quantity FROM sales WHERE sale_date = now() - INTERVAL '1' DAY ORDER BY sale_date, product_id;
The execution does not perform a sort operation:
QUERY PLAN ----------------------------------------------------------- Index Scan using sales_dt_pr (cost=0.01..680.86 rows=376) Index Cond: (sale_date = (now() - '1 day'::interval day))
PostgreSQL uses the same execution plan, when sorting by PRODUCT_ID only.
EXPLAIN SELECT sale_date, product_id, quantity FROM sales WHERE sale_date = now() - INTERVAL '1' DAY ORDER BY product_id;
Using an greater or equals condition requires an Sort operation:
EXPLAIN SELECT sale_date, product_id, quantity FROM sales WHERE sale_date >= now() - INTERVAL '1' DAY ORDER BY product_id;
Although the row estimate got lower, causing the cost also to be lower:
QUERY PLAN
--------------------------------------------------------------
Sort (cost=8.50..8.50 rows=1 width=32)
Sort Key: product_id
-> Index Scan using sales_dt_pr (cost=0.00..8.49 rows=1)
Index Cond: (sale_date >= (now() - '1 day'::interval day))
Indexing ASC, DESC and NULLS FIRST/LAST
Scanning an index backwards:
EXPLAIN SELECT sale_date, product_id, quantity FROM sales WHERE sale_date >= now() - INTERVAL '1' DAY ORDER BY sale_date DESC, product_id DESC;
QUERY PLAN
-------------------------------------------------------------
Index Scan Backward using sales_dt_pr (cost=0.00..8.49)
Index Cond: (sale_date >= (now() - '1 day'::interval day))
Mixing ASC and DESC causes an explicit sort:
EXPLAIN SELECT sale_date, product_id, quantity FROM sales WHERE sale_date >= now() - INTERVAL '1' DAY ORDER BY sale_date ASC, product_id DESC;
QUERY PLAN --------------------------------------------------------------- Sort (cost=8.50..8.50 rows=1 width=32) Sort Key: sale_date, quantity -> Index Scan using sales_dt_pr on sales (cost=0.00..8.49) Index Cond: (sale_date >= (now() - '1 day'::interval day))
Ordering the index with mixed ASC/DESC modifiers:
DROP INDEX sales_dt_pr;
CREATE INDEX sales_dt_pr
ON sales (sale_date ASC, product_id DESC);
EXPLAIN
SELECT sale_date, product_id, quantity
FROM sales
WHERE sale_date >= now() - INTERVAL '1' DAY
ORDER BY sale_date ASC, product_id DESC;
QUERY PLAN ------------------------------------------------------------ Index Scan using sales_dt_pr on sales (cost=0.00..8.49) Index Cond: (sale_date >= (now() - '1 day'::interval day))
PostgreSQL orders NULLS LAST per default. However, the DESC modifier puts them to the front, so that DESC NULLS LAST must sort explicitly:
EXPLAIN SELECT sale_date, product_id, quantity FROM sales WHERE sale_date >= now() - INTERVAL '1' DAY ORDER BY sale_date ASC, product_id DESC NULLS LAST;
QUERY PLAN
---------------------------------------------------------------
Sort (cost=8.50..8.50 rows=1 width=32)
Sort Key: sale_date, product_id
-> Index Scan using sales_dt_pr on sales (cost=0.00..8.49)
Index Cond: (sale_date >= (now() - '1 day'::interval day))
PostgreSQL allows explicit NULLS LAST indexing as well, so it becomes a pipelined order by again:
DROP INDEX sales_dt_pr;
CREATE INDEX sales_dt_pr
ON sales (sale_date ASC, product_id DESC NULLS LAST);
EXPLAIN
SELECT sale_date, product_id, quantity
FROM sales
WHERE sale_date >= now() - INTERVAL '1' DAY
ORDER BY sale_date ASC, product_id DESC NULLS LAST;
QUERY PLAN ------------------------------------------------------------- Index Scan using sales_dt_pr on sales (cost=0.00..8.49) Index Cond: (sale_date >= (now() - '1 day'::interval day))
Indexed Group By
The PostgreSQL (9.0.1) database seems to have a little glitch, that makes the following not work as piplined order by, when the index has NULLS LAST, as created above:
EXPLAIN SELECT product_id, SUM(eur_value) FROM sales WHERE sale_date = now() - INTERVAL '1' DAY GROUP BY product_id;
QUERY PLAN
---------------------------------------------------------------
HashAggregate (cost=682.74..687.44 rows=376 width=20)
-> Index Scan using sales_dt_pr on sales (cost=0.00..680.86)
Index Cond: (sale_date = (now() - '1 days'::interval day))
Removing the NULLS LAST clause from the index reveals something interesting:
DROP INDEX sales_dt_pr;
CREATE INDEX sales_dt_pr
ON sales (sale_date ASC, product_id DESC);
EXPLAIN
SELECT product_id, SUM(eur_value)
FROM sales
WHERE sale_date = now() - INTERVAL '1' DAY
GROUP BY product_id;
QUERY PLAN
--------------------------------------------------------------
GroupAggregate (cost=0.00..687.44 rows=376 width=20)
-> Index Scan Backward using sales_dt_pr on sales
(cost=0.00..680.86 rows=376 width=20)
Index Cond: (sale_date = (now() - '1 day'::interval day))
The index is read backwards, although the SQL statement does not require it. It seems like PostgreSQL uses an GROUP BY PRODUCT_ID internally, to fetch the pre-sorted result. The next case adds an order by clause in opposite order.
EXPLAIN SELECT product_id, SUM(eur_value) FROM sales WHERE sale_date = now() - INTERVAL '1' DAY GROUP BY product_id ORDER BY product_id DESC;
QUERY PLAN
---------------------------------------------------------------
GroupAggregate (cost=0.00..687.44 rows=376 width=20)
-> Index Scan using sales_dt_pr on sales (cost=0.00..680.86)
Index Cond: (sale_date = (now() - '1 day'::interval day))
So, it seems that the implicit order by is not hardcoded. On final test reveals the last peace to describe PostgreSQL’s behaviour:
DROP INDEX sales_dt_pr;
CREATE INDEX sales_dt_pr
ON sales (sale_date ASC, product_id DESC NULLS LAST);
EXPLAIN
SELECT product_id, SUM(eur_value)
FROM sales
WHERE sale_date = now() - INTERVAL '1' DAY
GROUP BY product_id
ORDER BY product_id DESC NULLS LAST;
QUERY PLAN
---------------------------------------------------------------
GroupAggregate (cost=0.00..687.44 rows=376 width=20)
-> Index Scan using sales_dt_pr on sales (cost=0.00..680.86)
Index Cond: (sale_date = (now() - '1 day'::interval day))
It does a pipelined group by, even when the index is defined NULLS LAST when the order by clause explicitly sorts the same way. Otherwise, it uses an internal order by clause that disregards the indexes NULLS specification.
Stay connected:
RSS Feed
Like on Facebook
Follow me on Twitter
Share at Google+
RSS FeedFlattr this! Follow me on TwitterShare at Google+Like on Facebook