PostgreSQL Skripts für „Sortieren und Gruppieren“

Dieser Abschnitt enthält die PostgreSQL Skripts und Ausführungspläne für Kapitel 6, „Sortieren und Gruppieren“.

Indexed Order By

  DROP INDEX sales_date;
CREATE INDEX sales_dt_pr ON sales (sale_date, product_id);

EXPLAIN
 SELECT sale_date, product_id, quantity
   FROM sales
  WHERE sale_date = now() - INTERVAL '1' DAY
  ORDER BY sale_date, product_id;

Die Ausführung erfolgt ohne Sort:

                         QUERY PLAN
-----------------------------------------------------------
Index Scan using sales_dt_pr (cost=0.01..680.86 rows=376)
  Index Cond: (sale_date = (now() - '1 day'::interval day))

Derselbe Ausführungsplan wird verwendet, wenn nur nach PRODUCT_ID sortiert wird:

EXPLAIN
 SELECT sale_date, product_id, quantity
   FROM sales
  WHERE sale_date = now() - INTERVAL '1' DAY
  ORDER BY product_id;

Wenn man den größer-gleich Operator verwendet, muss eine Sort Operation durchgeführt werden:

EXPLAIN
 SELECT sale_date, product_id, quantity
   FROM sales
  WHERE sale_date >= now() - INTERVAL '1' DAY
  ORDER BY product_id;

Die rows Schätzung ändert sich jedoch auf eine Zeile, und damit ist auch der Cost-Wert kleiner.

                         QUERY PLAN
--------------------------------------------------------------
Sort  (cost=8.50..8.50 rows=1 width=32)
 Sort Key: product_id
 -> Index Scan using sales_dt_pr (cost=0.00..8.49 rows=1)
    Index Cond: (sale_date >= (now() - '1 day'::interval day))

Order By ASC/DESC und NULLS FIRST/LAST

Der Index kann auch rückwärts gelesen werden:

EXPLAIN
 SELECT sale_date, product_id, quantity
   FROM sales
  WHERE sale_date >= now() - INTERVAL '1' DAY
  ORDER BY sale_date DESC, product_id DESC;

Gemischte ASC/DESC Angeben erzwingen jedoch eine explizite Sortierung:

EXPLAIN
 SELECT sale_date, product_id, quantity
   FROM sales
  WHERE sale_date >= now() - INTERVAL '1' DAY
  ORDER BY sale_date ASC, product_id DESC;

PostgreSQL kann jedoch auch ASC/DESC indizieren, sodass die Abfrage wieder als pipelined order by durchgeführt werden kann.

  DROP INDEX sales_dt_pr;

CREATE INDEX sales_dt_pr
    ON sales (sale_date ASC, product_id DESC);

EXPLAIN
 SELECT sale_date, product_id, quantity
   FROM sales
  WHERE sale_date >= now() - INTERVAL '1' DAY
  ORDER BY sale_date ASC, product_id DESC;

PostgreSQL sortiert standardmäßig mit NULLS LAST. Die Kombination mit DESC führt jedoch dazu, dass die NULL Einträge am Anfang landen:

EXPLAIN
 SELECT sale_date, product_id, quantity
   FROM sales
  WHERE sale_date >= now() - INTERVAL '1' DAY
  ORDER BY sale_date ASC, product_id DESC NULLS LAST;

PostgreSQL kann jedoch auch NULLS FIRST/NULLS LAST Spezifikationen bei der Indizierung berücksichtigen, sodass auch dafür ein pipelined order by bewirkt werden kann:

  DROP INDEX sales_dt_pr;

CREATE INDEX sales_dt_pr
    ON sales (sale_date ASC, product_id DESC NULLS LAST);

EXPLAIN
 SELECT sale_date, product_id, quantity
   FROM sales
  WHERE sale_date >= now() - INTERVAL '1' DAY
  ORDER BY sale_date ASC, product_id DESC NULLS LAST;

Indexing Group By

Es scheint als hätte die PostgreSQL (zumindest 9.0-13) Datenbank ein kleines Problem mit pipelined order by, wenn der Index eine NULLS Sortierung entgegen der natürlichen NULLS-Sortierung benutzt. Im oberen Beispiel war das der Fall:

EXPLAIN
 SELECT product_id, SUM(eur_value)
   FROM sales
  WHERE sale_date = now() - INTERVAL '1' DAY
  GROUP BY product_id;

                                     QUERY PLAN
------------------------------------------------------------------------------------
 HashAggregate  (cost=574.21..574.53 rows=26 width=40)
   Group Key: product_id
   ->  Index Scan using sales_dt_pr on sales  (cost=0.43..572.62 rows=318 width=14)
         Index Cond: (sale_date = (now() - '1 day'::interval day))

Ohne NULLS LAST, wird ein pipelined order by Ausgeführt:

  DROP INDEX sales_dt_pr;

CREATE INDEX sales_dt_pr
    ON sales (sale_date ASC, product_id DESC);

EXPLAIN
 SELECT product_id, SUM(eur_value)
   FROM sales
  WHERE sale_date = now() - INTERVAL '1' DAY
  GROUP BY product_id;

                                         QUERY PLAN
---------------------------------------------------------------------------------------------
 GroupAggregate  (cost=0.43..574.53 rows=26 width=40)
   Group Key: product_id
   ->  Index Scan Backward using sales_dt_pr on sales  (cost=0.43..572.62 rows=318 width=14)
         Index Cond: (sale_date = (now() - '1 day'::interval day))

Auffällig ist jedoch, dass der Index rückwärts gelesen wird. Das legt die Vermutung nahe, dass PostgreSQL intern eine ORDER BY PRODUCT_ID durchführt. Dann könnte man mit einen gegenläufigen, expliziten order by das pipelined order by unterbinden:

EXPLAIN
 SELECT product_id, SUM(eur_value)
   FROM sales
  WHERE sale_date = now() - INTERVAL '1' DAY
  GROUP BY product_id
  ORDER BY product_id DESC;

                                     QUERY PLAN
------------------------------------------------------------------------------------
 GroupAggregate  (cost=0.43..574.53 rows=26 width=40)
   Group Key: product_id
   ->  Index Scan using sales_dt_pr on sales  (cost=0.43..572.62 rows=318 width=14)
         Index Cond: (sale_date = (now() - '1 day'::interval day))

Da das Ergebnis dieser These entgegenspricht, führe ich noch einen Test mit einem NULLS LAST Index durch:

  DROP INDEX sales_dt_pr;

CREATE INDEX sales_dt_pr
    ON sales (sale_date ASC, product_id DESC NULLS LAST);

EXPLAIN
 SELECT product_id, SUM(eur_value)
   FROM sales
  WHERE sale_date = now() - INTERVAL '1' DAY
  GROUP BY product_id
  ORDER BY product_id DESC NULLS LAST;

                                     QUERY PLAN
------------------------------------------------------------------------------------
 GroupAggregate  (cost=0.43..574.53 rows=26 width=40)
   Group Key: product_id
   ->  Index Scan using sales_dt_pr on sales  (cost=0.43..572.62 rows=318 width=14)
         Index Cond: (sale_date = (now() - '1 day'::interval day))

Letztendlich scheint es also so, als würde PostgreSQL eine implizite order by-Klausel einfügen, dabei aber die NULLS LAST Spezifiktion des Indexes ignorieren. Wenn man jedoch eine explizite order by-Klausel angibt, die der Indexdefinition entspricht, kann PostgreSQL ein pipelined group by durchführen.

Durch weitere Tests stellt sich heraus, das dieses Problem auch für NULLS FIRST-Indizes besteht, wenn die entsprechende Spalte eine ASC-Sortierung hat:

  DROP INDEX sales_dt_pr;

CREATE INDEX sales_dt_pr
    ON sales (sale_date ASC, product_id ASC NULLS FIRST);

EXPLAIN
 SELECT product_id, SUM(eur_value)
   FROM sales
  WHERE sale_date = now() - INTERVAL '1' DAY
  GROUP BY product_id
  ORDER BY product_id ASC NULLS FIRST;

                                     QUERY PLAN
------------------------------------------------------------------------------------
 GroupAggregate  (cost=0.43..574.53 rows=26 width=40)
   Group Key: product_id
   ->  Index Scan using sales_dt_pr on sales  (cost=0.43..572.62 rows=318 width=14)
         Index Cond: (sale_date = (now() - '1 day'::interval day))

Lässt man die Order by-Klausel weg:

EXPLAIN
 SELECT product_id, SUM(eur_value)
   FROM sales
  WHERE sale_date = now() - INTERVAL '1' DAY
  GROUP BY product_id;

                                     QUERY PLAN
------------------------------------------------------------------------------------
 HashAggregate  (cost=574.21..574.53 rows=26 width=40)
   Group Key: product_id
   ->  Index Scan using sales_dt_pr on sales  (cost=0.43..572.62 rows=318 width=14)
         Index Cond: (sale_date = (now() - '1 day'::interval day))

Vorherige Seite Nächste Seite

Du kannst nicht alles an einem Tag lernen. Abonniere den Newsletter via E-Mail, Bluesky oder RSS um sukzessive aufzuholen. Und sieh dir auch modern-sql.com an.

Über den Autor

Markus Winand gibt auf modern-sql.com Einblick in SQL und zeigt, wie es von verschiedenen Systemen unterstützt wird. Zuvor machte er use-the-index-luke.com, was er noch immer wartet. Markus kann als Trainer, Sprecher und Berater auf winand.at engagiert werden.

PostgreSQL Skripts für „Sortieren und Gruppieren“

Indexed Order By

Order By ASC/DESC und NULLS FIRST/LAST

Indexing Group By

Über den Autor

Der 3-Minuten Test

Inhaltsverzeichnis

Aktuelle Blog Posts

Geek-Ware

Neuigkeiten Abonnieren

PostgreSQL Skripts für „Sortieren und Gruppieren“

Indexed Order By

Order By ASC/DESC und NULLS FIRST/LAST

Indexing Group By

Über den Autor

Sein Buch kaufen

Sein Training

Der 3-Minuten Test

Inhaltsverzeichnis

Aktuelle Blog Posts

Geek-Ware

Neuigkeiten Abonnieren