Filter Predicates in PostgreSQL


The PostgreSQL database uses three different methods to apply where clauses (predicates):

Access Predicate (“Index Cond”)

The access predicates express the start and stop conditions of the leaf node traversal.

Index Filter Predicate (“Index Cond”)

Index filter predicates are applied during the leaf node traversal only. They do not contribute to the start and stop conditions and do not narrow the scanned range.

Table level filter predicate (“Filter”)

Predicates on columns that are not part of the index are evaluated on the table level. For that to happen, the database must load the row from the heap table first.

Note

Index filter predicates give a false sense of safety; even though an index is used, the performance degrades rapidly on a growing data volume or system load.

PostgreSQL execution plans do not show index access and filter predicates separately—both show up as “Index Cond”. That means the execution plan must be compared to the index definition to differentiate access predicates from index filter predicates.

Note

The PostgreSQL explain plan does not provide enough information for finding index filter predicates.

The predicates shown as “Filter” are always table level filter predicates—even when shown for an Index Scan operation.

It is well written and is not too heavy on the guts of databases
Mr. B on Amazon.co.uk (5 stars)

Consider the following example, which originally appeared in the “Performance and Scalability” chapter (create & insert script):

CREATE TABLE scale_data (
   section NUMERIC NOT NULL,
   id1     NUMERIC NOT NULL,
   id2     NUMERIC NOT NULL
);
CREATE INDEX scale_data_key ON scale_data(section, id1);

The following select filters on the ID2 column, which is not included in the index:

PREPARE stmt(int) AS SELECT count(*) 
                       FROM scale_data
                      WHERE section = 1
                        AND id2 = $1;
EXPLAIN EXECUTE stmt(1);
                      QUERY PLAN
-----------------------------------------------------
Aggregate  (cost=529346.31..529346.32 rows=1 width=0)
  Output: count(*)
  -> Index Scan using scale_data_key on scale_data
     (cost=0.00..529338.83 rows=2989 width=0)
     Index Cond: (scale_data.section = 1::numeric)
     Filter: (scale_data.id2 = ($1)::numeric)

The ID2 predicate shows up as "Filter" below the Index Scan operation. This is because PostgreSQL performs the table access as part of the Index Scan operation. In other words, the TABLE ACCESS BY INDEX ROWID operation of the Oracle database is hidden within PostgreSQL’s Index Scan operation. It is therefore possible that a Index Scan filters on columns that are not included in the index.

Important

The PostgreSQL Filter predicates are table level filter predicates—even when shown for an Index Scan.

When we add the index from the “Performance and Scalability” chapter, we can see that all columns show up as “Index Cond”—regardless of whether they are access or filter predicates.

CREATE INDEX scale_slow ON scale_data (section, id1, id2)

The execution plan with the new index does not show any filter conditions:

                      QUERY PLAN
------------------------------------------------------
Aggregate  (cost=14215.98..14215.99 rows=1 width=0)
  Output: count(*)
  -> Index Scan using scale_slow on scale_data 
     (cost=0.00..14208.51 rows=2989 width=0)
     Index Cond: (section = 1::numeric AND id2 = ($1)::numeric)

Please note that the condition on ID2 cannot narrow the leaf node traversal because the index has the ID1 column before ID2. That means, the Index Scan will scan the entire range for the condition SECTION=1::numeric and apply the filter ID2=($1)::numeric on each row that fulfills the clause on SECTION.

Tip

About the Author

Photo of Markus Winand
Markus Winand tunes developers for high SQL performance. He also published the book SQL Performance Explained and offers in-house training as well as remote coaching at http://winand.at/