Function-based indexing is a very generic approach. Besides functions like UPPER
you can also index expressions like A + B
and even use user-defined functions in the index definition.
There is one important exception. It is, for example, not possible to refer to the current time in an index definition, neither directly nor indirectly, as in the following example.
CREATE FUNCTION get_age(date_of_birth DATE)
RETURN NUMBER
AS
BEGIN
RETURN
TRUNC(MONTHS_BETWEEN(SYSDATE, date_of_birth)/12);
END
The function GET_AGE
uses the current date (SYSDATE
) to calculate the age based on the supplied date of birth. You can use this function in all parts of an SQL query, for example in select
and the where
clauses:
SELECT first_name, last_name, get_age(date_of_birth)
FROM employees
WHERE get_age(date_of_birth) = 42
The query lists all 42-year-old employees. Using a function-based index is an obvious idea for optimizing this query, but you cannot use the function GET_AGE
in an index definition because it is not deterministic. That means the result of the function call is not fully determined by its parameters. Only functions that always return the same result for the same parameters—functions that are deterministic—can be indexed.
Support My Work
I offer SQL training, tuning and consulting. Buying my book “SQL Performance Explained” (from €9.95) also supports my work on this website.
The reason behind this limitation is simple. When inserting a new row, the database calls the function and stores the result in the index and there it stays, unchanged. There is no periodic process that updates the index. The database updates the indexed age only when the date of birth is changed by an update
statement. After the next birthday, the age that is stored in the index will be wrong.
Besides being deterministic, PostgreSQL and the Oracle database require functions to be declared to be deterministic when used in an index so you have to use the keyword DETERMINISTIC
(Oracle) or IMMUTABLE
(PostgreSQL).
Caution
PostgreSQL and the Oracle database trust the DETERMINISTIC
or IMMUTABLE
declarations—that means they trust the developer.
You can declare the GET_AGE
function to be deterministic and use it in an index definition. Regardless of the declaration, it will not work as intended because the age stored in the index will not increase as the years pass; the employees will not get older—at least not in the index.
Other examples for functions that cannot be “indexed” are random number generators and functions that depend on environment variables.
Note
DB2 cannot use user-defined functions in indexes (not even if they are deterministic).
Tip
How can you still use an index to optimize a query for all 42-year-old employees?