Function-based indexing is a very generic approach. Besides functions like
UPPER you can also index expressions like
A + B and even use user-defined functions in the index definition.
There is one important exception. It is, for example, not possible to refer to the current time in an index definition, neither directly nor indirectly, as in the following example.
CREATE FUNCTION get_age(date_of_birth DATE) RETURN NUMBER AS BEGIN RETURN TRUNC(MONTHS_BETWEEN(SYSDATE, date_of_birth)/12); END
GET_AGE uses the current date (
SYSDATE) to calculate the age based on the supplied date of birth. You can use this function in all parts of an SQL query, for example in
select and the
SELECT first_name, last_name, get_age(date_of_birth) FROM employees WHERE get_age(date_of_birth) = 42
The query lists all 42-year-old employees. Using a function-based index is an obvious idea for optimizing this query, but you cannot use the function
GET_AGE in an index definition because it is not deterministic. That means the result of the function call is not fully determined by its parameters. Only functions that always return the same result for the same parameters—functions that are deterministic—can be indexed.
The reason behind this limitation is simple. When inserting a new row, the database calls the function and stores the result in the index and there it stays, unchanged. There is no periodic process that updates the index. The database updates the indexed age only when the date of birth is changed by an
update statement. After the next birthday, the age that is stored in the index will be wrong.
Besides being deterministic, PostgreSQL and the Oracle database require functions to be declared to be deterministic when used in an index so you have to use the keyword
DETERMINISTIC (Oracle) or
PostgreSQL and the Oracle database trust the
IMMUTABLE declarations—that means they trust the developer.
You can declare the
GET_AGE function to be deterministic and use it in an index definition. Regardless of the declaration, it will not work as intended because the age stored in the index will not increase as the years pass; the employees will not get older—at least not in the index.
Other examples for functions that cannot be “indexed” are random number generators and functions that depend on environment variables.
DB2 cannot use user-defined functions in indexes (not even if they are deterministic).
How can you still use an index to optimize a query for all 42-year-old employees?