User-Defined Functions


Applies to
MySQLNo
OracleYes
PostgreSQLYes
SQL ServerYes

Function-based indexing is a very generic approach. Besides functions like UPPER you can also index expressions like A + B and even use user-defined functions in the index definition.

There is one important exception. It is, for example, not possible to refer to the current time in an index definition, neither directly nor indirectly, as in the following example.

CREATE FUNCTION get_age(date_of_birth DATE) 
RETURN NUMBER
AS
BEGIN
  RETURN 
    TRUNC(MONTHS_BETWEEN(SYSDATE, date_of_birth)/12);
END;
/

The function GET_AGE uses the current date (SYSDATE) to calculate the age based on the supplied date of birth. You can use this function in all parts of an SQL query, for example in select and the where clauses:

SELECT first_name, last_name, get_age(date_of_birth)
  FROM employees
 WHERE get_age(date_of_birth) = 42;

The query lists all 42-year-old employees. Using a function-based index is an obvious idea for optimizing this query, but you cannot use the function GET_AGE in an index definition because it is not deterministic. That means the result of the function call is not fully determined by its parameters. Only functions that always return the same result for the same parameters—functions that are deterministic—can be indexed.

About our book “SQL Performance Explained”
An indispensable manual for anyone, DBA, developer or system administrator
Luigi Zambetti on Amazon.co.uk (5 stars)

The reason behind this limitation is simple. When inserting a new row, the database calls the function and stores the result in the index and there it stays, unchanged. There is no periodic process that updates the index. The database updates the indexed age only when the date of birth is changed by an update statement. After the next birthday, the age that is stored in the index will be wrong.

Besides being deterministic, PostgreSQL and the Oracle database require functions to be declared to be deterministic when used in an index so you have to use the keyword DETERMINISTIC (Oracle) or IMMUTABLE (PostgreSQL).

Caution

PostgreSQL and the Oracle database trust the DETERMINISTIC or IMMUTABLE declarations—that means they trust the developer.

You can declare the GET_AGE function to be deterministic and use it in an index definition. Regardless of the declaration, it will not work as intended because the age stored in the index will not increase as the years pass; the employees will not get older—at least not in the index.

Other examples for functions that cannot be “indexed” are random number generators and functions that depend on environment variables.

Note

DB2 cannot use user-defined functions in indexes (not even if they are deterministic).

Think About It

How can you still use an index to optimize a query for all 42-year-old employees?

If you like my way of explaining things, you’ll love my book.

About the Author

Photo of Markus Winand
Markus Winand tunes developers for high SQL performance. He also published the book SQL Performance Explained and offers in-house training as well as remote coaching at http://winand.at/

?Recent questions at
Ask.Use-The-Index-Luke.com

3
votes
2
answers
294
views

pagination with nulls

2 days ago Markus Winand ♦♦ 771
pagination
2
votes
1
answer
1.9k
views
0
votes
2
answers
1.1k
views

different execution plans after failing over from primary to standby server

Sep 17 at 11:46 Markus Winand ♦♦ 771
oracle index update