Combining Columns


This section is about a popular obfuscation that affects concatenated indexes.

The first example is again about date and time types but the other way around. The following MySQL query combines a data and a time column to apply a range filter on both of them.

SELECT ...
  FROM ...
 WHERE ADDTIME(date_column, time_column)
     > DATE_ADD(now(), INTERVAL -1 DAY)

It selects all records from the last 24 hours. The query cannot use a concatenated index on (DATE_COLUMN, TIME_COLUMN) properly because the search is not done on the indexed columns but on derived data.

About our book “SQL Performance Explained”
Probably the best book on SQL performance I've read
Guillaume Lelarge on Amazon.co.uk (5 stars)

You can avoid this problem by using a data type that has both a date and time component (e.g., MySQL DATETIME). You can then use this column without a function call:

SELECT ...
  FROM ...
 WHERE datetime_column
     > DATE_ADD(now(), INTERVAL -1 DAY)

Unfortunately it is often not possible to change the table when facing this problem.

The next option is a function-based index if the database supports it—although this has all the drawbacks discussed before. When using MySQL, function-based indexes are not an option anyway.

It is still possible to write the query so that the database can use a concatenated index on DATE_COLUMN, TIME_COLUMN with an access predicate—at least partially. For that, we add an extra condition on the DATE_COLUMN.

 WHERE ADDTIME(date_column, time_column)
     > DATE_ADD(now(), INTERVAL -1 DAY)
   AND date_column
    >= DATE(DATE_ADD(now(), INTERVAL -1 DAY))

The new condition is absolutely redundant but it is a straight filter on DATE_COLUMN that can be used as access predicate. Even though this technique is not perfect, it is usually a good enough approximation.

Tweet this tip

Tip

Use a redundant condition on the most significant column when a range condition combines multiple columns.

For PostgreSQL, it’s preferable to use the row values syntax.

You can also use this technique when storing date and time in text columns, but you have to use date and time formats that yields a chronological order when sorted lexically—e.g., as suggested by ISO 8601 (YYYY-MM-DD HH:MM:SS). The following example uses the Oracle database’s TO_CHAR function for that purpose:

SELECT ...
  FROM ...
 WHERE date_string || time_string
     > TO_CHAR(sysdate - 1, 'YYYY-MM-DD HH24:MI:SS')
   AND date_string
    >= TO_CHAR(sysdate - 1, 'YYYY-MM-DD')

We will face the problem of applying a range condition over multiple columns again in the section entitled “Paging Through Results”. We’ll also use the same approximation method to mitigate it.

Sometimes we have the reverse case and might want to obfuscate a condition intentionally so it cannot be used anymore as access predicate. We already looked at that problem when discussing the effects of bind parameters on LIKE conditions. Consider the following example:

SELECT last_name, first_name, employee_id
  FROM employees
 WHERE subsidiary_id = ?
   AND last_name LIKE ?

Assuming there is an index on SUBSIDIARY_ID and another one on LAST_NAME, which one is better for this query?

Without knowing the wildcard’s position in the search term, it is impossible to give a qualified answer. The optimizer has no other choice than to “guess”. If you know that there is always a leading wildcard, you can obfuscate the LIKE condition intentionally so that the optimizer can no longer consider the index on LAST_NAME.

SELECT last_name, first_name, employee_id
  FROM employees
 WHERE subsidiary_id = ?
   AND last_name || '' LIKE ?

It is enough to append an empty string to the LAST_NAME column. This is, however, an option of last resort. Only do it when absolutely necessary.

If you like my way of explaining things, you’ll love my book.

About the Author

As an author, trainer, and coach Markus Winand specializes in helping developers cope with SQL performance issues. He also published the book SQL Performance Explained and tweets his best performance tips via @SQLPerfTips.http://winand.at/

Recent Questions at Ask.Use-The-Index-Luke.com

0
votes
1
answer
140
views

We want to buy the book but I can't

Jul 18 at 21:36 Markus Winand ♦♦ 541
book
0
votes
2
answers
158
views
0
votes
0
answers
820
views

Performance very bad in Postgresql 9.3

Jul 08 at 11:54 Markus Winand ♦♦ 541
performance issue