I’ve been experimenting with the performance of SELECT queries on a table with about a million records, consisting of these columns(among others..):
id, lastName, firstName,
Wehn i perform this query:
SELECT * FROM `user` WHERE lastName = 'someLastName' AND firstName = 'someFirstName'
I get more or less the same performance(in terms of miliseconds), whether i have two separate indexes on those fields, or have a compound index(lastName,firstName).
I came across this thread:difference between creating index with two columns and creating separate index on two columns
The guy explains the case where a compound index might fall short, bu he didn’t give any example where it can actually be beneficial, over setting up separate indexes.
Can someone provide an example/simple explanation of a scenario where a compound index is a good choice?
If you know that you frequently query on a combination of fields, a combined index means “you only have to search one B-tree.” Otherwise, the server would have to search n individual B-trees and then combine the results.
(N.B.: A “B-tree” is a data structure commonly used by database servers to create indexes …)
(You can use the
EXPLAIN command to quickly see how the server plans to do this … and to see that it does have to do this.)
I think MySQL has a good explanation of how composite indexes are used.
If your queries only take a few milliseconds, then indexes are probably not needed. One use of indexes is to reduce the number of data pages that need to be read. If you have only one data page for a table, there is not much utility there.
So, I think the answer to your question is that you need many thousands of rows in your table to see a big improvement in performance when using indexes for filtering. Your data is probably not big enough. Yet.
The ultimate answer here probably depends on whatever
EXPLAIN is telling you for this query, for the two index strategies (separate vs. compound). In general, if you have an index only on one of the name fields in the
WHERE clause, MySQL won’t use either of the indices. It might be possible for it to use both indices, but that would be unlikely. On the other hand, an index on
(lastName, firstName) at the very least does cover the
WHERE clause (though not the
SELECT clause), and so should be a candidate for MySQL to use. If MySQL is really not using the compound index, then you should check
EXPLAIN to find out why.
One possible explanation why the compound index would not be used is if your table size be fairly small. In that case, MySQL might choose to just do a full table scan and not use any index at all. Another explanation for why the compound index would not be used is the cardinality of your data. Specifically, if the first and last names repeat very often, and you are searching for a common pair, then adding an index might not add much benefit over just doing a full table scan.
The simple answer is perhaps the analogy with a phone book (remember those?).
A phone book for a city is typically indexed on lastname, firstname. Imagine a phone book for a city of 1 million people.
If you’re looking for “John Smith”, you go to the “Smith” section; this reduces the number of possible hits from 1 million to perhaps a few thousand. You can then jump to the “Smith, John” section; that reduces the few thousand to a handful. The biggest saving is by jumping from “1 million names” to “a few thousand Smiths”. The additional saving of searching for “John” is much smaller.
So, the compound index (lastname, firstname) is probably faster than “lastname” alone, but on modern hardware, searching a few thousand “Smiths” is indeed milliseconds (if you can measure it at all).
If you repeated your query with something less selective, but still affected by the index, you will probably see a bigger difference.
SELECT * FROM `user` WHERE lastName like 's*' AND firstName = 'j*'