Question :
I am using the php framework Laravel for my application, together with mysql 5.6, in order to have full-text search with innoBD.
But I am not sure if I have too many indexes.
My application allows users to upload products for sale. So the table that will be queried the most, Select / inserts is my products table.
The Table looks like this, (Removed a few rows in this example).
Product
p_id - (int) Primary key
Used to filter products, new/old – asc/desc
state_id - (int) index key & foreign key
Links to primary key in state Table in order to get the state name, also used for filtering when searching products
city_id - (int) index key & foreign key
Links to primary key in city Table in order to get the city name, also used for filtering when searching products
cat_id - (int) index key & foreign key
Links to primary key in category Table in order to get the category name, also used for filtering when searching products
subcat_id - (int) index key & foreign key
Links to primary key in subcategory Table in order to get the subcategory name, also used for filtering when searching products
title - (varchar) FULLTEXT search
Used for text search, when searching/filtering products
seller_id - (int) index key & foreign key
Links to primary key in users table in order to get the sellers username
product_type - (tinyint) index key
Used when searching/filtering products, if a product is an ad or auction
price - (int) index key
Used when searching/filtering products
end_time - timestamp
is_active - (tinyint) index key
Used when searching/filtering products
So when doing a search for products there will be a total of 4 inner joins
, I am not displaying the sellers name in search results. Only inner joining, state / city / category / subcategory.
As for now I have a total of 8 index keys and 5 foreign keys.
My first question: Do I need to put a index key on all columns that has a foreign key?
I am also thinking of adding a few composite index.
composite index 1: (title, state_id, city_id, cat_id, subcat_id, product_type , is_active)
composite index 2: (title, state_id, city_id, product_type, is_active)
As for now it looks like most users only do a text search. So the most common query is a fulltext search where is_active = 1
Answer :
Rather than try to guess, you should download Percona Tools.
You should use the tool pt-duplicate-key-checker. It will compare all the indexes for you and give your the ALTER TABLE commands to drop the ones you do not need and still maintain fully compliance with all your index search needs. Here is the code to do it:
ETL_INDX_SCRIPT=/tmp/remove_indexes.sql
MYSQL_USER=root
MYSQL_PASS=rootpassword
MYSQL_CONN="-u${MYSQL_USER} -p${MYSQL_PASS}"
DB=mydb
pt-duplicate-key-checker ${MYSQL_CONN} -d ${DB} | grep "^ALTER TABLE " >> ${ETL_INDX_SCRIPT}
mysql ${MYSQL_CONN} -AN 2>/dev/null < ${ETL_INDX_SCRIPT}
If you not sure, look at the content of the index removal SQL script first
ETL_INDX_SCRIPT=/tmp/remove_indexes.sql
MYSQL_USER=root
MYSQL_PASS=rootpassword
MYSQL_CONN="-u${MYSQL_USER} -p${MYSQL_PASS}"
DB=mydb
pt-duplicate-key-checker ${MYSQL_CONN} -d ${DB} | grep "^ALTER TABLE " >> ${ETL_INDX_SCRIPT}
vi -R ${ETL_INDX_SCRIPT}
If the resulting script has no ALTER TABLE commands, you don’t need anything done.
If there are ALTER TABLE commands and if you are ready, run the script with this
mysql ${MYSQL_CONN} -AN 2>/dev/null < ${ETL_INDX_SCRIPT}
GIVE IT A TRY !!!
More indexes –> slower INSERTs. (However, this may not be a critical issue.)
More indexes –> slower UPDATEs when you modify an indexed column.
Indexing a flag (or other low-cardinality field) –> almost never will the optimizer use that index, so it is a wasted index. (I’m guessing is_active
is such.) However, a compound index that includes that flag is possibly good.
Compound index with more than, say, 4 columns –> Probably not worth it.
Your two compound indexes –> just INDEX(title, state_id, city_id)
will probably be better than those two lengthy ones.
MySQL rarely uses more than one index in one SELECT.
A FULLTEXT index is used in preference to other indexes. Hence, your FT+is_active is (1) find records by FT, then (2) filter any that are not is_active.
In most cases of INDEX(a), INDEX(a,b), you should toss the shorter one. (title
does not count since it is FULLTEXT.)
There are limits on the length of indexed columns. (I’m guessing that title
is too big to be in any non-FULLTEXT index.)