Which column to index for this query?

Posted on

Question :

select DISTINCT id_request,username_request
from darkhast_follower
where darkhast_follower.id_request != '9762952594' 
  AND darkhast_follower.id_request != 'null' 
  AND darkhast_follower.status !=1 
  AND NOT exists ( select *
                   from log_follow 
                   where log_follow.other_id = darkhast_follower.id_request 
                   AND log_follow.id= '9762952594') 
LIMIT 5

Which column should be indexed? Or what is the best query to replace this?

Answer :

No index would be helpful in terms of inequality.

If you have the table of N rows and key=val yields the k rows of result, then key!=val yields N-k rows. Depending on the key’s selectivity N-k tends to N. So search on inequality is equivalent to the full table scan for all cases and can’t be improved by index.

Do not say

!= 'null' 

Instead, say

IS NOT NULL

That is, unless you are actually storing the 4 characters “null” in a VARCHAR?

Indexes:

log_follow:  INDEX(other_id, id)   -- for the subquery
darkhast_follower:   no index is useful for this query

The query, as you have it, will scan darkhast_follower until it finds 5 rows matching the all the WHERE clauses. In doing so, it will check log_follow — This task will be a table scan (slow) if you don’t have a suitable index. Else it will be fast.

Leave a Reply

Your email address will not be published. Required fields are marked *