Will moving the MATCH…AGAINST part in different part of the query (e.g. before evaluation of JOIN) will improve the performance?

Posted on

Question :

Basically, I have the following query:

select `t`.*
from `tracking` `t` 
where exists 
(   
    select `ti`.`id` from `tracking_items` `ti`
    inner join `cars` `c` on (`c`.`car_id` = `ti`.`tracking_object_id`)
    where `t`.`id` = `ti`.`tracking_id` and match(`c`.`car_text`) against('+bulgaria +turkey' in boolean mode) 
    limit 1
) and ... limit 5

The problem I have?

There’s FULLTEXT index for c.car_text column, but however the query is running a bit slowly (especially for records that rarely occurs, e.g. searching for non-existing keyword).

I haven’t published the result of EXPLAIN, as I’d like to know the theory behind my question, e.g. for different possible cases.

I use MySQL 5.6 with InnoDB storage engine, also it’s worth to mention that cars table have like ~160m records.

Answer :

I am glad you came here. I have always suggested using FULLTEXT indexes to collect keys first and then join it to other tables.

In your particular case, I would suggest rewriting your query as an all-out JOIN:

SELECT t.* FROM tracking t
INNER JOIN tracking_items ti ON t.id = ti.tracking_id
INNER JOIN (SELECT car_id,car_text FROM cars WHERE MATCH(car_text)
AGAINST ('+bulgaria +turkey' in boolean mode)) c
ON ti.tracking_object_id = c.car_id LIMIT 5;

or maybe

SELECT t.* FROM
(SELECT car_id,car_text FROM cars WHERE MATCH(car_text)
AGAINST ('+bulgaria +turkey' in boolean mode)) c
INNER JOIN tracking_items ti ON c.car_id = ti.tracking_object_id
INNER JOIN tracking t ON ti.tracking_id = t.id  LIMIT 5;

If these queries give you an idea, please incorporate it if you still want to use WHERE EXISTS.

Please run the explain plan on these queries and your and see the differences

EXPLAIN select `t`.*
from `tracking` `t` 
where exists 
(   
    select `ti`.`id` from `tracking_items` `ti`
    inner join `cars` `c` on (`c`.`car_id` = `ti`.`tracking_object_id`)
    where `t`.`id` = `ti`.`tracking_id` and match(`c`.`car_text`)
    against('+bulgaria +turkey' in boolean mode) 
    limit 1
) and ... limit 5;

EXPLAIN SELECT t.* FROM
(SELECT car_id,car_text FROM cars WHERE MATCH(car_text)
AGAINST ('+bulgaria +turkey' in boolean mode)) c
INNER JOIN tracking_items ti ON c.car_id = ti.tracking_object_id
INNER JOIN tracking t ON ti.tracking_id = t.id LIMIT 5;

EXPLAIN SELECT t.* FROM tracking t
INNER JOIN tracking_items ti ON t.id = ti.tracking_id
INNER JOIN (SELECT car_id,car_text FROM cars WHERE MATCH(car_text)
AGAINST ('+bulgaria +turkey' in boolean mode)) c
ON ti.tracking_object_id = c.car_id LIMIT 5;

Give it a try, and let us know what happens !!!

That query is rather confusing. You want one tracking_item, then fetch 5 trackings? And you don’t care which tracking_item (there is no ORDER BY). And you apparently don’t care which 5 trackings.

I suggest using JOIN instead of EXISTS. The goal is to let MATCH be the first thing that happens.

SELECT t.*
    FROM (
        SELECT ti.tracking_id
            FROM cars AS c
            JOIN tracking_item AS ti  ON ti.tracking_object_id = c.car_id
            WHERE MATCH(c.car_text) AGAINST('...' IN BOOLEAN MODE)
            ORDER BY ???
            LIMIT 1
         ) AS x
    JOIN tracking AS t  ON t.id = ti.tracking_id
    ORDER BY ???
    LIMIT 5

Now it reads in the same order that the Optimizer will (I expect) decide to do things.

  1. Do the fulltext query on cars, delivering only a few, if any, rows;
  2. Pick the first match (no ORDER BY) or fetch all that match, sort, and deliver first.
  3. join to tracking_item to get the tracking_id(s)
  4. Pick one tracking_id
  5. Fetch 5 rows (if no ORDER BY) or all rows matching on tracking_id and sort them.
  6. Deliver all columns for up to 5 rows.

Indexes needed:

tracking_item:  INDEX(tracking_object_id, tracking_id)  -- in this order
tracking:       INDEX(id)  -- (not needed if it is already the PRIMARY KEY)

As for the question in the title — It is much too broad to give a yes/no answer. There could be many different extenuating circumstances.

Leave a Reply

Your email address will not be published.