Question :
I have the following tables:
CREATE TABLE `tx_in` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`txid` char(64) DEFAULT NULL,
`hashPrevOut` char(64) DEFAULT NULL,
`address` char(34) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `hashPrevIndex` (`hashPrevOut`(8))
) ENGINE=InnoDB AUTO_INCREMENT=641391046 DEFAULT CHARSET=latin1
CREATE TABLE `tx_out` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`txid` char(64) DEFAULT NULL,
`btc_value` double DEFAULT NULL,
`address` char(34) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `txidIndex` (`txid`(8))
) ENGINE=InnoDB AUTO_INCREMENT=695391886 DEFAULT CHARSET=latin1
And I am running the following query:
UPDATE tx_in LEFT JOIN tx_out ON tx_in.hashPrevOut = tx_out.txid SET
tx_in.address = tx_out.address;
I modified it into a SELECT
statement so that I could run EXPLAIN
on it.
EXPLAIN SELECT * FROM tx_in LEFT JOIN tx_out ON tx_in.hashPrevOut = tx_out.txid;
And got this output
+----+-------------+--------+------------+------+---------------+-----------+---------+------------------------------+-----------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+------+---------------+-----------+---------+------------------------------+-----------+----------+-------------+
| 1 | SIMPLE | tx_in | NULL | ALL | NULL | NULL | NULL | NULL | 635293613 | 100.00 | NULL |
| 1 | SIMPLE | tx_out | NULL | ref | txidIndex | txidIndex | 9 | blockchain.tx_in.hashPrevOut | 2 | 100.00 | Using where |
+----+-------------+--------+------------+------+---------------+-----------+---------+------------------------------+-----------+----------+-------------+
But the execution plan doesn’t have any reference to the index on hashPrevOut
. Should I be concerned about this? It seems like it should just be using the two indexes to do the join. As you can see the tables are quite large so this could make a big difference in execution time.
Answer :
I think this is due to the partial index. i.e. the index has only 8 char of hashPrevOut
It estimated that the number of matching first 8 char will lead to a second read from the table to compare the full value from first table with the full value in the other table, so the optimizer decided to skip the index, and deal with the table directly.
Consider this suggestion,
A) in the query, tx_in.hashPrevOut = tx_out.txid
refers to two columns that are NOT indexed.
B) VARCHAR vs CHAR would conserve storage in the file (and HDD or SSD)
C) add INDEXES for the complete column content
for a predictable 1 to 1 match.
EXPLAIN SELECT original query to confirm your ndx’s are working and ROWS should be much lower as well as time to complete.