Question :
I have a Table which has 2 Million Records and No primary or Unique Keys.
I added Two Indexes and I’m expecting an Index to be used for my Select Query.
When I saw the Explain Plan , Optimizer is not using any of the Indexes I created.
I am quering a broader range of data which I cannot reduce as per my exisitng Use case and I want Optimizer to Pick Indexes.
I don’t want to Use FORCE INDEX Clause, I wanted to know how i can make Optimizer to use INDEXS
Below is the Table SCHEMA
CREATE TABLE EVENT_DETAILS
(
ORG_ID CHAR(32) NOT NULL,
CONFIG_KEY CHAR(32) NOT NULL,
COMPONENT_NAME VARCHAR(512) NOT NULL,
REQUEST_ID CHAR(32) NOT NULL,
UPDATE_DATE INT NOT NULL,
UPDATE_TIME INT NOT NULL,
CLIENT CHAR(16) NOT NULL,
STATUS INT NOT NULL,
EVENT_DATA VARCHAR(1024),
MESSAGE VARCHAR(512),
INSERTED_TIME TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL
);
CREATE TABLE DOMAIN_KEYS (
DOMAIN_KEY char(32) COLLATE utf8_unicode_ci NOT NULL,
DOMAIN_IDENTITY varchar(256) COLLATE utf8_unicode_ci NOT NULL,
DOMAIN_DISPLAY_NAME varchar(256) COLLATE utf8_unicode_ci NOT NULL,
ORG_ID char(32) COLLATE utf8_unicode_ci NOT NULL,
LOCATION_KEY int(11) DEFAULT '0',
DESCRIPTION varchar(4000) COLLATE utf8_unicode_ci DEFAULT NULL,
CATEGORY int(11) NOT NULL DEFAULT '1',
ENVIRONMENT_TYPE int(11) NOT NULL DEFAULT '0',
PRIMARY KEY (DOMAIN_KEY)
);
CREATE TABLE ORGANISATIONS (
ORG_ID char(32) NOT NULL,
ORG_IDENTITY varchar(256) NOT NULL,
ORG_NAME varchar(256) NOT NULL,
ORG_DESC varchar(256) DEFAULT NULL,
PARENT_ORG_ID char(32) NOT NULL,
PRIMARY KEY (ORG_ID)
);
Below are the Indexes created
CREATE INDEX MYINDEX1 ON EVENT_DETAILS (ORG_ID,CONFIG_KEY,COMPONENT_NAME,UPDATE_DATE);
CREATE INDEX MYINDEX2 ON EVENT_DETAILS (UPDATE_DATE,COMPONENT_NAME);
Below is the Query
SELECT ED.ORG_ID,
ORG.ORG_IDENTITY,
ORG.ORG_NAME,
ED.CONFIG_KEY,
REPLACE(REPLACE(ED.COMPONENT_NAME,'-CURRENT',''),'-HISTORICAL','') AS COMP_NAME,
DO.DOMAIN_IDENTITY,
DO.DOMAIN_DISPLAY_NAME,
GROUP_CONCAT(ED.STATUS ORDER BY ED.INSERTED_TIME DESC) AS ALL_STATUS
FROM EVENT_DETAILS ED ,
ORGANISATIONS ORG ,
DOMAIN_KEYS DO
WHERE UPDATE_DATE > '20191205'
AND ORG.ORG_ID = ED.ORG_ID
AND DO.DOMAIN_KEY=ED.CONFIG_KEY
AND DO.LOCATION_KEY <> 0
GROUP BY ED.ORG_ID,
ORG.ORG_IDENTITY,
ORG.ORG_NAME,
ED.CONFIG_KEY,
DO.DOMAIN_IDENTITY,
DO.DOMAIN_DISPLAY_NAME,
COMP_NAME
HAVING ALL_STATUS LIKE '0,0,0%';
I wanted to Reduce the Temp Tables created as well , because I observed Free Local Storage of my DB Machine going down when the Query is executing with high volumes of data
Some one please suggest the required Schema Changes and if Query can be Optimized.
Query Execution Break Up
Answer :
- Use InnoDB.
- Have a
PRIMARY KEY
on every table - Don’t blindly use
VARCHAR(256)
find a reasonable limit. - Use the
JOIN .. ON
syntax, not the commalist syntax. - Use
DATETIME
instead of a pair ofINTs
CHAR(32)
smells like some kind of hash — such is very inefficient for indexing andJOINing
.- If those are hex strings, use
CHARACTER SET ascii
, not utf8. - Hashes can be made smaller with
HEX
andUNHEX
and usingBINARY(16)
. - If more than about 20% of ED satisfies
UPDATE_DATE > '20191205'
, the Optimizer will eschew an index starting with that column for a table scan.
MySQL will only use one index per table per query, on a composite index it will use the columns from left to right, the where clause needs to have the columns specified in the same order.
I’d experiment with creating an index on ED that covers ORG_ID and UPDATE_DATE, alter the order of the columns so the filter that is most selective comes first.