I have a table that looks like:
CREATE TABLE `connections` ( `connection_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT, `service_id` smallint(5) unsigned DEFAULT NULL, `parent_id` bigint(20) unsigned DEFAULT NULL, `child_id` bigint(20) unsigned DEFAULT NULL, PRIMARY KEY (`connection_id`), UNIQUE KEY `primary_child` (`parent_id`,`child_id`), UNIQUE KEY `child_primary` (`child_id`,`parent_id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
I need to do lookups on child_id and get information about parent_id and vice versa.
SELECT COUNT(`parent_id`) FROM `connections` WHERE `child_id` = 'x'
SELECT COUNT(`child_id`) FORM `connections` WHERE `parent_id` = 'x'
And some more advanced queries like:
SELECT DISTINCT `parent_id` FROM `connections` `c1` INNER JOIN `connections` `c2` ON `c1`.`parent_id` = `c2`.`parent_id` WHERE `c1`.`child_id` = 'x' AND `c2`.`child_id` = 'y'
I’m looking to get a better idea of the most efficient index structure to get quick responses from all the above query examples.
It should be noted that the child/parent pairs will be unique.
EDIT: This table currently holds 40M rows, likely to be much larger in the near future.
You already have indices on both colums with the unique constraint. No need for new ones.
I hope you have a good reason avoiding a foreign key. I hope you pay attention to keep your data consistent…
btw: parent_id in your join query is ambiguous.
primary_child (parent_id,child_id)is defined to be unique, then
KEY child_primary (child_id,parent_id)IS unique, so you only have to define it as
UNIQUE KEY. [This is more helpful while inserting though]
- Having lots of
NULLvalues is not recommended. It is not good for performance. If possible, add
NOT NULLinstead. However, this is highly dependent on your requirements. If there are lots of (parent_id, child_id) = (NULL, NULL), then create a new table with these two fields only, and link it to this table.