Question :
I found a unique index on (RetailerID, ID)
in our database. The ID
column is the primary key so it’s already uniquely indexed. I can’t see how an index on (RetailerID, ID)
could help when looking up an individual record by RetailerID
and ID
(or just ID
).
I suspect it might be helpful when querying all the ID
s under a particular RetailerID
, since it covers such a query. Is that true?
But then why not just make the index on RetailerID
include ID
? Perhaps the unique index on (RetailerID, ID)
covers the query in the same way but performs better as a result of being a unique index.
As I said, there’s a unique primary clustering key on ID
alone. There’s also a unique index on (RetailerID, ID)
. Since ID
is already unique, querying where RetailerID = x and ID = y
shouldn’t be any more efficient than simply querying where ID = y
. (ID)
is unique and (RetailerID, ID)
is unique, so ID
is associated with at most one RetailerID
. Therefore, if a predicate has ID,
then including RetailerID
in the predicate is redundant. The clustered index would always be faster.
An index on (RetailerID, ID)
could be helpful for a query where the predicate is just on RetailerID
and the select is on just ID
, because it would cover the query, but for queries where ID
is in the predicate, RetailerID
in the predicate is redundant, and since it has no other included columns, it would be less-efficient than the clustered index when selecting any column other than ID
. That’s why I’m not seeing the point of this index, given the predicates used (which always involve ID
).
Answer :
I can think about one scenario where such index is needed. If you want another table to have a Foreign Key constraint that refers (RetailerID, ID)
, you must create a unique constraint (index). This pattern makes sense in some cases, for instance if you want to have a common parent table which holds common attributes, and multiple detail tables, and at the same time enforce that details are saved in the proper table.
Update. Example to illustrate an idea :
vehicle (vehicle_id, vehicle_type_id, year, model, ...
PK (vehicle_id),
Unique (vehicle_type_id,car_id),
CHECK (vehicle_type_id IN ('car','truck'))
);
car (vehicle_id, vehicle_type_id default 'car',
num_passengers, car_type, ....
PK,FK(vehicle_id, vehicle_type_id) ref vehicle,
CHECK (vehicle_type_id ='car')
);
truck (vehicle_id, vehicle_type_id default 'truck', load,
dimensions, ... ,
PK,FK(vehicle_id, vehicle_type_id) ref vehicle,
CHECK (vehicle_type_id ='truck')
);
Thus you enforce that truck’s attributes are always stored in truck table, car’s – in car, and at the same time refer to vehicle
if vehicle type is not important, say vehicle_owner
can look like vehicle_id, person_id, date_from ,date_thru, FK (vehicle_Id) ref vehicle(vehicle_id))
.
Clustered index columns are automatically added to non clustered index. Once non clustered key is located if other column values are required (by your query, most likely due select columns) to be retrieved a row locator is used to located the data row. In your case that is column ID. If it is a heap a unique pointer is added, which points to the pageid, rowid.
You can add columns to avoid going to data rows in your non clustered index as included column. Details here.
Nonclustered indexes have a structure separate from the data rows. A
nonclustered index contains the nonclustered index key values and each
key value entry has a pointer to the data row that contains the
key value.The pointer from an index row in a nonclustered index to a data row is
called a row locator. The structure of the row locator depends on
whether the data pages are stored in a heap or a clustered table. For
a heap, a row locator is a pointer to the row. For a clustered table,
the row locator is the clustered index key.
Here is a demo of this by Brent Ozar.