Storing hex values as strings or integers?

Posted on

Question :

I’ve inherited a database that uses sets of 32-byte hex strings to attempt to uniquely identify objects. Each string is called a fingerprint and each object can have up to 20 fingerprints.

I’m migrating the database across servers and I’ve noticed that re-populating the Fingerprint table takes hours. It seems that each fingerprint is given an integer id and is stored as a varchar, there’s an index on the fingerprint column and hence the slow inserts.

My question is: is it worth my while to refactor this design and remove the Fingerprint table altogether? I would guess that indexing on numbers rather than strings would be faster, is this correct?

If the design looks like this: (hopefully this make sense)

 id                      object_id                          id
                         fingerprint_id                     fingerprint

Would it be better to use:

 id                     object_id

I’m using a MySQL server with INNODb tables and I have roughly 1.4 million objects.

Answer :

I would recommend using the BINARY type; strings waste a lot of space, and integers require conversion.

For the other part: why do you need the associative table? If objects and fingerprints are 1 to 1, you could either store them directly in the object table, or just have single table with the id, fingerprint, and object id FK.

Leave a Reply

Your email address will not be published. Required fields are marked *