Database design for cancer mouse models

Posted on

Question :

This is a follow up question from my previous post:
Does this simple database design work even if it’s cyclic?

I am a graduate student working on breast cancer. My mice are genetically engineered to develop cancer, and I want to keep track of the their health status.

Each day, animal care staff check each cage. If they make at least one clinical observation, they report it to the vet. The report looks something like:

enter image description here

The animal care staff will color the boxes, and the vet will confirm (check-mark = “yes”, x = “no). Notice that several observations can be reported at once. Keeping track of the number of cages is also important because several cages (with the same protocol and researcher) may have the same observations, and the animal care staff will only make one report.

Once something is found, a clinical case is opened (the report becomes the case sheet). Subsequently, if additional observations are made with cages that already have a case number, the vet simply writes them in the
“notes” sections of the first report. Hence, several reports can be grouped into one case. When I enter a new report/case, I want the database to verify if a case already exist. If yes, append the new report. If not, create a new case. When the mice are healthy again or euthanized, the case is closed.

Finally, I need to keep all the “false positive” observations made by the animal care staff. Here is my tentative database model:

enter image description here

I have two transition tables, I would greatly appreciate the community’s feedback. Thank you very much!



enter image description here

Answer :

As I said in my previous comment, you should have two main tables.

  • Observation (the origin of all cases)

  • Clinical_case (linked to an observation – note that “case” is an SQL keyword).

It appears to me that you are overcomplicating things with so many tables. I think that any report made by a health professional should have the same status – be it a health care worker or a vet – you can keep track of this with staff_id. Have a closed field to indicate the mouse is getting well or (gulp…) being euthanized.

What database are you using to track these experiements/observations? I would recommend PostgreSQL if you haven’t yet made a choice. What is your development environment?

Go with the simple system system first – remember Occam’s razor – “Entities should not be multiplied beyond necessity“. But, should you really need to, you can make it more complex – as Einstein said, “Everything should be made as simple as possible, but not simpler.” :-). Better to start simple rather than over-engineering.

[Edit – in response to OP’s question about why PostgreSQL is better than MySQL]

  • more datatypes (IP address…)

  • CHECK CONSTRAINTS MySQL has workarounds (Triggers…) but these are clumsy and error prone,

  • CTEs – common table expressions – also known as the WITH clause or subquery refactoring,

  • Analytic functions,

  • a richer procedural language (or should I say language(S)),

  • a CREATE DOMAIN command – very useful and supported by Firebird, Oracle as well as PostgreSQL. Not by MySQL, but maybe by other RDBMSs.

  • generally (opinion now) – more of a focus on data integrity and correctness. MySQL’s fame is built on the fact that it was quick (and nasty) – just perfect for the first Internet boom (1995 – 2000). It also was easy to set up on Windows (unlike PostgreSQL until version 8 (.3?)). It was also trendy with the GPL rather than the *BSD licence.

Leave a Reply

Your email address will not be published. Required fields are marked *