Question :
I want to create simple conversation module on my site. So I need a list of users who had conversation with current user. I need to show amount of new messages in each conversation.
I think about next tables
User
id
...
Conversation
id
user_id1
user_id2
cnt_new_msg
last_msg_id
Message
id
conversation_id
sender_id
type
is_read
created_at
For selecting data for list of conversation I can use next query:
SELECT *
FROM conversation c
LEFT JOIN message m ON m.id = c.last_msg_id l.last_ms
WHERE user_id1=[current_user_id] OR user_id2=[current_user_id]
When user send new message I will increment Conversation.cnt_new_msg
and when user open conversation I will set Message.is_read
to 1 and recalculate Conversation.cnt_new_msg
again. Also to sync this value I can create cronjob.
But this approach has problems. I don’t like user_id1=[current_user_id] OR user_id2=[current_user_id]
for each query. My tables will be very denormalized but I think it will be faster.
Please advise best practices to design conversations tables.
Answer :
Don’t store functionally dependent data. If you should need that for performance, you can always add a materialize view later (see below). But it’s often better (and sometimes even faster) to calculate derived values in queries on the fly.
In particular, you would create many dead rows, table bloat, more work for VACUUM
and nasty concurrency issues in conversation
if you UPDATE
that row with every new message and with every status change.
I suggest a relational model like this:
CREATE TABLE users (user_id serial PRIMARY KEY, ...)
CREATE TABLE conversation (
conversation_ id serial PRIMARY KEY
, user_id1 int NOT NULL REFERENCES users
, user_id2 int NOT NULL REFERENCES users
-- cnt_new_msg -- redundant
-- last_msg_id
, created_at timestamptz NOT NULL DEFAULT now()
);
CREATE TABLE message (
message_id serial PRIMARY KEY
conversation_id int NOT NULL REFERENCES conversation
sender "char" NOT NULL CHECK (sender IN ('1', '2')
message text NOT NULL
type ???
unread boolean NOT NULL DEFAULT TRUE
created_at timestamptz NOT NULL DEFAULT now()
);
Details for data types in the manual.
You can create a VIEW
or MATERIALIZED VIEW
to present conversation
data like you had in mind:
CREATE VIEW conversation_plus_ct AS
SELECT c.*
, count(m.unread OR NULL) AS cnt_new_msg
, max(m.created_at) AS last_msg_id
FROM conversation c
JOIN message m USING (conversation_id);
And your query could look like this (not using the VIEW, but it could ..)
SELECT *
, count(m.unread OR NULL) AS cnt_new_msg
, max(m.created_at) AS last_msg_id
FROM conversation c
LEFT JOIN message m USING (conversation_id)
WHERE [current_user_id] IN (user_id1, user_id2);