Suppose we have this database design
Users table: - id - name Friends table: - id - user_id - friend_user_id
USER 1 befriended
USER 2, this is what my application logic will execute:
INSERT INTO friends (user_id, friend_user_id) VALUES (1, 2); INSERT INTO friends (user_id, friend_user_id) VALUES (2, 1);
For the retrieval of the
USER 1 friends, we simply query this:
SELECT * FROM friends WHERE user_id = 1
Is it a bad idea to make redundant rows when friending users?
What I am trying to avoid here is not to store the friendship in a single row and keep guessing in every query which column the user is in:
SELECT * FROM friends WHERE user_id = 1 OR friend_user_id = 1
The latter made my application logic very complicated in some cases.
There are two ways of doing this:
- Only allowing the friendship to be established or removed via a procedure – no direct inserts or deletes allowed1
- Use a single row with an additional index
For purposes of this answer, let’s define our table as such:
CREATE TABLE Friendship ( First_User_Id INT NOT NULL ,Second_User_Id INT NOT NULL ,CONSTRAINT FK_First_User_Is_User FOREIGN KEY (First_User_Id) REFERENCES User (User_Id) ,CONSTRAINT FK_Second_User_Is_User FOREIN KEY (Second_User_Id) REFERENCES User (Used_Id) ,CONSTRAINT PK_Friend PRIMARY KEY (First_User_Id, Second_User_Id) ,CONSTRAINT CK_First_User_NE_Second_User CHECK (First_User_Id <> Second_User_Id) )
This differs from your question in the sense there is no
id column – it is unnecessary (it is not a key) and wasteful (a second index would need to be created to maintain uniqueness – which is essentially creating a copy of the entire table).
For this two work, we need two rows for each friendship
(B,A). However, to make sure both rows are created (or removed) at the same time, we need to use procedural logic.
How to achieve this will vary based on the DBMS used. The basic outline would be something like this:
/* Adding a friendship */ CREATE PROCEDURE Friendship_Create @First_User_Id INT ,@Second_User_Id INT AS BEGIN INSERT INTO Friendship VALUES (@First_User_Id, @Second_User_Id) ,(@Second_User_Id, @First_User_Id) END /* Removing a friendship */ CREATE PROCEDURE Friendship_Delete @First_User_Id INT ,@Second_User_Id INT AS BEGIN DELETE FROM FriendShip WHERE (First_User_Id = @First_User_Id AND Second_User_Id = @Second_User_Id) OR (First_User_Id = @Second_User_Id AND Second_User_Id = @First_User_Id) END
Depending on the DBMS, you may need to do additional checks before committing the transaction or set the isolation level of each transaction appropriately.
You also need to configure permissions in a way that no user (except root/sa) would have direct Insert/Update/Delete permissions.
Getting the friendships for
User 1 is very direct:
SELECT Second_Friend_Id FROM Friendship WHERE First_User_Id = 1
Using a single row
If the relationship is bi-directional (mutual), then you only need to store one row.
(A,B) contains the same amount of information as
However, this still leaves two issues unresolved:
- An efficient manner to query using either member of the friendship
- A way to prevent duplicate relationships from being established
For the first, we can simply create a second unique constraint2 on the table that inverts the order of the B-Tree. Physically this isn’t much different from inserting both a
(B,A) row in the table, except we are guaranteed both pairs will exist without relying on transactional logic.
For the second, we can impose the restriction that
User A Id <
User B Id. So of the pairs
(B,A), only one satisfies that condition.
Putting everything together:
CREATE TABLE Friendship ( First_User_Id INT NOT NULL ,Second_User_Id INT NOT NULL ,CONSTRAINT FK_First_User_Is_User FOREIGN KEY (First_User_Id) REFERENCES User (User_Id) ,CONSTRAINT FK_Second_User_Is_User FOREIN KEY (Second_User_Id) REFERENCES User (Used_Id) ,CONSTRAINT PK_Friend PRIMARY KEY (First_User_Id, Second_User_Id) ,CONSTRAINT AK_Friend UNIQUE (Second_User_Id, First_User_Id) ,CONSTRAINT CK_First_User_Id_LT_Second_User_Id CHECK (First_User_Id < Second_User_Id) )
Getting the friendships of
User 1 is slightly different but will return the same results:
SELECT CASE WHEN First_User_Id = 1 THEN Second_User_Id ELSE First_User_Id END AS Friend_User_Id FROM Friendship WHERE First_User_Id = 1 OR Second_User_Id = 1
You do not need a graph database to support graph structures, the relational model has done that since its inception (and databases have supported since recursion was implemented). What graph support can do is provide syntax that may simplify your queries somewhat – although the physical implementation may not be as lightweight as a standard implementation.
1 IMO procedures should be the default method of altering the data.
2 Technically you could just do a regular index, a
UNIQUE constraint does basically the same thing.