23
Dec

In my previous post, I used a key style that is open to debate and has been for many years amongst DB folks. The idea of every table having a surrogate key, regardless of the purpose of the table. This says, that for any record in the table I have a single column that acts as the primary key. Given a many-to-many relationship, using a surrogate key on the linking table allows me to describe the relationship in terms of objects and how they’re represented. As shown in the below diagram – each user may have many user_role instances, which are tied to a single role instance. This makes the lives of ORMs much easier since you can create objects for the linking table, which has a simple key to reference.

The ORM then has a User, UserRole, and Role object to use in accessing these tables and adding / removing relationships with ease, since it only needs to worry about the single surrogate ‘id’ key on each table.  In the linking table (as a design concern), one should place a Unique Index on the user_id/role_id column combination.

The other option is using a composite candidate key.  I may have the specific terminology wrong, but the idea is that instead of the single surrogate key to identify a unique record in the linking table, you use a design like the diagram below, which combines the columns that are foreign keys to their respective tables to create the primary key.  The combination of the columns creates a unique identifying key.  The difficulty emerges with ORMs attempting to create objects out this design, and attempting to correctly generate the SQL required to make updates / deletes, etc, using each member of the composite key.

Personally speaking, I’m a fan of the surrogate key approach, but I’ve worked with both.  I won’t discuss the performance impacts of either design, since I don’t have nearly the research base to accurately describe it.  But, using simple integer based keys, the difference should be low.

2 Responses to “Surrogates”

  1. Jon Tucker Says:

    I prefer the Candidate Key combination myself. Reason being is that the Surrogate Key method doesn’t allow the two columns (in the above example: user_id and role_id) to be maintained with referential integrity in the database. If a parent table has the user_role_id as a child relationship, one can easily modify the user_id and role_id in the user_role table without impacting the parent that is using the user_role table. The business rule just got wacked and the database stood by helpless.

    Now, if you want to code and maintain the database trigger to enforce this relationship, then I maybe willing to go along with it. : )

  2. dave Says:

    I would contend (given this example) that tables should not be referencing the user_role table, since that table merely links the user to the role, establishing that the relationship exists. The parent table should rather link to the role table, and the business logic states that the user is a member of that role.

    For instance, if driving a permission from the role, I would code the application to the role, not the user_role. Though, it is a very valid point in a different situation, and I’d not thought of it that way.

Leave a Reply