A fellow developer and I considering a basic data modeling scenario and
would like additional perspective. For the sake of this question, please
consider one of the classic many-to-many relationships we find in
introductory database design articles/books: Students and Classes entities.
Of course we'd expect 3 tables to implement the many-to-many relationship
between Students and Classes (e.g., Students, StudentsXClasses, and Classes;
I hope you don't think you need DDL for this question :-)
Now we all know that the StudentsXClasses table would "tell us" which
students are enrolled in which classes. No problem there. But what about the
situation in which a student is enrolled in zero classes... or the similar
situation in which a Class has zero students. In both of those cases we
would expect the StudentsXClasses table to have zero rows in it for the
particular Student/Class combinations (where zero are enrolled).
To my thinking, that's the "end of the line" on the implementation. If we
want to find any class with zero students, then we query to find the classes
with zero rows in StudentsXClasses, for example... same with students
enrolled in zero classes.
But my fellow developer wants to have a "class" (row in the Classes table)
named "none" or something like that... and when a student is enrolled in
zero "real classes", then the StudentsXClasses table gets one row that
points to the student and to the "none class." The argument in favor of
doing this is that we have an explicit way for users to indicate - "yes,
this student is in fact enrolled in zero classes" (i.e., by creating that
entry)... as opposed to simply inferring "zero classes enrolled" from a lack
of expected row(s) in StudentsXClasses.
If we go with his desired implementation (i.e., having a pseudo class,
zero classes), then I'm thinking we'd have to create a table constraint or
trigger that prevents the "none class" from being entered into the
StudentsXClasses table for a given student along with other "real classes" -
as that would be an invalid combination of rows.
While that implementation seems unnecessarily complicated to me (having the
invalid row combinations), my fellow developer is telling me that the
business currently spends a lot of time trying to answer to the question;
some explicit "flag" that is there to answer that question would be valuable
to the business... therefore we have a "good reason" to do something we
otherwise would not want to do.
Thoughts? Perspective? Opinions?