Ten Common Database Design Mistakes
Louis Davidson at simple-talk has an article by the same name. I’ve pulled the following section from it, because I am currently faced with EXACTLY this problem. One of our Architecture Astronauts thought it would be good to store differing types of data in one table.
One of the hardest things in our industry is to get a team of people to work along the same line. Some have compared managing programmers to herding cats. In my case many of the cats think they know best…..
So here is Louis:
One table to hold all domain values
“One Ring to rule them all and in the darkness bind them“
This is all well and good for fantasy lore, but it’s not so good when applied to database design, in the form of a “ruling” domain table. Relational databases are based on the fundamental idea that every object represents one and only one thing. There should never be any doubt as to what a piece of data refers to. By tracing through the relationships, from column name, to table name, to primary key, it should be easy to examine the relationships and know exactly what a piece of data means.
The big myth perpetrated by architects who don’t really understand relational database architecture (me included early in my career) is that the more tables there are, the more complex the design will be. So, conversely, shouldn’t condensing multiple tables into a single “catch-all” table simplify the design? It does sound like a good idea, but at one time giving Pauly Shore the lead in a movie sounded like a good idea too.
For example, consider the following model snippet where I needed domain values for:
- Customer CreditStatus
- Customer Type
- Invoice Status
- Invoice Line Item BackOrder Status
- Invoice Line Item Ship Via Carrier
On the face of it that would be five domain tables…but why not just use one generic domain table, like this?

This may seem a very clean and natural way to design a table for all but the problem is that it is just not very natural to work with in SQL. Say we just want the domain values for the Customer table:
SELECT *FROM Customer
JOIN GenericDomain as CustomerType
ON Customer.CustomerTypeId = CustomerType.GenericDomainId
and CustomerType.RelatedToTable = ‘Customer’
JOIN GenericDomain as CreditStatus
ON Customer.CreditStatusId = CreditStatus.GenericDomainId
and CreditStatus.RelatedToTable = ‘Customer’ and CreditStatus.RelatedToColumn = ‘ CreditStatusId’
As you can see, this is far from being a natural join. It comes down to the problem of mixing apples with oranges. At first glance, domain tables are just an abstract concept of a container that holds text. And from an implementation centric standpoint, this is quite true, but it is not the correct way to build a database. In a database, the process of normalization, as a means of breaking down and isolating data, takes every table to the point where one row represents one thing. And each domain of values is a distinctly different thing from all of the other domains (unless it is not, in which case the one table will suffice.). So what you do, in essence, is normalize the data on each usage, spreading the work out over time, rather than doing the task once and getting it over with.

Looks harder to do, right? Well, it is initially. Frankly it took me longer to flesh out the example tables. But, there are quite a few tremendous gains to be had:
- Using the data in a query is much easier:
SELECT *
FROM Customer
JOIN CustomerType
ON Customer.CustomerTypeId = CustomerType.CustomerTypeId
JOIN CreditStatus
ON Customer.CreditStatusId = CreditStatus.CreditStatusId
- Data can be validated using foreign key constraints very naturally, something not feasible for the other solution unless you implement ranges of keys for every table – a terrible mess to maintain.
- If it turns out that you need to keep more information about a ShipViaCarrier than just the code, ‘UPS’, and description, ‘United Parcel Service’, then it is as simple as adding a column or two. You could even expand the table to be a full blown representation of the businesses that are carriers for the item.
- All of the smaller domain tables will fit on a single page of disk. This ensures a single read (and likely a single page in cache). If the other case, you might have your domain table spread across many pages, unless you cluster on the referring table name, which then could cause it to be more costly to use a non-clustered index if you have many values.
- You can still have one editor for all rows, as most domain tables will likely have the same base structure/usage. And while you would lose the ability to query all domain values in one query easily, why would you want to? (A union query could easily be created of the tables easily if needed, but this would seem an unlikely need.)
I should probably rebut the thought that might be in your mind. “What if I need to add a new column to all domain tables?” For example, you forgot that the customer wants to be able to do custom sorting on domain values and didn’t put anything in the tables to allow this. This is a fair question, especially if you have 1000 of these tables in a very large database. First, this rarely happens, and when it does it is going to be a major change to your database in either way.
Second, even if this became a task that was required, SQL has a complete set of commands that you can use to add columns to tables, and using the system tables it is a pretty straightforward task to build a script to add the same column to hundreds of tables all at once. That will not be as easy of a change, but it will not be so much more difficult to outweigh the large benefits.
The point of this tip is simply that it is better to do the work upfront, making structures solid and maintainable, rather than trying to attempt to do the least amount of work to start out a project. By keeping tables down to representing one “thing” it means that most changes will only affect one table, after which it follows that there will be less rework for you down the road.
Write a comment