Database: conception

Draw a picture

I like to start with a little picture of my database design, and I have to admit I used Microsoft Access' relationships view as my inspiration.

Access has this cool relationships tool where you can add all of the tables in your database and plot their relationships. More than that, actually: you can enforce referential integrity.

Don't let that scare you: referential integrity just means that you set up ways your database enforces relationships between tables. So, for example, if you store your friends' names in one table and their phone numbers in a related table, your referential integrity rules will make sure that when you delete your brother from your friends table, his phone numbers will go away, too. Don't ask me why you would want to delete your brother. It's just an example.

In my drawing, I have three tables: friends, phones and phonetypes. Notice each of these tables has a plural name, each describes what the table holds and none have spaces or underscores.

We want to take full advantage of the relational database structure, so the phones are in their own table, connected to our friends by the key field in the friends table, friendid.

Why not put phone numbers in the same table as your friends? Let's explore that question.

What if you allowed two columns for phone numbers? You called one of them home and the other cell. You are entering a new friend, but this one has two cell phones and no home phone. What do you do?

Some people do a work-around and enter the second cell phone in the home field, adding a notation of some sort, maybe 'cell2'. But what if your friend has two business cell phones and one personal cell phone? Because you only designed two columns for phone numbers, you are now considering the addition of a couple more columns for phone numbers. You have by now designed a printable phone listing that only has two phone number fields, so you have to change not only your table design but report design as well. And you have built in a lot of overhead, because you have empty fields for those friends who don't have all of those phone numbers.

Relational databases provide flexibility: you can enter as few or as many phone numbers for your friends as they have just by putting them in a separate, related, table. Simple!

You have just learned the First rule of Normalization: a table shall not contain repeating groups.

There is an additional table I haven't talked about and I'm sure you noticed, called phonetypes. I included this table as a pick list when entering phone numbers. I want all mobile phone numbers described as cell, not mobile. I don't want any descriptions capitalized. I want to make sure that no one enters something that is non-descriptive, like 'phone'. Limiting entries to values you have thought through in advance is a simple way to keep entries and reports looking clean and consistent. The more you consider output in advance, the better off you will be.

Key fields

Before getting into key fields, let's talk a little about records and columns.

You may be more accustomed to dealing with spreadsheets than databases, but spreadsheets are a lot like databases. When you open a table in your database, it looks very much like a spreadsheet. Each row of the spreadsheet is a record in the database. Each column of the spreadsheet is a field in your record.

Each record in your database must have a field or combination of fields that make it unique. Going back to our friends table as an example, we could have hundreds of records - hundreds of friends - and at some point forget whether we have added Tom Jones. We want the database to prevent duplication wherever possible.

Let's think about our phones table for a minute. It is possible that two of our friends could have the same phone number: they work in the same office. We want to allow that to happen, so we cannot make the phone number the key field. By itself, that does not distinguish one record from another. We want to allow the same phone number in multiple records, but we do not want the same phone number listed twice for the same person. The combination of friendid and phone can distinguish one record from another.

We also want to allow the addition of records only where there is enough information to make sense. If we do not know a phone number but we know our friend has a cell phone, does it make sense to add a record with no phone number and the description, 'cell'? No. It wouldn't make sense to create a record that has no value, and it follows that Nulls are not allowed in key fields.

Null means that no value is present at all. It is the absence of a value.
It is not the same as 0 or a single blank character.

Making the connection

Now that we have decided which tables we need and what our key fields are, we can better define relationships between our tables.