| |
Database 101
In this Bit Literate Guide you'll learn some basic database terminology and you'll learn what it means for a database to be a "relational database."
But we're going to keep this simple. There are vast amounts of information on this topic available on the Internet,. If you are looking for more detail you can easily find it there. Our intent is to provide a "primer" for adult learners who are using a database in the context of some course.
The first thing to clear up is that the word database has several different meanings. The term can refer to the database management software being used: Oracle, MySQL, and Microsoft Access are examples of database management software. And the term can also refer to the collection of tables used to store some set of data. When someone refers to "the database" they may be referring to the software, or they may be referring to the data. But they may also be referring to both things at once, in an imprecise sort of way.
These days almost every database (both software and the data storage) are based on what is called the relational database model. That sound complicated, but all it really means is that all the data is stored in tables and that tables can be linked using a common attribute.
So for example, imagine that you need a database to store customer records for your business. Using relational database management software (an RDBMS) you would start by creating a table called customers. Each data item that you want to store about a customer will require a separate field in this table. Each customer will end up as a row in the table.
So you would have fields for:
first name
last name
nickname
phone number
status (some customers are special)
street
street two (in case they have extra info like an apt. number)
city
state
zip code
So you've set up this very simple database with just the customers table, but after using it for a while you realize that you actually have two addresses for many of your customers. In some cases you have a billing address associated with the customer's credit card and separate shipping address. Of course you need to store both addresses and your initial thought is to add a second set of address fields to your customer table. Maybe something like:
shipping street
shipping street two
shipping city
shipping state
shipping zip code
Doing so will violate the prime directive of relational database design. That is, never provide two ways to store the same kind of thing.
For many of your customers the billing address and the shipping address will be the same address. Only some customers will have separate addresses for billing and shipping. Adding a second set of address fields to your customer table will inevitably lead to confusion and errors. For example, you could say that if there's no shipping address, use the billing address for shipping. But what happens when a customer has a shipping address and wants an order shipped to the billing address. Or what happens when both addresses are stored in the table, but they are really the same address with some slight variation (like St. vs. Street)
Violating the never provide two ways to store the same kind of thing rule is what leads to statements like, "The computer sent your order to the wrong address".
The relational database approach is to create a separate address table that can be linked to the customer table. This link is called a relation, and yes, that's why we call it a relational database.
Using this approach means that you have only a single place where an address can be stored.
To continue with our example, we would remove the address fields from the customer table and create a new table specifically to store addresses. The fields in the new address table will be:
street
street two
city
state
zip code
address type
customer id
There are two important things to note here. First is the address type field. You'll use this field to store some code that designates whether the address is a billing address or a shipping address. This will allow you to store more than one address for a customer and be able to tell them apart. Since you can now store as many addresses as you need for each customer, you could add additional addresses as you need them. In addition to the "billing" and "shipping" address types you could easy handle new address types like "gift" or "onetime".
The second new field to note in the address table is the customer id field. Remember, we need some way to associate each address with the proper customer. We'll make sure that each customer gets a unique ID number assigned when the customer is added to the database. Then we'll store that id along with each of the address records for that customer in the address table. This special field is called a key.
Now when we add a new record using a form, or we run a report, or we ship an order, the database software can use the customer ID to find the address(es) in the address table. Pulling the tables together using the key field is called joining the tables.
This simple set of concepts underlies the many databases you encounter on a daily basis (at the bank, at the motor vehicle dept., in your employer's payroll system, when you pass through an EZpass toll booth). Of course, these systems usually contain many tables, sometimes hundreds. But the underlying principles are the same. And as you've just seen, really not that hard to understand.
|