Database Normalization
Guidelines for a well-designed database
The Database design process should comply with certain broad guidelines. A well designed database stores information in the smallest amount of space possible and responds to queries speedily. Our sample database stores data of books and customers of a bookstore. Consider a scenario where customers should be mapped to their book category preferences for new release mailing lists:
The above table stores the word Fiction four times and the words Non-Fiction, Poetry, and Mythology three times. If the customer base grew to, say, 10000 people, the same words would appear many thousands of times. This table contains redundant (unnecessarily repeated) data. How would a good design eliminate this problem and streamline the database? Take a look at the following tables.
The CustomerCategories have now been split into two tables. Each customer now has a unique ID. This ID is called a Primary Key. Assigning such unique IDs or primary keys makes a database compliant to the 1st Normal form. The lookup becomes easier now; all we have to do is search for 'Poetry' in the second table to get a list of customers who have a preference for this category. However, it still looks untidy and numbers still get repeated a lot. Also, if the last customer subscribed to a particular category is removed from the database, the category will no longer be listed in the database!
Look at the above set of tables. A new Category table has been created with unique IDs for each category. Now, the new CustomerCategories table contains less repetitions and is much clearer. We have separated the old CustomerCategories table because it depended on two sets of keys - customerID and categoryID - not a good practice. By doing this, we have achieved the 2nd normal form. To get the design compliant to the 3rd normal form, all the data within a table should pertain to a single subject. For example, let us say that the bookstore expands to many cities. Rather than adding branch name and location columns to the customers table, we would create a Branch table with a unique primary key and use this primary key to indicate a customer's primary branch. This concept is called a foreign key. Using such keys makes the database comply with the 3rd Normal Form. Making a database comply with these normal forms will ensure that it is efficient.
A variant of the SQL query language known as T-SQL is the standard way of talking to a SQL Server database. SQL as a language is declarative and not procedural. A procedural language uses a number of steps or a procedure to get a specified result. A declarative language, on the other hand, simply makes a statement - a single declaration that is sent to the DBMS. The DBMS then executes internal programs and returns an answer. For example, the SQL statement 'CREATE TABLE' with a list of parameters creates a table on the database.
A SQL Server table may contain many columns with varying data types. The data type is the sort of value (number, string, date etc.) that the column contains. A table is made up of rows and columns. The columns define the subject of the data (or domain) each row contains. Table names should be unique within a database. The structure of a table consists of its columns and their data types. The order in which rows appear when a table is opened has nothing to do with the way they are stored.
We will learn T-SQL and use T-SQL to create structured programs with a flow of control over the next three chapters. We will use a sample bookstore database for our exercises. Let us first work on creating an appropriate database and populating tables with sample data. Most commands may be executed by clicking on the 'New Query' icon on the top left hand side window and entering a query or sets of queries in the window that pops up. Highlighting a portion of the queries or the entire set and clicking the 'Execute' '!'' exclamation point icon will run the highlighted queries. We will predominantly use the SQL query window panel to test our queries. However, we will also look at the GUI way of accomplishing each task.
Our database will be called 'booksdb'. Right click on the 'databases' folder on the tree on the left hand window of Management Studio and select 'New Database'. Enter 'booksdb' in the empty field next to 'Database Name:' and click the OK button. The new database will be created with folders for all associated objects. Now, let us learn how to create a table using the Management Studio GUI before running a T-SQL script to create all the booksdb database tables. We have already created an author table using the GUI in the last chapter.



