Network vs. Hierarchical Databases
The difference between network databases and hierarchical databases.
Fig 1c: Two Models
The large flat files in the databases of the 1960s were very hard to search. Because of the inflexibility of the existing system, a lot of research went into data models - ways of arranging and looking at data that would make search easier. A “hierarchy” is a one-way connection between nodes, like the organizational chart of a company. The hierarchical model is the oldest way of organizing data. Consider a shopping catalogue. Let us say that our catalogue has three main sections - a men's section, a women's section, and a kid's section.
|
Main Section |
Subsection |
|
Men's section |
Belts & Wallets |
|
Shoes |
|
|
Tools |
|
|
Women's section |
Jewelry |
|
Accessories |
|
|
Kid's Section |
Toys |
A database with a hierarchical data model would link each main section with all its subsections. However, once an item is classified as a sub section, it would be below the main section in the hierarchy and a user could not look up a main section from a sub section. The Information Management Systems (IMS) of the 1950s and 1960s were designed in this way. One of these IMSs was SABRE, a reservation system designed by IBM for American Airlines
A “network”, on the other hand, is a two-way connection between nodes, like phone lines, for example. The network model is consisted of groups of related data called sets. In a database with a network data model, each main section with all its subsections form a set. All the records, or members of the set, are linked. The sub sections are ordered. A database operator would find it equally easy to look up a main section given a sub section and sub sections given a main section. CODASYL was a famous product designed in this manner
Standalone Data is useless unless it is placed in its context; unless the way it relates to other data is clearly defined. The first formal data model was the relational model. A hallmark paper on this model by Cobb made the theory behind it famous. Put simply, a relational database consists of tables. Each table describes the relation of the main subject of the table to important information. The columns of the table contain the types of information stored about the main subject. For example, the main subject may be EMPLOYEE and the columns may contain EMPLOYEE_ID, EMPLOYEE_DEPARTMENT, and EMPLOYEE_SALARY. Each column has a range of permissible values, called a domain. For example, EMPLOYEE_SALARY should be a number greater than zero. The rows in the database contain information on different employees.
Searching for an employee becomes easy; a user may match his ID to find his entry in the table instead of running through the entire file looking for links or sets. How the relational database is represented in the hard drive of the computer - physical storage - becomes unimportant compared to how users and database administrators view the table - the logical view. Vendors designed many DBMS based on the relational model, one of these was INGRES.
The ER or Entity Relationship model rose in the late seventies. In many ways, it is simply a refinement of the relational model. The EMPLOYEE, according to this model, would form an entity. Other entities, such as SKILLS, SHIFT, and HOURS_WORKED, would be connected to EMPLOYEE by a series of well defined relationships. For example, the relationship between EMPLOYEE and SKILLS would be many to many. That is, a given employee may have many skills, and a given skill may be possessed by many employees. However, the relationship between EMPLOYEE and SHIFT would be many to one; many employees may work in one shift, but a single employee cannot work many shifts (unless he is really strong or really poor). The ER model remains the standard today. Another standard that came to us from the 70s is the Sequential Query Language, or SQL, a programming language that is based on the relational model and can be used to efficiently pull up and alter data from any relational model database.
Fig 1d: The ER Model
Database products became more and more complex and expensive in the 1990s. The rise of the internet and world wide web moved DBMS one step forward. Everyone wanted their data on the web, and many products that did just this - database-web connectivity engines and associated software - became de rigueur. Databases (like Ebay's for example) could be viewed from the net in a way that made sense to users, and altered. Object Oriented DBMS were also a product of the late 1990s. The graphical user interface that became famous through the Microsoft Windows Platform made it easy for programmers as well as users to look at a database as a collection of objects that were parts of larger objects.
Fig 1e: An XML File
Today, we do not have to deal with hard to use commands or data files; databases are intuitively designed and easy to use thanks to innovative graphical user interfaces that allow the user to perform operations by simply pointing and clicking or following instructions from a wizard. The widespread use of XML (Extensible Markup Language), furthers the object-orientation and platform-freedom causes. XML allows data to be stored the same way regardless of DBMS, hardware, or the Operating System. An XML file looks the same on the Windows or the Macintosh platforms. XML is touted to be the next revolution in DBMS. Amidst all these changes, storing and examining data remain important endeavors that will persist into the hazy future.