Table 1: An example of relation
Name, position and location are attributes under which there are
attribute values. A row of attribute values is called a tuple,
therefore, a relation is a set of tuples. And a relation is an ordered
set of attributes, i.e. . A relation is sometimes
represented as
. Since relations have very simple
expressions, mathematical discussion of which is also simple.
Even though relations are kind of tables, all tables are not
recognized as relations. The following two conditions must be met for
a table to be a relation.
How can we represent the real world as relations. First, collect attributes of the world and make a table like table1. This flat table representation is a data structure of the relational data model. However, this relation has redundancy, for example in combination of position and location. Furthermore, if you change a combination of name and position, information of position and location might be meaningless. Thus, we normalize the original table(first normal form) taking into account dependencies.
A functional dependency can be defined when a collection of attribute
values in relation
uniquely
decides another collection of attribute values
. In this case,
is functionally
dependent on
and expressed as
. Suppose
is an attribute value of
in a tuple r, the functional
dependency means as follows for arbitrary tuples
and
.
In table1, there are two functional dependencies, that are
nameposition and position
location.
If Yokota changes his position, the location of which have to be also
changed. Then the information that position O is in Minato ward will
get lost. That is why we devide table1 into table2.
Generally, we should devide a table into tables in which functional
dependencies are all on their keys. This type of relation is in third
normal form. The decomposition based on functional dependencies includes
second normal form which is different from third normal form in terms
of kind of functional dependencies. Third normal form not only
prevents above mentioned loss of information but also makes tables
simpler excluding repetitive pairs of position and location as you can
see in table2. In other words data redundancy decreases, so
you will have to update only one tuple if the location of position O
changes.
Table 2: Decomposition by functional dependencies
Normalization of relations are done not only by functional dependencies but by multivalued dependencies. Let's look at table 3. This relation consists of three attributes of name, hobby and club. It has no functional dependency.
Table 3: Relation including multivalued dependencies
If you devide table3 into table4, redundancy
decreases. Both hobby and club have plural values against name. This
property is called multivalued dependency. In this case the
multivalued dependencies are expressed as
namehobby and
name
club, which means that the
original relation can be decomposed into two relations of
and
. It is obvious that even if decomposed
relations are joined to a relation, no information loss occurs.
Table 4: Decomposition by multivalued dependencies
Dependencies other than functional and multivalued dependencies are considered. The most important thing here is that joined relations have the same amount of information as decomposed relations. Muximum relationship among decomposable attributes are called join dependency.
Next, let's think about data manipulations of relations. In the relational model there are relational algebra which centers on set operations and relational calculus which uses first-order predicate calculus.
Relational algebra includes the following five operations.
The above basic operations are combined to define the following operations.
Those mentiond operations including the first five basic ones and
the latter three composites are so-called relational algebra.
On the other hand, relational calculus is generaly a set of tuples
which satisfies a logical formula
where an
argument t is a tuple variable. It has been proved that all the
relational algebraic operations can be translated in relational
calculus and if the scope of tuple variables are specified,
they are equivalent.