**Related Content:** CS403 - VU Lectures, Handouts, PPT Slides, Assignments, Quizzes, Papers & Books of Database Management Systems

- Types of Joins
- Relational Calculus
- Normalization

In the previous lecture we have studied the basic operators of relational algebra along with different examples. From this lecture we will study the different types of joins, which are very important and are used extensively in relational calculus.

Join is a special form of cross product of two tables. It is a binary operation that allows combining certain selections and a Cartesian product into one operation. The join operation forms a Cartesian product of its two arguments, performs a selection forcing equality on those attributes that appear in both relation schemas, and finally removes duplicate attributes. Following are the different types of joins: -

- Theta Join
- Equi Join
- Semi Join
- Natural Join
- Outer Joins

We will now discuss them one by one

In theta join we apply the condition on input relation(s) and then only those selected rows are used in the cross product to be merged and included in the output. It means that in normal cross product all the rows of one relation are mapped/merged with all the rows of second relation, but here only selected rows of a relation are made cross product with second relation. It is denoted as under: -

**R X b S**

If R and S are two relations then b is the condition, which is applied for select operation on one relation and then only selected rows are cross product with all the rows of second relation. For Example there are two relations of FACULTY and COURSE, now we will first apply select operation on the FACULTY relation for selection certain specific rows then these rows will have across product with COURSE relation, so this is the difference in between cross product and theta join. We will now see first both the relation their different attributes and then finally the cross product after carrying out select operation on relation. From this example the difference in between cross product and theta join becomes clear.

FACULTY

facId |
facName |
dept |
salary |
rank |

F234 | Usman | CSE | 21000 | lecturer |

F235 | Tahir | CSE | 23000 | Asso Prof |

F236 | Ayesha | ENG | 27000 | Asso Prof |

F237 | Samad | ENG | 32000 | Professor |

COURSE

crCode |
crTitle |
fId |

C3456 | Database Systems | F234 |

C3457 | Financial Management | |

C3458 | Money & Capital Market | F236 |

C3459 | Introduction to Accounting | F237 |

facId |
facName |
dept |
salary |
rank |
crCode |
crTitle |
fId |

F235 | Tahir | CSE | 27000 | Asso Prof | C3456 | Database Systems | F234 |

F235 | Tahir | CSE | 27000 | Asso Prof | C3457 | Financial Management | |

F235 | Tahir | CSE | 27000 | Asso Prof | C3458 | Money & Capital Market | F236 |

F235 | Tahir | CSE | 27000 | Asso Prof | C3459 | Introduction to Accounting | F237 |

F236 | Ayesha | ENG | 27000 | Asso Prof | C3456 | Database Systems | F234 |

F236 | Ayesha | ENG | 27000 | Asso Prof | C3457 | Financial Management | |

F236 | Ayesha | ENG | 27000 | Asso Prof | C3458 | Money & Capital Market | F236 |

F236 | Ayesha | ENG | 27000 | Asso Prof | C3459 | Introduction to Accounting | F237 |

Fig. 1: Two tables with an example of theta join

In this example after fulfilling the select condition of Associate professor on faculty relation then it is cross product with course relation

This is the most used type of join. In equi–join rows are joined on the basis of values of a common attribute between the two relations. It means relations are joined on the basis of common attributes between them; which are meaningful. This means on the basis of primary key, which is a foreign key in another relation. Rows having the same value in the common attributes are joined. Common attributes appear twice in the output. It means that the attributes, which are common in both relations, appear twice, but only those rows, which are selected. Common attribute with the same name is qualified with the relation name in the output. It means that if primary and foreign

keys of two relations are having the same names and if we take the equi – join of both
then in the output relation the relation name will precede the attribute name. For Example, if we take the equi – join of faculty and course relations then the output would be as under: -

**FACULTY **FACULTY..facId=COURSE.fId COURSE

facId |
facName |
dept |
salary |
rank |
crCode |
crTitle |
fID |

F234 | Usman | CSE | 21000 | lecturer | C3456 | Database Systems | F234 |

F236 | Ayesha | ENG | 27000 | Asso Prof | C3458 | Money & Capital Market | F236 |

F237 | Samad | ENG | 32000 | Professor | C3459 | Introduction to Accounting | F237 |

F**ig. 2: Equi-join on tables of figure 1**

In the above example the name of common attribute between the two tables is different, that is, it is facId in FACULTY and fId in COURSE, so it is not required to qualify; however there is no harm doing it still. Now in this example after taking equi–join only those tuples are selected in the output whose values are common in both the relations.

Natural Join:

This is the most common and general form of join. If we simply say join, it means the natural join. It is same as equi–join but the difference is that in natural join, the common attribute appears only once. Now, it does not matter which common attribute should be part of the output relation as the values in both are same. For Example if we take the natural join of faculty and course the output would be as under: -

**FACULTY** facId, fId COURSE

facId |
facName |
dept |
salary |
rank |
crCode |
crTitle |

F234 | Usman | CSE | 21000 | Lecturer | C3456 | Database Systems |

F236 | Ayesha | ENG | 27000 | Asso Prof | C3458 | Money & Capital Marke |

F237 | Samad | ENG | 32000 | Professor | C3459 | Introduction to Accounting |

**Fig. 4: Natural join of FACULTY and COURSE tables of figure 1**

In this example the common attribute appears only once, rest the behavior is same. Following are the different types of natural join:-

In left outer join all the tuples of left relation remain part of the output. The tuples that have a matching tuple in the second relation do have the corresponding tuple from the second relation. However, for the tuples of the left relation, which do not have a matching record in the right tuple have Null values against the attributes of the right relation. The example is given in figure 5 below. It can be described in another way. Left outer join is the equi-join plus the non matching rows of the left side relation having Null against the attributes of right side relation

In right outer join all the tuples of right relation remain part of the output relation, whereas on the left side the tuples, which do not match with the right relation, are left as null. It means that right outer join will always have all the tuples of right relation and those tuples of left relation which are not matched are left as Null.

COURSE | ||

bkId |
bkTitile |
stId |

B10001 | Intro to Database Systems | S104 |

B10002 | Programming Fundamentals | S101 |

B10003 | Intro Data Structures | S101 |

B10004 | Modern Operating Systems | S103 |

B10005 | Computer Architecture | |

B10006 | Advanced Networks | S104 |

STUDENT | |

stId |
stName |

S101 | Ali Tahir |

S103 | Farah Hasan |

S104 | Farah Naz |

S106 | Asmat Dar |

S107 | Liaqat Ali |

COURSE left outer join STUDENT | ||||

bkId |
bkTitile |
BOOK.stId |
STUDENT.stId |
stName |

B10001 | Intro to Database Systems | S104 | S104 | Farah Naz |

B10002 | Programming Fundamentals |
S101 | S101 | Ali Tahir |

B10003 | Intro Data Structures | S101 | S101 | Ali Tahir |

B10004 | Modern Operating Systems |
S103 | S103 | Farah Hasan |

B10006 | Advanced Networks | S104 | S104 | Farah Naz |

B10005 | Computer Architecture | Null | Null | Null |

COURSE right outer join STUDENT | ||||

bkId |
bkTitile |
BOOK.stId |
STUDENT.stId |
stName |

B10001 | Intro to Database Systems | S104 | S104 | Farah Naz |

B10002 | Programming Fundamentals | S101 | S101 | Ali Tahir |

B10003 | Intro Data Structures | S101 | S101 | Ali Tahir |

B10004 | Modern Operating Systems | S103 | S103 | Farah Hasan |

B10006 | Advanced Networks | S104 | S105 | Farah Naz |

Null | Null | Null | S106 | Asmat Dar |

Null | Null | Null | S107 | Liaqat Ali |

Fig. 5: Input tables and left outer join and right outer join

In outer join all the tuples of left and right relations are part of the output. It means that all those tuples of left relation which are not matched with right relation are left as Null. Similarly all those tuples of right relation which are not matched with left relation are left as Null.

COURSE outer join STUDENT | ||||

bkId |
bkTitile |
BOOK.stId |
STUDENT.stId |
stName |

B10001 | Intro to Database Systems | S104 | S104 | Farah Naz |

B10002 | Programming Fundamentals | S101 | S101 | Ali Tahir |

B10003 | Intro Data Structures | S101 | S101 | Ali Tahir |

B10004 | Modern Operating Systems | S103 | S103 | Farah Hasan |

B10006 | Advanced Networks | S104 | S104 | Farah Naz |

B10005 | Computer Architecture | Null | Null | Null |

Null | Null | Null | S106 | Asmat Dar |

Null | Null | Null | S107 | Liaqat Ali |

Fig. 6: outer join operation on tables of figure 5

In semi join, first we take the natural join of two relations then we project the attributes of first table only. So after join and matching the common attribute of both relations only attributes of first relation are projected. For Example if we take the semi join of two relations faculty and course then the resulting relation would be as under:-

FACULTY/ COURSE

facId |
facName |
Dept |
Salary |
Rank |

F234 | Usman | CSE | 21000 | lecturer |

F236 | Ayesha | ENG | 27000 | Asso Prof |

F237 | Samad | ENG | 32000 | Professor |

Fig. 7: Semi-join operation on tables of figure 1

Now the resulting relation has attributes of first relation only after taking the natural join of both relations.

Relational Calculus is a nonprocedural formal relational data manipulation language in which the user simply specifies what data should be retrieved, but not how to retrieve it. It is an alternative standard for relational data manipulation languages. The relational calculus is not related to the familiar differential and integral calculus in mathematics, but takes its name from a branch of symbolic logic called the predicate calculus. It has two following forms: -

**Tuple Oriented Relational Calculus****Domain Oriented Relational Calculus**

In tuple oriented relational calculus we are interested primarily in finding relation tuples for which a predicate is true. To do so we need tuple variables. A tuple variable is a variable that takes on only the tuples of some relation or relations as its range of values. It actually corresponds to a mathematical domain. We specify the range of a tuple variable by a statement such as:

Here, S is the tuple variable and STUDENT is the range, so that S always represents a tuple of STUDENT. It is expressed as

{S | P (S)}

We will read it as find the set of all tuples S such that P(S) is true, where P implies the predicate condition now suppose range of R is STUDENT

{R | R.Credits > 50}

We will say like find the stuId, stuName, majors etc of all students having more than 50 credits.

There are four types of anomalies, which are of concern, redundancy, insertion, deletion and updation. Normalization is not compulsory, but it is strongly recommended that normalization must be done. Because normalized design makes the maintenance of database much easier. While carrying out the process of normalization, it should be applied on each table of database. It is performed after the logical database design. This process is also being followed informally during conceptual database design as well.

There are different forms or levels of normalization. They are called as first, second and so on. Each normalized form has certain requirements or conditions, which must be fulfilled. If a table or relation fulfills any particular form then it is said to be in that normal form. The process is applied on each relation of the database. The minimum form in which all the tables are in is called the normal form of entire database. The main objective of normalization is to place the database in highest form of normalization.

In this lecture we have studied the different types of joins, with the help of which we can join different tables. We can get different types of outputs from joins. Then we studied relational calculus in which we briefly touched upon tuple and domain oriented relational calculus. Lastly we started the process of normalization which is a very important topic and we will discuss in detail this topic in the coming lectures.

Draw two tables of PROJECT and EMPLOYEE along with different attribute, include a common attribute between the two to implement the PK/FK relationship and populate both the tables. Then apply all types of joins and observe the difference in the output relations