CS403 - Database Management Systems - Lecture Handout 18

User Rating:  / 1
PoorBest 

Related Content: CS403 - VU Lectures, Handouts, PPT Slides, Assignments, Quizzes, Papers & Books of Database Management Systems

Overview of Lecture:

  • Types of Joins
  • Relational Calculus
  • Normalization

In the previous lecture we have studied the basic operators of relational algebra along with different examples. From this lecture we will study the different types of joins, which are very important and are used extensively in relational calculus.

Types of Joins

Join is a special form of cross product of two tables. It is a binary operation that allows combining certain selections and a Cartesian product into one operation. The join operation forms a Cartesian product of its two arguments, performs a selection forcing equality on those attributes that appear in both relation schemas, and finally removes duplicate attributes. Following are the different types of joins: -

  1. Theta Join
  2. Equi Join
  3. Semi Join
  4. Natural Join
  5. Outer Joins

We will now discuss them one by one

Theta Join:

In theta join we apply the condition on input relation(s) and then only those selected rows are used in the cross product to be merged and included in the output. It means that in normal cross product all the rows of one relation are mapped/merged with all the rows of second relation, but here only selected rows of a relation are made cross product with second relation. It is denoted as under: -

R X b S

If R and S are two relations then b is the condition, which is applied for select operation on one relation and then only selected rows are cross product with all the rows of second relation. For Example there are two relations of FACULTY and COURSE, now we will first apply select operation on the FACULTY relation for selection certain specific rows then these rows will have across product with COURSE relation, so this is the difference in between cross product and theta join. We will now see first both the relation their different attributes and then finally the cross product after carrying out select operation on relation. From this example the difference in between cross product and theta join becomes clear.

FACULTY

facId facName dept salary rank
F234 Usman CSE 21000 lecturer
F235 Tahir CSE 23000 Asso Prof
F236 Ayesha ENG 27000 Asso Prof
F237 Samad ENG 32000 Professor

COURSE

crCode crTitle fId
C3456 Database Systems F234
C3457 Financial Management  
C3458 Money & Capital Market F236
C3459 Introduction to Accounting F237

(A rank = ‘Asso Prof’(FACULTY)) X COURSE

facId facName dept salary rank crCode crTitle fId
F235 Tahir CSE 27000 Asso Prof C3456 Database Systems F234
F235 Tahir CSE 27000 Asso Prof C3457 Financial Management  
F235 Tahir CSE 27000 Asso Prof C3458 Money & Capital Market F236
F235 Tahir CSE 27000 Asso Prof C3459 Introduction to Accounting F237
F236 Ayesha ENG 27000 Asso Prof C3456 Database Systems F234
F236 Ayesha ENG 27000 Asso Prof C3457 Financial Management  
F236 Ayesha ENG 27000 Asso Prof C3458 Money & Capital Market F236
F236 Ayesha ENG 27000 Asso Prof C3459 Introduction to Accounting F237

Fig. 1: Two tables with an example of theta join

In this example after fulfilling the select condition of Associate professor on faculty relation then it is cross product with course relation

Equi–Join:

This is the most used type of join. In equi–join rows are joined on the basis of values of a common attribute between the two relations. It means relations are joined on the basis of common attributes between them; which are meaningful. This means on the basis of primary key, which is a foreign key in another relation. Rows having the same value in the common attributes are joined. Common attributes appear twice in the output. It means that the attributes, which are common in both relations, appear twice, but only those rows, which are selected. Common attribute with the same name is qualified with the relation name in the output. It means that if primary and foreign
keys of two relations are having the same names and if we take the equi – join of both then in the output relation the relation name will precede the attribute name. For Example, if we take the equi – join of faculty and course relations then the output would be as under: -

FACULTY FACULTY..facId=COURSE.fId COURSE

facId facName dept salary rank crCode crTitle fID
F234 Usman CSE 21000 lecturer C3456 Database Systems F234
F236 Ayesha ENG 27000 Asso Prof C3458 Money & Capital Market F236
F237 Samad ENG 32000 Professor C3459 Introduction to Accounting F237

Fig. 2: Equi-join on tables of figure 1

In the above example the name of common attribute between the two tables is different, that is, it is facId in FACULTY and fId in COURSE, so it is not required to qualify; however there is no harm doing it still. Now in this example after taking equi–join only those tuples are selected in the output whose values are common in both the relations.

Natural Join:

This is the most common and general form of join. If we simply say join, it means the natural join. It is same as equi–join but the difference is that in natural join, the common attribute appears only once. Now, it does not matter which common attribute should be part of the output relation as the values in both are same. For Example if we take the natural join of faculty and course the output would be as under: -

FACULTY facId, fId COURSE

facId facName dept salary rank crCode crTitle
F234 Usman CSE 21000 Lecturer C3456 Database Systems
F236 Ayesha ENG 27000 Asso Prof C3458 Money & Capital Marke
F237 Samad ENG 32000 Professor C3459 Introduction to Accounting

Fig. 4: Natural join of FACULTY and COURSE tables of figure 1

In this example the common attribute appears only once, rest the behavior is same. Following are the different types of natural join:-

Left Outer Join:

In left outer join all the tuples of left relation remain part of the output. The tuples that have a matching tuple in the second relation do have the corresponding tuple from the second relation. However, for the tuples of the left relation, which do not have a matching record in the right tuple have Null values against the attributes of the right relation. The example is given in figure 5 below. It can be described in another way. Left outer join is the equi-join plus the non matching rows of the left side relation having Null against the attributes of right side relation

Right Outer Join:

In right outer join all the tuples of right relation remain part of the output relation, whereas on the left side the tuples, which do not match with the right relation, are left as null. It means that right outer join will always have all the tuples of right relation and those tuples of left relation which are not matched are left as Null.

COURSE    
bkId bkTitile stId
B10001 Intro to Database Systems S104
B10002 Programming Fundamentals S101
B10003 Intro Data Structures S101
B10004 Modern Operating Systems S103
B10005 Computer Architecture  
B10006 Advanced Networks S104

 

STUDENT  
stId stName
S101 Ali Tahir
S103 Farah Hasan
S104 Farah Naz
S106 Asmat Dar
S107 Liaqat Ali
   

 

COURSE left outer join STUDENT
bkId bkTitile BOOK.stId STUDENT.stId stName
B10001 Intro to Database Systems S104 S104 Farah Naz
B10002 Programming
Fundamentals
S101 S101 Ali Tahir
B10003 Intro Data Structures S101 S101 Ali Tahir
B10004 Modern Operating
Systems
S103 S103 Farah Hasan
B10006 Advanced Networks S104 S104 Farah Naz
B10005 Computer Architecture Null Null Null

 

COURSE right outer join STUDENT        
bkId bkTitile BOOK.stId STUDENT.stId stName
B10001 Intro to Database Systems S104 S104 Farah Naz
B10002 Programming Fundamentals S101 S101 Ali Tahir
B10003 Intro Data Structures S101 S101 Ali Tahir
B10004 Modern Operating Systems S103 S103 Farah Hasan
B10006 Advanced Networks S104 S105 Farah Naz
Null Null Null S106 Asmat Dar
Null Null Null S107 Liaqat Ali

Fig. 5: Input tables and left outer join and right outer join

Outer Join:

In outer join all the tuples of left and right relations are part of the output. It means that all those tuples of left relation which are not matched with right relation are left as Null. Similarly all those tuples of right relation which are not matched with left relation are left as Null.

COURSE outer join STUDENT
bkId bkTitile BOOK.stId STUDENT.stId stName
B10001 Intro to Database Systems S104 S104 Farah Naz
B10002 Programming Fundamentals S101 S101 Ali Tahir
B10003 Intro Data Structures S101 S101 Ali Tahir
B10004 Modern Operating Systems S103 S103 Farah Hasan
B10006 Advanced Networks S104 S104 Farah Naz
B10005 Computer Architecture Null Null Null
Null Null Null S106 Asmat Dar
Null Null Null S107 Liaqat Ali

Fig. 6: outer join operation on tables of figure 5

Semi Join:

In semi join, first we take the natural join of two relations then we project the attributes of first table only. So after join and matching the common attribute of both relations only attributes of first relation are projected. For Example if we take the semi join of two relations faculty and course then the resulting relation would be as under:-

FACULTY/ COURSE

facId facName Dept Salary Rank
F234 Usman CSE 21000 lecturer
F236 Ayesha ENG 27000 Asso Prof
F237 Samad ENG 32000 Professor

Fig. 7: Semi-join operation on tables of figure 1

Now the resulting relation has attributes of first relation only after taking the natural join of both relations.

Relational Calculus

Relational Calculus is a nonprocedural formal relational data manipulation language in which the user simply specifies what data should be retrieved, but not how to retrieve it. It is an alternative standard for relational data manipulation languages. The relational calculus is not related to the familiar differential and integral calculus in mathematics, but takes its name from a branch of symbolic logic called the predicate calculus. It has two following forms: -

  • Tuple Oriented Relational Calculus
  • Domain Oriented Relational Calculus

Tuple Oriented Relational Calculus:

In tuple oriented relational calculus we are interested primarily in finding relation tuples for which a predicate is true. To do so we need tuple variables. A tuple variable is a variable that takes on only the tuples of some relation or relations as its range of values. It actually corresponds to a mathematical domain. We specify the range of a tuple variable by a statement such as:

RANGE OF S IS STUDENT

Here, S is the tuple variable and STUDENT is the range, so that S always represents a tuple of STUDENT. It is expressed as

{S | P (S)}

We will read it as find the set of all tuples S such that P(S) is true, where P implies the predicate condition now suppose range of R is STUDENT

{R | R.Credits > 50}

We will say like find the stuId, stuName, majors etc of all students having more than 50 credits.

Domain Oriented Relational Calculus:

Normalization

There are four types of anomalies, which are of concern, redundancy, insertion, deletion and updation. Normalization is not compulsory, but it is strongly recommended that normalization must be done. Because normalized design makes the maintenance of database much easier. While carrying out the process of normalization, it should be applied on each table of database. It is performed after the logical database design. This process is also being followed informally during conceptual database design as well.

Normalization Process

There are different forms or levels of normalization. They are called as first, second and so on. Each normalized form has certain requirements or conditions, which must be fulfilled. If a table or relation fulfills any particular form then it is said to be in that normal form. The process is applied on each relation of the database. The minimum form in which all the tables are in is called the normal form of entire database. The main objective of normalization is to place the database in highest form of normalization.

Summary

In this lecture we have studied the different types of joins, with the help of which we can join different tables. We can get different types of outputs from joins. Then we studied relational calculus in which we briefly touched upon tuple and domain oriented relational calculus. Lastly we started the process of normalization which is a very important topic and we will discuss in detail this topic in the coming lectures.

Exercise:

Draw two tables of PROJECT and EMPLOYEE along with different attribute, include a common attribute between the two to implement the PK/FK relationship and populate both the tables. Then apply all types of joins and observe the difference in the output relations