Logical database design – the process of transforming the conceptual data model into a logical data model

Relational data model (Codd 1970) – represents data in the form of tables

3 components:

  1. data structure – a relation/a two-dimensional table
  2. data manipulation – SQL
  3. data integrity – constraints

Relational model

File processing

User view

Relation

File

Table

Tuple

Record

Row

Attribute

Field

Column

Key

Index

Identifier

Notation

RELATION(Key, Attribute_1, Foreign_Key, Attribute_2, …, Attribute_n)

key – a minimal set of attributes that uniquely identifies each row in a relation

  1. composite key – a key consisting of more than one attribute
  2. candidate key – an attribute/group of attributes that identify a unique row in a relation
  3. primary key – the candidate key designated for principal use in uniquely identifying rows in a relation
  4. foreign key – a set of attributes in one relation that constitutes a key in some other relation

Properties of relations

  1. Entries must be atomic/single-valued
  2. Entries must be from the same domain (domain constraint)
  3. No two rows may be identical
  4. The sequence of columns (left to right) is insignificant
  5. The sequence of rows (top to bottom) is insignificant
  6. Each relation has a unique name
  7. Each attribute has a unique name

Integrity constraints

  1. Domain
  2. Entity integrity – every relation has a non-null primary key
  3. Referential integrity – the value of a non-null foreign key must be an actual key value in some relation
  4. Operational – business rule(s)

E-R models à relational models
 

Elements

E-R model

Relational model

Entity

Entity type

A relation

 

Simple attribute

An attribute of the relation

 

Composite attribute

(Fig. 5-9)

Each component attributes becomes an attribute in the relation

 

Multivalued attribute

(Fig. 5-10)

2 relations are created:

  • The 1st relation contains all the attributes except the multivalued attribute 
  • The 2nd relation contains the multivalued attribute and the primary key of the 1st relation

 

Weak entity type

The primary key of the "owner" relation is included as its foreign key

Relationship

Binary 1:1

(Fig. 5-14)

  • One relation is created for each entity type 
  • The key of either one of the relations is placed as the foreign key in the other relation

 

Binary 1:M

(Fig. 5-12)

  • One relation is created for each entity type 
  • The key of the relation on the "one" side of the relationship (parent) is placed as the foreign key in the relation representing the "many" side of the relationship (child)

 

Binary M:N

(Fig. 5-13)

or n-ary, n>2

  • One relation is created for each entity type 
  • Create an intersection/association relation to represent the relationship itself

 

Unary 1:1 or 1:M

(Fig. 5-17)

A recursive foreign key is placed in the relation

 

Unary M:N 

(Fig. 5-18)

2 relations are created:

  • The 1st relation represents the entity type 
  • The 2nd relation contains the non key attribute of the relationship and two attributes that take their values from the primary key of the 1st relation

 

Subtype/supertype

(Fig. 5-20)

  • One relation is created for the supertype 
  • One relation is created for each subtype 
  • The key of the generic/supertype relation is placed as the key in the specialized/subtype relation

Normalization – the process of converting (through loseless decomposition) a relation to a standard form to avoid problems of data redundancy, data inconsistency, update anomaly, deletion anomaly, and insertion anomaly

Summary of normal forms
 

Form

Conditions

What to look for?

1NF

Entries are atomic

Multivalued attribute(s) or Repeating group(s)

2NF

All nonkey attributes are dependent on the whole key

Composite key

A+B à C, D

Aà C

3NF

All nonkey attributes are dependent on nothing but the key

à C, D

Cà D

BCNF

Every determinant is a candidate key

A+Bà C, D

à B

4NF

A relation does not contain two or more independent multi-valued facts about an entity

Ternary relationship

àà B

àà C

5NF/

PJNF

Every join dependency in a relation is implied by its candidate key

Ternary relationship that is 3-decomposable

ABC=AB join BC join CA

Functional dependencies – the value of an attribute in a row determines the value of another attribute of that row. For any relation R, attribute B is functionally dependent on attribute A if for every valid instance of A, that value of A uniquely determines the value of B.

                        ®B

                        ­

Determinant

A key is a determinant that functionally determine the entire row

Undesirable dependencies

Partial dependency – certain nonkey attributes are dependent on part of the composite key

A+B® C

A® C

Transitive dependency – certain nonkey attributes are dependent on other nonkey attributes

A® C, D

® D

If every determinant is a key then there is no transitive dependency (Boyce-Codd NF)

"® B, A is a key

Multivalued dependency – when there are at least three attributes (e.g., A, B, and C) in a relation, and for each value of A there is a well-defined set of values for B and a well-defined set of values for C, but the set of values of B is independent of set C.

® ® B

® ® C

Join dependency – when a relation can be divided into two (or more) relations (i.e., n-decomposable, n>2) such that the resulting relations can be recombined to form the original relation (lossless join)

Natural join -- a join of a relation A having attribute A1 with relation B having attribute B1 where A1=B1. The join relation, C, contains either column A1 or B1 but not both.

Equijoin -- a process of joining relation A containing attribute A1 with B containing attribute B1 to form relation C, so that for each row in C, A1=B1 and both A1 and B1 are represented in C.

Projection -- a relational algebra operation performed on a relation, A, that results in a relation, b, where B has a subset of attributes of A.
 
 

Converting a relation to BCNF:

STUDENT-MAJOR-ADVISOR
 

STUDENT#

MAJOR

ADVISOR

111

Accounting

Smith

111

IS

Johnson

222

Accounting

Smith

333

IS

White

STUDENT# + MAJOR ® ADVISOR

ADVISOR ® MAJOR

ADVISOR is a determinant but not a candidate key for STUDENT-MAJOR-ADVISOR relation
 
 

STUDENT-ADVISOR
 

STUDENT#

ADVISOR

111

Smith

111

Johnson

222

Smith

333

White

ADVISOR-MAJOR
 

ADVISOR

MAJOR

Smith

Accounting

Johnson

IS

White

IS


 
 
 

Converting a relation to 4NF:

STUDENT-MAJOR-ACTIVITY
 

STUDENT#

MAJOR

ACTIVITY

111

Accounting

IS

Swimming

AISP

333

IS

Football

Reading

STUDENT# ®® MAJOR

STUDENT# ®® ACTIVITY
 
 

STUDENT-MAJOR
 

STUDENT#

MAJOR

111

Accounting

111

IS

333

IS


 

STUDENT-ACTIVITY
 

STUDENT#

ACTIVITY

111

Swimming

111

AISP

333

Football

333

Reading

5NF/PJNF (Projection-Join Normal Form)

A relation R is in 5NF if and only if every join dependency in R is implied by the candidate keys of R

e.g., STUDENT(StudentID, Name, Major, Address)

StudentID à Name, Major, Address

STUDENT = (StudentID, Name) Join (StudentID, Major) Join (StudentID, Address)
 
 

e.g.,      m:m:m ternary relationship in SUPPLIER-PART-JOB

If a supplier supplies a certain part

And the part is used in a certain job,

And the job is supplied by that supplier,

Then the supplier supplies that part for that job.

SPJ(S#, P#, J#)
 

S#

P#

J#

S1

P1

J2

S1

P2

J1

If (S2, P1, J1) is added, (S1, P1, J1) must also be added but not the converse.
 
 

SPJ(S#, P#, J#)
 

S#

P#

J#

S1

P1

J2

S1

P2

J1

S2

P1

J1

S1

P1

J1

But if (S1, P1, J1) is deleted, (S2, P1, J1) must also be deleted but not the converse.

SP
 

S#

P#

S1

P1

S1

P2

S2

P1

PJ
 

P#

J#

P1

J2

P2

J1

P1

J1

JS
 

J#

S#

J2

S1

J1

S1

J1

S2


 

Join SP, PJ over P#                                                                 
 

E#

P#

J#

Join JS over J#, S#

----------------------8

S#

P#

J#

S1

P1

J2

S1

P1

J2

S1

P1

J1

S1

P1

J1

S1

P2

J1

S1

P2

J1

S2

P1

J2

S2

P1

J1

S2

P1

J1

Original SPJ


 

View integration -- merging relations

1.      Synonyms -- different names but the same meaning

2.      Homonyms -- more than one meaning

3.      Undesirable dependencies

4.      Subtype/supertype relationships
 
 

An example:

STUDENT(SID, Student_Name, Address)

STUDENT_MAJOR(SID, Major)

STUDENT_ADVISOR(Student#, Advisor)

ADVISOR_MAJOR(Advisor, Major)

UNDERGRAD_STUDENT(SID, Student_Name, SAT)