Conceptual data modeling -- capturing the meaning of data from the
viewpoint of the user
Concept
|
Definition
|
Model
|
A set of concepts to describe the structure of and operations on a
database
|
Schema
|
A language or diagrammatic conventions that can be used to provide the
modeling construct
|
Instance
|
A description of reality at a given point in time
|
Data modeling tools
- Entity-relationship diagram
(1976 by Peter Chen)
- Object-oriented diagram
Basic constructs of the E-R model
Concept
|
Definition
|
Examples
|
Entity
|
A person, place, object, event or concept
|
Employee, department, building, sale, account
|
Relationship
|
An entity that serves to interconnect two or more entity types
|
Assignment (Employee-Department)
|
Attribute
|
A property or characteristic of an entity/relationship type
|
Employee_name, department_location,
sale_date
|
Constraints
|
Guiding policies or rules that defines or restricts the structure and
processing of a database
|
All business majors must have a GPA of 2.9 or above
|
Entity
- Entity type/set/class - a
collection of entities that share common properties, Ei,
not mutually disjoint, i.e., e Î E1,
and e Î E2
- Entity instance - a single
occurrence of an entity type, e
- Strong entity type - an
entity that exists independently of other entity types
- Weak/ID-dependent entity
type - an entity type whose existence depends on some other entity type,
e.g., BUILDING-ROOM
Relationship
- Relationship type -
association among entity types (Fig. 3-10a)
- Relationship instance -
association among entity instances (Fig. 3-10b)
- Identifying relationship -
the relationship between a weak entity type and its owner (Fig. 3-5)
- Gerund/Associative
entity - a relationship that is represented as an entity type (Fig. 3-11b)
- Degree of a relationship -
the number of entity types that participate in a relationship
- Cardinality constraint -
specifies the number of instances of one entity that can be associated
with each instance of another entity
- Subtype/Supertype
relationship - applying specialization/generalization technique to capture
the subset/superset relationship between two entity types (Fig. 4-1)
4 conditions for a gerund
- Related to the participating
entity types in a "many" relationship
- Identified with a single-attribute
identifier
- Has one or more attributes
- Participates in one or more
relationships beyond the associated relationship
Degree of a relationship
- Unary/recursive - a
relationship between the instances of a single entity type, e.g., Student-Room_With (Fig. 3-12a)
- Binary - a relationship
between the instances of two entity types, e.g,
Faculty-Teach-Course (Fig. 3-12b)
- Ternary - a relationship
among the instances of three entity types, e.g.,
FACULTY-Offer-COURSE-Offer-DEPARTMENT
Cardinality constraints
- one-to-one, e.g, STUDENT-Assign-PARKING
- one-to-many, e.g.,
DEPARTMENT-Offer-COURSE
- many-to-many, e.g.,
STUDENT-register-COURSE
Attribute
- Simple/atomic attribute - an
attribute that cannot be broken down into component parts e.g., Student_ID
- Composite attribute - an
attribute that can be broken down into component parts, e.g., Student_Name (First, Middle, Last)
- Single-valued attribute - an
attribute that can take on only one value for a given entity instance, e.g, Student_DateOfBirth
- Multi-valued attribute - an
attribute that may take on more than one value for a given entity
instance, e.g., Student_Major
- Base attribute - an attribute
whose values are stored in the database, e.g., Student_Phone
- Derived attribute - an
attribute whose values can be calculated from related attribute values,
e.g., Student_GPA
- Identifier - an attribute
(Simple) or combination of attributes (Composite) that uniquely identifies
individual entity instance, e.g., Student_ID
Criteria for selecting identifiers
- Permanent
- Non-null
- Non-derived
- Simple
Constraints
Business rules (Fig. 4-11)
- Structural - domain
constraint, e.g., AGE(e) Î (20,
65) where e Î EMPLOYEE
- Operational -
procedural/declarative
Subtype/supertype rules
- Completeness: total/partial
specialization rule
- Disjointness;
disjoint/overlap rule
Basic E-R notation (Fig. 3-2)
Concept
|
Symbol
|
Entity
|
Rectangle
|
Weak entity
|
Double rectangle
|
Associative entity
|
Diamond within a rectangle
|
Relationship
|
Diamond
|
Identifying relationship
|
Double diamond
|
Cardinality
|
Crow's foot
|
Mandatory cardinality
|
Solid "|"(s) superimposed on the relationship line
|
Optional cardinality
|
A "0" superimpose on the relationship line
|
Subtype/Supertype
|
Circle
|
Direction of subtype/supertype
|
Open-end of the "U" points towards a supertype
|
Total specialization
|
Double line extending from a supertype
|
Disjoint rule
|
A "d" in the circle joining the supertype
& its subtypes
|
Overlap rule
|
An "o" in the circle joining the supertype
& its subtypes
|
Attribute
|
Ellipse
|
Multi-valued attribute
|
Double ellipse
|
Identifier
|
Underlined
|
Derived attribute
|
Broken ellipse
|
4 steps in designing a conceptual data model using the E-R diagram
- Identify entity sets
- Define the value sets,
attributes and primary key for each entity set
- Identify relationship sets
and semantic information (cardinality, subtype/supertype)
for each relationship set
- Integrate multiple views of
entities, attributes, and relationships
Guidelines for identifying entities & attributes
- Entities have descriptive
information; identifying attributes do not
- Multivalued
attributes should be classified as entities
- If a descriptor in one
entity has a many-to-one relationship with another entity, the descriptor
should be classified as an entity
- Attach attributes to
entities that they describe most directly
- Avoid composite identifiers
as much as possible
- If a generalization or
subset hierarchy among entities is detected, reattach attributes to the
relevant entities
Guidelines for defining relationships
- Eliminate redundant
relationships
- A ternary relationship is
defined only when the association cannot be represented by several binary
relationships among those entities
Which of the following contains a redundant relationship?
- An employee, who lives in a
city, belongs to a professional association that can be located in many
cities
- An employee work on many
projects that are located in various cities in
which an employee works.
- A student belongs to many
clubs located in a school where the student attends.
Note: Locate is a redundant relationship
Which of the following should be defined as a ternary relationship?
- Each student can be
involved in many projects and can work under the instruction of several
teachers for any of these projects, and each teacher can instruct many
students on any project.
- Each student can be
involved in several projects and work under the instruction of several
teachers, but the student must work under the instruction of exactly one
teacher on one project.
- Each employee can be
working on any of several projects and using the same notebooks on each
project
- Each employee can be
working on any of several projects but use exactly one notebook for each
project and that notebook belongs only to one employee
Evaluating a conceptual data model
- Flexibility/extensibility -- the ease with which a model can be
adapted to changing requirements
- Expressiveness -- the ability to bring out the different
abstractions and relationships in a natural way without the need for
further explanation
- Simplicity -- easy to use and understand
- Formality -- standardized to provide unique interpretation
- Self-explanatory -- no additional annotations
- Clarity -- no guesswork or ambiguity
- Completeness -- all relevant features of the application domain are
represented
- Correctness -- syntactic (concepts are properly defined) &
semantic (concepts e.g., entities, relationships, are used according to
their definition)
- Minimality -- no concept can be deleted from the
schema without losing some information
- Readability -- aesthetically presented (symmetry, minimal crossing
& bends, etc.)
Example of an E-R diagram not meeting the expressiveness, clarity and
readability criteria:
Example of an E-R diagram not meeting the clarity and readability
criteria: