Data administration -- a high-level function that is responsible for the overall management of data resources in an organization, including:

Database administration -- a hands-on, physical involvement with the management of a database including:


Data security

Protection of data against accidental or intentional loss, destruction, or misuse

-- increased difficulty due to Internet access and client/server technologies

 

Threats

  1. Accidental losses
  2. Theft & fraud
  3. Loss of privacy/confidentiality
  4. Loss of data integrity
  5. Loss of availability

Internet Security Measures

Layer

Measures

System/Hardware

       Router

       Firewall

       Intrusion detection system (Figure 10-9)

Network

       Separate the Web server from the database server and other business/internal systems

       Minimize sharing of files among networked servers

       Monitor network logs

       Patch any discovered vulnerabilities

Operating system

       Patch any vulnerabilities

       Install anti-virus software

Web server

       Restrict the number of users on the Web server

       Restrict access to the Web server

       Restrict all CGI scripts to one subdirectory

Data Security Measures

  1. View restriction
  2. User identification
  3. Access restriction
  4. Encryption
  5. Authentication
  6. Backup

Understanding MS Access Security

Database recovery (Table 12-1)

Restoring a database quickly and accurately after loss or damage

Type of failure

Recovery measures

Aborted transaction

  • Rollback
  • Rollforward/rerun transactions to state just prior to abort

Incorrect data

  • Rollback
  • Error correction
  • Restart

Systems failure

  • Switch
  • Rollback
  • Restart

Database destruction

  • Switch
  • Rollforward
  • Restart

 

Recovery measures

  1. Backup
  2. Journalizing
  1. Checkpoint

Recovery techniques

Technique

Purpose

How

Rollback

undo unwanted changes to the database

Before image + new à original

Rollforward

Move the database forward to a later state

After image + original à new

Restore/reun

Reprocess the day's transactions

Backup + transaction log à new

Switch

Use a mirrored database

Duplicated copy

Database integrity

Ensure that a database is accurate and consistent

Problems

  1. Lost updates -- one user's update is lost due to interference between another user's transaction
  2. Inconsistent read -- one user reads data that have been partially updated by another user

Measures

  1. Locking -- any data that are retrieved by a user for updating must be locked, or denied to other users, until the update is completed or aborted
  2. Versioning -- each transaction is restricted to a view of the database as of the time that transaction started and timstamped. Earlier transaction is given priority in updating the database.

Locking level

  1. Database
  2. Table
  3. Page
  4. Record
  5. Field

Types of lock

  1. Shared
  2. Exclusive

Deadlock management

  1. Deadlock prevention
  2. Deadlock resolution

Example of concurrency processing

Request 1         withdraw $50 from savings

Request 2         withdraw $100 from savings

Request 1

Request 2

Action 1           obtain savings balance

Action 1           obtain savings balance

Action 2           withdraw $50

Action 2           withdraw $100

Action 3           update new savings balance

Action 3           update new savings balance

Locking policy:

Request 1

Request 2

Action 1           request lock for savings records

Action 1 request lock for savings records

Action 2 lock savings records

Action 2           wait for record to be unlocked

Action 3           subtract $50

 

Action 4           write new savings balance

 

Action 5 unlock savings records

Action 3 lock savings records

 

Action 4 subtract $100

 

Action 5 write new savings balance

 

Action 6 unlock savings records

** delay changes to the database until the transaction is committed **

Example of a deadlock situation

Request 1         transfer $50 from savings to checking

Request 2         transfer $100 from checking to savings

Request 1

Request 2

Action 1           request lock for savings records

Action 1           request lock for checking records

Action 2 lock savings records

Action 2           lock checking records

Action 3           request lock for checking records

Action 3           request lock for savings records

Action 4           wait for checking to be unlock

Action 4 wait for savings to be unlock

 

Deadlock prevention protocol:

Every transaction lock all the data items it needs in advance.

None of the items is locked if any of the data items cannot be obtained.

Request 1

Request 2

Action 1           request lock for savings records

Action 1           request lock for checking records

Action 2 lock savings records

 

Action 3           request lock for checking records

 

Action 4           lock checking records

 

Action 5 subtract $50 from savings

 

Action 6 unlock savings records

 

Action 7 add $50 to checking

 

Action 8 unlock checking records

Action 2           lock checking records

 

Action 3           request lock for savings records

 

Action 4 lock savings records

 

Action 5 subtract $100 from checking

 

Action 6 unlock checking records

 

Action 7 add $100 to savings

 

Action 8 unlock savings records

Criteria of useful data (Data Quality)

Information gap -- drowning in data & starving for information

Operation processing -- capture, store, manipulate data for daily operations

Information processing -- analyze summarized or other forms of data for decision support

Data Warehouse

An integrated, subject-oriented, time-variant, non-volatile database that provides support for decision making. It represents the entire history of the company.

  1. Integrated à data from the entire organization conform to a common definition and representation
  2. Subject-oriented à data organized & summarized by topics for easy query
  3. Time-variant à data has a time component which is used to generate time-dependent aggregates
  4. Non-volatile à data are never deleted

Purpose -- to consolidate and integrate data from a variety of sources and to format those data in a context for making accurate business decisions

Data warehouse architecture (Figure 11-2)

Layer

Function

Important consideration

Data acquisition

(Operational Data)

Gather, refine, cleanse, aggregate data from existing systems

STANDARDIZATION

  • Data source
  • Data restructuring (time attribute)

Data storage (Reconciled Data)

Provide a single logical store of point-in-time data (Enterprise Data Warehouse)

FLEXIBILITY

  • Data are stored at its most granular level
  • Minimal denormalization

Data delivery (Derived Data)

Provide information to end-users (Data Mart)

USER INTERACTION

  • Presentation tool
  • Analytical tool

Data Mart

A small, single-subject data warehouse subset that provides decision support to small groups within the organization.

Analytical tools

  1. Traditional query & reporting tools, e.g., spreadsheet, database, etc.
  2. On-line analytical processing (OLAP) tools -- multidimensional data analysis
  3. Data-mining tools -- statistics, AI, computer graphics
  4. Data visualization tools -- graphical & multimedia representation

OLAP

Data-mining -- look for patterns or trends in a collection of facts to:

Pg.440 Field Exercises #2

Techniques (Table 11-4)

  1. Case-based reasoning
  2. Rule discovery (Induction)
  3. Cluster analysis
  4. Neural nets
  5. Genetic algorithms

New challenges to standard relational technology

  1. Size of the databases: growing
  2. Composition of the databases:  graphics, video, sound, …
  3. User interfaces:  graphical
  4. Modeling techniques:  more realistic representation

 

Object-oriented concepts

Object – an abstract representation of a real-world entity that has a unique identity (ID), embedded properties (attributes), and the ability to interact with other objects and itself (methods)

·        encapsulation

·        dynamic binding

 

Approaches to adding OO features to DBMS:

Approach

Description

Example

Comments

Large objects

·        Complex data is stored as binary/text field separately from other data in a table

·        Software external to the DBMS (ActiveX, Java applets, web browser plugins) displays and manipulates complex data

Access

DBMS does not understand the structure and operations of complex data à poor performance, no query on complex data

Media servers

·        Complex data resides on its own database

·        A media server is dedicated to manipulating complex data

·        Software contains a mix of SQL and API calls to the media server to retrieve, update, and transform complex data

Oracle 7.3 with text and spatial data servers

No query on complex data

Object database middleware

·        Same as above except an object middleware layer is used to relieve the user of knowing a separate API for each media server

Oracle 8

Same as above

Object relational

·        Complex data is added as a user-defined data type

·        A collection function is defined for each user-defined data type

·        An object query processor (SQL3 + parser + optimizer + display manager) for user-defined data types

Informix Universal Server

Uncertain reliability

Object oriented

·        Fundamental changes to a DBMS to support objects

·        Object Database Management Group (ODMG) are working on ODL and OQL standards

·        ObjectStore

·        UniSQL

·        O2

·        Versant

·        Gemstone

       Emphasize support for complex data in large systems

       Currently used in applications where ad hoc query and transaction processing are not important