Monday, 27 October 2014

RDBMS

Unit-I:

Introduction

Database Management System (DBMS)

DBMS contains information about a particular enterprise
Collection of interrelated data
Set of programs to access the data
An environment that is both convenient and efficient to use
Database Applications:
Banking: all transactions
Airlines: reservations, schedules
Universities:  registration, grades
Sales: customers, products, purchases
Online retailers: order tracking, customized recommendations
Manufacturing: production, inventory, orders, supply chain
Human resources:  employee records, salaries, tax deductions
Databases touch all aspects of our lives

Purpose of Database Systems
 In the early days, database applications were built directly on top of file systems
Drawbacks of using file systems to store data:
Data redundancy and inconsistency
Multiple file formats, duplication of information in different files
Difficulty in accessing data
Need to write a new program to carry out each new task
Data isolation — multiple files and formats

Integrity problems
Integrity constraints (e.g. account balance > 0) become “buried” in program code rather than being stated explicitly
Hard to add new constraints or change existing ones

Atomicity of updates
               Failures may leave database in an inconsistent state with partial updates carried out
Example: Transfer of funds from one account to another should either complete or not happen at all
Concurrent access by multiple users
Concurrent accessed needed for performance
Uncontrolled concurrent accesses can lead to inconsistencies
Example: Two people reading a balance and updating it at the same time

Security problems

Hard to provide user access to some, but not all, data
Database systems offer solutions to all the above problems.

Levels of Abstraction

Physical level: describes how a record (e.g., customer) is stored.
Logical level: describes data stored in database, and the relationships among the data.
                                     type customer
                                      customer_id : string;
                                                        customer_name : string;
                                                        customer_street : string;
                                    customer_city : string;
                                    end;
View level: application programs hide details of data types.  Views can also hide information (such as an employee’s salary) for security purposes.
    Instances and Schemas
Similar to types and variables in programming languages
Schema – the logical structure of the database
Example: The database consists of information about a set of customers and accounts and the relationship between them)
Analogous to type information of a variable in a program
Physical schema: database design at the physical level
Logical schema: database design at the logical level
Instance – the actual content of the database at a particular point in time
Analogous to the value of a variable
Data Independence

Physical Data Independence – the ability to modify the physical schema without changing the logical schema
Applications depend on the logical schema
In general, the interfaces between the various levels and components should be well defined so that changes in some parts do not seriously influence others.

Logical Data Independence:

           It is the ability to modify the logical schema without having application program to be re-written .Modification at this level are necessary.



Data Models
A collection of tools for describing
Data
Data relationships
Data semantics
Data constraints
Relational model
Entity-Relationship data model (mainly for database design)
Object-based data models (Object-oriented and Object-relational)
Other older models:
Network model 
Hierarchical model                                      
Database languages:
Data Manipulation Language (DML)
Language for accessing and manipulating the data organized by the appropriate data model
DML also known as query language
 Two classes of languages
Procedural – user specifies what data is required and how to get those data
Declarative (nonprocedural) – user specifies what data is required without specifying how to get those data
Data Definition Language (DDL)
Specification notation for defining the database schema
Example:create table account (account_number char(10),
 branch_name  char(10),balance integer)
DDL compiler generates a set of tables stored in a data dictionary
Data dictionary contains metadata (i.e., data about data)
Database schema
Data storage and definition language
Specifies the storage structure and access methods used
  Integrity constraints
  Domain constraints
  Referential integrity (e.g. branch_name must correspond to a valid branch in the branch table)
 Authorization
Transaction Management
         A transaction is a collection of operations that performs a single logical function in a database application
        Transaction-management component ensures that the database remains in a consistent (correct) state despite system failures (e.g., power failures and operating system crashes) and transaction failures.
        Concurrency-control manager controls the interaction among the concurrent transactions, to ensure the consistency of the database.
Storage Management
Storage manager is a program module that provides the interface between the low-level data stored in the database and the application programs and queries submitted to the system.
The storage manager is responsible to the following tasks:
Interaction with the file manager
Efficient storing, retrieving and updating of data
Issues:
Storage access
File organization
Indexing and hashing
Database Administrator
Coordinates all the activities of the database system
has a good understanding of the enterprise’s information resources and needs.
Database administrator's duties include:
Storage structure and access method definition
Schema and physical organization modification
Granting users authority to access the database
Backing up data
Monitoring performance and responding to changes
Database tuning
Database Users
Users are differentiated by the way they expect to interact with
the system
Application programmers – interact with system through DML calls
Sophisticated users – form requests in a database query language
Specialized users – write specialized database applications that do not fit into the traditional data processing framework
Naïve users – invoke one of the permanent application programs that have been written previously
Examples, people accessing database over the web, bank tellers, clerical staff

The Entity-Relationship Model
Models an enterprise as a collection of entities and relationships
Entity: a “thing” or “object” in the enterprise that is distinguishable from other objects
Described by a set of attributes
Relationship: an association among several entities.
A database can be modeled as:
a collection of entities,
relationship among entities.
An entity is an object that exists and is distinguishable from other objects.
Example:  specific person, company, event, plant
Entities have attributes
Example: people have names and addresses        
An entity set is a set of entities of the same type that share the same properties.
Relationship Sets
A relationship is an association among several entities
Example: Hayesdepositor                       A-102
                 customer entity  relationship set      
                 account entity
A relationship set is a mathematical relation among n ³ 2 entities, each taken from entity sets{(e1, e2, … en) | e1  Î E1, e2 Π E2, …, en Π En} where (e1, e2, …, en) is a relationship
Example: (Hayes, A-102)
Î depositor
Relationship Set borrower
     An attribute can also be property of a relationship set. For instance,     the depositor relationship set between entity sets customer and account may have the attribute access-date.
Degree of a Relationship Set
Refers to number of entity sets that participate in a relationship set.
Relationship sets that involve two entity sets are binary (or degree two).  Generally, most relationship sets in a database system are binary.
Relationship sets may involve more than two entity sets.
Relationships between more than two entity sets are rare.  Most relationships are binary. (More on this later.)
Attributes
An entity is represented by a set of attributes, that is descriptive properties possessed by all members of an entity set.
Domain – the set of permitted values for each attribute

Attribute types:
Simple and composite attributes.
Single-valued and multi-valued attributes
Example: multivalued attribute: phone_numbers
Derived attributes
Can be computed from other attributes
Example:  age, given date_of_birth
Composite Attributes
Design Issues
Use of entity sets vs. attributes
Choice mainly depends on the structure of the enterprise being modeled, and on the semantics associated with the attribute in question.
Use of entity sets vs. relationship sets
Possible guideline is to designate a relationship set to describe an action that occurs between entities
Binary versus n-ary relationship sets
Although it is possible to replace any nonbinary (n-ary, for n > 2) relationship set by a number of distinct binary relationship sets, a n-ary relationship set shows more clearly that several entities participate in a single relationship.
   Mapping Cardinality Constraints:
Express the number of entities to which another entity can be associated via a relationship set.
Most useful in describing binary relationship sets.
For a binary relationship set the mapping cardinality must be one of the following types:
One to one
One to many
Many to one
Many to many
Mapping Cardinalities
Keys

A super key of an entity set is a set of one or more attributes whose values uniquely determine each entity.

A candidate key of an entity set is a minimal super key
Customer_id is candidate key of customer
account_number is candidate key of account
Although several candidate keys may exist, one of the candidate keys is selected to be the primary key.

Keys for Relationship Sets

The combination of primary keys of the participating entity sets forms a super key of a relationship set.
(customer_id, account_number) is the super key of depositor
NOTE:  this means a pair of entity sets can have at most one relationship in a particular relationship set.

Example: if we wish to track all access_dates to each account by each customer, we cannot assume a relationship for each access.  We can use a multivalued attribute though

Must consider the mapping cardinality of the relationship set when deciding what are the candidate keys
Need to consider semantics of relationship set in selecting the primary key  in case of more than one candidate key
 E-R Diagrams:
Rectangles represent entity sets.
Diamonds represent relationship sets.
Lines link attributes to entity sets and entity sets to relationship sets.
Ellipses represent attributes
Double ellipses represent multivalued attributes.
Dashed ellipses denote derived attributes.
Underline indicates primary key attributes (will study later)

Weak Entity Sets
An entity set that does not have a primary key is referred to as a weak entity set.
The existence of a weak entity set depends on the existence of a identifying entity set
it must relate to the identifying entity set via a total, one-to-many relationship set from the identifying to the weak entity set
Identifying relationship depicted using a double diamond
The discriminator (or partial key) of a weak entity set is the set of attributes that distinguishes among all the entities of a weak entity set.
The primary key of a weak entity set is formed by the primary key of the strong entity set on which the weak entity set is existence dependent, plus the weak entity set’s discriminator.
We depict a weak entity set by double rectangles.
We underline the discriminator of a weak entity set  with a dashed line.
payment_number – discriminator of the payment entity set
Primary key for payment – (loan_number, payment_number)
Note: the primary key of the strong entity set is not explicitly stored with the weak entity set, since it is implicit in the identifying relationship.
If loan_number were explicitly stored, payment could be made a strong entity, but then the relationship between payment and loan would be duplicated by an implicit relationship defined by the attribute loan_number common to payment and loan
Extended E-R Features:
Specialization
Top-down design process; we designate subgroupings within an entity set that are distinctive from other entities in the set.
These subgroupings become lower-level entity sets that have attributes or participate in relationships that do not apply to the higher-level entity set.
Depicted by a triangle component labeled ISA (E.g. customer “is a” person).

Generalization
A bottom-up design process – combine a number of entity sets that share the same features into a higher-level entity set.
Specialization and generalization are simple inversions of each other; they are represented in an E-R diagram in the same way.
The terms specialization and generalization are used interchangeably.
Can have multiple specializations of an entity set based on different features. 
E.g. permanent_employee vs. temporary_employee, in addition to officer  vs. secretary vs. teller
Each particular employee would be a member of one of permanent_employee or temporary_employee and also a member of one of officer, secretary, or teller.The ISA relationship also referred to as superclass - subclass relationship
Aggregation
Consider the ternary relationship works_on, which we saw earlier Suppose we want to record managers for tasks performed by an  
employee at a branch.
Relationship sets works_on and manages represent overlapping information
Every manages relationship corresponds to a works_on relationship
However, some works_on relationships may not correspond to any manages relationships
So we can’t discard the works_on relationship
Eliminate this redundancy via aggregation
Treat relationship as an abstract entity
Allows relationships between relationships
Abstraction of relationship into new entity
Without introducing redundancy, the following diagram represents:
An employee works on a particular job at a particular branch
An employee, branch, job combination may have an associated manager
E-R Diagram With Aggregation.
Attribute inheritance – a lower-level entity set inherits all the attributes and relationship participation of the higher-level entity set to which it is linked. 

No comments:

Post a Comment