Tải bản đầy đủ (.ppt) (68 trang)

Giáo trình cơ sở dữ liệu

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (391.44 KB, 68 trang )





1
1
Data
Data
: Data is collection of records.
: Data is collection of records.
Database
Database
: A database is a collection of data that represents important objects in a user's business.
: A database is a collection of data that represents important objects in a user's business.
Usage of database
Usage of database
: Database is being used in almost all possible aspects of business and life.
: Database is being used in almost all possible aspects of business and life.
1) S
1) S
ingle User
ingle User
: 2)
: 2)
Multiple Users
Multiple Users
: 3)
: 3)
Very Large Application
Very Large Application
: 4)


: 4)
Database used by Internet
Database used by Internet
Application
Application
:
:
Evolution of Database Management System
Evolution of Database Management System
GIÁO TRÌNH CƠ SỞ DỮ LIỆU
Nguyễn Quang Tấn
Bài 1
DATABASE MANAGEMENT SYSTEM
DATABASE MANAGEMENT SYSTEM
:
:


APPLICATION
PROGRAM
FILE
FILE
FILE
FILE
DBMS
FILE- PROCESSING SYSTEM:
FILE- PROCESSING SYSTEM:


APPLICATION

PROGRAM




2
2
1.
1.
It is self-describing
It is self-describing


2.
2.
It is a collection of integrated records (Users Data, Meta Data, Indexes, Application
It is a collection of integrated records (Users Data, Meta Data, Indexes, Application
Metadata)
Metadata)
3. A Database is Model of Model
3. A Database is Model of Model
Therefore based on the characteristics of the Database we can formulate the definition as: A
Therefore based on the characteristics of the Database we can formulate the definition as: A
database is self-describing collection of integrated records.
database is self-describing collection of integrated records.
GIÁO TRÌNH CƠ SỞ DỮ LIỆU
Nguyễn Quang Tấn
Bài 1
Characteristics of Database Management System
Characteristics of Database

1. It is collection of databases
2. It is a computerized record-keeping system
3. It contains facilities that allow the user to

Add and delete files to the system

Insert, retrieve, update, and delete data in existing files
• A DBMS may contain many databases that can be used for separate
purposes or combined to provide useful information
A database management system (DBMS) is a program that allows users to define,
manipulate, and process the data in a database in order to produce meaningful
information.




3
3
Important models in Database Management Systems
Important models in Database Management Systems
1.
1.
The Relational Model : concepts of relational algebra to the problem of storing large data.
The Relational Model : concepts of relational algebra to the problem of storing large data.


(E.F.Codd- 1970)
(E.F.Codd- 1970)
.
.

Relational Database: databases in which data is organized into tables.
Relational Database: databases in which data is organized into tables.


GIÁO TRÌNH CƠ SỞ DỮ LIỆU
Nguyễn Quang Tấn
Bài 1
E.F.Codd formulated rules for Relational Database Manangement Systems which can be tablulated as below:
Information Representation All data must be represented logically in tables
Guaranteed logical accessibility All data is accessible logically using a combination of table name, key name, key value
and column
NULL support A Null value represents missing information and are not to be confused with empty, blank
or zero filled data. They also are not necessarily equal
Dynamic online catalog The definition of data is represented in the same manner as data so that it can
be relationally accessed. Example every database has catalog tables
Comprehensive data sub-language (HLL Like SQL ) A unique language which is supported along with
several other programming languages to achieve data definition, view and manipulation
Updatable views All views that can theoretically exists can be updated
High level insert, updates and delete Both base and derived relations can be handled as singular
requests and apply to retrieval, insert, update and deletion of data
Physical data independence Program and terminal activities are preserved when changes are made
in storage representation or access methods
Logical data independence Program and terminal activities are preserved when changes are made to base tables
Lintegrity independence Integrity constraints must be definable in the unique sub-language and stored in the catalog
Distribution independence e The unique sub-language must support database distribution while preserving program
and terminal activities
Nonsubersion A relational system which perform low-level processing of records cannot supplant the integrity rules in effect
in high level, record processing by a relational language
Relational Table database Defined as a rectangular array of columns and rows where columns are homogenous,
non-grouped, uniquely named data items( attributes ) and rows are not duplicated. It is also important to impose the property

that column and row order within the array is immaterial




4
4
GIÁO TRÌNH CƠ SỞ DỮ LIỆU
Nguyễn Quang Tấn
Bài 1
Microcomputer DBMS Products
Dbase-II
(Ashon-Tate)
1979
Client Server Database Application
Access
FoxPro

A client server application

Database server
MS SQL,
SQL Anywhere,
DB2, Oracle
Database Using Internet Applications

Distributed Database Processing

Object- Oriented DBMS (ODBMS)
ASP,

ColdFusion,
XML
Single Database Application




5
5
GIÁO TRÌNH CƠ SỞ DỮ LIỆU
Nguyễn Quang Tấn
Bài 1
Components of Database
1. User Data:
StudentID Student Name StudentPhone InstructorName InstructorPhone
100 Adam 111-222-0987 Gary 111-222-3333
101 John 111-221-3457 Gary 111-222-3333
102 Jenny 111-220-9822 Tim 111-219-3267
103 Smantha 111-220-7644 Tim 111-219-3267
104 Stacy 111-222-3478 Gary 111-222-3333
Table StudentIns
StudentID Student Name StudentPhone InstructorName
100 Adam 111-222-0987 Gary
101 John 111-221-3457 Gary
102 Jenny 111-220-9822 Tim
103 Smantha 111-220-7644 Tim
104 Stacy 111-222-3478 Gary
Table Student
InstructorName InstructorPhone
Gary 111-222-3333

Tim 111-219-3267
Table Instructor
2. MetaData:
Table Name Number of Columns Primary Key
Student 4 StudentID
Instructor 2 InstructorName
Course 4 Courseid
Registration 4 {Reg_No, zip}
Hypothetical Systems Table




6
6
GIÁO TRÌNH CƠ SỞ DỮ LIỆU
Nguyễn Quang Tấn
Bài 1
3. Indexes:
Components of Database (cont.)
StudentName StudentID
Adam 100
Smith 101
Chris 102
Nancy 103
Andrew 104
Student Name Index
4. Application Metadata
Application metadata stores the structure and format of user forms, reports, queries and other
application components. Usually application metadata is accessed by using DBMS tools.





7
7
GIÁO TRÌNH CƠ SỞ DỮ LIỆU
Nguyễn Quang Tấn
Bài 1
Features of DBMS subsystem
OPERATING
SYSTEM
DBMS
ENGINE
DESIGN TOOLS:
•TABLE CREATION
•QUERY CREATION

REPORT CREATION

PROCEDURAL LANGUAGE
RUN-TIME FACILITIES:
•FORM PROCESSOR
•QUERY PROCESSOR

REPORT WRITE

PROCEDURAL RUN TIME





8
8
GIÁO TRÌNH CƠ SỞ DỮ LIỆU
Nguyễn Quang Tấn
Bài 2
The Entity Relationship Model (E-R Model)
Entities
StudentID StudentName Address City State Zip TelephoneNumber

Student entity

StudentID StudentName Address City State Zip TelephoneNumber
101 Adam 100 Gateway Drive Seattle WA 98101 206-334-9876

Instance of STUDENT

Attributes Student Name, Student ID, Class ID etc. Attributes is analogous to a column in the relational table.
Composite Attribute Example of Composite Attribute is Address, which consists of group of
attributes [Street, City, State, Zip]
Multi Value Attributes Example can be anyone's E-mail ID as person may have more
then one E-Mail ID.

Relationships
Entity Relationship Diagrams
A Relationship represents an association between two or more entities.

Entity classes are shown by rectangles
•Relationships are shown by diamonds

•Maximum cardinality of the relationship is inside the diamond

The name of the relationship is shown near the diamond

To show Minimum cardinality one can place a hash mark (|) across the relationship line to indicate that the entity
must exist in the relationship and place an oval across the relationship line to indicate that there may or may not be
an entity in the relationship




9
9
GIÁO TRÌNH CƠ SỞ DỮ LIỆU
Nguyễn Quang Tấn
Bài 2
Important Utilities of SQL Server
7.0
Creating database in SQL Server
The SQL Server Enterprise Manager
•The Create Database statement

The database creation Wizard
Create database Dbase_name
[ ON [PRIMARY]
( [ NAME = logical_file_name, ]
FILENAME = 'os_file_name'
[, SIZE = size]
[, MAXSIZE = { max_size | UNLIMITED } ]
[, FILEGROWTH = growth_increment] ) [, n]]

[ LOG ON
{ ( [ NAME = logical_file_name, ]
FILENAME = 'os_file_name'
[, SIZE = size]
[, MAXSIZE = { max_size | UNLIMITED } ]
[, FILEGROWTH = growth_increment] ) [, n]
[, n]} ]
[ FOR LOAD | FOR ATTACH ]
Create Tables
Can be created both by using Query Analyzer and
Enterprise Manager (EM)
Syntax
Create Table [database. [Owner.] table_name
( column_name datatype [identity|constraint|NULL|NOT NULL]
[…]
)
Example

Create Table student1
(sid tinyint IDENTITY NOT NULL,
Fname char (20),
Lname char (20) NOT NULL,
Address1 varchar (30),
Address2 varchar (30),
City varchar (30),
State varchar (30),
Zipcode char (10),
Add_date datetime,
marks1 tinyint NOT NULL,
marks2 tinyint NOT NULL,

Total AS marks1+marks2
)
SQL Server 7.0




10
10
GIÁO TRÌNH CƠ SỞ DỮ LIỆU
Nguyễn Quang Tấn
Bài 2
SQL Server Data Types
I. Strings (char (n), varchar)- String contain character data made up of letters, numbers and symbols. In case of char (n),
data will always contain n bytes of information. If its less then n, it will be padded with spaces and if more then n, data will be
truncated.
Varchar (n) is used when one expects a wide variation in data size. Varchar option is good from memory standpoint but SQL
Server has to do overhead in determining the variable-length field.
II. Binary Data (binary (n), varbinary) - Binary data is stored as a series of 0s and 1s, which are represented on input and
output as a hexadecimal pairs. These hexadecimal pairs are made up of characters 0 through 9 and A through F
III. Integers (tinyint, smallint, int)- Of these three types of integer data type difference is in the amount of storage space
they require and range of values they can store
Data Type Length Range
TinyInt 1 0-255
SmallInt 2 +32,767
Int 4 +2,147,483,647
IV. Approximate and Exact Numeric Data Types
Approximate Numeric data types (float (n), real)- Floats and Reals are useful for scientific and statstical data where
absolute accuracy is not required, but where value reaches from extremely small to extremely large numbers. Floats can have
a precision from 1 to 38 digits.

Real have a precision of 7 digits and require 4 bytes of data storage. So Float with a precision less then 7 is real data type.
Ex. Variable1 real; variable2 float (8)
Exact Numeric data Types (decimal and numeric)- Ex. decimal (7,2)




11
11
GIÁO TRÌNH CƠ SỞ DỮ LIỆU
Nguyễn Quang Tấn
Bài 2
SQL Server Data Types (cont)
V. Special Data Types
•Bit- It's a logical data type used to store Boolean information. Used in case of true/ false or yes/no or on/off. The value stored
are 0 or 1. Bit data type single byte of storage space. Ex. Paid bit NULL or Passed bit NOT NULL

Text and Image- Used when storage requirements are more then 8000 character column limitations. These data types are
often referred to as BLOBs. The text and image data type can store up to 2GB of Binary or text data per declaration.
Ex. Comments text, Students image

RowGUID-Used mostly in special type of replication called merge replication to provide each column in replicated table with
a unique identifier.
•Date and Time Data Types
Data Type Range
Datetime 1/1/1753-12/31/9999
Smalldatetime 1/1/1900-6/6/2079

Money (money and smallmoney)
Data Type Range

Money +922,337,203,685,447.5808
Smallmoney +214,748,3647
VI. User- Defined Data Types- One can create user-defined database for specific database or one can place same in a
Model Database.
To create user-defined data Type, one must use system-provided data types.
For defining use-defined data type one needs to use sp_addtype system stored procedure
EXEC sp_addtype ssn, 'VARCHAR (11)', 'NOT NULL'
To drop user defined data type use:sp_droptype sid
User-Defined data types can be created and dropped using EM too.




12
12
GIÁO TRÌNH CƠ SỞ DỮ LIỆU
Nguyễn Quang Tấn
Bài 3
Normalization
Normalization is a process for converting a relation that has certain problems to two or more relations that do not have
problem. We can say that Normalization is used to check the desirability and correctness of relations. In Relational Model
Tables are termed as relations.
A relation is a 2-D table with each row holding data that pertain to some thing or a portion of some thing. Each column has
a data regarding an attribute. The term relational comes from the mathematical field of relational algebra. So, Relational models
are composed of relations, or two-dimensional tables, which follow the operations described in relational algebra.
For a table to be in a relation, it must follow these rules
•Cells of the table must be of single value

All entries of the column must be of same type


Each column/attribute need to have a unique name
•No two rows in a table may be identical

Also order of the rows and columns in a relationship is insignificant
Functional Dependency
It's a relationship between or among attributes. Means if we know value of 1 attribute we can obtain the value of another
attribute. Like if we know StudentId we can know his admission date. And we can say that admission dates are functionally
dependent on the StudentId. Or in general attribute Y is dependent on attribute X. One can comprehend that the storage and
retrieval of functional dependencies is the only reason for having a database
StudentId admissiondate
(SID, Subject) Grade




13
13
GIÁO TRÌNH CƠ SỞ DỮ LIỆU
Nguyễn Quang Tấn
Bài 3
Key: It is a group of one or more than one attributes that uniquely identifies a row
StudentId CourseName Fee
100 IT203 2220
101 IT161 1000
102 IT107 2000
103 IT121 1000
104 IT331 200
A
Relation: Courses
StudentId CourseName Fee

100 IT203 2220
101 1T161 1000
102 IT107 2000
103 IT121 1000
101 IT107 2000
103 IT203 2220
104 IT331 200
B
Relation: Courses1
StudentId CourseName
100 IT203
101 1T161
102 IT107
103 IT121
101 IT107
103 IT203
104 IT331
Relation: Student - Courses
CourseName Fee
IT203 2220
1T161 1000
IT107 2000
IT121 1000
IT107 2000
IT203 2220
IT331 200

Relation: Courses - Fee
Modification Anomalies:
delete Studen 104 from A or B, or insert another course IT176 to A or B





14
14
GIÁO TRÌNH CƠ SỞ DỮ LIỆU
Nguyễn Quang Tấn
Bài 3
In early 70s each time anomaly was found in a relation, relation theorists worked on same and thought of the way to prevent
same. This way Relations were classified based on anomalies. These classes of Relations and the techniques for preventing
anomalies are called Normal Forms. This way 1st, 2nd and 3rd Normal Forms were defined. It was followed by Boyce-Codd
Normal form, then 4th and 5th Normal form. All these Normal forms are nested means for a relation by 5NF it need
to be 4NF, 3NF, 2NF, and 1NF.

First Normal Form
Any table of data that meets the definition of a relation is said to be in First Normal Form. For a table to be in a relation, it must

Cells of the table must be of single value (No arrays or repeating groups are allowed as value)
• All entries in any column must be of same type
• Each column/attribute need to have a unique name

No two rows in a table may be identical

Also order of the rows and columns in a relationship is insignificant
StudentId CourseName Fee
100 IT203 2220
101 IT161 1000
102 IT107 2000
103 IT121 1000

104 IT331 200
A
Relation: Courses




15
15
GIÁO TRÌNH CƠ SỞ DỮ LIỆU
Nguyễn Quang Tấn
Bài 3
Second Normal Form
Consider once again Relation Course1, now this relation also suffers from modification anomalies as discussed before. In this
relation one student can register for one or more courses. That means that StudentId itself only cannot be key. So we consider
(
StudentId, CourseName
), (
StudentId, Fee
) and (C
ourseName, Fee
). But since two courses can have same fee we can not
have (
CourseName, Fee
) and also (
StudentId, Fee
) can not determine uniqueness in row as student 100, for example, could
engage in two different activities both having same fee. Thus only (
StudentId, CourseName
) can uniquely determine each row.

StudentId CourseName Fee
100 IT203 2220
101 1T161 1000
102 IT107 2000
103 IT121 1000
101 IT107 2000
103 IT 203 2220
104 IT331 200
B
Relation: Courses1
The key is combination of (StudentId, CourseName),
but relation has a dependency like this:
CourseName Fee
We define that a relation is said to be in second Normal Form if all its non-key attributes are dependent on
all of the key and not part of it. Therefore we also conclude that if a relation has a single attribute as a key
then it's automatically in 2NF. Thus 2NF concerns only those, which have composite keys




16
16
GIÁO TRÌNH CƠ SỞ DỮ LIỆU
Nguyễn Quang Tấn
Bài 3
Third Normal Form
Relations in 2NF too can have anomalies. Consider HOUSING Relation below:
StudentId BldgName Rent
100 CityHall 1200
101 CityWoods 1000

102 CityHall 1200
103 CalmPlace 1500
104 UrbaneColony 1200
105 CityHall 1200
106 CityWoodsl 1000
C
Relation: HOUSING
In above relation StudentId is a key, and the functional dependencies
are:
StudentId BldgName
and BldgName Rent
StudentId BldgName
100 CityHall
101 CityWoods
102 CityHall
103 CalmPlace
104 UrbaneColony
105 CityHall
106 CityWoodsl
C1
Relation: Student-Bldg
BldgName Rent
CityHall 1200
CityWoods 1000
CityHall 1200
CalmPlace 1500
UrbaneColony 1200
CityHall 1200
CityWoodsl 1000
C2

Relation: BldgN-Rent




17
17
GIÁO TRÌNH CƠ SỞ DỮ LIỆU
Nguyễn Quang Tấn
Bài 3
Boyce-Codd Normal Form (BCNF)
A relation is in BCNF if every determinant is a candidate key.
Key: One or more attributes that uniquely identify a tuple (row) in a relation
Those determinants that are keys we initially call candidate keys. o Eventually, we select a single
candidate key to be the primary key for the relation
FundID InvestmentType Manager
99 Common Stock Smith
99 Municipal Bonds Jones
33 Common Stock Green
22 Growth Stocks Brown
11 Common Stock Smith
D
Relation: INVESTMENT

FundID, InvestmentType -> Manager

FundID, Manager -> InvestmentType
• Manager -> InvestmentType
Checking this relation R(FundID, InvestmentType, Manager) if its in 1NF, 2NF or 3NF
Assuming we pick

FundID, InvestmentType
as the Primary Key

1NF for sure as relation is as per definition of 1NF

2NF because all of the non-key attributes (Manager) is dependant on all of the key
• 3NF because there are no transitive dependencies




18
18
GIÁO TRÌNH CƠ SỞ DỮ LIỆU
Nguyễn Quang Tấn
Bài 3
Consider what happens if we delete the tuple with FundID 22. We loose the fact that Brown manages the InvestmentType
"Growth Stocks." The following are steps to normalize a relation into BCNF:

List all of the determinants

See if each determinant can act as a key (candidate keys)
• For any determinant that is not a candidate key, create a new relation from the functional dependency. Retain
thedeterminant in the original relation
For our example:
Rorig (FundID, InvestmentType, Manager)
The determinants are:
•FundID, InvestmentType

FundID, Manager


Manager
Which determinants can act as keys ?

FundID, InvestmentType
YES

•FundID, Manager
YES


Manager
NO

Create a new relation from the functional dependency:
Rnew (Manager, InvestmentType) : Key - Manager
Rorig (FundID, Manager) : Key - FundID, Manager
FundID InvestmentType Manager
99 Common Stock Smith
99 Municipal Bonds Jones
33 Common Stock Green
22 Growth Stocks Brown
11 Common Stock Smith
D
Relation: INVESTMENT




19

19
GIÁO TRÌNH CƠ SỞ DỮ LIỆU
Nguyễn Quang Tấn
Bài 3
Fourth Normal Form
Id -> -> Major
Id -> ->Minor
This attribute dependency is called multi-value dependency.

Student can enroll in several different majors and several different minors

Major and Minor are independent of each other
• Only possible key is (id, major, minor)
Above relation is:

1NF for sure
• 2NF because there is no non-key attributes

3NF because there are no transitive dependencies

BCNF as it does not have any functional dependency, so this relation is in BCNF
Above relation also suffers from anomalies as:
• If we want to signup student 10 for minor Chemistry, then we can add row (10, Physics, Chemistry) but this will mean that
student can take minor Chemistry with Major Physics only and not with Math, which is wrong. So we will have to add row
(10, Math, Chemistry) too. This is an update anomaly as too much information needs to be updated for simple change in
the data

Also if a student drops a major, all the entries associated with major need to be dropped, this causes deletion anomaly
To make this relation 4NF we need to:
Create 2 relations each storing data only for one multi multi-value attribute.

(SID, Major) : Key(SID, Major)
(SID, Minor) : Key (SID, Minor)
A relation meets the requirements of 4NF only if it is in 3NF (or BCNF) and has no multi-valued dependencies.
Id Major Minor
10 Physics Geography
10 Math Geography
10 Physics Computers
10 Math Computers
10 Geology Economics
E
Relation: MAJOR-MINIOR




20
20
GIÁO TRÌNH CƠ SỞ DỮ LIỆU
Nguyễn Quang Tấn
Bài 3
Foundations of Relational Implementation
1. Data Definition language (DDL) - Allows one to define database objects at the conceptual level. It consists of commands
to create objects and alter the structure of objects, such as tables, views, indexes etc. Commonly used DDL statements are
Create, Drop etc. If we have to create table Student then use the following syntax:
Create Table Student
(Student_id Integer Primary Key,
Student_nm Varchar (30),
Course_id Varchar (15),
Phone Varchar (10),
Address Varchar (50))

2. Data Manipulation language (DML) - Allows one to retrieve or update data within a database. It is used for query, insertion,
and deletion and updating of information stored in databases. E.g.: Select, Insert, Update, Delete.
E.g.
1.Select * From Student where Student_nm = 'Dave'
2.Delete Student where Student_nm = 'James'
3.Update Student Set Address = '146, Fauntleroy Way Sw, Seattle- 98116' where Student_nm = 'Jim'
3. Data control language (DCL) -In RDBMS one of the main advantage is the security for the data in the database. You
can allow some user to do a specific operation or all operations on certain objects. Examples for DCL statements are
Grant, Revoke statements.
While Grant is used to grant permission to user, Revoke is used to revoke permission to user.
4. Indexes are referred as Physical key too. Physical Key signifies a column that has a special data structure defined to it in
order to improve performance. Whereas in Databases we often use term key for Logical key which identifies the unique column/s
in a database. One can choose any column or columns of the table to act as Physical key (index). A Logical key need not be a
Physical key, and a Physical key need not be a logical key.




21
21
GIÁO TRÌNH CƠ SỞ DỮ LIỆU
Nguyễn Quang Tấn
Bài 3
Defining the database structure to the DBMS:
There are various ways in which the structure of database can be described to the database management system.
Using a text file to describe a data structure as the one we define in Query Analyzer of the SQL Server which use DDL
Using a GUI based interface for defining the structure of database as in SQL Server Enterprise Manager.
DML Interface to the DBMS
There are multiple ways in which user can interface to the database. They can use form and report capability supplied by the
DBMS or can use query language or process application using application interface. All broad categories are discussed below:

1.Data Manipulation by means of Forms: Most of RDBMS have tools for building forms. Each product has its own way of
providing of these forms. Access has a very helpful wizard for same. Some product comes up with the auto creation of forms
as tables are defined and these forms can be customized later.
2.Query/ Update Language: While using query languages, the user enters the query language commands and then these
commands are decoded by the DBMS and appropriate action is taken. The single most important Query Language is SQL.
3.Stored Procedure Interface: Stored Procedures are precompiled T-SQL scripts and are stored in a DBMS. Stored
Procedures are of great help to users who have little knowledge about the Query Language. Also since they are precompiled
they run fast. DBMS products come with system written stored procedures too to make things easier.
4.Application Program Interface: The type of data access interface is through application programs written in programming
languages like C, C++, JAVA, and ASP etc.




22
22
GIÁO TRÌNH CƠ SỞ DỮ LIỆU
Nguyễn Quang Tấn
Bài 4
Relation Algebra
It's similar to the algebra used in high school. Although the concepts both in Relational Algebra and Mathematical Algebra are
same, the basic difference is that numeric variables of Mathematical Algebra are replaced by relations in Relational Algebra.
An important observation point is that Relations are sets. The rows of relations can be considered elements of a set, and so the
operations that can be performed on sets can also be performed on Relations.
1. Union: The union of two Relations is formed by adding the rows from one table to those of a second table to produce a
third table or Relation.
The Union Relations of A and B is represented as: A+B
For this Operation to work relations must be Union Compatible which means:
• Each Relation must have the same number of attributes


Attributes from the corresponding columns must come from same domain

Duplicate rows must be eliminated




23
23
GIÁO TRÌNH CƠ SỞ DỮ LIỆU
Nguyễn Quang Tấn
Bài 4
Example:
Student (Sid, Name, City)
Sid Name City
1001 Dan Seattle
1002 Adam Redmond
1003 John Kirkland
3003 Jack Seattle
Relation: Student(Sid, Name, City)
Attribute Domain Format
Sid Identifier Integer (4)
Name PersonName Char (14)
City Residencecity Char (10)
Iid Name City
5001 Steve Seattle
5002 Neena Redmond
3003 Jack Seattle

Relation: Instructor(Iid, Name, City)

Attribute Domain Format
Iid Identifier Integer (4)
Name PersonName Char (14)
City Residencecity Char (10)
Instructor (Iid, Name, City)
Assuming that Jack is both a Student and Instructor in the Institute
.
Now in above example both the tables are Union Compatible so there Union will result into:
Sid or Iid Name City
1001 Dan Seattle
1002 Adam Redmond
1003 John Kirkland
5001 Steve Seattle
5002 Neena Redmond
3003 Jack Seattle
Relation: Union




24
24
2. Difference
The difference of two relations provides third relation, which has rows that are present in first relation but not in second.
So A-B for example above will be:
So remember A-B will be different from B-A
3. Intersection
The Intersection of the two relations is provided by third relation having rows that appear in both the first and second
relations. Prerequisite is that Relations should be Union compatible. So Intersection for Relations in above example would
be:

GIÁO TRÌNH CƠ SỞ DỮ LIỆU
Nguyễn Quang Tấn
Bài 4
Sid or Iid Name City
1001 Dan Seattle
1002 Adam Redmond
1003 John Kirkland
3003 Jack Seattle

A-B
Sid or Iid Name City
3003 Jack Seattle

Intersection




25
25
GIÁO TRÌNH CƠ SỞ DỮ LIỆU
Nguyễn Quang Tấn
Bài 4
5. Product
Product of two relations is the joining
of each row of one relation with each
row of another relation. The Product (AXB) of
Relation A having m rows with a Relation B
having n rows would be
m x n

.
A:
Instructor (Iid, Name, City
)
B:
InstructorMajor (Iid, Major)
AxB
Obviously above information does not make
much sense so more operations need to be
done to get useful Information
from same.
Iid Name City
5001 Steve Seattle
5002 Neena Redmond
3003 Jack Seattle
A.
Instructor
Iid Major
5001 IT
5009 English
B.
InstructorMajor
Iid Name City Id Major
5001 Steve Seattle 5001 IT
5002 Neena Redmond 5001 IT
3003 Jack Seattle 5001 IT
5001 Steve Seattle 5009 English
5002 Neena Redmond 5009 English
3003 Jack Seattle 5009 English


A x B

Tài liệu bạn tìm kiếm đã sẵn sàng tải về

Tải bản đầy đủ ngay
×