4.4 View Integration 67
words, the concept of order has been variously represented as an entity,
a relationship, and an attribute, depending on perspective.
There are four basic steps needed for conceptual schema integration:
1. Preintegration analysis
2. Comparison of schemas
3. Conformation of schemas
4. Merging and restructuring of schemas
4.4.1 Preintegration Analysis
The first step, preintegration analysis, involves choosing an integration
strategy. Typically, the choice is between a binary approach with two
Figure 4.4 Schemas: placement of an order
(a) The concept of order as an entity
(b) The concept of order as a relationship
1
N
N
1
Customer
places
Order
NN
orders
Product
Product
for-a
Customer
(c) The concept of order as an attribute
NN
purchases
Product
Customer
order-no
Teorey.book Page 67 Saturday, July 16, 2005 12:57 PM
68 CHAPTER 4 Requirements Analysis and Conceptual Data Modeling
schemas merged at one time and an n-ary approach with n schemas
merged at one time, where n is between 2 and the total number of sche-
mas developed in the conceptual design. The binary approach is attrac-
tive because each merge involves a small number of data model con-
structs and is easier to conceptualize. The n-ary approach may require
only one grand merge, but the number of constructs may be so large
that it is not humanly possible to organize the transformations properly.
4.4.2 Comparison of Schemas
In the second step, comparison of schemas, the designer looks at how
entities correspond and detects conflicts arising from schema diversity—
that is, from user groups adopting different viewpoints in their respec-
tive schemas. Naming conflicts include synonyms and homonyms. Syn-
onyms occur when different names are given for the same concept;
these can be detected by scanning the data dictionary, if one has been
established for the database. Homonyms occur when the same name is
used for different concepts. These can only be detected by scanning the
different schemas and looking for common names.
Structural conflicts occur in the schema structure itself. Type con-
flicts involve using different constructs to model the same concept. In
Figure 4.4, for example, an entity, a relationship, or an attribute can be
used to model the concept of order in a business database. Dependency
conflicts result when users specify different levels of connectivity (one-
to-many, etc.) for similar or even the same concepts. One way to resolve
such conflicts might be to use only the most general connectivity—for
example, many-to-many. If that is not semantically correct, change the
names of entities so that each type of connectivity has a different set of
entity names. Key conflicts occur when different keys are assigned to the
same entity in different views. For example, a key conflict occurs if an
employee’s full name, employee ID number, and Social Security number
are all assigned as keys.
4.4.3 Conformation of Schemas
The resolution of conflicts often requires user and designer interaction.
The basic goal of the third step is to align or conform schemas to make
them compatible for integration. The entities, as well as the key
attributes, may need to be renamed. Conversion may be required so that
concepts modeled as entities, attributes, or relationships are conformed
to be only one of them. Relationships with equal degree, roles, and con-
Teorey.book Page 68 Saturday, July 16, 2005 12:57 PM
4.4 View Integration 69
nectivity constraints are easy to merge. Those with differing characteris-
tics are more difficult and, in some cases, impossible to merge. In addi-
tion, relationships that are not consistent—for example, a relationship
using generalization in one place and the exclusive OR in another—
must be resolved. Finally, assertions may need to be modified so that
integrity constraints remain consistent.
Techniques used for view integration include abstraction, such as
generalization and aggregation to create new supertypes or subtypes, or
even the introduction of new relationships. As an example, the generali-
zation of Individual over different values of the descriptor attribute “job-
title” could represent the consolidation of two views of the database—
one based on an individual as the basic unit of personnel in the organi-
zation, and another based on the classification of individuals by job
titles and special characteristics within those classifications.
4.4.4 Merging and Restructuring of Schemas
The fourth step consists of the merging and restructuring of schemas.
This step is driven by the goals of completeness, minimality, and under-
standability. Completeness requires all component concepts to appear
semantically intact in the global schema. Minimality requires the
designer to remove all redundant concepts in the global schema. Exam-
ples of redundant concepts are overlapping entities and truly semanti-
cally redundant relationships; for example, Ground-Vehicle and Auto-
mobile might be two overlapping entities. A redundant relationship
might occur between Instructor and Student. The relationships “direct-
research” and “advise” may or may not represent the same activity or
relationship, so further investigation is required to determine whether
they are redundant or not. Understandability requires that the global
schema make sense to the user.
Component schemas are first merged by superimposing the same
concepts and then restructuring the resulting integrated schema for
understandability. For instance, if a supertype/subtype combination is
defined as a result of the merging operation, the properties of the sub-
type can be dropped from the schema because they are automatically
provided by the supertype entity.
4.4.5 Example of View Integration
Let us look at two different views of overlapping data. The views are
based on two separate interviews of end users. We adapt the interesting
Teorey.book Page 69 Saturday, July 16, 2005 12:57 PM
70 CHAPTER 4 Requirements Analysis and Conceptual Data Modeling
example cited by Batini, Lenzerini, and Navathe [1986] to a hypothetical
situation related to our example.
In Figure 4.5a we have a view that focuses on reports and includes
data on departments that publish the reports, topic areas in reports, and
contractors for whom the reports are written. Figure 4.5b shows another
view, with publications as the central focus and keywords on publication
as the secondary data. Our objective is to find meaningful ways to inte-
grate the two views and maintain completeness, minimality, and under-
standability.
We first look for synonyms and homonyms, particularly among the
entities. Note that a synonym exists between the entities Topic-area in
schema 1 and Keyword in schema 2, even though the attributes do not
Figure 4.5 View integration: find meaningful ways to integrate
NN
N
N
N1
written-
for
contains
title
address
(a) Original schema 1, focused on reports
name
name
Report
publishesDepartment
Topic-area
Publication
N
N
contains Keyword
title
code
title
code
research-area
dept-name
(b) Original schema 2, focused on publications
name
Contractor
address
Teorey.book Page 70 Saturday, July 16, 2005 12:57 PM
4.4 View Integration 71
match. However, we find that the attributes are compatible and can be
consolidated. This is shown in Figure 4.6a, which presents a revised
schema, schema 2.1. In schema 2.1 Keyword has been replaced by Topic-
area.
Next we look for structural conflicts between schemas. A type con-
flict is found to exist between the entity Department in schema 1 and
the attribute “dept-name” in schema 2.1. The conflict is resolved by
keeping the stronger entity type, Department, and moving the attribute
type “dept-name” under Publication in schema 2 to the new entity,
Department, in schema 2.2 (see Figure 4.6b).
Figure 4.6 View integration: type conflict
Publication
Department
N
N
contains Topic-area
title
code
title
code
research-area
title
code
research-area
dept-name
(a) Schema 2.1, in which Keyword has changed to Topic-area
dept-name
1
has
title
code
N
(b) Schema 2.2, in which the attribute dept-name has changed
to an attribute and an entity(b)
Publication
N
N
contains Topic-area
Teorey.book Page 71 Saturday, July 16, 2005 12:57 PM