You are on page 1of 26

Q.1) Describe Three Levels of Data Abstraction?

Ans.
For Understanding :
(1. Physical level : how the data is stored physically and where it is
stored in
database.
2. Logical level : what information or data is stored in the database
(like what is the
datatype or what is format of data.
3.View level : end users work on view level. if any amendment is made
it can be
saved by other name.

1. The major purpose of a database system is to provide users with an abstract


view of the system.
The system hides certain details of how data is stored and created and
maintained
Complexity should be hidden from database users.
Data abstraction is a process of representing the essential features without
including implementation details.
many database-systems users are not computer trained, developers hide the
complexity from users through several levels of abstraction, to simplify users
interactions with the system:
1) Physical level.
The lowest level of abstraction describes how the data are
actually stored. The physical level describes complex low-level data
structures in detail.

E.g. index, B-tree, hashing.

2) Logical level.
The next-higher level of abstraction describes what data are
stored in the database, and what relationships exist among those data. The
logical level thus describes the entire database in terms of a small number of
relatively simple structures.

3) View level.
The highest level of abstraction describes only part of the entire
database. The variety of information stored in a large database. Many users
of the database system do not need all this information; instead, they need
to access only a part of the database. The view level of abstraction exists to
simplify their interaction with the system.

E.g. tellers in a bank get a view of customer accounts, but not of payroll data.
Q.2) What is Data Abstraction?
Ans.
Abstraction, in general, is the process of taking away or removing
characteristics from something in order to reduce it to a set of essential
characteristics. As in abstract art, the representation is likely to be one
potential abstraction of a number of possibilities. Adatabase abstraction
layer, for example, is one of a number of such possibilities.
Data abstraction is usually the first step in database design. A
complete database is much too complex a system to be developed without
first creating a simplified framework. Data abstraction makes it possible for
the developer to start from essential elements -- data abstractions -- and
incrementally add data detail to create the final system.

Example: A doctor sees (abstracts) the person as patient.

The doctor
is interested in name, height, weight, age, blood group, previous or existing
diseases etc of a person
An employer sees (abstracts) a person as Employee. The employer is
interested in name, age, health, degree of study, work experience etc of a person.

Q.3) Define the term Data Independence?


Ans.

Data Independence:
Data independence is the type of data transparency that matters for a centralized DBMS. It refers
to the immunity of userapplications to changes made in the definition and organization of data.
There's a lot of data in whole database management system other than user's data. DBMS comprises of three kinds
of schemas, which is in turn data about data (Meta-Data). Meta-data is also stored along with database, which once
stored is then hard to modify. But as DBMS expands, it needs to be changed over the time satisfy the requirements of
users. But if the whole data were highly dependent it would become tedious and highly complex.

[Image: Data independence]


Data about data itself is divided in layered architecture so that when we change data at one layer it does not affect the
data layered at different level. This data is independent but mapped on each other.

Logical Data Independence


Logical data is data about database, that is, it stores information about how data is managed inside. For example, a
table (relation) stored in the database and all constraints, which are applied on that relation.
Logical data independence is a kind of mechanism, which liberalizes itself from actual data stored on the disk. If we
do some changes on table format it should not change the data residing on disk.

Physical Data Independence

All schemas are logical and actual data is stored in bit format on the disk. Physical data independence is the power to
change the physical data without impacting the schema or logical data.
For example, in case we want to change or upgrade the storage system itself, that is, using SSD instead of Harddisks should not have any impact on logical data or schemas.

Q.4) Define an entity, entity set , relationship and relationship set?


Ans.
Entity :
An entity is a person,place,thing or event for which data is collected and maintained.
for example...
a library system may contain data about different entities like BOOK and MEMBER.
A college system may include entities like STUDENT,TEACHER and CLASS...

Entity Set:

An entity set is a set of entities of the same type (e.g., all persons having an
account at a bank).
Entity sets need not be disjoint. For example, the entity set employee (all
employees of a bank) and the entity set customer (all customers of the bank)
may have members in common.
An entity is represented by a set of attributes.
o E.g. name, S.I.N., street, city for ``customer'' entity.
o The domain of the attribute is the set of permitted values (e.g. the
telephone number must be seven positive integers).
Formally, an attribute is a function which maps an entity set into a domain.

o Every entity is described by a set of (attribute, data value) pairs.


o There is one pair for each attribute of the entity set.
o E.g. a particular customer entity is described by the set {(name, Harris),
(S.I.N., 890-123-456), (street, North), (city, Georgetown)}.
An analogy can be made with the programming language notion of type definition.
The concept of an entity set corresponds to the programming language type
definition.
A variable of a given type has a particular value at a point in time.
Thus, a programming language variable corresponds to an entity in the E-R
model.

Relationships & Relationship Sets


A relationship is an association between several entities.
A relationship set is a set of relationships of the same type.
Formally it is a mathematical relation on
If

where

(possibly non-distinct) sets.

are entity sets, then a relationship set R is a subset of

is a relationship.

For example, consider the two entity sets customer and account. (Fig. 2.1 in the text).
We define the relationship CustAcct to denote the association between customers and
their accounts. This is a binary relationship set (see Figure 2.2 in the text).
Going back to our formal definition, the relationship set CustAcct is a subset of all the
possible customer and account pairings.
This is a binary relationship. Occasionally there are relationships involving more than
two entity sets.
The role of an entity is the function it plays in a relationship. For example, the
relationship works-for could be ordered pairs of employee entities. The first employee
takes the role of manager, and the second one will take the role of worker.
A relationship may also have descriptive attributes. For example, date (last date of
account access) could be an attribute of the CustAcct relationship set.

Q.5) What are the different types of database system users?


Ans
1. Application programmers or Ordinary users
2. End users
3. Database Administrator (DBA)
4. System Analyst
1. Application programmers or Ordinary users: These users write application programs
to interact with the database. Application programs can be written in some programming
language such a COBOL, PL/I, C++, JAVA or some higher level fourth generation language.
Such programs access the database by issuing the appropriate request, typically a SQL statement
to DBMS.
2. End Users: End users are the users, who use the applications developed. End users need not
know about the working, database design, the access mechanism etc. They just use the system to
get their task done. End users are of two types:

a) Direct users b) Indirect users


a) Direct users: Direct users are the users who se the computer, database system directly, by
following instructions provided in the user interface. They interact using the application
programs already developed, for getting the desired result. E.g. People at railway reservation
counters, who directly interact with database.
b) Indirect users: Indirect users are those users, who desire benefit form the work of DBMS
indirectly. They use the outputs generated by the programs, for decision making or any other
purpose. They are just concerned with the output and are not bothered about the programming
part.
3. Database Administrator (DBA): Database Administrator (DBA) is the person which
makes the strategic and policy decisions regarding the data of the enterprise, and who provide
the necessary technical support for implementing these decisions. Therefore, DBA is responsible
for overall control of the system at a technical level. In database environment, the primary
resource is the database itself and the secondary resource is the DBMS and related software
administering these resources is the responsibility of the Database Administrator (DBA).
4. System Analyst: System Analyst determines the requirement of end users, especially nave
and parametric end users and develops specifications for transactions that meet these
requirements. System Analyst plays a major role in database design, its properties; the structure
prepares the system requirement statement, which involves the feasibility aspect, economic
aspect, technical aspect etc. of the system.

Q.6) Define Simple , composite attributes and single valued and multivalued
attributes?
Ans

Single and Composite Attributes


Attributes can be classified as having many parts to them or just a single unbreakable attribute. The
composite attribute is an attribute that can be subdivided into other single attributes with meanings of
their own. A single attribute is just an attribute that cannot be subdivided into parts.
Example: Imagine from the entity Student that instead of having the three
attributes: stu_LastName, stu_MiddleName, stu_FirstName it had one attribute called stu_Name.
The attribute stu_Name would be considered a composite attribute since it can be subdivided into
the other three attributes:stu_LastName, stu_MiddleName, stu_FirstName. The rest of attributes
would be consider single attributes since they can't be subdivided into parts.

Single-valued and multi-valued Attributes


Attributes can be classified as single or multi value. The single-value attribute can only have one
value, while the multi-valued attributes usually can store multiple data in them.
Example: In the entity Student, stu_Address could be considered a multi-value attribute since a
student could have multiple addresses where he lives at. An example of a single-value attribute
would bestu_LastName since a student usually has one last name that uniquely identifies him/her.

Q.7) Explain Derived Attribute with example?


Ans
The last category that attributes can be defined is called a derived attribute, where one attribute is
calculated from another attribute. The derived attribute may not be stored in the database but rather
calculated using algorithm.
Example: In the entity Student, stu_Age would be considered a derived attribute since it could be
calculated using the student's date of birth with the current date to find their age.
examples of derived attributes are:salary,age or DOB.

Q.8) What is total and partial participation?


Ans
Total participation
Every member of entity set must participate in the relationship

Represented by double line from entity rectangle to relationship diamond

E.g., A Class entity cannot exist unless related to a Faculty member entity
in this example, not necessarily at Juniata.

You can set this double line in Dia

In a relational model we will use the references clause.

Partial Participation: There exist an instance of the first entity type that dont share an instance
of the relationship type with the other entity type.

Q.9) Define Primary key , Candidate key and Super Key?


Ans
A primary key is a column (or columns) in a table that uniquely identifies the rows
in that table.
CUSTOMERS

CustomerNo

FirstName

LastName

Sally

Thompson

Sally

Henderson

Harry

Henderson

Sandra

Wellington

For example, in the table above, CustomerNo is the primary key.


The values placed in primary key columns must be unique for each row: no
duplicates can be tolerated. In addition, nulls are not allowed in primary key
columns.
Definition: A candidate key is a combination of attributes that can be uniquely used to
identify a database record without any extraneous data. Each table may have one or
more candidate keys. One of these candidate keys is selected as the table primary key.

Definition: A superkey is a combination of attributes that can be uniquely used to


identify a database record. A table might have many superkeys. Candidate keys are a
special subset of superkeys that do not have any extraneous information in them.
Examples: Imagine a table with the fields <Name>, <Age>, <SSN> and <Phone
Extension>. This table has many possible superkeys. Three of these are <SSN>,
<Phone Extension, Name> and <SSN, Name>. Of those listed, only <SSN> is a
candidate key, as the others contain information not necessary to uniquely identify
records.

Q.10) What is Weak Entity set and Strong Entity Set?


Ans
WEAK ENTITY SET
Weak entity set may not have sufficient attributes to form a primary key.
Eg:An entity set payment with the following attributes:payment-number,payment-date
and payment-amount.
STRONG ENTITY SET
Strong entity set is an entity set that has a primary key.
Eg:An entity set customer with the following attributes:customer-name,customer-id and
balance.
Q.11) Distinguish Between Specialization And Generalization?
Ans
generalization and specialization are important relationships that exist betweena higher level entity
set and one or more lower level entity sets.
1. generalization is the result of taking the union of two or more lower level entity sets to produce a
higher level entity sets.

specialization is the results of taking subsets of a higher level entity set to form a lower level entity
sets.
2. In generalization,each higher level entity must also be a lower level entity.
In specialization,some higher level entities may not have lower-level entity sets at all.
3. Specialization is a Top Down process where as Generalization is Bottom Up process.
Q.12) What is Aggregation?
Ans.

Aggregation refers to performing an operation on a group of values to get a single


result.

One limitation of the E-R model is that it cannot express relationships among the
relationships so to overcome this we use aggregation.

Aggregation is a process when the relation between two entity is treated as a single
entity.

Aggregation represents abstract entities by allowing relationship between


relationships.

Aggregation is a special type of Association.

Aggregation is also called a "Has-a" relationship.

UML Notation:

Example

5 Marks
Q.1) Compare database systems with file systems.

Database Systems

File Systems

Data are non-redundant and consistent

Data are redundant and inconsistent

Easy to access the data

Difficult to access the data

It ensures integrity, security and atomicity

It doesnt ensure integrity, security and atomicity

Data can be accessed concurrently

It leads to concurrent access anomalies

Appropriate data can be retrieved easily

Since data is scattered in various files,


new application programs have to be written to
retrieve the appropriate data

Q.2) Define and map E R Model to Relational Model?


Ans

Entity-Relationship Model
Entity-Relationship model is based on the notion of real world entities and relationship among them. While formulating
real-world scenario into database model, ER Model creates entity set, relationship set, general attributes and
constraints.
ER Model is best used for the conceptual design of database.
ER Model is based on:

Entities and their attributes


Relationships among entities
These concepts are explained below.

[Image: ER Model]

Entity
An entity in ER Model is real world entity, which has some properties called attributes. Every attribute is defined by
its set of values, called domain.

For example, in a school database, a student is considered as an entity. Student has various attributes like name, age
and class etc.

Relationship
The logical association among entities is called relationship. Relationships are mapped with entities in various ways.
Mapping cardinalities define the number of association between two entities.
Mapping cardinalities:

one to one

one to many

many to one

many to many

Converting (Mapping) E-R Model to


Relational Model
E-R Model

->

Relational Model

->

Normalization -> Database

Previously, we covered modeling the users view as an E-R diagram. Entities, Relationships,

Attributes and Identifiers.


We now need to convert this conceptual representation to a model that can be implemented

directly in a database.
Relational Model includes: Relations, Tuples, Attributes, keys and foreign keys.
Relation: A two dimensional table make up of tuples (This is a simple definition that

o
o

we will define more rigorously in a later chapter).


Tuple: A row of data in a relation made up of one or more attributes.
Attribute: A characteristic of the relation contained in a tuple.
The following are some vocabulary that are commonly used. Note the different terms used
depending on the model being discussed.
ER Model

Relational Model

Database

Traditional Programmer

Entity

Relation

Table

File

Entity Instance

Tuple

Row

Record

Attribute

Attribute

Column

Field

Identifier

Key

Key

Key (or link)

Q.3) What is relational algebra ? Describe any three operations with eg?
Ans

Relational Algebra is :
the formal description of how a relational database operates
an interface to the data stored in the database itself
the mathematics which underpin SQL operations
Operators in relational algebra are not necessarily the same as SQL
operators, even if they have the same name. For example, the SELECT
statement exists in SQL, and also exists in relational algebra. These two uses
of SELECT are not the same. The DBMS must take whatever SQL statements
the user types in and translate them into relational algebra operations before
applying them to the database.

Terminology
Relation - a set of tuples.
Tuple - a collection of attributes which describe some real world entity.
Attribute - a real world role played by a named domain.
Domain - a set of atomic values.
Set - a mathematical definition for a collection of objects which contains
no duplicates.
Select Operation ()
Selects tuples that satisfy the given predicate from a relation.
Notation p(r)
Where p stands for selection predicate and r stands for relation. p is prepositional logic
formulae which may use connectors like and, or and not. These terms may use
relational operators like: =, , , < , >, .
For example:

subject="database"(Books)
Output : Selects tuples from books where subject is 'database'.
subject="database"

(Books)

and price="450"

Output : Selects tuples from books where subject is 'database' and 'price' is 450.
subject="database"

(Books)

and price < "450" or year > "2010"

Output : Selects tuples from books where subject is 'database' and 'price' is 450 or the
publication year is greater than 2010, that is published after 2010.
Project Operation ()
Projects column(s) that satisfy given predicate.
Notation: A1, A2, An (r)
Where a1, a2 , an are attribute names of relation r.
Duplicate rows are automatically eliminated, as relation is a set.
for example:
subject,

author

(Books)

Selects and projects columns named as subject and author from relation Books.

Union Operation ()
Union operation performs binary union between two given relations and is defined as:
r s = { t | t r or t s}
Notion: r U s
Where r and s are either database relations or relation result set (temporary relation).

For a union operation to be valid, the following conditions must hold:

r, s must have same number of attributes.

Attribute domains must be compatible.


Duplicate tuples are automatically eliminated.

author

(Books)

author

(Articles)

Output : Projects the name of author who has either written a book or an article or both.

Q.4) Explain EER Features?


Ans

Attribute inheritance
The attributes of higher level entity set are inherited by lower level entity set.
Aggregation
Aggregation is an abstraction in which relationship sets are treated as higher level entity

sets. Here a relationship set is embedded inside an entity set, and these entity sets can
participate in relationships.

Q.7) Explain Join Operation of Relational Algebra?


Ans
From Korth book
Q.8) Write a queries on relational algebra?
Ans

1. Consider a database with the following schema:


Person ( name, age, gender ) name is a key
Frequents ( name, pizzeria )

(name, pizzeria) is a
key

Eats ( name, pizza )

(name, pizza) is a key

Serves ( pizzeria, pizza,


price )

(pizzeria, pizza) is a key

Write relational algebra expressions for the following nine queries.


(Warning: some of the later queries are a bit challenging.)
If you know SQL, you can try running SQL queries to match your
relational algebra expressions. We've created a file for
download with schema declarations and sample data. (See our quick
guide for SQL system instructions.) To check your queries, the
correct results are found in the answers section below.
a. Find all pizzerias frequented by at least one person under the age of 18.
b.
c.

Find the names of all females who eat either mushroom or pepperoni pizza (or both).
Find the names of all females who eat both mushroom and pepperoni pizza.

d. Find all pizzerias that serve at least one pizza that Amy eats for less than $10.00.

e.

Find all pizzerias that are frequented by only females or only males.

For each person, find all pizzas the person eats that are not served by any pizzeria the person f
f. (name) / pizza pairs.
g.
h.
i.

Find the names of all people who frequent only pizzerias serving at least one pizza they eat.
Find the names of all people who frequent every pizzeria serving at least one pizza they eat.

Find the pizzeria serving the cheapest pepperoni pizza. In the case of ties, return all of the chea

2. Consider a schema with two relations, R(A, B) and S(B, C), where
all values are integers. Make no assumptions about keys. Consider
the following three relational algebra expressions:

Two of the three expressions are equivalent (i.e., produce the same
answer on all databases), while one of them can produce a different
answer. Which query can produce a different answer? Give the
simplest database instance you can think of where a different
answer is produced.

3. Consider a relation R(A, B) that contains r tuples, and a


relation S(B, C) that contains s tuples; assume r > 0 and s > 0. Make
no assumptions about keys. For each of the following relational
algebra expressions, state in terms of r and s the minimum and
maximum number of tuples that could be in the result of the
expression.

4. Two more exotic relational algebra operators we didn't cover are


the semijoin and antijoin. Semijoin is the same as natural join,
except only attributes of the first relation are returned in the result.
For example, if we have relations Student(ID, name) and Enrolled(ID,
course), and not all students are enrolled in courses, then the query
"Student Enrolled" returns the ID and name of all students who
are enrolled in at least one course. In the general case,
E1 E2 returns all tuples in the result of expression E1 such that
there is at least one tuple in the result of E2 with matching values for
the shared attributes. Antijoin is the converse: E1 E2 retuns all
tuples in the result of expression E1 such that there are no tuples in
the result of E2 with matching values for the shared attributes. For
example, the query "Student Enrolled" returns the ID and name of
all students who are not enrolled in any courses.
Like some other relational operators (e.g., intersection, natural join),
semijoin and antijoin are abbreviations - they can be defined in
terms of other relational operators. Define E1 E2 in terms of other
relational operators. That is, give an equation "E1 E2 = ??",
where ?? on the right-hand side is a relational algebra expression
that doesn't use semijoin. Similarly, give an equation "E1 E2 = ??",
where ?? on the right-hand side is a relational algebra expression
that doesn't use antijoin.

5. Consider a relation Temp(regionID, name, high, low) that records


historical high and low temperatures for various regions. Regions
have names, but they are identified by regionID, which is a key.

Consider the following query, which uses the linear notation


introduced at the end of the relational algebra videos.

State in English what is computed as the final Result. The answer


can be articulated in a single phrase.
1. Sample solutions; in general there are many correct expressions
for each query.

Query results for SQL data:


a. Straw Hat, New York Pizza, Pizza Hut
b. Amy, Fay
c. Amy
d. Little Caesars, Straw Hat, New York Pizza
e. Little Caesars, Chicago Pizza, New York Pizza
f. Amy: mushroom, Dan: mushroom, Gus: mushroom

g. Amy, Ben, Dan, Eli, Fay, Gus, Hil


h. Fay
i. Straw Hat, New York Pizza

2. Query (c) is different. Let R = {(3, 4)} and S = {(1, 2)}. Then
query (a) and (b) produce an empty result while (c) produces {(3,
2)}.

3. a. Minimum = max(r, s) (if one relation is a subset of the other)


Maximum = r + s (if the relations are disjoint)
b. Minimum = 0 (if there are no shared B values)
Maximum = r x s (if all of the B values are the same)
c. Minimum = 0 (if there are no shared B values)
Maximum = min(r, s)
(if one relation's B values are a subset of the other's, and all B
values are distinct)
d. (equivalent to R)
Minimum = r, Maximum = r
e. Minimum = 0 (if A = B in all tuples of R)
Maximum = r (if A <> B in all tuples of R)

4.

5. Names of regions with the highest high temperature and/or


lowest low temperature

https://www.classle.net/book/dbms

You might also like