You are on page 1of 16

Data Source

Datasource is a name given to the connection set up to a database from a server.


The name is commonly used when creating a query to the database. The DSN
(Datasource Name) does not have to be the same as the filename for the database.
For example, a database file named “friends.mdb” could be set up with a DSN of
“school”. Then DSN “school” would then be used to refer to the database when
performing a query.

A DataSource object is the representation of a data source in the Java programming


language. In basic terms, a data source is a facility for storing data. It can be as
sophisticated as a complex database for a large corporation or as simple as a file
with rows and columns. A data source can reside on a remote server, or it can be on
a local desktop machine. Applications access a data source using a connection, and
a DataSource object can be thought of as a factory for connections to the particular
data source that the DataSource instance represents. The DataSource interface
provides two methods for establishing a connection with a data source.

Source of Data
• From another Database

• From Web / Network / User / Groups

Q8/I (A) –
Data File Environment 2006
Q5 (A) –
• Data File Environment, also called file system (often also written as
filesystem) is a method of storing and organizing computer files and their
data. Each file in this system is isolated and possesses no / very little
connection with another.

• In a data file environment, all files are produced using various tools and
applications, so file integrity is far less.

• Files can be stored on different hard disk partitions according to user


requirements and more files can be added to the same partition till the disk is
full.
• Security is generally low in a Data File Environment and sharing integrity is
also low.
Q8/II (A) –
Database Environment 2006
Q5 (A) –
• In a database environment, data is logically stored in tabular form and often
possess relations and connections within such other tables.

• In database environment, all files (databases) are created can be opened /


edited / deleted using same tool (DBMS Software), so file integrity is very
high.

• Databases are broken down into smaller “Data Files” which is stored in
memory at random locations on related server. Such Data Files are logically
connected but physically scattered on server’s storage device.

• Different usability and accessibility rights awarded to different level of users


which ensures that the database environment remains very secure. Again, it
is highly sharable since the core language of all database software are same
(SQL)

Data model (Database Models)


A data model in software engineering is an abstract model that describes how data
are represented and accessed. Data models formally define data elements and
relationships among data elements for a domain of interest. According to Hoberman
(2009), "A data model is a way finding tool for both business and IT professionals,
which uses a set of symbols and text to precisely explain a subset of real
information to improve communication within the organization and thereby lead to
a more flexible and stable application environment." A data model explicitly
determines the structure of data or structured data. Typical applications of data
models include database models, design of information systems, and enabling
exchange of data. Usually data models are specified in a data modeling language.

A database model is a theory or specification describing how a database is


structured and used. Several such models have been suggested. Common models
include:

 Flat model: This may not strictly qualify as a data model.


The flat (or table) model consists of a single, two-
dimensional array of data elements, where all members of a
given column are assumed to be similar values, and all
members of a row are assumed to be related to one another.
 Hierarchical model: In this model data is organized into a
tree-like structure, implying a single upward link in each
record to describe the nesting, and a sort field to keep the
records in a particular order in each same-level list.

 Network model: This model organizes data using two


fundamental constructs, called records and sets. Records
contain fields, and sets define one-to-many relationships
between records: one owner, many members.

 Relational model: is a database model based on first-


order predicate logic. Its core idea is to describe a
database as a collection of predicates over a finite set of
predicate variables, describing constraints on the possible values and
combinations of values.

 Object-relational model: Similar to a relational database


model, but objects, classes and inheritance are directly
supported in database schemas and in the query language.

 Concept Oriented Model: This is the conceptual


structuring of a database. Real structure may vary from this
structuring as this widely depend upon system or database
designer and may conceive a problem in different way than
that is actually implemented.

 Star schema is the simplest style of data warehouse


schema. The star schema consists of a few "fact tables"
(possibly only one, justifying the name) referencing any
number of "dimension tables". The star schema is
considered an important special case of the snowflake schema.

Properties of Databases (ACID)


Atomicity

Atomicity requires that database modifications must follow an all-or-nothing rule.


Each transaction is said to be atomic if one part of the transaction fails, the entire
transaction fails and database state is left unchanged. It is critical that the database
management system maintains the atomic nature of transactions in spite of any
application, DBMS, operating system or hardware failure. An atomic transaction
cannot be subdivided, and must be processed in its entirety or not at all. Atomicity
means that users do not have to worry about the effect of incomplete transactions.
Transactions can fail for several kinds of reasons:

• Hardware failure: A disk drive fails, preventing some of the transaction's database
changes from taking effect
• System failure: The user loses their connection to the application before providing
all necessary information
• Database failure: E.g., the database runs out of room to hold additional data
• Application failure: The application attempts to post data that violates a rule that
the database itself enforces, such as attempting to create a new account without
supplying an account number

Consistency

The consistency property ensures that the database remains in a consistent state.
More precisely, it says that any transaction will take the database from one
consistent state to another consistent state.

The consistency rule applies only to integrity rules that are within its scope. Thus, if
a DBMS allows fields of a record to act as references to another record, then
consistency implies the DBMS must enforce referential integrity: by the time any
transaction ends, each and every reference in the database must be valid. If a
transaction consisted of an attempt to delete a record referenced by another, each
of the following mechanisms would maintain consistency:

• Abort the transaction, rolling back to the consistent, prior state


• Delete all records that reference the deleted record (this is known as cascade
delete)
• Nullify the relevant fields in all records that point to the deleted record.

Isolation

Isolation refers to the requirement that other operations cannot access or see data
that has been modified during a transaction that has not yet completed. Each
transaction must remain unaware of other concurrently executing transactions,
except that one transaction may be forced to wait for the completion of another
transaction that has modified data that the waiting transaction requires.

Durability

Durability is the DBMS's guarantee that once the user has been notified of a
transaction's success, the transaction will not be lost. The transaction's data
changes will survive system failure, and that all integrity constraints have been
satisfied, so the DBMS won't need to reverse the transaction. Many DBMSs
implement durability by writing transactions into a transaction log that can be
reprocessed to recreate the system state right before any later failure. A
transaction is deemed committed only after it is entered in the log.
Deeper into Database modeling language Q2 (C) –
2007
Q4 (B) –
• Hierarchical model

o A hierarchy can link entities either directly or indirectly, and either


vertically or horizontally. The only direct links in a hierarchy, in so far
as they are hierarchical, are to one's immediate superior or to one of
one's subordinates, although a system that is largely hierarchical can
also incorporate alternative hierarchies. Indirect hierarchical links can
extend "vertically" upwards or downwards via multiple links in the
same direction, following a path.

o Degree of branching

Degree of branching refers to the number of direct subordinates or


children an object has (equivalent to the number of vertices a node
has). Hierarchies can be categorized based on the "maximum degree",
the highest degree present in the system as a whole. Categorization in
this way yields two broad classes: linear and branching.

 In a linear hierarchy, the maximum degree is 1. In other


words, all of the objects can be visualized in a lineup, and each
object (excluding the top and bottom ones) has exactly one
direct subordinate and one direct superior. Note that this is
referring to the objects and not the levels; every hierarchy has
this property with respect to levels, but normally each level can
have an infinite number of objects. An example of a linear
hierarchy is the hierarchy of life.

 In a branching hierarchy, one or more objects have a degree


of 2 or more (and therefore the maximum degree is 2 or higher).
For many people, the word "hierarchy" automatically evokes an
image of a branching hierarchy. Branching hierarchies are
present within numerous systems, including organizations and
classification schemes. The broad category of branching
hierarchies can be further subdivided based on the degree.

 A flat hierarchy is a branching hierarchy in which the


maximum degree approaches infinity, i.e., with a wide span.
Most often, systems intuitively regarded as hierarchical have at
most a moderate span. Therefore, a flat hierarchy is often not
viewed as a hierarchy at all at first blush. For example,
diamonds and graphite is a flat hierarchy of numerous carbon
atoms which can be further decomposed into subatomic
particles.
 An overlapping hierarchy is a branching hierarchy in which at
least one objects has two parent objects. For example, a
graduate student can have two co-supervisors to whom they
report directly and equally, and who have the same level of
authority within the university hierarchy (i.e., they have the
same position or tenure status).

• Network model

o The network model is a database model conceived as a flexible way of


representing objects and their relationships. Its distinguishing feature
is that the schema, viewed as a graph in which object types are nodes
and relationship types are arcs, is not restricted to being a hierarchy or
lattice.
o

• Object model

o A collection of objects or classes through which a program can


examine and manipulate some specific parts of its world. In other
words, the object-oriented interface to some service or system. Such
an interface is said to be the object model of the represented service
or system.
• Relational model

o Its central idea is to describe a database as a collection of predicates


over a finite set of predicate variables, describing constraints on the
possible values and combinations of values. The content of the
database at any given time is a finite (logical) model of the database,
i.e. a set of relations, one per predicate variable, such that all
predicates are satisfied. A request for information from the database (a
database query) is also a predicate.

o The purpose of the relational model is to provide a declarative method


for specifying data and queries: we directly state what information the
database contains and what information we want from it, and let the
database management system software take care of describing data
structures for storing the data and retrieval procedures for getting
queries answered.

Inverted lists and other methods are also used. A given database management
system may provide one or more of the four models. The optimal structure depends
on the natural organization of the application's data, and on the application's
requirements (which include transaction rate (speed), reliability, maintainability,
scalability, and cost).
The dominant model in use today is the ad hoc one embedded in SQL, despite the
objections of purists who believe this model is a corruption of the relational model,
since it violates several of its fundamental principles for the sake of practicality and
performance. Many DBMSs also support the Open Database Connectivity API that
supports a standard way for programmers to access the DBMS.
DBMS Concepts
Relations are the total table in which data are inserted and maintained. One or
more such tables may be linked using
different types of keys to form a
database. Such a link helps in relational
integrity (all related areas are updated
when a common field is updated) and
data sufficiency (low redundancy and
multiplicative errors).

A relation is again logically divided into


rows and columns. The columns represent different attributes of the table, one of
which is generally a primary key (used to decrease redundancy). The rows,
frequently referred to as tuples in database terminology, are complete information
on a single item which is indexed (linked / for which the table is actually made) in
the relation.

Keys in DBMS
Primary key: The attribute or combination of attributes that uniquely identifies a
row or record.

Foreign Key: an attribute or combination of attributes in a table whose value


matches a Primary key in another table.

Composite key: A primary key that consists of two or more attributes is known as
composite key

Candidate key: is a column in a table which has the ability to become a primary
key.

Alternate Key: Any of the candidate keys that are not part of the primary key is
called an alternate key. An alternate key is any candidate key which is not selected to be the
primary key.

Super key - A super key is defined in the relational model as a set of attributes of a
relation variable for which it holds that in all relations assigned to that variable
there are no two distinct tuples (rows) that have the same values for the attributes
in this set. Equivalently a super key can also be defined as a set of attributes of a
variable upon which all attributes of the relation are functionally dependent.

Secondary key: alternate of primary key.


DBMS Terminologies
• Database management system (DBMS): Software for establishing, Q2 (A) –
updating, and querying (e.g., managing) a database 2007

• Database: Organizing files into related units which are then viewed as a
single storage. The data in the database are generally made available to a
wide range of users through sharing and mentioning different rights and roles
to different classes of users.

• SQL (Structural Query Language): This is the core language of all


databases and this is also the common platform for different database
engines to interact.
Q5 (B) –
• Data warehouse: This is a physical repository where relational data are 2007
organized to provide clean, enterprise-wide data in a standardized format.
Data warehouse is a huge database that stores current and historical data of
potential interest to decision makers throughout the company. These data
originates in different TPS and through other external entry methods.

• Data Marts: These are the subsets of a data warehouse in which a


summarized and highly focused portion of the organization’s data is placed in
a separate database for a specific set of users. Companies often build
enterprise-wide warehouses where a central data warehouse serves the
entire organization; or they create small decentralized warehouses called
data marts.

• Entity: An entity may be defined as a thing which is recognized as being


capable of an independent existence and which can be uniquely identified.
Entities carries attributes to get it uniquely identified.

• Relationship: Two different entities possessing some logical associations


are physically connected using relationships. Relationships may also have
attributes attached to it.

• Attributes: These are the features or uniquely identifiable characteristic of


an element (entity or Relationship).

Q2 (B) –
Relevance of relational design in DSS 2007
Q5 (A) –
• Multidimensional problem solving: in DSS architecture, problem solving
requires multiple ways of evaluation of the problem and collecting requisite Q1 (A) –
information towards each different evaluation. 2005
Q2 (A) –
2006
• Critical queries: DBMS and RDBMS can handle complex queries and
information search which is very useful in DSS.

• Referentially integrated inputs: RDBMS and Relational structuring of data


helps in connecting related fields and information of a single item or object.

• Data warehousing support: RDBMS can remotely connect to different


servers to fetch data from and span across boundaries to create a centralized
data access medium which eventually gives rise to data warehouses.

• Data mart support: RBDMS, through its access rights and different views to
the same data can create data marts for high involvement decision making

• Sharability and scalability of information: Since a database accepts


concurrent access, multiple users can log on to the same screen at different
geographical locations or at different decision points. Information stored in
the database is highly scalable to offer flexibility at the information
searcher’s end.

Q8/I (B) –
Database Normalization 2006
Q8/II (A) –
Normalization is the scientific method of breaking down complex table structures
into simple table structures using certain rules. This method is used to reduce
redundancy in table and eliminate the problems of inconsistency and disk space
usage. The normalization theory is based on the fundamental notion of functional
dependency. (Given a Relation / Table R, Attribute A is functionally dependent on
attribute B if each value of A in R is associated with precisely one value of B.

E.g., >> Code Name City


E1 Mac Delhi
E2 Sandra CA
E3 Henry Paris

Not Normalized Form

The relation is kept without any normalization rules and guidelines. E.g., >>

ECODE DEPT DEPTHEA PROJCODE HOURS


D
E101 Systems E901 P27 90
P51 101
P20 60
E305 Sales E906 P27 109
P22 98
E508 Admin E908 P51 NULL
P27 72

First Normal Form (1NF)

A table is said to be in 1NF if each cell of the table contains precisely one value.
E.g., >>

ECODE DEPT DEPTHEA PROJCODE HOURS


D
E101 Systems E901 P27 90
E101 Systems E901 P51 101
E101 Systems E901 P20 60
E305 Sales E906 P27 109
E305 Sales E906 P22 98
E508 Admin E908 P51 NULL
E508 Admin E908 P27 72

Second Normal Form (2NF)

A table is said to be in 2NF when it is in 1NF and every attribute in the row is
functionally dependent on the whole key, and is not just a part of the key.

Guidelines to convert a table to 2NF:


• Find and remove attributes that are functionally dependent on only a part of
the key and not on the whole key. Place them in a different table.
• Group the remaining attributes.

E.g., >>

ECODE DEPT DEPTHEA PROJCOD


ECODE HOURS
D E
E101 Systems E901
E101 P27 90
E305 Sales E906
E101 P51 101
E508 Admin E908
E101 P20 60
E305 P27 109
E305 P22 98
E508 P51 NULL
E508 P27 72

Third Normal Form (3NF)


A table is said to be in 3NF when it is in 2NF and every non-key attribute is
functionally dependent only on the primary key.

Guidelines to convert a table to 3NF:


• Find and remove non-key attributes that are functionally dependent on
attributes that are not primary key. Place them in a different table containing
same properties
• Group the remaining attributes

E.g., >> DEPT DEPTHEA


D
ECODE DEPT
Systems E901
E101 Systems
Sales E906
E305 Sales
Admin E908
E402 Finance
Finance E909
E508 Admin
E607 Finance
E608 Finance
E104 Systems

Boyce – Codd Normal Form

A relation is in BCNF only if every determinant is a candidate key.

Guidelines to convert a table to BCNF


• Find and remove the overlapping candidate keys. Place the part of candidate
key and the attribute it is functionally dependent on, in another table.
• Group the remaining items into a table.

E.g., >>

ECODE NAME PROJCODE HOURS


E1 Veronica P2 48
E2 Anthony P5 100
E3 Mac P6 15
E4 Susan P2 250
E4 Susan P5 75
E1 Veronica P5 40

ECODE PROJCOD HOURS


E
E1 P2 48
E2 P5 100
E3 P6 15
E4 P2 250
E4 P5 75
E1 P5 40

You might also like