Professional Documents
Culture Documents
UNIT-TWO
TRANSACTIONS PROCESSING
What is a transaction?
A Transaction is a mechanism for applying the
desired modifications/operations to a database. It
is evident in real life that the final database
instance after a successful manipulation of the
content of the database is the most up-to-date
copy of the database.
Action, or series of actions, carried out by a single
user or application program, which accesses or
changes contents of database. (i.e. Logical unit of
work on the database.)
In comparison with database transaction, application program is series of transactions with non-
database processing in between.
What we are interested about is the online transaction, which is the interaction between the users
of a database system and the shared data stored in the database. This transaction program
contains the steps involved in the business transaction.
Transactions can be started, attempted, then committed or aborted via data manipulation
commands of SQL.
A single transaction might require several queries, each reading and/or writing information in the
database.When this happens it is usually important to be sure that the database is not left with
only some of the queries carried out. For example, when doing a money transfer, if the money was
debited from one account, it is important that it also be credited to the depositing account. Also,
transactions should not interfere with each other.
A: Atomicity
C: Consistency
I: Isolation
D: Durability
It is referred as ACID property of transaction. Without the ACID property, the integrity of
the database cannot be guaranteed.
Atomicity
Is All or None property
Every transaction should be considered as an atomic process which cannot be sub divided into small tasks.
Due to this property, just like an atom which exists or does not exist, a transaction has only two states.
Done - a transaction must complete successfully and its effect should be visible in the
database.
Never Started -If a transaction fails during execution then all its modifications must be
undone to bring back the database to the last consistent state, i.e., remove the effect
of failed transaction.
Consistency
If the transaction code is correct then a transaction, at the end of its execution, must leave the database
consistent. A transaction should transform a database from one previous consistent state to another
consistent state.
Isolation
A transaction must execute without interference from other concurrent transactions and its intermediate or
partial modifications to data must not be visible to other transactions.
Durability
The effect of a completed transaction must persist in the database, i.e., its updates must be available to
other transaction immediately after the end of its execution, and it should not be affected due to failures
after the completion of the transaction.
In practice, these properties are often relaxed somewhat to provide better performance.
State of a Transaction
A transaction is an atomic operation from the users’ perspective. But it has a collection of operations and it
can have a number of states during its execution.
1. Successful Termination:
- When a transaction completes the execution of all operations in it and reaches the COMMIT
command.
2. Suicidal Termination:
- When the transaction detects an error during its processing and decide to abrupt itself before the
end of the transaction and perform a ROLL BACK
3. Murderous Termination:
- When the DBMS or the system force the execution to abort for any reason.
Start Ok to Commit
Commit Commit
Database
Modified
No Error
System Detects
Error
Modify Abort End of
Transaction
(PProjID=WProjID)
<PProj
State
Error Transaction RollBack
(WEmpID=EE Database
Initiated ID>
mpID)
unmodified
(a) Serially: Serial Execution: In a serial execution transactions are executed strictly serially. Thus,
Transaction Ti completes and writes its results to the database then only the next transaction Tj is
scheduled for execution. This means at one time there is only one transaction that is being executed
in the system. The data is not shared between transactions at one specific time.
In Serial transaction execution, one transaction being executed does not interfere the execution of any other
transaction.
Correct execution, i.e., if the input is correct then output will be correct.
Fast execution, since all the resources are available to the active.
The worst thing about serial execution is very inefficient resource utilization.
Example of a serial execution, Suppose data items X = 10, Y = 6, and N =1 and T1 and T2 are transactions.
T1 T2
X := X+N X := X+N
read (Y)
Y := Y+N
write (Y)
Time T1 T2
X := X+N {X = 11}
read (Y) {Y = 6}
Y := Y+N {Y = 7}
write (Y) {Y = 7}
X := X+N {X = 12}
write (X)
(b) Concurrently: is the reverse of serially executable transactions, in this scheme the individual
operations of transactions, i.e., reads and writes are interleaved in some order.
Time T1 T2
X := X+N {X = 11}
X := X+N {X = 11}
write (X)
read (Y) {Y = 6}
Y := Y+N {Y = 7}
write (Y) {Y = 7}
Final values at the end of T1 and T2 : X = 11, and Y = 7. This improves resource
utilization, unfortunately gives incorrect result.
The correct value of X is 12 but in concurrent execution X =11, which is incorrect. The
reason for this error is incorrect sharing of X by T1 and T2.
In serial execution T2 read the value of X written by T1 (i.e., 11) but in concurrent
execution T2 read the same value of X (i.e., 10) as T1 did and the update made by T1 was
overwritten by T2’s update.
This is the reason the final value of X is one less that what is produced by serial execution.
After the successful completion of the operation in this schedule, the final value of A
will be 200 which override the update made by the first transaction that changed the
value from 100 to 90.
UNIT-THREE
1. Locking methods
2. Time stamping
3. Optimistic
Locking and Time stamping are pessimistic approaches since they delay transactions.
Both Locking and Time stamping are conservative approaches: delay transactions in case they
conflict with other transactions.
The optimistic approach allows us to proceed and check conflicts at the end.
Optimistic methods assume conflict is rare and only check for conflicts at commit.
Locking Method
A LOCK is a mechanism for enforcing limits on access to a resource in an environment
where there are many threads of execution. Locks are one way of enforcing concurrency
control policies. Transaction uses locks to deny access to other transactions and so prevent
incorrect updates.
Lock prevents another transaction from modifying item or even reading it, in the
case of a write lock.
Lock (X): If a transaction T1 applies Lock on data item X, then X is locked and it is not
available to any other transaction.
Unlock (X): T1 Unlocks X. X is available to other transactions.
Types of a Lock
Shared lock: A Read operation does not change the value of a data item. Hence a data
item can be read by two different transactions simultaneously under share
lock mode. So only to read a data item T1 will do: Share lock (X), then Read
(X), and finally Unlock (X).
Exclusive lock:A write operation changes the value of the data item. Hence two write
operations from two different transactions or a write from T1 and a read
from T2 are not allowed. A data item can be modified only under Exclusive
lock. To modify a data item T1 will do: Exclusive lock (X), then Write (X) and
finally Unlock (X).
When these locks are applied, then a transaction must behave in a special way. This
special behavior of a transaction is referred to as well-formed.
Well-formed:A transaction is well- formed if it does not lock a locked data item and it
does not try to unlock an unlocked data item.
Examples: T1 and T2 are two transactions. They are executed under locking as follows. T1
locks A in exclusive mode. When T2 wants toc lock A, it finds it locked by T1 so T2 waits
for Unlock on A by T1. When A is released then T2 locks A and begins execution.
Suppose a lock on a data item is applied, the data item is processed and it is unlocked
immediately after reading/writing is completed as follows.
Initial values of A = 10 and B = 20.
T1 T2 T1 T2
Lock (A) Lock (A)
A := A + 100 A := A + 100
B := B + 10 B := B * 5
B := B * 5 B := B + 10
A := A + 20 A := A + 20
Growing phase - acquires all locks but cannot release any locks.
Shrinking phase - releases locks but cannot acquire any new locks.
Timeout
The deadlock detection could be done using the technique of TIMEOUT. Every transaction
will be given a time to wait in case of deadlock. If a transaction waits for the predefined
period of time in idle mode, the DBMS will assume that deadlock occurred and it will abort
and restart the transaction.
The timestamp ordering protocol ensures that any conflicting read and write
operations are executed in the timestamp order.
Cascading Rollback
Whenever some transaction T tries to issue a Read_Item(X) or a Write_Item(X) operation,
the basic timestamp ordering algorithm compares the timestamp of T with the read
timestamp and the write timestamp of X to ensure that the timestamp order of execution of
Optimistic Technique
Locking and assigning and checking timestamp values may be unnecessary for
some transactions
Assumes that conflict is rare.
When transaction reaches the level of executing commit, a check is performed
to determine whether conflict has occurred. If there is a conflict, transaction is
rolled back and restarted.
Based on assumption that conflict is rare and more efficient to let transactions
proceed without delays to ensure serializability.
At commit, check is made to determine whether conflict has occurred.
If there is a conflict, transaction must be rolled back and restarted.
Potentially allows greater concurrency than traditional protocols.
Three phases:
1. Read
2. Validation
3. Write
The granularity has effect on the performance of the system. As locking will prevent access
to the data, the size of the data required to be locked will prevent other transactions from
having access. If the entire database is locked, then consistency will be highly maintained
but less performance of the system will be witnessed. If a single data item is locked;
consistency maybe at risk but concurrent processing and performance will be enhanced.
Thus, as one go from the entire database to a single value, performance and concurrent
processing will be enhanced but consistency will be at risk and needs good concurrency
controlmechanism and strategy.
Transaction Subsystem
Transaction manager
Scheduler
Recovery manager
Buffer manager
The Scheduler- in the DBMS ensures that the individual steps of different
transactions preserve consistency.
Recovery Manager: Ensures that database is restored to the original state incase
failure occurs.
Buffer Manager: responsible for transfer of data between disk storage and main
memory.
UNIT - FOUR
Database Recovery
Database recovery -is the process of restoring database to a correct state in the event of a
failure.
A database recovery - is the process of eliminating the effects of a failure from the
database.
Recovery, in database systems terminology, is called restoring the last consistent state of
the data items.
Types of failures
A failure is a state where data inconsistency is visible to transactions if they are
Scheduled for execution.
The kind of failure could be:
System crashes, resulting in loss of main memory.
Media failures, resulting in loss of parts of secondary storage.
Application software errors.
To make the database secured, one should formulate a “plan of attack” in advance. The
plan will be used in case of database insecurity that may range from minor inconsistency to
total loss of the data due to hazardous events.
The basic steps in performing a recovery are
1. Isolating the database from other users. Occasionally, you may need to drop and re-
create the database to continue the recovery.
2. Restoring the database from the most recent use able dump.
3. Applying transaction log dumps, in the correct sequence, to the database to make the
data as current as possible.
Example:
The initial value of A=100, B=200 and C=300
The Required state after the execution of T1 is A=500, B=800 and C=700
Thus S1= (100,200,300) AND S2= (500,800,700)
Transaction (T1)
Time Operation
1 A=A+200
2 B=B-100
DBMS starts at time t0, but fails at time tf. Assume data for transactions T2 and T3 have been
written to secondary storage.
T1 and T6 have to be undone. In absence of any other information, recovery manager has to redo
T2, T3, T4, and T5.
tc is the checkpoint time by the DBMS
Recovery Facilities
DBMS should provide following facilities to assist with recovery:
Backup mechanism: that makes periodic backup copies of database.
Logging facility: that keeps track of current state of transactions and
database changes.
Checkpoint facility: that enables updates to database in progress to be
made permanent.
Recovery manger: which allows DBMS to restore the database to a
consistent state following a failure.
Recovery Techniques
Damage to the database could be either physical and relate which will result in the loss of
the data stored or just inconsistency of the database state after the failure. For each we can
have a recover mechanism:
1. If database has been damaged:
Need to restore last backup copy of database and re apply updates of
committed transactions using log file.
Extensive damage/catastrophic failure: physical media failure; is restored
by using the backup copy and by re executing the committed transactions
from the log up to the time of failure.
Recovery is required if only the database is updated. The kind of recovery also
depends on the kind of update made on the database.
Database update: A transaction’s updates to the database can be applied in three
Ways:
Three main recovery techniques:
1. Deferred Update
2. Immediate Update
3. Shadow Paging
Deferred Update
Updates are not written to the database until after a transaction has reached its
commit point.
If transaction fails before commit, it will not have modified database and so no
undoing of changes required.
May be necessary to redo updates of committed transactions as their effect may not
have reached database.
A transaction first modifies all its data items and then writes all its updates to the
final copy of the database. No change is going to be recorded on the database before
commit. The changes will be made only on the local transaction workplace. Update
on the actual database is made after commit and after the change is recorded on the
log. Since there is no need to perform undo operation it is also called NO-
UNDO/REDO Algorithm
As soon as a transaction updates a data item, it updates the final copy of the database on
the database disk. During making the update, the change will be recorded on the
transaction log to permit rollback operation in case of failure. UNDO and REDO are
required to make the transaction consistent.
Thus it is called UNDO/REDO Algorithm. This algorithm will undo all updates made
in place before commit. The redo is required because some operations which are
completed but not committed should go to the database.
If we don’t have the second scenario, then the other variation of this algorithm is called
UNDO/NO-REDO Algorithm.
Shadow Paging
Maintain two page tables during life of a transaction:
current page and shadow page table.
When transaction starts, two pages are the same.
Shadow page table is never changed thereafter and is used to restore database in
event of failure.
During transaction, current page -table records all updates to database.
When transaction completes, current page table becomes shadow page table.
Centralized DB Distributed DB
o Local Transaction: transactions that access data only in that single site
o Global Transaction: transactions that access data in several sites.
Parallel DBMS: a DBMS running across multiple processors and disks that is designed
to execute operations in parallel, whenever possible, in order to improve performance.
Three architectures for parallel DBMS:
Shared Memory- for fast data access for a limited number of processors.
Shared Disk- for application inherently centralized
Shared nothing.- massively parallel
What makes DDBMS different is that
o The various sites are aware of each other
o Each site provides a facility for executing both local and global transactions.
The different sites can be connected physically in different topologies.
o Fully /networked,
o Partially Connected,
o Tree Network,
o Star Network and
o Ring Network
The differences between these sites is based on:
o Installation Cost: cost of linking sites physically.
o Communication Cost: cost to send and receive messages and data
o Reliability: resistance to failure
o Availability: degree to which data can be accessed despite the failure.
The distribution of the database sites could be:
1. Large Geographical Area: Long-Haul Network
relatively slow
less reliable
uses telephone line, microwave, satellite
2. Small Geographical Area: Local Area Network
higher speed
lower rate of error
use twisted pair, base band coaxial, broadband coaxial, fiber optics
Even though integration of data implies centralized storage and control, in distributed
database systems the intention is different. Data is stored in different database systems in a
decentralized manner but act as if they are centralized through development of computer
networks.
A distributed database system consists of loosely coupled sites that share no physical
component and database systems that run on each site are independent of each other.
Functions of a DDBMS
Issues in DDBMS
How is data stored in DDBMS?
There are several ways of storing a single relation in distributed database systems.
1.Replication:
o System maintains multiple copies of similar data (identical data)
o Stored in different sites, for faster retrieval and fault tolerance.
o Duplicate copies of the tables can be kept on each system (replicated). With this
option, updates to the tables can become involved (of course the copies of the tables
can be read-only).
o Advantage: Availability, Increased parallelism (if only reading)
o Disadvantage: increased overhead of update
2.Fragmentation:
o Relation is partitioned into several fragments stored in distinct sites
The partitioning could be vertical, horizontal or both.
o Horizontal Fragmentation
Systems can share the responsibility of storing information from a
single table with individual systems storing groups of rows
o Vertical Fragmentation
Systems can share the responsibility of storing particular attributes of a
table.
Needs attribute with tuple number (the primary key value be repeated.)
Performed by the Projection Operation
The whole content of the relation is reconstructed using the Natural
JOIN operation using the attribute with Tuple number (primary key
values)
1. Distribution transparency Even though there are many systems they appear
as one- seen as a single, logical entity.
2. Replication transparency Copies of data floating around everywhere also
seem like just one copy to the developers and users
Distributed Database Recovery If one machine goes down how does that affect the others.
Security: Just like any computer network, a distributed system needs to have a common
way to validate users entering from any computer in the network of servers.
Common Data-Dictionary Your schema now has to be distinguished and work in
connection to schemas created on many systems.
Why DDBMS/Advantages
1. Many existing systems
There are different strategies to process a specific query, which in turn increase the performance of
the system by minimizing processing time and cost. In addition to the cost estimates we have for a
centralized database (disk access, relation size, etc), we have to consider the following in
distributed query processing:
For the case of Replicated data allocation, even though parallel processing is used to increase
performance, update will have a great impact since all the sites containing the data item should be
updated.For the case of fragmentation, update works more like the centralized database but
reconstruction of the whole relation will require accessing data from all sites containing part of the
relation.Let the distributed database has three sites (S1, S2, and S3). And two relations, EMPLOYEE
and DEPARTMENT are located at S1 and S2 respectively without any fragmentation. And a query
is initiated from S3 to retrieve employees [First Name (15 byte long), Last name (15 byte long) and
Department name (10 byte long) total of 40 bytes with the department they are working in.
Let:
For EMPLOYEE we have the following information
1. 10,000 records
2. each record is 100 bytes long
For DEPARTMENT we have the following information
3. 100 records
4. each record is 35 bytes long
There are three ways of executing this query:
1. Transfer DEPARTMENT and EMPLOYEE to S3 and perform the join there: needs transfer of
10,000*100+100*35=1,003,500 byte.
2. Transfer the EMPLOYEE to S2, perform the join there which will have 40*10,000 = 400,000
bytes and transfer the result to S3. we need 1,000,000+400,000=1,400,000 byte to be
transferred
3. Transfer the DEPARTMENT to S1, perform the join there which will have 40*10,000 =
400,000 bytes and transfer the result to S3. We need 3,500+400,000=403,500 byte to be
transferred.
Then one can select the strategy that will reduce the data transfer cost for this specific query. Other
steps of optimization may also be included to make the processing more efficient by reducing the
size of the relations using projection.
Transaction Management
Transaction is a logical unit of work constituted by one or more operations executed by a single
user. A transaction begins with the user's first executable query statement and ends when it is
committed or rolled back.
There are two types of transaction in DDBMS to access data from other sites:
1. Remote Transaction: contains only statements that access a single remote node. Thus,
Remote Query statement is a query that selects information from one or more remote tables,
all of which reside at the same remote node or site.
o For example, the following query accesses data from the dept table in the Addis schema
(the site) of the remote sales database:
o A remote update statement is an update that modifies data in one or more tables, all
of which are collocated at the same remote node. For example, the following query
updates the branch table in the Addis schema of the remote sales database:
WHERE BranchNo = 5;
2. Distributed Transaction: contains statements that access more than one node.
{Employee data is stored in Dessie and Sales data is stored in Addis, there is an employee
responsible for each sale}
Remote query
select
client_nm
from
clients@accounts.motorola.com;
Distributed query
select
project_name, student_nm
from
intership@accounts.motorola.com i, student s
where
s.stu_id = i.stu_id
Remote Update
Distributed Update
Concurrency Control
o There are various techniques used for concurrency control in centralized database systems. The
techniques in distributed database system are similar with the centralized approach with additional
implementation requirements or modifications.
o The main difference or the change that should be incorporated is the way the lock manager is
implemented and how it functions.
o There are different schemes for concurrency control in DDBS
1. Non-Replicated Scheme
o No data is replicated in the system
o All sites will maintain a local lock manager (local lock and unlock)
o If site Si needs a lock on data in site Sj it send message to lock manager of site
Sj and the locking will be handled by site Sj
o All the locking and unlocking principles are handled by the local lock
manager in which the data object resides.
o Is simple to implement
o Need three message transfers
o To request a lock
o To notify grant of lock
o To request unlock
Attributes
• Called instance variables
• Domain
Object state
• Object values at any given time
• Values of attributes at any given point in time.
Methods
• Code/function that performs operation on object’s data
• Has name and body
Messages
• Means by which objects communicate
• Request from one object to the other to activate one of its methods
• Invokes method/calls method to be applied
• Sent to object from real world or other object
• Notation: Object.Method
• Eg: StaffObject.updatesalary(slary)
Classes
• Blueprint for defining similar objects
• Objects with similar attributes and respond to same message are grouped
together
• Defined only once and used by many objects
• Collection of similar objects
• Shares attributes and structure
• Superclass
• Subclass
Inheritance
• Ability of object to inherit the data structure and behavior of classes above it
• Single inheritance – class has one immediate superclass
• Multiple – class has more than one immediate superclass
Method Overriding
Object Classification
• Simple
Only single-valued attributes
No attributes refer to other objects
• Composite
At least one multi-valued attribute
No attributes refer to other object
• Compound
At least one attribute that references other object
OO vs. E-
R Model Components
OO Data Model E-R Model
OID N/A
OODBMS
• Object-oriented database technology is a marriage of object-oriented programming
and database technologies.
• Database management system integrates benefits of typical database systems with
OODM characteristics
• Handles a mix of data types (since OODBMS permits new data definition)
• Follows OO rules
• Follows DBMS rules
OO and Database Design
• Provides data identification and the procedures for data manipulation
• Data and procedures self-contained entity
• Iterative and incremental
• DBA does more programming
• Lack of standards
OODBMS Advantages
• More semantic information
• Support for complex objects
• Extensibility of data types (user defined data types)
• May improve performance with efficient caching
• Versioning
• Polymorphism: one operation shared by many objects and each acting differently
• Reusability
• Inheritance speeds development and application: defining new objects in terms of
previously defined objects Incremental Definition)
• Potential to integrate DBMSs into single environment
• Relationship between objects is represented explicitly supporting both navigational
and associative access to information.
OODBMS Disadvantages
• Strong opposition from the established RDBMSs
• Lack of theoretical foundation
UNIT – SEVEN
Privacy – Ethical and legal rights that individuals have with regard to control
Over the dissemination and use of their personal information
Database integrity – Mechanism that is applied to ensure that the data in the
database is correct and consistent
A good database security management system has the following characteristics:
- data independence,
- shared access,
- minimal redundancy,
With a strong enforcement and management of these, it is said that the database system can
effectively prevent accidental security and integrity threats from system error, improper
authorization and concurrent usage anomalies. In addition to have an efficient system, it
should have prevention on malicious or intentional security and integrity threats where
computer system operator can bypass security as well as programmers as hackers.
There are certain security policy issues that we should recognize, where we should
consider administrative control policies, decide which security features offered by the
DBMS is used to implement the system, decide whether the focus of security
administration is left with DBA and whether it is centralized or decentralized. Besides, one
should decide on ownership of shared data as well.
When we talk about the levels of security protection, it may start from
organization & administrative security,
physical & personnel security,
communication security and
Information systems security
Database security and integrity is about protecting the database from being inconsistent and
being disrupted. We can also call it database misuse.
Database misuse could be Intentional or Accidental, where accidental misuse is easier to
cope with than intentional misuse.
Accidental inconsistency could occur due to:
System crash during transaction processing
Anomalies due to concurrent access
Anomalies due to redundancy
Logical errors
1. Physical Level: concerned with securing the site containing the computer system. The
backup systems should also be physically protected from access except for authorized
users. In other words, the site or sites containing the computer systems must be
physically secured against armed or sneaky entry by intruders.
2. Human Level: concerned with authorization of database users for access the content at
different levels and privileges.
3. Operating System: concerned with the weakness and strength of the operating system
security on data files. Weakness may serve as a means of unauthorized access to the
database. No matter how secure the database system is, weakness in operating system
security may serve as a means of unauthorized access to the database. This also includes
protection of data in primary and secondary memory from unauthorized access.
4. Database System: concerned with data access limit enforced by the database system.
Access limit like password, isolated transaction and etc. Some database system users
may be authorized to access only a limited portion of the database. Other users may be
allowed to issues queries, but may be forbidden to modify the data. It is the
responsibility of the database system to ensure that these authorization restrictions are
not violated.
5. Application Level:Since almost all database systems allow remote access through
terminals or networks, software-level security with the network software is as important
as physical security, both on the Internet and networks private to an enterprise.
Even though we can have different levels of security and authorization on data objects and
users, who access which data is a policy matter rather than technical.
Database Integrity-
-Constraints contribute to maintaining a secure database system by
preventing data from becoming invalid and hence giving misleading or
incorrect results
Domain Integrity -means that each column in any table will have set of
allowed values and Can not assume any value other than the one specified in
the domain.
Entity Integrity -means that in each table the primary key (which may be
composite) satisfies both of two conditions:
1. That the primary key is unique within the table and
2. That the primary key column(s) contains no null values.
Referential Integrity means that in the database as a whole, things are set up
in such a way that if a column exists in two or more tables in the database
(typically as a primary key in one table and as a foreign key in one or more
other tables), then any change to a value in that column in any one table will
be reflected in corresponding changes to that value where it occurs in other
tables. This means that the RDBMS must be set up so as to take appropriate
actions to spread a change—in one table—from that table to the other tables
where the change must also occur.
The effect of the existence and maintenance of referential integrity is, in short,
that if a column exists in two or more tables in the database, every occurrence
of the column will contain only values that are consistent across the database.
Database Security - the mechanisms that protect the database against intentional or
accidental threats. Database security encompasses hardware, software, people and data.
The designer and the administrator of a database should first identify the possible threat that
might be faced by the system in order to take counter measures.
Threat may be any situation or event, whether intentional or accidental, that may adversely
affect a system and consequently the organization
Examples of threats:
Using another persons’ means of access
Unauthorized amendment/modification or copying of data
Program alteration
Inadequate policies and procedures that allow a mix of confidential and normal
out put
Wire-tapping
Illegal entry by hacker
Blackmail
Theft of data, programs, and equipment
Failure of security mechanisms, giving greater access than normal
Staff shortages or strikes
Authorization
The granting of a right or privilege that enables a subject to have legitimate
access to a system or a system’s object
Authorization controls can be built into the software, and govern not only
what system or object a specified user can access, but also what the user may
do with it
Authorization controls are sometimes referred to as access controls
The process of authorization involves authentication of subjects (i.e. a user or
program) requesting access to objects (i.e. a database table, view, procedure,
trigger, or any other object that can be created within the system)
Views
A view is the dynamic result of one or more relational operations operation on
the base relations to produce another relation
A view is a virtual relation that does not actually exist in the database, but is
produced upon request by a particular user
The view mechanism provides a powerful and flexible security mechanism by
hiding parts of the database from certain users
Integrity
Integrity constraints contribute to maintaining a secure database system by
preventing data from becoming invalid and hence giving misleading or
incorrect results
Domain Integrity: setting the allowed set of values
Entity integrity: demanding Primary key values not to assume a NULL value
Referential integrity: enforcing Foreign Key values to have a value that already
exist in the corresponding Candidate Key attribute(s) or be NULL.
Key constraints: the rules the Relational Data Model has on different kinds of
Key.
Encryption
Authorization may not be sufficient to protect data in database systems, especially when there is a
situation where data should be moved from one location to the other using network facilities.
Encryption is used to protect information stored at a particular site or transmitted between sites
from being accessed by unauthorized users.
If a database system holds particularly sensitive data, it may be deemed necessary to encode
it as a precaution against possible external threats or attempts to access it
The DBMS can access data after decoding it, although there is a degradation in performance
because of the time taken to decode it
Encryption also protects data transmitted over communication lines
To transmit data securely over insecure networks requires the use of a Cryptosystem, which
includes:
1. An encryption key to encrypt the data (plain text)
2. An encr
3. encryption algorithm that, with the encryption key, transforms the plaintext into
cipher text
4. A decryption key to decrypt the ciphertext
5. A decryption algorithm that, with the decryption key, transforms the ciphertext
back into plaintext
Data encryption standard (DES) is an approach which does both a substitution of characters
and a rearrangement of their order based on an encryption key.
The hardware that the DBMS is running on must be fault-tolerant, meaning that the
DBMS should continue to operate even if one of the hardware components fails. This
suggests having redundant components that can be seamlessly integrated into the
working system whenever there is one or more component failures. The main
hardware components that should be fault-tolerant include disk drives, disk
controllers, CPU, power supplies, and cooling fans. Disk drives are the most
vulnerable components with the shortest times between failures of any of the
hardware components.
RAID works on having a large disk array comprising an arrangement of several
independent disks that are organized to improve reliability and at same time increase
performance
Performance is increased through data striping
Data striping – the data is segmented into equal size partitions (the striping unit)
which are transparently distributed across multiple disks.
The database should be able to check for all the three components before processing
any request. The checking is performed by the security subsystem of the DBMS.
AUTHENTICATION
All users of the database will have different access levels and permission for different data
objects, and
authentication is the process of checking whether the user is the one with the privilege for
the access level.
Is the process of checking the users are who they say they are.
Each user is given a unique identifier, which is used by the operating system to determine
who they are
Thus the system will check whether the user with a specific username and password is
trying to use the resource.
Associated with each identifier is a password, chosen by the user and known to the
operation system, which must be supplied to enable the operating system to authenticate
who the user claims to be.
AUTHORIZATION/PRIVILEGE
Authorization refers to the process that determines the mode in which a particular (previously
authenticated) client is allowed to access a specific resource controlled by a server.
Most of the time, authorization is implemented by using Views.
Views are unnamed relations containing part of one or more base relations creating a
customized/personalized view for different users.
Views are used to hide data that a user needs not to see.
Different users, depending on the power of the user, can have one or the combination of
the above forms of authorization on different data objects.
User authorization on the database schema
1. Index Authorization: deals with permission to create as well as delete an index table for
relation.
2. Resource Authorization: deals with permission to add/create a new relation in the
database.
3. Privilege Grant:
involves giving different levels of privileges for different users and user groups.
There are two broader approaches to security. The two types of database security
mechanisms are:
1. Discretionary security mechanisms
Grant different privileges to different users and user groups on various
data objects
The privilege is to access different data objects
The mode of the privilege could be
Read,
Insert,
Delete,
Update files, records or fields.
Is more flexible
One user can have A but not B and another user can have B but not A
2. Mandatory security mechanisms
Enforce multilevel security
Classifying data and users into various security classes (or levels)
and implementing the appropriate security policy of the
organization.
Each data object will have certain classification level
Each user is given certain clearance level
Only users who can pass the clearance level can access the data object
Is comparatively not-flexible/rigid
If one user can have A but not B then B is accessed by users with higher
privilege and we can not have B but not A
The ability to classify user into a hierarchy of groups provide a powerful tool for
administering large systems with thousands of users and objects.
A database system can support one or both of the security mechanisms to protect the data.
Statistical databases contain information about individuals which may not be permitted to be
seen by others as individual records.
THUS STATISTICAL DATABASES SHOULD HAVE ADDITIONAL SECURITY TECHNIQUES WHICH WILL
PROTECT THE RETRIEVAL OF INDIVIDUAL RECORDS.
ONLY QUERIES WITH STATISTICAL AGGREGATE FUNCTIONS LIKE AVERAGE, SUM, MIN, MAX,
STANDARD DEVIATION, MID, COUNT, ETC SHOULD BE EXECUTED.
NOT TO LET THE USER MAKE INFERENCE ON THE RETRIEVED DATA, ONE CAN ALSO IMPLEMENT
CONSTRAINT ON THE MINIMUM NUMBER OF RECORDS OR TUPLES IN THE RESULTING RELATION BY
SETTING A THRESHOLD.