The most striking difference between the database The Flat-File Approach model and the flat-file model is the pooling of Flat files are data files that contain records with data into a common database that is shared by no structured relationships to other files. The all organizational users.1 flat-file approach is most often associated with so-called legacy systems. KEY ELEMENTS OF THE DATABASE ENVIRONMENT: Three Significant Problems In The Flat-File Environment: Database management system Data Storage - Efficient data management Typical Features: captures and stores data only once and makes 1. Program development. The DBMS contains this single source available to all users who application development software. Both need it.1 programmers and end users may employ this Data Updating - Organizations store a great deal feature to create applications to access the of data on master files and reference files that database. require periodic updating to reflect changes.1 2. Backup and recovery. During processing, the Currency of Information - In contrast to the DBMS periodically makes backup copies of the problem of performing multiple updates is the physical database. In the event of a disaster problem of failing to update all the user files (disk failure, program error, or malicious act) that are affected by a change in status.1 that renders the database unusable, the DBMS Task-Data Dependency - Another problem with can recover to an earlier version that is known the flat-file approach is the user’s inability to to be correct. obtain additional information as his or her 3. Database usage reporting. This feature needs change: this is known as task-data captures statistics on what data are being used, dependency.1 when they are used, and who uses them.1 The Database Approach 4. Database access. The most important feature Access to the data resource is controlled by a of a DBMS is to permit authorized user access, database management system (DBMS). The both formal and informal, to the database. DBMS is a special software system that is Data Definition Language programmed to know which data elements each - is a programming language used to define the user is authorized to access. 1 database to the DBMS. The DDL identifies the Elimination of Data Storage Problem names and the relationship of all data elements, Each data element is stored only once, thereby records, and files that constitute the database. eliminating data redundancy and reducing data Database Views collection and storage costs. Internal View/Physical View. The physical Elimination of Data Update Problem arrangement of records in the database is Because each data element exists in only one place, presented through the internal view. 1 it requires only a single update procedure.1 Conceptual View/Logical View (Schema). The Elimination of Currency Problem schema (or conceptual view) describes the A single change to a database attribute is entire database.1 automatically made available to all users of the External View/User View (Subschema). The attribute.1 subschema or user view, defines the user’s Elimination of Task-Data Dependency Problem section of the database—the portion that an individual user is authorized to access. Define data requirements Users Specify tests procedures Develop data dictionary. FORMAL ACCESS: APPLICATION INTERFACES Establish programming standards Data manipulation language (DML) is the Design: proprietary programming language that a Operation and Maintenance: particular DBMS uses to retrieve, process, and Logical database (schema) store data. 2 Evaluate database performance The following description is generic and certain External users’ views (subschemas) technical details are omitted. Reorganize database as user needs demand 1. A user program sends a request for data to Internal view of databases. the DBMS Review standards and procedures 2. The DBMS analyzes the request by matching Database controls the called data elements against the Change and Growth: user view and the conceptual view. Plan for change and growth 3. The DBMS determines the data structure Evaluate new technology parameters from the internal view and passes The Data Dictionary them to the operating system, which performs -describes every data element in the the actual data retrieval. database. 4. Using the appropriate access method (an operating system utility program), the operating The Physical Database system interacts with the disk storage device to retrieve the data from the physical database. The fourth major element of the database 5. The operating system then stores the data in approach. This is the lowest level of the a main memory buffer area managed by the database and the only level that exists in DBMS. physical form. 6. The DBMS transfers the data to the user’s Typical File Processing Operations work location in main memory. 1. Retrieve a record from the file based on its 7. When processing is complete, Steps 4, 5, and primary key value. 6 are reversed to restore the processed data to 2. Insert a record into a file. the database 3. Update a record in the file. INFORMAL ACCESS: QUERY LANGUAGE 4. Read a complete file of records. The second method of database access is the 5. Find the next record in a file. informal method of queries. A query is an ad 6. Scan a file for records with common hoc access methodology for extracting secondary keys. information from a database.2 7. Delete a record from a file.
The Database Administrator Data structures
The DBA is responsible for managing the -are the bricks and mortar of the database resource. database. FUNCTIONS OF THE DATABASE ADMINISTRATOR Data Organization Database Planning: -The organization of a file refers to the way Implementation: records are physically arranged on the Develop organization’s database strategy secondary storage device. This may be either Determine access policy sequential or random. Define database environment. Data Access Methods Implement security controls -The access method is the technique used to This was a popular method of data locate records and to navigate through the representation because it reflected, more or database. less faithfully, many aspects of an organization that are hierarchical in relationship. The Criteria That Influence The Selection Of The Navigational Databases Data structure include: The hierarchical data model is called a 1. Rapid file access and data retrieval navigational database because traversing the 2. Efficient use of disk storage space files requires following a predefined path. 3. High throughput for transaction processing Limitations of the Hierarchical Model 4. Protection from data loss 1. A parent record may have one or more child 5. Ease of recovery from system failure records. 6. Accommodation of file growth 2. No child record can have more than one parent. The Network Model DBMS Models In the late 1970s, an ANSI committee created the Committee on Development of Applied A data model is an abstract representation of Symbolic Languages (CODASYL), which formed a the data about entities, including resources database task group to develop standards for (assets), events (transactions), and agents database design. (personnel or customers, etc.) and their The Relational Model relationships in an organization. E. F. Codd originally proposed the principles of the relational model in the late 1960s. DATABASE TERMINOLOGY Data Attribute/Field.A data attribute (or field) is DATABASES IN A DISTRIBUTED ENVIRONMENT a single item of data, such as customer’s name, The physical structure of the organization’s data account balance, or address. is an important consideration in planning a Entity. An entity is a database representation of distributed system. In addressing this issue, the an individual resource, event, or agent about planner has two basic options: the databases which we choose to collect data. Entities may be physical (inventories, customers, and can be centralized or they can be distributed. employees) or conceptual (sales, accounts Distributed databases fall into two categories: receivable, and depreciation expense). partitioned databases and replicated databases. Record Type (Table or File). When we group Centralized Databases together the data attributes that logically define The first approach involves retaining the data in an entity, they form a record type a central location. Remote IT units send Database. A database is the set of record types that an organization needs to support its requests for data to the central site, which business processes. processes the requests and transmits the data Associations. Record types that constitute a back to the requesting IT unit. The actual database exist in relation to other record types. processing of the data is performed at the This is called an association. remote IT unit. The central site performs the functions of a file manager that services the Three basic record associations are: data needs of the remote sites. A fundamental One-to-one association. One-to-many association. objective of the database approach is to Many-to-many association maintain data currency. This can be a challenging task in a DDP environment. The Hierarchical Model Distributed Databases Distributed databases can be either partitioned complete processing of the other transactions in or replicated. the deadlock. The pre-empted transactions Partitioned Databases must then be reinitiated. In pre-empting The partitioned database approach splits the transactions, the dead- lock resolution software central database into segments or parti- tions attempts to minimize the total cost of breaking that are distributed to their primary users. The the deadlock. Some of the factors that are advantages of this approach follow: considered in this decision follow: Having data stored at local sites increases users’ The resources currently invested in the control. transaction. This may be measured by the Transaction processing response time is number of updates that the transaction has improved by permitting local access to data and already performed and that must be repeated if reducing the volume of data that must be the transaction is terminated. transmitted between IT units. The transaction’s stage of completion. In Partitioned databases can reduce the potential general, deadlock resolution software will avoid effects of a disaster. By locating data at several terminating transactions that are close to sites, the loss of a single IT unit does not completion. eliminate all data processing by the The number of deadlocks associated with the organization. transaction. Because terminating the The partitioned approach, which is illustrated in transaction breaks all deadlock involvement, the Figure 4.16, works best for organizations that software should attempt to terminate require minimal data sharing among their transactions that are part of more than one distributed IT units. The primary user manages deadlock. data requests from other sites. To minimize data Replicated Databases access from remote users, the organization Are effective in companies where there exists a needs to carefully select the host location. high degree of data sharing but no primary user. Identifying the optimum host requires an in- Since common data are replicated at each IT depth analysis of user data needs. unit site, the data traffic between sites is The Deadlock Phenomenon - In a distributed reduced considerably The primary justification environment, it is possible for multiple sites to for a replicated database is to support read-only lock out each other from the database, thus queries. With data replicated at every site, data preventing each from processing its access for query purposes is ensured, and transactions. lockouts and delays due to data traffic are Deadlock - is a permanent condition that must minimized. The problem with this approach is be resolved by special software that analyzes maintaining current versions of the database at each deadlock – Is a condition to determine the each site. Since each IT unit processes only its best solution. Because of the implication for transactions, common data replicated at each transaction processing, accountants should be site are affected by different transactions and aware of the issues pertaining to deadlock reflect different values. resolutions. Concurrency Control Deadlock Resolution Database concurrency is the presence of Resolving a deadlock usually involves complete and accurate data at all user sites. terminating one or more transactions to System designers need to employ methods to ensure that transactions processed at each site retrieving, corrupting, or destroying the entity’s are accurately reflected in the databases of all data. Backup controls ensure that in the event the other sites. Because of the implication for of data loss due to unauthorized access, the accuracy of accounting records, the equipment failure, or physical disaster the concurrency problem is a matter of concern for organization can recover its database. auditors. A commonly used method for Access Controls concurrency control is to serialize transactions. Users of flat files maintain exclusive ownership This method involves labelling each transaction of their data. In spite of the data integration by two criteria. problems associated with this model, it creates First, special software groups transactions into an environment in which unauthorized access to classes to identify potential conflicts. data can be effectively controlled. When not in The second part of the control process is to use by the owner, a flat file is closed to other time-stamp each transaction. users and may be taken off-line and physically Database Distribution Methods and the secured in the data library. In contrast, the need Accountant to integrate and share data in the database The decision to distribute databases is one that environment means that databases must should be entered into thoughtfully. There are remain on-line and open to all potential users. many issues and trade-offs to consider. Here are In the shared database environment, access some of the most basic questions to be control risks include corruption, theft, misuse, addressed: and destruction of data. These threats originate Should the organization’s data be centralized or from both unauthorized intruders and distributed? authorized users who exceed their access If data distribution is desirable, should the privileges. Several control features are now databases be replicated or partitioned? reviewed. If replicated, should the databases be totally replicated or partially replicated? User Views If the database is to be partitioned, how should The user view or subschema is a subset of the the data segments be allocated among the total database that defines the user’s data sites? domain and provides access to the database. The choices involved in each of these questions Database Authorization Table impact the organization’s ability to maintain Containsrulesthatlimittheactionsausercantake.T data integrity. The preservation of audit trails his technique is similar to the access control list and the accuracy of accounting records are key used in the operating system. Each user is concerns. Clearly, these are decisions that the granted certain privileges that are coded in the modern auditor should understand and authority table, which is used to verify the influence intelligently. user’s action requests. CONTROLLING AND AUDITING DATA User-Defined Procedures MANAGEMENT SYSTEMS Allows the user to create a personal security Controls over data management systems fall program or routine to provide more positive into two general categories: access controls and user identification than a single password. Thus, backup controls. Access controls are designed to in addition to a password, the security prevent unauthorized individuals from viewing, procedure asks a series of personal questions (such as the user’s mother’s maiden name), What is the total cost of Class II payroll for which only the legitimate user should know. department XYZ? Data Encryption Answers to these types of questions are needed Database systems also use encryption routinely for resource management, facility procedures to protect highly sensitive stored planning, and operations control decisions. data, such as product formulas, personnel pay Legitimate queries sometimes involve access to rates, password files, and certain financial data confidential data. Thus, individual users may be thus making it unreadable to an intruder granted summary and statistical query access to “browsing” the database. confidential data to which they normally are Biometric Devices denied direct access. The ultimate in user authentication procedures To preserve the confidentiality and integrity of is the use of biometric devices, which measure the database, inference controls should be in various personal characteristics, such as place to prevent users from inferring, through fingerprints, voice prints, retina prints, or query features, specific data values that they signature characteristics. These user otherwise are unauthorized to access. Inference characteristics are digitized and stored controls attempt to pre- vent three types of permanently in a database security file or on an compromises to the database. identification card that the user carries. When 1. Positive compromise—the user determines the an individual attempts to access the database, a specific value of a data item. special scanning device captures his or her 2. Negative compromise—the user determines biometric characteristics, which it compares that a data item does not have a specific value. with the profile data stored on file or the ID 3. Approximate compromise—the user is unable card. If the data do not match, access is denied. to determine the exact value of an item but is Biometric technology is currently being used to able to estimate it with sufficient accuracy to secure ATM cards and credit cards. Because of violate the confidentiality of the data. the distributed nature of modern systems, the degree of remote access to systems, the decline Audit Objective Relating to Database Access in costs of biometric systems, and the increased Verify that database access authority and effectiveness of biometric systems, biometric privileges are granted to users in accordance de- vices have a great potential to serve as with their legitimate needs. effective means of access control, especially Audit Procedures for Testing Database Access from remote locations. Controls Inference Controls Responsibility for Authority Tables and One advantage of the database query capability Subschemas - The auditor should verify that is that it provides users with summary and database administration (DBA) personnel retain statistical data for decision making. For exclusive responsibility for creating authority example, managers might ask the following tables and designing user views. Evidence may questions: come from three sources: (1) by reviewing What is the total value for inventory items with company policy and job descriptions, which monthly turnover less than three? specify these technical responsibilities; (2) by What is the average charge to patients with examining programmer authority tables for hospital stays greater than 8 days? access privileges to data definition language (DDL) commands; and (3) through personal process called destructive replacement. interviews with programmers and DBA Therefore, once a data value is changed, the personnel. original value is destroyed, leaving only one Appropriate Access Authority - The auditor can version (the current version) of the file. To select a sample of users and verify that their provide backup, direct access files must be access privileges stored in the authority table copied before being updated. The timing of the are consistent with their job descriptions direct access backup procedures will depend on organizational levels. the processing method being used. Biometric Controls - The auditor should evaluate the costs and benefits of biometric Off-Site Storage – As an added safeguard, controls. Generally, these would be most backup files created under both the GPC and appropriate where highly sensitive data are direct access approaches should be stored off- accessed by a very limited number of users. site in a secure location. Inference Controls - The auditor should verify Audit Objective Relating to Flat-File Backup - that database query controls exist to prevent Verify that backup controls in place are effective unauthorized access via inference. The auditor in protecting data files from physical damage, can test controls by simulating access by a loss, accidental erasure, and data corruption sample of users and attempting to retrieve through system failures and program errors. unauthorized data via inference queries. Audit Procedures for Testing Flat-File Backup Encryption Controls - The auditor should verify Controls that sensitive data, such as passwords, are Sequential File (GPC) Backup. The auditor properly encrypted. Printing the file contents to should select a sample of systems and hard copy can do this. determine from the system documentation that Backup Controls - Data can be corrupted and the number of GPC backup files specified for destroyed by malicious acts from external each system is adequate. If insufficient backup hackers, disgruntled employees, disk failure, versions exist, recovery from some types of program errors, fires, floods, and earthquakes. failures may be impossible. To recover from such disasters, organizations Backup Transaction Files. The auditor should must implement policies, procedures, and verify through physical observation that techniques that systematically and routinely transaction files used to reconstruct the master provide backup copies of critical files. files are also retained. Without corresponding Backup Controls in the Flat-File Environment – transaction files, reconstruction is impossible. The backup Direct Access File Backup. The auditor should techniqueemployedwilldependonthemediaandt select a sample of applications and identify the hefilestructure.Sequential files (both tape and direct access files being updated in each system. disk) use a backup technique called From system documentation and through grandparent–parent–child (GPC). This backup observation, the auditor can verify that each of technique is an integral part of the master file them was copied to tape or disk before being update process. Direct access files, by contrast, updated. need a separate back up procedure. Off-Site Storage. The auditor should verify the Direct Access File Backup - Data values in direct existence and adequacy of off-site storage. This access files are changed in place through a audit procedure may be performed as part of the review of the disaster recovery plan or Verify that controls over the data resource are computer center operations controls. sufficient to preserve the integrity and physical Backup Controls in the Database Environment security of the database. Since data sharing is a fundamental objective of Audit Procedures for Testing Database Backup the database approach, this environment is Controls particularly vulnerable to damage from The auditor should verify that backup is individual users. One unauthorized procedure, performed routinely and frequently to facilitate one malicious act, or one program error can the recovery of lost, destroyed, or corrupted deprive an entire user community of its data without excessive reprocessing. Production information resource. Also, because of data databases should be copied at regular intervals centralization, even minor disasters such as a (perhaps several times an hour). Backup policy disk failure can affect many or all users. When should strike a balance between the such events occur, the organization needs to inconvenience of frequent backup activities and reconstruct the database to pre-failure status. the business disruption caused by excessive This can be done only if the database was repro- cessing that is needed to restore the properly backed up in the first place. database after a failure. Backup The auditor should verify that automatic backup The backup feature makes a periodic backup of procedures are in place and functioning, and the entire database. This is an automatic that copies of the database are stored off-site procedure that should be performed at least for further security. once a day. The backup copy should then be stored in a secure remote area. Transaction Log (Journal) The Transaction Log feature provides an audit trail of all processed transactions. It lists transactions in a transaction log file and records the resulting changes to the database in a separate database change log. Checkpoint Feature The checkpoint facility suspends all data processing while the system reconciles the transaction log and the database change log against the database. At this point, the system is in a quiet state. Checkpoints occur automatically several times an hour. If a failure occurs, it is usually possible to restart the processing from the last checkpoint. Thus, only a few minutes of transaction processing must be repeated. Recovery Module The recovery module uses the logs and backup files to restart the system after a failure. Audit Objective Relating to Database Backup