You are on page 1of 32

IELM 511: Information System design

Introduction Part 1. ISD for well structured data relational and other DBMS Info storage (modeling, normalization) Info retrieval (Relational algebra, Calculus, SQL) DB integrated APIs ISD for systems with non-uniformly structured data Basics of web-based IS (www, web2.0, ) Markups, HTML, XML Design tools for Info Sys: UML

Part III: (one out of)


APIs for mobile apps Security, Cryptography IS product lifecycles Algorithm analysis, P, NP, NPC

Agenda
Structured Query Language (SQL) DB APIs

Recall our Bank DB design


BRANCH( b_name, city, assets) CUSTOMER( cssn, c_name, street, city, banker, banker_type) LOAN( l_no, amount, br_name) PAYMENT( l_no, pay_no, date, amount) EMPLOYEE( e_ssn, e-name, tel, start_date, mgr_ssn)
1

ACCOUNT( ac_no, balance) SACCOUNT( ac_no, int_rate)


n

CACCOUNT( ac_no, od_amt) BORROWS( cust_ssn, loan_num)

n m n m

n 1 1 n

DEPOSIT( c_ssn, ac_num, access_date)


DEPENDENT( emp_ssn, dep_name)

Background: Structured Query Language


Basics of SQL: A DataBase Management System is an IT system Core requirements: - A structured way to store the definition of data [why ?] DDL - Manipulation of data [obviously!] DML SQL: a combined DDL+DML

SQL as a DDL
A critical element of any design is to store the definitions of its components. In DB design, we deal with tables, using table names, attribute names etc. Each of these terms should have unambiguous syntax and semantics. A systematic way to specify and store these meta-data is by the use of a Data Definition Language The information about the data is stored in a Data Dictionary SQL provides a unified DDL + a Data Manipulation Language (DML).

SQL as a DDL: create command


To create a new database: create database my_database; To create a new table: create table my_table ( attribute_name attribute_type ., constraint, ); To create an index on a table:
A table stores data A DB stores one or more tables and one or more indexes

constraint,

create index my_index on my_table( attribute);

An index is a special file for faster DB look-up, when searching the specified table for some data using the specified attribute.

SQL as a DDL: create command examples


create database bank; LOAN( l_no, amount, br_name)

create table loan ( l_no char(10), amount double, br_name char(30) references branch(b_name), primary key (loan_number) ); BORROWS( cust_ssn, loan_num) create table borrows ( cust_ssn char(11), loan_num char(10), primary key (cust_ssn, loan_num), constraint borrows_c1 foreign key cust_ssn references customer( cssn), constraint borrows_c2 foreign key loan_num references loan( l_no) );

Note on metadata: system catalogs


Metadata = data about data. DBMS manages a data dictionary sometimes called system catalog with - When was the DB and each table created/modified - Name of each attribute, its data type, and comments describing it, - List of all users who can access the DB and their passwords, - Which user can do what (read/add/update/delete/authorize) to the data. System catalog itself is stored in a table, and users can see (if they have authority) the data in it.

SQL as a DML: insert, drop commands


To add one row into a table: insert into branch values( Downtown, Brooklyn, 9000000); insert into loan values( L17, 1000, Downtown);
Note: char( ), date, datetime types: data must be quoted integer, single, double (number data types) are not quoted. Sequence in which you execute insert matters ! This insert will fail unless table branch has a row with Downtown

To remove an entire table from the DB: drop table branch;


Note: this drop command will fail if, e.g. there is data in table loan [ why?]

SQL as a DML: select command


To get some data from a ( set of ) table (s):
Required Optional

select attribute1, , attribute_n from table_1, , table_m where selection_or_join_condition1, , selection_or_join_condition_r group by attribute_i having aggregate_function( attribute_j, ) order by attribute_k

SQL as a DML: select command


To get some data from a ( set of ) table (s) select customer, loan_no from borrows; select * from borrows; select customer as customer ssn from borrows;
loan_no customer 111-12-0000 222-12-0000 333-12-0000 444-00-0000 666-12-0000 111-12-0000 999-12-0000 777-12-0000 loan_no L17 L23 L15 L93 L17 L11 L17 L16

customer ssn
111-12-0000

select distinct loan_no from borrows;


Notes: * is a wildcard as: gives alias name to attribute

222-12-0000 333-12-0000 444-00-0000 666-12-0000 111-12-0000 999-12-0000 777-12-0000

L17 L23 L15 L93 L11 L16

SQL select: row filters


Example: Find the names of all branches that have given loans larger than 1200 LOAN
loan_number
L17 L23 L15 L93 L11 L16

amount
1000 2000 1500 500 900 1300

branch_name
Downtown Redwood Pennyridge Mianus Round Hill Pennyridge

select distinct branch_name from loan where amount > 1200


Note: all operations in where are applied one row at a time

branch_name Redwood

Pennyridge

SQL select: joins


Example: Find the customer ssn, loan no, amount and branch name for all loans > 1200
LOAN
loan_number L17 L23 L15 L93 L11 L16 amount 1000 2000 1500 500 900 1300 branch_name Downtown Redwood Pennyridge Mianus Round Hill Pennyridge

BORROWS
customer 111-12-0000 222-12-0000 333-12-0000 444-00-0000 666-12-0000 111-12-0000 999-12-0000 loan_no L17 L23 L15 L93 L17 L11 L17 L16

select customer, loan.* from borrows, loan where loan_no = loan_number and amount > 1200

777-12-0000

q-condition for join of loan, borrows selection condition


customer loan_number L23 L15 L16 amount 2000 1500 1300 branch_name Redwood Pennyridge Pennyridge

WHERE clause: multiple q-conditions and, or, not comparing cell values: >, =, !=, <, etc.

222-12-0000 333-12-0000 777-12-0000

SQL select: joins with table and column aliases


Example: Find the names of employees and their manager.
E=M
e_ssn 111-22-3333 333-11-4444 123-45-6789 555-66-8888 987-65-4321 888-99-9999 321-32-4321 777-77-7777 e_name Jones Smith Lee Turner Jones Chan Adams Black tel 12345 54321 54321 55555 87621 87654 77777 99111 start_date Nov-2005 Mar-1998 Mar-1998 Aug-2002 Mar-1995 Feb-1980 Feb-1990 Jan-1980 mgr_ssn 321-32-4321 111-22-3333 111-22-3333 321-32-4321 888-99-9999 777-77-7777 777-77-7777 null

worker

boss Adams Jones Jones Adams Chan Black Black null

select E.e_name as worker, M.e_name as boss from employee as E, employee as M where E.mgr_ssn = M.e_ssn

Jones Smith Lee Turner Jones Chan

Note: E, M are aliases (copies) of employee table

Adams Black

SQL select: nested queries, in


Example: Find ssn of customers who have both deposit and loan
DEPOSIT
c_ssn 888-12-0000 222-12-0000 333-12-0000 555-00-0000 888-12-0000 111-12-0000 000-12-0000 ac_num A101 A215 A102 A305 A201 A217 A101 accessDate Jan 1, 09 Feb 1, 09 Feb 28, 09 Mar 10, 09 Mar 1, 98 Mar 1, 09 Feb 25, 09

BORROWS

customer 111-12-0000 222-12-0000 333-12-0000 444-00-0000 666-12-0000 111-12-0000 999-12-0000 777-12-0000

loan_no L17 L23 L15 L93 L17 L11 L17 L16

select c_ssn from deposit where c_ssn in ( select customer from borrows)
Notes: in performs a set membership test

c_ssn 222-12-0000 333-12-0000 111-12-0000

SQL select: nested queries, in


Example: Find ssn of customers who have a deposit but no loan
DEPOSIT
c_ssn 888-12-0000 222-12-0000 333-12-0000 555-00-0000 888-12-0000 111-12-0000 000-12-0000 ac_num A101 A215 A102 A305 A201 A217 A101 accessDate Jan 1, 09 Feb 1, 09 Feb 28, 09 Mar 10, 09 Mar 1, 98 Mar 1, 09 Feb 25, 09

BORROWS

customer 111-12-0000 222-12-0000 333-12-0000 444-00-0000 666-12-0000 111-12-0000 999-12-0000 777-12-0000

loan_no L17 L23 L15 L93 L17 L11 L17 L16

select c_ssn from deposit where c_ssn not in ( select customer from borrows)
Notes: not in is true if in is false.

c_ssn

888-12-0000
555-00-0000 888-12-0000 000-12-0000

SQL select: nested, correlated queries, exists


Existential qualifier (a generalization of in) Example: Find the names of branches that have given no loan
BRANCH
branch_name Downtown Redwood Pennyridge Mianus Round Hill Pownal North Town Brighton city Brooklyn Palo Alto Horseneck Horseneck Horseneck Bennington Rye Brooklyn assets 9000000 2100000 1700000 400000 8000000 300000 3700000 7100000

LOAN
loan_number L17 L23 L15 L93 L11 L16 amount 1000 2000 1500 500 900 1300 branch_name Downtown Redwood Pennyridge Mianus Round Hill Pennyridge

select branch_name from branch where not exists ( select * from loan where branch.branch_name = loan.branch_name)
1. Correlated: where clause of inner query refers to outer query 2. exists is true is there is >= 1 row in evaluating inner query; not exists is true is exists is false

SQL select: arithmetic operations on columns


Report the branch name and assets in units of millions BRANCH
branch_name Downtown Redwood Pennyridge Mianus Round Hill Pownal North Town Brighton city Brooklyn Palo Alto Horseneck Horseneck Horseneck Bennington Rye Brooklyn assets 9000000 2100000 1700000 400000 8000000 300000 3700000 7100000

select branch_name, assets*0.000001 as assets (m) from branch

branch_name
Downtown Redwood Pennyridge Mianus Round Hill Pownal

assets (m)
9.0 2.1 1.7 0.4 8.0 0.3 3.7 7.1

Notes: arithmetic ops can be used in SELECT, WHERE, HAVING

North Town Brighton

SQL select: group by, group-wise aggregation functions


Example: Report the average, maximum amount, and number of loans by branch
LOAN
loan_number L17 L23 L15 L93 L11 L16 amount 1000 2000 1500 500 900 1300 branch_name Downtown Redwood Pennyridge Mianus Round Hill Pennyridge

select branch_name, avg( amount) as Avg, max( amount) as Max, count( branch_name) as no_loans from loan group by branch_name branch_name Avg Max order by no_loans desc Pennyridge 1400 1500
Downtown Redwood 1000 2000 500 900 1000 2000 500 900

no_loans 2 1 1 1 1

1. Aggregating functions: avg, max, min, sum, count 2. avg/max return average/max for each group

Mianus Round Hill

SQL select: group by, having


having is used to screen out groups from the output Example: Report the small loans (<= 1500) held by 2 or more people.
LOAN
loan_number L17 L23 L15 L93 L11 L16 amount 1000 2000 1500 500 900 1300 branch_name Downtown Redwood Pennyridge Mianus Round Hill Pennyridge

BORROWS

customer 111-12-0000 222-12-0000 333-12-0000 444-00-0000 666-12-0000 111-12-0000 999-12-0000 777-12-0000

loan_no L17 L23 L15 L93 L17 L11 L17 L16

select loan_number, amount, count( loan_number) as no_debtors from loan, borrows where loan_number = loan_no and amount <= 1500 group by loan_number having count(loan_number) >= 2
loan_number L17 1000

amount

no_debtors 3

having conditions are only applied to data after rows have been grouped order by used with group by will be applied to groups.

SQL select: date functions


SQL provides special functions to handle dates, times and strings Example: report those customers who have been inactive for over 5 years
DEPOSIT
c_ssn 888-12-0000 222-12-0000 333-12-0000 555-00-0000 888-12-0000 111-12-0000 ac_num A101 A215 A102 A305 A201 A217 A101 accessDate Jan 1, 09 Feb 1, 09 Feb 28, 09 Mar 10, 09 Mar 1, 98 Mar 1, 09 Feb 25, 09

select c_ssn from deposit where datediff( yy, accessDate, getdate( ) ) > 5
c_ssn

000-12-0000

ac_num A201

accessDate Mar 1, 98

datediff units: yy (years), , ns (nano-seconds)

888-12-0000

SQL select: string functions


It is often useful to use wild-cards for string matching
CUSTOMER
ssn 111-12-0000 222-12-0000 333-12-0000 444-12-0000 555-12-0000 666-12-0000 777-12-0000 888-12-0000 999-12-0000 name Jones Smith Hayes Curry Turner Williams Adams Johnson Brooks Lindsay street Main North Main North Putnam Nassau Spring Alma Senator Park city Harrison Rye Harrison Rye Stamford Princeton Pittsfield Palo Alto Brooklyn Pittsfield banker 321-32-4321 321-32-4321 321-32-4321 333-11-4444 888-99-9999 333-11-4444 123-45-6789 888-99-9999 123-45-6789 888-99-9999 b_type CRM CRM CRM LO DO LO LO DO LO DO

select ssn, name, street, city from customer where name LIKE J% or street LIKE [^mnp]% or city LIKE %[ ]%
Wildcards: % zero or more chars [asd] match one char out of list [asd] [^asd] matches any one char except a, s, d.

000-12-0000

ssn

name

street

city

111-12-0000
777-12-0000 888-12-0000 999-12-0000

Jones
Adams Johnson Brooks

Main
Spring Alma Senator

Harrison
Pittsfield Palo Alto Brooklyn

SQL as a DML: update command


To modify an entry in a cell update loan set amount = amount - 200 where loan_number = ( select loan_no from borrows, customer where customer = ssn and name = Jones )
LOAN
loan_number L17 L23 L15 L93 L11 L16 amount 1000 2000 1500 500 900 1300 branch_name Downtown Redwood Pennyridge Mianus Round Hill Pennyridge

BORROWS
customer 111-12-0000 222-12-0000 333-12-0000 444-00-0000 666-12-0000 111-12-0000 999-12-0000 777-12-0000 loan_no L17 L23 L15 L93 L17 L11 L17 L16

CUSTOMER
ssn 111-12-0000 222-12-0000 333-12-0000 444-12-0000 555-12-0000 666-12-0000 777-12-0000 888-12-0000 999-12-0000 000-12-0000 name Jones Smith Hayes Curry Turner Williams Adams Johnson Brooks Lindsay street Main North Main North Putnam city Harrison Rye Harrison Rye Stamford Princeton Pittsfield Palo Alto
800 amount

banker 321-32-4321 321-32-4321 321-32-4321 333-11-4444 888-99-9999 333-11-4444 123-45-6789 888-99-9999


Downtown Redwood branch_name

b_type CRM CRM CRM LO DO LO LO DO LO DO

LOAN Spring
Alma
L17 L23

Nassau

loan_number

Senator Park L15


L93 L11 L16

Brooklyn
2000

123-45-6789 888-99-9999 Pennyridge


Mianus Round Hill Pennyridge

select * from loan

Pittsfield 1500
500 700 1300

SQL as a DML: delete command


To delete a row from a table

delete from loan


delete from customer where name = Jones

all rows of loan table deleted


request to delete row of customer table with name = Jones [will it succeed ?]

BORROWS
customer 111-12-0000 222-12-0000 333-12-0000 444-00-0000 666-12-0000 111-12-0000 999-12-0000 777-12-0000 loan_no L17 L23 L15 L93 L17 L11 L17 L16

CUSTOMER
ssn 111-12-0000 222-12-0000 333-12-0000 444-12-0000 555-12-0000 666-12-0000 777-12-0000 888-12-0000 999-12-0000 name Jones Smith Hayes Curry Turner Williams Adams Johnson Brooks street Main North Main North Putnam Nassau Spring Alma Senator city Harrison Rye Harrison Rye Stamford Princeton Pittsfield Palo Alto Brooklyn banker 321-32-4321 321-32-4321 321-32-4321 333-11-4444 888-99-9999 333-11-4444 123-45-6789 888-99-9999 123-45-6789 b_type CRM CRM CRM LO DO LO LO DO LO

Views in SQL
A view is a virtual table defined on a given Database:

The columns of the view are either (i) columns from some (actual or virtual) table of the DB or (ii) columns that are computed (from other columns)
Main uses of a view:

- Security (selective display of information to different users)


- Ease-of-use -- Explicit display of derived attributes -- Explicit display of related information from different tables -- Intermediate table can be used to simplify SQL query

Views in SQL..
Create a view showing the names of employees, their ssn, telephone number, their manager's name, and how many years they have worked in the bank.
create view bank_employee as select e.e_ssn as ssn, e.e-name as name, e.tel as phone, m.e-name as manager, datediff( yy, start_date, getdate( )) as n_years from EMPLOYEE as e, EMPLOYEE as m where e.mgr_ssn = m.e_ssn select * from bank_employee
ssn 111-22-3333 333-11-4444 123-45-6789 555-66-8888 987-65-4321 888-99-9999 321-32-4321 777-77-7777 name Jones Smith Lee Turner Jones Chan Adams Black phone 12345 54321 54321 55555 87621 87654 77777 99111 manager Adams Jones Jones Adams Chan Black Black null n_years 15 12 12 8 15 30 30 30

Operations on Views
View definition is persistent once you define it, the definition stays permanently in the DB until you drop the view. The DBMS only computes the data in a view when it is referenced in a SQL command (e.g. in a select command) no physical table is stored in the stored memory corresponding to the view. You can use the view in any SQL query just the same as any other table, BUT (1) You cannot modify the value of a computed attribute (2) If an update/delete command is execute, the underlying data in the referenced table of the view is updated/deleted. [this can cause unexpected changes in your DB]

Concluding remarks on SQL


SQL language has some other useful commands and operators [e.g. see here] In addition, most DBMS will provide many non-standard operators and services to facilitate information system deployment and administration. DBMSs can handle very large amount of data, and process queries very fast. IBMs DB2 can handle over 6m transactions per min (tpm); Oracle 10g, over 4m tpm To speed up queries, you can use indexes. Common DBMSs: IBM DB2, Oracle 10g, Microsoft SQL Server, Sybase, MySQL. all support SQL.

Database APIs
Most people use DBs, but always through some computer program interface (API).

Most DBMSs will provide program libraries (a collection of a set of complied functions) with functions to: - Connect to the DBMS - Select a DB - Send a SQL command, and receive the response in some standard data structure.
Each DBMS provides one library for each programming language. On Windows (and several other) systems, these libraries are called ODBC

odbc (DLL) your code odbc func more code Client App

SQL query Response

DBMS DB

Bank tables..
BRANCH
branch_name Downtown Redwood Pennyridge Mianus Round Hill Pownal North Town Brighton city Brooklyn Palo Alto Horseneck Horseneck Horseneck Bennington Rye Brooklyn assets 9000000 2100000 1700000 400000 8000000 300000 3700000 7100000

EMPLOYEE
e_ssn e_name tel start_date mgr_ssn

111-22-3333
333-11-4444 123-45-6789 555-66-8888 987-65-4321 888-99-9999 321-32-4321 777-77-7777

Jones
Smith Lee Turner Jones Chan Adams Black

12345
54321 54321 55555 87621 87654 77777 99111

Nov-2005
Mar-1998 Mar-1998 Aug-2002 Mar-1995 Feb-1980 Feb-1990 Jan-1980

321-32-4321
111-22-3333 111-22-3333 321-32-4321 888-99-9999 777-77-7777 777-77-7777 null

CUSTOMER
ssn 111-12-0000 222-12-0000 333-12-0000 name Jones Smith Hayes street Main North Main city Harrison Rye Harrison banker 321-32-4321 321-32-4321 321-32-4321 b_type CRM CRM CRM

DEPOSIT
c_ssn 888-12-0000 222-12-0000 333-12-0000 ac_num A101 A215 A102 accessDate Jan 1, 09 Feb 1, 09 Feb 28, 09

444-12-0000
555-12-0000 666-12-0000 777-12-0000 888-12-0000 999-12-0000 000-12-0000

Curry
Turner Williams Adams Johnson Brooks Lindsay

North
Putnam Nassau Spring Alma Senator Park

Rye
Stamford Princeton Pittsfield Palo Alto Brooklyn Pittsfield

333-11-4444
888-99-9999 333-11-4444 123-45-6789 888-99-9999 123-45-6789 888-99-9999

LO
DO LO LO DO LO DO

555-00-0000
888-12-0000 111-12-0000 000-12-0000

A305
A201 A217 A101

Mar 10, 09
Mar 1, 98 Mar 1, 09 Feb 25, 09

BORROWS
customer 111-12-0000 loan_no L17 L23 L15 L93 L17 L11 L17 L16

LOAN
loan_number L17 L23 amount 1000 2000 branch_name Downtown Redwood

222-12-0000 333-12-0000 444-00-0000 666-12-0000

L15
L93 L11 L16

1500
500 900 1300

Pennyridge
Mianus Round Hill Pennyridge

Not all tables of our normalized design are shown; please create and populate for practice.

111-12-0000 999-12-0000 777-12-0000

References and Further Reading

Silberschatz, Korth, Sudarshan, Database Systems Concepts, McGraw Hill

Next: IS for non-structured data

You might also like