You are on page 1of 23

STAGEING_TABLES----E-R MODELING FOR STAGING TABLES ACCTS Acct_process_nbr acct_nbr CUSTOMER_DET acct_type CUSTOMER cust_id cust_id cust_id, name_prefix

acct_start_date income first_name acct_end_date age last_name ref_acct_nbr years_with_bank gender Empno nbr_children marital_status gender street_nbr marital_status street_name postal_code city_name state_code SAVINGS acct_nbr acct_type cust_id ref_acct_nbr Empno minimum_balance per_check_fee account_active acct_start_date acct_end_date starting_balance ending_balance CHECKING acct_nbr acct_type cust_id ref_acct_nbr Empno minimum_balance account_active acct_start_date acct_end_date starting_balance ending_balance

SERVICES Trans_id Acct_Nbr Service Tran_Amt Tran_Charge Tran_tot_amt

LOAN acct_nbr cust_id Agent_id credit_limit credit_rating account_active acct_start_date acct_end_date starting_balance ending_balance

CREDIT acct_nbr cust_id Agent_id credit_limit credit_rating account_active acct_start_date acct_end_date starting_balanc ending_balance

SAVING_TRAN Tran_Id Cust_Id Acct_Nbr Channel_Nbr Session_Id Tran_Duration Tran_Amt Principal_Amt Interest_Amt New_Balance Tran_Date DATE, Tran_Time

Channel Tran_Code

CHECK_TRAN Tran_Id Cust_Id Acct_Nbr Channel_Nbr Session_Id Tran_Duration Tran_Amt Principal_Amt Interest_Amt New_Balance Tran_Date DATE, Tran_Time Channel Tran_Code AGENT Agent_id Agent_name Agent_type Location

LOAN_TRAN Tran_Id Cust_Id Acct_Nbr Channel_Nbr Session_Id Tran_Duration Tran_Amt Principal_Amt Interest_Amt New_Balance Tran_Date DATE, Tran_Time Channel Tran_Code

CREDIT_TRA Tran_Id Cust_Id Acct_Nbr Channel_Nbr Session_Id Tran_Duration Tran_Amt Principal_Amt Interest_Amt New_Balance Tran_Date DATE, Tran_Time Channel Tran_Code

TRANS cust_id acct_type tran_period tran_channel tran_code tran_type tran_count tran_total

EMPLOYEE Empno Empname Deptno Location Manager Salary Designation

WIRE_TRANS Trans_id Acct_Nbr Amt_trans Trans_amt_fee

TABLES - DIMENSIONAL MODELING

PRODUCT_DIM Acct_Key Acct_nbr Acct_type Acct_start_date Acct_end_date Trans_code Trans_id Earnings Transaction_fee Service diagram Account_active Account_bal_credit Channel Ref_acct_nbr DATE_DIM Date_key DT_calender_Date DT_weekday_full DT_weekend_full DT_calen_week_numb DT_calen_month_numb DT_calen_qtr_numbr DT_calen_monthend DT_calen_quater_number_mon th DT_calen_year_nmbr DT_calen_FISICALYear

CUSTOMER_DIM Cust_Key Cust_id Name Income Age Year_with_bank nbr_children gender marital_status acct_start_date acct_end_date street_number street_name customer_effi_points customer_track_points customer_ref_points

BANK_TRANS_FACT Cust_key Acct_Key Date_key Trans_key Amt_on_transaction Amt_of_total_earning Amt_on_internet_emi Profit_on_services

Profit_on_loan_credit

TRANSACTION_DIM Trans_Key Trans_id Trans_code Channel_nbr Agent_id Session_IT Transaction_charge Transaction_amt Transaction_time

Dimension Overview Based on the business requirements just listed, the grain and dimensionality of the initial model begin to emerge. We start with a core fact table that records the primary balances of every account at the end of each month. Clearly, the grain of the fact table is one row for each account at the end of each month. Based on this grain declaration, we initially envision a design with only two dimensions month and account. A data-centric designer might argue that all the other description information, such as household, branch, and product characteristics, should be embedded as descriptive attributes of the account dimension because each account has only one household, branch, and product associated with it. While this schema accurately represents the many-to-one and many-to-many relationships in the snapshot data, it does not adequately reflect the natural business dimensions. Rather than collapsing everything into the huge account dimension table, additional analytic dimensions such as product and branch mirror the instinctive way that banking users think about their businesses. These supplemental dimensions provide much smaller points of entry to the fact table. Thus they address both the performance and usability objectives of a dimensional model. Finally, given that the master account dimension in a big bank may approach 10 million members, we Follow type 2 slowly changing dimension (SCD) for the huge dimension into something workable process. The product and branch attributes are convenient groups of attributes to remove from the account dimension in order to cut down on the type 2 SCD effects. Later we'll squeeze the changing demographics and behavioral attributes out of the account dimension for the same reasons.

The product and branch dimensions are two separate dimensions because there is a many-to-many relationship between products and branches. They both change slowly but on different rhythms. Most important, business users think of them as basic, distinct dimensions of the banking business. Based on further study of the bank's requirements, we ultimately choose the following dimensions for our initial schema: month end date, account, household, branch, product, and status. At the intersection of these six dimensions, we take a monthly snapshot and record the primary balance and any other metrics that make sense across all products, such as interest paid, interest charged, and transaction count. Remember that account balances are just like inventory balances in that they are not additive across any measure of time. Instead, we must average the account balances by dividing the balance sum by the number of months. Product Dimension The product dimension consists of a simple product hierarchy that describes all the bank's products, including the name of the product, type, and category. The need to construct a generic product categorization in the bank is the same need that causes grocery stores to construct a generic merchandise hierarchy. The main difference between the bank and grocery store examples is that the bank also develops a large number of custom product attributes for each product type. We'll defer discussion regarding the handling of these custom attributes until the end of this chapter. The account status dimension is a useful dimension to record the condition of the account at the end of each month. The status records whether the account is active or inactive or whether a status change occurred during the month, such as a new account opening or an account closure. Rather than whipsawing the large account dimension or merely embedding a cryptic status code or abbreviation

directly in the fact table, we treat status as a full-fledged dimension with descriptive status decodes, groupings, and status reason descriptions as appropriate. In many ways we could consider the account status dimension to be another example of a minidimension. Customer Dimension Rather than focusing solely on the bank's accounts, users also want the ability to analyze the bank's relationship with an customer. They are interested in understanding the overall profile of a customer, the magnitude of the existing relationship with the customer, and what additional products should be sold to the customer., and. These demographic attributes change over time; as you might suspect, the users want to track the changes. If the bank focuses on accounts for commercial entities rather than consumers, it likely has similar requirements to identify and link corporate families. From the bank's perspective, a customer may be comprised of several accounts and individual account holders. For example, consider John and Mary Smith as a single customer household. John has a checking account, and Mary has a savings account. In addition, John and Mary have a joint checking account, credit card, and mortgage with the bank. All five of these accounts are considered to be a part of the same Smith household despite the fact that minor inconsistencies may exist in the operational name and address information. The process of relating individual accounts to households (or the commercial business equivalent of a residential household) is not to be taken lightly. House holding requires the development of business rules and algorithms to assign accounts to households. There are specialized products and services to do the matching necessary to determine household assignments. It is very common for a large financial services organization to invest significant resources in specialized capabilities to support its house holding needs.

We decide to treat them separately because of the size of the account dimension and the volatility of the account constituents within a household dimension, as referenced earlier. In a large bank, the account dimension is huge, with easily over 10 million rows that group into several million households. The customer dimension provides a somewhat smaller point of entry into the fact table without traversing a 10-million-row account dimension table. In addition, given the changing nature of the relationship between accounts and customer, we elect to use the fact table to capture the relationship rather than merely including the household attributes on each account dimension row. In this way we avoid using the type 2 SCD approach with the large account dimension. Various Dimension So far we discussed about customer and product analysis. There are other bank related things Agent, Transaction, employee. Agent analysis to be maintained to know about the agent information history wise according to there locations. To give other agents policies to the agents. Transaction to be maintained for credit account daily wise. it should have transaction information of credit complete transaction and employee information according to there location of the bank. Time Dimension So far we've restricted our discussions in this financial services chapter to monthend balance snapshots because this level of detail typically is sufficient for analysis. If required, we could supplement the monthly-grained snapshot fact table with a second fact table that provides merely the most current snapshot as of the last nightly update or perhaps is extended to provide daily-balance snapshots for the last week or month. However, what if we face the requirement to report an account's balance at any arbitrarily picked historical point in time? Creating daily-balance snapshots for a large bank over a lengthy historical time span would be overwhelming given the density of the snapshot data. If the bank

has 10 million accounts, daily snapshots translate into approximately 3.65 billion fact rows per year. Assuming that business requirements already have driven the need to make transaction detail data available for analysis, we could leverage this transaction detail to determine an arbitrary point-in-time balance. To simplify matters, we'll boil the account transaction fact table down to an extremely simple design. The transaction type key joins to a small dimension table of permissible transaction types. The transaction sequence number is a continuously increasing numeric number running for the lifetime of the account. The final flag indicates whether this is the last transaction for an account on a given day. The transaction amount is self-explanatory. The balance fact is the ending account balance following the transaction event. In a situation we are taking advantage of a special situation that exists with the surrogate date key. The date key is a set of integers running from 1 to N with a meaningful, predictable sequence. We assign consecutive integers to the date surrogate key so that we can physically partition a large fact table based on the date. This neatly segments the fact table so that we can perform discrete administrative actions on certain date ranges, such as moving archived data to offline storage or dropping and rebuilding indexes. The date dimension is the only dimension whose surrogate keys have any embedded semi-intelligence. Due to its predictable sequence, it is the only dimension on which we dare place application constraints. We used this ordering in the preceding SQL code to locate the most recent prior end-of-day transaction. Fact Overview The heterogeneous product technique just discussed is appropriate for fact tables in which a single logical row contains many product-specific facts. Snapshots usually fit this pattern.

On the other hand, transaction-grained fact tables often have a single fact that is generically the target of a particular transaction. In such cases the fact table has an associated transaction dimension that interprets the amount column. In the case of transaction-grained fact tables, we typically do not need specific line-ofbusiness fact tables. We get by with only one core fact table because there is only one fact. However, we still can have a rich set of heterogeneous products with diverse attributes. In this case we would generate the complete portfolio of custom product dimension tables and use them as appropriate, depending on the nature of the application. In a cross-product analysis, we would use the core product dimension table because it is capable of spanning any group of products. In a single-product analysis, we optionally could use the custom-product dimension table instead of the core dimension if we wanted to take advantage of the custom attributes specific to that product type.

SYSTEM DEVELOPMENT

5.1 Specifications for Target Tables Bank Transaction


CUST_KEY PK/ FK CUST_ID IF CUSTOMER_DIM.C USTID= CUST_ID.BANK_TR ACCT_KEY PK/ FK ACCT_NB R AN_SOURCE IFPROD_DIM.ACCT _NBR= ACCT_NBR.BANK_T AGENT_KEY PK/ FK AGENT_ID RANS_SOURCE IF AGENT_DIM.AGENT _ID= AGENT_ID.BANK_T TRAN_KEY PK/ FK TRANS_ID RANS_SOURCE IF TRANSA_DIM.TRAN S_ID= TRANS_ID.BANK_T RANS_SOURCES Lookup on TRANSACTION_DIM(TRAN_KEY ,TRANS_ID) Lookup on AGENT_DIM (AGENT_KEY,AGENT_ID) Lookup on PRODUCT_DIM (ACCT_KEY,ACCT_NBR) Lookup on CUSTOMER_DIM (CUST_KEY,CUST_ID)

EMP_KEY

PK/ FK

EMPNO

IF EMP_DIM.EMPNO= EMPNO.BANK_TRA NS_SOURCE SYSDATE

Lookup on EMP_BANkDIM(EMP_KEY,EMPNO)

AMT_ON_TRA NSACTION AMT_OF_TOT AL_EARNING S PROFIT_ON_S ERVICES

AMT_ON_ TRANSAC TION AMT_OF_T OTAL_EAR NINGS PROFIT_O N_SERVIC ES

ORACLE USERNAME

5.2 Staging Column Specification

Staging Column Specifications Column Name PK Format FK Target Table Name: CUSTOMER_DETAILS

Null

Data Source Specifications Colu Pk Format Null File / Table mn / FK Name Field Source File Name: CUSTOMER_DETAILS CUSTOMER

CUST_ID INCOME AGE YEARS_WITH_BA NK NBR_CHILDERN

PK

NUMBER NUMBER NUMBER NUMBER NUMBER

N N N Y Y

1-to-1 1-to-1 1-to-1 1-to-1 1-to-1 1-to-1 1-to-1 1-to-1 1-to-1 1-to-1 1-to-1 1-to-1 1-to-1 1-to-1 1-to-1

_DETAILS CUSTOMER _DETAILS CUSTOMER _DETAILS CUSTOMER _DETAILS CUSTOMER _DETAILS CUSTOMER _DETAILS CUSTOMER _DETAILS CUSTOMER _DETAILS CUSTOMER _DETAILS CUSTOMER _DETAILS CUSTOMER _DETAILS CUSTOMER _DETAILS CUSTOMER _DETAILS CUSTOMER _DETAILS CUSTOMER _DETAILS

STREET_NBR STREET_NAME POSTAL_CODE CITY_NAME STATE_CODE NAME_PREFIX FIRST_NAME LAST_NAME GENDER MARITAL_STATUS

NUMBER DATE VARCHAR2(30 ) VARCHAR2(30 ) VARCHAR2(30 ) VARCHAR2(30 ) VARCHAR2(30 ) VARCHAR2(30 ) VARCHAR2(30 ) VARCHAR2(30 )

Y Y Y Y Y Y Y Y Y Y

5.3 Dimension Modeling Relationship between fact and dimension tables

Customer

Time

Bank Transactions

Transaction

Product

Here is the system generated schema for the Bank Product Analysis.

5.4 SCHEMA

5.5 Source table for Bank Product Analysis


Customer_details Column Name Cust_Id (PK) Name prefix First name Last name Gender Marital status Street nbr Street name Postal code City name State code Customer Column Name Cust_id(PK) Income Age Years_with_bank Nbr_childern Gender Marital_status Account Column Name Acct_process_nbr Data Type Number Data Type Number Number(9,2) Number Number Number Varchar2(1) Varchar2(1) Data Type Number Varchar2(4) Varchar2(30) Varchar2(30) Varchar2(10) Varchar2(1) Number Varchar2(30) Varchar2(5) Varchar2(20) Vatchar2(2)

Acct_nbr (PK) Acct_type Cust_id Acct_start_date Acct_end_date Ref_acct_nbr Empno Checking_acct Column Name Acct_nbr (PK) Acct_type Cust_id Ref_acct_nbr Empno Minimum_balance Per_check_fee Account_active Account_start_date Acct_end_date Starting_balance Ending_balance Acct_nbr (PK)

Number(16) Varchar2(2) Number Date Date Number(16) Number(3)

Data Type Number(3) Varchar2(2) number Number(16) Number(3) Number(9,2) Number(9,2) Varchar2(1) Date Date Number(9,2) Number(9,2) Number(3)

Checking_tran Column Name Tran_id Cust_id Data Type Number(9,2) Number

Acct_nbr Channel_nbr Session_id Check_nbr Tran_duration Tran_amt Principal_amt Interest_amt New_balance Tran_date Tran_time Channel Tran_code Savings_acct Column Name Acct_nbr (PK) Acct_type Cust_id Ref_acct_nbr Empno Minimum_balance Account_active Acct_start_date Acct_end_date Starting_balance Ending_balance Savings_tran Column Name

Number(16) Number Number|(9,2) Number Number Number(9,2) Number(9,2) Number(9,2) Number(9,2) Date Varchar2(6) Varchar2(1) Varchar2(2)

Data Type Number(16) Varchar2(2) Number Number(16) Number(3) Number(9,2) Varchar2(1) Date Date Number(9,2) Number(9,2)

Data Type

Tran_id (PK) Cust_id Acct_nbr Channel_nbr Session_id Tran_duration Tran_amt Principal_amt Interest_amt New_balance Tran_date Tran_time Channel Tran_code Credit_acct Column Name Acct_nbr (PK) Agent_id Cust_id Credit_limit Credit_rating Minimum_balance Account_active Acct_start_date Acct_end_date Starting_balance Ending_balance Credit_tran

Number Number Number (16) Number Number(9,2) Number Number(9,2) Number(9,2) Number(9,2) Number(9,2) Date Varchar2(6) Varchar2(1) Varchar2(2)

Data Type Number(16) number Number Number(9,2) Number Number(9,2) Varchar2(1) Date Date Number(9,2) Number(9,2)

Column Name Tran_id (PK) Cust_id Acct_nbr Channel_nbr Agent_id Session_id Tran_duration Tran_amt Principal_amt Interest_amt New_balance Tran_date Tran_time Channel Tran_code Banking_services Column Name Trans _id Acct_nbr Service Tran_amt Tran_charge Tran_tot_amt Bank_trans_source Column Name cust_id Acct_nbr

Data Type Number Number Number (16) Number Number(9,2) Number(9,2) Number Number(9,2) Number(9,2) Number(9,2) Number(9,2) Date Varchar2(6) Varchar2(1) Varchar2(2)

Data Type Number(9,2) number(9,2) Varchar2(10) Varchar2(20) Number(9,2) Number(9,2)

Data Type Number number

Agent_id Trans_id Empno Transacation_amt Amt_total_earning Profit_on_services

Number Number Number(4) Number(9,2) Number(9,2) Number(9,2)

You might also like