You are on page 1of 73

Informatica Overview

Contents
Introduction Clients Server(s) Source, Target, Repository Connectivity

What is Informatica?
Allows you to load data into a centralized location, such as a datamart or data warehouse. ETL Tool Extract data from multiple sources Transform the data according to business logic and needs Load the transformed data into file and relational targets

Example
EMPLOYEE EMP_ID EMP_NAME EMP_CITY EMP_STATE EMP_COUNTR Y EMP_DATE_O F_JOINING EMP_DETAILS EMP_ID EMP_NAME EMP_CITY EMP_STATE EMP_COUNTRY YRS_OF_SRV

Transform Date of Joining to Yrs of Service

Data Warehousing
Develope r End User Metada ta

Operational Sources

Extract Transform Load

Data Warehouse

n n n

Informatica Architecture Design Process Client Tool Review Repository Manager Designer Server Manager

Informaticas Architecture
Data Models Designer Repository Manager Server Manager

Sources 1-n PowerPlugs Repository

Targets 1-n

Server

Informatica Design Process


2. 1.

3 .

Source Def Mapping Sessions 4 . Target Def

5 .

1. 2. 3. 4. 5.

Create Repository Import Source Definitions Create Target Schema Create Mappings Load Data

Informatica Client
n

Repository Manager Can view much of the metadata in the Repository through the Repository Manager. Designer Create Source-to-Target mappings that contain transformation instructions for the Informatica Server. Server Manager Create, schedule, and monitor sessions. You create a session based on a transformation and schedule it to run on the Informatica Server.

Informatica Client
Repository Manager

Metadata Repository
Information about the data mart
system Catalogs the repository Directs the server Contains record of user access Can be shared Can be searched and reported Bridged through Metadata Exchange

Repository Manager

Navigator Window

Analysis Window

Dependency Window

Output Window

Folder Attributes
FOLDER OWNER - user who serves as focal point
for folder permissions

PERMISSIONS - rights to read, write, and/or


execute objects in a folder

SHARED - property that allows you to make


shortcuts to objects in a folder a shared folder

SHORTCUT - a dynamic link to an object stored in VERSIONS - folder iterations that indicate
development stages

Informatica Client
Designer

Designer Workspace
Open Folder List

Navigator Workspace

Output Window Status Bar

Workbook Tabs

Designer Options

General workspace options reload objects on open group source definitions

Tables columns viewed column size object size object colors

Format workspace colors import keys automatic Source Qualifier creation

Informatica Client
Server Manager

Server Manager
Navigator Configure Window

Monitor Window Output Window

Designer
Source Analyzer Warehouse Designer Transformation Developer Mapplet Designer Mapping Designer

Source Analyzer
Identify the sources used to build the warehouse. Create repository definitions for these sources

Analyzing Sources
n n n n

Relational Oracle, Sybase, Informix, IBM


DB2, Microsoft SQL Server, and Teradata and XML

File Fixed and delimited flat file, COBOL file, Other Microsoft Excel, Microsoft Access Extended PeopleSoft, SAP R/3, Sieble, and
IBM MQSeries (need to purchase additional products for these sources)

Mainframe Need to purchase additional


products.

Warehouse Designer
Create relational tables in Target database Edit target definitions Preview relational target data

Targets
Relational Oracle, Sybase, Sybase IQ, Informix,
IBM DB2, Microsoft SQL Server, and Teradata

File Fixed and delimited flat files and XML Extended SAP BW, IBM MQ Series (need to
purchase additional products for these targets)

Other - Microsoft Access

Mixing Sources and Targets


You can combine data from different platforms and source types.

Oracle Sybase Flat File

Transformation Developer
Generates ,modifies, passes data through ports 12 objects for transforming data

Transformations Types
Source Qualifier represents all data queried from the source Normalizer normalizes records from VSAM or relational sources Expression performs simple calculations Filter serves as a conditional filter Aggregator performs aggregate calculations Rank limits records to top or bottom range

Update Strategy allows for logic to insert, update, delete, or reject data Lookup looks up values and passes to other objects Stored Procedure calls a stored procedure and captures return values External Procedure calls a procedure defined in a shared library Sequence Generator generates unique ID values Joiner allows for heterogeneous joins

Transformations Types contd

SourceQualifier Transformation
Represents records that Informatica server reads when it runs a session Automatically attached when a Source is added to a mapping

Use a Source Qualifier to:


n n n n

Filter Records when the Informatica Server reads source data Specify sorted ports Order by clause Select only distinct values from a source Create a custom query for the Informatica Server to read source data

Expression Transformation
n

n n

Calculate values in a single row Adjust employee salaries, concatenate first and last names, convert string to number Perform Any Non-Aggregate Calculations Test conditional statements before you output to target

Example
EMPLOYEE EMP_ID EMP_NAME ROLE_CODE BASIC_SALA RY EMP_SALARY EMP_ID EMP_NAME ROLE_CODE GROSS_SALARY

Gross Salary= Basic Salary * 3.5

Aggregator Transformation
n

Allows you to perform aggregate calculations, such as averages and sums While the Expression is on a row-by-row basis, the aggregator can perform calculations on groups

Example
REVENUE PU_CODE PROJECT_CO DE REVENUE PU_REVNUE PU_CODE MAX_REVENUE MIN_REVENUE AVG_REVENUE

Aggregator Transformat ion Max Revenue = Max (Revenue) Min Revenue = Min(Revenue) Avg Revenue = Avg (Revenue)

Filter Transformation
n

Provides the means for filtering rows in a mapping

Employees who are currently working in the project NML of WENA as SE

Only rows that meet the condition pass through the mapping.

Filter Transformation
n n

n n

All ports are input/output Returns TRUE or FALSE for each row passed through the mapping based on the condition Discarded rows do not appear in the session log or reject files The input ports must only come from one transformation

Filter vs Source Qualifier (SQ)


n n

SQ provides better performance SQ only lets you filter rows from relational sources, Filter Transformation filters rows from any source SQ only uses standard SQL, Filter can use any statement or function that returns True/False

Example
EMPLOYEE EMP_ID EMP_NAME PROJECT_CO DE PU_CODE ROLE_CODE NM_EMP_DETAILS EMP_ID EMP_NAME PROJECT_CODE PU_CODE ROLE_CODE

Filter Transform ation Where Project = NML and PU = WENA and Role = SE

Router Transformation
n n n n n

Groups data into many groups Routes rows of data that do not meet any condition to a default group Can enter any expression that returns a single value Condition returns True or False for each row If the condition = NULL, the row is assumed as FALSE

Router Transformation
One Group can be connected to One transformation or target One Output Port can be connected to multiple transformations or targets Multiple Output ports in One Group can be connected to multiple transformations or targets CANNOT Connect more than One Group to One Transformation or Target

Lookup Transformation
n

Looks up data in a relational table

Can be the Source, Target, or any database that the Informatica Client and Server can connect to Lookup table can be a single table or can join multiple tables Get a related value (your source include Employee_ID and you want Employee_Name), Perform a calculation Update a slowly changing dimension table (check if records exist on a target)

Lookups can:

Lookup Transformation
n

For each input row, the Informatica Server queries the lookup table based on the lookup ports and the condition in the transformation
The Informatica Server can return values OR from that lookup (static cache) The Informatica Server inserts a row into the cache to flag rows as new or existing (dynamic cache)

Connected and Unconnected Lookup Transformations

Example
EMPLOYEE_PROJECT EMP_ID EMP_NAME PROJECT_CO DE
LOOK UP

NM_EMP_DETAILS EMP_ID EMP_NAME PROJECT_CODE PROJECT_DESC

PROJECT PROJECT_CO DE PROJECT_DE SC

Transform Get ation PROJECT.PROJECT_DESC Where PROJECT.PROJECT_CODE = NM_EMP_DETAILS.PROJE CT_CODE

Update Strategy
n

Two Ways Of doing Within a Session Within a Mapping

Update Strategy
Within a Session Instruct the Informatica Server how to treat the rows when the session is configured n Within a Mapping Use the update strategy transformation to flag records for insert, delete, update, or reject.
n

Constraint for each Database Operation


Operation Insert Update Delete Reject Constant DD_INSERT Numeric Value 0

DD_UPDATE 1 DD_DELETE 2 DD_REJECT 3

Joiner Transformation
Active Transformation Join two flat files Join two tables from different databases Join a flat file with a relational table

Transformation Overview
Three views:
Iconized View -- shows transformation in relation to mapping Normal View -- shows data flow through transformation Edit View -- shows transformation properties and allows for

editing

Transformation Overview
Normal view shows data flow through the transformation
Data passes through I/O ports unchanged

oDATE_ENTERED

passes into transformation through an input port. oIt is used in MTH port to extract month. oMonth is passed through MTH output port to another transformation.

Transformation Overview
Edit view provides flexibility in defining transformation rules
Define port level handling -Switch between transformation s -Enter comments -Make reusable Define transformation level properties

Transformations and Expressions


Calculation or conditional statement Used in Expression, Aggregator, Rank,
Filter, Update Strategy

Performs calculation based on ports,

functions, operators, variables, literals, constants, and return values from other transformations

Mapplets
Reusable Object Include multiple transformations Include Source definitions Multiple groups of output ports

Mapping
Move and transform data from sources to targets n Includes source definitions target definitions transformations.
n

Source

Transformations

Target

Mapping

Mapping Designer
Transformation Toolbar Mapping List Iconized

Status Bar

Validation

Three different levels of validation: Connection validation Expression validation Mapping validation

Connection Validation

Connecting ports with mismatched datatypes Connecting output ports to a source Connecting a source to anything but a Source Qualifier or Normalizer Connecting an output to a output, or an input to a input Connecting more than one active transformation to another transformation Copying columns to a target definition

Expression Validation
Parse the current expression, with

remote port searching (references to a port in another transformation are resolved) Parse expression attributes such as filter condition, lookup condition, SQL Query, etc. Parse default values

Mapping Validation
n

Mapping validation will take place with menu commands:


Mapping | Validate Repository | Save

Mapping validation will:


Perform connection validation Perform expression validation Check the mapping flow validation
Data from Source Qualifier mapped to a target Targets are connected to transformations

Informatica Server
Reads information from the Repository Extracts data from the Sources and stores the data in memory while it applies the transformation rules you created Loads the transformed data into the mapping targets

Transformation Process
Repository Session Metadata
Source Def Mapping Target Def Session

Server Manager

source information target information mapping scheduling error handling pre- / post-session scripts tuning parameters output log information transformation overrides

Server Targets Sources

Definitions
Session - A set of instructions that tells the Informatica Server how and when to move data from sources to targets Batch - A group of Sessions which are to be run together

Server Manager
Navigator Configure Window

Monitor Window Output Window

Process
n n n n n n

Configure server Create session Run session Monitor session View logs Tune session

Server Configuration
Server Variables
n n n n n n

Establish default directories for files and caches. Variables are server specific. Allows for easy deployment. Can be overridden at session level. Changing the variable updates sessions. Directories must exist prior to session launch.

Server Output

.
Target Data

Control

perf

.dat

.idx

E-mail

. bad

. log

Even t Log

Error Log

Source Settings - Session Wizard


Source o Select source type o File, Relational, Heterogeneous

Treat rows as: o Source Type: Relational, File o Tells server how to treat source rows o Insert, update, delete, data driven o Works in conjunction with Target Options

Source Database o Source Type: Relational o Database connection

Source Options... o Source Type: Relational m Database name o Source Type: File m fixed / delimited properties m file list m FTP properties

Target Settings - Session Wizard


Target o Select target type o File, Relational

Flat File Options: o File properties o FTP properties o Loader properties

Target Database o Target Type: Relational o Database connection

Target Options: o Target type: Relational o Tell server how to load target m Insert, Update Delete m Truncate target m Bulk, Normal, Test

Launching Sessions
n

Manual start - Manually launch a


session from within the Server Manager Session Schedule - Schedule a session using business cycle start, stop, and repeat intervals Batching - Use batches to run session concurrently or sequentially Event based - Configure a session to launch based upon the appearance of an indicator file Command Line - Launch a session from the command line prompt

n n

After the Session Launches...


n n n n n n

Poll/Refresh Session Status View Session Details View Performance Monitor View Logs Tune the Session Correct Session Problems

Monitor Session
Select Server Requests | Session Details Number of rows loaded/failed Read/Write throughput Most current Server message Audit trail in repository or session log

Monitor Session
Performance Monitor n Select Server Requests | Session Performance Details or open file <sessionname>.perf n Need to configure session properties to save the performance detail counters n Help determine where session performance can be improved

Log Files - Session Wizard


Log Files... o Log file path and name o Reject file path and name o Session log archive options

Connectivity Overview

You might also like