A Interview Question

1) what is admin console. can u explained me detailed.
ANS:
There are two types in Informatica..
1)Web based client tool - We need browser to Access

--- Admin Console
2)Window Based Client tool - We dont need browser to Access

---Repository Manager,Designer,Workflow Manager,Workflow Monitor...
and
Admin Console-- Its A web based client tool n administrating the all informatica client tools such as
Repository Manager,Desiger etc. And Administrating Users and Groups.
Responsible for creating Repository services and Integration Services and Access by Administrator.
------------------------------------------------------------------------------------------------------------------------------------------
2) can any one explain clearly what do u mean by dirty dimension and junk dimension with
example?
ANS:
Junk Dim:
Dimension table containing flags, gender, text,…. Which is not useful to generate reports? Then we
say this is junk dimension.
Dirty Dim:
If a dimension contains records those records are maintaining duplication.

-------------------------------------------------------------------------------------------------------------------------------------
3) I need your help for faster loading from flat file to oracle table
Flatfile contains 20 million records
please suggest me best to way load and better session performance
ANS:
1)Use Expernal Loading Concept in informatic at session level.For this u need configure "SQL Loader".
2)Split ur single Flat-File in unix around 10 Files based on the size and Enable Partition in session and
select target load type "Bulk"
(or)
I know there is feature sqlloader in informatica..., there is anything i have to do at db level .. i meant
table at db as currently table is at Oracle database..
4) Can anybody explain the difference between
1) session variable and mapping variable.
2)Session parameter and mapping parameter.
ANS:
session variable:
Yes they are same But suppose if we define Same variable in Session and mapping with some value
then Informatica will take Value which has been defined in Session
mapping variable:
mapping variable represent value that can be changed during mapping run
mapping variable is reqiresd to define a incremantal extratoin and source qualifier t/r to per form
incremental extraction
5) 1)how to load first half records to one target and remaining half records to another target
2)where we can use status code
3)der r 2 workflows wkf1,wkf2 if wkf1 sucess then only wkf2 executed how
4)what is vi editor
5) what operations performed on materalized view
5)when we go for connected and unconnected lookup
6)how target table gets refreshed daily
7)session variables and workflow variables
8)how to delete particular row i.e 2nd in a file unix
9) what r the properties set to improve session performance
6) unconnected lookup can return multiple values?

plz give me with brief explination?
(I) No it cannot. It will can retrun only one value. Google it you will get better answers.
(II) Any Lookup can't return more than one value.
if lookup find multiple matches there are few options u can select
1) custom
2) first
3) Last
If question related multiple return port fields

You can get multiple fields out of unconnected lookup.
To achieve it In Lookup SQL override write concate(field1, Field2)
In Expression u can split that field again into two fields and use it as two separate fields.
(III) Using Unconnected lookup also we can return Multilpe Column Values . Strictly Speaking
Un connected Lookup Returns only one Value.So here i am concating all the column values
using the || operator and returning this column from lookupup.So after that i am spliting
the entire column into multiple values using Substr and Instr.
Ex : Table (emp---> No,Name,Sal,Loc,Dname). So here i am passing only No to EMP i

want to all other columns.
Note : Enable the Lookup override : select name|| '~' || sal ||'~' || loc|| '~' ||Dname from
emp (Give the dateype as string and Lenght in the Lookup as 10000
for eno it retruns all the columns based on '~' we will split the single column into multiple
columns
7) How to load the filename of the flat file into the session statisitc table that is audit table
(I) Which flat file name do you want to upload. I mean is it a source or a target? or anything
else
(II) From Informatica 8 Onwords its ver Simple.
1) Click on on source defenition and edit the properties tab

2)Select the option 'Currently add Processing File Name"
(III) Yes i know that option ...
Session1 is flatfile to targettable(this is general business requirement load ) ....here we

are not loading filename .
Source filename is loaded into different table that is audit table
this means after session1 is completed i will be running session2 (this is to load into audit
table)
Here I am thinking how can we get 'Currently add Processing File Name" from the
session1 into session2 .
(IV) decalre one variable in mapping variables & parameters.

and hardcode the required source table in initial value or in parameter file.
------------------------------------------------------------------------------------------------------------------
8) Joiner and look up t\r which one gives better performance ?
1)how can we tell particular t/r will give good performance (lookup, joiner) ?
2) i have multiple transformations instead of multiple transformations we will take mapplet

which one gives good performance???
Ans) my answer is both are same but insted of using multiple transformations we will use
mapplet, but mapplet will not show the transformations....
ANS: As per my knowledge...

(I)
1) Lookup and joiner can be used for the same purpose.
Both had its own advantages and disadvantages. based on requirement performance differs.
Joiner will take master table into cache by default. It should use the cache for joiner.
Lookup also uses cache but u can minimize the cache usage in lookup and u can write sql override in
lookup filter only required fields and records by mentioned some conditions in override.
2) Instead of multiple transformation we won't use mapplets.. Mapplets are prefered to use repeated
logic and re use that mapplets in different mappints. Performance won't differ by changing multiple
trans into mapplets.
(II) Lookup is passive t/r that means it processes all the records, whether condition is true or false
whatever it may be it processes all the records( if the condition is true it sends corresponding value
and if the condition is false sends null ) where as Joiner is active and it processes the records which
are satisfied by the condition. Joiner is nice performer here.
but we can't say which one is good that depends upon the requirement.
------------------------------------------------------------------------------------------------------------
9) 1)how to load first half records to one target and remaining half records to another target
2)where we can use status code
3)der r 2 workflows wkf1,wkf2 if wkf1 sucess then only wkf2 executed how
4)what is vi editor
5) what operations performed on materalized view
5)when we go for connected and unconnected lookup
6)how target table gets refreshed daily
7)session variables and workflow variables

8)how to delete particular row i.e 2nd in a file unix
9) what r the properties set to improve session performance
-----------------------------------------------------------------------------------------------------------------
10) need solution for this

i/p1: a
b
c
d
i/p2: e
f
g
h
o/p should be:

ae
bf
cg
dh
ANS:
(I) give the clarity of this question Rashmi.
what i am asking is i/p 1 and 2 are different files or two ports are in same file ?
and o/p is two columns or same column ?

(II) Please find the solution below for your requirement :
Sourc1---SQ---->SEQ--->
Souce2--SQ---SEQ-------> Joiner [It will Join the 2 pilenines into one(Condition is
Source1-->SEQ=Source2-->SEQ)--->Load the records into target
Here, I am generating a seqence numbers for the first file and second file. So based on the
seq numbers i am joining the two files.
10) please anybody explain unix commands which are frequently in Informatica development.
If anybody related documents..
ANS:
Regarding UNIX Usage in Informatica:
First of all .. In Informatica development projects Unix is not mandatory. Some projects use
windows based environment. You know need to know unix commands when ur working in
Windows environment.
Even if you are working in Unix environment. If Sources are databases and target database
then u hardly use Unix environment.
UNIX Basic commands using in dev
FTP -- connecting to unix for files transfer or download

put -- place the file in unix
get -- get the file from unix
cp-- copy file to different file

rm-- remove file
vi-- vi editor to modify the file content
Grep-- search for the content in the file
These are some basic commands

-------------------------------------------------------------------------------------------------------
11) how to split one half to one target and another half to another target using unconnected
lookup how to do this can any one explain me.....
12) regarding update stretagy?

without using update strategy how can i update the table?
with simple mapping?
ANS:
1)Enable the Update Override in Target in the Mapping

2)Select Treat Rows as "Update"
3)Target Session Properties "Enable only Update as Update"
13) comparison null operetor

what is comparison null operator in lookup.where we use it in dynamic lookup.which situetion.
ANS: null comparision in lookup with dynamic chace

when we specified a condition in lookup tm, for null values the integration service take default value
instead of taking the null values.....
while null values on the scan it should take default only..... in un-connected look up tm
it should give null only.....
14) how can i lookup the data without using lookup transformation?
ANS: with join t.m
15) table A:
col1
bangalore#chennai#Hyd
Delhi#bombay
Required format
Col1 col2 col3

bangalore chennai hyd
delhi bomby
How can divide one row to multiple rows.
using informatica and also want sql query for that(oracle).
ANS:
(I) Yes you can use Normalizer Transformation

(II) Use instring and substring in oracle to acheive the result.
(III) when ur importing the flat file take a delimiter as # then directly map to target we get
output what ever ur asking.
(IV) eather data coming from database means we can use substring and instring concepts.
16) flat file

hi can u please help on this
i have flat file, the content of file is huge and it contains lot of unwanted data...
and we should not use any transformations. and source is connected to target it is one to one
mapping only,
how to remove the unwanted data and how can we achieve this...........please help me
ANS:
(I) 1) Please some filter condition in the source qualifier like Deptno = 10
2) Use some pre UNIX scripts to validate the flat file and copy those records from
the flat file to the other file and then trigger the workflow
Hope this may help
(II) can you please tell me where we have to implement unix script
(III) In the informatica server it self
Please move/copy your file into a particular directory of unix server where
Informatica is installed and from there using unix script do all the data validation
and format correction then trigger your Informatica job
(IV) in sq trns write a query in sql override, user defined join, source filter and select
distinct
we can restrict the unwanted data.
17) i have two employee tables with the same structures but with differnet data , now i want to find
the max(sal) from two tables and needs to compare tables like(max(sal) of emp1 >max(sal) of emp2)
if emp1 having the max(sal) i need to pass one set of records and emp2 has max(sal) i need to pass
another set of records.
please help me asap.
18) What is associated port in informatica??????
What is data validation????????
What is versioning in Informatica???
ANS:
1. associated port is default port when we are enabling the dynamic cache in the lookup trans. it
is compared with input ports. and compared result will give the "NewLookupRow" port.
2. data validation means when source is coming to the staging area we have to validate the
i.e. we have 4 techniques.
1. data cleansing (remove unwanted data)

2. data scrubing ( adding the ports i.e. adding the data)
3. data merging (merge the data diff sources)
4. data aggregations (loading the summarized data)
3. versioning is good concept of informatica, i.e. group of members accessing the same
repository, when u want to restrict to access ur mappings from others use version controlling.
there we have 2 options checkin, checkout.
checkin used for to the mapping. in this informatica server will give one version number starts
with 1.
checkout used for edit mapping. applying the some changes.
19) Do you have any idea about LRF ( Load Ready File) in Informatica?
What is LRF ?, When it comes in to the picture ?, how we deal with it ?
ANS:
(I) LRF is "Indicator File".Using Even Wait Task in informatica, we will wait for the file to
trigger the Jobs/Sessions.
(II) I think LRF is another concept
bcoz, "indicator file" maintains the id's and it is one of the session log statistics.
one type of output file.
indicator file will give each every record different id's.
weather it is inserts means give one id
updates means give another id
like that.
20) Which transformation is used instead of lookup to check whether the record alredy exist or
not???
ANS:
(I) Stored Procedure
(II) Joiner tm
21) how to print singele record three times

for example i have emp table
i have the king record only once in the table
[i.e no duplicate records]
but my requirement is to print the same record three times
ANS:
(I) I guess u can take union transformation and do it.

Take 3 emp source instances and union it in union transformation and u get 3 duplicates
for each record in target.
Never tried it, but i guess it works
Let me know if i am wrong.

(II) Create a target instance three times...... Connect all the output port the three instances...
U will get the expect output...
@shashank
No need of Union transformation here......
(III) taking 3 instances of target is working but full records are not comming, i have removed
the p.k also when i took union t/r they are getting successfully loaded with 3 duplicates of
every record.
22) diff between $ and $$ in informatica

If any body know the information about this post me
ANS:
(I) $ is Mapping Parameter & $$ is Session Parameter
(II) Three types of variables are available in Informatica. They are
1) Pre Defined Variable

2) User Defined Variable
3) System Variable
Pre Defined Variables:

----------------------------
It denoted by Single Dollar ($).
Its defined by the Informatica itself.
Example : $Source, $Target
User Defined Variables:

----------------------------
It denoted by double dollar ($$).
Its defined by the informatica developers depends on the logic. For mapping
parameter and mapping variables are using the user defined variables.
Example : $$Currdate, $$deptno
System Variable:
---------------------
It denoted by trible dollar ($$$)
Example : $$$Sessionsystime
@Chocsweet: Pls chk the things above, wht u r saying is not the exact answer.
(III) if $ is used for predefined variable, then session parameters come under
predefined variables???
examples of session parameters
$session parameter
$Inputfile
----------------------------------------------------------------------------------------------------
23) TCS question

1.wt is throughput in informatica
2. wt is pipeline partition.
ANS: THROUGHPUT is nothing but the speed that informatica server reads the data from source and
writs the data to the target /sec......... it wil be displayed when u double click on session in workflow
monitor , it will show a window of the speed that inf.server read and writes. check it out .........
24) SQL Transformation

Hi Dudes...
What is the use of sql t/r??

when we are using sql t/r??
Wt r the situations are used sql t/r in project??
ANS:
1.sql t.r will supports the ddl and dml commands, where as other t.r not supported to ddl commands.
2. when u want to create a table dynamically (i.e. while session is running).
3. that is on depends on LLd preparations. then we want to use sql t.r mandatory.
and also depends on client requirements.
25) mapping lock

Under what cicumstances a mapping gets locked. And how to realease the lock from the mapping??
ANS
(I) Locks in informatica

Hi shashank,
There are different situations mappings or workflows are locked by other users.
1) Repository is disconnected when u r working on any of the mappings. If you reconnect to repository
and if you open the mapping which you are working on sometimes it will says mappings is locked by
User(Pavan)
2) If you are not closed the designer or workflow manager is not disconnected properly. If you directly
close the window. Locks will be on corresponding mapping or workflow.
3) if two persons are working on the same mapping / workflow. second person will prompt a msg "
m_test mapping is locked by user(pavan).
If you want to release the locks u have to go to repository manager check user locks and u can
release the locks.
hope this will help you to understand locks in informatica.
(III) Its a Write intent lock, if some user has opened it.
this lock can be released by disconnecting the respective connection id(user specific)
for that object in Admin console
(IV) I think from Infa 8.6.1 onwards this activity is pushed to Admin console to manage.
But earlier version all the locks can be managed by Repositry Manager.
26) i have scenario like this
Rno subject Marks

----- --------- --------
1 maths 60
1 science 72
2 maths 67
2 science 82
now i need output like this
Rno subject Marks subject Marks

----- --------- ------- --------- --------
1 maths 60 science 72
2 maths 67 science 82
ANS:
(I) i think normalizer will not work here, may be it will be acheaved by using aggregator
t/r but i dont know exactly,
(II) i think Normalizer T/R will giv us the Input in this example as its output when the
output of this particular example is used as the Normalizers Input

27) in which scenario we go for mapplets?pls tell me in real time with example
ANS:
(I) Definition and Limits
Mapplets
When the server runs a session using a mapplets, it expands the mapplets. The server then runs
the session as it would any other sessions, passing data through each transformations in the mapplet.
If you use a reusable transformation in a mapplet, changes to these can invalidate the mapplet and
every mapping using the mapplet.
You can create a non-reusable instance of a reusable transformation.
Mapplet Objects:
(a) Input transformation
(b) Source qualifier
(c) Transformations, as you need
(d) Output transformation
Mapplet Won’t Support:
- Joiner
- Normalizer
- Pre/Post session stored procedure
- Target definitions
- XML source definitions
Types of Mapplets:
(a) Active Mapplets - Contains one or more active transformations

(b) Passive Mapplets - Contains only passive transformation
Copied mapplets are not an instance of original mapplets. If you make changes to the original, the
copy does not inherit your changes
You can use a single mapplet, even more than once on a mapping.
Ports
Default value for I/P port - NULL

Default value for O/P port - ERROR
Default value for variables - Does not support default values
Example
In one our my projects.. we have error stretagy which is applicable for all the mappings..
That will capture the error records and flag them and attach error msg to the error records and writing
to the error table. this the logic common for all the mappings.. so we implemented this logic
in MAPPLETS
this example any one and any domain can tell as example of mapplet..
We are routing the error records based on the key coloumns null or null values.. what are the
scenarios we thought errors.. we capture those as errors and sending the only error records to the
Mapplet input..
In Mapplet, we are converting single row into multiple rows if that records contains more then one
error.. example one row had three errors we are creating three records out of it and showing three
errors for the same record.
example.. empno name date sal

101 praveen 12/12/1999 10000
Null pavan 33/13/1990 afdsfdsfds
in the above example we have three errors.in second record. empno is null, date is invalid date, sal is
not valid.. so what will do is we generate row number for each input records to identify which record
had error. so our output in the error file like this
rowid tablename error field error msg

2 emp empno empno null
2 emp date date invalid
2 emp sal invalid salary.
In Mapplet..
Mapplet Input---> Expression---> UnconnectedLookup--> Mapplet output
actuall mapplet will do only segregate the errors and assign the error code and lookup on error msg
table based on the error code and assign the error messages to the each error and also error count for
each record in expressoin based on tat we will mention occurs option in normalizer to split single row
into multiple rows.
Its out of mapplet.. i'm just explaning till the last to load data into error table
ater that flow will go to mapplet out---> Normalizer --> error table..
In normalizer, we have to mention occurs option based on the max number of possible
errors.
Please let me know if ur required more abt this....
In the Expression we will validate what kind of error and assign the code for each type
of error. based on the code we will lookup on the lkp file..
there is no specific reason to use unconnected lookup. u can use connected also.. we
required only one return port from the lookup so we used unconnected where we will
get the performance improvement also.. if ur using connnected lookup it will create
the cache for each coloumn in the lookup. so it might hit performance so we used
unconnected..
hope it cleared ur doubts.. if not reply back..
(I) can u explaine me
how maplet doesnot support joiner T/R

(II) joiner transformation is supported in mapplets.
(III) Sorry i just copied that matter from material. that is old version..6 x
In latest versions Joiner will work.
28) Hi All,
I am getting the error while installing infa 8.6 on my PC. OS is Vista and database
is Oracle 10G vista compatible .
No logs have been created.
Error Details :-
Informatica PowerCenter 8.6.0
cannot start Informatica Services.
Use the error below and catalina.out and node.log in the server/tomcat/logs
directory on the current machine to get more information. Select Retry to continue
the installation.
EXIT CODE: S
Please Help me out regarding this issue.
ANS:
Check the following.....

(I)
1)Be sure that java is installed as its own installation not included with other software.
2)Put the Informatica service bin path in the environment path of the system.
3)While the installer was pinging the domain I stopped and re-started the service, change user service
ID from a network domain user to the system account, Stopped the service and disabled it then after
saving re-enabled it and restarted, changed the service ID back to a domain user ID, etc., etc., etc...
4)Used infacmd commands to try to add a domain and/or ping the domain externally. I did find one
time I had to manually create a domain using the infacmd.
5)Handle Compatibility issues.
Let me know if it works or no.
(II) Thanks a Lot Aparna....As per your suggestions infa Server got successfully
Installed...
Found few errors with respect to configuration in Admin console and Client let
you know later.
29) I am new to informatica ... can any one explain me abt incremental
aggregation property... I knw we need to chk incremental agg. property in
sessions....
but my qtn is just by chking that option how informatica filters out only the new
records from the source? do we need to do some kind of look up in mappings? or
its enough if we just click this incremental aggregation option?
ANS:
(I) when u send data second time and you have new values to be aggregated
then you go for incremental aggregation. when u select incremental
aggregation in session properties the data is copied into cache and
checked in cache for the new values and aggregation is performed.
(II) Thanks Shashank ... u mean to say the source data as is copied into the
cache?
and it compares the new source with the cache and processes only the
new records for aggregation?
(III) Its is not source data which is going into cache it is target data and source
data is compared with this cache.
(IV) okey .. but the target is aggregated ... for example
Source - Jan Feb and Mar

Target may be Qtr?
How it compares?
(V) As per my knowledge,
The property(checkbox) specified in the sesssion tab would NOT be
responsible to filter out the new records and aggregate it..
The option is used only for the incremental aggregation or also called as
running aggregation.
The onus would be on the user to filter the delta records(changed/New) at
the source side and pass it on to the aggregator..
When you check the option in the session properties,infa creates two set of
caches for data and index..(original set of cache and backup set of caches
which is used for recovery purposes).
eg:
existing data in the emp table as on 20sep
-------------------------------------------------
empid allowances date
1 100 20sep
2 200 20sep
incoming records on sep21st

----------------------------------
1 200 21sep
2 300 21sep
3 400 21sep
After running the incremental aggregation enabled session,the target table

would look like below:
--------------------------------------------------------------------
empid allowances date
1 100 20sep
2 200 20sep
1 300 21sep
2 500 21sep
3 400 21sep
Note the aggregation of the allowances that has taken place above and
also the new record that been inserted into the table..
(VI) Ya thts correct.,explained by the shiva prasad..In realtime these
incremental aggregation will be used in telecom industry and insurance
domain..
* telecom domain in postpaid connection to find out the total balance what
have you talked till today for this month will be calculated in incremental
aggregation..
*Insurance domain: To sum up ur premium amt from d start effective date

to till date they will be using incremental aggregation...
Thse are the some small examples in realtime...Any explaination needed
reply in this forum itself...
(VII) So we need use lookup,update strategy to filter out the new/changed
records just like SCD ?? and pass it to the aggregator ?
30) scenario
id name
1 abc
2def
3ghi
1 abc
2 def
2 def
3 ghi
----------
Target
-----------
id name
1 abc
1 abc1
2 def
2 def2
2 def3
3 ghi1
3 ghi2
how to acheive this logic
ANS:
(I) select id,Name || decode (row_number() over(partition by id order
by id),1,null,row_number() over(partition by id order by id)-1)
from Table
Use the above query in Source Qualifier and load the records into
your target.
(II) If this is a flat file how do solve this
(III) Source(Flat-File)--SQ-->Rank Trans(Group by ID and select no of
ranks say 10000)-->Exp (Drag all the fileds from Rank Trans--
id,name,rank)
Disable the Outputport in the exp for the columns (name,rank)
Create One O/P Port named as Named_rn give this formula name
|| (decode(rank,1,null,rank-1)
Now Connect the id & Names_rn from expresiion to Target.
--------------------------------------------------------------------------
31) please tell about ETL testing using informatica.
Roles of ETL testing experts.
At what stage ETL testing people will come in the project.
ANS:
(I) basically ETL testing is 4 types:1 unit test: for developed
mappings
2.integrate test: gathering all mappings to one location

this will do superiors
3.system test: this is all so one of the impotent testing
4.UAT:user acceptance test: this will show to our client

this is the final test........
testing is different scenarios are there :
test spec will provide your org.
test procedure:
test case:
expecting result:
actual result:
status: pass or fail
(II) Apart from mappings....the following test scenarios should
be considered
1).Existing data testing

2).New data creation
3).Editing/Deletion of existing data.
How UI reacts to the changes
4).What are the filters/SQ's/Target connections/Jobs and

Tasks
32) This is tcs interview question if parameter file miss

what eror will come?plz answer
ANS:
(I) PETL_24049 failed to get the initialization
properties from the master service process for the
prepare phase [Session task
instance[Session_name] Unable to read variable
definition from parameter file[file_name]]
with error code [32522632]
This error i got n if its wrong let me know the

error...
(II) I guess even if the parameter file goes missin and
if you have assigned the initial values for that
parameter @ the mapping level,You shouldn't be
getting the error...
33) I don't have source data but i have to test my

mapping, how it is possible in informatica?
ANS:
(I) Hi Guys,
Siva is right here , Without sample data u

can't test the mapping.
Deepthi might have confused here. without
loading to target table u can test with
sample data using debugger or test load
options..
(II) A mapping is not valid without a source or
target.
A debugger can not be used in a invalid
mapping.
------------------------------------------------
34) How do you migrate from one

environment to another like development
to production in Informatica
ANS:
(I) Migration code from dev --> test
---> Prod is having different
methods. It will be different from
project to project.
1) Repository Dump and create

new repository with different name
in Prod
2) Export and import workflows
from Repository Manager.
3) Unix Scripts to migrate
workflows from folder to new folder
Who's responsible for this
migration. this is also having
different options...
1) Some project will have separate
Admin team will handle this
migration.
2) In team experienced people will
do the migrate
3) sometimes development is
handle by one company and
Separate vendor's(different
company) will maintained
environment in this case we have
raise migration request to
respective team to migrate the
code.
Hope this will help you to
understand migration methods
(II) import & Export through repository
manager
import & Export through pmrep
command line
Keep both the Repositories open in

a Infa Repository manager client
Copy Paste
Drag & Drop
Deployment groups
(III) Another way is called Deployment
Group.
Trust me this works amezing. Not

sure from which version onwards it
comes.. but I think 8.6.1 onwards.
You need not to worry about the

dumps or export import.... this
work just like copy workflow from
Repo to Repo. But make sure you
use right way of Infa version
control.
---------------------------------------
34) ERROR TABLE IN INFORMATICA REPOSITORY

ANY BODY KNOW ABOUT THE " ERROR TABLE " IN INFORMATICA REPOSITORY......
IT HAS A UNIQUE 'SEQUENCE_NUMBER', 'ERROR_CODE','ERROR_DATA'..........
PLS............. ANY BODY HAVE THE KNOWLEDGE OF THIS TASK...... PLS EXPLAIN US
ANS:
(I) In Informatica from 7.1.4 onwards provided separate repository for error tables.
In session properties you have option to log the errors in error tables. Informatica build 4
separate error tables to log the errors.
For error info., Infa 7.1 onwards provides 4 inbuitl error tables.
PMERR_SESS, PMERR_TRANS, PMERR_DATA, PMERR_MSG...........these tables log the
error details on row by row basis..
This errors are related to only informatica functions or validations or technical errors not
data related errors. Informatica will log Technical issues related to informatica process.
Check workflow administration guide for further explanation.

(II) Thats right Pawan...
These OPB* tables are specific to Informatica.

But I think you can get the error code and description in some error table as well.
Informatica log is built based on that repository information.
Sorry I do not have any OPB table access.. else I could help you out.
----------------------------------------------------------------------------------------------
35) Splitting source columns
I have flat file source with 6 columns
I want to load 1st 3 columns in target1
2n 3 columns in target2
How?
ANS:
(I) From source qualifier directly connect first 3 columns to first target and remaining 3
columns to second target.
RamaKrishna: Router concept will not come into picture here.
(II) Ramakrishna, Abdull

Wrong answer.. router is no use here..
Siva answer is correct u can directly connect the columns to different targets based on
requirement.
(III) i don't think so you need to go for sorter and ranker for this solution.
check the problem clearly it is simply loading from source to target..
first three coloumns needs to load to target 1 and next three columns need to load to
target 2..
he is not talking about records only columns.

(IV) yes , you can directly map first 3 ports to Target1 and remaining 3 ports to target 2.
no need of router...etc
36) I got a Question like
With out using Update strategy transformation can we Update the target in informatica?
If it is possible how ?
waiting for reply
ANS:
(I) By Using Session Properites.........
Update else Insert
(II) updating target

by using data driven
select update,insert ,delete in session in target mapping tab
(III) n target one option is there

IN TARGET properties Upadate override option is there
using this one we can update the target
37) what is Transformation Error

hello friends..
Could any one tell me what is transformation error?
ANS:
(I) transformation errors are
sql override in the source qualifier transformation
some date conversions
and lookup override in the lookup transformation
38) Please Provide Large Databases?

please i want to gain more knowledge on informatica
so while taking source database it will be better if we have some database which has more
fact nd dimension tables along with more records..do any 1 have large databases just like
emp database
ANS: (I) in cognos u will find gosales database with lacks of records
(II) we have SH ( sales History ) and OE (Order Entry) schemas
are there in Oracle so those may helpful to your
requirement
38) WEIRD QUESTION FACED IN INTERVIEW

I have d source file with 3GB memory.I need to load d data in 2 target tabls according to following
condition...
T1: should have d data for 2GB

T2: It should have d remaing 1 GB data..
plz answer me for dis question..i have faced d question in DELL interviews..
ANS:
(I) Hi...the only solution that i may think as of now is to split the file into two
based on the size and then load them into there respective targets.
'split' unix command can be used to split the files.

(II) Ya that may be possible,but he asked me to implement the mapping for
above scenario.
(III) So here you want to load the 75% of your input data into first target and
remaining 25% of data into the second target.
Source(Flat-File/DB)--->SQ-->EXP(Pass all the columns and have a

sequence number by using seq generator)--->RTR(two Group )--->Targets
(T1,T2)
First Group Cond : IIF(MOD(seq,3)=1 or MOD(seq,3)=2) Pass this to T1

Default Group : Pass this to T2.
-----------------------------------------------------------------------------
39) informatica 8.6 installation

ANS:
(I) It is just next , next like that only
But you should have fallowing :
1) Two DB users
2) oracle port number(default is 1521) , Oracle SID ( difault is ORCL)
And other part of the installations are
1) creating the Repository

2) creating Integration service
After this you can start using Informatica
I hope it will be usefull

(II) after the installing integrating server/service
tne integrating service & rep not working
(III) change the operating mode from exclusive to normal
and click on enable
then ur informatica repository service will be running
40) i have 2 work flows namely wkf1,wkf2. after the execution of wk1 we have to execute
wkf2. if wkf1 is not executed means you should not execute wkf2 we have to do it
automatically
ANS:
(I) use the command task in workflow1 as the final task and start the workflow2 using
pmcmd command....once all the tasks are executed successfully in workflow1, the
pmcmd command in command task will trigger the workflow2 automatically...
(II) Hi Prasad,
In this approach we can't achieve all the requirements. any time u can start the
wkf2 in ur case but requirement is wkf2 shouldn't run if wkf1 is not succeeded.
Solution for In this case u have to create a flat file at the end of wkf1 by using
touch xyz.txt and in wkf2 use even wait task and use file watch event on that flat
file. in this solution if you run the wkf2 it will wait for the flat file to create if not it
won't run the wkf2.
Hope this will be ur solution.

(III) Use the post session command for the wf1 to create a null/zero file.
the file created can be used as a file-watcher in wf2.
Create an event-wait task and use it as the first task in wf2.

Such that it starts the preceding sessions of wf2 & after the creation of 0 file(wf1
completes). You can select the option of Delete File-watch in the event wait
task(depends on your requirement).
41) how to delete dublicates in flat files
ANS:
(I) using sorter t/r Check the distinct option in property tab
42) help me pls in this data issue

hi friends
While loading from flat file(fixed length) to oracle table .the data got loaded successfully but when i
checked in session log file .. i got the below error ..
Severity: ERROR
Timestamp: 9/16/2010 1:31:45 PM
Node: NODE_02
Thread: TRANSF_1_1_1
Process ID: 6749
Message Code: TT_11132
Message: Transformation [e_Donnelly] had an error evaluating output column [v_DATE2]. Error
message is [<<Expression Error>> [TO_DATE]: invalid string for converting to Date
... t:TO_DATE(s:LTRIM(s:RTRIM(s:' ',s:' '),s:' '),s:'YYYY-MM-DD')].
ANS:
(I) Hi Karthik,
This error will occur when using incompatible date formats. The
default date format in PowerCenter is MM/DD/YYYY HH24:MI:SS
and hence the SYSDATE is converted to MM/DD/YYYY
HH24:MI:SSbut in this case the TO_DATE is expecting the input in
the DD-MON-YYYY format.
Solution
To resolve this use the convert the date format of the SYSDATE to
the formate used by the TO_DATE function.
Example:
TO_DATE(TO_CHAR(SYSDATE,'DD-MON-YYYY'),'DD-MON-YYYY')
(II) data is in string format like 20070802
this data is getting loaded but some data has no value and it is
creating error for those data
the file is fixed limit

(III) If some records having null u have to filter those records or send
default date to those records. If you trying to do to_date function
in null values it will throw an error like this.
Use condition something like this..
IIF( ISNULL(IN_DATE), to_date('19991231','yyyymmdd',

to_date(in_date,'yyyymmdd') )
Try this logic it will resolve ur issue.
43) source qualifier to expression

Hi friends i have 2 source qualifiers ( EMP & DEPT) and i want to use expression transformation and in
which i want to copy port from both the transformation.
so i did it and port are copied from both the table SQ but links are forming from only one source
qualifier only why like that if you don't get i will explain again
ANS:
(I) You have to define the Join Condition in the Source
qualifier.
Let me know if it works or not.

(II) Hi aparna
Thanks for reply. I got this question in an interview

His requirement is he has two individual source qualifiers
and he want to copy the
port from two SQ to Expression with out using any other
transformation like
source1----source1_SQ
---------> Expression Trans
source2----source2_SQ
Like above is it Possible
(III) You can use one SQ->Lookup->Expression

(IV) hi ratna kumar
i got you but after SQ no other Transformation , direct
Expression only like this he asked me
(V) if the two tables coming frm same database..... then
remove one two sourc quailfiers ,, and create one source
quailifier transformation ,,,, then combain two
tables(emp,dept) because they have common columns....
after that connect the ports to expression
transformation.....
source qualifier is a active transfor mation.... only multi
input group tm accepts the two inputs ....... so expressin
never act lick that........ read transformation guid for
assistence....
44) Duplicates in flat file

I have a flat file, it has duplicate records
I want to send
1.Distinct records in target 1

2.Duplicate records in target 2
How to do this?
ANS:
(I) Use Exp Tx compare the row and set the flag then
use router transformation.
(II) Duplicates in flat file

For ex: Assume you have Dept table
(DEPTNO,DEPT_NAME,LOC) are the fields in the
table. now your scenario is to compare the record
by the use of DEPTNO
Table data:
DEPTNO,DEPT_NAME
10,AAA,CHE
10,BBB,BGR
20,CCC,HYD
30,DDD,KMU
30,EEE,TPJ
Your EXPRESSION T/R PORT will be like this
PORT NAME,I,O,V, EXP

DEPTNO (will be I/P port)
DEPT_NAME(will be I/P Port)
LOC(will be I/P Port)
Now Create new ports in this order
PRE_RECORD(Variable port) TEMP(in expression)

TEMP(Variable port) DEPTNO(in expression)
FLAG_DUP(O/P port)
iif(PRE_RECORD=DEPTNO,'Y','N')
This is use to compare the records and it will flag Y

if it is duplicate.
Then use router
-----------------------------------------------------
45) Please explan this query
Select * from emp e where &N=(select count(distinct(sal)) from emp f where f.sal>=e.sal)
how it get sorted

whether it will create a virtual table to sort or what it do actually
what what is the need of count
please help me
ANS:
(I) Its querying the emp table inorder to find the Nth Highest salary, for eg: 5th Highest
Salary.
Before executing this query you can give this command. "set autotrace on". Then you will
know how the SQL got executed
(II) Hi Aparna,
Thanks for ur reply but it showing following errors Could you help me
SQL>set autotrace on
SP2-0613:Unable to verify PLAN_TABLE format or existence
SP2-0611:Error enabling EXPLAIN report
SP2-0618:Cannot find the session Identifier. Check PLUSTRACE role is enabled
SP2-0611 Error enabling STATISTICS report
(III) Which version of oracle DB are u using?
(IV) IM using oracle 9i You have to run these scripts before you set autotrace on.
$ORACLE_HOME/sqlplus/admin/plustrce.sql
$ORACLE_HOME/rdbms/admin/utlxplan.sql
46) router is an active or passive transformation
ANS: Router is a Active tx because of Update strategy property and it will change the Rownum of the
table
---------------------------------------------------------------------------------------------------------------
47) Suppose I have one source which is linked into 3 targets.When the workflow runs for the first time
only the first target should be populated and the rest two(second and last) should not be
populated.When the workflow runs for the second time only the second target should be populated
and the rest two(first and last) should not be populated.When the workflow runs for the third time
only the third target should be populated and the rest two(first and second) should not be
populated.Could any one
ANS:
by my first look.. posting the thought.. hope this helps you...

take sequence generator in exp and generate the numbers starting from 0
use a Router t/r which is connected to the Target table,
define three group conditions as
Table1 : MOD(SEQ_VALUE,3)=0
48) can anybody solve this scenario please

my source is
id
1
1
1
1
2
2
2
2
3
3
3
3
then my targets are like
target1:
id
1
2
3
target2:
id
1
1
1
2
2
2
3
3
3
ANS:
(I) This question is similar to past post only
to do this
SQ----> sorter--->expression---->router------>target1 and target 2

(II) routing of unique values into one target
(III) hi frens
i have got one source table(here source is of type flat file)like this
SNO,SNAME,EDUCATION
101,VIJAY,BTECH
102,PRAMODH,BCOM
103,KISHORE,BTECH
104,SHANKAR,MSC
105,RAJESH,BCOM
106,MOHAN,BTECH
107,SNEHA,MCA
108,KAMALINI,BCOM
109,PAYAL,MCA
110,SHANTI,MSC
and i need o/p in such a way tht routing of unique values into one target and dummy
values into other with in a given mapping only
SNO,SNAME,EDUCATION
101,VIJAY,BTECH
102,PRAMODH,BCOM
104,SHANKAR,MSC
107,SNEHA,MCA
and in the other tagrget table i need like this

SNO,SNAME,EDUCATION
103,KISHORE,BTECH
105,RAJESH,BCOM
106,MOHAN,BTECH
108,KAMALINI,BCOM
109,PAYAL,MCA
110,SHANTI,MSC
(IV) Hi shankar,
Din & Ramesh is correct. may be u got confused with Jai answer using aggregator. now
aggregator is not required. here i'm giving you the coding check it out.
Source Qualifier -> sorter -> Expressino ->router -- 2 targets
in sorter : sort on education
in expression : INPUT & OUTPUT PORTS ARE ENO, ENAME, EDUCATION
Take two variable and one output port

v_FLAG : IIF(V_TEMP = EDUCATION, 1,0)
V_TEMP : EDUCATION
O_flag : v_flag
In router :
Create two groups
unique values group : o_flag = 0

duplicate values group : o_flag=1
map those values to corresponding group.
hi friends correct me if any wrong in that code.

(V) thanks pavan for ur detailed explanation
here i followed another approach and got it
i used source qualifier -> rank -> router ->target
in rank i grouped by education and gave ranking from bottom(say ranking upto 5)
and in router i used o/p grp condition rank=1 which obviously routes the unique values to
one target and default grp routes dummy values to another target
(VI) i have a doubt Shankar .. the way you are proceeding seems doubtful to me …..
Here is one example suppose in your target table you have rows as below
SNO,SNAME,EDUCATION
101,VIJAY,BTECH
102,PRAMODH,BCOM
104,SHANKAR,MSC
107,SNEHA,MCA
and in the other tagrget table

SNO,SNAME,EDUCATION
103,KISHORE,BTECH
105,RAJESH,BCOM
106,MOHAN,BTECH
108,KAMALINI,BCOM
109,PAYAL,MCA
110,SHANTI,MSC
Now if next day a row comes with the data

SNO,SNAME,EDUCATION
111,AMIT,BTECH
Now tell me the row will get inserted to which table?
So what I want to suggest is …. We can have a look up to the existing table … and can
give a condition …. If that found put into table 1 if not put into table 2.
(VII) Shankar wants to findout the duplicate , unique record into two different targets in the
same day.
u are thinking abt next day also.If you use the lookup on the target table first day ur
approach won't workout. first day all the records won't be there in the target table so all
the records go to one target. i can understand ur point but shankar requirement is not
that.
Shankar u approach is rite.. it will work for ur requirement.. in informatica we can do it in

N number of ways. tanx for sharing another approach also..
49) interview question

Hi
How to extract original records at one target & Duplicate records at one target?
Thanks in Advance...
ANS: You are solution is not completed/Correct. may be u got confused with the question.
Check the similar question in the old postshttp://www.orkut.co.in/Main#CommMsgs?
cmm=791012&tid=2543660813794278811&kw=v_count
---------------------------------------------------------------------------------------------------------------
50) what is source commit and target commit intervals
where we can find rejected records and how to reload that records
what operations we can performed on materalized views
exact difference between lookup and joiner trs
ANS:
(I) The main differencee between lookup and joiner is
Lookup is performs Non-equijoins and joiner performs only EquiJoins.
Second Difference is Whenever we perform join Using Joiner It contains Must Primary key-
Foreign key Relation ship.
By using Lookup Transformation Just We Need a Matching Port.
If any diff is there Tell me

(II) Commit Is The Wat amount of data is loaded in to target during session runs
main we have 3 types of commit points is there
Target-based commit. The PowerCenter Server commits data based on the number of target
rows and the key constraints on the target table. The commit point also depends on the buffer
block size the commit interval and the PowerCenter Server configuration for writer timeout
.
Source-based commit. The PowerCenter Server commits data based on the number of source
rows. The commit point is the commit interval you configure in the session properties.
User-defined commit. The PowerCenter Server commits data based on transactions defined in
the mapping properties. You can also configure some commit and rollback options in the
session properties. Rejected Records Are Saved in Session log in the form .bad File
(III) source commit is after how many records you wan load to target ...
eg after sening 1000 records ... commit interval
and target commit point after reaching 1000 rec to target.....
these are just like savepoint in oracle ok....
to improve the session performence you have to increase the commit interval .......
51) In expression transformation no values are getting passed from output port...even if i
'hard code' any value in output port its not coming to next transformation...
ANS:
(I) if the value is coming into the expression then its not hard coded
hard coded means that the value remains the same for tht port
like suppose if we want a port STATE to have a value as 'New York' then v make that
port as output (checked) ,input(unchecked) and write 'New York' in the expression
part
51) Hi All..
Can anybody clarify me these questions.
1)Surrogate comes under which category?
2)Where we use surrogate key?(technically)
3)Whether first we should use and why?
i)Expression Transformation
ii)Filter Transformation
4)What is basic diffrence between star schema and snowflake schema?
5)How to do performance tuning in Informatica?
6)What is diffrence between data migration and data warehouse.
7)What is dynamic and static lookup transformation.
8)Why to choose oracle or sql server for data warehouse?
ANS:
(I) 1)Surrogate comes under which category?
2)Where we use surrogate key?(technically)
Surrogate Key is an artificial identifier for an entity.In surrogate key values are generated
by the system sequentially(Like Identity property in SQL Server and Sequence in Oracle).
They do not describe anything.
Primary Key is a natural identifier for an entity. In Primary keys all the values are entered
manually by the user which are uniquely identified. There will be no repeatition of data.
Need for surrogate key not Primary Key
If a column is made a primary key and later there needs a change in the datatype or the
length for that column then all the foreign keys that are dependent on that primary key
should be changed making the database Unstable
Surrogate Keys make the database more stable because it insulates the Primary and
foreign key relationships from changes in the data types and length.
(II) 3)Whether first we should use and why?

i)Expression Transformation
ii)Filter Transformation
As per the informatica documents for bottlenecks in informatica, we should use Filter

Transformation first to remove as much as unwanted records and then allow Expression
Transformation to process records based on user requirement.
But as per the requirement if necessary u can use Expression Transformation first, For
example.. you have to set some flag and based on those flag you want to filter records. So
in such situation you have to use Expression first.
(III) 5)How to do performance tuning in Informatica?
Excerpts from Infomatica Manuals:
The goal of performance tuning is to optimize session performance by eliminating

performance bottlenecks. To tune the performance of a session, first you identify a
performance bottleneck, eliminate it, and then identify the next performance bottleneck
until you are satisfied with the session performance. You can use the test load option to
run sessions when you tune session performance.
The most common performance bottleneck occurs when the PowerCenter Server
writes to a target database.
You can identify performance bottlenecks by the following methods:
♦ Running test sessions. You can configure a test session to read from a flat file source or
to write to a flat file target to identify source and target bottlenecks.
♦ Studying performance details. You can create a set of information called performance
details to identify session bottlenecks. Performance details provide information such as
buffer input and output efficiency.
♦ Monitoring system performance. You can use system monitoring tools to view percent
CPU usage, I/O waits, and paging to identify system bottlenecks.
Once you determine the location of a performance bottleneck, you can eliminate
the Bottleneck by following these guidelines:
♦ Eliminate source and target database bottlenecks. Have the database administrator
optimize database performance by optimizing the query, increasing the database network
packet size, or configuring index and key constraints.
♦ Eliminate mapping bottlenecks. Fine tune the pipeline logic and transformation settings
and options in mappings to eliminate mapping bottlenecks.
♦ Eliminate session bottlenecks. You can optimize the session strategy and use
performance details to help tune session configuration.
♦ Eliminate system bottlenecks. Have the system administrator analyze information from
system monitoring tools and improve CPU and network performance.
(IV) 4)What is basic diffrence between star schema and snowflake schema?
Star Schema : Star Schema is a relational database schema for representing

multimensional data. It is the simplest form of data warehouse schema that contains one
or more dimensions and fact tables. It is called a star schema because the entity-
relationship diagram between dimensions and fact tables resembles a star where one fact
table is connected to multiple dimensions. The center of the star schema consists of a
large fact table and it points towards the dimension tables. The advantage of star schema
are slicing down, performance increase and easy understanding of data.
Snowflake Schema : A snowflake schema is a term that describes a star schema
structure normalized through the use of outrigger tables. i.e dimension table hierachies
are broken into simpler tables.
In a star schema every dimension will have a primary key.
• In a star schema, a dimension table will not have any parent table while in a snow flake
schema, a dimension table will have one or more parent tables.
• Hierarchies for the dimensions are stored in the dimensional table itself in star schema
Whereas hierachies are broken into separate tables in snow flake schema. These
hierachies helps to drill down the data from topmost hierachies to the lowermost
hierarchies.
• Star schema utilises less joins than snow flakes so the performance is faster.
• last but not least Star schema is more common than snow flakes
(V) 6)What is diffrence between data migration and data warehouse.
Data Migration: It is a process of migration of data from one database location either
relational or non relational to another database location.
Data Warehouse: It is a store or warehouse where we manage the unoperational data

(non transactional) for historic usage, business analysis or tren analysis. Remember that
DW can be relational but can not be OLTP. interviewer confuses candidate by that way
7)What is dynamic and static lookup transformation.
Uncache
You cannot insert or update the cache.

You cannot insert or update the cache.
You can insert or update rows in the cache as you pass rows to the target.
You cannot use a flat file lookup. You can use a relational or a flat file lookup.
You can use a relational lookup only. When the condition is true, the PowerCenter Server
returns a value from the lookup table or
cache.
When the condition is not true, the PowerCenter Server returns
the default value for connected transformations and NULL for
unconnected transformations.
Static Cache
When the condition is true, the PowerCenter Server returns a
value from the lookup table or cache.
When the condition is not true, the PowerCenter Server returns the
default value for connected transformations and NULL for unconnected transformations.
Dyanamic Cache
When the condition is true, the PowerCenter Server either updates rows in the cache or
leaves the cache unchanged, depending on the row type.
This indicates that the row is in the cache and target table. You can pass updated rows to
the target table.
When the condition is not true, the PowerCenter Server either inserts rows into the cache
or leaves the cache unchanged, depending on the row type.
This indicates that the row is not in the cache or target table. You can pass inserted rows
to the target table.
(VI) 8)Why to choose oracle or sql server for data warehouse?
This is a debatable topic. there can be many reasons like technical, financial, user
requirement etc. One should give the reason by analysing all the logics. I have worked in
both SQL and ORACLE, both are pros and cons..one person can not say easily that ORACLe
is better than SQL SErver.
----------------------------------------------------------------------------------------
52) ) i hav 1table containing 1000records i want to load first five records in one target
and next 5 records in second target alternatley up2 1000recordss ???????????
2)i hav 1 table contaning 10records and dat table contaning sum duplicates
now i want to load duplicates and original records in two targets???????
3) how to load the same table 10 times in2 target in one mapping????
plzz help out vth ds quests?
ANS:
(I) Q1 -
Use two counter variables in Expression.
Counter1 - Increment it from 1 to 5 and then cycle it back.

Counter 2 - Whenever Counter1 becomes 1 increment Counter2.
So for Target 1 - If mod(Counter2,2) = 1

For Target2 - If mod(Counter2,2) = 0
Hope this serves the requirement

(II) There could be multiple approaches to handle this
1. You can use aggregator

2. From your Source Table, connect 2 SQs. In first SQ. use DISTINCT query to get the
unique records and load into first target
In 2nd SQ, put to query to get the duplicates and load into second table.
(III) Q3 - For this also there are multiple ways .. depends on what kinda requirement you have
-
1. Use Normalizer with OCCURS as 10.
2. Use Unix Script
3. You can connect 10 SQs from your source.
4. At Workflow level also you should be able to use a workflow variable and a decision task
I guess.
5. You can also schedule your workflow to run for 10 times one after another.
Depends whats the actual business requirement.

-----------------------------------------------------------------------------------------------
53) comparing two tables
i have two employee tables with the same structures but with differnet data , now i want to find the
max(sal) from two tables and needs to compare tables like(max(sal) of emp1 >max(sal) of emp2)
if emp1 having the max(sal) i need to pass one set of records and emp2 has max(sal) i need to pass
another set of records.
please help me asap.
ANS:
(I) I guess SQL transformation can come to your rescue here.
In the source qualifier, use

SELECT COUNT(*) FROM TABLE1
UNION
SELECT COUNT(*) FROM TABLE 2
In the filter check the condition count1 > count2. After that you can use SQL
transformation and write your SELECT query there.
this is one approach.. there could be other ways as well

-----------------------------------------------------------------------------------------------------
54) hi,
i have n number of flat files like a.txt,b.txt..............n.txt
in a.txt
-------------
C1,C2,C3
1,2,3
in b.txt
-----------
C1,C2,C3
2,3,4
like this up to n.txt
i want to insert my data into a target as
C1 C2 C3 C4
1 2 3 a.txt(source file name)
2 3 4 b.txt( " )
...................
..................
briefly
i want to load the data into target is
all the data from source(a.txt,b.txt ......n.txt) as well as source file name
and
if i want to add another n+1 file in source system that will add in the target
ANS:
(I) we have option in source flat file " Add currently processed file name as port" you check the
option then u will be the achieve the ur desired output.
(II) hi,
first read ur input files using indirect option in session.then in flat file source
properties "add currently procesed files" port u have achive tje desired output.for indirect
option check with informatica help
55) what are Push and Pull ETL strategies?

2) Can anyone explain factless fact table with example?
3) what is a Junk Dimension can you explain it with example?
ANS:
(I) Push and Pull strategies determine how data comes from source system to ETL
server.
Push : In this case the Source system pushes data i.e.(sends data) to the ETL server.
Pull : In this case the ETL server pulls data i.e.(gets data) from the source system.
(II) Fact table without any measure is called factless fact table.
(III) A "junk" dimension is a collection of random transactional codes, flags and/or text
attributes that are unrelated to any particular dimension. The junk dimension is
simply a structure that provides a convenient place to store the junk attributes. A
good example would be a trade fact in a company that brokers equity trades
56) Pre and post QSL Overrides
Hi Guys can some one help me on what condition we prefer the pre and post SQL Overrides
And whats its advantages and dis advantages
ANS:
(I) In case if you want to do anything on the table before data loading or after data loading we
use pre and post sql's
for ex. dropping and creating indexes
57) source filter condition
what is the main difference between source filter which we will give in the source qualifier and
filter condition which we will add by overriding the source qualifier query?
ANS:
(I)
Filter
Hi,
If you set the filter at SQ level, it limits the source data( means it filter the data at source lever so
it increase the performance) if you use the filter condition with the filter tranceformation it limits
the target rows( means it retrive all rows from the source then it will apply the filter condition
and load the data in to the target) it will take some time to retrive the records from source, so it
will increase the session time.
(II) Paluri was not talking about diff between source filter in source qualifier transformation and
filer in filter transformation..
His Question was Source qualifier filter and SQL over ride in source with filter condition..
Both are the. In source filter u can only give filter condition to filter out unwanted records or
for testing purpose..
where as SQL over ride source qualifer is to customize the default sql which is generated by
Source Qualifier..
(III) is there any difference in the way how they will execute? i mean both the filter conditions
we execute in the database level?
(IV) Purpose of both is the same but executes in diff levels...
SQL override execute in Database level, where as just filter condition executes in Informatica
level, It means source Qualifier reads all the data from Database and then based on filter
condition it passes the records
(V) if you observe the session log , select query which is issued to the database will be with the
"where clause".(where clause will be the condition which we added in the source filter of
the sql transformation)
As per my understanding source fileter condition will be add to the select statement first
and it will be issued to the database...
(VI) for your clarity, try to run a mapping with SQL over ride with where clause and don't put and
any filter... and again run the same with just filter condition ( no sql over ride) then u will
understand the diff
(VII) If we run with just filter condition ( no sql over ride) then observe the log for the sql queiry
which is issued on database,it will be adding filter condition only.
if we over ride the query then it will take the over rided query.
(VIII) if we don't give any filter condition or sql over ride, though the log shows default sql query
without any customizations
58) mapping xml doubt
how to export total mapping to a xml file.

how to use that xml file in other system.
what r the steps we follow when we do this.
ANS:
(I) In the designer,
go to repository, click on export it will save the file in XML format..
To import,
go to repository, click on import, select the xml which u saved and follow the steps..
(II) First you want expand the navigator window,
select the which mapping you want export
go to menu's and select Export option....
click on ok
(III) exported as xml file.
when i import it is saying not a valid xml
59) how to maintain history data in oracle
Hi Friends,
my source table is having like this
sno sname comm

------- ---------- -----------
1 anand 500
2 reddy 1000
1 anand 2000
2 reddy 2000
3 vinod 1000
i want target table like this
sno sname comm

------- ---------- -----------
1 anand 2500
2 reddy 3000
3 vinod 1000
Note : This is only in oracle not for Informatica....
ANS:
(I) your question and wat you want to achieve are totally different...
In order to get the output as you mentioned in the table..

Here is the query
select sno, sname, sum(comm) from table_name

group by sno, sname
60) Why Union is Active transformation
Hi guys can any body explain why Union is Active transformation? I tried Union but it doesn't
delete any duplication records.
ANS:
(I) UNION I S ACTIVE SOLUTION
FIRST WE know a transformation is said to be active if it changes the rows to pass through it. it
combines the two tables jof similar stru. so there is a change in records so it is active.
suppose u want to delete duplicate records we use only option . otherwise we use union all for
displaying with duplicate records.
1) What is Direct and Indirect in flat file property
2) What are the types of Caches
3) Scenario: I have 3 tables 1st table Emp ID and 2nd table Telephone Number, Address ,
Location and 3rd table Bank Name , Account num .I want all the 3 tables in one target table
(all columns converted to rows .Note table is in Denormalized form) ** Empid is Common in
3 tables
Other part of Question
Scenario: Can i solve the above problem with Unconnected Lookup.
Other part
Scenario: If we are using Joiner what is the join condition
4) Scenario : I want to update the target table only, without using Dynamic Cache
5) Scenario : I have used Sql override in lookup ,there are 5 ports from Col A to
Col D But I have used override for first 3 columns Col A, Col B, Col C order by Col B and
mapping is validated .when i run the mapping the session throws error and Sql overide is
not valid.
6) Scenario : I have to join table is it better using the Sql override or lookup or
Joiner. Performance wise which is better
7) Doing overrides in SrcQua is it a better performance
8) Scenario: In workflow I have a reusable Session, the same session is reused across other
workflow. Any change made in either of the Session does reflect in the other Sessions.
9) Can we make any changes in reusable & Non reusable Session often
10) Scenario : In my mapping the Update Strategy is not Updating
11) Scenario : Is it better filtering the rows in filter or in SrcQua
ANSWERS :::
1> Direct : When we want to load data from 1 flat file only.
Indirect: when we want to load data from 2 or more flat files of same data structure.
2> Static Cache, Dynamic Cache, Persistent Cache, Recache Cache, Shared Cache
3> Explain more plz. is there any column common???
4> Use a filter after dynamic cache and give filter condition NewLookUprow=0
5> After SQL override give "--" without quotes else override wont work.
6> If tables are from same database use Source Qualifier else joiner. Lookup is not a good
option.
7> Yes, override in SQL is better.
8> When you make changes to a reusable task definition in the Task Developer, the
changes reflect in the instance of the task in the workflow only if you have not edited the
instance.
9> Yes, we can make.
10> Check whether you have set the option Treat Source Row as Update in session
properties or not. Set it to Update.
11> Better in source quailfier
3> We need to join the three tables using Joiner. We need 2 joiner transformations to join
them. Join condition would be empid of one table=empid of another table.
We do not need to use unconnected lookup as it is used to update Slowly changing

dimension tables.
4> We can use simple lookup to know whether record exists in target table or not. We can
the use update strategy.
We can also use Unconnected lookup.
Other option is to set the Treat Source Row as Update in session properties.
1) How to delete header and footer in flat file.
2) In source we have 1000 rows and i have 3 tragets . The 1st 100 rows have to go in 1st
target and the next 200 rows in 2nd traget and the rest of the rows in the 3rd tgt.
3) I have some duplicate rows in the Source table , i have 2 targtes the unique records has
to be loaded in the 1st target and the duplicate records in 2nd target.
4) I have Empno,Name ,Loc in the source table , we have two targets the 1st target is Tgt_
india and Tgt_USA. when the employee moves from india to USA the row of that employee
must be inserted in USA tgt and must be deleted in Indian tgt and vice versa.
SOLUTIONS
1> Skip the first row to delete header. Not sure for footer.
2> Use an expression t/f after source and use a mapping variable of count
aggreation type. Then use a router to filter the records in 3 groups and load to 3
target tables. ( REFER TO FLATFILES FOLDER MAPPING)
3> Already discussed in thread.
4> Insert table using a router to appropriate table and use Post SUCCESS COMMAND
to delete the row from other table. (REFER TO FLATFILES MAPPING)
SQL STAEMENT ::
1> delete from emp_india a
where empno =(select empno from emp_usa b
where a= b);
2 >delete from emp_india

where empno =(select empno from emp_usa);
Aim of Informatica ::
Informatica is a ETL tool used to extract data from OLTP sources , applied transforamtion to
cleanse the data before applying the business logic and then loaded into the star tables
(warehouse)
ARCHITECTURE
Data Warehouse Architecture
Informatica Architecture
Version 6.2 / 7.1.1

Clients Server Sources Users
Repository Repository
Source Repository
Admin Console Server
Repository Informatica Target / Data
Manager Server DataWarehouse warehouse/Datamart
Designer
Workflow

Manager
Workflow

Monitor
SCHEMA

1Star Schema
2Snow Flake Schema
3Fact Constellation or Galaxy Schema

DWH CONCEPTS | TYPES OF SCHEMA
STAR SCHEMA

Star schema architecture is the simplest data warehouse design. The
main feature of a star schema is a table at the center, called the fact
table and the dimension tables which allow browsing of specific
categories, summarizing, drill-downs and specifying criteria.
Typically, most of the fact tables in a star schema are in database third
normal form, while dimensional tables are de-normalized (second
normal form).
Fact table
The fact table is not a typical relational database table as it is de-

normalized on purpose - to enhance query response times. The fact
table typically contains records that are ready to explore, usually with
ad hoc queries. Records in the fact table are often referred to as events,
due to the time-variant nature of a data warehouse environment.
The primary key for the fact table is a composite of all the columns
except numeric values / scores (like QUANTITY, TURNOVER, exact
invoice date and time).
Typical fact tables in a global enterprise data warehouse are (apart for
those, there may be some company or business specific fact tables):
sales fact table - contains all details regarding sales

orders fact table - in some cases the table can be split into open orders
and historical orders. Sometimes the values for historical orders are
stored in a sales fact table.
budget fact table - usually grouped by month and loaded once at the
end of a year.
forecast fact table - usually grouped by month and loaded daily,
weekly or monthly.
inventory fact table - report stocks, usually refreshed daily
Dimension table
Nearly all of the information in a typical fact table is also present in

one or more dimension tables. The main purpose of maintaining
Dimension Tables is to allow browsing the categories quickly and
easily.
The primary keys of each of the dimension tables are linked together
to form the composite primary key of the fact table. In a star schema
design, there is only one de-normalized table for a given dimension.
Typical dimension tables in a data warehouse are:
time dimension table

customers dimension table
products dimension table
key account managers (KAM) dimension table
sales office dimension table
Star schema example
An example of a star schema architecture is depicted below.

SNOWFLAKE SCHEMA
Snowflake schema architecture is a more complex variation of a star schema design. The main
difference is that dimensional tables in a snowflake schema are normalized, so they have a
typical relational database design.
Snowflake schemas are generally used when a dimensional table becomes very big and when a
star schema can’t represent the complexity of a data structure. For example if a PRODUCT
dimension table contains millions of rows, the use of snowflake schemas should significantly
improve performance by moving out some data to other table (with BRANDS for instance).
The problem is that the more normalized the dimension table is, the more complicated SQL joins
must be issued to query them. This is because in order for a query to be answered, many tables
need to be joined and aggregates generated.
An example of a snowflake schema architecture is depicted below.

GALAXY SCHEMA
For each star schema or snowflake schema it is possible to construct a fact constellation schema.
This schema is more complex than star or snowflake architecture, which is because it contains
multiple fact tables. This allows dimension tables to be shared amongst many fact tables.
That solution is very flexible, however it may be hard to manage and support.
The main disadvantage of the fact constellation schema is a more complicated design because
many variants of aggregation must be considered.
In a fact constellation schema, different fact tables are explicitly assigned to the dimensions,
which are for given facts relevant. This may be useful in cases when some facts are associated
with a given dimension level and other facts with a deeper dimension level.
Use of that model should be reasonable when for example, there is a sales fact table (with details
down to the exact date and invoice header id) and a fact table with sales forecast which is
calculated based on month, client id and product id.
In that case using two different fact tables on a different level of grouping is realized through a
fact constellation model.
Normalization
What is Normalization?
Normalization is the process of efficiently organizing data in a database. There are two goals of
the normalizaton process::
Eliminating redundant data.

Ensuring data dependencies.
First Normal Form
First Normal form(1 NF) sets the very basic rules for an organized database.
 Eliminate duplicative columns from the same table

 Create seperate tables for each group of related data and identify each
row with a unique column or set of columns(the promary key)
Second Normal Form
Second Normal form(2 NF) further addresses the concept of removing duplicative data.
 Meet all the requirements of teh first normal form.

 Remove subsets of data that apply to multiple rows of a table and place
them in seperate tables.
 Create relationships between these new tables and their predecessors
through the use of foreign keys.
Third Normal Form
Third Normal form(3 NF) remove columns which are not dependent upon the primary key.
 Meet all the requirements of the second normal form.

 Remove columns that are not dependent upon the primary key.
Next
Install
 Keep the INFORMATICA CD into the CD Rom

 Click on CD-Drive

Click on the icon
 Now it is showing prompt box with option

 Now click on " Power Center for windows" button.
 while in the process it asks you to configure Repository Server and
Informatica Server .That you can do it now or later on.We will tell how
to cofigure it in the next few lines.
 Successfully installed.
Configure Repository Server
 Go to Program files
 Informatica power center 7.1.1 > Repository Server > Repository Setup.
 You see the prompt screen of Repository server with differnt options.
 Set the "Server Port Number" any value between 5002 to 65535.
 Set the "administrative password" of your own choice.It is case
sensitive.
 Now we successfully configured Repository Server.
Add Data Source
 Click on Control panel

 Click on Administrator tool >Data source (ODBC)
 Now click on add button.It shows set of drivers of differnt data
sources.Select the right data source.For eg., If your database is Oracle
then select respected oracle driver.
 Cick on OK
 It ask for description of the Data Source.You can name it as you want
for eg., ora_data_src.
 Click on OK
 Here you added the data source successfully.
Start Services
 Clcik on Control Panel

 Now click on Administrator Tool> Services
 You see the prompt display on the screen.
 Right click on "Informatica Repository Server" and click on start.
 Right click on "Informatica" and click on start.
 successfully started the services.
Create two users
 Login to your
 Create two users one for Repository and other for Target Database.
 This are necessary to secure the data in the repositories.
 For eg., Repository (Username :: Rep_one , Password :: r) and Target
(username::trg_wh , Password :: t).
 After creating the users.Test it by connecting to the database.
Previous
Create Repository (Version 7.1.1)
 ALl programs > Informatica Power Center 7.1.1

 Informatica Power Center -client > Repository Admin Console
 Under console root , Right click on Informatica Repository Sever and
click on new server registration.
 Prompt comes with a port number.
 Host name :: your computer name , Port number :: Server Port
number(Repository)
 Now your system connected with the Repository server.
 Click on the Server (represented with your host name) and clcik on
connect.
 It asks you to enter the password .Make sure the port number and
password must be same as the Repository server.
 Now Right click on the repository folder and click on New Repository .
 Enter the Repository name and select Global Repository (optional) .
 Now click on LICENSE tab -- enter the keys in the following order
Product License Key, Opton Key and Connectivity License Key.
 Click on OK
 It takes some time to create repository.
 Now your repository keeps running.
Create Temporary Server ( Workflow Manager)
 Clck on workflow manager (Informatica Client)

 Start the Infromtica Services
 Open the reepository
 On the menu bar , click on server and choose "Add server"
 Now enter the new server name and the host name must be "Computer
name"
 click on OK
 On menu bar click on "Connections" and choose the relational option.
 Now we have to create teo relationals one for Source and the other for
Target Database.(Be careful wen you enter login information)
 Successfully server created.
Configure Informatica Server
 Go to All Programs
 Informatica Power Center 7.1.1 >Informatica Power center -Server >
Informatica Server Setup
 It prompts with a box ..click on Continue button.
 Now you find a bigger prompt with set of tabs with different options.
 In the first tab "Server" tab
 Server Name :: Temporary server name
TCP/IP Host Name :: Computer Name
 Click on "Repository" tab
Repository Name:: Name of the repository created at console place
Repository User :: Username of Repository
Repository Password :: Password of Repository
Repository Server host Name :: Computer Name
 Now move to "License" tab
Enter Option key , click on update
Enter Connectivity , click on update
 Click on ok
REPOSITORY ADMIN CONSOLE

Actions
Create Local or Global Repository
Start Repositories.
Back up repository
 Move the copy of the Repository to a different Server
 Disable the Repository.
 Export connection information.
 Notificy Users :: Notification message can be send to all the users connected to the
Repository
 Propagate
 Register Repositories
 Rstore Repository
 Upgrade Repository
REPOSITORY MANAGER
Actions
 Create Local or Global Repository

 Start Repositories.
 Back up repository
 Move the copy of the Repository to a different Server
 Disable the Repository.
 Export connection information.
 Notificy Users :: Notification message can be send to all the users connected to the
Repository
 Propagate
 Register Repositories
 Rstore Repository
 Upgrade Repository
DESIGNER MENU
DESIGNER OVERVIEW
 Design Mappings
 Represent how to move data from source to target.
 Design Mapplets
 Create Reusable and Non-Reusable Transformation
 Access multiple Repositories and folders templates/tables at a time.
 Many more features like Data Profiling , Propagate , Debugger ,Versioning etc.,
Different Data Providers/Sources

 Flat File (Note pad , Excel )
 Relational Data , Views , Synonyms(Oracle , Sql Server , Access)
 XMl Data (XML)
 COBOL Data

DESIGNER MENU
Important Designer Tools
 Source Analyzer
Create Source Definition - Import data from Flat Files/ Relational/ Application/ XML/ COBOL
 Warehouse Designer
Create Target Definition
 Transformation Developer
Create Reusable Transformation
 Mapplet Designer
Create Mapplets (Group of transformation which can be reused in different mappings)
 Mapping Designer
Create Mappings - Represents how the data to move from Source to Target Table.It consists of

Source Definition , Mapplets , Transformations and Target Definition.
MAPPLETS
When you have to create a Mapplet ?
If in a need of particular set of transformations which uses same logic in multiple mappings.So
that you can reuse the group of transformation in multiple mappings.
Create Mapplet
1. You can create Mapplets in Mapplet Designer Tool.
2. Mapplet Input Transformation is used only when you dont want to use the Source
Definition in Mapplet Designer.
3. Mapplet Output Transformation is always used when ever you create Mapplet.
4. Example Mapplet Flows
Source > Sorter > Expression > Mapplet Output
Mapplet Input > Sorter > Expression > Mapplet Output
Advantages
 Include source definitions. You can use multiple source definitions and source qualifiers
to provide source data for a mapping.
 Accept data from sources in a mapping. If you want the mapplet to receive data from the
mapping, you can use an Input transformation to receive source data.
 Include multiple transformations. A mapplet can contain as many transformations as you
need.
 Pass data to multiple transformations. You can create a mapplet to feed data to multiple
transformations. Each Output transformation in a mapplet represents one output group in
a mapplet.
 Contain unused ports. You do not have to connect all mapplet input and output ports in a
mapping.
Limitations
 Must use a reusable Sequence Generator transformation.

 Normal Stored Procedure transformation alone can be used.
 Cannot use
o Normalizer transformations
o COBOL sources
o XML Source Qualifier transformations
o XML sources
o Target definitions
o Other mapplets
VERSIONING
Slowly Changed Dimension
 It is a Dimension which slowly changes over a time.
Slowly Changed
Type Description
Dimension Mapping
SCD Type 1 Slowly Changing Inserts new dimensions. Overwrites existing
Dimension dimensions with changed dimensions. (Shows
Current Data)
SCD Type 2 /Version Slowly Changing Inserts new and changed dimensions. Creates
Data Dimension a version number and increments the primary
key to track changes.
SCD Type 2 /Flag Current Slowly Changing Inserts new and changed dimensions. Flags
Dimension the current version and increments the
primary key to track changes.
SCD Type 2 /Date Range Slowly Changing Inserts new and changed dimensions. Creates
Dimension an effective date range to track changes.
SCD Type 3 Slowly Changing Inserts new dimensions. Updates changed
Dimension values in existing dimensions. Optionally
uses the load date to track changes.
Data Profiling
Data profiling is a technique used to analyze source data. PowerCenter Data Profiling can help
you evaluate source data and detect patterns and exceptions. PowerCenter lets you profile source
data to suggest candidate keys, detect data patterns, evaluate join criteria, and determine
information, such as implicit datatype.
You can use Data Profiling to analyze source data in the following situations:
 During mapping development

 During production to maintain data quality
VERSIONING
How to Enable Version Control?
 In Repository Admin Console .

 Select the repository for which you want to enable version control.
 Choose Action-Properties.
 Select the Supports Version Control option.
 Click on Ok button.
Advantage of Versioning ?
A repository enabled for version control maintains an audit trail of version history. It stores
multiple versions of an object as you check out, modify, and check it in. As the number of
versions of an object grows, you may want to view the object version history. You may want to
do this for the following reasons:
 Determine what versions are obsolete and no longer necessary to store in the repository.
 Troubleshoot changes in functionality between different versions of metadata.
WORKFLOW MANAGER
Actions
 Create Reusable tasks , Worklets , Workflows.
 Schedule Workflows.
 Configure tasks.
Workflow
A workflow is a set of instructions that describes how and when to run tasks related to extracting,
transforming, and loading data.
Worklets
A worklet is an object that represents a set of tasks.
When to create Worklets?

Create a worklet when you want to reuse a set of workflow logic in several workflows. Use the
Worklet Designer to create and edit worklets.
Where to use Worklets?

You can run worklets inside a workflow. The workflow that contains the worklet is called the
parent workflow. You can also nest a worklet in another worklet.
TASKS
There are many tasks available , which are used to create workflows and worklets.
Types of Tasks
Task Name Reusable Description

Session Yes Set of instructions to run a mapping
Command Yes Specifies shell commands to run during the
workflow. You can choose to run the
Command task only if the previous task in the
workflow completes.
Email Yes Sends email during the workflow.
Control No Stops or aborts the workflow.
Decision No Specifies a condition to evaluate in the

workflow. Use the Decision task to create
branches in a workflow.
Event-Raise No Represents the location of a user-defined
event. The Event-Raise task triggers the user-
defined event when the PowerCenter Server
runs the Event-Raise task.
Event-Wait No Waits for a user-defined or a pre-defined
event to occur. Once the event occurs, the
PowerCenter Server completes the rest of the
workflow.
Timer No Waits for a specified period of time to run the
next task.
WORKFLOW MONITOR
You can monitor workflows and tasks in the Workflow Monitor. View details about a workflow
or task in Gantt Chart view or Task view.
Actions
You can run, stop, abort, and resume workflows from the Workflow Monitor.
You can view the log file and Performance Data
TRANSFORMATIONS
Transformation Active Passive Description
__ sorting the tables in ascending or descending and aslo to
SORTER obtain Distinct records.
RANK __ Top or bottom 'N' analysis .

__ Join two different sources cmng from different and same
JOINER location .
FILTER __ filters the rows that do not meet the condition.

ROUTER __ It is useful to test multiple conditions .
__ To perform group calculation such as count , max , min ,
sum , avg (mainly to perform calculation or multiple rows
AGGREGATOR or group)
__ Reads cobol files ( denormalized format).

NORMALIZER Split a single row into multiple rows.
__ It performs many tasks such as override default sql query

, filtering records , join data from two or more table etc
SOURCE QUALIFIER Represents the flatfile or relational data.
UNION __ It merges data from multiple sources similar

to the UNION ALL SQL statement to
combine the results from two or more SQL
statements. Similar to the UNION ALL
statement, the Union transformation does not
remove duplicate rows.
__ You can use the Expression transformation to
EXPRESSION calculate values in a single row before you
write to the target.
__ Use a Lookup transformation in a mapping to
LOOK UP look up data in a flat file or a relational table,
view, or synonym.
__ stored procedures to automate tasks that are
STORED PROCEDURE
too complicated for standard SQL
statements.You can call by using Stored
Procedure Transformation.
__ When you add an XML source definition to a
XML SOURCE QUALIFIER mapping, you need to connect it to an XML
Source Qualifier transformation.
__ To flag rows for insert, delete, update, or
UPDATE STRATEGY
reject..

Why we have to create Mapping Parameters or Variables ?
you can use mapping parameters and variables to make mappings more flexible.
You can Reuse a mapping by varing the parameters and Variables.
Represntation
$$parametername/$$variablename
Parameters A mapping parameter represents a constant value that you can define before running
a session. A mapping parameter retains the same value throughout the entire session.

Variables
A mapping variable represents a value that can change through the session. The PowerCenter
Server saves the value of a mapping variable to the repository at the end of each successful
session run and uses that value the next time you run the session.
Default Values of Mapping Parameter and Variables
Data Default Value

String Empty String
Numeric 0
Datetime 1/1/1753 A.D. ,
DEBUGGER
Actions
 You can debug a valid mapping to gain troubleshooting information about data and error
conditions.
Situation to run the Debugger
o Before you run a session. After you save a mapping, you can run some initial
tests with a debug session before you create and configure a session in the Workflow
Manager.
o After you run a session. If a session fails or if you receive unexpected results in
your target, you can run the Debugger against the session.
1.Define Data Warehouse ?
“A subject-oriented , integrated , time-variant and non-volatile collection of data in support of

management's decision making process”
2. What is junk dimension? What is the difference between junk dimension and
degenerated dimension?
A "junk" dimension is a collection of random transactional codes, flags and/or text attributes that
are unrelated to any particular dimension. The junk dimension is simply a structure that provides
a convenient place to store the junk attributes.where as A degenerate dimension is data that is
dimensional in nature but stored in a fact table.
Junk dimension: the column which we are using rarely or not used, these columns are formed a
dimension is called junk dimension
Degenerative dimension: the column which we use in dimension are degenerative dimension
Ex.Emp table has empno, ename, sal, job, deptno
But We are talking only the column empno, ename from the EMP table and forming a dimension
this is called degenerative dimension
3.Differnce between Normalization and Denormalization?
Normalization is the process of removing redundancies.

OLTP uses the Normalization process
Denormalization is the process of allowing redundancies.

OLAP/DW uses the denormalized process to capture greater level of detailed data (each and
every transaction)
4. Why fact table is in normal form?
A fact table consists of measurements of business requirements and foreign keys of dimensions
tables as per business rules.
A fact table consists of measurements of business requirements and foreign keys of dimensions
tables as per business rules.
There can just be SKs within a Star schema, which itself is de-Normalized. Now, if there were
then FKs on the dimensions as well, I would agree. Being in normal form, more granularity is
achieved with less coding i.e. less number of joins while retrieving the fact.
5. What is Difference between E-R Modeling and Dimensional Modeling?
Basic difference is E-R modeling will have logical and physical model. Dimensional model will
have only physical model. E-R modeling is used for normalizing the OLTP database design.
Dimensional modeling is used for de-normalizing the ROLAP/MOLAP design. Adding to the
point:
E-R modeling revolves around the Entities and their relationships to capture the overall process
of the system.
Dimensional model / Multidimensional Modeling revolves around Dimensions (point of

analysis) for decision-making and not to capture the process.
In ER modeling the data is in normalized form. So more number of Joins, which may adversely
affect the system performance. Whereas in Dimensional Modeling the data is denormalized, so
less number of joins, by which system performance will improve.
6. What is conformed fact?
Conformed dimensions are the dimensions, which can be used across multiple Data Marts in
combination with multiple facts tables accordingly
Conformed facts are allowed to have the same name in separate tables and can be combined and
compared mathematically. Conformed dimensions are those tables that have a fixed structure.
There will b no need to change the metadata of these tables and they can go along with any
number of facts in that application
without any changes
Dimension table, which is used, by more than one fact table is known as a conformed dimension.
7. What are the methodologies of Data Warehousing?
They are mainly 2 methods.
1. Ralph Kimbell Model (Top - Down approach :: Data Warehouse --> Data Mart)
Kimball model always structured as Denormalized structure.
2. Inmon Model. (Bottom - Up approach :: Data Mart --> Data Warehouse)

Inmon model structured as Normalized structure.
8. What are data validation strategies for data mart validation after loading process?
Data validation is to make sure that the loaded data is accurate and meets the business
requirements. Strategies are different methods followed to meet the validation requirements.
9. What is surrogate key?
Surrogate key is the primary key for the Dimensional table. Surrogate key is a substitution for
the natural primary key.
Data warehouses typically use a surrogate, (also known as artificial or identity key), key for the
dimension tables primary keys. They can use Infa sequence generator, or Oracle sequence, or
SQL Server Identity values for the surrogate key.
It is useful because the natural primary key (i.e. Customer Number in Customer table) can
change and this makes updates more difficult and also used in SCDs to preserve historical data.
10. What is meant by metadata in context of a Data warehouse and how it is important?
Metadata or Meta data is data about data. Examples of metadata include data element
descriptions, data type descriptions, attribute/property descriptions, range/domain descriptions,
and process/method descriptions. The repository environment encompasses all corporate
metadata resources: database catalogs, data dictionaries, and navigation services. Metadata
includes things like the name, length, valid values, and description of a data element. Metadata is
stored in a data dictionary and repository. It insulates the data warehouse from changes in the
schema of operational systems. Metadata Synchronization The process of consolidating, relating
and synchronizing data elements with the same or similar meaning from different systems.
Metadata synchronization joins these differing elements together in the data warehouse to allow for
easier access.
In context of a Data warehouse metadata is meant the information about the data. This
information is stored in the designer repository. Meta data is the data about data; Business
Analyst or data modeler usually capture information about data - the source (where and how the
data is originated), nature of data (char, varchar, nullable, existance, valid values etc) and
behavior of data (how it is modified / derived and the life cycle) in data dictionary a.k.a
metadata. Metadata is also presented at the Data mart level, subsets, fact and dimensions, ODS
etc. For a DW user, metadata provides vital information for analysis / DSS.
11. What are the possible data marts in Retail sales?
Product information, sales information
12. What is the main difference between schema in RDBMS and schemas in Data
Warehouse?
RDBMS Schema DWH Schema
* Used for OLTP systems * Used for OLAP systems
* Traditional and old schema * New generation schema
* Normalized * De Normalized
* Difficult to understand and navigate * Easy to understand and navigate
* Cannot solve extract and complex * Extract and complex problems can be easily
problems solved
* Poorly modelled * Very good model
13.What is Dimensional Modeling?
In Dimensional Modeling, Data is stored in two kinds of tables: Fact Tables and Dimension
tables.
Fact Table contains fact data e.g. sales, revenue, profit etc.....
Dimension table contains dimensional data such as Product Id, product name, product
description etc.....
Dimensional Modeling is a design concept used by many data warehouse designers to build their
data warehouse. In this design model all the data is stored in two types of tables - Facts table and
Dimension table. Fact table contains the facts/measurements of the business and the dimension
table contains the context of measurements i.e., the dimensions on which the facts are calculated.
14. Why is Data Modeling Important?
The data model is also detailed enough to be used by the database developers to use as a
"blueprint" for building the physical database. The information contained in the data model will
be used to define the relational tables, primary and foreign keys, stored procedures, and triggers.
A poorly designed database will require more time in the long-term. Without careful planning
you may create a database that omits data required to create critical reports, produces results that
are incorrect or inconsistent, and is unable to accommodate changes in the user's requirements.
15. What does level of Granularity of a fact table signify?
It describes the amount of space required for a database. Level of Granularity indicates the extent
of aggregation that will be permitted to take place on the fact data. More Granularity implies
more aggregation potential and vice-versa. In simple terms, level of granularity defines the
extent of detail. As an example, let us look at geographical level of granularity. We may analyze
data at the levels of COUNTRY, REGION, TERRITORY, CITY and STREET. In this case, we
say the highest level of granularity is STREET. Level of granularity means the upper/lower level
of hierarchy, up to which we can see/drill the data in the fact table. Level of granularity means
the upper/lower level of hierarchy, up to which we can see/drill the data in the fact table.
16. What is degenerate dimension table?

The values of dimension, which is stored, in fact table is called degenerate dimensions. These
dimensions don't have it's own dimensions.
17. How do you load the time dimension?
In Data warehouse we manually load the time dimension, Every Data warehouse maintains a time
dimension. It would be at the most granular level at which the business runs at (ex: week day,
day of the month and so on). Depending on the data loads, these time dimensions are updated.
Weekly process gets updated every week and monthly process, every month.
18. Difference between Snowflake and Star Schema. What are situations where Snow flake
Schema is better than Star Schema to use and when the opposite is true?
Star schema and snowflake both serve the purpose of dimensional modeling when it comes to
data warehouses.
Star schema is a dimensional model with a fact table (large) and a set of dimension tables
(small). The whole set-up is totally denormalized.
However in cases where the dimension tables are split to many tables that are where the schema
is slightly inclined towards normalization (reduce redundancy and dependency) there comes the
snowflake schema.
The nature/purpose of the data that is to be feed to the model is the key to your question as to
which is better.
Star schema
 contains the dimension tables mapped around one or more fact tables.
 It is a denormalized model.
 No need to use complicated joins.
 Queries results fastly.
Snowflake schema
 It is the normalized form of Star schema.

 Contains in depth joins, because the tables are splited in to many pieces.
We can easily do modification directly in the tables.
 We have to use complicated joins, since we have more tables.
 There will be some delay in processing the Query.
19. Why do you need Star schema?

1) Less joiners contains
2) Simply database
3) Support drilling up options
20. Why do you need Snowflake schema?
Some times we used to provide separate dimensions from existing dimensions that time we will
go to snowflake
Disadvantage Of snowflake: Query performance is very low because more joiners is there
21. What is conformed fact?
Conformed dimensions are the dimensions, which can be used across multiple Data Marts in
combination with multiple facts tables accordingly
Conformed facts are allowed to have the same name in separate tables and can be combined and
compared mathematically. Conformed dimensions are those tables that have a fixed structure.
There will b no need to change the metadata of these tables and they can go along with any
number of facts in that application without any changes
Dimension table, which is used, by more than one fact table is known as a conformed dimension.
22. What are conformed dimensions
They are dimension tables in a star schema data mart that adhere to a common structure, and
therefore allow queries to be executed across star schemas. For example, the Calendar dimension
is commonly needed in most data marts. By making this Calendar dimension adhere to a single
structure, regardless of what data mart it is used in your organization, you can query by date/time
from one data mart to another to another.
Conformed dimentions are dimensions which are common to the cubes.(cubes are the schemas contains
facts and dimension tables)

Consider Cube-1 contains F1,D1,D2,D3 and Cube-2 contains F2,D1,D2,D4 are the Facts and Dimensions
here D1,D2 are the Conformed Dimensions
23. What is Fact table
A table in a data warehouse whose entries describe data in a fact table. Dimension tables contain
the data from which dimensions are created. A fact table in data ware house is it describes the
transaction data. It contains characteristics and key figures.
24. What are Semi-additive and faceless facts and in which scenario will you use such kinds
of fact tables
Semi-Additive: Semi-additive facts are facts that can be summed up for some of the dimensions
in the fact table, but not the others. For example:
Current Balance and Profit Margin are the facts. Current Balance is a semi-additive fact, as it
makes sense to add them up for all accounts (what's the total current balance for all accounts in
the bank?), but it does not make sense to add them up through time (adding up all current
balances for a given account for each day of the month does not give us any useful information
A factless fact table captures the many-to-many relationships between

dimensions, but contains no numeric or textual facts. They are often used to record events or
coverage information. Common examples of factless fact tables include:
- Identifying product promotion events (to determine promoted products that didn't sell)
- Tracking student attendance or registration events
- Tracking insurance-related accident events
- Identifying building, facility, and equipment schedules for a hospital or university
25. What are the Different methods of loading Dimension tables
Conventional Load: Before loading the data, all the Table constraints will be checked against the data.

Direct load:(Faster Loading) All the Constraints will be disabled. Data will be loaded directly.Later the
data will be checked against the table constraints and the bad data won't be indexed. Conventional and
Direct load method are applicable for only oracle. The naming convension is not general one applicable
to other RDBMS like DB2 or SQL server..
26.What are Aggregate tables
Aggregate tables contain redundant data that is summarized from other data in the warehouse.
These are the tables which contain aggregated / summarized data. E.g Yearly, monthly sales
information. These tables will be used to reduce the query execution time.
Aggregate table contains the summary of existing warehouse data which is grouped to certain
levels of dimensions.Retrieving the required data from the actual table, which have millions of
records will take more time and also affects the server performance.To avoid this we can
aggregate the table to certain required level and can use it.This tables reduces the load in the
database server and increases the performance of the query and can retrieve the result very fastly.
27. What is a dimension table
A dimensional table is a collection of hierarchies and categories along which the user
can drill down and drill up. it contains only the textual attributes.
Previous Next
Send ::If you have any Data Warehouse related interview questions of a particular company.Please
share with us Click Here.Mention the name of the company ,Questions related to Informatica/ business
Objects /Database etc
25. Why are OLTP database designs not generally a good idea for a Data Warehouse
OLTP cannot store historical information about the organization. It is used for storing the details
of daily transactions while a datawarehouse is a huge storage of historical information obtained
from different datamarts for making intelligent decisions about the organization.
26. What is the need of surrogate key; why primary key not used as surrogate key
Surrogate Key is an artificial identifier for an entity.In surrogate key values are generated by the
system sequentially(Like Identity property in SQL Server and Sequence in Oracle). They do not
describe anything.
Primary Key is a natural identifier for an entity. In Primary keys all the values are entered
manually by the user which are uniquely identified. There will be no repeatition of data.
Need for surrogate key not Primary Key
If a column is made a primary key and later there needs a change in the datatype or the length for
that column then all the foreign keys that are dependent on that primary key should be changed
making the database Unstable
Surrogate Keys make the database more stable because it insulates the Primary and foreign key
relationships from changes in the data types and length.
For Example : You are extracting Customer Information from OLTP Source and after ETL
process, loading customer information in a dimension table (DW). If you take SCD Type 1, Yes
you can use Primary Key of Source CustomerID as Primary Key in Dimension Table. But if you
would like to preserve history of customer in Dimension table i.e. Type 2. Then you need
another unique no apart from CustomerID. There you have to use Surrogate Key.
Another reason : If you have AlphaNumeric as a CustomerID. Then you have to use surrogate
key in Dimension Table. It is advisable to have system generated small integer number as a
surrogate key in the dimension table. so that indexing and retrieval is much faster.
27. What is data cleaning? how is it done?
Data Cleansing: the act of detecting and removing and/or correcting a database's dirty data (i.e.,
data that is incorrect, out-of-date, redundant, incomplete, or formatted incorrectly)
It can be done by using the exisitng ETL tools or using third party tools like Trivillium etc.,
28. What are slowly changing dimensions

Dimensions that change over time are called Slowly Changing Dimensions. For instance, a
product price changes over time; People change their names for some reason; Country and State
names may change over time. These are a few examples of Slowly Changing Dimensions since
some changes are happening to them over a period of time
29. What are Data Marts
Data Mart is a segment of a data warehouse that can provide data for reporting and analysis on a
section, unit, department or operation in the company, e.g. sales, payroll, production. Data marts
are sometimes complete individual data warehouses which are usually smaller than the corporate
data warehouse.
Data Mart: a data mart is a small data warehouse. In general, a data warehouse is divided into
small units according the busness requirements. for example, if we take a Data Warehouse of an
organization, then it may be divided into the following individual Data Marts. Data Marts are
used to improve the performance during the retrieval of data.
eg: Data Mart of Sales, Data Mart of Finance, Data Mart of Maketing, Data Mart of HR etc.
30. Can a dimension table contains numeric values?
No. Only Fact Table having Numeric Fields.
31. Explain degenerated dimension in detail.
Degenerated dimension is a dimension, which is not having any source in oltp
It is generated at the time of transaction
Like invoice no this is generated when the invoice is raised
It is not used in linking and it is also not a fkey
But we can refer these degenerated dimensions as a primary key of the fact table
A Degenerate dimension is a Dimension which has only a single attribute.
This dimension is typically represented as a single field in a fact table.
The data items thar are not facts and data items that do not fit into the existing dimensions are
termed as Degenerate Dimensions.
Degenerate Dimensions are the fastest way to group similar transactions.
Degenerate Dimensions are used when fact tables represent transactional data.
32. Give examples of degenerated dimensions
Degenerated Dimension is a dimension key without corresponding dimension. Example:
In the PointOfSale Transaction Fact table, we have:
Date Key (FK), Product Key (FK), Store Key (FK), Promotion Key (FP), and POS
Transaction Number
Date Dimension corresponds to Date Key, Production Dimension corresponds to Production

Key. In a traditional parent-child database, POS Transactional Number would be the key to the
transaction header record that contains all the info valid for the transaction as a whole, such as
the transaction date and store identifier. But in this dimensional model, we have already
extracted this info into other dimension. Therefore, POS Transaction Number looks like a
dimension key in the fact table but does not have the corresponding dimension table.Therefore,
POS Transaction Number is a degenerated dimension.
33. What are the steps to build the data warehouse
1.Gathering bussiness requiremnts
• Identifying Sources
• Identifying Facts
• Defining Dimensions
• Define Attribues
• Redefine Dimensions & Attributes
• Organise Attribute Hierarchy & Define Relationship
• Assign Unique Identifiers
• Additional convetions:Cardinality/Adding ratios
• Understand the bussiness requirements.
2.Once the business requirements are clear then Identify the Grains(Levels).
3.Grains are defined; design the Dimensional tables with the Lower level Grains.
4.Once the Dimensions are designed, design the Fact table With the Key Performance Indicators
(Facts).
5.Once the dimensions and Fact tables are designed define the relation ship between the tables by
using primary key and Foreign Key. In logical phase data base design looks like Star Schema
design so it is named as Star Schema Design
34. What is the different architecture of data warehouse
1. Top down - (bill Inmon)
2. Bottom up - (Ralph kimbol)
There are three types of architectures.
• Date warehouse Basic Architecture:
In this architecture end users access data that is derived from several sources through the data
warehouse.
Architecture: Source --> Warehouse --> End Users
• Data warehouse with staging area Architecture:
Whenever the data that is derived from sources need to be cleaned and processed before putting
it into warehouse then staging area is used.
Architecture: Source --> Staging Area -->Warehouse --> End Users
• Data warehouse with staging area and data marts Architecture:
Customization of warehouse architecture for different groups in the organization then data marts
are added and used.
Architecture: Source --> Staging Area --> Warehouse --> Data Marts --> End Users
Q1>How do u change parameter when u move it from development to production.

Ans :: We can manually move the parameter file and save in prod server. (Posted :: Vijay )
Q2>How do u retain variable value when u move it from development to production.

Ans :: while moving the variables to prod make sure that you assign the default value while
creating the variables in dev environment.when the code was moved it will check the repository
and see if there is any value for that variable if there is no value then it takes the default value.
(Posted :: Vijay )
Q3>How do u reset sequence generator value u move it from development to production

Ans :: Keep the sequence value as 1 in dev and move the code to prod. (Posted :: Vijay )
Q4>How to delete duplicate values from UNIX.
Ans :: UNIQ <filename>
Q5>How to find no.of rows commited to the target, when a session fails.
Ans :: Log file
Q6>How to remove the duplicate records from flat file (other than using sorter trans. and
mapping variables)
Ans :: (i)Dynamic Lookup (ii) sorter and aggregator
Q7>How to generate sequence of values in which target has more than 2billion records.(with
sequence generator we can generate upto 2 billion values only)
Ans :: Create a Stored Procedure in database level and call it using storedprocedure
transformation.
Q8>I have to generate a target field in Informatica which doesn exist in the source table. It is the
batch number. There are 1000 rows altogether. The first 100 rows should have the same batch
number 100 and the next 100 would have the batch numbe 101 and so on. How can we do using
informatica?
Ans :: develop a mapping flow
Source > sorter > sequencegenerator (generate numbers)> expression (batchnumber , decode
function) > target
Expression :: decode(nexval<=100, nextval ,

Nextval>100 and Nextval<=200,Nextval+1,
Nextval>200 and nextval<=300,nextval+2 ,
Nextval>900 and nextval<=1000,nextval+10,0 )
Q9>Lets take that we have a Flat File in the Source System and It was in the correct path then
when we ran the workflow and we got the error as "File Not Found", what might be the reson?
Ans :: Not entered “source file name” properly at the session level
Q10>How to load 3 unstructured flat files into single target file?

Ans :: Indirect file option (configure at session level)
Q11>There are 4 columns in the table
target Definition ::
Store_id, Item, Qty, Price
101, battery, 3, 2.99
101, battery, 1 , 3.19
101, battery, 2, 2.59
Source Definition::
101, battery, 3, 2.99
101, battery, 1 , 3.19
101, battery, 2, 2.59
101, battery, 2,17.34
How can we do this using Aggregator?
Ans :: Source > aggregator (group by on store_id , item , qty ) > target
Tip :: aggregator will sort the data in descending order if u dnt use sorted input.
Q12> in the source qualifer if the default query is not generated... what is the reason...? how to
slove it?
Ans :: (i)if source is flat file you cannot use this feature in source qualifier
(ii)In case if u are using the realational file as source and if u forget make the connection to the
next transformation from source qulaifier .u cannot gerate SQL query
ABBREVIATIONS
ASCII American Standard Code for Information Interchange
BI Business Intelligence
BO Business Object
BPM Business Process Modeling
C/S Client/Server
DBA Database Administrator
DBMS Database Management System
DDL Data Definition Language
DM Data Modeling
DSN Data Source Name
DSS Decision Support System
DW Data Warehouse
ERD Enterprise Relationship Diagram
ERP Enterprise Relationship Planning
ETL Extract Transformation Loading
GB Giga Bytes
GUI Graphical User Interface
HOLAP Hybrid OnLine Analytical processing
HTML Hyper Text Markup Language
JDBC Java Database Connectivity
MB Mega Bytes
MDBMS Multi-dimensional Data Base Management System
MOLAP Multi-dimensional On Line Analytical Processing
ODBC Open Data Base Connectivity
ODS Operational Data Store
OLAP On Line Analytical Processing
OLTP On Line Transaction Processing
OS Operating System
PCS Power Center Server
QA Quality Assurance
RDBMS Relational Data Base Management System
ROLAP Relational On lIne Analytical Processing
SCD Slowly Changing Dimension
SQA Software Quality Assurance
SRS Software Requirement Specification
TB Tera Bytes
TCP/IP Transmission Control Protocol/Internet Protocol
VPN Virtual Private Network
XML eXtensible Markup Language

What are the tasks that Loadmanger process will do?
ANS;
Manages the session and batch scheduling: Whe you start the informatica server the load maneger
launches and queries the repository for a list of sessions configured to run
on the informatica server.When you configure the session the loadmanager maintains list of list of
sessions and session start times.When you sart a session loadmanger fetches the session information
from the repository to perform the validations and verifications prior to starting DTM process.
Locking and reading the session: When the informatica server starts a session lodamaager locks the
session from the repository.Locking prevents you starting the session again and again.
Reading the parameter file: If the session uses a parameter files,loadmanager reads the parameter file
and verifies that the session level parematers are declared in the file
Verifies permission and privelleges: When the sesson starts load manger checks whether or not the user
have privelleges to run the session.
Creating log files: Loadmanger creates logfile contains the status of session.
What is DTM process?
ANS:
After the loadmanger performs validations for session,it creates the DTM process.DTM is to create and
manage the threads that carry out the session tasks.I creates the
master thread.Master thread creates and manges all the other threads.
DTM means data transformation manager.in informatica this is main back ground process.it run
after complition of load manager.in this process informatica server search source and tgt
connection in repository if it correct then informatica server fetch the data from source and load
it to target.
Informatica integration server maintains two types executing process.

1. Load manager
2.Data transfer Manager
DTM is to read from Source by using the Readers thread and loading the data into target by
using writer thread
What are the different threads in DTM process?
ANS:
Master thread: Creates and manages all other threads

Maping thread: One maping thread will be creates for each session.Fectchs session and maping
information.
Pre and post session threads: This will be created to perform pre and post session operations.
Reader thread: One thread will be created for each partition of a source.It reads data from source.
Writer thread: It will be created to load data to the target.
Transformation thread: It will be created to tranform data.
What are the data movement modes in informatcia?
ANS:
Datamovement modes determines how informatcia server handles the charector data.U choose the
datamovement in the informatica server configuration settings.
Two types of datamovement modes avialable in informatica:-
ASCII mode
Uni code mode.
What are the out put files that the informatica server creates during the session running?
ANS:
Informatica server log: Informatica server(on unix) creates a log for all status and error messages(default
name: pm.server.log). It also creates an error log for error
messages.
These files will be created in informatica home directory:-
Session log file: Informatica server creates session log file for each session.It writes information about
session into log files such as initialization process,creation of sql
commands for reader and writer threads,errors encountered and load summary.The amount of detail in
session log file depends on the tracing level that you set.
Session detail file: This file contains load statistics for each targets in mapping.Session detail include
information such as table name,number of rows written or rejected.U
can view this file by double clicking on the session in monitor window
Performance detail file: This file contains information known as session performance details which helps
you where performance can be improved.To genarate this file select
the performance detail option in the session property sheet.
Reject file: This file contains the rows of data that the writer does notwrite to targets.
Control file: Informatica server creates control file and a target file when you run a session that uses the
external loader.The control file contains the information about the
target flat file such as data format and loading instructios for the external loader.
Post session email: Post session email allows you to automatically communicate information about a
session run to designated recipents.You can create two different
messages.One if the session completed sucessfully the other if the session fails.
Indicator file: If you use the flat file as a target,You can configure the informatica server to create
indicator file.For each target row,the indicator file contains a number to indicate
whether the row was marked for insert,update,delete or reject.
output file: If session writes to a target file,the informatica server creates the target file based on file
prpoerties entered in the session property sheet.
Cache files: When the informatica server creates memory cache it also creates cache files.
For the following circumstances informatica server creates index and datacache files:-
Aggreagtor transformation
Joiner transformation
Rank transformation
Lookup transformation
In which circumstances that informatica server creates Reject files?
ANS:
When it encounters the DD_Reject in update strategy transformation.

Violates database constraint
Filed in the rows was truncated or overflowed.
What is polling?
ANS:
It displays the updated information about the session in the monitor window. The monitor
window displays the status of each session when you poll the informatica server.
-----------------------------------------------------------------------------------------------------------------------------------------
Can you copy the session to a different folder or repository?

ANS:Yes. By using copy session wizard You can copy a session in a different folder or repository. But that
target folder or repository should consists of mapping of that session.
If target folder or repository is not having the maping of copying session ,
You should have to copy that maping first before you copy the session.
What is batch and describe about types of batches?
ANS:
Grouping of session is known as batch. Batches are two types:-

Sequential: Runs sessions one after the other
Concurrent: Runs session at same time.
If you have sessions with source-target dependencies you have to go for sequential batch to start the
sessions one after another.If you have several independent sessions You can use concurrent batches
which runs all the sessions at the same time.
Can you copy the batches?
ANS: NO.
What are the session parameters?
ANS:
Session parameters are like maping parameters,represent values you might want to change between
sessions such as database connections or source files.
Server manager also allows you to create userdefined session parameters.Following are user defined
session parameters:-
Database connections
Source file names: use this parameter when you want to change the name or location of
session source file between session runs.
Target file name : Use this parameter when you want to change the name or location of
session target file between session runs.
Reject file name : Use this parameter when you want to change the name or location of
session reject files between session runs.
What is parameter file?
ANS:
Parameter file is to define the values for parameters and variables used in a session.A parameter
file is a file created by text editor such as word pad or notepad.
You can define the following values in parameter file:-
Maping parameters
Maping variables
session parameters.
Performance tuning in Informatica?
ANS:The goal of performance tuning is optimize session performance so sessions run during the
available load window for the Informatica Server.Increase the session
performance by following.
The performance of the Informatica Server is related to network connections. Data generally
moves across a network at less than 1 MB per second, whereas a local disk
moves data five to twenty times faster. Thus network connections ofteny affect on session
performance.So aviod netwrok connections.
Flat files: If u?r flat files stored on a machine other than the informatca server, move those files
to the machine that consists of informatica server.
Relational datasources: Minimize the connections to sources ,targets and informatica server to
improve session performance.Moving target database into server system may improve session
performance.
Staging areas: If u use staging areas u force informatica server to perform multiple datapasses.
Removing of staging areas may improve session performance.
You can run the multiple informatica servers againist the same repository.Distibuting the session
load to multiple informatica servers may improve session performance.
Run the informatica server in ASCII datamovement mode improves the session
performance.Because ASCII datamovement mode stores a character value in one
byte.Unicode mode takes 2 bytes to store a character.
If a session joins multiple source tables in one Source Qualifier, optimizing the query may
improve performance. Also, single table select statements with an ORDER BY or
GROUP BY clause may benefit from optimization such as adding indexes.
We can improve the session performance by configuring the network packet size,which allows
data to cross the network at one time.To do this go to server manger ,choose server configure
database connections.
If u are target consists key constraints and indexes u slow the loading of data.To improve the
session performance in this case drop constraints and indexes before u run the
session and rebuild them after completion of session.
Running a parallel sessions by using concurrent batches will also reduce the time of loading the
data.So concurent batches may also increase the session performance.
Partittionig the session improves the session performance by creating multiple connections to
sources and targets and loads data in paralel pipe lines.
In some cases if a session contains a aggregator transformation ,You can use incremental
aggregation to improve session performance.
Aviod transformation errors to improve the session performance.
If the sessioin containd lookup transformation You can improve the session performance by
enabling the look up cache.
If U?r session contains filter transformation ,create that filter transformation nearer to the sources
or You can use filter condition in source qualifier.
Aggreagator,Rank and joiner transformation may oftenly decrease the session performance
.Because they must group data before processing it.To improve session
performance in this case use sorted ports option.Increase the temporary database space also
improves the performance.
1) In a single flat file, if am having multiple delimiters,then how can i load the flat file?
2)If a flat file is " , " delimited flat file and while loading data one of my field or attribute
is having this "," in the field.then how can i handle this case..?
3)while loading multiple flat files using Indirect loading method, how can i generate list
file,if am having n number of flat files of similar structure?
4)While loading multiple flat files using Indirect loading,i want to load data into one
target and the file names into one target? How u can do this?
ANS:
1) According to my knowledge. check the below answers.
1) Not possible to load multiple delimiters. Informatica can't handle this case. Request
the source data team to send in only single delimiter.
2)Informatica can't consider that field value with ','. Either request source system to
change the delimiter to some other delimiter.
3)List of files will be appended in file list file using UNIX Script. Once all the files are FTP
to source Directory.. Unix script will generate the file list with all the files exists in the
source file directory.
4) In mapping we can get the file name including field values. connect that file name
field into sorter or aggregator to get the distinct file names and load into filename target
table or file. remaining fields other than filename connect to other target.
Let me know if you required any further information.

2) From Informatica 8.x Onwards it possible
1) Enable the below property

Column Delimiters :
One or more characters used to separate columns of data. Delimiters can be either
printable or single-byte unprintable characters, and must be different from the escape
character and the quote character (if selected). To enter a single-byte unprintable
character, click the Browse button to the right of this field. In the Delimiters dialog box,
select an unprintable character from the Insert Delimiter list and click Add. You cannot
select unprintable multibyte characters as delimiters.
Maximum number of delimiters is 80.
2) Enable the Below Property
Optional Quotes
Select No Quotes, Single Quote, or Double Quotes. If you select a quote character, the
Integration Service ignores delimiter characters within the quote characters. Therefore,
the Integration Service uses quote characters to escape the delimiter.
For example, a source file uses a comma as a delimiter and contains the following row:
342-3849, ‘Smith, Jenna’, ‘Rockville, MD’, 6.
If you select the optional single quote character, the Integration Service ignores the
commas within the quotes and reads the row as four fields.
4)Enable the Below Property
Add Currently Processed Flat File Name Port.
The Designer adds the CurrentlyProcessedFileName port as the last column on the
Columns tab. The CurrentlyProcessedFileName port is a string port with default
precision of 256 characters.

A Interview Question

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

A Interview Question

Uploaded by

Copyright:

Available Formats

1) what is admin console. can u explained me detailed.

There are two types in Informatica..

1)Web based client tool - We need browser to Access

2)Window Based Client tool - We dont need browser to Access

If a dimension contains records those records are maintaining duplication.

Flatfile contains 20 million records

please suggest me best to way load and better session performance

2)where we can use status code

5) what operations performed on materalized view

5)when we go for connected and unconnected lookup

6)how target table gets refreshed daily

7)session variables and workflow variables

8)how to delete particular row i.e 2nd in a file unix

9) what r the properties set to improve session performance

6) unconnected lookup can return multiple values?

If question related multiple return port fields

Ex : Table (emp---> No,Name,Sal,Loc,Dname). So here i am passing only No to EMP i

1) Click on on source defenition and edit the properties tab

(III) Yes i know that option ...

Session1 is flatfile to targettable(this is general business requirement load ) ....here we

Source filename is loaded into different table that is audit table

(IV) decalre one variable in mapping variables & parameters.

2) i have multiple transformations instead of multiple transformations we will take mapplet

ANS: As per my knowledge...

2)where we can use status code

5) what operations performed on materalized view

5)when we go for connected and unconnected lookup

6)how target table gets refreshed daily

7)session variables and workflow variables

9) what r the properties set to improve session performance

10) need solution for this

o/p should be:

(I) give the clarity of this question Rashmi.

and o/p is two columns or same column ?

If anybody related documents..

FTP -- connecting to unix for files transfer or download

cp-- copy file to different file

Grep-- search for the content in the file

These are some basic commands

12) regarding update stretagy?

1)Enable the Update Override in Target in the Mapping

13) comparison null operetor

ANS: null comparision in lookup with dynamic chace

ANS: with join t.m

Col1 col2 col3

How can divide one row to multiple rows.

using informatica and also want sql query for that(oracle).

(I) Yes you can use Normalizer Transformation

16) flat file

Hope this may help

18) What is associated port in informatica??????

What is data validation????????

What is versioning in Informatica???

1. data cleansing (remove unwanted data)

What is LRF ?, When it comes in to the picture ?, how we deal with it ?

21) how to print singele record three times

but my requirement is to print the same record three times

(I) I guess u can take union transformation and do it.

Let me know if i am wrong.

22) diff between $ and $$ in informatica

1) Pre Defined Variable

Pre Defined Variables:

Its defined by the Informatica itself.