Professional Documents
Culture Documents
A. There are two ways to lookup the target table to verify a row exists
or not :
1. Use connect dynamic cache lookup and then check the values of
NewLookuprow Output port to decide whether the incoming record
already exists in the table / cache or not.
2. Use Unconnected lookup and call it from an expression
transformation and check the Lookup condition port value (Null/ Not
Null) to decide whether the incoming record already exists in the table
or not.
Subset - A code page is a subset of another code page when all characters
in the code page are encoded in the other code page.
What is Code Page used for?
Code Page is used to identify characters that might be in different
languages. If you are importing Japanese data into mapping, u must select
the Japanese code page of source data.
When the PowerCenter Server runs a session, the DTM performs the
following tasks:
1. Fetches session and mapping metadata from the repository.
2. Creates and expands session variables.
3. Creates the session log file.
4. Validates session code pages if data code page validation is enabled.
Checks query
conversions if data code page validation is disabled.
5. Verifies connection object permissions.
6. Runs pre-session shell commands.
7. Runs pre-session stored procedures and SQL.
8. Creates and runs mappings, reader, writer, and transformation threads
to extract, transform, and load data.
9. Runs post-session stored procedures and SQL.
10. Runs post-session shell commands.
11. Sends post-session email.
• The DTM allocates process memory for the session and divide it
into buffers. This is also known as buffer memory. It creates the
main thread, which is called the master thread. The master thread
creates and manages all other threads.
• If we partition a session, the DTM creates a set of threads for
each partition to allow concurrent processing.. When Integration
service writes messages to the session log it includes thread type
and thread ID.
Pre and Post Session Thread - One Thread each to Perform Pre and
Post Session Operations.
Reader Thread - One Thread for Each Partition for Each Source
Pipeline.
Writer Thread - One Thread for Each Partition if target exist in the
source pipeline write to the target.
Perform Calculation.
Q. Where should you place the flat file to import the flat file
definition to the designer?
A. Place it in local folder
Q. What is a mapplet?
A. A mapplet should have a mapplet input transformation which
receives input values, and an output transformation which passes the
final modified data to back to the mapping. Set of transformations
where the logic can be reusable when the mapplet is displayed within
the mapping only input & output ports are displayed so that the
internal logic is hidden from end-user point of view.
Q. What is a transformation?
A. It is a repository object that generates, modifies or passes data.
2. Do not forget to check the option on the aggregator that tells the
aggregator that the input is sorted on the same keys as group by. The
key order is also very important.
Q: Explain what are the tools you have used in Power Center
and/or Power Mart?
A: Designer, Server Manager, and Repository Manager.
Q: What is a Mapping?
A: Mapping Represent the data flow between source and target
Q: What is Transformation?
A: Transformation is a repository object that generates, modifies, or
passes data. Transformation performs specific function. They are two
types of transformations:
1. Active
Rows, which are affected during the transformation or can
change the no of rows that pass through it. Eg: Aggregator,
Filter, Joiner, Normalizer, Rank, Router, Source qualifier, Update
Strategy, ERP Source Qualifier, Advance External Procedure.
2. Passive
Does not change the number of rows that pass through it. Eg:
Expression, External Procedure, Input, Lookup, Stored Procedure,
Output, Sequence Generator, XML Source Qualifier.
Q: If you have 2 files to join, which file will you use as the
master file?
A: Use the file with lesser nos. of records as master file.
Q: What did you do in the stored procedure? Why did you use
stored proc instead of using expression?
A:
Q: How did you handle reject data? What file does Informatica
create for bad data?
A: Informatica saves the rejected data in a .bad file. Informatica adds a
row identifier for each record rejected indicating whether the row was
rejected because of Writer or Target. Additionally for every column
there is an indicator for each column specifying whether the data was
rejected due to overflow, null, truncation, etc.
Incremental aggregation?
In the Session property tag there is an option for performing
incremental aggregation. When the Integration service performs
incremental aggregation , it passes new source data through the
mapping and uses historical cache (index and data cache) data to
perform new aggregation calculations incrementally.
What are the three areas where the rows can be flagged for
particular treatment?
In mapping, In Session treat Source Rows and In Session Target
Options.
2. Sources
Set a filter transformation after each SQ and see the records are
not through.
If the time taken is same then there is a problem.
You can also identify the Source problem by
Read Test Session – where we copy the mapping with sources, SQ
and remove all transformations
and connect to file target. If the performance is same then there
is a Source bottleneck.
Using database query – Copy the read query directly from the log.
Execute the query against the
source database with a query tool. If the time it takes to execute
the query and the time to fetch
the first row are significantly different, then the query can be
modified using optimizer hints.
Solutions:
Optimize Queries using hints.
Use indexes wherever possible.
3. Mapping
If both Source and target are OK then problem could be in
mapping.
Add a filter transformation before target and if the time is the
same then there is a problem.
(OR) Look for the performance monitor in the Sessions property
sheet and view the counters.
Solutions:
If High error rows and rows in lookup cache indicate a mapping
bottleneck.
Optimize Single Pass Reading:
Optimize Lookup transformation :
1. Caching the lookup table:
When caching is enabled the integration service caches
the lookup table and queries the
cache during the session. When this option is not enabled
the server queries the lookup
table on a row-by row basis.
Static, Dynamic, Shared, Un-shared and Persistent cache
2. Optimizing the lookup condition
Whenever multiple conditions are placed, the condition
with equality sign should take
precedence.
3. Indexing the lookup table
The cached lookup table should be indexed on order by
columns. The session log contains
the ORDER BY statement
The un-cached lookup since the server issues a SELECT
statement for each row passing
into lookup transformation, it is better to index the lookup
table on the columns in the
condition
4. Sessions
If you do not have a source, target, or mapping bottleneck, you
may have a session bottleneck.
You can identify a session bottleneck by using the performance
details. The integration service
creates performance details when you enable Collect
Performance Data on the General Tab of
the session properties.
Performance details display information about each Source
Qualifier, target definitions, and
individual transformation. All transformations have some basic
counters that indicate the
Number of input rows, output rows, and error rows.
Any value other than zero in the readfromdisk and writetodisk
counters for Aggregate, Joiner,
or Rank transformations indicate a session bottleneck.
Low bufferInput_efficiency and BufferOutput_efficiency
counter also indicate a session
bottleneck.
Small cache size, low buffer memory, and small commit intervals
can cause session bottlenecks.
5. System (Networks)
1 bit = a 1 or 0 (b)
4 bits = 1 nybble (?)
8 bits = 1 byte (B)
1024 bytes = 1 Kilobyte (KB)
1024 Kilobytes = 1 Megabyte (MB)
1024 Megabytes = 1 Gigabyte (GB)
1024 Gigabytes = 1 Terabyte (TB)
We need answers:::
1) why we use source qualifier in mapping? what is the basic need?
2)while we import files into mapping ........ how we igore the header & footer of the
file?
3) when we use lookup condition in lookup transformation it gives one record..but in
the sql t/m (sql transformation) it give's the all records matching with the condition? is
there any internal proess occurs based on lookup & sql transformation/
Q) my source is
id
1
1
1
1
2
2
2
2
3
3
3
3
then my targets are like
target1:
id
1
2
3
target2:
id
1
1
1
2
2
2
3
3
3
Ans)
Din & Ramesh is correct. may be u got confused with Jai answer using aggregator. now
aggregator is not required. here i'm giving you the coding check it out.
Q) Mapping variable :
Ans.
In the Designer, you can create mapping variables in a mapping or
mapplet. After you create a mapping variable, it appears in the
Expression Editor. You can then use it in any expression in the
mapping or mapplet. You can also use mapping variables in a source
qualifier filter, user-defined join, or extract override, and in the
Expression Editor of reusable transformations.
TIMESTAMP = $$IncludeDateTime
Ans.
1)The architecture of Power Center 8 has changed a lot; PC8 is service-oriented for
modularity, scalability and flexibility.
2) The Repository Service and Integration Service (as replacement for Rep Server and
Informatica Server) can be run on different computers in a network (so called nodes),
even redundantly.
3) Management is centralized, that means services can be started and stopped on nodes
via a central web interface.
4) Client Tools access the repository via that centralized machine, resources are
distributed dynamically.
5) Running all services on one machine is still possible, of course.
6) It has a support for unstructured data which includes spreadsheets, email, Microsoft
Word files, presentations and .PDF documents. It provides high availability, seamless fail
over, eliminating single points of failure.
7) It has added performance improvements (To bump up systems performance,
Informatica has added "push down optimization" which moves data transformation
processing to the native relational database I/O engine whenever its is most appropriate.)
8) Informatica has now added more tightly integrated data profiling, cleansing, and
matching capabilities.
9) Informatica has added a new web based administrative console.
10) Ability to write a Custom Transformation in C++ or Java.
11) Midstream SQL transformation has been added in 8.1.1, not in 8.1.
12) Dynamic configuration of caches and partitioning
13) Java transformation is introduced
. 14) User defined functions
15) Power Center 8 release has "Append to Target file"
A new web based administrative console has been added.
Q) work flow
i have 2 work flows namely wkf1,wkf2. after the execution of wk1 we have to execute
wkf2. if wkf1 is not executed means you should not execute wkf2 we have to do it
automatically
Ans.
In this approach we can't achieve all the requirements. any time u can start the wkf2 in ur
case but requirement is wkf2 shouldn't run if wkf1 is not succeeded.
Solution for In this case u have to create a flat file at the end of wkf1 by using touch
xyz.txt and in wkf2 use even wait task and use file watch event on that flat file. in this
solution if you run the wkf2 it will wait for the flat file to create if not it won't run the
wkf2.