Professional Documents
Culture Documents
Informatica
Topics of Discussion
Identify Bottleneck
Identifying Bottlenecks
Writing to a slow target? Reading from a slow source?
Target Bottleneck
Source Bottleneck
No
Mapping Bottleneck
Performance ok?
Yes Done
Session Bottleneck
System Bottleneck
Target Bottleneck
Indexes or key constraints Databases checkpoints Small database network packet size Too many target instances in the mapping Target table is too wide Drop indexes and key constraints before loading and then rebuild Use bulk loading wherever practical Increase database network packet size Decrease the frequency of database checkpoints When using partitions consider partitioning the target table.
Common solutions
Source Bottleneck
Slow query Index issues Highly complex Small database network packet size Wide source tables Analyze the query with the help of show plan or other tools. Consider using database optimizer hints when joining several tables Consider indexing tables when you have order by or group by clauses Test source qualifier conditional filter versus filtering at the database level Increase database network packet size
Common Solutions
Mapping Bottleneck
Too many transformations Unused links between ports Too many input/output or output ports in an aggregator or ranking transformations Unnecessary datatype conversions Eliminate transform errors If several mappings read from the same source try single pass reading Optimize datatypes use integers for comparison Dont convert back and forth between datatypes Optimize lookups and lookup tables, using cache and indexing tables Put the filters early in the dataflow, use simple filter condition
Common Solutions
Mapping Bottleneck
Common Solutions
For aggregators use sorted input integer columns to group by and simplify expressions Use reusable sequence generators, increase number of cached values If you use the same logic in different data streams apply it before the streams branch off Optimize expressions Isolate slow and complex expressions Reduce or simplify aggregate functions Use local variables to encapsulate repeated computations Integer computations are faster than character computations Use operators rather than the equivalent functions, || faster than CONCAT()
Session Bottleneck
Inappropriate memory allocation settings Running in series rather than in parallel Error tracing override set to high level
Common Solutions
Experiment with DTM buffer pool and buffer block size If your mapping allows it use partition Run sessions in parallel with concurrent batches, whenever possible Increase database commit interval Turn off recovery and decimal arithmetic (theyre off by default) Use debugger rather than high error tracing, always reduce your tracing level during production runs Dont stage your data if you can avoid it
System Bottleneck
Slow network connections Overloaded or under-powered servers Slow disk performance Get the best machines to run the servers Use multiple CPUs and session partitioning Make sure informatica servers and database servers are closely located in your network If possible consider having informatica server and database server on the same machine
Common Solutions
Identifying Bottlenecks
Read/write throughput Rows failed # of objects in the mapping Type of objects in the mapping How many objects In parallel/partitioned? What's the size of the hardware Source/Target/Database? What kind of pipeline is setup for sourcing and targeting?
Group/Order bys Distinct clauses Where clause (filters) use of non-indexed fields Invalid plans
Database issues
Aggregator Problems
Too many multi level aggregates Incorrect selection of master table Too many (or too wide) join columns Not tuning the data and index caches Not tuning the data and index caches
Joiner Problems
Rank Problems
Too many fields, width (precision) issues Implicit data conversions Update Strategies Too many targets per mapping No use of the bulk-loader
Session Problems Not enough RAM given to the session Commit point too high Commit point too low Too many sessions running in parallel
3.
4. 5. 6. 7. 8. 9.
10.
Too many targets in a single mapping Data width is too large (too many columns passing through the mapping) Too many aggregators, lookups, joiners, ranks in the mapping Not tuning data/index settings for the above objects Too many objects in a single mapping Unused ports in Cached lookups Source query/joins not tuned Lookup query/cache not tuned Ports passed through the mapping but not passed to the target Huge expressions
3.
4.
5. 6. 7.
8. 9. 10.
Not controlling the log file override Not tuning the data and index caches for lookups, aggregators, ranks and joiners Not Tuning the commit point to match the database performance setup Assuming that giving the mapping more memory will make it run faster Assuming that increasing the commit point will make it run faster Not utilizing partitioning available Running too many sessions in parallel on an undersized, overutilized machine Not architecting for failed rows Not setting the line buffer length for flat files Not testing the session for performance, when targeting a flat file
Multiple targets, single thread used for write processes Multiple aggregators, single thread used for moving the data Stacked aggregators fight for memory, disk and cache directory in a single session
Filter condition is too lengthy not optimized Expression only performs a single calculation, forcing the entire row processing when only the one field should be flowed through Disk contention is high 4 targets, single writer thread, I/O is a hotspot
Expression/Filter Contention
Expression Filter
IIF (EmpId= A and .. Or ..
Expression
B_Rowpass = IIF (EmpId= A and .. Or ...)
Filter
B_Rowpass
..)
Expressions are built for evaluation speed Filters take a different code path (slower)
Passing numeric integer to the filter keeps it fast (increases throughput by 0.5x and 3x)
Filter expressions should be as simple as possible to maximize throughput
Aggregator Contention
Aggregator1 Aggregator1
Aggregator1 Aggregator1
Serial Execution
Aggregator1
Aggregator1 Aggregator1
Parallel Execution
Update Stragtegies
Update Update Update Target1 Target2 Target3
Steps to tuning
Make a copy of the map for each target
Provides best read ranges if you know what data you are after
Allows for process parallelism Breaks source data into manageable parts Caution: Requires additional maintenance Faster indexes Allows parallel queries to be run
M-S
T-Z
Potentially decreases I/Os on the source side Assists in the processing component of the data movement
Joiner Object
Master reads rows first into data and index caches RAM is impacted depending on the rows read Joiner fields are read into D&I No control over D&I memory sizes
Contrary to popular belief, the joiner is not slow (compared to cached lookup)
The joiner is a powerful object for performance tuning especially when getting rid of cached lookups.
Joiner Contention
Data and index cache must be calculated properly Master-Detail Join: master should be the smaller table Detail Outer Join: Master should be smaller table Full Outer Join: Both tables should be relatively smaller in size Good for heterogeneous sources Rarely necessary when staging table architecture is employed Consider the size of shared memory for large scale join operations
Lookup Object
Initialization caches ALL data identified in the ports of the lookup, including the data to be matched on RAM is impacted depending on the number of rows read The lookups two flaws are: It fights for resources during execution and its initialization speed is dependent on the speed of the SQL beneath it. WIDE lookups can soak up a lot of time, and space, especially with the Order By clause thats generated/appended to the SQL.
Lookup Contention
Lookups should always be cached when: small width and huge number of rows (primary key only), or any width, and small number of rows (less than 100,000). The exception is: If the hardware has enough RAM to handle all the concurrent sessions and the cached lookup, then it is cached. If the lookup is cached, the data and index cache are utilized.
Uncached lookups should be used when: extremely large number of rows are sourced, or a wide table is sourced, or RAM is scarce
An uncached lookup generates its own database connection An uncached lookup should always be retrieved by primary key Cached or Uncached the connection to the database should have the maximum packet size.
Aggregator Object
Initialization caches ALL data identified in the ports of the aggregator RAM is impacted depending on the number of rows read The aggregator absorbs all rows before pushing them to the target
WIDE Aggregates can soak up a lot of space, also, they can increase I/Os if the data isnt sorted on the way in.
Rank Object
A lot like aggregator, reads all rows in to the data and index cache RAM is impacted depending on the number of rows read.
WIDE Ranks can soak up space, also they can increase the I/Os if data isnt sorted on the way in. All rows must be evaluated to get the top or bottom X%.
Sort Object
A lot like aggregator, reads all rows in to the data and index cache RAM is impacted depending on the number of rows read. WIDE rows can soak up space. Depending on whats setup for sorting on, the index can also grow large.
Router Transformation
Filter1 Expression Filter2 Filter3 Target2 Target3 Target1
Total of 6 passes for 3 filters. This can double the amount of work for passing a row.
Router Transformation
Target1 Expression Router Target2 Target3
Compared to 6 passes of data, now there are 4. This will help improve the performance.
Bulk-Loaders Modes
Fast Slow
The fast loads usually bypass R.I., and indexing. Even faster modes append to the target tables. Slow loads run direct through the engine (same as other applications) Loaders will switch modes if certain criteria arent met before the job starts.
Session Speeds
Simple Map Simple Maps run faster given more memory There is a 1.8 GIG real mem limit to all maps The performance depends on how many maps are running in parallel Eventually too much parallelism will cause slow down of the entire system Even the simple maps run as fast as the source and the target Complex Map Each thread is linked by memory management as the block sizes change, the thread speed slides Speed tests indicate best average performance achieved with 128k Block sizes and 24Mb RAM (Shared Memory)
Alternatives to a lookup
Lookup Expression Expression
Joiner
Source Qual
Lookups are to be used for relatively smaller tables Caches all the data identified in the ports
Joiners cache only the keys based on which the join happens
Alternatives to a lookup
Lookup Expression Expression
Joiner
Src Qual Exp
This is an unconnected lookup The lookup is used to get the values based on a key and a a code
1 1 Txt1 1 2 Txt2 1 3 Txt3 Input O/P Result set 1 Txt1 Txt2 Txt3 2 txta txtb txtc The lookup is replaced by the Source qualifier, expression, aggregator and the joiner. The expression is used to calculate the proper values and the aggregator is used to keep the record with the accurate information (which in this case is the last record)
Agg
Group by Clause
Src Qual Agg
A group by clause can be replaced by an aggregator with a sorted input (if possible)
If the data is coming as sorted then the aggregator is always faster than the group bys
Deletes
Src1 Src Qual Src2
Source 2 is the driving table and Source 1 is the table from which the data is deleted Normally this process gives the keys in the table source 1 from where the data is to be deleted based on some conditions and then the data is deleted accordingly Source 2 contains a few record on a key which forms a part of the composite primary key in table 1 Source 2 is the only table that is used The target instance contains update override, based on which the deletion happens on the particular key coming from the source table
Upd
Tgt
(Src1)
Src2
Src Qual
Upd
Tgt
(Src1)
Thank You J