NOTE: FEW ANSWERS MAY BE WRONG/INACCURATE
------------------------------------------------------------------------------
========
=======
Normal
(Default)
Master outer
Detail outer
Full outer
Simple Pass through maping :
Loads a static fact or dimension table by inserting all rows. Use this mapping when you want to drop all existing data from your table before loading new data.
Slowly Growing target :
Loads a slowly growing fact or dimension table by inserting new rows. Use this mapping to load new data when existing data does not require updates.
ETL
General
Typical
relational databases are designed for on-line transactional processing (OLTP)
and do not meet the requirements for effective on-line analytical processing
(OLAP). As a result, data warehouses are designed differently than traditional
relational databases.
When you create lookup transformation that time INFA asks for table name so
you can choose source, target, import and skip.
So click skip and
the use the sql overide property in properties tab to join two tables for
lookup
In ver 7x u have the option of looking up (lookup) on a flat file.
U can write to XML target.
Versioning
LDAP authentication
Support of 64 bit architectures
.
------------------------------------------------------------------------------
1. While importing the
relational source definition from database, what
are the Meta data of source
you import?
A. Source name, Database
location, Column names, Data types, and Key constraints
2. How many ways you can update a relational
source definition and what r they?
A. 1. Edit the definition
2. Reimport the definition
3. Where should U place the flat file to import
the flat file definition to the designer?
A.
Place it in local folder
4. To provide support for Main frames source data,
which files r used as a source
definitions?
A. COBOL files
5. Which transformation should
u need while using the cobol sources as source defintions?
A. Normalizer transformaiton
which is used to normalize the data.
Since Cobol sources r often consists of Denaormailzed data.
6. How can U create or import
flat file definition in to the warehouse designer?
A. U can not create or import flat file defintion in to
warehouse designer directly.
B. Instead U must analyze the file in source analyzer, then
drag it into the warehouse designer.
C. When U drag the
flat file source definition into warehouse designer workspace, the warehouse designer creates a
relational target defintion not a file defintion.
D. If u want to load
to a file,configure the session to write to a flat file.When the informatica
server runs the session,it creates and loads
the flatfile.
7.What is the mapplet?
A. Mapplet is a set of
transformations that you build in the mapplet designer and U can use in
multiple mappings.
8. what is a transforamation?
A. It is a repostitory object that
generates,modifies or passes data.
9. What r the deigner tools for
creating tranformations?
A. Mapping designer, Tansforamtion developer,
Mapplet designer
10. What r the active and
passive transforamtions?
A. An
active transforamtion can change the number of rows that pass through it.A
passive transformation does not change the number of rows that pass through it.
11. What r the connected or
unconnected transformations?
A. An
unconnected transformation is not connected to other transformations in the
mapping. Connected transformation is connected to other transformations in the mapping.
12. How many ways u create
ports?
A. 1.Drag the port from another
transformation
2.Click the add button on the ports
tab.
14. What r the reusable
transformations?
A. Reusable transformations can be used in multiple
mappings. When u need to incorporate
this transformation into mapping add an
instance of it to Mapping. Later if U
change the definition of the
transformation, all instances
of it inherit the changes. Since the instance of reusable transformations is a
pointer to that transformation can
change the transformation in the transformation developer, its instances automatically reflect
these changes. This feature can save U
great deal of work.
15. What r the methods for
creating reusable transformations?
A.
Two methods
1.Design it in the transformation
developer.
2.Promote a standard transformation
from the mapping designer. After U add a Transformation to the mapping, U can
promote it to the status of reusable
Transformation. Once U promote a standard transformation to reusable
status,U can demote it to a Standard
transformation at any time. If u change
the properties of a reusable transformation in
mapping,U can revert it to the original reusable transformation
properties by clicking the revert button.
16. What r the unsupported repository objects
for a mapplet?
A.
COBOL source definition
Joiner transformations
Normalizer transformations
Non reusable sequence generator
transformations.
Pre or post session stored procedures
Target definitions
Power mart 3.5 style Look Up functions
XML source definitions
IBM MQ source defintions
17. What r the mapping
paramaters and mapping variables?
Mapping parameter represents a constant
value that U can define before running a session. A mapping parameter retains
the same value throughout the entire session. when u use the mapping parameter,
U declare and use the parameter in a mapping or Mapplet. Then define the value
of parameter in a parameter file for the session.
Unlike a mapping parameter,a mapping variable
represents a value that can change throughout the session.The informatica
server saves the value of mapping variable to the repository at the end of
session run and uses that value next
time U run the session.
18. Can U use the mapping
parameters or variables created in one mapping into another mapping?
NO. We can use mapping parameters or
variables in any transformation of the same mapping or mapplet in which U have
created mapping parameters or variables.
19.Can u use the mapping parameters or variables created in
one mapping into any other reusable transformation?
Yes. Because reusable tranformation is not
contained with any mapplet or mapping.
20.How can U improve session performance in aggregator
transformation?
Use sorted input.
21.What is
aggregate cache in aggregator transforamtion?
The aggregator stores data in the
aggregate cache until it completes aggregate calculations.When u run a session
that uses an aggregator transformation,the informatica server creates index and
data caches in memory to process the transformation.If the informatica server
requires more space,it stores overflow values in cache files.
22.What r the diffrence between joiner transformation
and source qualifier transformation?
U can join hetrogenious data sources in
joiner transformation which we can not
achieve in source qualifier transformation.U need matching keys to join
two relational sources in source
qualifier transformation.Where as u doesn't need matching keys to join two sources.Two
relational sources should come from same datasource in sourcequalifier.whereas U can join relational sources which r coming
from different sources in joiner
transformation .
23. In which conditions we cannot use joiner
transformation (Limitations of joiner transformation)
Both pipelines begin with the
same original data source.
Both input pipelines originate
from the same Source Qualifier transformation.
Both input pipelines originate
from the same Normalizer transformation.
Both input pipelines originate
from the same Joiner transformation.
Either input pipelines contains
an Update Strategy transformation.
Either input pipelines contains
a connected or unconnected Sequence
Generator transformation.
24. What r the setting that u use to configure the
joiner transformation?
Master and detail source, Type of join
and Condition of the join
25. What r the join types in joiner transformation?
Normal (Default), Master outer, Detail
outer, full outer
Master Outer :All rows from Detail
and only matching rows from Master
Detail outer :All rows from Master and only matching rows
from Detail.
full outer join: keeps all rows of data
from both the master and detail sources
26.What r the joiner caches?
When a Joiner transformation occurs in a
session, the Informatica Server reads all the records from the master source
and builds index and data caches based on the master rows.After building the
caches, the Joiner transformation reads records from the detail source and
perform joins.
27.what is the look up transformation?
Use lookup transformation in u'r mapping
to lookup data in a relational table, view, synonym. Flat file Informatica server queries the look up table based on the
lookup ports in the transformation.It
compares the lookup transformation port values to lookup table column values
based on the look up condition.
28.Why use the lookup transformation ?
Get a related value. For example, if your
source table includes employee ID, but you want to include the employee name in
your target table to make your summary data
Easier to read. Perform a calculation. Many normalized tables include
values used in a Calculation, such as gross sales per invoice or sales tax, but
not the calculated value (such as net
sales).
Update slowly changing dimension tables.
You can use a Lookup transformation to Determine whether records already exist
in the target.
29.What r the types of lookup?
Connected and unconnected
30.Differences between
connected and unconnected lookup?
Connected
Lookup
|
Unconnected
Lookup
|
Receives input values directly from the
pipeline.
|
Receives input values from the result of
a :LKP expression in another transformation.
|
You can use a dynamic or static cache.
|
You can use a static cache.
|
Cache includes all lookup columns used in
the mapping (that is, lookup source columns included in the lookup condition
and lookup source columns linked as output ports to other transformations).
|
Cache includes all lookup/output ports in
the lookup condition and the lookup/return port.
|
Can return multiple columns from the same
row or insert into the dynamic lookup cache.
|
Designate one return port (R). Returns
one column from each row.
|
If there is no match for the lookup
condition, the PowerCenter Server returns the default value for all output
ports. If you configure dynamic caching, the PowerCenter Server inserts rows
into the cache or leaves it unchanged.
|
If there is no match for the lookup
condition, the PowerCenter Server returns NULL.
|
If there is a match for the lookup
condition, the PowerCenter Server returns the result of the lookup condition
for all lookup/output ports. If you configure dynamic caching, the
PowerCenter Server either updates the row the in the cache or leaves the row unchanged.
|
If there is a match for the lookup
condition, the PowerCenter Server returns the result of the lookup condition
into the return port.
|
Pass multiple output values to another
transformation. Link lookup/output ports to another transformation.
|
Pass one output value to another
transformation. The lookup/output/return port passes the value to the
transformation calling :LKP expression.
|
Supports user-defined default values.
|
Does not support user-defined default
values.
|
|
|
31.what is meant by lookup caches?
The informatica server builds a cache in
memory when it processes the first row of a data in a cached look up transformation.It allocates memory
for the cache based on the amount u configure in the
transformation or session
properties.The informatica server stores condition values in the index cache and output values
in the data cache.
32.What r the types of lookup caches?
Persistent cache: U can save
the lookup cache files and reuse them the next time the informatica server
processes a lookup transformation configured to use the cache.
Recache from database: If the
persistent cache is not synchronized with the lookup table,U can configure the
lookup transformation to rebuild the
lookup cache.
Static cache: U can configure a
static or read-only cache for only lookup table.By default informatica server
creates a static cache. It caches the lookup table and lookup values in the
cache for each row that comes into the transformation. when the lookup
condition is true,the informatica server does not update the cache while it
processes the lookup transformation.
Dynamic cache: If u want to
cache the target table and insert new rows into cache and the target,u can create a look up
transformation to use dynamic cache.The informatica server dynamically inserts
data to the target table.
shared cache: U can share the
lookup cache between multiple transactions U can share unnamed cache between
transformations in the same mapping.
33. Difference between static
cache and dynamic cache
Uncached
|
Static
Cache
|
Dynamic
Cache
|
You cannot insert or update the cache.
|
You cannot insert or update the cache.
|
You can insert or update rows in the
cache as you pass rows to the target.
|
You cannot use a flat file lookup.
|
You can use a relational or a flat file
lookup.
|
You can use a relational lookup only.
|
When the condition is true, the
PowerCenter Server returns a value from the lookup table or cache.
When the condition is not true, the
PowerCenter Server returns the default value for connected transformations
and NULL for unconnected transformations.
For details, see Working with an Uncached Lookup or Static Cache.
|
When the condition is true, the
PowerCenter Server returns a value from the lookup table or cache.
When the condition is not true, the
PowerCenter Server returns the default value for connected transformations
and NULL for unconnected transformations.
For details, see Working with an Uncached Lookup or Static Cache.
|
When the condition is true, the
PowerCenter Server either updates rows in the cache or leaves the cache
unchanged, depending on the row type. This indicates that the row is in the
cache and target table. You can pass updated rows to the target table.
When the condition is not true, the
PowerCenter Server either inserts rows into the cache or leaves the cache
unchanged, depending on the row type. This indicates that the row is not in
the cache or target table. You can pass inserted rows to the target table.
For details, see Updating the Dynamic Lookup Cache.
|
34. Which transformation should we use to normalize the
COBOL and relational sources?
Normalizer Transformation. When U drag
the COBOL source in to the mapping
Designer workspace, the normalizer transformation automatically appears,
creating input and output ports for every column in the source.
35 .How the informatica server sorts the string values
in Ranktransformation?
When the informatica server runs in the
ASCII data movement mode it sorts session data using Binary sort order.If U configure the session
use a biary sort order, the informatica server calculates the binary value of
each string and returns the specified Number of rows with the highest binary
values for the string.
36.What r the rank caches?
During the session ,the informatica server
compares an in out row with rows in the data cache. If the input row out-ranks
a stored row, the informatica server
replaces the stored row with the input
row. The informatica server stores group information in an index cache and row
data in a data cache.
37.What is the Rankindex in Ranktransformation?
The Designer automatically creates a
RANKINDEX port for each Rank transformation. The Informatica Server uses the
Rank Index port to store the ranking position for each record in a group. For
example, if you create a Rank transformation that ranks the top 5 salespersons
for each quarter, the rank index numbers the salespeople from 1 to 5:
38. What is the
Router transformation?
A
Router transformation is similar to a Filter transformation because both
transformations allow you to use a
condition to test data. However, a Filter transformation tests data for one
condition and drops the
rows of data that do not meet
the condition. A Router transformation tests data for one or more conditions
and gives you the option to route rows of data that do not meet any of the
conditions to a default output
group. If you need to test the
same input data based on multiple conditions, use a Router Transformation in a
mapping instead of creating multiple Filter transformations to perform the same
task.
39.What r the types of groups
in Router transformation?
Input group, Output group
The designer copies property information
from the input ports of the input group to create a set of output ports for each output group.
Two types of output groups
User defined groups
Default group
U can not modify or delete default groups.
40.Why we use stored procedure transformation?
For populating and maintaining
data bases.
42.What r the types of data that passes between
informatica server and stored procedure?
3 types of data
Input/Out put parameters
Return Values
Status code.
43.What is the status code?
Status code provides error handling for the
Informatica server during the session.The stored procedure issues a status code
that notifies whether or not stored procedure Completed sucessfully. The user
cannot see this value. It only used by the informatica server to determine
whether to continue running the Session or stop.
44. What is source qualifier transformation?
Ans. When U add a relational or
a flat file source definition to a mappings, U need to connect it to a source qualifier transformation. The source
qualifier transformation represents the records that the informatica server
reads when it runs a session.
45.What r the tasks that source qualifier perform?
Join data originating from same source
database.
Filter records when the informatica server
reads source data.
Specify an outer join rather than the
default inner join
Specify sorted records.
Select only distinct values from the
source.
Creating custom query to issue a special
SELECT statement for the informatica server to read Source data.
46. What is the target load order?
U specify the target load order based on
source qualifiers in a mapping. If u have the multiple Source qualifiers
connected to the multiple targets, U can designate the order in which
informatica server loads data into the targets.
47. What is the default join that source qualifier
provides?
Inner equijoin.
48. What r the basic needs to join two sources in a
source qualifier?
Two sources should have primary and
foreign key relation ships.
Two sources should have matching data
types.
49.what is update strategy transformation?
This transformation is used to maintain the
history data or just most recent Changes
into target table.
50.Describe two levels in which update strategy
transformation sets?
Within a session. When you
configure a session, you can instruct the Informatica Server to either treat
all records in the same way (for example, treat all records as inserts), or use
instructions coded into the
session mapping to flag records
for different database operations.
Within a mapping. Within a
mapping, you use the Update Strategy transformation to flag records for insert,
delete, update, or reject.
51.What is the default source option for update
strategy transformation?
Data driven.
52. What is Datadriven?
The informatica server follows
instructions coded into update strategy transformations With in the session
mapping determine how to flag records for insert,update,delete or Reject .If u
do not choose data driven option setting, the informatica server ignores all
update strategy transformations in the mapping.
53.What r the options in the target session of update
strategy transformation?
Insert ,Delete ,Update , Update as update
,Update as insert , Update esle insert
,Truncate table
54. What r the types of mapping wizards that r to be
provided in Informatica?
The Designer provides two
mapping wizards to help you create mappings quickly and easily. Both wizards
are designed to create mappings for loading and maintaining star schemas, a
series of dimensions related to a central fact table.
Getting
Started Wizard. Creates mappings to load static fact and dimension tables, as
well as slowly growing dimension tables.
Slowly Changing Dimensions
Wizard. Creates mappings to load slowly changing dimension tables based on the amount of
historical dimension data you want to keep and
the method you choose to handle historical dimension data.
55. What r the types of mapping in Getting Started
Wizard?
Simple Pass through mapping:
Loads a static fact or dimension table
by inserting all rows. Use this mapping when you Want to drop all existing data
from your table before loading new data.
Slowly Growing target:
Loads a slowly growing fact or dimension
table by inserting new rows. Use this mapping to load new data when existing
data does not require updates.
56. What r the mappings that we use for slowly
changing dimension table?
Type1: Rows containing changes to existing
dimensions are updated in the target by
overwriting the existing dimension. In the Type 1 Dimension mapping, all
rows contain current dimension data. Use the Type 1 Dimension mapping to update
a slowly changing dimension table when
you do not need to keep any previous versions of dimensions in the
table.
Type 2: The Type 2 Dimension Data mapping inserts
both new and changed dimensions into the
target. Changes are tracked in the target table by versioning the primary key and creating a version number for
each dimension in the table. Use the
Type 2 Dimension/Version Data mapping to update a slowly
changing dimension table when you want to keep a full
history of dimension data in the table. Version numbers and versioned primary
keys track the order of changes to each
dimension.
Type 3: The Type 3 Dimension
mapping filters source rows based on user-defined comparisons and inserts only those found to
be new dimensions to the target. Rows
containing changes to existing dimensions are updated in the target.
When updating an existing dimension, the Informatica Server saves existing data
in different columns of the same row and
replaces the existing data with the updates
57. What r the different types
of Type2 dimension mapping?
Type2 Dimension/Version Data
Mapping: In this mapping the updated dimension in he source will gets inserted in target along with a new
version number. And newly added dimension in source will inserted into target
with a primary key.
Type2 Dimension/Flag current
Mapping: This mapping is also used for slowly changing dimensions. In addition
it creates a flag value for changed or new dimension.Flag indicates the
dimension is new or newlyupdated.Recent dimensions will gets saved with current
flag value 1. And updated dimensions r saved with the value 0.
Type2 Dimension/Effective Date Range Mapping: This
is also one flavour of Type2 mapping used for slowly changing dimensions.This
mapping also inserts both new and changed dimensions in to the target.And
changes r tracked by the effective date range for each version of each
dimension.
58. How can u recognise whether
or not the newly added rows in the source r gets insert in the target ?
In the Type2 mapping we have
three options to recognise the newly added rows Version number, flagvalue,
Effective date Range
59. What r two types of processes that informatica runs
the session?
Load manager Process: Starts
the session, creates the DTM process, and sends post-session email when the
session completes.
The DTM process. Creates
threads to initialize the session, read, write, and transform data, and handle
pre- and post-session operations.
60. What r the new features of the server manager in
the informatica 5.0?
U can use command line arguments for a
session or batch.This allows U to change the
values of session parameters,and mapping
parameters and mapping variables.
Parallel data processig: This feature is
available for powercenter only.If we use the
informatica server on a SMP system,U can
use multiple CPU's to process a session
concurently.
Process session data using threads:
Informatica server runs the session in two
processes.Explained in previous
question.
61. Can u
generate reports in Informatcia?
Yes. By using Metadata reporter we can generate reports in
informatica.
62. What is
metadata reporter?
It is a web based application that
enables you to run reports against repository metadata. With a metadata
reporter,u can access information about U'r repository with out having knowledge of sql,transformation
language or underlying tables in the
repository.
63. Define mapping and sessions?
Mapping: It is a set of source and
target definitions linked by transformation objects that define the rules for
transformation.
Session : It is a set of instructions
that describe how and when to move data from
source to targets.
64. Which tool U use to create and manage
sessions and batches and to monitor and stop the informatica server?
Informatica server manager.
65. Why we use partitioning the session in
informatica?
Partitioning achieves the session
performance by reducing the time period of reading the source and loading the data into target.
66. To achieve the session partition what r the necessary tasks u have to do?
Configure the session to partition
source data. Install the informatica server on a machine with multiple CPU's.
67 .How the informatica server increases the
session performance through partitioning the
source?
For a relational sources informatica
server creates multiple connections for each
partition of a single source and
extracts seperate range of data for each
connection.Informatica server reads multiple partitions of a single
source concurently.Similarly for loading
also informatica server creates multiple connections to the target and loads partitions of data
concurently.
For XML and file sources,informatica
server reads multiple files concurently.For
loading the data informatica server creates a seperate file for each
partition(of a source file).U can choose
to merge the targets.
68. Why u use repository connectivity?
When u edit,schedule the sesion each
time,informatica server directly communicates
the repository to check whether or not the session and users r valid.All
the metadata of sessions and mappings
will be stored in repository.
69. What r the tasks that Loadmanger process will do?
Manages the session and batch
scheduling: When u start the informatica server the load maneger launches and
queries the repository for a list of
sessions configured to run on the
informatica server. When u configure the session the loadmanager maintains list
of list of sessions and session start times.When u sart a session loadmanger fetches
the session information from the repository to perform the validations and
verifications prior to starting DTM
process.
Locking and reading the session: When the
informatica server starts a session
loadmaager locks the session from the repository.Locking prevents U
starting the session again and again.
Reading the parameter file: If the
session uses a parameter files,loadmanager reads the parameter file and
verifies that the session level parematers are declared in the file
Verifies permission and privelleges: When
the sesson starts load manger checks
whether or not the user have privelleges to run the session.
Creating log files: Loadmanger creates
logfile contains the status of session.
70. What is DTM process?
After the loadmanger performs validations for session,it creates the
DTM process.DTM is to create and manage
the threads that carry out the session tasks.I
creates the master thread.Master thread creates and manges all the other
threads.
71. What r the different
threads in DTM process?
Master thread: Creates and manages all
other threads
Mapping thread: One mapping thread will
be creates for each session.Fectchs session and mapping info
Pre and post session threads: This will
be created to perform pre and post session operations.
Reader thread: One thread will be
created for each partition of a
source.It reads data from source.
Writer thread: It will be created to
load data to the target.
Transformation thread: It will be
created to tranform data.
72. What r the data movement modes in
informatcia?
Datamovement modes determines how
informatica server handles the charector data.U choose the datamovement in
the informatica server configuration
settings.Two types of datamovement modes
avialable in informatica.
ASCII mode
Uni code mode.
73. What r the out put files that the informatica
server creates during the session running?
Informatica server log: Informatica
server(on unix) creates a log for all
status and error messages(default name:
pm.server.log).It also creates an error log for error messages.These files will be created in
informatica home directory.
Session log file: Informatica server
creates session log file for each session.It writes information about session into log files such
as initialization process,creation of sql
commands for reader and writer threads,errors encountered and load
summary.The amount of detail in session log file depends on the tracing level
that u set.
Session detail file: This file contains
load statistics for each targets in mapping.Session detail include information such as table
name,number of rows written or rejected.U can
view this file by double clicking on the session in monitor window
Performance detail file: This file
contains information known as session performance details which helps U where performance can
be improved.To genarate this file select
the performance detail option in the session property sheet.
Reject file: This file contains the rows
of data that the writer does notwrite to targets.
Control file: Informatica server creates
control file and a target file when U run a
session that uses the external loader.The control file contains the
information about the target flat file
such as data format and loading instructios for the external loader.
Post session email: Post session email
allows U to automatically communicate
information about a session run to designated recipents.U can create two
different messages.One if the session
completed sucessfully the other if the session fails.
Indicator file: If u
use the flat file as a target,U can configure the informatica server to create indicator file.For each target
row,the indicator file contains a number to indicate whether the row was marked for
insert,update,delete or reject.
output file: If session writes to a target
file,the informatica server creates the target file based on file prpoerties entered in the
session property sheet.
Cache files: When the informatica server
creates memory cache it also creates cache
files.For the following circumstances informatica server creates index
and datacache files.
Aggreagtor transformation
Joiner transformation
Rank transformation
Lookup transformation
74. In which circumstances that informatica
server creates Reject files?
When it encounters the DD_Reject in
update strategy transformation. Violates database constraint Filed in the rows was truncated or
overflowed.
75. What is polling?
It displays the updated information
about the session in the monitor window.The monitor window
displays the status of each session when
U poll the informatica server
76. Can u copy the session to a different
folder or repository?
Yes. By using copy session wizard u can
copy a session in a different folder or
repository.But that target folder or repository should consists of mapping of
that session.If target folder or
repository is not having the mapping of copying session , u should have to copy that mapping first
before u copy the
session.
77. What is batch and describe about types of
batches?
Grouping of session is known as
batch.Batches r two types
Sequential: Runs sessions one after the
other
Concurrent: Runs session at same time.
If u have sessions with source-target
dependencies u have to go for sequential batch to start the sessions one after
another.If u have several independent
sessions u can use concurrent
batches.Whcih runs all the sessions at
the same time.
78. Can u copy the batches? NO
79. How many number of sessions that u can create
in a batch?
Any number of sessions.
80. When the informatica server marks that a
batch is failed?
If one of session is configured to
"run if previous completes"
and that previous session fails.
81. What is a command that used
to run a batch?
pmcmd is used to start a batch.
82. What r the different
options used to configure the sequential
batches?
Two options Run the session only if
previous session completes
successfully.
Always runs the session.
83. In a sequential batch can u run the session
if previous session fails?
Yes.By setting the option always runs the
session.
84. Can u start a batches with in a
batch?
U can not. If u want to start batch that
resides in a
batch,create a new independent
batch and copy the necessary sessions into the new batch.
85. Can u start a session inside a batch
idividually?
We can start our required session only in
case of sequential batch.in case of
concurrent batch we cant do like this.
86. How can u stop a batch? By using server manager or pmcmd.
87. What r the session parameters?
Session parameters r like mapping
parameters,represent values U might want to change between sessions such as database connections or
source files.
Server manager also allows U to create
userdefined session parameters.Following r user defined session
parameters.
Database connections
Source file names: use this parameter
when u want to change the name or location of session source file between
session runs
Target file name : Use this parameter when u want to change the
name or location of session target file between session runs.
Reject file name : Use this parameter when u want to change
the name or location of
session reject files between session
runs.
88. What is parameter file?
Parameter file is to define the values
for parameters and variables used in a
session. A parameter file is a file
created by text editor such as word pad or notepad. U can define the following
values in parameter file
Mapping parameters
Mapping variables
session parameters
89. How can u access the remote source into U'r
session?
Relational source: To acess relational
source which is situated in a remote place ,u need configure database connection to the
datasource.
FileSource : To access the remote source
file U must configure the FTP connection to the host machine before u
create the session.
Hetrogenous : When U'r mapping contains
more than one source type,the server manager creates a hetrogenous session that displays source
options for all types.
90. What is difference between
partioning of relatonal target and partitioning of file targets?
If u partition a session with a
relational target informatica server creates multiple Connections to the target database to write
target data concurently.
If u partition a session with a
file target the informatica server creates one target file for each partition.U
can configure session properties to merge these target files.
91. what r the transformations that restricts the
partitioning of sessions?
Advanced External procedure
tranformation and External procedure transformation:
This transformation contains a check box on the properties tab to allow
partitioning.
Aggregator Transformation: If u use sorted
ports u can not parttion the assosiated
Source
Joiner Transformation : U can not
partition the master source for a joiner
Transformation
Normalizer Transformation XML targets.
92. Performance tuning in
Informatica?
The goal of performance tuning is
optimize session performance so sessions run
during the available load window for the Informatica Server.Increase the
session performance by following.
The performance of the Informatica
Server is related to network connections. Data
generally moves across a network at less
than 1 MB per second, whereas a local disk
moves data five to twenty times faster. Thus network connections ofteny
affect on session performance.So aviod
netwrok connections.
Flat files: If u'r flat files
stored on a machine other than the informatca server, move those files to the machine that consists of
informatica server.
Relational datasources: Minimize
the connections to sources ,targets and informatica server to improve session performance.Moving
target database into server system may improve
session performance. Staging
areas: If u use staging areas u force informatica server to perform multiple datapasses.
Removing of staging areas may improve
session performance.
U can run the multiple informatica
servers againist the same
repository.Distibuting the
session load to multiple informatica
servers may improve session
performance.
Run the informatica server in ASCII
datamovement mode improves
the session
performance.Because ASCII datamovement
mode stores a character
value in one
byte.Unicode mode takes 2 bytes to store
a character.
If a session joins multiple source tables
in one Source
Qualifier, optimizing the
query
may improve performance. Also, single
table select statements
with an ORDER BY or
GROUP BY clause may benefit from
optimization such as adding
indexes.
We can improve the session performance by
configuring the network
packet size,which
Allows
data to cross the network at one time.To do this go to
server manger ,choose
server configure database connections.
If u r target consists key constraints
and indexes u slow the
loading of data.To improve
the session performance in this case drop
constraints and indexes
before u run the
session and rebuild them after completion
of session.
Running a parallel sessions by using
concurrent batches will
also reduce the time of
loading the
data.So concurent batches may also
increase the session
performance.
Partittionig the session improves the
session performance by
creating multiple
connections to sources and targets
and loads data in paralel
pipe lines.
In some cases if a session contains a
aggregator transformation
,u can use incremental
aggregation to improve session
performance.
Aviod transformation errors to improve
the session
performance.
If the sessioin containd lookup
transformation u can improve
the session performance
by enabling the look up cache.
If U'r session contains filter transformation
,create that
filter transformation nearer
to
the sources
or u can use filter condition in source
qualifier.
Aggreagator,Rank and joiner
transformation may oftenly decrease
the session
performance .Because they must group
data before processing
it.To improve session
performance in this case use sorted
ports option.
92. What is difference
between mapplet and reusable transformation?
Mapplet consists of set of
transformations that is reusable.A reusable transformation is A single transformation that can be
reusable.
If u create a variables or parameters
in mapplet that can not be used in another mapping or mapplet.Unlike the variables that r
created in a reusable transformation can be
usefull in any other mapping or mapplet.
We can not include source definitions
in reusable
transformations.But we can
add
sources to a mapplet.
Whole transformation logic will be hided in case of
mapplet.But it is transparent
in
case of reusable transformation.
We cant use COBOL source
qualifier,joiner,normalizer
transformations in
mapplet.Where as we can make them as a
reusable
transformations.
93. Define
informatica repository?
The Informatica repository is a
relational database that stores information, or metadata, used by the Informatica Server
and Client tools.
Metadata can include information such as mappings describing
how to transform source
data, sessions indicating when you want the
Informatica Server to perform the transformations, and
connect strings for sources and
targets. The repository also stores
administrative information such as
usernames and passwords, permissions and privileges, and
product version.
Use repository manager to create the
repository.The Repository Manager connects to
the repository database and runs the code needed to create the repository tables.Thsea tables
stores metadata in specific format the informatica server,client
tools use.
94. What r the types of
metadata that stores in repository?
Following
r the types of metadata that stores in the repository
Database connections
Global objects
Mappings
Mapplets
Multidimensional metadata
Reusable transformations
Sessions and batches
Short cuts
Source definitions
Target defintions
Transformations
95. What is power center
repository?
The PowerCenter repository allows you to
share metadata across repositories to create
a data mart domain. In a data mart domain, you can create a single
global repository to store metadata used
across an enterprise, and a number of local repositories to share the global metadata as needed.
96. How can u work with remote database in informatica?did u work directly
by using remote connections?
To work with remote datasource u need to
connect it with remote connections.But it is
not preferable to work with that remote source directly by using remote
connections . Instead u bring that source
into U r local machine where informatica server resides.If u work directly
with remote source the session performance will decreases by passing
less amount of data across the network
in a particular time.
97. What r the new features in
Informatica 5.0?
U can Debug U'r mapping in mapping
designer
U can view the work space over the
entire screen
The designer displays a new icon for a
invalid mappings in the navigator window
U can use a dynamic lookup cache in a
lokup transformation
Create mapping parameters or mapping
variables in a mapping or mapplet to make mappings more flexible
U can export objects into repository and
import objects from repository.when u export
a repository object,the designer or
server manager creates an XML file to describe the
pository metadata.
The designer allows u to use Router
transformation to test data for multiple
conditions.Router transformation allows u route groups of data to
transformation or target.
U can use XML data as a source or target.
Server Enahancements:
U can use the command line program pmcmd
to specify a parameter file to run sessions
or batches.This allows you to change the values of session parameters,
and mapping parameters and variables at runtime.
If you run the Informatica Server on a
symmetric multi-processing system, you can use
multiple CPUs to process a session
concurrently. You configure partitions in the
session properties based on source qualifiers. The Informatica Server
reads, transforms, and writes partitions
of data in parallel for a single session. This is avialable for Power center only.
Informatica server creates two processes
like loadmanager process,DTM process to
run the sessions.
Metadata Reporter: It is a web based
application which is used to run reports againist
repository metadata.
U can copy the session across the folders and
reposotories using the copy session
wizard in the informatica server manager
With new email variables, you can
configure post-session email to include information, such as the mapping used during the
session
98. what is incremantal aggregation?
When using incremental aggregation, you
apply captured changes in the source to
aggregate calculations in a session. If the source changes only
incrementally and you can capture
changes, you can configure the session to process only those changes. This allows the Informatica Server to update your
target incrementally, rather than forcing it
o process the entire source and recalculate the same calculations each
time you run the session.
99. What r the scheduling
options to run a sesion?
U can shedule a session to run at a given
time or intervel,or u can manually run the
session.
Different options of scheduling
Run only on demand: Informatica server
runs the session only when user starts session
explicitly
Run once: Informatica server runs the
session only once at a specified date and time.
Run every: Informatica server runs the
session at regular intervels as u configured.
Customized repeat: Informatica server
runs the session at the dats and times secified
in the repeat dialog box.
100 . What is tracing level and
what r the types of tracing level?
Tracing level represents the amount of
information that
informatcia server writes in
a
log file.
Types of tracing level
Normal Verbose Verbose init Verbose data
101. What is
difference between stored procedure transformation and external procedure transformation?
In case of storedprocedure
transformation procedure will be
compiled and executed in a relational
data source.U need database connection to import the stored Procedure in to u'r mapping.Where as in
external procedure transformation
procedure or function will be executed out side of data
source.Ie u need to make
it as a DLL to access in u r mapping.No
need to have data base
connection in case of external
procedure transformation.
102. Explain about Recovering sessions?
If you stop a session or if an error
causes a session to stop, refer to the session and error logs to determine the cause of failure.
Correct the errors, and then complete the
session. The method you use to complete the session depends on the
properties of the mapping, session, and
Informatica Server configuration.
Use one of the following
methods to complete the session:
• Run the session again if the Informatica Server has not
issued a commit.
• Truncate the target tables and run the session again if
the session is not recoverable.
• Consider performing recovery if the Informatica Server has
issued at least one commit.
103. If a session fails after
loading of 10,000 records in to the target. How can u load the records from 10001 the record when u run the session
next time?
As explained above informatica server has 3
methods to recovering the sessions. Use performing recovery to load the records from where the
session fails.
104. Explain about perform
recovery?
When the Informatica Server starts a
recovery session, it reads the OPB_SRVR_RECOVERY
table and notes the row ID of the last row committed to the target database.
The Informatica Server then reads all sources again and starts Processing from the next row ID. For
example, if the Informatica
Server commits 10,000 rows before the session fails, when you
run recovery, the Informatica Server bypasses the rows up to 10,000 and starts loading with
row 10,001.
By default, Perform Recovery is
disabled in the Informatica Server setup. You must enable Recovery in the Informatica Server
setup before you run a session so the
Informatica Server can create and/or write entries in the
OPB_SRVR_RECOVERY Table.
105. How to recover the
standalone session?
A Standalone session is a session that is not
nested in a batch. If a standalone session fails, you can run recovery using a menu command or
pmcmd. These options are not available for batched sessions. To recover sessions using the menu:
1. In the Server Manager,
highlight the session you want to recover.
2. Select Server Requests-Stop
from the menu.
3. With the failed session highlighted, select
Server Requests-Start
Session in Recovery Mode from
the menu.
To recover sessions using pmcmd:
1.From the command line, stop
the session.
2. From the command line, start
recovery.
106. How can u recover the session
in sequential batches?
If you configure a session in a sequential
batch to stop on failure, you can run recovery
Starting with the failed session. The Informatica Server completes the
session and then Runs the rest of the
batch. Use the Perform Recovery session property
To recover sessions in
sequential batches configured to stop on
failure:
1. In the Server Manager, open
the session property sheet.
2. On the Log Files tab, select
Perform Recovery, and click OK.
3. Run the session.
4. After the batch completes,
open the session property sheet.
5. Clear Perform Recovery, and
click OK.
If you do not clear Perform
Recovery, the next time you run the
session, the Informatica Server
attempts to recover the previous session.
If you do not configure a
session in a sequential batch to stop on
failure, and the remaining
sessions in the batch complete, recover the
failed session as a standalone
session.
107. How to recover sessions in
concurrent batches?
If multiple sessions in a concurrent
batch fail, you might want to truncate all targets and run the batch again. However, if a session in
a concurrent batch fails and the rest of the
sessions complete successfully, you can recover the session as a
standalone session.
To recover a session in a
concurrent batch:
1.Copy the failed session using
Operations-Copy Session.
2.Drag the copied session
outside the batch to be a standalone session.
3.Follow the steps to recover a
standalone session.
4.Delete the standalone copy.
108. How can u complete
unrecoverable sessions?
Under certain circumstances, when a
session does not complete, you need to truncate the target tables and run the session from
the beginning. Run the session from the
beginning when the Informatica Server cannot run recovery or when
running recovery might result in
inconsistent data.
109. What r the circumstances
that informatica server results an unreciverable session?
The source qualifier transformation
does not use sorted ports .if u change the partition information after the
initial session fails. Perform
recovery is disabled in the informatica server
configuration. If the sources or targets changes after
initial session fails.
If the mapping consists of sequence
generator or normalizer
transformation.
If a concurrent batch contains multiple failed sessions.
110. If i done any
modifications for my table in back end does it reflect in informatica Warehouse or mapping designer or source
analyzer?
NO. Informatica is not at all concern
with back end database. It displays u all the
Information that is to be stored in repository. If want to reflect back
end changes to informatica screens,
again u have to import from back end to informatica by valid connection. And u have to replace the
existing files with
imported files.
========
·
What are the types of lookup?
· Connected
look up and un-connected look up
·
Unconnected Lookup and Connected Lookup
· connected
and unconnected
·
1.connected 2.unconnected
· connected
|
||||||||||||||||||||||||||||||||||||||||||||||
·
How can you recognise whether or not the newly added rows in the
source are gets insert in the target?
|
||||||||||||||||||||||||||||||||||||||||||||||
·
How can you create or import flat file definition in to the warehouse
designer?
· By giving
server connection path
|
||||||||||||||||||||||||||||||||||||||||||||||
·
What is aggregate cache in aggregator transforamtion?
· r8ir678
· IT IS A
CACHE
· Aggregate
value will stored in data cache, grouped column value will stored in index
cache
|
||||||||||||||||||||||||||||||||||||||||||||||
·
What are the settiings that you use to cofigure the joiner
transformation?
· Master
group flow detail group flow join condition type of join
· take less
no. of rows table as master table, more no of table as detail table and join
condition. joiner will put all row from master table into chache and check
condition with detail table rows.
|
||||||||||||||||||||||||||||||||||||||||||||||
·
How the informatica server increases the session performance through
partitioning the source?
|
||||||||||||||||||||||||||||||||||||||||||||||
·
What is source qualifier transformation?)
|
||||||||||||||||||||||||||||||||||||||||||||||
·
Why use the lookup transformation?
· Used to
look up data in a relational table or view.
· in Inf7.1,
we can get from flat file also
· look up is
used to perform one of the following task: -to get related value -to perform
calculation -to update slowley changing dimension table
|
||||||||||||||||||||||||||||||||||||||||||||||
·
What is the look up transformation?
· Used to
look up data in a reational table or view.
· Lookup is
a passive transformation and used to look up data in a flat file or a
relational table
|
||||||||||||||||||||||||||||||||||||||||||||||
·
What are the join types in joiner transformation?
·
· the
following are the join types
|
||||||||||||||||||||||||||||||||||||||||||||||
|
If you have 10 mappings designed and you need to implement
some chages (may be in existing mapping or new mapping need to be designed) then
how much time it takes from easier to complex?
|
|
What is the filename which you need to configure in UNIX
while installing infromatica?
|
|
What happens if you increase commit intervals and also
decrease commitExplain grouped cross tab?
|
|
What is hash partition?
|
|
What is the approximate size of data warehouse?
|
|
What is data quality? How can a data quality solution be
implemented into my informatica
transformations, even internationally?
|
|
What is the difference between view and materialised view?
|
|
How is Data Models Used in Practice?
|
|
What is an MDDB? What is the difference between MDDBs and
RDBMSs?
|
|
What is active and passive transformation?
|
|
Why do we use DSS database for OLAP tools?
|
|
What is up date strategy and what are the options for
update strategy?
|
|
What is data merging, data cleansing and sampling?
|
|
What is staging area?
|
|
What is difference between a connected look up and
unconnected look up?
|
|
When you add a relational or a flat file source definition
to a mapping, you need to connect it to a Source Qualifier transformation. The
Source Qualifier represents the rows that the Informatica Server reads when it
executes a session.
What r
the diffrence between joiner transformation and source qualifier transformation
U can join hetrogenious data sources in joiner transformation which we
can not achieve in source qualifier transformation.
U need matching keys to join two relational sources in source qualifier transformation.Where as u doesn’t need matching keys to join two sources.
Two relational sources should come from same datasource in sourcequalifier.U can join relatinal sources which r coming from diffrent sources also.
U need matching keys to join two relational sources in source qualifier transformation.Where as u doesn’t need matching keys to join two sources.
Two relational sources should come from same datasource in sourcequalifier.U can join relatinal sources which r coming from diffrent sources also.
Source qualifier – Homogeneous source
Joiner – Heterogeneous source
The Source Qualifier transformation provides
an alternate way to filter rows. Rather than filtering rows from within a
mapping, the Source Qualifier transformation filters rows when read from a
source. The main difference is that the source qualifier limits the row set extracted from a source, while the
Filter transformation limits the row set sent to a target. Since a
source qualifier reduces the number of rows used throughout the mapping, it
provides better performance.
However, the Source Qualifier transformation
only lets you filter rows from relational sources, while the Filter
transformation filters rows from any type of source. Also, note that since it
runs in the database, you must make sure that the filter condition in the
Source Qualifier transformation only uses standard SQL.
what r
the transformations that restricts the partitioning of sessions?
Advanced External procedure tranformation and External procedure
transformation: This
transformation contains a check box on the properties tab to allow partitioning.
Aggregator Transformation: If u use sorted ports u can not parttion the assosiated source
Joiner Transformation : U can not partition the master source for a joiner transformation
Normalizer Transformation
XML targets.
transformation contains a check box on the properties tab to allow partitioning.
Aggregator Transformation: If u use sorted ports u can not parttion the assosiated source
Joiner Transformation : U can not partition the master source for a joiner transformation
Normalizer Transformation
XML targets.
What are
Data driven Sessions?
The informatica server follows instructions coded into update strategy
transformations with in the session mapping to determine how to flag records
for insert,update,delete or reject. If you do not choose data driven optionn
setting, the informatica server ignores all update strategy transformations in
the mapping
Once you load the data
in your DW,you can update the new data with the following options in your
session properties:-
1.update
2.insert2.delete and datadriven, and all these options are present in your
session properties,now if you select the datadriven option ,informatica
takes the logic to update,delete or reject data from your designer update
strategy transformation.it will look some thing like this
IIF( JOB =
'MANAGER', DD_DELETE, DD_INSERT ) this expression marks jobs with an ID of
manager for deletion, and all other items for insertion.
Hope answer the
question.
What r
the unsupported repository objects for a mapplet?
COBOL source definition
Joiner transformations
Normalizer transformations
Non reusable sequence generator transformations.
Pre or post session stored procedures
Target defintions
Power mart 3.5 style Look Up functions
XML source definitions
IBM MQ source definitions
Joiner transformations
Normalizer transformations
Non reusable sequence generator transformations.
Pre or post session stored procedures
Target defintions
Power mart 3.5 style Look Up functions
XML source definitions
IBM MQ source definitions
The following answer is from Informatica Help Documnetation
· You cannot include the following objects in a
mapplet:
- Normalizer transformations
- Cobol sources
- XML Source Qualifier transformations
- XML sources
- Target definitions
- Pre- and post- session stored procedures
- Other mapplets
What is
difference between maplet and reusable transformation?
Maplet consists of set of transformations that is reusable.A reusable
transformation is a
single transformation that can be reusable.
If u create a variables or parameters in maplet that can not be used in another maping or maplet.Unlike the variables that r created in a reusable transformation can be usefull in any other maping or maplet.
We can not include source definitions in reusable transformations.But we can add sources to a maplet.
Whole transformation logic will be hided in case of maplet.But it is transparent in case of reusable transformation.
We cant use COBOL source qualifier,joiner,normalizer transformations in maplet.Where as we can make them as a reusable transformations.
single transformation that can be reusable.
If u create a variables or parameters in maplet that can not be used in another maping or maplet.Unlike the variables that r created in a reusable transformation can be usefull in any other maping or maplet.
We can not include source definitions in reusable transformations.But we can add sources to a maplet.
Whole transformation logic will be hided in case of maplet.But it is transparent in case of reusable transformation.
We cant use COBOL source qualifier,joiner,normalizer transformations in maplet.Where as we can make them as a reusable transformations.
Maplet: one or more transformations
Reusable transformation: only one transformation
Why we
use partitioning the session in informatica?
Partitioning achieves the session performance by reducing the time
period of reading the source and loading the data into target.
Performance can be
improved by processing data in parallel in a single session by creating
multiple partitions of the pipeline.
Informatica server can
achieve high performance by partitioning the pipleline and performing the
extract , transformation, and load for each partition in parallel
What are the mapping parameters and mapping variables?
Mapping parameter represents a constant value that U can
define before running a session.
A mapping parameter retains the same value throughout the
entire session.
When u use the mapping parameter, U declare and use the parameter in a mapping or mapplet. Then define the value of parameter in a parameter file for the session.
Unlike a mapping parameter, a mapping variable represents a value that can change throughout the session. The informatica server saves the value of mapping variable to the repository at the end of session run and uses that value next time U run the session.
When u use the mapping parameter, U declare and use the parameter in a mapping or mapplet. Then define the value of parameter in a parameter file for the session.
Unlike a mapping parameter, a mapping variable represents a value that can change throughout the session. The informatica server saves the value of mapping variable to the repository at the end of session run and uses that value next time U run the session.
Can you use the mapping parameters or variables created in one mapping
into another mapping?
NO.
We can use mapping parameters or variables in any transformation of the same mappings or mapplet in which U have created mapping parameters or variables.
We can use mapping parameters or variables in any transformation of the same mappings or mapplet in which U have created mapping parameters or variables.
Can you use the mapping
parameters or variables created in one mapping into any other reusable
transformation?
Yes. Because reusable transformation is not contained
with any maplet or mapping.
What is aggregate cache in aggregator transformation?
The aggregator stores data in the aggregate cache until
it completes aggregate calculations. When u run a session that uses an aggregator
transformation, the informatica server creates index and data caches in memory
to process the transformation. If the informatica server requires more space,
it stores overflow values in cache files.
What r the difference between joiner transformation and source
qualifier transformation?
U can join heterogeneous data sources in joiner
transformation, which we cannot achieve in source qualifier transformation.
U need matching keys to join two relational sources in source qualifier transformation. Where as u doesn’t need matching keys to join two sources.
Two relational sources should come from same datasource in source qualifier. U can join relational sources which r coming from different sources also.
U need matching keys to join two relational sources in source qualifier transformation. Where as u doesn’t need matching keys to join two sources.
Two relational sources should come from same datasource in source qualifier. U can join relational sources which r coming from different sources also.
What r the settings that u use to configure the joiner transformation?
Master and detail source
Type of join
Condition of the join
Type of join
Condition of the join
What r the join types in joiner transformation?
Master outer
Detail outer
Full outer
What r the joiner caches?
When a Joiner transformation occurs in a session, the
Informatica Server reads all the records from the master source and builds
index and data caches based on the master rows.
After building the caches, the Joiner transformation reads records from the detail source and performs joins.
After building the caches, the Joiner transformation reads records from the detail source and performs joins.
What is the look up transformation?
Use lookup transformation in u’r mapping to lookup data
in a relational table, view, and synonym.
Informatica server queries the look up table based on the lookup ports in the transformation. It compares the lookup transformation port values to lookup table column values based on the look up condition.
Informatica server queries the look up table based on the lookup ports in the transformation. It compares the lookup transformation port values to lookup table column values based on the look up condition.
Why use the lookup transformation?
To perform the following tasks.
Get a related value. For example, if your source table includes employee ID, but you want to include the employee name in your target table to make your summary data easier to read.
Perform a calculation. Many normalized tables include values used in a calculation, such as gross sales per invoice or sales tax, but not the calculated value (such as net sales).
Update slowly changing dimension tables. You can use a Lookup transformation to determine whether records already exist in the target.
Get a related value. For example, if your source table includes employee ID, but you want to include the employee name in your target table to make your summary data easier to read.
Perform a calculation. Many normalized tables include values used in a calculation, such as gross sales per invoice or sales tax, but not the calculated value (such as net sales).
Update slowly changing dimension tables. You can use a Lookup transformation to determine whether records already exist in the target.
Differences between connected and unconnected lookup?
Connected lookup
|
Unconnected
lookup
|
Receives
input values directly from the pipe line. |
Receives
input values from the result of a lkp expression in a another
transformation. |
U can use
a dynamic or static cache |
U
can use a static cache. |
Cache
includes all lookup columns used in the mapping |
Cache
includes all lookup out put ports in the lookup condition and the
lookup/return port. |
Support
user defined default values |
Does
not support user defined default values |
What is meant by lookup caches?
The informatica server builds a cache in memory when it
processes the first row of a data in a cached look up transformation. It
allocates memory for the cache based on the amount u configure in the
transformation or session properties. The informatica server stores condition
values in the index cache and output values in the data cache.
What r the types of lookup caches?
Persistent cache: U can save the lookup cache files and
reuse them the next time the informatica server processes a lookup
transformation configured to use the cache.
Recache from database: If the persistent cache is not synchronized with he lookup table,U can configure the lookup transformation to rebuild the lookup cache.
Static cache: U can configure a static or readonly cache for only lookup table.By default informatica server creates a static cache.It caches the lookup table and lookup values in the cache for each row that comes into the transformation.when the lookup condition is true,the informatica server does not update the cache while it prosesses the lookup transformation.
Dynamic cache: If u want to cache the target table and insert new rows into cache and the target,u can create a look up transformation to use dynamic cache.The informatica server dynamically inerts data to the target table.
shared cache: U can share the lookup cache between multiple transactions.U can share unnamed cache between transformations inthe same maping.
Recache from database: If the persistent cache is not synchronized with he lookup table,U can configure the lookup transformation to rebuild the lookup cache.
Static cache: U can configure a static or readonly cache for only lookup table.By default informatica server creates a static cache.It caches the lookup table and lookup values in the cache for each row that comes into the transformation.when the lookup condition is true,the informatica server does not update the cache while it prosesses the lookup transformation.
Dynamic cache: If u want to cache the target table and insert new rows into cache and the target,u can create a look up transformation to use dynamic cache.The informatica server dynamically inerts data to the target table.
shared cache: U can share the lookup cache between multiple transactions.U can share unnamed cache between transformations inthe same maping.
Difference between static cache and dynamic cache
Static cache
|
Dynamic
cache
|
U can not
insert or update the cache |
U
can insert rows into the cache as u pass to the target |
The
informatic server returns a value from the lookup table or cache when the
condition is true.When the condition is not true, informatica server returns
the default value for connected transformations and null for unconnected
transformations. |
The
informatic server inserts rows into cache when the condition is false.This
indicates that the the row is not in the cache or target table. U can pass
these rows to the target table |
Which transformation should we use to normalize the COBOL and relational
sources?
Normalizer Transformation.
When U drag the COBOL source in to the mapping Designer workspace, the normalizer transformation automatically appears, creating input and output ports for every column in the source.
When U drag the COBOL source in to the mapping Designer workspace, the normalizer transformation automatically appears, creating input and output ports for every column in the source.
How the informatica server sorts the string values in
Ranktransformation?
When the informatica server runs in the ASCII data
movement mode it sorts session data using Binary sortorder.If U configure the
seeion to use a binary sort order,the informatica server caluculates the binary
value of each string and returns the specified number of rows with the higest
binary values for the string.
What r the rank caches?
During the session, the informatica server compares an
inout row with rows in the datacache.If the input row out-ranks a stored
row,the informatica server replaces the stored row with the input row.The
informatica server stores group information in an index cache and row data in a
data cache.
What is the Rankindex in Ranktransformation?
The Designer automatically creates a RANKINDEX port for
each Rank transformation. The Informatica Server uses the Rank Index port to
store the ranking position for each record in a group. For example, if you
create a Rank transformation that ranks the top 5 salespersons for each
quarter, the rank index numbers the salespeople from 1 to 5:
What is the Router transformation?
A Router transformation is similar to a Filter
transformation because both transformations allow you to use a condition to
test data. However, a Filter transformation tests data for one condition and
drops the rows of data that do not meet the condition. A Router transformation
tests data for one or more conditions and gives you the option to route rows of
data that do not meet any of the conditions to a default output group.
If you need to test the same input data based on multiple conditions, use a Router Transformation in a mapping instead of creating multiple Filter transformations to perform the same task.
If you need to test the same input data based on multiple conditions, use a Router Transformation in a mapping instead of creating multiple Filter transformations to perform the same task.
What r the types of groups in Router transformation?
Input group Output group
The designer copies property information from the input ports of the input group to create a set of output ports for each output group.
Two types of output groups
User defined groups
Default group
U can not modify or delete default groups.
The designer copies property information from the input ports of the input group to create a set of output ports for each output group.
Two types of output groups
User defined groups
Default group
U can not modify or delete default groups.
What r the types of data that passes between informatica server and
stored procedure?
3 types of data
Input/Out put parameters
Return Values
Status code.
Input/Out put parameters
Return Values
Status code.
What is the status code?
Status code provides error handling for the informatica
server during the session.The stored procedure issues a status code that
notifies whether or not stored procedure completed sucessfully.This value can
not seen by the user.It only used by the informatica server to determine
whether to continue running the session or stop.
What is source qualifier transformation?
When U add a relational or a flat file source definition
to a maping,U need to connect it to
a source qualifer transformation.The source qualifier transformation represnets the records
that the informatica server reads when it runs a session.
a source qualifer transformation.The source qualifier transformation represnets the records
that the informatica server reads when it runs a session.
What r the tasks that source qualifier performs?
Join data originating from same source data base.
Filter records when the informatica server reads source data.
Specify an outer join rather than the default inner join
specify sorted records.
Select only distinct values from the source.
Creating custom query to issue a special SELECT statement for the informatica server to read
source data.
Filter records when the informatica server reads source data.
Specify an outer join rather than the default inner join
specify sorted records.
Select only distinct values from the source.
Creating custom query to issue a special SELECT statement for the informatica server to read
source data.
What is the target load order?
U specify the target loadorder based on source qualifiers
in a maping.If u have the multiple
source qualifiers connected to the multiple targets,U can designatethe order in which informatica
server loads data into the targets.
source qualifiers connected to the multiple targets,U can designatethe order in which informatica
server loads data into the targets.
What r the basic needs to join two sources in a source qualifier?
Two sources should have primary and Foreign key relation
ships.
Two sources should have matching data types.
Two sources should have matching data types.
Describe two levels in which update strategy transformation sets?
Within a session. When you configure a session, you can
instruct the Informatica Server to either treat all records in the same way
(for example, treat all records as inserts), or use instructions coded into the
session mapping to flag records for different database operations.
Within a mapping. Within a mapping, you use the Update Strategy transformation to flag records for insert, delete, update, or reject.
Within a mapping. Within a mapping, you use the Update Strategy transformation to flag records for insert, delete, update, or reject.
What is Datadriven?
The informatica server follows instructions coded into
update strategy transformations with in the session maping determine how to
flag records for insert, update, delete or reject. If u do not choose data
driven option setting,the informatica server ignores all update strategy
transformations in the mapping.
What r the options in the target session of update strategy
transsformatioin?
Insert
Delete
Update
Update as update
Update as insert
Update esle insert
Truncate table
Delete
Update
Update as update
Update as insert
Update esle insert
Truncate table
What r the types of maping wizards that r to be provided in
Informatica?
The Designer provides two mapping wizards to help you
create mappings quickly and easily. Both wizards are designed to create
mappings for loading and maintaining star schemas, a series of dimensions
related to a central fact table.
Getting Started Wizard. Creates mappings to load static fact and dimension tables, as well as slowly growing dimension tables.
Slowly Changing Dimensions Wizard. Creates mappings to load slowly changing dimension tables based on the amount of historical dimension data you want to keep and the method you choose to handle historical dimension data.
Getting Started Wizard. Creates mappings to load static fact and dimension tables, as well as slowly growing dimension tables.
Slowly Changing Dimensions Wizard. Creates mappings to load slowly changing dimension tables based on the amount of historical dimension data you want to keep and the method you choose to handle historical dimension data.
What r the types of maping in Getting Started Wizard?
Loads a static fact or dimension table by inserting all rows. Use this mapping when you want to drop all existing data from your table before loading new data.
Slowly Growing target :
Loads a slowly growing fact or dimension table by inserting new rows. Use this mapping to load new data when existing data does not require updates.
What r the mapings that we use for slowly changing dimension table?
Type1: Rows containing changes to existing dimensions are
updated in the target by overwriting the existing dimension. In the Type 1
Dimension mapping, all rows contain current dimension data.
Use the Type 1 Dimension mapping to update a slowly changing dimension table when you do not need to keep any previous versions of dimensions in the table.
Type 2: The Type 2 Dimension Data mapping inserts both new and changed dimensions into the target. Changes are tracked in the target table by versioning the primary key and creating a version number for each dimension in the table.
Use the Type 2 Dimension/Version Data mapping to update a slowly changing dimension table when you want to keep a full history of dimension data in the table. Version numbers and versioned primary keys track the order of changes to each dimension.
Type 3: The Type 3 Dimension mapping filters source rows based on user-defined comparisons and inserts only those found to be new dimensions to the target. Rows containing changes to existing dimensions are updated in the target. When updating an existing dimension, the Informatica Server saves existing data in different columns of the same row and replaces the existing data with the updates
Use the Type 1 Dimension mapping to update a slowly changing dimension table when you do not need to keep any previous versions of dimensions in the table.
Type 2: The Type 2 Dimension Data mapping inserts both new and changed dimensions into the target. Changes are tracked in the target table by versioning the primary key and creating a version number for each dimension in the table.
Use the Type 2 Dimension/Version Data mapping to update a slowly changing dimension table when you want to keep a full history of dimension data in the table. Version numbers and versioned primary keys track the order of changes to each dimension.
Type 3: The Type 3 Dimension mapping filters source rows based on user-defined comparisons and inserts only those found to be new dimensions to the target. Rows containing changes to existing dimensions are updated in the target. When updating an existing dimension, the Informatica Server saves existing data in different columns of the same row and replaces the existing data with the updates
What r the different types of Type2 dimension maping?
Type2 Dimension/Version Data Maping: In this maping the
updated dimension in the source will gets inserted in target along with a new
version number.And newly added dimension
in source will inserted into target with a primary key.
Type2 Dimension/Flag current Maping: This maping is also used for slowly changing dimensions.In addition it creates a flag value for changed or new dimension.
Flag indiactes the dimension is new or newlyupdated.Recent dimensions will gets saved with cuurent flag value 1. And updated dimensions r saved with the value 0.
Type2 Dimension/Effective Date Range Maping: This is also one flavour of Type2 maping used for slowly changing dimensions.This maping also inserts both new and changed dimensions in to the target.And changes r tracked by the effective date range for each version of each dimension.
in source will inserted into target with a primary key.
Type2 Dimension/Flag current Maping: This maping is also used for slowly changing dimensions.In addition it creates a flag value for changed or new dimension.
Flag indiactes the dimension is new or newlyupdated.Recent dimensions will gets saved with cuurent flag value 1. And updated dimensions r saved with the value 0.
Type2 Dimension/Effective Date Range Maping: This is also one flavour of Type2 maping used for slowly changing dimensions.This maping also inserts both new and changed dimensions in to the target.And changes r tracked by the effective date range for each version of each dimension.
How can u recognise whether or not the newly added rows in the source r
gets insert in the target ?
In the Type2 maping we have three options to recognise
the newly added rows
Version number
Flagvalue
Effective date Range
Version number
Flagvalue
Effective date Range
What r two types of processes that informatica runs the session?
Load manager Process: Starts the session, creates the DTM
process, and sends post-session email when the session completes.
The DTM process. Creates threads to initialize the session, read, write, and transform data, and handle pre- and post-session operations.
The DTM process. Creates threads to initialize the session, read, write, and transform data, and handle pre- and post-session operations.
What r the new features of the server manager in the informatica 5.0?
can use command line arguments for a session or
batch.This allows U to change the values of session parameters,and mapping
parameters and maping variables.
Parallel data processing: This feature is available for powercenter only.If we use the informatica server on a SMP system,U can use multiple CPU’s to process a session concurently.
Process session data using threads: Informatica server runs the session in two processes.Explained in previous question.
Parallel data processing: This feature is available for powercenter only.If we use the informatica server on a SMP system,U can use multiple CPU’s to process a session concurently.
Process session data using threads: Informatica server runs the session in two processes.Explained in previous question.
Can u generate reports in Informatcia?
Yes. By using Metadata reporter we can generate reports
in informatica.
What is metadata reporter?
It is a web based application that enables you to run
reports againist repository metadata.
With a meta data reporter,u can access information about U’r repository with out having knowledge of sql,transformation language or underlying tables in the repository.
With a meta data reporter,u can access information about U’r repository with out having knowledge of sql,transformation language or underlying tables in the repository.
Define maping and sessions?
Maping: It is a set of source and target definitions
linked by transformation objects that define the rules for transformation.
Session : It is a set of instructions that describe how and when to move data from source to targets.
Session : It is a set of instructions that describe how and when to move data from source to targets.
Which tool U use to create and manage sessions and batches and to
monitor and stop the informatica server
Informatica server manager
Why we use partitioning the session in informatica?
Partitioning achieves the session performance by reducing
the time period of reading the source and loading the data into target.
To achieve the session partition what r the necessary tasks u have to
do?
Configure the session to partition source data.
Install the informatica server on a machine with multiple CPU’s
Install the informatica server on a machine with multiple CPU’s
How the informatica server increases the session performance through
partitioning the source?
For a relational sources informatica server creates
multiple connections for each parttion of a single source and extracts seperate
range of data for each connection.Informatica server reads multiple partitions
of a single source concurently.Similarly for loading also informatica server
creates multiple connections to the target and loads partitions of data
concurently.
For XML and file sources,informatica server reads multiple files concurently.For loading the data informatica server creates a seperate file for each partition(of a source file).U can choose to merge the targets.
For XML and file sources,informatica server reads multiple files concurently.For loading the data informatica server creates a seperate file for each partition(of a source file).U can choose to merge the targets.
Why u use repository connectivity?
When u edit,schedule the sesion each time,informatica
server directly communicates the repository to check whether or not the session
and users r valid.All the metadata of sessions and mappings will be stored in
repository.
What is DTM process?
After the loadmanger performs validations for session,it
creates the DTM process.DTM is to create and manage the threads that carry out
the session tasks.I creates the master thread.Master thread creates and manges
all the other threads.
What r the different threads in DTM process?
Master thread: Creates and manages all other threads
Maping thread: One maping thread will be creates for each session.Fectchs session and maping information.
Pre and post session threads: This will be created to perform pre and post session operations.
Reader thread: One thread will be created for each partition of a source.It reads data from source.
Writer thread: It will be created to load data to the target.
Transformation thread: It will be created to tranform data.
Maping thread: One maping thread will be creates for each session.Fectchs session and maping information.
Pre and post session threads: This will be created to perform pre and post session operations.
Reader thread: One thread will be created for each partition of a source.It reads data from source.
Writer thread: It will be created to load data to the target.
Transformation thread: It will be created to tranform data.
What r the data movement modes in informatcia?
Datamovement modes determines how informatcia server
handles the charector data.U choose the datamovement in the informatica server
configuration settings.Two types of datamovement modes avialable in
informatica.
ASCII mode
Uni code mode
ASCII mode
Uni code mode
What r the out put files that the informatica server creates during the
session running?
Informatica server log: Informatica server(on unix)
creates a log for all status and error messages(default name: pm.server.log).It
also creates an error log for error messages.These files will be created in
informatica home directory.
Session log file: Informatica server creates session log file for each session.It writes information about session into log files such as initialization process,creation of sql commands for reader and writer threads,errors encountered and load summary.The amount of detail in session log file depends on the tracing level that u set.
Session detail file: This file contains load statistics for each targets in mapping.Session detail include information such as table name,number of rows written or rejected.U can view this file by double clicking on the session in monitor window
Performance detail file: This file contains information known as session performance details which helps U where performance can be improved.To genarate this file select the performance detail option in the session property sheet.
Reject file: This file contains the rows of data that the writer does notwrite to targets.
Control file: Informatica server creates control file and a target file when U run a session that uses the external loader.The control file contains the information about the target flat file such as data format and loading instructios for the external loader.
Post session email: Post session email allows U to automatically communicate information about a session run to designated recipents.U can create two different messages.One if the session completed sucessfully the other if the session fails.
Indicator file: If u use the flat file as a target,U can configure the informatica server to create indicator file.For each target row,the indicator file contains a number to indicate whether the row was marked for insert,update,delete or reject.
output file: If session writes to a target file,the informatica server creates the target file based on file prpoerties entered in the session property sheet.
Cache files: When the informatica server creates memory cache it also creates cache files.For the following circumstances informatica server creates index and datacache files.
Aggreagtor transformation
Joiner transformation
Rank transformation
Lookup transformation
Session log file: Informatica server creates session log file for each session.It writes information about session into log files such as initialization process,creation of sql commands for reader and writer threads,errors encountered and load summary.The amount of detail in session log file depends on the tracing level that u set.
Session detail file: This file contains load statistics for each targets in mapping.Session detail include information such as table name,number of rows written or rejected.U can view this file by double clicking on the session in monitor window
Performance detail file: This file contains information known as session performance details which helps U where performance can be improved.To genarate this file select the performance detail option in the session property sheet.
Reject file: This file contains the rows of data that the writer does notwrite to targets.
Control file: Informatica server creates control file and a target file when U run a session that uses the external loader.The control file contains the information about the target flat file such as data format and loading instructios for the external loader.
Post session email: Post session email allows U to automatically communicate information about a session run to designated recipents.U can create two different messages.One if the session completed sucessfully the other if the session fails.
Indicator file: If u use the flat file as a target,U can configure the informatica server to create indicator file.For each target row,the indicator file contains a number to indicate whether the row was marked for insert,update,delete or reject.
output file: If session writes to a target file,the informatica server creates the target file based on file prpoerties entered in the session property sheet.
Cache files: When the informatica server creates memory cache it also creates cache files.For the following circumstances informatica server creates index and datacache files.
Aggreagtor transformation
Joiner transformation
Rank transformation
Lookup transformation
In which circumstances that informatica server creates Reject files?
When it encounters the DD_Reject in update strategy
transformation.
Violates database constraint
Filed in the rows was truncated or overflowed.
Violates database constraint
Filed in the rows was truncated or overflowed.
What is polling?
It displays the updated information about the session in
the monitor window. The monitor window displays the status of each session when
U poll the informatica server
Can u copy the session to a different folder or repository?
Yes. By using copy session wizard u can copy a session in
a different folder or repository.But that
target folder or repository should consists of mapping of that session.
If target folder or repository is not having the maping of copying session ,
u should have to copy that maping first before u copy the session
target folder or repository should consists of mapping of that session.
If target folder or repository is not having the maping of copying session ,
u should have to copy that maping first before u copy the session
What is batch and describe about types of batches?
Grouping of session is known as batch.Batches r two types
Sequential: Runs sessions one after the other
Concurrent: Runs session at same time.
If u have sessions with source-target dependencies u have to go for sequential batch to start the
sessions one after another.If u have several independent sessions u can use concurrent batches.
Whch runs all the sessions at the same time.
Sequential: Runs sessions one after the other
Concurrent: Runs session at same time.
If u have sessions with source-target dependencies u have to go for sequential batch to start the
sessions one after another.If u have several independent sessions u can use concurrent batches.
Whch runs all the sessions at the same time.
What r the session parameters?
Session parameters r like maping parameters,represent
values U might want to change between
sessions such as database connections or source files.
Server manager also allows U to create userdefined session parameters.Following r user defined
session parameters.
Database connections
Source file names: use this parameter when u want to change the name or location of
session source file between session runs
Target file name : Use this parameter when u want to change the name or location of
session target file between session runs.
Reject file name : Use this parameter when u want to change the name or location of
session reject files between session runs.
sessions such as database connections or source files.
Server manager also allows U to create userdefined session parameters.Following r user defined
session parameters.
Database connections
Source file names: use this parameter when u want to change the name or location of
session source file between session runs
Target file name : Use this parameter when u want to change the name or location of
session target file between session runs.
Reject file name : Use this parameter when u want to change the name or location of
session reject files between session runs.
What is parameter file?
| |
|
Parameter file is to define the values for parameters
and variables used in a session.A parameter
file is a file created by text editor such as word pad or notepad. U can define the following values in parameter file Maping parameters Maping variables session parameters |
How can u access the remote source into U’r session?
Relational source: To acess relational source which is
situated in a remote place ,u need to
configure database connection to the datasource.
FileSource : To access the remote source file U must configure the FTP connection to the
host machine before u create the session.
Hetrogenous : When U’r maping contains more than one source type,the server manager creates
a hetrogenous session that displays source options for all types.
configure database connection to the datasource.
FileSource : To access the remote source file U must configure the FTP connection to the
host machine before u create the session.
Hetrogenous : When U’r maping contains more than one source type,the server manager creates
a hetrogenous session that displays source options for all types.
What is difference between partioning of relatonal target and
partitioning of file targets?
If u parttion a session with a relational target
informatica server creates multiple connections
to the target database to write target data concurently.If u partition a session with a file target
the informatica server creates one target file for each partition.U can configure session properties
to merge these target files.
to the target database to write target data concurently.If u partition a session with a file target
the informatica server creates one target file for each partition.U can configure session properties
to merge these target files.
what r the transformations that restricts the partitioning of sessions?
Advanced External procedure tranformation and External
procedure transformation: This
transformation contains a check box on the properties tab to allow partitioning.
Aggregator Transformation: If u use sorted ports u can not parttion the assosiated source
Joiner Transformation : U can not partition the master source for a joiner transformation
Normalizer Transformation
XML targets.
transformation contains a check box on the properties tab to allow partitioning.
Aggregator Transformation: If u use sorted ports u can not parttion the assosiated source
Joiner Transformation : U can not partition the master source for a joiner transformation
Normalizer Transformation
XML targets.
Performance tuning in Informatica?
The goal of performance tuning is optimize session
performance so sessions run during the available load window for the
Informatica Server.Increase the session performance by following.
The performance of the Informatica Server is related to network connections. Data generally moves across a network at less than 1 MB per second, whereas a local disk moves data five to twenty times faster. Thus network connections ofteny affect on session performance.So aviod netwrok connections.
Flat files: If u’r flat files stored on a machine other than the informatca server, move those files to the machine that consists of informatica server.
Relational datasources: Minimize the connections to sources ,targets and informatica server to
improve session performance.Moving target database into server system may improve session
performance.
Staging areas: If u use staging areas u force informatica server to perform multiple datapasses.
Removing of staging areas may improve session performance.
U can run the multiple informatica servers againist the same repository.Distibuting the session load to multiple informatica servers may improve session performance.
Run the informatica server in ASCII datamovement mode improves the session performance.Because ASCII datamovement mode stores a character value in one byte.Unicode mode takes 2 bytes to store a character.
If a session joins multiple source tables in one Source Qualifier, optimizing the query may improve performance. Also, single table select statements with an ORDER BY or GROUP BY clause may benefit from optimization such as adding indexes.
We can improve the session performance by configuring the network packet size,which allows data to cross the network at one time.To do this go to server manger ,choose server configure database connections.
If u r target consists key constraints and indexes u slow the loading of data.To improve the session performance in this case drop constraints and indexes before u run the session and rebuild them after completion of session.
Running a parallel sessions by using concurrent batches will also reduce the time of loading the
data.So concurent batches may also increase the session performance.
Partittionig the session improves the session performance by creating multiple connections to sources and targets and loads data in paralel pipe lines.
In some cases if a session contains a aggregator transformation ,u can use incremental aggregation to improve session performance.
Aviod transformation errors to improve the session performance.
If the sessioin containd lookup transformation u can improve the session performance by enabling the look up cache.
If U’r session contains filter transformation ,create that filter transformation nearer to the sources
or u can use filter condition in source qualifier.
Aggreagator,Rank and joiner transformation may oftenly decrease the session performance .Because they must group data before processing it.To improve session performance in this case use sorted ports option.
The performance of the Informatica Server is related to network connections. Data generally moves across a network at less than 1 MB per second, whereas a local disk moves data five to twenty times faster. Thus network connections ofteny affect on session performance.So aviod netwrok connections.
Flat files: If u’r flat files stored on a machine other than the informatca server, move those files to the machine that consists of informatica server.
Relational datasources: Minimize the connections to sources ,targets and informatica server to
improve session performance.Moving target database into server system may improve session
performance.
Staging areas: If u use staging areas u force informatica server to perform multiple datapasses.
Removing of staging areas may improve session performance.
U can run the multiple informatica servers againist the same repository.Distibuting the session load to multiple informatica servers may improve session performance.
Run the informatica server in ASCII datamovement mode improves the session performance.Because ASCII datamovement mode stores a character value in one byte.Unicode mode takes 2 bytes to store a character.
If a session joins multiple source tables in one Source Qualifier, optimizing the query may improve performance. Also, single table select statements with an ORDER BY or GROUP BY clause may benefit from optimization such as adding indexes.
We can improve the session performance by configuring the network packet size,which allows data to cross the network at one time.To do this go to server manger ,choose server configure database connections.
If u r target consists key constraints and indexes u slow the loading of data.To improve the session performance in this case drop constraints and indexes before u run the session and rebuild them after completion of session.
Running a parallel sessions by using concurrent batches will also reduce the time of loading the
data.So concurent batches may also increase the session performance.
Partittionig the session improves the session performance by creating multiple connections to sources and targets and loads data in paralel pipe lines.
In some cases if a session contains a aggregator transformation ,u can use incremental aggregation to improve session performance.
Aviod transformation errors to improve the session performance.
If the sessioin containd lookup transformation u can improve the session performance by enabling the look up cache.
If U’r session contains filter transformation ,create that filter transformation nearer to the sources
or u can use filter condition in source qualifier.
Aggreagator,Rank and joiner transformation may oftenly decrease the session performance .Because they must group data before processing it.To improve session performance in this case use sorted ports option.
What is difference between maplet and reusable transformation?
Maplet consists of set of transformations that is
reusable.A reusable transformation is a
single transformation that can be reusable.
If u create a variables or parameters in maplet that can not be used in another maping or maplet.Unlike the variables that r created in a reusable transformation can be usefull in any other maping or maplet.
We can not include source definitions in reusable transformations.But we can add sources to a maplet.
Whole transformation logic will be hided in case of maplet.But it is transparent in case of reusable transformation.
We cant use COBOL source qualifier,joiner,normalizer transformations in maplet.Where as we can make them as a reusable transformations.
single transformation that can be reusable.
If u create a variables or parameters in maplet that can not be used in another maping or maplet.Unlike the variables that r created in a reusable transformation can be usefull in any other maping or maplet.
We can not include source definitions in reusable transformations.But we can add sources to a maplet.
Whole transformation logic will be hided in case of maplet.But it is transparent in case of reusable transformation.
We cant use COBOL source qualifier,joiner,normalizer transformations in maplet.Where as we can make them as a reusable transformations.
Define informatica repository?
The Informatica repository is a relational database that
stores information, or metadata, used by the Informatica Server and Client
tools. Metadata can include information such as mappings describing how to
transform source data, sessions indicating when you want the Informatica Server
to perform the transformations, and connect strings for sources and targets.
The repository also stores administrative information such as usernames and passwords, permissions and privileges, and product version.
Use repository manager to create the repository.The Repository Manager connects to the repository database and runs the code needed to create the repository tables.Thsea tables
stores metadata in specific format the informatica server,client tools use.
The repository also stores administrative information such as usernames and passwords, permissions and privileges, and product version.
Use repository manager to create the repository.The Repository Manager connects to the repository database and runs the code needed to create the repository tables.Thsea tables
stores metadata in specific format the informatica server,client tools use.
How can u work with remote database in informatica?did u work directly
by using remote
connections?
To work with remote datasource u need to connect it with
remote connections.But it is not
preferable to work with that remote source directly by using remote connections .Instead u bring that source into U r local machine where informatica server resides.If u work directly with remote source the session performance will decreases by passing less amount of data across the network in a particular time.
preferable to work with that remote source directly by using remote connections .Instead u bring that source into U r local machine where informatica server resides.If u work directly with remote source the session performance will decreases by passing less amount of data across the network in a particular time.
what is incremantal aggregation?
When using incremental aggregation, you apply captured
changes in the source to aggregate calculations in a session. If the source
changes only incrementally and you can capture changes, you can configure the
session to process only those changes. This allows the Informatica Server to
update your target incrementally, rather than forcing it to process the entire
source and recalculate the same calculations each time you run the session
What is tracing level and what r the types of tracing level?
Tracing level represents the amount of information that
informatcia server writes in a log file.
Types of tracing level
Normal
Verbose
Verbose init
Verbose data
Types of tracing level
Normal
Verbose
Verbose init
Verbose data
If a session fails after loading of 10,000 records in to the target.How
can u load the records from 10001 th record when u run the session next time?
informatcia server has 3 methods to recovering the
sessions.Use performing recovery to load the records from where the session
fails.
Explain about perform recovery?
When the Informatica Server starts a recovery session, it
reads the OPB_SRVR_RECOVERY table and notes the row ID of the last row committed
to the target database. The Informatica Server then reads all sources again and
starts processing from the next row ID. For example, if the Informatica Server
commits 10,000 rows before the session fails, when you run recovery, the
Informatica Server bypasses the rows up to 10,000 and starts loading with row
10,001.
By default, Perform Recovery is disabled in the Informatica Server setup. You must enable Recovery in the Informatica Server setup before you run a session so the Informatica Server can create and/or write entries in the OPB_SRVR_RECOVERY table.
By default, Perform Recovery is disabled in the Informatica Server setup. You must enable Recovery in the Informatica Server setup before you run a session so the Informatica Server can create and/or write entries in the OPB_SRVR_RECOVERY table.
What is Data cleansing..?
The process of finding and removing or correcting data
that is incorrect, out-of-date, redundant, incomplete, or formatted
incorrectly.
ETL
What is a staging area? Do we need it? What is the purpose of a staging
area?
Staging area is place where you hold temporary tables on
data warehouse server. Staging tables are connected to work area or fact
tables. We basically need staging area to hold the data , and perform data cleansing
and merging , before loading the data into warehouse.
What is a three tier data warehouse?
Three tier data warehouse contains three tier
such as bottom tier,middle tier and top tier.
Bottom tier deals with retrieving related datas or information from various information repositories by using SQL.
Middle tier contains two types of servers.
1.ROLAP server
2.MOLAP server
Top tier deals with presentation or visualization of the results .
Bottom tier deals with retrieving related datas or information from various information repositories by using SQL.
Middle tier contains two types of servers.
1.ROLAP server
2.MOLAP server
Top tier deals with presentation or visualization of the results .
General
What is a Data Warehousing
Data Warehouse is a repository of integrated information,
available for queries and analysis. Data and information are extracted from
heterogeneous sources as they are generated....This makes it much easier and
more efficient to run queries over data that originally came from different
sources.
What are Data Marts
Data Mart is a segment of a data warehouse that can
provide data for reporting and analysis on a section, unit, department or
operation in the company, e.g. sales, payroll, production. Data marts are sometimes
complete individual data warehouses which are usually smaller than the
corporate data warehouse.
What is a Star Schema
A relational database schema organized around a central
table (fact table) joined to a few smaller tables (dimension tables) using
foreign key references. The fact table contains raw numeric items that
represent relevant business facts (price, discount values, number of units
sold, dollar value, etc.)
What is Dimensional Modelling
In Dimensional Modeling, Data is stored in
two kinds of tables: Fact Tables and Dimension tables.
Fact Table contains fact data e.g. sales, revenue, profit etc.....
Dimension table contains dimensional data such as Product Id, product name, product description etc.....
Fact Table contains fact data e.g. sales, revenue, profit etc.....
Dimension table contains dimensional data such as Product Id, product name, product description etc.....
What Snow Flake Schema
Snowflake schemas normalize dimensions to eliminate
redundancy. That is, the dimension data has been grouped into multiple tables
instead of one large table. For example, a product dimension table in a star
schema might be normalized into a products table, a product_category table, and
a product_manufacturer table in a snowflake schema. While this saves space, it
increases the number of dimension tables and requires more foreign key joins.
The result is more complex queries and reduced query performance
What is the Difference between OLTP and OLAP
OLTP
Current data
Short database transactions
Online update/insert/delete
Normalization is promoted
High volume transactions
Transaction recovery is necessary
OLAP
Current and historical data
Long database transactions
Batch update/insert/delete
Denormalization is promoted
Low volume transactions
Transaction recovery is not necessary
Current data
Short database transactions
Online update/insert/delete
Normalization is promoted
High volume transactions
Transaction recovery is necessary
OLAP
Current and historical data
Long database transactions
Batch update/insert/delete
Denormalization is promoted
Low volume transactions
Transaction recovery is not necessary
What is Fact table
table in a data warehouse whose entries describe data
in a fact table. Dimension tables contain the data from which dimensions are
created.
|
What is data mining
Data mining is a process of extracting hidden trends
within a datawarehouse. For example an insurance dataware house can be used to
mine data for the most high risk people to insure in a certain geographial
area.
What is ODS
ODS
stands for Online Data Storage.
It is used to maintain, store the current and up to date information and the transactions regarding the source databases taken from the OLTP system.
It is directly connected to the source database systems instead of to the staging area.
It is further connected to data warehouse and moreover can be treated as a part of the data warehouse database.
It is used to maintain, store the current and up to date information and the transactions regarding the source databases taken from the OLTP system.
It is directly connected to the source database systems instead of to the staging area.
It is further connected to data warehouse and moreover can be treated as a part of the data warehouse database.
What is a level of Granularity of a fact table
Level of granularity means level of detail
that you put into the fact table in a data warehouse. For example: Based on
design you can decide to put the sales data in each transaction. Now, level of
granularity would mean what detail are you willing to put for each
transactional fact. Product sales with respect to each minute or you want to
aggregate it upto minute and put that data
Why are OLTP database designs not generally a good idea for a Data
Warehouse
OLTP cannot store historical information about the
organization. It is used for storing the details of daily transactions while a
datawarehouse is a huge storage of historical information obtained from
different datamarts for making intelligent decisions about the organization
=====
what
are mapping parameters and varibles in which ...
|
Mapping parameters have a constant value through out the session whereas in mapping variable the values change and the informatica server saves the values in the repository and uses next time when u run the session. |
what
is difference between lookup cashe and unchas...
|
the
difference between cache and uncacheed lookup iswhen you configure the lookup
transformation cache lookup it stores all the lookup table data in the cache
when the first input record enter into the lookup transformation, in cache
lookup the select statement executes only once and compares the values of the
input record with the values in the cachebut in uncache lookup the the select
statement executes for each input record entering into the lookup
transformation and it has to connect to database each time entering the new
record
|
what
are partition points?
|
Partition points mark the thread boundaries in a source pipeline and
divide the pipeline into stages. |
With
out using Updatestretagy and sessons options,...
|
By default all the rows
in the session is set as insert flag ,you can change it in the session
general properties -- Treate source rows as :update so, all the incoming rows will be set with update flag.now you can update the rows in the target table |
What
is the procedure or steps implementing versio...
|
Hi, For version control in ETL layer using informatica, first of all after doing anything in your designer mode or workflow manager, do the following steps..... 1> First save the changes or new implementations. 2>Then from navigator window, right click on the specific object you are currently in. There will be a pop up window. In that window at the lower end side, you will find versioning->Check In. A window will be opened. Leave the information you have done like "modified this mapping" etc. Then click ok button. OK......Still there is also another shortcut method for this...But I'll give that some other time |
If
you have four lookup tables in the workflow. Ho...
|
there r many ways to improve the mapping which has multiple lookups. 1) we can create an index for the lookup table if we have permissions(staging area). 2) divide the lookup mapping into two (a) dedicate one for insert means: source - target,, these r new rows . only the new rows will come to mapping and the process will be fast . (b) Dedicate the second one to update: source=target, these r existing rows. only the rows which exists already will come into the mapping. 3)we can increase the cache size of the lookup. |
When
a workflow has multiple lookup tables use shared cache
In
a sequential Batch how can we stop single sess...
|
we have a task called wait event using that we can stop. we start using raise event. this is as per my knowledge. |
hi
all, recently i was asked by the interviewer li...
|
u can do onething after running the mapping,, in workflow manager start-------->session. Right click on the session u will get a menu, in that go for persistent values, there u will find the last value stored in the repository regarding to mapping variable. then remove it and put |
|
U can also define the value of the variable in the parameter file. Power
center server searches the value of the variable first in parameter file then
searches for the value stored in repository. |
it
takes value of 51 but u can override the saved variable in the repository by
defining the value in the parameter file. if there is a parameter file for the
mapping variable it uses the value in the parameter file not the value+1 in the
repository for example assign the value of the mapping variable as 70.in other
words higher preference is given to the value in the parameter file
How
to use mapping parameters and what is their us...
|
Mapping
parameters and variables make the use of mappings more flexible. And also it
avoids creating of multiple mappings. It helps in adding incremental data.
Mapping parameters and variables has to create in the mapping designer by
choosing the menu option as Mapping ----> parameters and variables and the
enter the name for the variable or parameter but it has to be preceded by $$.
and choose type as parameter/variable, datatypeonce defined the
variable/parameter is in the any expression for example in SQ transformation
in the source filter properties tab. just enter filter condition and finally
create a parameter file to assign the value for the variable / parameter and configure
the session properties. however the final step is optional. if the parameter
is not present it uses the initial value which is assigned at the time of
creating the variable
|
What
is the use of incremental aggregation? Explain...
|
Its a session option. When the informatica server performs incremental
aggr. it passes new source data through the mapping and uses historical cache
data to perform new aggregation calculations incrementally. For performance
we will use it. |
:
How to lookup the data on multiple tables.
|
If
u want to lookup data on multiple tables at a time u can do one thing join
the tables which u want then lookup that joined table. Informatica provides
lookup on joined tables, hats off to informatica.
|
How
to retrieve the records from a rejected file. e...
|
During
the execution of workflow all the rejected rows will be stored in bad
files(where your informatica server get installed:\Program Files\Informatica Power
Center 7.1\Server) These bad files can be imported as flat a file in source
then thro' direct mapping we can load these files in desired format.
|
What
are variable ports and list two situations wh...
|
variable port is used to break the complex expression into simpler and also it is used to store intermediate values |
What
is difference between IIF and DECODE function...
|
u
can use decode in conditioning coloumns also while we cann't use iff but
u can use case. but by using decode retrieveing data is quick
|
which
is better among connected lookup and unconne...
|
Its not a easy question to say which is better out of connected,
unconnected lookups. Its depends upon our experience and upon the requirement. When you compared both basically connected lookup will return more values and unconnected returns one value. |
If you are having defined source you can use connected,
source is not well defined or from different database you can go for
unconnected
We are using like that only
think
the better one is connected look up. beacaz we can use dynamic cache with it ,,
also connected loop up can send multiple columns in a single row, where as
unconnected is concerned it has a single return port.(in case of etl
informatica is concerned)
We
are using Update Strategy Transformation in map...
|
Operation
Constant
Numeric value Insert DD_INSERT 0 Update DD_UPDATE 1 Delete DD_DELETE 2 Reject DD_REJECT 3 |
Suppose
session is configured with commit interval...
|
Source based commit will commit the data into target based on commit
interval.so,for every 10,000 rows it will commit into target. Target based commit will commit the data into target based on buffer size of the target.i.e., it commits the data into target when ever the buffer fills.Let us assume that the buffer size is 6,000.So,for every 6,000 rows it commits the data. |
How
do we estimate the number of partitions that a ...
|
Hi,
it depends upon the informatica version we r using. Suppose if we r using
informatica 6 it supports only 32 partitions where as informatica 7 supports
64 partitions.
|
How
do you decide whether you need to do aggregation...
|
It depends upon our requirement only. If you have good processing database
you can create aggregation table or view at database level else it’s better
to use informatica. Here I’m explaining why we need to use informatica. what ever it may be informatica is a third party tool, so it will take more time to process aggregation compared to the database, but in Informatica an option we called "Incremental aggregation" which will help you to update the current values with current values +new values. No necessary to process entire values again and again. Unless this can be done if nobody deleted that cache files. If that happened total aggregation we need to execute on informatica also. In database we don't have Incremental aggregation facility. |
How
to join two tables without using the Joiner Tr...
|
Joiner
transformation is used to join n (n>1) tables from same or different databases,
but source qualifier transformation is used to join only n tables from same database.
|
In
a filter expression we want to compare one date...
|
The
db2 date format is "yyyymmdd" where as sysdate in
oracle will give "dd-mm-yy" so conversion of db2 date format to
local database date format is compulsory. other wise u will get that type of
error
|
how
to create the staging area in your database
|
A Staging area in a DW is used as a temporary space to hold all the
records from the source system. So more or less it should be exact replica of
the source systems except for the load startegy where we use truncate and
reload options. So create using the same layout as in your source tables or using the Generate SQL option in the Warehouse Designer tab |
whats
the diff between Informatica powercenter ser...
|
Repository is a database in which all informatica componets are stored in
the form of tables. The reposiitory server controls the repository and
maintains the data integrity and Consistency across the repository when
multiple users use Informatica. Powercenter Server/Infa Server is responsible
for execution of the components (sessions) stored in the repository. |
What
are the Differences between Informatica Power...
|
the main difference between informatica 5.1 and 6.1 is that in 6.1 they
introduce a new thing called repository server and in place of server
manager(5.1), they introduce workflow manager and workflow monitor |
Differences
between Informatica 6.2 and Informati...
|
Features in 7.1 are : 1.union and custom transformation 2.lookup on flat file 3.grid servers working on different operating systems can coexist on same server 4.we can use pmcmdrep 5.we can export independent and dependent rep objects 6.we can move mapping in any web application 7.version controlling 8.data profilling |
What
is the difference between connected and uncon...
|
Unconnected: The unconnected Stored Procedure transformation is not connected directly to the flow of the mapping. It either runs before or after the session, or is called by an expression in another transformation in the mapping. connected: The flow of data through a mapping in connected mode also passes through the Stored Procedure transformation. All data entering the transformation through the input ports affects the stored procedure. You should use a connected Stored Procedure transformation when you need data from an input port sent as an input parameter to the stored procedure, or the results of a stored procedure sent as an output parameter to another transformation |
how
to get the first 100 rows from the flat file i...
|
please check this one, task ----->(link) session (workflow manager) double click on link and type $$source success rows(parameter in session variables) = 100 it should automatically stops session. |
can we lookup a table from a source qualifer trans... |
No. we can't do. I will explain you why. 1) Unless you assign the output of the source qualifier to another transformation or to target no way it will include the feild in the query. 2) source qualifier don't have any variables feilds to utalize as expression |
What
is the difference between
|
Normal Load: Bulk Mode: Bulk load will not write information to the database log file so that if any recovery is needed we can't do any thing in such cases. Comparatively Bulk load is pretty faster than normal load. |
What
are main advantages and purpose of using Normalizer Transformation
in Informatica?
|
Normalizer
Transformation is used mainly with COBOL sources where most of the time data
is stored in de-normalized format. Also, Normalizer transformation can be
used to create multiple rows from a single row of data
|
On
connected or unconnected transformations
|
Here is the deal, Connected transformation is a part of your data flow in the pipeline while unconnected Transformation is not. Much like calling a program by name and by reference. Use unconnected transforms when you want call the same transform many times in a single mapping. |
What are 2 modes of data
movement in Informatica Server?
The
data movement mode depends on whether Informatica Server should process single
byte or multi-byte character data. This mode selection can affect the
enforcement of code page relationships and code page validation in the
Informatica Client and Server.
a) Unicode - IS allows 2 bytes for each
character and uses additional byte for each non-ascii character (such as
Japanese characters)
b) ASCII - IS holds all data in a single byte
RE:
Explain the informatica Architecture in detail...
|
informatica server connects source data and target data using native odbc drivers again it connect to the repository for running sessions and retrieving metadata information source------>informatica server--------->target | | REPOSITORY |
How
many ways you can update a relational source d...
|
in 2 ways we can do it 1) by reimport the source definition 2) by edit the source definition |
How
can U improve session performance in aggregator transformation?
|
use sorted input: 1. use a sorter before the aggregator 2. donot forget to check the option on the aggregator that tell the aggregator that the input is sorted on the same keys as group by. the key order is also very important. |
what
is meant by lookup caches?
The
informatica server builds a cache in memory when it processes the first row af
a data in a cached look up transformation.It allocates memory for the cache
based on the amount u configure in the transformation or session properties.The
informatica server stores condition values in the index cache and output values
in the data cache
What
r the types of lookup caches?
Persistent
cache: U can save the lookup cache files and reuse them the next time the
informatica server processes a lookup transformation configured to use the
cache.
Recache from database: If the persistent cache is not synchronized with he lookup table,u can configure the lookup transformation to rebuild the lookup cache.
Static cache: U can configure a static or read-only cache for only lookup table. By default informatica server creates a static cache. It caches the lookup table and lookup values in the cache for each row that comes into the transformation. When the lookup condition is true, the informatica server does not update the cache while it processes the lookup transformation.
Dynamic cache: If u want to cache the target table and insert new rows into cache and the target can create a look up transformation to use dynamic cache. The informatica server dynamically inserts data to the target table.
Shared cache: U can share the lookup cache between multiple transactions can share unnamed cache between transformations in the same mapping.
Recache from database: If the persistent cache is not synchronized with he lookup table,u can configure the lookup transformation to rebuild the lookup cache.
Static cache: U can configure a static or read-only cache for only lookup table. By default informatica server creates a static cache. It caches the lookup table and lookup values in the cache for each row that comes into the transformation. When the lookup condition is true, the informatica server does not update the cache while it processes the lookup transformation.
Dynamic cache: If u want to cache the target table and insert new rows into cache and the target can create a look up transformation to use dynamic cache. The informatica server dynamically inserts data to the target table.
Shared cache: U can share the lookup cache between multiple transactions can share unnamed cache between transformations in the same mapping.
Difference
between static cache and dynamic cache
Static cache
|
Dynamic
cache
|
U can not insert or update the cache |
U can insert rows
into the cache as u pass to the target |
The informatic server returns a value from the lookup table or cache when
the condition is true. When the condition is not true, informatica server
returns the default value for connected transformations and null for
unconnected transformations. |
The informatica
server inserts rows into cache when the condition is false.This indicates
that the row is not in the cache or target table. U can pass these rows to
the target table |
What
r the mappings that we use for slowly changing dimension table?
Type1:
Rows containing changes to existing dimensions are updated in the target by
overwriting the existing dimension. In the Type 1 Dimension mapping, all rows
contain current dimension data.
Use the Type 1 Dimension mapping to update a slowly changing dimension table when you do not need to keep any previous versions of dimensions in the table.
Type 2: The Type 2 Dimension Data mapping inserts both new and changed dimensions into the target. Changes are tracked in the target table by versioning the primary key and creating a version number for each dimension in the table.
Use the Type 2 Dimension/Version Data mapping to update a slowly changing dimension table when you want to keep a full history of dimension data in the table. Version numbers and versioned primary keys track the order of changes to each dimension.
Type 3: The Type 3 Dimension mapping filters source rows based on user-defined comparisons and inserts only those found to be new dimensions to the target. Rows containing changes to existing dimensions are updated in the target. When updating an existing dimension, the Informatica Server saves existing data in different columns of the same row and replaces the existing data with the updates
Use the Type 1 Dimension mapping to update a slowly changing dimension table when you do not need to keep any previous versions of dimensions in the table.
Type 2: The Type 2 Dimension Data mapping inserts both new and changed dimensions into the target. Changes are tracked in the target table by versioning the primary key and creating a version number for each dimension in the table.
Use the Type 2 Dimension/Version Data mapping to update a slowly changing dimension table when you want to keep a full history of dimension data in the table. Version numbers and versioned primary keys track the order of changes to each dimension.
Type 3: The Type 3 Dimension mapping filters source rows based on user-defined comparisons and inserts only those found to be new dimensions to the target. Rows containing changes to existing dimensions are updated in the target. When updating an existing dimension, the Informatica Server saves existing data in different columns of the same row and replaces the existing data with the updates
What
r two types of processes that informatica runs the session?
Load
manager Process: Starts the session, creates the DTM process, and sends
post-session email when the session completes.
The DTM process. Creates threads to initialize the session, read, write, and transform data, and handle pre- and post-session operations.
The DTM process. Creates threads to initialize the session, read, write, and transform data, and handle pre- and post-session operations.
Can
u generate reports in Informatca?
|
It is a ETL tool, you could not make reports from here,
but you can generate metadata report, that is not going to be used for
business analysis
|
Why
we use partitioning the session in informatica...
|
Performance can be improved by processing data in parallel in a single
session by creating multiple partitions of the pipeline. Informatica server can achieve high performance by partitioning the pipleline and performing the extract , transformation, and load for each partition in parallel. |
Define
informatica repository?
The
Informatica repository is a relational database that stores information, or
metadata, used by the Informatica Server and Client tools. Metadata can include
information such as mappings describing how to transform source data, sessions
indicating when you want the Informatica Server to perform the transformations,
and connect strings for sources and targets.
The repository also stores administrative information such as usernames and passwords, permissions and privileges, and product version.
Use repository manager to create the repository.The Repository Manager connects to the repository database and runs the code needed to create the repository tables.Thsea tables
stores metadata in specific format the informatica server,client tools use.
The repository also stores administrative information such as usernames and passwords, permissions and privileges, and product version.
Use repository manager to create the repository.The Repository Manager connects to the repository database and runs the code needed to create the repository tables.Thsea tables
stores metadata in specific format the informatica server,client tools use.
What
is Data Transformation Manager?
After
the load manager performs validations for the session, it creates the DTM
process. The DTM process is the second process associated with the session run.
The primary purpose of the DTM process is to create and manage threads that
carry out the session tasks.
· The DTM allocates process memory for the
session and divide it into buffers. This is also known as buffer memory. It
creates the main thread, which is called the master thread. The master thread
creates and manages all other threads.
· If we partition a session, the DTM creates
a set of threads for each partition to allow concurrent processing.. When
Informatica server writes messages to the session log it includes thread type
and thread ID. Following are the types of threads that DTM creates:
Master thread - Main
thread of the DTM process. Creates and manages all other threads. Mapping
thread - One Thread to Each Session. Fetches Session and Mapping
Information.Pre and Post Session Thread-One Thread each to Perform Pre and Post
Session Operations. reader thread-One Thread for Each Partition for Each Source
Pipeline. WRITER THREAD-One Thread for Each Partition if target exist in the
source pipeline write to the target.tRANSFORMATION THREAD - One or More
Transformation Thread For Each Partition.
Active/Passive Transformations
Transformations
can be active or passive. An active transformation can change the number of
rows that pass through it, such as a Filter transformation that removes rows
that do not meet the filter condition. A passive transformation does not change
the number of rows that pass through it, such as an Expression transformation
that performs a calculation on data and passes all rows through the
transformation.
======
Q: How do I connect job streams/sessions or
batches across folders? (30
October 2000 )
·
For quite a while there's been a deceptive
problem with sessions in the Informatica repository. For management and
maintenance reasons, we've always wanted to separate mappings, sources,
targets, in to subject areas or functional areas of the business. This
makes sense until we try to run the entire Informatica job stream.
Understanding of course that only the folder in which the map has been defined
can house the session. This makes it difficult to run jobs / sessions
across folders - particularly when there are necessary job dependencies which
must be defined. The purpose of this article is to introduce an
alternative solution to this problem. It requires the use of
shortcuts.
The basics are like this: Keep
the map creations, sources, and targets subject oriented. This allows
maintenance to be easier (by subject area). Then once the maps are done,
change the folders to allow shortcuts (done from the repository manager).
Create a folder called: "MY_JOBS" or something like that. Go in
to designer, open "MY_JOBS", expand the source folders, and create
shortcuts to the mappings in the source folders.
Go to the session manager, and
create sessions for each of the short-cut mappings in MY_JOBS. Then batch
them as you see fit. This will allow a single folder for running jobs and
sessions housed anywhere in any folder across your repository.
Q: How do I get maximum speed out of my database
connection? (12 September
2000 )
·
In Sybase or MS-SQL Server, go to the Database
Connection in the Server Manager. Increase the packet size.
Recommended sizing depends on distance traveled from PMServer to Database - 20k
Is usually acceptable on the same subnet. Also, have the DBA increase the
"maximum allowed" packet size setting on the Database itself.
Following this change, the DBA will need to restart the DBMS. Changing
the Packet Size doesn't mean all connections will connect at this size, it just
means that anyone specifying a larger packet size for their connection may be
able to use it. It should increase speed, and decrease network
traffic. Default IP Packets are between 1200 bytes and 1500 bytes.
In Oracle: there are two
methods. For connection to a local database, setup the protocol as IPC
(between PMServer and a DBMS Server that are hosted on the same machine).
IPC is not a protocol that can be utilized across networks (apparently).
IPC stands for Inter Process Communication, and utilizes memory piping (RAM)
instead of client context, through the IP listner. For remote connections
there is a better way: Listner.ORA and TNSNames.ORA need to be modified
to include SDU and TDU settings. SDU = Service Layer Data Buffer, and TDU
= Transport Layer Data Buffer. Both of which specify packet sizing in
Oracle connections over IP. Default for Oracle is 1500 bytes. Also
note: these settings can be used in IPC connections as well, to control the IPC
Buffer sizes passed between two local programs (PMServer and Oracle Server)
Both the
Server and the Client need to be modified. The server will allow packets
up to the max size set - but unless the client specifies a larger packet size,
the server will default to the smallest setting (1500 bytes). Both SDU
and TDU should be set the same. See the example below:
TNSNAMES.ORA
LOC = (DESCRIPTION =
(SDU = 20480) (TDU=20480)
LOC = (DESCRIPTION =
(SDU = 20480) (TDU=20480)
LISTENER.ORA
LISTENER = ....
.... (SID_DESC =
(SDU = 20480) (TDU=20480) (SID_NAME = beqlocal) ....
LISTENER = ....
.... (SID_DESC =
(SDU = 20480) (TDU=20480) (SID_NAME = beqlocal) ....
Q: How do I get a Sequence Generator to
"pick up" where another "left off"? (8 June 2000 )
·
To perform this mighty trick, one can use an
unconnected lookup on the Sequence ID of the target table. Set the
properties to "LAST VALUE", input port is an ID. the condition
is: SEQ_ID >= input_ID. Then in an expression set up a variable
port: connect a NEW self-resetting sequence generator to a new input port
in the expression. The variable port's expression should read: IIF( v_seq
= 0 OR ISNULL(v_seq) = true, :LKP.lkp_sequence(1), v_seq). Then, set up
an output port. Change the output port's expression to read: v_seq +
input_seq (from the resetting sequence generator). Thus you have just
completed an "append" without a break in sequence numbers.
Q: How do I query the repository to see which
sessions are set in TEST MODE? (8
June 2000 )
·
Run the following select:
select * from opb_load_session where bit_option = 13;
It's actually BIT # 2 in this bit_option setting, so if you have a mask, or a bit-level function you can then AND it with a mask of 2, if this is greater than zero, it's been set for test load.
select * from opb_load_session where bit_option = 13;
It's actually BIT # 2 in this bit_option setting, so if you have a mask, or a bit-level function you can then AND it with a mask of 2, if this is greater than zero, it's been set for test load.
Q: How do I "validate" all my mappings
at once? (31 March
2000 )
·
Issue the following command WITH
CARE.
UPDATE OPB_MAPPING SET IS_VALID = 1;
UPDATE OPB_MAPPING SET IS_VALID = 1;
Then disconnect from the database, and re-connect. In session
manager, and designer as well.
Q: How do I validate my entire repository?
(12 September 2000 )
·
To add the menu option, change this registry
entry on your client.
HKEY_CURRENT_USER/Software/Informatica/PowerMart
Client Tools/4.7/Repository Manager Options
add the following string Name:
EnableCheckReposit Data: 1
Validate Repository forces
Informatica to run through the repository, and check the repo for errors
Q: How do I work around a bug in 4.7? I
can't change the execution order of my stored procedures that I've imported? (31 March 2000 )
·
Issue the following statements WITH CARE:
select widget_id from OPB_WIDGET where WIDGET_NAME = <widget name>
(write down the WIDGET ID)
select widget_id from OPB_WIDGET where WIDGET_NAME = <widget name>
(write down the WIDGET ID)
select * from OPB_WIDGET_ATTR where WIDGET_ID = <widget_id>
update OPB_WIDGET_ATTR set attr_value = <execution
order> where WIDGET_ID = <widget_id> and attr_id = 5
COMMIT;
The <execution order> is the number of the order in which you want
the stored proc to execute. Again, disconnect from both designer and
session manager repositories, and re-connect to "re-read" the local
cache.
Q: How do I keep the session manager from
"Quitting" when I try to open a session? (23 March 2000 )
·
Informatica Tech Support has said: if you are
using a flat file as a source, and your "file name" in the "Source
Options" dialog is longer than 80 characters, it will "kill" the
Session Manager tool when you try to re-open it. You can fix the session
by: logging in to the repository via SQLPLUS, or ISQL, and finding the table
called: OPB_LOAD_SESSION, find the Session ID associated with the session name
- write it down. Then select FNAME from OPB_LOAD_FILES where Session_ID =
<session_id>. Change / update OPB_LOAD_FILES set FNAME= <new
file name> column, change the length back to less than 80 characters, and
commit the changes. Now the session has been repaired. Try to keep
the directory to that source file in the DIRECTORY entry box above the file
name box. Try to keep all the source files together in the same source
directory if possible.
Q: How do I repair a "damaged"
repository? (16 March 2000 )
·
There really isn't a good repair tool, nor is
there a "great" method for repairing the repository. However, I
have some suggestions which might help. If you're running in to a session
which causes the session manager to "quit" on you when you try to
open it, or you have a map that appears to have "bad sources", there
may be something you can do. There are varying degrees of damage to the
repository - mostly caused because the sequence generator that PM/PC relies on
is buried in a table in the repository - and they generate their own sequence
numbers. If this table becomes "corrupted" or generates the
wrong sequences, you can get repository errors all over the place. It can
spread quickly. Try the following steps to repair a repository: (USE AT
YOUR OWN RISK) The recommended path is to backup the repository, send it to
Technical Support - and tell them it's damaged.
1. Delete
the session, disconnect, re-connect, then re-create the session, then attempt
to edit the new session again. If the new session won't open up
(srvr mgr quits), then there are more problems - PM/PC is not successfully
attaching sources and targets to the session (SEE: OPB_LOAD_SESSION table
(SRC_ID, TARGET_ID) columns - they will be zero, when they should contain an
ID.
2. Delete
the session, then open the map. Delete the source and targets from the
MAP. Save the map and invalidate it - forcing an update to the repository
and it's links. Drag the sources and targets back in to the map and re-connect
them. Validate and Save. Then try re-building the session (back to
step one). If there is still a failure, then there are more problems.
3. Delete
the session and the map entirely. Save the repository changes - thus
requesting a delete in the repository. While the "delete" may
occur - some of the tables in the repository may not be
"cleansed". There may still be some sources, targets, and
transformation objects (reusable) left in the repository. Rebuild the map
from scratch - then save it again... This will create a new MAP ID in the
OPB_MAPPING table, and force PM/PC to create new ID links to existing Source
and Target objects (as well as all the other objects in the map).
4. If
that didn't work - you may have to delete the sources, reusable objects, and
targets, as well as the session and the map. Then save the repository -
again, trying to "remove" the objects from the repository
itself. Then re-create them. This forces PM/PC to assign new ID's
to ALL the objects in the map, the map, and the session - hopefully creating a
"good" picture of all that was rebuilt.
·
Or try this method:
1. Create
a NEW repository -> call it REPO_A (for reference only).
2. Copy
any of the MAPPINGS that don't have "problems" opening in their
respective sessions, and copy the mappings (using designer) from the old
repository (REPO_B) to the new repository (REPO_A). This will create NEW
ID's for all the mappings, CAUTION: You will lose your sessions.
3. DELETE
the old repository (REPO_B).
4. Create
a new repository in the OLD Repository Space (REPO_B)..
5. Copy
the maps back in to the original repository (Recreated Repository) From REPO_A
to REPO_B.
6. Rebuild
the sessions, then re-create all of the objects you originally had trouble
with.
·
You can apply this to FOLDER level and Repository
Manager Copying, but you need to make sure that none of the objects within a
folder have any problems.
·
What this does: creates new ID's, resets the
sequence generator, re-establishes all the links to the objects in the tables,
and drop's out (by process of elimination) any objects you've got problems
with.
·
Bottom line: PM/PC client tools have trouble
when the links between ID's get broken. It's fairly rare that this
occurs, but when it does - it can cause heartburn.
Q: How do I clear the locks that are left in the
repository? (3 March 2000 )
· Clearing
locks is typically a task for the repository manager. Generally it's done
from within the Repository Manager: Edit Menu -> Show Locks. Select
the locks, then press "remove". Typically locks are left on
objects when a client is rebooted without properly exiting Informatica.
These locks can keep others from editing the objects. They can also keep
scheduled executions from occurring. It's not uncommon to want to clear
the locks automatically - on a prescheduled time table, or at a specified
time. This can be done safely only if no-one has an object out for
editing at the time of deletion of the lock. The suggested method is to
log in to the database from an automated script, and issue a "delete from
OPB_OBJECT_LOCKS" table.
Q: How do I turn on the option for Check
Repository? (3 March 2000 )
·
According to Technical Support, it's only
available by adjusting the registry entries on the client. PM/PC need to be
told it's in Admin mode to work. Below are the steps to turn on the
Administration Mode on the client. Be aware - this may be a security
risk, anyone using that terminal will have access to these features.
1)start repository manager
2) repository menu go to check repository
3) if the option is not there you need to edit your registry using regedit
go to: HKEY_CURRENT_USER>>SOFTWARE>>INFORMATICA>>PowerMart Client Tools>>Repository Manager Options
go to your specific version 4.5 or 4.6 and then go to Repository Manager. In
there add two strings:
1) EnableAdminMode 1
2) EnableCheckReposit 1
1)start repository manager
2) repository menu go to check repository
3) if the option is not there you need to edit your registry using regedit
go to: HKEY_CURRENT_USER>>SOFTWARE>>INFORMATICA>>PowerMart Client Tools>>Repository Manager Options
go to your specific version 4.5 or 4.6 and then go to Repository Manager. In
there add two strings:
1) EnableAdminMode 1
2) EnableCheckReposit 1
both
should be spelled as shown the value for both is 1
Q: How do
I generate an Audit Trail for my repository (ORACLE / Sybase) ?
·
Download
one of two *USE AT YOUR OWN RISK* zip files. The first is available now
for PowerMart 4.6.x and PowerCenter 1.6x. It's a 7k zip file: Informatica
Audit Trail v0.1a The other file (for 4.5.x is coming...).
Please note: this is FREE software that plugs in to ORACLE 7x, and ORACLE 8x,
and Oracle 8i. It has NOT been built for Sybase, Informix, or DB2.
If someone would care to adapt it, and send it back to me, I'll be happy to
post these also. It has limited support - has not been fully tested in a
multi-user environment, any feedback would be appreciated. NOTE:
SYBASE VERSION IS ON IT'S WAY.
Q: How do
I "tune" a repository? My repository is slowing down after a
lot of use, how can I make it faster?
·
In
Oracle: Schedule a nightly job to ANALYZE TABLE for ALL INDEXES, creating
histograms for the tables - keep the cost based optimizer up to date with the
statistics. In SYBASE: schedule a nightly job to UPDATE STATISTICS
against the tables and indexes. In Informix, DB2, and RDB, see your
owners manuals about maintaining SQL query optimizer statistics.
Q: How do
I achieve "best performance" from the Informatica tool set?
·
By
balancing what Informatica is good at with what the databases are built
for. There are reasons for placing some code at the database level -
particularly views, and staging tables for data. Informatica is extremely
good at reading/writing and manipulating data at very high rates of
throughput. However - to achieve optimum performance (in the Gigabyte to
Terabyte range) there needs to be a balance of Tuning in Oracle, utilizing
staging tables, views for joining source to target data, and throughput of
manipulation in Informatica. For instance: Informatica will never achieve
the speeds of "append" or straight inserts that Oracle SQL*Loader, or
Sybase BCP achieve. This is because these two tools are written
internally - specifically for the purposes of loading data (direct to tables /
disk structures). The API that Oracle / Sybase provide Informatica with
is not nearly as equipped to allow this kind of direct access (to eliminate
breakage when Oracle/Sybase upgrade internally). The basics of
Informatica are: 1) Keep maps as simple as possible 2) break complexity up in
to multiple maps if possible 3) rule of thumb: one MAP per TARGET table 4) Use
staging tables for LARGE sets of data 5) utilize SQL for it's power of sorts,
aggregations, parallel queries, temp spaces, etc... (setup views in the
database, tune indexes on staging tables) 6) Tune the database - partition
tables, move them to physical disk areas, etc... separate the logic.
Q: How do
I get an Oracle Sequence Generator to operate faster?
·
The
first item is: use a function to call it, not a stored procedure. Then,
make sure the sequence generator and the function are local to the SOURCE or
TARGET database, DO NOT use synonyms to place either the sequence or function
in a remote instance (synonyms to a separate schema/database on the same
instance may be only a slight performance hit). This should help - possibly
double the throughput of generating sequences in your map. The other item
is: see slide presentations on performance tuning for your sessions / maps for
a "best" way to utilize an Oracle sequence generator. Believe
it or not - the write throughput shown in the session manager per target table
is directly affected by calling an external function/procedure which is
generating sequence numbers. It does NOT appear to affect the read
throughput numbers. This is a difficult problem to solve when you have
low "write throughput" on any or all of your targets. Start
with the sequence number generator (if you can), and try to optimize the map
for this.
Q: I have
a mapping that runs for hours, but it's not doing that much. It takes 5 input
tables, uses 3 joiner transformations, a few lookups, a couple expressions and
a filter before writing to the target. We're running PowerMart 4.6 on an NT 4
box. What tuning options do I have?
·
Without
knowing the complete environment, it's difficult to say what the problem is,
but here's a few solutions with which you can experiment. If the NT box is not
dedicated to PowerMart (PM) during its operation, identify what it contends
with and try rescheduling things such that PM runs alone. PM needs all the
resources it can get. If it's a dedicated box, it's a well known fact that PM
consumes resources at a rapid clip, so if you have room for more memory, get
it, particularly since you mentioned use of the joiner transformation. Also toy
with the caching parameters, but remember that each joiner grabs the full
complement of memory that you allocate. So if you give it 50Mb, the 3 joiners
will really want 150Mb. You can also try breaking up the session into parallel
sessions and put them into a batch, but again, you'll have to manage memory
carefully because of the joiners. Parallel sessions is a good option if
you have a multiple-processor CPU, so if you have vacant CPU slots,
consider adding more CPU's. If a lookup table is relatively big (more than a
few thousand rows), try turning the cache flag off in the session and see what
happens. So if you're trying to look up a "transaction ID" or
something similar out of a few million rows, don't load the table into memory.
Just look it up, but be sure the table has appropriate indexes. And last, if
the sources live on a pretty powerful box, consider creating a view on the
source system that essentially does the same thing as the joiner
transformations and possibly some of the lookups. Take advantage of the source
system's hardware to do a lot of the work before handing down the result to the
resource constrained NT box.
Q: Is
there a "best way" to load tables?
·
Yes - If
all that is occurring is inserts (to a single target table) - then the BEST
method of loading that target is to configure and utilize the bulk loading
tools. For Sybase it's BCP, for Oracle it's SQL*Loader. With
multiple targets, break the maps apart (see slides), one for INSERTS only, and
remove the update strategies from the insert only maps (along with unnecessary
lookups) - then watch the throughput fly. We've achieved 400+ rows per
second per table in to 5 target Oracle tables (Sun Sparc E4500, 4 CPU's,
Raid 5, 2 GIG RAM, Oracle 8.1.5) without using SQL*Loader. On an NT
366 mhz P3, 128 MB RAM, single disk, single target table, using SQL*Loader
we've loaded 1 million rows (150 MB) in 9 minutes total - all the map had was
one expression to left and right trim the ports (12 ports, each row was 150
bytes in length). 3 minutes for SQL*Loader to load the flat file - DIRECT,
Non-Recoverable.
Q: How do
I guage that the performance of my map is acceptable?
·
If you
have a small file (under 6MB) and you have pmserver on a Sun Sparc 4000,
Solaris 5.6, 2 cpu's, 2 gigs RAM, (baseline configuration - if your's is
similar you'll be ok). For NT: 450 MHZ PII 128 MB RAM (under 3 MB file
size), then it's nothing to worry about unless your write throughput is sitting
at 1 to 5 rows per second. If you are in this range, then your map is too
complex, or your tables have not been optimized. On a baseline defined
machine (as stated above), expected read throughput will vary - depending on
the source, write throughput for relational tables (tables in the database)
should be upwards of 150 to 450+ rows per second. To calculate the total
write throughput, add all of the rows per second for each target together, run
the map several times, and average the throughput. If your map is running
"slow" by these standards, then see the slide presentations to
implement a different methodology for tuning. The suggestion here is:
break the map up - 1 map per target table, place common logic in to maplets.
Q: How do
I create a “state variable”?
·
Create a
variable port in an expression (v_MYVAR), set the data type to Integer (for
this example), set the expression to: IIF( ( ISNULL(v_MYVAR) = true or v_MYVAR
= 0 ) [ and <your condition> ], 1, v_MYVAR).> What happens here,
is that upon initialization Informatica may set the v_MYVAR to NULL, or
zero.> The first time this code is executed it is set to “1”.>
Of course – you can set the variable to any value you wish – and carry that
through the transformations.> Also – you can add your own AND
condition (as indicated in italics), and only set the variable when a specific
condition has been met.> The variable port will hold it’s value for
the rest of the transformations.> This is a good technique to use for
lookup values when a single lookup value is necessary based on a condition
being met (such as a key for an “unknown” value).> You can change the
data type to character, and use the same examination – simply remove the “or
v_MYVAR = 0” from the expression – character values will be first set to NULL.
Q: How do
I pass a variable in to a session?
·
There is
no direct method of passing variables in to maps or sessions.> In
order to get a map/session to respond to data driven (variables) – a data
source must be provided.> If working with flat files – it can be
another flat file, if working with relational data sources it can be with another
relational table.> Typically a relational table works best, because
SQL joins can then be employed to filter the data sets, additional maps and
source qualifiers can utilize the data to modify or alter the parameters during
run-time.
Q: How
can I create one map, one session, and utilize multiple source files of the
same format?
·
In
UNIX it’s very easy: create a link to the source file desired, place the link
in the SrcFiles directory, run the session.> Once the session has
completed successfully, change the link in the SrcFiles directory to point to
the next available source file.> Caution: the only downfall is that
you cannot run multiple source files (of the same structure) in to the database
simultaneously.> In other words – it forces the same session to be run
serially, but if that outweighs the maintenance and speed is not a major issue,
feel free to implement it this way.> On NT you would have to
physically move the files in and out of the SrcFiles directory. Note: the difference between creating a link
to an individual file, and changing SrcFiles directory to link to a specific
directory is this: changing a link to an individual file allows multiple
sessions to link to all different types of sources, changing SrcFiles to be a
link itself is restrictive – also creates Unix Sys Admin pressures for
directory rights to PowerCenter (one level up).
Q: How
can I move my Informatica Logs / BadFiles directories to other disks without
changing anything in my sessions?
·
Use the
UNIX Link command – ask the SA to create the link and grant read/write
permissions – have the “real” directory placed on any other disk you wish to
have it on.
Q: How do
I handle duplicate rows coming in from a flat file?
·
If you
don't care about "reporting" duplicates, use an aggregator. Set
the Group By Ports to group by the primary key in the parent target
table. Keep in mind that using an aggregator causes the following: The
last duplicate row in the file is pushed through as the one and only row, loss
of ability to detect which rows are duplicates, caching of the data before
processing in the map continues. If you wish to report duplicates, then
follow the suggestions in the presentation slides (available on this web site)
to institute a staging table. See the pro's and cons' of staging tables,
and what they can do for you.
Where can
I find …?
Q: Where
can I find a history / metrics of the load sessions that have occurred in
Informatica? (8 June 2000 )
·
The
tables which house this information are OPB_LOAD_SESSION, OPB_SESSION_LOG, and
OPB_SESS_TARG_LOG. OPB_LOAD_SESSION contains the single session entries,
OPB_SESSION_LOG contains a historical log of all session runs that have taken
place. OPB_SESS_TARG_LOG keeps track of the errors, and the target tables
which have been loaded. Keep in mind these tables are tied together
by Session_ID. If a session is deleted from OPB_LOAD_SESSION, it's
history is not necessarily deleted from OPB_SESSION_LOG, nor from
OPB_SESS_TARG_LOG. Unfortunately - this leaves un-identified session ID's
in these tables. However, when you can join them together, you can get
the start and complete times from each session. I would suggest using a
view to get the data out (beyond the MX views) - and record it in another
metrics table for historical reasons. It could even be done by putting a
TRIGGER on these tables (possibly the best solution)...
Q: Where
can I find more information on what the Informatica Repository Tables
are?
·
On this
web-site. We have published an unsupported view of what we believe to be
housed in specific tables in the Informatica Repository. Check it out -
we'll be adding to this section as we go. Right now it's just a belief of
what we see in the tables. Repository Table Meta-Data Definitions
(http://www.coreintegration.com/innercore/library/infa/InfAuditTrail.zip)
Q: Where
can I find / change the settings regarding font's, colors, and layouts for the
designer?
·
You can
find all the font's, colors, layouts, and controls in the registry of the
individual client. All this information is kept at:
HKEY_CURRENT_USER\Software\Informatica\PowerMart Client
Tools\<ver>. Below here, you'll find the different folders which
allow changes to be made. Be careful, deleting items in the registry
could hamper the software from working properly.
Q: Where
can I find tuning help above and beyond the manuals?
·
Right
here. There are slide presentations, either available now, or soon which
will cover tuning of Informatica maps and sessions - it does mean that the
architectural solution proposed here be put in place.
Q: Where
can I find the map's used in generating performance statistics?
·
A
windows ZIP file will soon be posted, which houses a repository backup, as well
as a simple PERL program that generates the source file, and a SQL script which
creates the tables in Oracle. You'll be able to download this, and
utilize this for your own benefit.
Why
doesn’t …?
Q: Why
doesn't constraint based load order work with a maplet? (08 May 2000)
·
If your
maplet has a sequence generator (reusable) that's mapped with data straight to
an "OUTPUT" designation, and then the map splits the output to two
tables: parent/child - and your session is marked with "Constraint Based Load
Ordering" you may have experienced a load problem - where the constraints
do not appear to be met?? Well - the problem is in the perception of what
an "OUTPUT" designation is. The OUTPUT component is NOT an
"object" that collects a "row" as a row, before pushing it
downstream. An OUTPUT component is merely a pass-through structural
object - as indicated, there are no data types on the INPUT or OUTPUT
components of a maplet - thus indicating merely structure. To make the
constraint based load order work properly, move all the ports through a single
expression, then through the OUTPUT component - this will force a single row to
be "put together" and passed along to the receiving maplet.
Otherwise - the sequence generator generates 1 new sequence ID for each split
target on the other side of the OUTPUT component.
Q: Why
doesn't 4.7 allow me to set the Stored Procedure connection information in the
Session Manager -> Transformations Tab? (31 March 2000 )
·
This
functionality used to exist in an older version of PowerMart/PowerCenter.
It was a good feature - as we could control when the procedure was executed
(ie: source pre-load), but execute it in a target database connection. It
appears to be a removed piece of functionality. We are asking Informatica
to put it back in.
Q: Why
doesn't it work when I wrap a sequence generator in a view, with a lookup
object?
·
First -
to wrap a sequence generator in a view, you must create an Oracle stored
function, then call the function in the select statement in a view.
Second, Oracle dis-allows an order by clause on a column returned from a user
function (It will cut your connection - and report an oracle error). I
think this is a bug that needs to be reported to Oracle. An Informatica
lookup object automatically places an "order by" clause on the return
ports / output ports in the order they appear in the object. This
includes any "function" return. The minute it executes a
non-cached SQL lookup statement with an order by clause on the function return
(sequence number) - Oracle cuts the connection. Thus keeping this
solution from working (which would be slightly faster than binding an external
procedure/function).
Q: Why
doesn't a running session QUIT when Oracle or Sybase return fatal errors?
·
The
session will only QUIT when it's threshold is set: "Stop on 1
errors". Otherwise the session will continue to run.
Q: Why
doesn't a running session return a non-successful error code to the command
line when Oracle or Sybase return any error?
·
If the
session is not bounded by it's threshold: set "Stop on 1 errors" the
session will run to completion - and the server will consider the session to
have completed successfully - even if Oracle runs out of Rollback or Temp Log
space, even if Sybase has a similar error. To correct this - set the
session to stop on 1 error, then the command line: pmcmd will return a non-zero
(it failed) type of error code. - as will the session manager see that the
session failed.
Q: Why
doesn't the session work when I pass a text date field in to the to_date
function?
·
In order
to make to_date(xxxx,<format>) work properly, we suggest surrounding your
expression with the following: IIF( is_date(<date>,<format>) =
true, to_date(<date>,<format>), NULL) This will prevent
session errors with "transformation error" in the port. If you
pass a non-date to a to_date function it will cause the session to bomb
out. By testing it first, you ensure 1) that you have a real date, and 2)
your format matches the date input. The format should match the expected
date input directly - spaces, no spaces, and everything in between. For
example, if your date is: 1999103022:31:23 then you want a format to be:
YYYYMMDDHH24:MI:SS with no spaces.
Q: Why
doesn't the session control an update to a table (I have no update strategy in
the map for this target)?
·
In order
to process ANY update to any target table, you must put an update
strategy in the map, process a DD_UPDATE command, change the session to
"data driven". There is a second method: without utilizing an
update strategy, set the SESSION properties to "UPDATE" instead of
"DATA DRIVEN", but be warned ALL targets will be updated in place -
with failure if the rows don't exist. Then you can set the update flags
in the mapping's sessions to control updates to the target. Simply
setting the "update flags" in a session is not enough to force the
update to complete - even though the log may show an update SQL statement, the
log will also show: cannot insert (duplicate key) errors.
Q: What
happens when I don't connect input ports to a maplet? (14 June 2000 )
·
Potentially
Hazardous values are generated in the maplet itself. Particularly for
numerics. If you didn't connect ALL the ports to an input on a maplet,
chances are you'll see sporadic values inside the maplet - thus sporadic
results. Such as ZERO in certain decimal cases where NULL is
desired. This is because both the INPUT and OUTPUT objects of a maplet
are nothing more than an interface, which defines the structure of a data row -
they are NOT like an expression that actually "receives" or
"puts together" a row image. This can cause a misunderstanding
of how the maplet works - if you're not careful, you'll end up with unexpected
results.
Q: What
is the Local Object Cache? (3
March 2000 )
·
The
local object cache is a cache of the Informatica objects which are retrieved
from the repository when a connection is established to a repository. The
cache is not readily accessed because it's housed within the PM/PC client
tool. When the client is shut-down, the cache is released.
Apparently the refresh cycle of this local cache requires a full
disconnect/reconnect to the repository which has been updated. This cache
will house two different images of the same object. For instance: a
shared object, or a shortcut to another folder. If the actual source
object is updated (source shared, source shortcut), updates can only be seen in
the current open folder if a disconnect/reconnect is performed against that
repository. There is no apparent command to refresh the cache from the
repository. This may cause some confusion when updating objects then
switching back to the mapping where you'd expect to see the newly updated
object appear.
Q: What
is the best way to "version control"?
·
It seems
the general developer community agrees on this one, the Informatica Versioning
leaves a lot to be desired. We suggest not utilizing the versioning
provided. For two reasons: one, it's extremely unwieldy (you lose all
your sessions), and the repository grows exponentially because Informatica
copies objects to increase the version number. We suggest two different
approaches; 1) utilizing a backup of the repository - synchronize Informatica
repository backups (as opposed to DBMS repo backups) with all the
developers. Make your backup consistently and frequently. Then - if
you need to back out a piece, restore the whole repository. 2) Build on
this with a second "scratch" repository, save and restore to the
"scratch" repository ONE version of the folders. Drag and drop
the folders to and from the "scratch" development repository.
Then - if you need to VIEW a much older version, restore that backup to the
scratch area, and view the folders. In this manner - you can check in the
whole repository backup binary to an outside version control system like PVCS,
CCS, SCM, etc... Then restore the whole backup in to acceptance - use the
backup as a "VERSION" or snapshot of everything in the repository -
this way items don't get lost, and disconnected versions do not get migrated up
in to production.
Q: What
is the best way to handle multiple developer environments?
·
The
school of thought is still out on this one. As with any - there are many
many ways to handle this. One idea is presented here (which seems to work
well, and be comfortable to those who already worked in shared Source Code
environments). The idea is this: All developers use shared folders,
shared objects, and global repositories. In development - it's all about
communication between team members - so that the items being modified are
assigned to individuals for work. With this methodology - all maps can
use common mapplets, shared sources, targets, and other items. The one
problem with this is that the developers MUST communicate about what they are
working on. This is a common and familiar method to working on shared
source code - most development teams feel comfortable with this, as do
managers. The problem with another commonly utilized method (one folder
per developer), is that you end up with run-away development environments.
Code re-use, and shared object use nearly always drop to zero percent (caveat:
unless you are following SEI / CMM / KPA Level 5 - and you have a dedicated CM
(Change Management) person in the works. Communication is still of utmost
importance, however now you have the added problem of "checking in"
what looks like different source tables from different developers, but the
objects are named the same... Among other problems that arise.
Q: What
is the web address to submit new enhancement requests?
·
Informatica's
enhancement request web address is: mailto:featurerequest@informatica.com
Q: What
is the execution order of the ports in an expression?
·
All
ports are executed TOP TO BOTTOM in a serial fashion, but they are done in the
following groups: All input ports are pushed values first. Then all
variables are executed (top to bottom physical ordering in the
expression). Last - all output expressions are executed to push values to
output ports - again, top to bottom in physical ordering. You can utilize
this to your advantage, by placing lookups in to variables, then using the
variables "later" in the execution cycle.
Q: What
is a suggested method for validating fields / marking them with errors?
·
One of
the successful methods is to create an expression object, which contains
variables.> One variable per port that is to be checked.> Set
the error “flag” for that field, then at the bottom of the expression trap each
of the error fields.> From this port you can choose to set flags based
on each individual error which occurred, or feed them out as a combination of
concatenated field names – to be inserted in to the database as an error row in
an error tracking table.
Q: What
does the error “Broken Pipe” mean in the PMSERVER.ERR log on Unix?
·
One of
the known causes for this error message is: when someone in the client User
Interface queries the server, then presses the “cancel” button that appears
briefly in the lower left corner.> It is harmless – and poses no
threat.
Q: What
is the best way to create a readable “DEBUG” log?
·
Create a
table in a relational database which resembles your flat file source (assuming
you have a flat file source).> Load the data in to the relational
table.> Then – create your map from top to bottom and turn on VERBOSE
DATA log at the session level.> Go back to the map, over-ride the SQL
in the SQL Qualifier to only pull one to three rows through the map, then run
the session.> In this manner, the DEBUG log will be readable, errors
will be much easier to identify – and once the logic is fixed, the whole data
set can be run through the map with NORMAL logging.> Otherwise you may
end up with a huge (Megabyte) log.> The other two ways to create debugging
logs are: 1) switch the session to TEST LOAD, set it to 3 rows, and run… The
problem with this is that the reader will read ALL of the source
data.> 2) change the output to a flat file…. The problem with this is
that your log ends up huge (depends on the number of source rows you have).
Q: What
is the best methodology for utilizing Informatica’s Strengths?
·
It
depends on the purpose. However – there is a basic definition of how well the
tool will perform with throughput and data handling, if followed in general
principal – you will have a winning situation.> 1) break all complex
maps down in to small manageable chunks.> Break up any logic you can
in to steps.> Informatica does much better with smaller more
maintainable maps. 2) Break up complex
logic within an expression in to several different expressions.> Be
wary though: the more expressions the slower the throughput – only break up the
logic if it’s too difficult to maintain.> 3) Follow the guides for
table structures and data warehouse structures which are available on this web
site.> For reference: load flat files to staging tables, load staging
tables in to operational data stores / reference stores / data warehousing
sources, load data warehousing sources in to star schemas or snowflakes, load
star schemas or snowflakes in to highly de-normalized reporting
tables.> By breaking apart the logic you will see the fastest
throughput.
Q: When
is it right to use SQL*Loader / BCP as a piped session versus a tail process?
·
SQL*Loader
/ BCP as a piped session should be used when no intermediate file is necessary,
or the source data is too large to stage to an intermediate file, there is not
enough disk or time to place all the source data in to an intermediate
file.> The downfalls currently are this: as a piped process (for
PowerCenter 1.5.2 and 1.6 / PowerMart v4.52. and 4.6)> the core does
NOT stop when either BCP or SQL*Loader “quit” or terminate.> The core
will only stop after reading all of the source data in to the data reader
thread.> This is dangerous if you have a huge file you wish to process
– and it’s scheduled as a monitored process.> Which means: a 5 hour
load (in which SQL*Loader / BCP stopped within the first 5 minutes) will only
stop and signal a page after 5 hours of reading source data.
Q: What
happens when Informatica causes DR Watson's on NT? (30 October 2000 )
·
This is
just my theory for now, but here's the best explanation I can come up
with. Typically this occurs when there is not enough physical RAM
available to perform the operation. Usually this only happens when
SQLServer is installed on the same machine as the PMServer - however if this is
not your case, some of this may still apply. PMServer starts up child
threads just like Unix. The threads share the global shared memory area -
and rely on NT's Thread capabilities. The DR Watson seems to appear
when a thread attempts to deallocate, or allocate real memory. There's
none left (mostly because of SQLServer). The memory manager appears to
return an error, or asks the thread to wait while it reorganizes virtual RAM to
make way for the physical request. Unfortunately the thread code doesn't
pay attention to this requrest, resulting in a memory violation. The
other theory is the thread attempts to free memory that's been swapped to
virtual, or has been "garbage collected" and cleared already - thus
resulting again in a protected memory mode access violation - thus a DR
Watson. Typically the DR Watson can cause the session to "freeze
up". The only way to clear this is to stop and restart the PMSERVER
service - in some cases it requires a full machine reboot. The only other
possibility is when PMServer is attempting to free or shut down a thread -
maybe there's an error in the code which causes the DR Watson. In any case,
the only real fix is to increase the physical RAM on the machine, or to
decrease the number of concurrent sessions running at any given point, or to
decrease the amount of RAM that each concurrent session is using.
Q: What
happens when Informatica CORE DUMPS on Unix? (12 April 2000 )
·
Many
things can cause a core dump, but the question is: how do you go about
"finding out" what cuased it, how do you work to solve it, and is
there a simple fix? This case was found to be frequent (according to tech
support) among setups of New Unix Hardware - causing unnecessary core
dumps. The IPC semaphore settings were set too low - causing X
number of concurrent sessions to "die" with "writer process
died" and "reader process died" etc... We are on a Unix
Machine - Sun Solaris 5.7, anyone with this configuration might want to check
the settings if they experience "Core Dumps" as well.
1.
Run
"sysdef", examine the IPC Semaphores section at the bottom of the
output.
2.
the
folowing settings should be "increased"
3.
SEMMNI -
(semaphore identifiers), (7 x # of concurrent sessions to run in Informatica) +
10 for growth + DBMS setting (DBMS Setting: Oracle = 2 per user, Sybase =
40 (avg))
4.
SEMMNU -
(undo structures in system) = 0.80 x SEMMNI value
5.
SEMUME -
(max undo entries per process) = SEMMNU
6.
SHMMNI -
(shared memory identifiers) = SEMMNI + 10
·
These
settings must be changed by ROOT: etc/system file.
·
About
the CORE DUMP: To help Informatica figure out what's going wrong you can
run a unix utility: "truss" in the following manner:
1.
Shut down
PMSERVER
2.
login as
"powermart" owner of pmserver - cd to the pmserver home directory.
3.
Open
Session Manager on another client - log in, and be ready to press
"start" for the sessions/batches causing problems.
4.
type:
truss -f -o truss.out pmserver <hit return>
5.
On the
client, press "start" for the sessions/batches having trouble.
6.
When all
the batches have completed or failed, press "stop server" from the
Server Manager
·
Your
"truss.out" file will have been created - thus giving you a log of
all the forked processes, and memory management /system calls that will help
decipher what's happing. you can examine the "truss.out" file -
look for: "killed" in the log.
·
DONT
FORGET: Following a CORE DUMP it's always a good idea to shut down the unix
server, and bounce the box (restart the whole server).
Q: What
happens when Oracle or Sybase goes down in the middle of a transformation?
·
It’s up
to the database to recover up to the last commit point.> If you’re
asking this question, you should be thinking about re-runnability of your
processes.> Designing re-runability in to the processing/maps up front
is the best preventative measure you can have.> Utilizing the recovery
facility of PowerMart / PowerCenter appears to be sketchy at best – particularly
in this area of recovery.> The transformation itself will eventually
error out – stating that the database is no longer available (or something to
that effect).
Q: What
happens when Oracle (or Sybase) is taken down for routine backup, but nothing
is running in PMServer at the time?
·
PMServer
reports that the database is unavailable in the PMSERVER.err log.>
When Oracle/Sybase comes back on line, PMServer will attempt to re-connect (if
the repository is on the Oracle/Sybase instance that went down), and eventually
it will succeed (when Oracle/Sybase becomes available again).> However
– it is recommended that PMServer be scheduled to shutdown before Oracle/Sybase
is taken off-line and scheduled to re-start after Oracle/Sybase is put back
on-line.
Q: What
happens in a database when a cached LOOKUP object is created (during a
session)?
·
The
session generates a select statement with an Order By clause. Any time
this is issued, the databases like Oracle and Sybase will select (read) all the
data from the table, in to the temporary database/space. Then the
data will be sorted, and read in chunks back to Informatica server. This
means, that hot-spot contention for a cached lookup will NOT be the table it
just read from. It will be the TEMP area in the database, particularly if
the TEMP area is being utilized for other things. Also - once the
cache is created, it is not re-read until the next running session re-creates
it.
Generic
Questions
Q: Can
you explain how "constraint based load ordering" works? (27 Jan 2000 )
·
Constraint
based load ordering in PowerMart / PowerCenter works like this: it controls the
order in which the target tables are committed to a relational database.
It is of no use when sending information to a flat file. To construct the
proper constraint order: links between the TARGET tables in Informatica need to
be constructed. Simply turning on "constraint based load
ordering" has no effect on the operation itself. Informatica does
NOT read constraints from the database when this switch is turned on.
Again, to take advantage of this switch, you must construct primary / foreign
key relationships in the TARGET TABLES in the designer of Informatica.
Creating primary / foreign key relationships is difficult - you are only
allowed to link a single port (field) to a single table as a primary / foreign
key.
Q: It
appears as if "constraint based load ordering" makes my session
"hang" (it never completes). How do I fix this? (27 Jan 2000 )
·
We have
a suggested method. The best known method for fixing this "hang" bug
is to 1) open the map, 2) delete the target tables (parent / child pairs) 3)
Save the map, 4) Drag in the targets again, Parent's FIRST 5) relink the ports,
6) Save the map, 7) refresh the session, and re-run it. What it does: Informatica
places the "target load order" as the order in which the targets are
created (in the map). It does this because the repository is Seuqence ID Based
and the session derives it's "commit" order by the Sequence ID
(unless constraint based load ordering is ON), then it tries to re-arrange the
commit order based on the constraints in the Target Table definitions (in
PowerMart/PowerCenter). Once done, this will solve the commit ordering
problems, and the "constraint based" load ordering can even be turned
off in the session. Informatica claims not to support this feature in a session
that is not INSERT ONLY. However -we've gotten it to work successfully in DATA
DRIVEN environments. The only known cause (according to Technical Support) is
this: the writer is going to commit a child table (as defined by the key links
in the targets). It checks to see if that particular parent row has been
committed yet - but it finds nothing (because the reader filled up the memory
cache with new rows). The memory that was holding the
"committed" rows has been "dumped" and no longer
exists. So - the writer waits, and waits, and waits - it never sees a
"commit" for the parents, so it never "commits" the child
rows. This only appears to happen with files larger than a certain number
of rows (depending on your memory settings for the session). The only fix
is this: Set "ThrottleReader=20" in the PMSERVER.CFG file. It
apparently limits the Reader thread to a maximum of "20" blocks for
each session - thus leaving the writer more room to cache the commit
blocks. However - this too also hangs in certain situations. To fix
this, Tech Support recommends moving to PowerMart 4.6.2 release (internal core
apparently needs a fix). 4.6.2 appears to be "better" behaved
but not perfect. The only other way to fix this is to turn off constraint
based load ordering, choose a different architecture for your maps (see my
presentations), and control one map/session per target table and their order of
execution.
Q: Is
there a way to copy a session with a map, when copying a map from repository to
repository? Say, copying from Development to Acceptance?
·
Not that
anyone is aware of. There is no direct straight forward method for
copying a session. This is the one downside to attempting to version control
by folder. You MUST re-create the session in Acceptance (UNLESS) you
backup the Development repository, and RESTORE it in to acceptance. This
is the only way to take all contents (and sessions) from one repository to
another. In this fashion, you are versioning all of the repository at
once. With the repository BINARY you can then check this whole binary in
to PVCS or some other outside version control system. However, to
recreate the session, the best method is to: bring up Development folder/repo,
side by side with Acceptance folder/repo - then modify the settings in
Acceptance as necessary.
Q: Can I
set Informatica up for Target flat file, and target relational database?
·
Up
through PowerMart 4.6.2, PowerCenter 1.6.2 this cannot be done in a single
map. The best method for this is to stay relational with your first map,
add a table to your database that looks exactly like the flat file (1 for 1
with the flat file), target the two relational tables. Then, construct
another map which simply reads this "staging" table and dumps it to
flat file. You can batch the maps together as sequential.
Q: How
can you optimize use of an Oracle Sequence Generator?
·
In order
to optimize the use of an Oracle Sequence Generator you must break up you
map. The generic method for calling a sequence generator is to
encapsulate it in a stored procedure. This is typically slow - and kills
the performance. Your version of Informatica's tool should contain
maplets to make this easier. Break the map up in to inserts only,
and updates only. The suggested method is as follows: 1) Create a staging
table - bring the data in straight from the flat file in to the staging
table. 2) Create a maplet with the current logic in it. 3) create
one INSERT map, and one Update map (separate inserts from updates) 4)
create a SOURCE called: DUAL, containing the fields: DUMMY char(1), NEXTVAL
NUMBER(15,0), CURRVAL number(15,0), 5) Copy the source in to your INSERT map,
6) delete the Source Qualifier for "dummy" 7) copy the
"nextval" port in to the original source qualifier (the one that
pulls data from the staging table) 8) Over-ride the SQL in the original source
qualifier, (generate it, then change DUAL.NEXTVAL to the sequence name:
SQ_TEST.NEXTVAL. 9) Feed the "nextval" port through the
mapplet. 10) Change the where clause on the SQL over-ride to select only
the data from the staging table that doesn't exist in the parent target (to be
inserted. This is extremely fast, and will allow your inserts only map to
operate at incredibly high throughput while using an Oracle Sequence
Generator. Be sure to tune your indexes on the Oracle tables so that
there is a high read throughput.
Q: Why
can't I over-ride the SQL in a lookup, and make the lookup non-cached?
·
Apparently
Informatica hasn't made this feature available yet in their tool. It's a
shame - it would simplify the method for pulling Oracle Sequence numbers from
the database. For now - it's simply not implemented.
Q: Does
it make a difference if I push all my ports (fields) through an expression, or
push only the ports which are used in the expression?
·
From the
work that has been done - it doesn't make much of an impact on the overall
speed of the map. If the paradigm is to push all ports through the
expressions for readability then do so, however if it's easier to push the
ports around the expression (not through it), then do so.
Q: What
is the affect of having multiple expression objects vs one expression object
with all the expressions?
·
Less
overall objects in the map make the map/session run faster. Consolodating
expressions in to a single expression object is most helpful to throughput -
but can increase the complexity (maintenance). Read the question/answer
about execution cycles above for hints on how to setup a large expression like
this.