Friday 15 April 2016

INFORMATICA INTERVIEW QUESTIONS AND ANSWERS

NOTE: FEW ANSWERS MAY BE WRONG/INACCURATE
------------------------------------------------------------------------------

1. While importing the relational source definition from database, what

are the Meta data of source you    import?

A. Source name, Database location, Column names, Data types, and Key constraints

 

2.  How many ways you can update a relational source definition and what r they?

A.   1. Edit the definition

       2. Reimport the definition

 

3.   Where should U place the flat file to import the flat file definition to the designer?

 A.  Place it in local folder

 

4.   To provide support for Main frames source data, which files r used as a source  definitions?

A.   COBOL files

 

5. Which transformation should u need while using the cobol sources as source defintions?

A. Normalizer transformaiton which is  used to normalize the data.

Since Cobol sources r      often consists of Denaormailzed data.

 

6. How can U create or import flat file definition in to the warehouse designer?

A.         U can not create or import flat file defintion in to warehouse designer directly.

B.         Instead U must analyze the file in source analyzer, then drag it into the warehouse designer.

C.         When U drag the flat file source definition into warehouse designer  workspace, the warehouse designer creates a relational target defintion not a file defintion.

D.         If u want to load to a file,configure the session to write to a flat file.When the informatica server runs the session,it creates and loads  the flatfile.

 

7.What is the mapplet?

A. Mapplet is a set of transformations that you build in the mapplet designer and U can use in multiple mappings.

 

8. what is a transforamation?

A.  It is a repostitory object that generates,modifies or passes data.

 

9. What r the deigner tools for creating tranformations?

A.   Mapping designer, Tansforamtion developer, Mapplet designer

 

10. What r the active and passive transforamtions?

 A.   An active transforamtion can change the number of rows that pass through it.A passive transformation does not change the number of rows that pass through it.

 

11. What r the connected or unconnected transformations?

 A.   An unconnected transformation is not connected to other transformations in the mapping. Connected transformation is connected to other  transformations in the mapping.

 

12. How many ways u create ports?

A.      1.Drag the port from another transformation

          2.Click the add button on the ports tab.

 

14. What r the reusable transformations?

A.         Reusable transformations can be used in multiple mappings.   When u need to incorporate this transformation into mapping add  an instance of it to  Mapping. Later if U change the definition of the

transformation, all instances of it inherit the changes. Since the instance of reusable transformations is a pointer to that  transformation can change the transformation in the transformation developer, its         instances automatically reflect these  changes. This feature can save U great deal of work.

 

 

15. What r the methods for creating reusable transformations?

 

 A.    Two methods

         1.Design it in the transformation developer.

         2.Promote a standard transformation from the mapping designer. After U add a Transformation to the mapping, U can promote it to the status of reusable  Transformation. Once U promote a standard transformation to reusable status,U can demote it to a    Standard transformation at any time.  If u change the properties of a reusable transformation in  mapping,U can revert it to the original reusable transformation properties by clicking the revert button.

 

16.    What r the unsupported repository objects for a mapplet?

 A.     COBOL source definition

          Joiner transformations

          Normalizer transformations

          Non reusable sequence generator transformations.

          Pre or post session stored procedures

          Target definitions

          Power mart 3.5 style Look Up functions

          XML source definitions

         IBM MQ source defintions

 

17. What r the mapping paramaters and mapping variables?

      Mapping parameter represents a constant value that U can define before running a session. A mapping parameter retains the same value throughout the entire session. when u use the mapping parameter, U declare and use the parameter in a mapping or Mapplet. Then define the value of parameter in a parameter file for the session.

Unlike  a mapping parameter,a mapping variable represents a value that can change throughout the session.The informatica server saves the value of mapping variable to the repository at the end of session run and uses that value  next time U run the session.

 

18. Can U use the mapping parameters or variables created in one mapping into another  mapping?

 

       NO. We can use mapping parameters or variables in any transformation of the same mapping or mapplet in which U have created mapping parameters or  variables.

 

 

19.Can u use the  mapping parameters or variables created in one mapping into any other reusable transformation?

  

  Yes. Because reusable tranformation is not contained with any mapplet or mapping.

 

20.How can U improve session performance in aggregator transformation?

     Use sorted input.

 

21.What  is aggregate cache in aggregator transforamtion?

 

       The aggregator stores data in the aggregate cache until it completes aggregate calculations.When u run a session that uses an aggregator transformation,the informatica server creates index and data caches in memory to process the transformation.If the informatica server requires more space,it stores overflow values in cache files.

 

22.What r the diffrence between joiner transformation and source qualifier transformation?

 

       U can join hetrogenious data sources in joiner  transformation which we can not achieve in source qualifier transformation.U need matching keys to join two  relational sources in source qualifier transformation.Where as u doesn't need matching keys to join two sources.Two relational sources should come from same datasource  in sourcequalifier.whereas U  can join relational sources which r coming from different sources in joiner  transformation .

 

23. In which conditions we cannot use joiner transformation (Limitations of joiner transformation)

 

Both pipelines begin with the same original data source.

Both input pipelines originate from the same Source Qualifier transformation.

Both input pipelines originate from the same Normalizer transformation.

Both input pipelines originate from the same Joiner transformation.

Either input pipelines contains an Update Strategy transformation.

Either input pipelines contains a connected or unconnected Sequence

Generator transformation.

 

24. What r the setting that u use to configure the joiner transformation?

      Master and detail source, Type of join and Condition of the join

 

25. What r the join types in joiner transformation?

      Normal (Default), Master outer, Detail outer, full outer

Master Outer :All rows from Detail and only matching rows from Master

Detail outer   :All rows from Master and only matching rows from Detail.

full outer join: keeps all rows of data from both the master and detail sources

 

26.What r the joiner caches?

     When a Joiner transformation occurs in a session, the Informatica Server reads all the records from the master source and builds index and data caches based on the master rows.After building the caches, the Joiner transformation reads records from the detail source and perform joins.

 

27.what is the look up transformation?

     Use lookup transformation in u'r mapping to lookup data in a relational table, view, synonym. Flat file Informatica  server queries the look up table based on the lookup  ports in the transformation.It compares the lookup transformation port values to lookup table column values based on the look up condition.

 

28.Why use the lookup transformation ?

      Get a related value. For example, if your source table includes employee ID, but you want to include the employee name in your target table to make your summary data  Easier to read. Perform a calculation. Many normalized tables include values used in a Calculation, such as gross sales per invoice or sales tax, but not the calculated value  (such as net sales).

 

       Update slowly changing dimension tables. You can use a Lookup transformation to Determine whether records already exist in the target.

 

29.What r the types of lookup?

        Connected and unconnected

 

30.Differences between connected and unconnected lookup?

 

       

Connected Lookup
Unconnected Lookup
Receives input values directly from the pipeline.
Receives input values from the result of a :LKP expression in another transformation.
You can use a dynamic or static cache.
You can use a static cache.
Cache includes all lookup columns used in the mapping (that is, lookup source columns included in the lookup condition and lookup source columns linked as output ports to other transformations).
Cache includes all lookup/output ports in the lookup condition and the lookup/return port.
Can return multiple columns from the same row or insert into the dynamic lookup cache.
Designate one return port (R). Returns one column from each row.
If there is no match for the lookup condition, the PowerCenter Server returns the default value for all output ports. If you configure dynamic caching, the PowerCenter Server inserts rows into the cache or leaves it unchanged.
If there is no match for the lookup condition, the PowerCenter Server returns NULL.
If there is a match for the lookup condition, the PowerCenter Server returns the result of the lookup condition for all lookup/output ports. If you configure dynamic caching, the PowerCenter Server either updates the row the in the cache or leaves the row unchanged.
If there is a match for the lookup condition, the PowerCenter Server returns the result of the lookup condition into the return port.
Pass multiple output values to another transformation. Link lookup/output ports to another transformation.
Pass one output value to another transformation. The lookup/output/return port passes the value to the transformation calling :LKP expression.
Supports user-defined default values.
Does not support user-defined default values.
 
 

                            

           

31.what is meant by lookup caches?

 

    The informatica server builds a cache in memory when it processes the first row of a data in a cached  look up transformation.It allocates memory for the cache based on the amount u configure in the

transformation or session properties.The informatica server stores condition  values in the index cache and output values in the data cache.

 

 

32.What r the types of lookup caches?

Persistent cache: U can save the lookup cache files and reuse them the next time the informatica server processes a lookup transformation configured to use the cache.

 

Recache from database: If the persistent cache is not synchronized with the lookup table,U can configure the lookup transformation to rebuild  the lookup cache.

 

Static cache: U can configure a static or read-only cache for only lookup table.By default informatica server creates a static cache. It caches the lookup table and lookup values in the cache for each row that comes into the transformation. when the lookup condition is true,the informatica server does not update the cache while it processes the lookup transformation.

 

Dynamic cache: If u want to cache the target table and insert new rows into cache  and the target,u can create a look up transformation to use dynamic cache.The informatica server dynamically inserts data to the target table.

 

shared cache: U can share the lookup cache between multiple transactions U can share unnamed cache between transformations in the same mapping.

 

 

33. Difference between static cache and dynamic cache

                                                                          

 

 

Uncached
Static Cache
Dynamic Cache
You cannot insert or update the cache.
You cannot insert or update the cache.
You can insert or update rows in the cache as you pass rows to the target.
You cannot use a flat file lookup.
You can use a relational or a flat file lookup.
You can use a relational lookup only.
When the condition is true, the PowerCenter Server returns a value from the lookup table or cache.
When the condition is not true, the PowerCenter Server returns the default value for connected transformations and NULL for unconnected transformations.
When the condition is true, the PowerCenter Server returns a value from the lookup table or cache.
When the condition is not true, the PowerCenter Server returns the default value for connected transformations and NULL for unconnected transformations.
When the condition is true, the PowerCenter Server either updates rows in the cache or leaves the cache unchanged, depending on the row type. This indicates that the row is in the cache and target table. You can pass updated rows to the target table.
When the condition is not true, the PowerCenter Server either inserts rows into the cache or leaves the cache unchanged, depending on the row type. This indicates that the row is not in the cache or target table. You can pass inserted rows to the target table.

34. Which transformation should we use to normalize the COBOL and relational sources?

 

      Normalizer Transformation. When U drag the COBOL source in to the  mapping Designer workspace, the normalizer transformation automatically appears, creating input and output ports for every column in the source.

 

 

35 .How the informatica server sorts the string values in Ranktransformation?

 

     When the informatica server runs in the ASCII data movement mode it sorts session data using  Binary sort order.If U configure the session use a biary sort order, the informatica server calculates the binary value of each string and returns the specified Number of rows with the highest binary values for the string.

 

36.What r the rank caches?

 

    During the session ,the informatica server compares an in out row with rows in the data cache. If the input row out-ranks a stored row, the informatica  server replaces the  stored row with the input row. The informatica server stores group information in an index cache and row data in a data cache.

 

 

37.What is the Rankindex in Ranktransformation?

    The Designer automatically creates a RANKINDEX port for each Rank transformation. The Informatica Server uses the Rank Index port to store the ranking position for each record in a group. For example, if you create a Rank transformation that ranks the top 5 salespersons for each quarter, the rank index numbers the salespeople from 1 to 5:

 

38.  What is the Router transformation?

 

 A   Router transformation is similar to a Filter transformation because both transformations   allow you to use a condition to test data. However, a Filter transformation tests data for one condition and drops the

rows of data that do not meet the condition. A Router transformation tests data for one or more conditions and gives you the option to route rows of data that do not meet any of the conditions to a default output

group. If you need to test the same input data based on multiple conditions, use a Router Transformation in a mapping instead of creating multiple Filter transformations to perform the same task.

 

39.What r the types of groups in Router transformation?   

 Input group, Output group

 

   The designer copies property information from the input ports of the input group to create a set  of output ports for each output group.

 

    Two types of output groups

    User defined groups

    Default group

    U can not modify or delete default groups.

 

40.Why we use stored procedure transformation?

For populating and maintaining data bases.

 

42.What r the types of data that passes between informatica server and stored procedure?

     3 types of data

                     Input/Out put parameters

                     Return Values

                     Status code.

 

43.What is the status code?

    Status code provides error handling for the Informatica server during the session.The stored procedure issues a status code that notifies whether or not stored procedure Completed sucessfully. The user cannot see this value. It only used by the informatica server to determine whether to continue running the Session or stop.

 

44. What is source qualifier transformation?

 

Ans. When U add a relational or a flat file source definition to a mappings, U need to connect it to  a source qualifier transformation. The source qualifier transformation represents the records that the informatica server reads when it runs a session.

 

45.What r the tasks that source qualifier perform?

 

     Join data originating from same source database.

     Filter records when the informatica server reads source data.

     Specify an outer join rather than the default inner join

     Specify sorted records.

     Select only distinct values from the source.

     Creating custom query to issue a special SELECT statement for the informatica server to read Source data.

 

46. What is the target load order?

 

     U specify the target load order based on source qualifiers in a mapping. If u have the multiple Source qualifiers connected to the multiple targets, U can designate the order in which informatica server loads data into the targets.

 

47. What is the default join that source qualifier provides?

     Inner equijoin.

 

48. What r the basic needs to join two sources in a source qualifier?

 

       Two sources should have primary and foreign key relation ships.

       Two sources should have matching data types.

 

49.what is update strategy transformation?

 This transformation is used to maintain the history data or just most recent  Changes into target table.

 

50.Describe two levels in which update strategy transformation sets?

  

Within a session. When you configure a session, you can instruct the Informatica Server to either treat all records in the same way (for example, treat all records as inserts), or use instructions coded into the

session mapping to flag records for different database operations.

 

Within a mapping. Within a mapping, you use the Update Strategy transformation to flag records for insert, delete, update, or reject.

 

51.What is the default source option for update strategy transformation?

         Data driven.

 

 

52. What is Datadriven?

       The informatica server follows instructions coded into update strategy transformations With in the session mapping determine how to flag records for insert,update,delete or Reject .If u do not choose data driven option setting, the informatica server ignores all update strategy transformations in the mapping.

 

53.What r the options in the target session of update strategy transformation?

      Insert ,Delete ,Update , Update as update ,Update as insert  , Update esle insert ,Truncate table

 

 

54. What r the types of mapping wizards that r to be provided in Informatica?

 

The Designer provides two mapping wizards to help you create mappings quickly and easily. Both wizards are designed to create mappings for loading and maintaining star schemas, a series of dimensions related to a central fact table.

 

Getting Started Wizard. Creates mappings to load static fact and dimension tables, as well as slowly growing dimension tables.

Slowly Changing Dimensions Wizard. Creates mappings to load slowly changing  dimension tables based on the amount of historical dimension data you want to keep and  the method you choose to handle historical dimension data.

 

 

55. What r the types of mapping in Getting Started Wizard?

 

       Simple Pass through mapping:

       Loads a static fact or dimension table by inserting all rows. Use this mapping when you Want to drop all existing data from your table before loading new data.

 

      Slowly Growing target:

      Loads a slowly growing fact or dimension table by inserting new rows. Use this mapping to load new data when existing data does not require updates.

 

56. What r the mappings that we use for slowly changing  dimension table?

 

Type1:  Rows containing changes to existing dimensions are updated in the target by  overwriting the existing dimension. In the Type 1 Dimension mapping, all rows contain current dimension data. Use the Type 1 Dimension mapping to update a slowly changing dimension table when  you do not need to keep any previous versions of dimensions in the table.

 

Type 2:  The Type 2 Dimension Data mapping inserts both new and changed  dimensions into the target. Changes are tracked in the target table by versioning the  primary key and creating a version number for each dimension in the table.  Use the Type 2 Dimension/Version Data mapping to update a slowly

changing  dimension table when you want to keep a full history of dimension data in the table. Version numbers and versioned primary keys track the order of changes to each  dimension.

 

Type 3: The Type 3 Dimension mapping filters source rows based on user-defined   comparisons and inserts only those found to be new dimensions to the target. Rows  containing changes to existing dimensions are updated in the target. When updating an existing dimension, the Informatica Server saves existing data in different columns of  the same row and replaces the existing data with the updates

 

 

57. What r the different types of Type2 dimension mapping?

 

Type2 Dimension/Version Data Mapping: In this mapping the updated dimension in he source  will gets inserted in target along with a new version number. And newly added dimension in source will inserted into target with a primary key.

 

Type2 Dimension/Flag current Mapping: This mapping is also used for slowly changing dimensions. In addition it creates a flag value for changed or new dimension.Flag indicates the dimension is new or newlyupdated.Recent dimensions will gets saved with current flag value 1. And updated dimensions r saved with the value 0.

 

Type2  Dimension/Effective Date Range Mapping: This is also one flavour of Type2 mapping used for slowly changing dimensions.This mapping also inserts both new and changed dimensions in to the target.And changes r tracked by the effective date range for each version of each dimension.

 

 

58. How can u recognise whether or not the newly added rows in the source r gets insert in  the target ?

 

In the Type2 mapping we have three options to recognise the newly added rows Version number, flagvalue, Effective date Range

 

 

59. What r two types of processes that informatica runs the session?

 

Load manager Process: Starts the session, creates the DTM process, and sends post-session email when the session completes.

The DTM process. Creates threads to initialize the session, read, write, and transform data, and handle pre- and post-session operations.

 

60. What r the new features of the server manager in the informatica 5.0?

 

      U can use command line arguments for a session or batch.This allows U to change the  

      values of session parameters,and mapping parameters and mapping variables.

 

      Parallel data processig: This feature is available for powercenter only.If we use the

      informatica server on a SMP system,U can use multiple CPU's to process a session  

      concurently.

 

       Process session data using threads: Informatica server runs the session in two 

       processes.Explained in previous question.

 

61.  Can u generate reports in Informatcia?

       Yes. By using  Metadata reporter we can generate reports in informatica.

    

62.  What is metadata reporter?

       It is a web based application that enables you to run reports against repository metadata. With a metadata reporter,u can access information about U'r repository with out  having knowledge of sql,transformation language or underlying tables in the  repository.

 

 

63.  Define mapping and sessions?

       Mapping: It is a set of source and target definitions linked by transformation objects that define the rules for transformation.

       Session : It is a set of instructions that describe how and when to move data from  source to targets.

 

64.  Which tool U use to create and manage sessions and batches and to monitor and stop the informatica server?    

      Informatica server manager.

 

 

65.  Why we use partitioning the session in informatica?

        Partitioning achieves the session performance by reducing the time period of reading  the source and loading the data into target.

 

66.  To achieve the session partition  what r the necessary tasks u have to do?

       Configure the session to partition source data. Install the informatica server on a machine with multiple CPU's.

 

67  .How the informatica server increases the session performance through partitioning the   source?

       For a relational sources informatica server creates multiple connections for each  partition of  a single source and extracts seperate range of data for each  connection.Informatica server reads multiple partitions of a single source  concurently.Similarly for loading also informatica server creates multiple connections  to the target and loads partitions of data concurently.

 

       For XML and file sources,informatica server reads multiple files concurently.For  loading the data informatica server creates a seperate file for each partition(of a source  file).U can choose to merge the targets.

 

68.  Why u use repository connectivity?

    When u edit,schedule the sesion each time,informatica server directly communicates  the repository to check whether or not the session and users r valid.All the metadata of  sessions and mappings will be stored in repository.

 

69.       What r the tasks that Loadmanger process will do?

 

       Manages the session and batch scheduling: When u start the informatica server the load maneger launches and queries the repository for  a list of sessions configured to run on  the informatica server. When u configure the session the loadmanager maintains list of list of sessions and session start times.When u sart a session loadmanger fetches the session information from the repository to perform the validations and verifications   prior to starting DTM process.

 

      Locking and reading the session: When the informatica server starts a session  loadmaager locks the session from the repository.Locking prevents U starting the   session again and again.

 

      Reading the parameter file: If the session uses a parameter files,loadmanager reads the parameter file and verifies that the session level parematers are declared in the file

 

      Verifies permission and privelleges: When the sesson  starts load  manger checks  whether or not the user have privelleges to run the session.

 

     Creating log files: Loadmanger creates logfile contains the status of session.

 

 

70. What is DTM process?

       After the loadmanger performs validations for session,it creates the DTM  process.DTM is to create and manage the threads that carry out the session tasks.I  creates the master thread.Master thread creates and manges all the other threads.

 

71. What r the different threads in DTM process?

       Master thread: Creates and manages all other threads

       Mapping thread: One mapping thread will be creates for each session.Fectchs session and mapping info

       Pre and post session threads: This will be created to perform pre and post session operations.

       Reader thread: One thread will be created for each partition of a  source.It reads data from source.

       Writer thread: It will be created to load data to the target.

       Transformation thread: It will be created to tranform data.

 

 

 72. What r the data movement modes in informatcia?

       Datamovement modes determines how informatica server handles the charector data.U choose the datamovement in the  informatica server configuration settings.Two types  of datamovement modes avialable in informatica.

 

       ASCII mode

       Uni code mode.

 

73.  What r the out put files that the informatica server creates during the session running?

       Informatica server log: Informatica server(on unix)  creates a log for all status and   error messages(default name: pm.server.log).It also creates an error log for error  messages.These files will be created in informatica home directory.

 

       Session log file: Informatica server creates session log file for each session.It writes  information about session into log files such as initialization process,creation of sql   commands for reader and writer threads,errors encountered and load summary.The amount of detail in session log file depends on the tracing level that u set.

 

       Session detail file: This file contains load statistics for each targets in mapping.Session   detail include information such as table name,number of rows written or rejected.U can  view this file by double clicking on the session in monitor window

 

       Performance detail file: This file contains information known as session performance   details which helps U where performance can be improved.To genarate this file select  the performance detail option in the session property sheet.

 

      Reject file: This file contains the rows of data that the writer does notwrite to targets.

 

     Control file: Informatica server creates control file and a target file when U run a   session that uses the external loader.The control file contains the information about the  target flat file such as data format and loading instructios for the external loader.

 

     Post session email: Post session email allows U to automatically communicate  information about a session run to designated recipents.U can create two different   messages.One if the session completed sucessfully the other if the session fails.

 

     Indicator file:    If u use the flat file as a target,U can configure the informatica server to   create indicator file.For each target row,the indicator file contains a number to indicate   whether the row was marked for insert,update,delete or reject.

 

     output file: If session writes to a target file,the informatica server creates the target file   based on file prpoerties entered in the session property sheet.

 

      Cache files: When the informatica server creates memory cache it also creates cache  files.For the following circumstances informatica server creates index and datacache    files.

 

 

      Aggreagtor transformation

      Joiner transformation

      Rank transformation

      Lookup transformation

 

74.  In which circumstances that informatica server creates Reject files?

      When it encounters the DD_Reject in update strategy transformation. Violates database constraint        Filed in the rows was truncated or overflowed.

 

75.  What is polling?

        It displays the updated information about the session in the monitor window.The monitor window 

       displays the status of each session when U poll the informatica server

 

76.  Can u copy the session to a different folder  or repository?

       Yes. By using copy session wizard u can copy a  session in a different folder or repository.But that target folder or repository should consists of mapping of that  session.If target folder or repository is not having the mapping of copying session ,    u should have to copy that mapping first before u copy the

session.

 

77.  What is batch and describe about types of batches?

       Grouping of session is known as batch.Batches r two types

       Sequential: Runs sessions one after the other

       Concurrent: Runs session at same time.

       If u have sessions with source-target dependencies u have to go for sequential batch to start the sessions one after another.If u have several independent  sessions u can use  concurrent batches.Whcih runs all the  sessions at the same time.

 

78.  Can u copy the batches?     NO

 

79.  How many number of sessions that u can create in a batch? 

Any number of sessions.

 

80.  When the informatica server marks that a batch is failed?

       If one of session is configured to "run if previous completes"

        and that previous  session fails.

 

81. What is a command that used to run a batch?

   pmcmd is used to  start a batch.

 

82. What r the different options used to configure the sequential

batches?

      Two options Run the session only if previous session completes

successfully.

       Always runs the session.

 

83.  In a sequential batch can u run the session if previous session fails?

       Yes.By setting the option always runs the session.

 

84.  Can u start a batches  with in a  batch?

       U can not. If u want to start batch that resides in a

batch,create a new independent

       batch and copy the  necessary sessions into the new batch.

 

85.  Can u start a session inside a batch idividually?

      We can start our required session only in case of sequential batch.in case of

      concurrent batch  we cant do like this.

 

86.  How can u stop a batch?  By using server manager or pmcmd.

 

87.        What r the session parameters?

 

       Session parameters r like mapping parameters,represent values U might want to change between  sessions such as database connections or source files.

       Server manager also allows U to create userdefined session parameters.Following r user defined session parameters. 

       Database connections

       Source file names: use this parameter when u want to change the name or location of session source file between session runs

       Target file name :   Use this parameter when u want to change the name or location of  session  target file between session runs.

       Reject file name :   Use this parameter when u want to change

the name or location of

       session reject files between session runs.

 

88.       What is parameter file?

 

         Parameter file is to define the values for parameters and variables used in  a session. A parameter  file is a file created by text editor such as word pad or notepad. U can define the following values in parameter file

         Mapping parameters

         Mapping variables

         session parameters

 

89.  How can u access the remote source into U'r session?

      Relational source: To acess relational source which is situated in a remote place ,u need   configure database connection to the datasource.

 

       FileSource : To access the remote source file U must configure the FTP connection to the                          host machine before u create the session.

 

       Hetrogenous : When U'r mapping contains more than one source type,the server manager creates  a hetrogenous session that displays source options  for all types.

 

90. What is difference between partioning of relatonal target and partitioning of file targets?

 

        If u partition a session with a relational target informatica server creates multiple  Connections to the target database to write target data concurently.

If u partition a session with a file target the informatica server creates one target file for each partition.U can configure session properties to merge these target files.

 

91.  what r the transformations that restricts the partitioning  of sessions?

       Advanced External procedure tranformation and External procedure transformation:

       This transformation contains a  check box on the properties tab to allow partitioning.

 

     Aggregator Transformation: If u use sorted ports u can not parttion the assosiated  Source

     Joiner Transformation : U can not partition the master source for a joiner   Transformation

     Normalizer Transformation  XML targets.

 

92.   Performance tuning in Informatica?

        The goal of performance tuning is optimize session performance so sessions run  during the available load window for the Informatica Server.Increase the session  performance by following.

 

       The performance of the Informatica Server is related to network connections. Data 

       generally moves across a network at less than 1 MB per second, whereas a local disk  moves data five to twenty times faster. Thus network connections ofteny affect on  session performance.So aviod

       netwrok connections.

 

Flat files: If u'r flat files stored on a machine other than the informatca server, move  those files to the machine that consists of informatica server.

Relational datasources: Minimize the connections to sources ,targets and informatica  server to improve session performance.Moving target database into server system may improve  session performance.       Staging areas: If u use staging areas u force informatica server to perform multiple   datapasses.

      Removing of staging areas may improve session performance.

 

      U can run the multiple informatica servers againist the same

repository.Distibuting the  

      session load to multiple informatica servers may improve session

performance.

 

      Run the informatica server in ASCII datamovement mode improves

the session  

      performance.Because ASCII datamovement mode stores a character

value in one  

      byte.Unicode mode takes 2 bytes to store a character.

 

      If a session joins multiple source tables in one Source

Qualifier, optimizing the query  

      may improve performance. Also, single table select statements

with an ORDER BY or 

      GROUP BY clause may benefit from optimization such as adding

indexes.

 

      We can improve the session performance by configuring the network

packet size,which    

       Allows  data to cross the network at one time.To do this go to

server manger ,choose 

       server configure database connections.

 

      If u r target consists key constraints and indexes u slow the

loading of data.To improve 

      the session performance in this case drop constraints and indexes

before u  run the 

      session and rebuild them after completion of session.

 

          Running a parallel sessions by using concurrent batches will

also reduce the time of  

          loading the

          data.So concurent batches may also increase the session

performance.

 

          Partittionig the session improves the session performance by

creating multiple  

          connections to sources and targets and loads data in paralel

pipe lines.

 

        In some cases if a session contains a aggregator transformation

,u can use incremental  

        aggregation to improve session performance.

 

         Aviod transformation errors to improve the session

performance.

 

         If the sessioin containd lookup transformation u can improve

the session performance 

         by enabling the look up cache.

 

        If U'r session contains filter transformation ,create that

filter transformation nearer to   

        the sources

        or u can use filter condition in source qualifier.

 

        Aggreagator,Rank and joiner transformation may oftenly decrease

the session 

        performance .Because they must group data before processing

it.To improve session  

        performance in this case use sorted ports option.

 

92.        What is difference between mapplet and reusable transformation?

 

        Mapplet consists of set of transformations that is reusable.A reusable transformation is     A single transformation that can be reusable.

 

        If u create a variables or parameters in mapplet that can not be used in another mapping  or mapplet.Unlike the variables that r created in a reusable transformation can be   usefull in any other mapping or mapplet.

 

        We can not include source definitions in reusable

transformations.But we can add  

         sources to a mapplet.

 

        Whole transformation  logic will be hided in case of

mapplet.But it is transparent in 

         case of reusable transformation.

 

         We cant use COBOL source qualifier,joiner,normalizer

transformations in 

         mapplet.Where as we can make them as a reusable

transformations.

 

93.    Define informatica repository?

         The Informatica repository is a relational database that stores information, or    metadata, used by the Informatica Server and Client tools.

Metadata can include       information such as mappings describing how to transform source

data, sessions          indicating when you want the Informatica Server to perform the transformations, and 

      connect strings for sources and targets.  The repository also stores administrative information such as

usernames and     passwords, permissions and privileges, and product version.

 

      Use repository manager to create the repository.The Repository Manager connects to  the repository database and runs the code needed to create the  repository tables.Thsea   tables   stores metadata in specific format the informatica server,client

tools use.

 

94. What r the types of metadata that stores in repository?

      Following  r the types of metadata that stores in the repository

 

      Database connections

      Global objects

      Mappings

      Mapplets

      Multidimensional metadata

      Reusable transformations

      Sessions and batches

      Short cuts

      Source definitions

      Target defintions

      Transformations

 

95. What is power center repository?

      The PowerCenter repository allows you to share metadata across repositories to create   a data mart domain. In a data mart domain, you can create a single global repository to  store metadata used across an enterprise, and a number of local repositories to share   the global metadata as needed.

 

96. How can u work with remote database in informatica?did u work directly by using  remote connections?

 

     To work with remote datasource u need to connect it with remote connections.But it is  not preferable to work with that remote source directly by using remote connections     .  Instead u bring  that source  into U r local machine where informatica server resides.If u  work directly  with remote source the session performance will decreases by passing less  amount of data across the network in a particular time.

 

97. What r the new features in Informatica 5.0?

       U can Debug U'r mapping in mapping designer

       U can view the work space over the entire screen

      The designer displays a new icon for a invalid mappings in the navigator window

       U can use a dynamic lookup cache in a lokup transformation

       Create mapping parameters or mapping variables in a mapping or mapplet to make mappings more flexible

       U can export objects into repository and import objects from repository.when u export 

       a repository object,the designer or server manager creates an XML file to describe the 

       pository metadata.

      The designer allows u to use Router transformation to test data for multiple  conditions.Router transformation allows u route groups of data to transformation or  target.

      U can use XML data as a source or target.

 

      Server Enahancements:

      U can use the command line program pmcmd to specify a parameter file to run sessions  or batches.This allows you to change the values of session parameters, and mapping parameters and variables at runtime.

 

      If you run the Informatica Server on a symmetric multi-processing system, you can use 

      multiple CPUs to process a session concurrently. You configure partitions in the  session properties based on source qualifiers. The Informatica Server reads, transforms,  and writes partitions of data in parallel for a single session. This is avialable for Power    center only.

 

      Informatica server creates two processes like loadmanager process,DTM process to  

      run the sessions.

 

     Metadata Reporter: It is a web based application which is used to run reports againist 

     repository metadata.

 

     U can copy the session across the folders and reposotories using the copy session  wizard in the informatica server manager

 

     With new email variables, you can configure post-session email to include information,        such as the mapping used during the session

 

98. what is incremantal aggregation?

     When using incremental aggregation, you apply captured changes in the source to  aggregate calculations in a session. If the source changes only incrementally and you can   capture changes, you can configure the session to process only those changes. This  allows the Informatica Server to update your target incrementally, rather than forcing it  o process the entire source and recalculate the same calculations each time you run the   session.

 

99. What r the scheduling options to run a sesion?

     U can shedule a session to run at a given time or intervel,or u can manually run the   session.

        Different options of scheduling

        Run only on demand: Informatica server runs the session only when user starts session  explicitly

        Run once: Informatica server runs the session only once at a specified date and time.

        Run every: Informatica server runs the session at regular intervels as u configured.

        Customized repeat: Informatica server runs the session at the dats and times secified 

        in the repeat dialog box.

 

100 . What is tracing level and what r the types of tracing level?

         Tracing level represents the amount of information that

informatcia server writes in a 

         log file.

        Types of tracing level

         Normal           Verbose           Verbose init      Verbose data

 

101.  What is difference between stored procedure transformation and external procedure           transformation?

 

          In case of storedprocedure transformation  procedure will be compiled and executed  in a relational data source.U need database connection to import the stored  Procedure in to u'r mapping.Where as in external procedure transformation      procedure or function will be executed out side of data

source.Ie u need to make it   as a DLL to access in u r mapping.No need to have data base

connection in case of external procedure transformation.

 

102. Explain about Recovering sessions?

 

        If you stop a session or if an error causes a session to stop, refer to the session and  error logs to determine the cause of failure. Correct the errors, and then complete the  session. The method you use to complete the session depends on the properties of the  mapping, session, and Informatica Server configuration.

Use one of the following methods to complete the session:

           Run the session again if the Informatica Server has not issued a commit.

           Truncate the target tables and run the session again if the session is not recoverable.

           Consider performing recovery if the Informatica Server has issued at least one commit.

 

103. If a session fails after loading of 10,000 records in to the target. How can u load the records  from 10001 the record when u run the session next time?

  

    As explained above informatica server has 3 methods to recovering the sessions. Use performing   recovery to load the records from where the session fails.

 

104. Explain about perform recovery?

 

        When the Informatica Server starts a recovery session, it reads the  OPB_SRVR_RECOVERY table and notes the row ID of the last row committed to the target database. The Informatica Server then reads all sources again and starts    Processing from the next row ID. For example, if the Informatica

Server commits 10,000     rows before the session fails, when you run recovery, the Informatica Server bypasses the   rows up to 10,000 and starts loading with row 10,001.

 

By default, Perform Recovery is disabled in the Informatica Server setup. You must   enable Recovery in the Informatica Server setup before you run a session so the   Informatica Server can create and/or write entries in the OPB_SRVR_RECOVERY   Table.

 

105. How to recover the standalone session?

 

 A Standalone session is a session that is not nested in a batch. If a standalone session fails,  you can run recovery using a menu command or pmcmd. These options are not available for batched sessions.  To recover sessions using the menu:

1. In the Server Manager, highlight the session you want to recover.

2. Select Server Requests-Stop from the menu.

3.  With the failed session highlighted, select Server Requests-Start

Session in Recovery Mode from the menu.

 

    To recover sessions using pmcmd:

1.From the command line, stop the session.

2. From the command line, start recovery.

 

106. How can u recover the session in sequential  batches?

    

     If you configure a session in a sequential batch to stop on failure, you can run recovery  Starting with the failed session. The Informatica Server completes the session and then    Runs the rest of the batch. Use the Perform Recovery session property

 

To recover sessions in sequential batches configured to stop on

failure:

1. In the Server Manager, open the session property sheet.

2. On the Log Files tab, select Perform Recovery, and click OK.

3. Run the session.

4. After the batch completes, open the session property sheet.

5. Clear Perform Recovery, and click OK.

 

If you do not clear Perform Recovery, the next time you run the

session, the Informatica Server attempts to recover the previous session.

If you do not configure a session in a sequential batch to stop on

failure, and the remaining sessions in the batch complete, recover the

failed session as a standalone session.

 

107. How to recover sessions in concurrent batches?

 

      If multiple sessions in a concurrent batch fail, you might want to truncate all targets and  run the batch again. However, if a session in a concurrent batch fails and the rest of the   sessions complete successfully, you can recover the session as a standalone session.

 

To recover a session in a concurrent batch:

1.Copy the failed session using Operations-Copy Session.

2.Drag the copied session outside the batch to be a standalone session.

3.Follow the steps to recover a standalone session.

4.Delete the standalone copy.

 

108. How can u complete unrecoverable sessions?

 

         Under certain circumstances, when a session does not complete, you need to truncate   the target tables and run the session from the beginning. Run the session from the   beginning when the Informatica Server cannot run recovery or when running   recovery might result in inconsistent data.

 

109. What r the circumstances that informatica server results an unreciverable session?

 

        The source qualifier transformation does not use sorted ports .if u change the partition information after the initial session fails.      Perform recovery is disabled in the informatica server

configuration.  If the sources or targets changes after initial session fails.

        If the mapping consists of sequence generator or  normalizer

transformation.

        If a concurrent batch  contains multiple failed sessions.

 

 

 

110. If i done any modifications for my table in back end does it reflect in informatica   Warehouse or mapping designer or source analyzer?

 

        NO. Informatica is not at all concern with back end database. It displays u all the   Information that is to be stored in repository. If want to reflect back end changes to   informatica screens, again u have to import from back end to informatica by valid  connection. And u have to replace the existing files with

imported files.

========

·  What are the types of lookup?    
·  Connected look up and un-connected look up
·  Unconnected Lookup and Connected Lookup
·  connected and unconnected
·  1.connected 2.unconnected
·  connected

·  How can you recognise whether or not the newly added rows in the source are gets insert in the target?
·  In the type-2 mapping we have three options to recognise the newly added rows. i) Version Number ii) Flag Value iii) Effective Date Range

 

·  How can you create or import flat file definition in to the warehouse designer?
·  By giving server connection path

·  What is aggregate cache in aggregator transforamtion?  
·  r8ir678
·  IT IS A CACHE
·  Aggregate value will stored in data cache, grouped column value will stored in index cache

·  What are the settiings that you use to cofigure the joiner transformation?   
·  Master group flow detail group flow join condition type of join
·  take less no. of rows table as master table, more no of table as detail table and join condition. joiner will put all row from master table into chache and check condition with detail table rows.
·  How the informatica server increases the session performance through partitioning the source?
·  What is source qualifier transformation?)
·  SQ is an active tramsformation. It performs one of the following task: to join data from the same source database to filtr the rows when Power centre reads source data to perform an outer join to select only distinct values from the source

 

    
·  Why use the lookup transformation?
·  Used to look up data in a relational table or view.
·  in Inf7.1, we can get from flat file also
·  look up is used to perform one of the following task: -to get related value -to perform calculation -to update slowley changing dimension table

 
·  What is the look up transformation?
·  Used to look up data in a reational table or view.
·  Lookup is a passive transformation and used to look up data in a flat file or a relational table
·  What are the join types in joiner transformation?
·  Normal Join Master Join Detail Join Outer Join
·  the following are the join types Normal,MasterOuter,Detail Outer,Full Outer
  
·  What are the designer tools for creating tranformations?  
·  Source Analyzer, Warehouse Designer, Maping Designer
·  in Mapplet designer and transformation developer also we can create transformations

 

·  In which condtions we can not use joiner transformation(Limitaions of joiner transformation)?
 
·  no restrictions
·  you perform the following task before configuring the joiner transformation configure the transformation to use sorted data and define the join condition to recieve sorted data in the same order as the sort origin
·  In the conditions; Either input pipeline contains an Update Strategy transformation, You connect a Sequence Generator transformation directly before the Joiner transformation
·  What are the diffrence between joiner transformation and source qualifier transformation?
·  1. Source Qualifier Operates only with relational sources within the same schema. Joiner can have either heterogenous sources or relation sources in different schema 2. Source qualifier requires atleats one matching column to perform a join. Joiner joins based on matching port. 3. Additionally, Joiner requires two separate input pipelines and should not have an update strategy or Sequence generator (this is no longer true from Infa 7.2). Hope this helps. Thanks, Prasanna
·  How can you improve session performance in aggregator transformation?  
·  How can you improve session performance in aggregator transformation?
·  By using Incremental Aggregation
·  create the sorter transformation before the aggregator
·  sorted input
 
·  Can you use the maping parameters or variables created in one maping into any other reusable transformation?


·  Yes. Because reusable transformation is not contained with any mapplet or mapping.

 

·  What is meant by lookup caches?
·  Session will read all unique rows from the reference table/ file to fill the local buffer first; then for each row received from up-stream transformation, it tries to match them against the local buffer
   
·  Difference between static cache and dynamic cache?
 
·  Which transformation should we use to normalize the COBOL and relational sources?  
·  Normalizer Transformation. Thanks
·  What are the basic needs to join two sources in a source qualifier?   
·  Need matching Kesy and Should come from same datasource. Thanks, Siva
·  What is the default join that source qualifier provides?
·  Normal join

 
·  What is the target load order?   
·  Normal and bulk

·  What are the tasks that source qualifier performs?  
·  Join data originating from the same source database. You can join two or more tables with primary-foreign key relationships by linking the sources to one Source Qualifier. • Filter records when the Informatica Server reads source data. If you include a filter condition, the Informatica Server adds a WHERE clause to the default query. • Specify an outer join rather than the default inner join. If you include a user-defined join, the Informatica Server replaces the join information specified by the metadata in the SQL query. • Specify sorted ports. If you specify a number for sorted ports, the Informatica Server adds an ORDER BY clause to the default SQL query. • Select only distinct values from the source. If you choose Select Distinct, the Informatica Server adds a SELECT DISTINCT statement to the default SQL query. • Create a custom query to issue a special SELECT statement for the Informatica Server to read source data. For example, you might use a custom query to perform aggregate calculations or
·  What is the status code?  
·  Status code provides error handling for the informatica server during the session. The stored procedure isusses status code that notifies whether or not stored procedure competed successfully. This values can not seen by user. It only used by informtica server to determine wheter to continue running session or stop.

·  What are the types of data that passes between informatica server and stored procedure?  
·  Why we use stored procedure transformation?  
·  Used for error handling, to drop and recreate indexes and fo a specialized calculations. Thanks. Siva

·  What are the types of groups in Router transformation?  
·  Input, output and Default Groups. Thanks. Siva
·  What is the Router transformation?
·  router transformations are used to apply multiple conditions. hari
·  its like filter transformer where we can use multiple conditions
·  What is the Rankindex in Ranktransformation?
·  rank index vll create when ever v use rank transformation. in a rank index ranks to be allocated depending upon the rank port selected
·  is used to hold (store)the rank position in rank transformation
·  What are the rank caches?
·  Can you use the maping parameters or variables created in one maping into another maping?   
·  no,u can't use mapping paramters or variables created in one mapping into another mapping. hari

·  What are the mapping paramaters and maping variables?   
·  mapping paramater is constant
·  What is Session and Batches?
 
·  What is Data Transformation Manager?  
·  What is Load Manager?    
·  What is Router transformation?   
·  What is Code Page used for?  
·  to identify the character which r in other format.varun

·  What is Code Page Compatibility?  
·  What are two modes of data movement in Informatica Server?
·  unicode data movement & ascii data movement. Reddy
·  What are Dimensions and various types of Dimensions?  
 
·  What are various types of Aggregation?  
·  sum,avg,max,min ......etc. sreekanth
·  What are Aggregate transformation?  
·  aggreate transformations is used to perform calculations on group of data.sreekanth

 

·  How do you identify existing rows of data in the target table using lookup
transformation?
 
·  What are Target Options on the Servers?  
·  What is a source qualifier?
·  Why we use lookup transformations?
·  While importing the relational source defintion from database, what are the meta data of source you import?
·  What are the unsupported repository objects for a mapplet?   
·  What are the methods for creating reusable transforamtions?
·  What are the reusable transformations

·  How the informatica server sorts the string values in Ranktransformation?
·  By setting the proerty as binary sorted code at session level.it vll calculate the binary values for stringa.and according to the binary values it vll show the top(highest valued strings).  


 

 

 
If you have 10 mappings designed and you need to implement some chages (may be in existing mapping or new mapping need to be designed) then how much time it takes from easier to complex?
What is the filename which you need to configure in UNIX while installing infromatica?
What happens if you increase commit intervals and also decrease commitExplain grouped cross tab?
What is hash partition?
What is the approximate size of data warehouse?
What is data quality? How can a data quality solution be implemented into my informatica transformations, even internationally?
What is the difference between view and materialised view?
How is Data Models Used in Practice?
What is an MDDB? What is the difference between MDDBs and RDBMSs?
What is active and passive transformation?
Why do we use DSS database for OLAP tools?
What is up date strategy and what are the options for update strategy?
What is data merging, data cleansing and sampling?
What is staging area?
What is difference between a connected look up and unconnected look up?

 


 RE: What is source qualifier transformation?

 
When you add a relational or a flat file source definition to a mapping, you need to connect it to a Source Qualifier transformation. The Source Qualifier represents the rows that the Informatica Server reads when it executes a session.
·  Join data originating from the same source database. You can join two or more tables with primary-foreign key relationships by linking the sources to one Source Qualifier.
·  Filter records when the Informatica Server reads source data. If you include a filter condition, the Informatica Server adds a WHERE clause to the default query.
·  Specify an outer join rather than the default inner join. If you include a user-defined join, the Informatica Server replaces the join information specified by the metadata in the SQL query.
·  Specify sorted ports. If you specify a number for sorted ports, the Informatica Server adds an ORDER BY clause to the default SQL query.
·  Select only distinct values from the source. If you choose Select Distinct, the Informatica Server adds a SELECT DISTINCT statement to the default SQL query.
·  Create a custom query to issue a special SELECT statement for the Informatica Server to read source data. For example, you might use a custom query to perform aggregate calculations or execute a stored procedure.

When you add a relational or a flat file source definition to a mapping, you need to connect it to a Source Qualifier transformation. The Source Qualifier represents the rows that the Informatica Server reads when it executes a session.

What r the diffrence between joiner transformation and source qualifier transformation

U can join hetrogenious data sources in joiner transformation which we can not achieve in source qualifier transformation.
U need matching keys to join two relational sources in source qualifier transformation.Where as u doesn’t need matching keys to join two sources.
Two relational sources should come from same datasource in sourcequalifier.U can join relatinal sources which r coming from diffrent sources also.

 

Source qualifier – Homogeneous source

Joiner – Heterogeneous source

The Source Qualifier transformation provides an alternate way to filter rows. Rather than filtering rows from within a mapping, the Source Qualifier transformation filters rows when read from a source. The main difference is that the source qualifier limits the row set extracted from a source, while the Filter transformation limits the row set sent to a target. Since a source qualifier reduces the number of rows used throughout the mapping, it provides better performance.

However, the Source Qualifier transformation only lets you filter rows from relational sources, while the Filter transformation filters rows from any type of source. Also, note that since it runs in the database, you must make sure that the filter condition in the Source Qualifier transformation only uses standard SQL.

what r the transformations that restricts the partitioning of sessions?

Advanced External procedure tranformation and External procedure transformation: This
transformation contains a check box on the properties tab to allow partitioning.

Aggregator Transformation: If u use sorted ports u can not parttion the assosiated source

Joiner Transformation : U can not partition the master source for a joiner transformation

Normalizer Transformation

XML targets.

 


What are Data driven Sessions?

The informatica server follows instructions coded into update strategy transformations with in the session mapping to determine how to flag records for insert,update,delete or reject. If you do not choose data driven optionn setting, the informatica server ignores all update strategy transformations in the mapping

Once you load the data in your DW,you can update the new data with the following options in your session properties:-

1.update 2.insert2.delete and datadriven, and all these options are present in your session properties,now if you select  the datadriven option ,informatica takes the logic to update,delete or reject data from your designer update strategy transformation.it will look some thing like this

IIF( JOB = 'MANAGER', DD_DELETE, DD_INSERT ) this expression marks jobs with an ID of manager for deletion, and all other items for insertion.

Hope answer the question.

What r the unsupported repository objects for a mapplet?

COBOL source definition
Joiner transformations
Normalizer transformations
Non reusable sequence generator transformations.
Pre or post session stored procedures
Target defintions
Power mart 3.5 style Look Up functions
XML source definitions
IBM MQ source definitions


 RE: What r the unsupported repository objects for a ma...

 
·  Source definitions. Definitions of database objects (tables, views, synonyms) or files that provide source data.
·  Target definitions. Definitions of database objects or files that contain the target data.
·  Multi-dimensional metadata. Target definitions that are configured as cubes and dimensions.
·  Mappings. A set of source and target definitions along with transformations containing business logic that you build into the transformation. These are the instructions that the Informatica Server uses to transform and move data.
·  Reusable transformations. Transformations that you can use in multiple mappings.
·  Mapplets. A set of transformations that you can use in multiple mappings.
·  Sessions and workflows. Sessions and workflows store information about how and when the Informatica Server moves data. A workflow is a set of instructions that describes how and when to run tasks related to extracting, transforming, and loading data. A session is a type of task that you can put in a workflow. Each session corresponds to a single mapping.

The following answer is from Informatica Help Documnetation

·  You cannot include the following objects in a mapplet:

  • Normalizer transformations
  • Cobol sources
  • XML Source Qualifier transformations
  • XML sources
  • Target definitions
  • Pre- and post- session stored procedures
  • Other mapplets

What is difference between maplet and reusable transformation?

Maplet consists of set of transformations that is reusable.A reusable transformation is a
single transformation that can be reusable.

If u create a variables or parameters in maplet that can not be used in another maping or maplet.Unlike the variables that r created in a reusable transformation can be usefull in any other maping or maplet.

We can not include source definitions in reusable transformations.But we can add sources to a maplet.

Whole transformation logic will be hided in case of maplet.But it is transparent in case of reusable transformation.

We cant use COBOL source qualifier,joiner,normalizer transformations in maplet.Where as we can make them as a reusable transformations.

Maplet: one or more transformations

Reusable transformation: only one transformation

 

Why we use partitioning the session in informatica?

Partitioning achieves the session performance by reducing the time period of reading the source and loading the data into target.

Performance can be improved by processing data in parallel in a single session by creating multiple partitions of the pipeline.

Informatica server can achieve high performance by partitioning the pipleline and performing the extract , transformation, and load for each partition in parallel

 

=======
 
 
What are the mapping parameters and mapping variables?
Mapping parameter represents a constant value that U can define before running a session.
A mapping parameter retains the same value throughout the entire session.
When u use the mapping parameter, U declare and use the parameter in a mapping or mapplet. Then define the value of parameter in a parameter file for the session.
Unlike a mapping parameter, a mapping variable represents a value that can change throughout the session. The informatica server saves the value of mapping variable to the repository at the end of session run and uses that value next time U run the session.
 
Can you use the mapping parameters or variables created in one mapping into another mapping?
NO.
We can use mapping parameters or variables in any transformation of the same mappings or mapplet in which U have created mapping parameters or variables.
 
Can you use the mapping parameters or variables created in one mapping into any other reusable transformation?
Yes. Because reusable transformation is not contained with any maplet or mapping.
 
What is aggregate cache in aggregator transformation?
The aggregator stores data in the aggregate cache until it completes aggregate calculations. When u run a session that uses an aggregator transformation, the informatica server creates index and data caches in memory to process the transformation. If the informatica server requires more space, it stores overflow values in cache files.
 
What r the difference between joiner transformation and source qualifier transformation?
U can join heterogeneous data sources in joiner transformation, which we cannot achieve in source qualifier transformation.
U need matching keys to join two relational sources in source qualifier transformation. Where as u doesn’t need matching keys to join two sources.
Two relational sources should come from same datasource in source qualifier. U can join relational sources which r coming from different sources also.
 
What r the settings that u use to configure the joiner transformation?
Master and detail source
Type of join
Condition of the join
 
What r the join types in joiner transformation?
Normal (Default)
Master outer
Detail outer
Full outer
 
What r the joiner caches?
When a Joiner transformation occurs in a session, the Informatica Server reads all the records from the master source and builds index and data caches based on the master rows.
After building the caches, the Joiner transformation reads records from the detail source and performs joins.
 
 
 
What is the look up transformation?
Use lookup transformation in u’r mapping to lookup data in a relational table, view, and synonym.
Informatica server queries the look up table based on the lookup ports in the transformation. It compares the lookup transformation port values to lookup table column values based on the look up condition.
 
Why use the lookup transformation?
To perform the following tasks.
Get a related value. For example, if your source table includes employee ID, but you want to include the employee name in your target table to make your summary data easier to read.
Perform a calculation. Many normalized tables include values used in a calculation, such as gross sales per invoice or sales tax, but not the calculated value (such as net sales).
Update slowly changing dimension tables. You can use a Lookup transformation to determine whether records already exist in the target.
 
Differences between connected and unconnected lookup?
Connected lookup
Unconnected lookup
Receives input values directly from the pipe line.
Receives input values from the result of a  lkp expression in a another transformation.
U can use a dynamic or static cache
U can use a static cache.
Cache includes all lookup columns used in the mapping
Cache includes all lookup out put ports in the lookup condition and the lookup/return port.
Support user defined default values
Does not support user defined default values
What is meant by lookup caches?
The informatica server builds a cache in memory when it processes the first row of a data in a cached look up transformation. It allocates memory for the cache based on the amount u configure in the transformation or session properties. The informatica server stores condition values in the index cache and output values in the data cache.
 
What r the types of lookup caches?
Persistent cache: U can save the lookup cache files and reuse them the next time the informatica server processes a lookup transformation configured to use the cache.

Recache from database: If the persistent cache is not synchronized with he lookup table,U can configure the lookup transformation to rebuild the lookup cache.

Static cache: U can configure a static or readonly cache for only lookup table.By default informatica server creates a static cache.It caches the lookup table and lookup values in the cache for each row that comes into the transformation.when the lookup condition is true,the informatica server does not update the cache while it prosesses the lookup transformation.

Dynamic cache: If u want to cache the target table and insert new rows into cache and the target,u can create a look up transformation to use dynamic cache.The informatica server dynamically inerts data to the target table.

shared cache: U can share the lookup cache between multiple transactions.U can share unnamed cache between transformations inthe same maping.
 
Difference between static cache and dynamic cache
Static cache
Dynamic cache
U can not insert or update the cache
U can insert rows into the cache as u pass to the target
The informatic server returns a value from the lookup table or cache when the condition is true.When the condition is not true, informatica server returns the default value for connected transformations and null for unconnected transformations.
The informatic server inserts rows into cache when the condition is false.This indicates that the the row is not in the cache or target table. U can pass these rows to the target table
 
Which transformation should we use to normalize the COBOL and relational sources?
Normalizer Transformation.
When U drag the COBOL source in to the mapping Designer workspace, the normalizer transformation automatically appears, creating input and output ports for every column in the source.
 
How the informatica server sorts the string values in Ranktransformation?
When the informatica server runs in the ASCII data movement mode it sorts session data using Binary sortorder.If U configure the seeion to use a binary sort order,the informatica server caluculates the binary value of each string and returns the specified number of rows with the higest binary values for the string.
 
What r the rank caches?
During the session, the informatica server compares an inout row with rows in the datacache.If the input row out-ranks a stored row,the informatica server replaces the stored row with the input row.The informatica server stores group information in an index cache and row data in a data cache.
 
What is the Rankindex in Ranktransformation?
The Designer automatically creates a RANKINDEX port for each Rank transformation. The Informatica Server uses the Rank Index port to store the ranking position for each record in a group. For example, if you create a Rank transformation that ranks the top 5 salespersons for each quarter, the rank index numbers the salespeople from 1 to 5:
 
What is the Router transformation?
A Router transformation is similar to a Filter transformation because both transformations allow you to use a condition to test data. However, a Filter transformation tests data for one condition and drops the rows of data that do not meet the condition. A Router transformation tests data for one or more conditions and gives you the option to route rows of data that do not meet any of the conditions to a default output group.
If you need to test the same input data based on multiple conditions, use a Router Transformation in a mapping instead of creating multiple Filter transformations to perform the same task.
What r the types of groups in Router transformation?
Input group Output group
The designer copies property information from the input ports of the input group to create a set of output ports for each output group.
Two types of output groups
User defined groups
Default group
U can not modify or delete default groups.
What r the types of data that passes between informatica server and stored procedure?
3 types of data
Input/Out put parameters
Return Values
Status code.
 
What is the status code?
Status code provides error handling for the informatica server during the session.The stored procedure issues a status code that notifies whether or not stored procedure completed sucessfully.This value can not seen by the user.It only used by the informatica server to determine whether to continue running the session or stop.
 
What is source qualifier transformation?
When U add a relational or a flat file source definition to a maping,U need to connect it to
a source qualifer transformation.The source qualifier transformation represnets the records
that the informatica server reads when it runs a session.
 
What r the tasks that source qualifier performs?
Join data originating from same source data base.
Filter records when the informatica server reads source data.
Specify an outer join rather than the default inner join
specify sorted records.
Select only distinct values from the source.
Creating custom query to issue a special SELECT statement for the informatica server to read
source data.
 
What is the target load order?
U specify the target loadorder based on source qualifiers in a maping.If u have the multiple
source qualifiers connected to the multiple targets,U can designatethe order in which informatica
server loads data into the targets.
 
What r the basic needs to join two sources in a source qualifier?
Two sources should have primary and Foreign key relation ships.
Two sources should have matching data types.
 
Describe two levels in which update strategy transformation sets?
Within a session. When you configure a session, you can instruct the Informatica Server to either treat all records in the same way (for example, treat all records as inserts), or use instructions coded into the session mapping to flag records for different database operations.

Within a mapping. Within a mapping, you use the Update Strategy transformation to flag records for insert, delete, update, or reject.
 
What is Datadriven?
The informatica server follows instructions coded into update strategy transformations with in the session maping determine how to flag records for insert, update, delete or reject. If u do not choose data driven option setting,the informatica server ignores all update strategy transformations in the mapping.
 
 
What r the options in the target session of update strategy transsformatioin?
Insert
Delete
Update
Update as update
Update as insert
Update esle insert
Truncate table
 
What r the types of maping wizards that r to be provided in Informatica?
The Designer provides two mapping wizards to help you create mappings quickly and easily. Both wizards are designed to create mappings for loading and maintaining star schemas, a series of dimensions related to a central fact table.

Getting Started Wizard. Creates mappings to load static fact and dimension tables, as well as slowly growing dimension tables.
Slowly Changing Dimensions Wizard. Creates mappings to load slowly changing dimension tables based on the amount of historical dimension data you want to keep and the method you choose to handle historical dimension data.
 
What r the types of maping in Getting Started Wizard?
Simple Pass through maping :
Loads a static fact or dimension table by inserting all rows. Use this mapping when you want to drop all existing data from your table before loading new data.

Slowly Growing target :
Loads a slowly growing fact or dimension table by inserting new rows. Use this mapping to load new data when existing data does not require updates.
 
What r the mapings that we use for slowly changing dimension table?
Type1: Rows containing changes to existing dimensions are updated in the target by overwriting the existing dimension. In the Type 1 Dimension mapping, all rows contain current dimension data.
Use the Type 1 Dimension mapping to update a slowly changing dimension table when you do not need to keep any previous versions of dimensions in the table.

Type 2: The Type 2 Dimension Data mapping inserts both new and changed dimensions into the target. Changes are tracked in the target table by versioning the primary key and creating a version number for each dimension in the table.
Use the Type 2 Dimension/Version Data mapping to update a slowly changing dimension table when you want to keep a full history of dimension data in the table. Version numbers and versioned primary keys track the order of changes to each dimension.

Type 3: The Type 3 Dimension mapping filters source rows based on user-defined comparisons and inserts only those found to be new dimensions to the target. Rows containing changes to existing dimensions are updated in the target. When updating an existing dimension, the Informatica Server saves existing data in different columns of the same row and replaces the existing data with the updates
 
What r the different types of Type2 dimension maping?
Type2 Dimension/Version Data Maping: In this maping the updated dimension in the source will gets inserted in target along with a new version number.And newly added dimension
in source will inserted into target with a primary key.

Type2 Dimension/Flag current Maping: This maping is also used for slowly changing dimensions.In addition it creates a flag value for changed or new dimension.
Flag indiactes the dimension is new or newlyupdated.Recent dimensions will gets saved with cuurent flag value 1. And updated dimensions r saved with the value 0.

Type2 Dimension/Effective Date Range Maping: This is also one flavour of Type2 maping used for slowly changing dimensions.This maping also inserts both new and changed dimensions in to the target.And changes r tracked by the effective date range for each version of each dimension.
 
How can u recognise whether or not the newly added rows in the source r gets insert in the target ?
In the Type2 maping we have three options to recognise the newly added rows
Version number
Flagvalue
Effective date Range
 
What r two types of processes that informatica runs the session?
Load manager Process: Starts the session, creates the DTM process, and sends post-session email when the session completes.
The DTM process. Creates threads to initialize the session, read, write, and transform data, and handle pre- and post-session operations.
 
What r the new features of the server manager in the informatica 5.0?
can use command line arguments for a session or batch.This allows U to change the values of session parameters,and mapping parameters and maping variables.

Parallel data processing: This feature is available for powercenter only.If we use the informatica server on a SMP system,U can use multiple CPU’s to process a session concurently.

Process session data using threads: Informatica server runs the session in two processes.Explained in previous question.
 
Can u generate reports in Informatcia?
Yes. By using Metadata reporter we can generate reports in informatica.
 
What is metadata reporter?
It is a web based application that enables you to run reports againist repository metadata.
With a meta data reporter,u can access information about U’r repository with out having knowledge of sql,transformation language or underlying tables in the repository.
 
Define maping and sessions?
Maping: It is a set of source and target definitions linked by transformation objects that define the rules for transformation.
Session : It is a set of instructions that describe how and when to move data from source to targets.
 
Which tool U use to create and manage sessions and batches and to monitor and stop the informatica server
Informatica server manager
 
Why we use partitioning the session in informatica?
Partitioning achieves the session performance by reducing the time period of reading the source and loading the data into target.
 
To achieve the session partition what r the necessary tasks u have to do?
Configure the session to partition source data.

Install the informatica server on a machine with multiple CPU’s
 
How the informatica server increases the session performance through partitioning the source?
For a relational sources informatica server creates multiple connections for each parttion of a single source and extracts seperate range of data for each connection.Informatica server reads multiple partitions of a single source concurently.Similarly for loading also informatica server creates multiple connections to the target and loads partitions of data concurently.

For XML and file sources,informatica server reads multiple files concurently.For loading the data informatica server creates a seperate file for each partition(of a source file).U can choose to merge the targets.
 
Why u use repository connectivity?
When u edit,schedule the sesion each time,informatica server directly communicates the repository to check whether or not the session and users r valid.All the metadata of sessions and mappings will be stored in repository.
 
What is DTM process?
After the loadmanger performs validations for session,it creates the DTM process.DTM is to create and manage the threads that carry out the session tasks.I creates the master thread.Master thread creates and manges all the other threads.
 
What r the different threads in DTM process?
Master thread: Creates and manages all other threads

Maping thread: One maping thread will be creates for each session.Fectchs session and maping information.

Pre and post session threads: This will be created to perform pre and post session operations.

Reader thread: One thread will be created for each partition of a source.It reads data from source.

Writer thread: It will be created to load data to the target.

Transformation thread: It will be created to tranform data.
 
What r the data movement modes in informatcia?
Datamovement modes determines how informatcia server handles the charector data.U choose the datamovement in the informatica server configuration settings.Two types of datamovement modes avialable in informatica.

ASCII mode
Uni code mode
What r the out put files that the informatica server creates during the session running?
Informatica server log: Informatica server(on unix) creates a log for all status and error messages(default name: pm.server.log).It also creates an error log for error messages.These files will be created in informatica home directory.

Session log file: Informatica server creates session log file for each session.It writes information about session into log files such as initialization process,creation of sql commands for reader and writer threads,errors encountered and load summary.The amount of detail in session log file depends on the tracing level that u set.

Session detail file: This file contains load statistics for each targets in mapping.Session detail include information such as table name,number of rows written or rejected.U can view this file by double clicking on the session in monitor window

Performance detail file: This file contains information known as session performance details which helps U where performance can be improved.To genarate this file select the performance detail option in the session property sheet.

Reject file: This file contains the rows of data that the writer does notwrite to targets.

Control file: Informatica server creates control file and a target file when U run a session that uses the external loader.The control file contains the information about the target flat file such as data format and loading instructios for the external loader.

Post session email: Post session email allows U to automatically communicate information about a session run to designated recipents.U can create two different messages.One if the session completed sucessfully the other if the session fails.

Indicator file: If u use the flat file as a target,U can configure the informatica server to create indicator file.For each target row,the indicator file contains a number to indicate whether the row was marked for insert,update,delete or reject.

output file: If session writes to a target file,the informatica server creates the target file based on file prpoerties entered in the session property sheet.

Cache files: When the informatica server creates memory cache it also creates cache files.For the following circumstances informatica server creates index and datacache files.

Aggreagtor transformation
Joiner transformation
Rank transformation
Lookup transformation
 
In which circumstances that informatica server creates Reject files?
When it encounters the DD_Reject in update strategy transformation.
Violates database constraint
Filed in the rows was truncated or overflowed.
 
What is polling?
It displays the updated information about the session in the monitor window. The monitor window displays the status of each session when U poll the informatica server
 
Can u copy the session to a different folder or repository?
Yes. By using copy session wizard u can copy a session in a different folder or repository.But that
target folder or repository should consists of mapping of that session.
If target folder or repository is not having the maping of copying session ,
u should have to copy that maping first before u copy the session
 
What is batch and describe about types of batches?
Grouping of session is known as batch.Batches r two types
Sequential: Runs sessions one after the other
Concurrent: Runs session at same time.

If u have sessions with source-target dependencies u have to go for sequential batch to start the
sessions one after another.If u have several independent sessions u can use concurrent batches.
Whch runs all the sessions at the same time.
 
What r the session parameters?
Session parameters r like maping parameters,represent values U might want to change between
sessions such as database connections or source files.

Server manager also allows U to create userdefined session parameters.Following r user defined
session parameters.
Database connections
Source file names: use this parameter when u want to change the name or location of
session source file between session runs
Target file name : Use this parameter when u want to change the name or location of
session target file between session runs.
Reject file name : Use this parameter when u want to change the name or location of
session reject files between session runs.
 
What is parameter file?
 
 
Parameter file is to define the values for parameters and variables used in a session.A parameter
file is a file created by text editor such as word pad or notepad.
U can define the following values in parameter file
Maping parameters
Maping variables
session parameters
 
How can u access the remote source into U’r session?
Relational source: To acess relational source which is situated in a remote place ,u need to
configure database connection to the datasource.
FileSource : To access the remote source file U must configure the FTP connection to the
host machine before u create the session.
Hetrogenous : When U’r maping contains more than one source type,the server manager creates
a hetrogenous session that displays source options for all types.
 
What is difference between partioning of relatonal target and partitioning of file targets?
If u parttion a session with a relational target informatica server creates multiple connections
to the target database to write target data concurently.If u partition a session with a file target
the informatica server creates one target file for each partition.U can configure session properties
to merge these target files.
 
what r the transformations that restricts the partitioning of sessions?
Advanced External procedure tranformation and External procedure transformation: This
transformation contains a check box on the properties tab to allow partitioning.

Aggregator Transformation: If u use sorted ports u can not parttion the assosiated source

Joiner Transformation : U can not partition the master source for a joiner transformation

Normalizer Transformation

XML targets.
 
Performance tuning in Informatica?
The goal of performance tuning is optimize session performance so sessions run during the available load window for the Informatica Server.Increase the session performance by following.

The performance of the Informatica Server is related to network connections. Data generally moves across a network at less than 1 MB per second, whereas a local disk moves data five to twenty times faster. Thus network connections ofteny affect on session performance.So aviod netwrok connections.

Flat files: If u’r flat files stored on a machine other than the informatca server, move those files to the machine that consists of informatica server.
Relational datasources: Minimize the connections to sources ,targets and informatica server to
improve session performance.Moving target database into server system may improve session
performance.
Staging areas: If u use staging areas u force informatica server to perform multiple datapasses.
Removing of staging areas may improve session performance.

U can run the multiple informatica servers againist the same repository.Distibuting the session load to multiple informatica servers may improve session performance.
Run the informatica server in ASCII datamovement mode improves the session performance.Because ASCII datamovement mode stores a character value in one byte.Unicode mode takes 2 bytes to store a character.

If a session joins multiple source tables in one Source Qualifier, optimizing the query may improve performance. Also, single table select statements with an ORDER BY or GROUP BY clause may benefit from optimization such as adding indexes.
We can improve the session performance by configuring the network packet size,which allows data to cross the network at one time.To do this go to server manger ,choose server configure database connections.

If u r target consists key constraints and indexes u slow the loading of data.To improve the session performance in this case drop constraints and indexes before u run the session and rebuild them after completion of session.

Running a parallel sessions by using concurrent batches will also reduce the time of loading the
data.So concurent batches may also increase the session performance.

Partittionig the session improves the session performance by creating multiple connections to sources and targets and loads data in paralel pipe lines.

In some cases if a session contains a aggregator transformation ,u can use incremental aggregation to improve session performance.

Aviod transformation errors to improve the session performance.

If the sessioin containd lookup transformation u can improve the session performance by enabling the look up cache.

If U’r session contains filter transformation ,create that filter transformation nearer to the sources
or u can use filter condition in source qualifier.

Aggreagator,Rank and joiner transformation may oftenly decrease the session performance .Because they must group data before processing it.To improve session performance in this case use sorted ports option.
 
What is difference between maplet and reusable transformation?
Maplet consists of set of transformations that is reusable.A reusable transformation is a
single transformation that can be reusable.

If u create a variables or parameters in maplet that can not be used in another maping or maplet.Unlike the variables that r created in a reusable transformation can be usefull in any other maping or maplet.

We can not include source definitions in reusable transformations.But we can add sources to a maplet.

Whole transformation logic will be hided in case of maplet.But it is transparent in case of reusable transformation.
We cant use COBOL source qualifier,joiner,normalizer transformations in maplet.Where as we can make them as a reusable transformations.
 
Define informatica repository?
The Informatica repository is a relational database that stores information, or metadata, used by the Informatica Server and Client tools. Metadata can include information such as mappings describing how to transform source data, sessions indicating when you want the Informatica Server to perform the transformations, and connect strings for sources and targets.

The repository also stores administrative information such as usernames and passwords, permissions and privileges, and product version.

Use repository manager to create the repository.The Repository Manager connects to the repository database and runs the code needed to create the repository tables.Thsea tables
stores metadata in specific format the informatica server,client tools use.
 
How can u work with remote database in informatica?did u work directly by using remote connections?
To work with remote datasource u need to connect it with remote connections.But it is not
preferable to work with that remote source directly by using remote connections .Instead u bring that source into U r local machine where informatica server resides.If u work directly with remote source the session performance will decreases by passing less amount of data across the network in a particular time.
 
what is incremantal aggregation?
When using incremental aggregation, you apply captured changes in the source to aggregate calculations in a session. If the source changes only incrementally and you can capture changes, you can configure the session to process only those changes. This allows the Informatica Server to update your target incrementally, rather than forcing it to process the entire source and recalculate the same calculations each time you run the session
 
What is tracing level and what r the types of tracing level?
Tracing level represents the amount of information that informatcia server writes in a log file.
Types of tracing level
Normal
Verbose
Verbose init
Verbose data
 
If a session fails after loading of 10,000 records in to the target.How can u load the records from 10001 th record when u run the session next time?
informatcia server has 3 methods to recovering the sessions.Use performing recovery to load the records from where the session fails.
 
Explain about perform recovery?
When the Informatica Server starts a recovery session, it reads the OPB_SRVR_RECOVERY table and notes the row ID of the last row committed to the target database. The Informatica Server then reads all sources again and starts processing from the next row ID. For example, if the Informatica Server commits 10,000 rows before the session fails, when you run recovery, the Informatica Server bypasses the rows up to 10,000 and starts loading with row 10,001.
By default, Perform Recovery is disabled in the Informatica Server setup. You must enable Recovery in the Informatica Server setup before you run a session so the Informatica Server can create and/or write entries in the OPB_SRVR_RECOVERY table.
 
What is Data cleansing..?
The process of finding and removing or correcting data that is incorrect, out-of-date, redundant, incomplete, or formatted incorrectly.
 
 

ETL

 
What is a staging area? Do we need it? What is the purpose of a staging area?
Staging area is place where you hold temporary tables on data warehouse server. Staging tables are connected to work area or fact tables. We basically need staging area to hold the data , and perform data cleansing and merging , before loading the data into warehouse.
 
What is a three tier data warehouse?
Three tier data warehouse contains three tier such as bottom tier,middle tier and top tier. 
Bottom tier deals with retrieving related datas or information from various information repositories by using SQL. 
Middle tier contains two types of servers. 
1.ROLAP server 
2.MOLAP server 
Top tier deals with presentation or visualization of the results .
 

                                               

 

General

 
What is a Data Warehousing
Data Warehouse is a repository of integrated information, available for queries and analysis. Data and information are extracted from heterogeneous sources as they are generated....This makes it much easier and more efficient to run queries over data that originally came from different sources.
Typical relational databases are designed for on-line transactional processing (OLTP) and do not meet the requirements for effective on-line analytical processing (OLAP). As a result, data warehouses are designed differently than traditional relational databases.
What are  Data Marts
Data Mart is a segment of a data warehouse that can provide data for reporting and analysis on a section, unit, department or operation in the company, e.g. sales, payroll, production. Data marts are sometimes complete individual data warehouses which are usually smaller than the corporate data warehouse.
 
What is a Star Schema
A relational database schema organized around a central table (fact table) joined to a few smaller tables (dimension tables) using foreign key references. The fact table contains raw numeric items that represent relevant business facts (price, discount values, number of units sold, dollar value, etc.)
 
What is Dimensional Modelling
In Dimensional Modeling, Data is stored in two kinds of tables: Fact Tables and Dimension tables.  
Fact Table contains fact data e.g. sales, revenue, profit etc..... 
Dimension table contains dimensional data such as Product Id, product name, product description etc.....
 
 
 
What Snow Flake Schema
Snowflake schemas normalize dimensions to eliminate redundancy. That is, the dimension data has been grouped into multiple tables instead of one large table. For example, a product dimension table in a star schema might be normalized into a products table, a product_category table, and a product_manufacturer table in a snowflake schema. While this saves space, it increases the number of dimension tables and requires more foreign key joins. The result is more complex queries and reduced query performance
 
What is the Difference between OLTP and OLAP
OLTP 
Current data 
Short database transactions 
Online update/insert/delete 
Normalization is promoted 
High volume transactions 
Transaction recovery is necessary 
 
 
OLAP 
Current and historical data 
Long database transactions 
Batch update/insert/delete 
Denormalization is promoted 
Low volume transactions 
Transaction recovery is not necessary 
 
What is Fact table
table in a data warehouse whose entries describe data in a fact table. Dimension tables contain the data from which dimensions are created.

What is data mining
Data mining is a process of extracting hidden trends within a datawarehouse. For example an insurance dataware house can be used to mine data for the most high risk people to insure in a certain geographial area.
 
What is ODS
ODS stands for Online Data Storage.

It is used to maintain, store the current and up to date information and the transactions regarding the source databases taken from the OLTP system.

It is directly connected to the source database systems instead of to the staging area.

It is further connected to data warehouse and moreover can be treated as a part of the data warehouse database.
 
What is a level of Granularity of a fact table
Level of granularity means level of detail that you put into the fact table in a data warehouse. For example: Based on design you can decide to put the sales data in each transaction. Now, level of granularity would mean what detail are you willing to put for each transactional fact. Product sales with respect to each minute or you want to aggregate it upto minute and put that data
 
Why are OLTP database designs not generally a good idea for a Data Warehouse
OLTP cannot store historical information about the organization. It is used for storing the details of daily transactions while a datawarehouse is a huge storage of historical information obtained from different datamarts for making intelligent decisions about the organization
=====
 
 
what are mapping parameters and varibles in which ...

 
Mapping parameters have a constant value through out the session
whereas in mapping variable the values change and the informatica server saves the values in the repository and uses next time when u run the session.
 
 
 
what is difference between lookup cashe and unchas...

 
the difference between cache and uncacheed lookup iswhen you configure the lookup transformation cache lookup it stores all the lookup table data in the cache when the first input record enter into the lookup transformation, in cache lookup the select statement executes only once and compares the values of the input record with the values in the cachebut in uncache lookup the the select statement executes for each input record entering into the lookup transformation and it has to connect to database each time entering the new record
 
 
 
what are partition points?

 
Partition points mark the thread boundaries in a source pipeline and divide
the pipeline into stages.
 
 
 
With out using Updatestretagy and sessons options,...

 
         By default all the rows in the session is set as insert flag ,you can change it in the session general properties -- Treate source rows as :update
so, all the incoming rows will be set with update flag.now you can update the rows in the target table
 
 
 
What is the procedure or steps implementing versio...

 
Hi,
For version control in ETL layer using informatica, first of all after doing anything in your designer mode or workflow manager, do the following steps.....
1> First save the changes or new implementations.
2>Then from navigator window, right click on the specific object you are currently in. There will be a pop up window. In that window at the lower end side, you will find versioning->Check In. A window will be opened. Leave the information you have done  like "modified this mapping" etc. Then click ok button.
OK......Still there is also another shortcut method for this...But I'll give that some other time
 
 
If you have four lookup tables in the workflow. Ho...

 
there r many ways to improve the mapping which has multiple lookups.
1) we can create an index for the lookup table if we have permissions(staging area).
2) divide the lookup mapping into two  (a) dedicate one for insert means: source - target,, these r new rows  . only the new rows will come to mapping and the process will be fast . (b) Dedicate the second one to update: source=target, these r existing rows. only the rows which exists already will come into the mapping.
3)we can increase the cache size of the lookup.
 
When a workflow has multiple lookup tables use shared cache
 
 
In a sequential Batch how can we stop single sess...

 
we have a task called wait event using that we can stop.
we start using raise event.
this is as per my knowledge.
 
 
 
hi all, recently i was asked by the interviewer li...

 
u can do onething after running the mapping,, in workflow manager
              start-------->session.
 Right click on the session  u will get a menu, in that go for persistent values, there u will find the last value stored in the repository regarding to mapping variable. then remove it and put ur desired one, run the session... i hope ur task will be done
 
U can also define the value of the variable in the parameter file. Power center server searches the value of the variable first in parameter file then searches for the value stored in repository.
it takes value of 51 but u can override the saved variable in the repository by defining the value in the parameter file. if there is a parameter file for the mapping variable it uses the value in the parameter file not the value+1 in the repository for example assign the value of the mapping variable as 70.in other words higher preference is given to the value in the parameter file
 
 
 
 
 
How to use mapping parameters and what is their us...

 
Mapping parameters and variables make the use of mappings more flexible. And also it avoids creating of multiple mappings. It helps in adding incremental data. Mapping parameters and variables has to create in the mapping designer by choosing the menu option as Mapping ----> parameters and variables and the enter the name for the variable or parameter but it has to be preceded by $$. and choose type as parameter/variable, datatypeonce defined the variable/parameter is in the any expression for example in SQ transformation in the source filter properties tab. just enter filter condition and finally create a parameter file to assign the value for the variable / parameter and configure the session properties. however the final step is optional. if the parameter is not present it uses the initial value which is assigned at the time of creating the variable
 
 
 
What is the use of incremental aggregation? Explain...

 
Its a session option. When the informatica server performs incremental aggr. it passes new source data through the mapping and uses historical cache data to perform new aggregation calculations incrementally. For performance we will use it.
 
 
 
: How to lookup the data on multiple tables.

 
If u want to lookup data on multiple tables at a time u can do one thing join the tables which u want then lookup that joined table. Informatica provides lookup on joined tables, hats off to informatica.
 
When you create lookup transformation that time INFA asks for table name so you can choose source, target, import and skip.
So click skip and the use the sql overide property in properties tab to join two tables for lookup
 
 
How to retrieve the records from a rejected file. e...

 
During the execution of workflow all the rejected rows will be stored in bad files(where your informatica server get installed:\Program Files\Informatica Power Center 7.1\Server) These bad files can be imported as flat a file in source then thro' direct mapping we can load these files in desired format.
 
 
What are variable ports and list two situations wh...

 
variable port is used to break the complex expression into simpler
and also it is used to store intermediate values
 
 
What is difference between IIF and DECODE function...

 
u can use decode in conditioning coloumns also while we cann't use iff but u can use case. but by using decode retrieveing data is quick  
 
 
which is better among connected lookup and unconne...

 
Its not a easy question to say which is better out of connected, unconnected lookups. Its depends upon our experience and upon the requirement.
When you compared both basically connected lookup will return more values and unconnected returns one value. Conn lookup is in the same pipeline of source and it will accept dynamic caching. Unconn lookup don't have that facility but in some special cases we can use Unconnected. if o/p of one lookup is going as i/p of another lookup this unconnected lookups are favorable.
If you are having defined source you can use connected, source is not well defined or from different database you can go for unconnected
We are using like that only
 
think the better one is connected look up. beacaz we can use dynamic cache with it ,, also connected loop up can send multiple columns in a single row, where as unconnected is concerned it has a single return port.(in case of  etl informatica is concerned)
 
 
 
We are using Update Strategy Transformation in map...

 
Operation           Constant               Numeric value
Insert                  DD_INSERT                      0
Update                DD_UPDATE                     1
Delete                 DD_DELETE                      2
Reject                  DD_REJECT                      3
 
 
Suppose session is configured with commit interval...

 
Source based commit will commit the data into target based on commit interval.so,for every 10,000 rows it will commit into target.
Target based commit will commit the data into target based on buffer size of the target.i.e., it commits the data into target when ever the buffer fills.Let us assume that the buffer size is 6,000.So,for every 6,000 rows it commits the data.
 
 
 
How do we estimate the number of partitions that a ...

 
Hi, it depends upon the informatica version we r using. Suppose if we r using informatica 6 it supports only 32 partitions where as informatica 7 supports 64 partitions.
 
 
How do you decide whether you need to do aggregation...

 
It depends upon our requirement only. If you have good processing database you can create aggregation table or view at database level else it’s better to use informatica. Here I’m explaining why we need to use informatica.
what ever it may be informatica is a third party tool, so it will take more time to process aggregation compared to the database, but in Informatica  an option we  called "Incremental aggregation" which will help you to update the current values with current  values +new values. No necessary to process entire values again and again. Unless this can be done if nobody deleted that cache files. If that happened total aggregation we need to execute on informatica also.
In database we don't have Incremental aggregation facility.
 
 
How to join two tables without using the Joiner Tr...

 
Joiner transformation is used to join n (n>1) tables from same or different databases, but source qualifier transformation is used to join only n tables from same database.
 
 
Briefly explain the Versioning Concept in Power Ce...

 
When you create a version of a folder referenced by shortcuts, all shortcuts continue to reference their original object in the original version. They do not automatically update to the current folder version.
For example, if you have a shortcut to a source definition in the Marketing folder, version 1.0.0, then you create a new folder version, 1.5.0, the shortcut continues to point to the source definition in version 1.0.0.
Maintaining versions of shared folders can result in shortcuts pointing to different versions of the folder. Though shortcuts to different versions do not affect the server, they might prove more difficult to maintain. To avoid this, you can recreate shortcuts pointing to earlier versions, but this solution is not practical for much-used objects. Therefore, when possible, do not version folders referenced by shortcuts.
Cheers,
 
 
 
 
In a filter expression we want to compare one date...

 
The db2 date format is  "yyyymmdd"  where as sysdate in oracle will give "dd-mm-yy" so conversion of db2 date format to local database date format is compulsory. other wise u will get that type of error
 
 
how to create the staging area in your database

 
A Staging area in a DW is used as a temporary space to hold all the records from the source system. So more or less it should be exact replica of the source systems except for the load startegy where we use truncate and reload options.
So create using the same layout as in your source tables or using the Generate SQL option in the Warehouse Designer tab
 
 
whats the diff between Informatica powercenter ser...

 
Repository is a database in which all informatica componets are stored in the form of tables. The reposiitory server controls the repository and maintains the data integrity and Consistency across the repository when multiple users use Informatica. Powercenter Server/Infa Server is responsible for execution of the components (sessions) stored in the repository.
 
 
What are the Differences between Informatica Power...

 
the main difference between informatica 5.1 and 6.1 is that in 6.1 they introduce a new thing called repository server and in place of server manager(5.1), they introduce workflow manager and workflow monitor
In ver 7x u have the option of looking up (lookup) on a flat file.
U can write to XML target.
Versioning
LDAP authentication
Support of 64 bit architectures
 
 
 
 
Differences between Informatica 6.2 and Informati...

 
Features in 7.1 are :
1.union and custom transformation
2.lookup on flat file
3.grid servers working on different operating systems can coexist on same server
4.we can use pmcmdrep
5.we can export independent and dependent rep objects
6.we can move mapping in any web application
7.version controlling
8.data profilling
 
What is the difference between connected and uncon...

 
Unconnected:
The unconnected Stored Procedure transformation is not connected directly to the flow of the mapping. It either runs before or after the session, or is called by an expression in another transformation in the mapping.
connected:
The flow of data through a mapping in connected mode also passes through the Stored Procedure transformation. All data entering the transformation through the input ports affects the stored procedure. You should use a connected Stored Procedure transformation when you need data from an input port sent as an input parameter to the stored procedure, or the results of a stored procedure sent as an output parameter to another transformation
 
how to get the first 100 rows from the flat file i...

 
please check this one,
task ----->(link)   session (workflow manager)
double click on link and type $$source success rows(parameter in session variables) = 100
it should automatically stops session.
can we lookup a table from a source qualifer trans...

 
No. we can't do.
I will explain you why.
1) Unless you assign the output of the source qualifier to another transformation or to target no way it will include the feild in the query.
2) source qualifier don't have any variables feilds to utalize as expression
 
What is the difference between Normal load and Bulk...

 
Normal Load: Normal load will write information to the database log file so that if any recovery is needed it is will be helpful. when the source file is a text file and loading data to a table, in such cases we should you normal load only, else the session will be failed.
Bulk Mode: Bulk load will not write information to the database log file so that if any recovery is needed we can't do any thing in such cases.
Comparatively Bulk load is pretty faster than normal load.
 
 
What are main advantages and purpose of using Normalizer Transformation in Informatica?

 
Normalizer Transformation is used mainly with COBOL sources where most of the time data is stored in de-normalized format. Also, Normalizer transformation can be used to create multiple rows from a single row of data 
 
 
On connected or unconnected transformations

 
Here is the deal,
Connected transformation is a part of your data flow in the pipeline while unconnected Transformation is not.
Much like calling a program by name and by reference.
Use unconnected transforms when you want call the same transform many times in a single mapping.
 
 
 
 
 
 
What are 2 modes of data movement in Informatica Server?
 
The data movement mode depends on whether Informatica Server should process single byte or multi-byte character data. This mode selection can affect the enforcement of code page relationships and code page validation in the Informatica Client and Server.
a) Unicode - IS allows 2 bytes for each character and uses additional byte for each non-ascii character (such as Japanese characters)
b) ASCII - IS holds all data in a single byte
.
 
RE: Explain the informatica Architecture in detail...

 
informatica server connects source data and target data using native
odbc drivers
again it connect to the repository for running sessions and retrieving metadata information
source------>informatica server--------->target
                                       |
                                       |
                                REPOSITORY  
 
 
How many ways you can update a relational source d...

 
in 2 ways we can do it
  1) by reimport the source definition
  2) by edit the source definition
 
 
 
 
 
 
 
 
 
How can U improve session performance in aggregator transformation?

 
use sorted input:
1. use a sorter before the aggregator
2. donot forget to check the option on the aggregator that tell the aggregator that the input is sorted on the same keys as group by.
the key order is also very important.
 
 
what is meant by lookup caches?
The informatica server builds a cache in memory when it processes the first row af a data in a cached look up transformation.It allocates memory for the cache based on the amount u configure in the transformation or session properties.The informatica server stores condition values in the index cache and output values in the data cache
 
 
What r the types of lookup caches?
 
Persistent cache: U can save the lookup cache files and reuse them the next time the informatica server processes a lookup transformation configured to use the cache.

Recache from database: If the persistent cache is not synchronized with he lookup table,u can configure the lookup transformation to rebuild the lookup cache.

Static cache: U can configure a static or read-only cache for only lookup table. By default informatica server creates a static cache. It caches the lookup table and lookup values in the cache for each row that comes into the transformation. When the lookup condition is true, the informatica server does not update the cache while it processes the lookup transformation.

Dynamic cache: If u want to cache the target table and insert new rows into cache and the target can create a look up transformation to use dynamic cache. The informatica server dynamically inserts data to the target table.

Shared cache: U can share the lookup cache between multiple transactions can share unnamed cache between transformations in the same mapping.
 
 
Difference between static cache and dynamic cache
 
 
Static cache
Dynamic cache
U can not insert or update the cache
U can insert rows into the cache as u pass to the target
The informatic server returns a value from the lookup table or cache when the condition is true. When the condition is not true, informatica server returns the default value for connected transformations and null for unconnected transformations.
The informatica server inserts rows into cache when the condition is false.This indicates that the row is not in the cache or target table. U can pass these rows to the target table
 
 
 
 
What r the mappings that we use for slowly changing dimension table?
 
Type1: Rows containing changes to existing dimensions are updated in the target by overwriting the existing dimension. In the Type 1 Dimension mapping, all rows contain current dimension data.
Use the Type 1 Dimension mapping to update a slowly changing dimension table when you do not need to keep any previous versions of dimensions in the table.

Type 2: The Type 2 Dimension Data mapping inserts both new and changed dimensions into the target. Changes are tracked in the target table by versioning the primary key and creating a version number for each dimension in the table.
Use the Type 2 Dimension/Version Data mapping to update a slowly changing dimension table when you want to keep a full history of dimension data in the table. Version numbers and versioned primary keys track the order of changes to each dimension.

Type 3: The Type 3 Dimension mapping filters source rows based on user-defined comparisons and inserts only those found to be new dimensions to the target. Rows containing changes to existing dimensions are updated in the target. When updating an existing dimension, the Informatica Server saves existing data in different columns of the same row and replaces the existing data with the updates
 
 
 
 
What r two types of processes that informatica runs the session?
 
Load manager Process: Starts the session, creates the DTM process, and sends post-session email when the session completes.
The DTM process. Creates threads to initialize the session, read, write, and transform data, and handle pre- and post-session operations.
 
 
Can u generate reports in Informatca?

 
It is a ETL tool, you could not make reports from here, but you can generate metadata report, that is not going to be used for business analysis
 
 
Why we use partitioning the session in informatica...

 
Performance can be improved by processing data in parallel in a single session by creating multiple partitions of the pipeline.
Informatica server can achieve high performance by partitioning the pipleline and performing the extract , transformation, and load for each partition in parallel. 
 
 
What is parameter file?

 
When you start a workflow, you can optionally enter the directory and name of a parameter file. The Informatica Server runs the workflow using the parameters in the file you specify.
For UNIX shell users, enclose the parameter file name in single quotes:
-paramfile '$PMRootDir/myfile.txt'
For Windows command prompt users, the parameter file name cannot have beginning or trailing spaces. If the name includes spaces, enclose the file name in double quotes:
-paramfile ”$PMRootDir\my file.txt”
Note: When you write a pmcmd command that includes a parameter file located on another machine, use the backslash (\) with the dollar sign ($). This ensures that the machine where the variable is defined expands the server variable.
pmcmd startworkflow -uv USERNAME -pv PASSWORD -s SALES:6258 -f east -w wSalesAvg -paramfile '\$PMRootDir/myfile.txt'
 
 
Define informatica repository?
 
The Informatica repository is a relational database that stores information, or metadata, used by the Informatica Server and Client tools. Metadata can include information such as mappings describing how to transform source data, sessions indicating when you want the Informatica Server to perform the transformations, and connect strings for sources and targets.

The repository also stores administrative information such as usernames and passwords, permissions and privileges, and product version.

Use repository manager to create the repository.The Repository Manager connects to the repository database and runs the code needed to create the repository tables.Thsea tables
stores metadata in specific format the informatica server,client tools use.
 
 
What r the types of metadata that stores in reposi...

 
·  Source definitions. Definitions of database objects (tables, views, synonyms) or files that provide source data.
·  Target definitions. Definitions of database objects or files that contain the target data.
·  Multi-dimensional metadata. Target definitions that are configured as cubes and dimensions.
·  Mappings. A set of source and target definitions along with transformations containing business logic that you build into the transformation. These are the instructions that the Informatica Server uses to transform and move data.
·  Reusable transformations. Transformations that you can use in multiple mappings.
·  Mapplets. A set of transformations that you can use in multiple mappings.
·  Sessions and workflows. Sessions and workflows store information about how and when the Informatica Server moves data. A workflow is a set of instructions that describes how and when to run tasks related to extracting, transforming, and loading data. A session is a type of task that you can put in a workflow. Each session corresponds to a single mapping.
 
 
 
 
 
 
 
What is Data Transformation Manager?
 
After the load manager performs validations for the session, it creates the DTM process. The DTM process is the second process associated with the session run. The primary purpose of the DTM process is to create and manage threads that carry out the session tasks.
· The DTM allocates process memory for the session and divide it into buffers. This is also known as buffer memory. It creates the main thread, which is called the master thread. The master thread creates and manages all other threads.
· If we partition a session, the DTM creates a set of threads for each partition to allow concurrent processing.. When Informatica server writes messages to the session log it includes thread type and thread ID. Following are the types of threads that DTM creates:
Master thread - Main thread of the DTM process. Creates and manages all other threads. Mapping thread - One Thread to Each Session. Fetches Session and Mapping Information.Pre and Post Session Thread-One Thread each to Perform Pre and Post Session Operations. reader thread-One Thread for Each Partition for Each Source Pipeline. WRITER THREAD-One Thread for Each Partition if target exist in the source pipeline write to the target.tRANSFORMATION THREAD - One or More Transformation Thread For Each Partition.
 
Active/Passive Transformations
Transformations can be active or passive. An active transformation can change the number of rows that pass through it, such as a Filter transformation that removes rows that do not meet the filter condition. A passive transformation does not change the number of rows that pass through it, such as an Expression transformation that performs a calculation on data and passes all rows through the transformation.
Transformation
Type
Description
Advanced External Procedure
Active/
Connected
Calls a procedure in a shared library or in the COM layer of Windows.
Aggregator
Active/
Connected
Performs aggregate calculations.
Application Source Qualifier
Active/
Connected
Represents the rows that the Informatica Server reads from an application, such as an ERP source, when it runs a workflow.
Expression
Passive/
Connected
Calculates a value.
External Procedure
Passive/
Connected or Unconnected
Calls a procedure in a shared library or in the COM layer of Windows.
Filter
Active/
Connected
Filters data.
Input
Passive/
Connected
Defines mapplet input rows. Available only in the Mapplet Designer.
Joiner
Active/
Connected
Joins data from different databases or flat file systems.
Lookup
Passive/
Connected or Unconnected
Looks up values.
Normalizer
Active/
Connected
Source qualifier for COBOL sources. Can also use in the pipeline to normalize data from relational or flat file sources.
Output
Passive/
Connected
Defines mapplet output rows. Available only in the Mapplet Designer.
Rank
Active/
Connected
Limits records to a top or bottom range.
Router
Active/
Connected
Routes data into multiple transformations based on group conditions.
Sequence Generator
Passive/
Connected
Generates primary keys.
Sorter
Active/Connected
Sorts data based on a sort key.
Source Qualifier
Active/
Connected
Represents the rows that the Informatica Server reads from a relational or flat file source when it runs a workflow.
Stored Procedure
Passive/
Connected or Unconnected
Calls a stored procedure.
Update Strategy
Active/
Connected
Determines whether to insert, delete, update, or reject records.
XML Source Qualifier
Passive/
Connected
Represents the rows that the Informatica Server reads from an XML source when it runs a workflow
======
Q: How do I connect job streams/sessions or batches across folders? (30 October 2000)
·         For quite a while there's been a deceptive problem with sessions in the Informatica repository.  For management and maintenance reasons, we've always wanted to separate mappings, sources, targets, in to subject areas or functional areas of the business.  This makes sense until we try to run the entire Informatica job stream.  Understanding of course that only the folder in which the map has been defined can house the session.  This makes it difficult to run jobs / sessions across folders - particularly when there are necessary job dependencies which must be defined.  The purpose of this article is to introduce an alternative solution to this problem.  It requires the use of shortcuts. 
The basics are like this: Keep the map creations, sources, and targets subject oriented.  This allows maintenance to be easier (by subject area).  Then once the maps are done, change the folders to allow shortcuts (done from the repository manager).  Create a folder called: "MY_JOBS" or something like that.  Go in to designer, open "MY_JOBS", expand the source folders, and create shortcuts to the mappings in the source folders.
Go to the session manager, and create sessions for each of the short-cut mappings in MY_JOBS.  Then batch them as you see fit.  This will allow a single folder for running jobs and sessions housed anywhere in any folder across your repository.
Q: How do I get maximum speed out of my database connection? (12 September 2000)
·         In Sybase or MS-SQL Server, go to the Database Connection in the Server Manager.  Increase the packet size.  Recommended sizing depends on distance traveled from PMServer to Database - 20k Is usually acceptable on the same subnet.  Also, have the DBA increase the "maximum allowed" packet size setting on the Database itself.  Following this change, the DBA will need to restart the DBMS.  Changing the Packet Size doesn't mean all connections will connect at this size, it just means that anyone specifying a larger packet size for their connection may be able to use it.  It should increase speed, and decrease network traffic.  Default IP Packets are between 1200 bytes and 1500 bytes.
In Oracle: there are two methods.  For connection to a local database, setup the protocol as IPC (between PMServer and a DBMS Server that are hosted on the same machine).  IPC is not a protocol that can be utilized across networks (apparently).  IPC stands for Inter Process Communication, and utilizes memory piping (RAM) instead of client context, through the IP listner.  For remote connections there is a better way:  Listner.ORA and TNSNames.ORA need to be modified to include SDU and TDU settings.  SDU = Service Layer Data Buffer, and TDU = Transport Layer Data Buffer.  Both of which specify packet sizing in Oracle connections over IP.  Default for Oracle is 1500 bytes.  Also note: these settings can be used in IPC connections as well, to control the IPC Buffer sizes passed between two local programs (PMServer and Oracle Server)
 
 
 
Both the Server and the Client need to be modified.  The server will allow packets up to the max size set - but unless the client specifies a larger packet size, the server will default to the smallest setting (1500 bytes).  Both SDU and TDU should be set the same. See the example below:
TNSNAMES.ORA
LOC = (DESCRIPTION =
    (SDU = 20480) (TDU=20480)
LISTENER.ORA
LISTENER = ....
   .... (SID_DESC =
        (SDU = 20480) (TDU=20480) (SID_NAME = beqlocal) ....
Q: How do I get a Sequence Generator to "pick up" where another "left off"? (8 June 2000)
·         To perform this mighty trick, one can use an unconnected lookup on the Sequence ID of the target table.  Set the properties to "LAST VALUE", input port is an ID.  the condition is: SEQ_ID >= input_ID.  Then in an expression set up a variable port:  connect a NEW self-resetting sequence generator to a new input port in the expression.  The variable port's expression should read: IIF( v_seq = 0 OR ISNULL(v_seq) = true, :LKP.lkp_sequence(1), v_seq).  Then, set up an output port.  Change the output port's expression to read: v_seq + input_seq (from the resetting sequence generator).  Thus you have just completed an "append" without a break in sequence numbers.
Q: How do I query the repository to see which sessions are set in TEST MODE? (8 June 2000)
·         Run the following select:
select * from opb_load_session where bit_option = 13;
It's actually BIT # 2 in this bit_option setting, so if you have a mask, or a bit-level function you can then AND it with a mask of 2, if this is greater than zero, it's been set for test load.
Q: How do I "validate" all my mappings at once?  (31 March 2000)
·         Issue the following command WITH CARE. 
UPDATE OPB_MAPPING SET IS_VALID = 1;
Then disconnect from the database, and re-connect.  In session manager, and designer as well.
Q: How do I validate my entire repository?  (12 September 2000)
·         To add the menu option, change this registry entry on your client.
HKEY_CURRENT_USER/Software/Informatica/PowerMart Client Tools/4.7/Repository Manager Options
add the following string Name: EnableCheckReposit Data: 1
Validate Repository forces Informatica to run through the repository, and check the repo for errors
Q: How do I work around a bug in 4.7?  I can't change the execution order of my stored procedures that I've imported? (31 March 2000)
·         Issue the following statements WITH CARE:
select widget_id from OPB_WIDGET where WIDGET_NAME = <widget name>
(write down the WIDGET ID)
select * from OPB_WIDGET_ATTR where WIDGET_ID = <widget_id>
update OPB_WIDGET_ATTR set attr_value = <execution order> where WIDGET_ID = <widget_id> and attr_id = 5
COMMIT;
The <execution order> is the number of the order in which you want the stored proc to execute.  Again, disconnect from both designer and session manager repositories, and re-connect to "re-read" the local cache.
Q:  How do I keep the session manager from "Quitting" when I try to open a session? (23 March 2000)
·         Informatica Tech Support has said: if you are using a flat file as a source, and your "file name" in the "Source Options" dialog is longer than 80 characters, it will "kill" the Session Manager tool when you try to re-open it.  You can fix the session by: logging in to the repository via SQLPLUS, or ISQL, and finding the table called: OPB_LOAD_SESSION, find the Session ID associated with the session name - write it down.  Then select FNAME from OPB_LOAD_FILES where Session_ID = <session_id>.  Change / update OPB_LOAD_FILES set FNAME= <new file name> column, change the length back to less than 80 characters, and commit the changes.  Now the session has been repaired.  Try to keep the directory to that source file in the DIRECTORY entry box above the file name box.  Try to keep all the source files together in the same source directory if possible.
Q: How do I repair a "damaged" repository? (16 March 2000)
·         There really isn't a good repair tool, nor is there a "great" method for repairing the repository.  However, I have some suggestions which might help.  If you're running in to a session which causes the session manager to "quit" on you when you try to open it, or you have a map that appears to have "bad sources", there may be something you can do.  There are varying degrees of damage to the repository - mostly caused because the sequence generator that PM/PC relies on is buried in a table in the repository - and they generate their own sequence numbers.  If this table becomes "corrupted" or generates the wrong sequences, you can get repository errors all over the place.  It can spread quickly.  Try the following steps to repair a repository: (USE AT YOUR OWN RISK) The recommended path is to backup the repository, send it to Technical Support - and tell them it's damaged.
1.      Delete the session, disconnect, re-connect, then re-create the session, then attempt to edit the new session again.   If the new session won't open up (srvr mgr quits), then there are more problems - PM/PC is not successfully attaching sources and targets to the session (SEE: OPB_LOAD_SESSION table (SRC_ID, TARGET_ID) columns - they will be zero, when they should contain an ID.
2.      Delete the session, then open the map.  Delete the source and targets from the MAP.  Save the map and invalidate it - forcing an update to the repository and it's links.  Drag the sources and targets back in to the map and re-connect them.  Validate and Save.  Then try re-building the session (back to step one).  If there is still a failure, then there are more problems.
3.      Delete the session and the map entirely.  Save the repository changes - thus requesting a delete in the repository.  While the "delete" may occur - some of the tables in the repository may not be "cleansed".  There may still be some sources, targets, and transformation objects (reusable) left in the repository.  Rebuild the map from scratch - then save it again... This will create a new MAP ID in the OPB_MAPPING table, and force PM/PC to create new ID links to existing Source and Target objects (as well as all the other objects in the map).
4.      If that didn't work - you may have to delete the sources, reusable objects, and targets, as well as the session and the map.  Then save the repository - again, trying to "remove" the objects from the repository itself.  Then re-create them.  This forces PM/PC to assign new ID's to ALL the objects in the map, the map, and the session - hopefully creating a "good" picture of all that was rebuilt.
·         Or try this method:
1.      Create a NEW repository -> call it REPO_A (for reference only).
2.      Copy any of the MAPPINGS that don't have "problems" opening in their respective sessions, and copy the mappings (using designer) from the old repository (REPO_B) to the new repository (REPO_A).  This will create NEW ID's for all the mappings, CAUTION: You will lose your sessions.
3.      DELETE the old repository (REPO_B).
4.      Create a new repository in the OLD Repository Space (REPO_B)..
5.      Copy the maps back in to the original repository (Recreated Repository) From REPO_A to REPO_B.
6.      Rebuild the sessions, then re-create all of the objects you originally had trouble with.
·         You can apply this to FOLDER level and Repository Manager Copying, but you need to make sure that none of the objects within a folder have any problems.
·         What this does: creates new ID's, resets the sequence generator, re-establishes all the links to the objects in the tables, and drop's out (by process of elimination) any objects you've got problems with.
·         Bottom line: PM/PC client tools have trouble when the links between ID's get broken.  It's fairly rare that this occurs, but when it does - it can cause heartburn.
 
 
 
Q: How do I clear the locks that are left in the repository? (3 March 2000)
·   Clearing locks is typically a task for the repository manager.  Generally it's done from within the Repository Manager: Edit Menu -> Show Locks.  Select the locks, then press "remove".  Typically locks are left on objects when a client is rebooted without properly exiting Informatica.  These locks can keep others from editing the objects.  They can also keep scheduled executions from occurring.  It's not uncommon to want to clear the locks automatically - on a prescheduled time table, or at a specified time.  This can be done safely only if no-one has an object out for editing at the time of deletion of the lock.  The suggested method is to log in to the database from an automated script, and issue a "delete from OPB_OBJECT_LOCKS" table.
Q: How do I turn on the option for Check Repository? (3 March 2000)
·         According to Technical Support, it's only available by adjusting the registry entries on the client. PM/PC need to be told it's in Admin mode to work.  Below are the steps to turn on the Administration Mode on the client.  Be aware - this may be a security risk, anyone using that terminal will have access to these features.

1)start repository manager
2) repository menu go to check repository
3) if the option is not there you need to edit your registry using regedit
go to: HKEY_CURRENT_USER>>SOFTWARE>>INFORMATICA>>PowerMart Client Tools>>Repository Manager Options
go to your specific version 4.5 or 4.6 and then go to Repository Manager. In
there add two strings:
1) EnableAdminMode 1
2) EnableCheckReposit 1
both should be spelled as shown the value for both is 1
Q: How do I generate an Audit Trail for my repository (ORACLE / Sybase) ?
·         Download one of two *USE AT YOUR OWN RISK* zip files.  The first is available now for PowerMart 4.6.x and PowerCenter 1.6x. It's a 7k zip file: Informatica Audit Trail v0.1a  The other file (for 4.5.x is coming...).  Please note: this is FREE software that plugs in to ORACLE 7x, and ORACLE 8x, and Oracle 8i.  It has NOT been built for Sybase, Informix, or DB2.  If someone would care to adapt it, and send it back to me, I'll be happy to post these also.  It has limited support - has not been fully tested in a multi-user environment, any feedback would be appreciated.   NOTE: SYBASE VERSION IS ON IT'S WAY.
 
 
 
Q: How do I "tune" a repository?  My repository is slowing down after a lot of use, how can I make it faster?
·         In Oracle: Schedule a nightly job to ANALYZE TABLE for ALL INDEXES, creating histograms for the tables - keep the cost based optimizer up to date with the statistics.  In SYBASE: schedule a nightly job to UPDATE STATISTICS against the tables and indexes.  In Informix, DB2, and RDB, see your owners manuals about maintaining SQL query optimizer statistics.
Q: How do I achieve "best performance" from the Informatica tool set?
·         By balancing what Informatica is good at with what the databases are built for.  There are reasons for placing some code at the database level - particularly views, and staging tables for data.  Informatica is extremely good at reading/writing and manipulating data at very high rates of throughput.  However - to achieve optimum performance (in the Gigabyte to Terabyte range) there needs to be a balance of Tuning in Oracle, utilizing staging tables, views for joining source to target data, and throughput of manipulation in Informatica.  For instance: Informatica will never achieve the speeds of "append" or straight inserts that Oracle SQL*Loader, or Sybase BCP achieve.  This is because these two tools are written internally - specifically for the purposes of loading data (direct to tables / disk structures).  The API that Oracle / Sybase provide Informatica with is not nearly as equipped to allow this kind of direct access (to eliminate breakage when Oracle/Sybase upgrade internally).  The basics of Informatica are: 1) Keep maps as simple as possible 2) break complexity up in to multiple maps if possible 3) rule of thumb: one MAP per TARGET table 4) Use staging tables for LARGE sets of data 5) utilize SQL for it's power of sorts, aggregations, parallel queries, temp spaces, etc... (setup views in the database, tune indexes on staging tables) 6) Tune the database - partition tables, move them to physical disk areas, etc... separate the logic.
Q: How do I get an Oracle Sequence Generator to operate faster?
·         The first item is: use a function to call it, not a stored procedure.  Then, make sure the sequence generator and the function are local to the SOURCE or TARGET database, DO NOT use synonyms to place either the sequence or function in a remote instance (synonyms to a separate schema/database on the same instance may be only a slight performance hit).  This should help - possibly double the throughput of generating sequences in your map.  The other item is: see slide presentations on performance tuning for your sessions / maps for a "best" way to utilize an Oracle sequence generator.  Believe it or not - the write throughput shown in the session manager per target table is directly affected by calling an external function/procedure which is generating sequence numbers.  It does NOT appear to affect the read throughput numbers.  This is a difficult problem to solve when you have low "write throughput" on any or all of your targets.  Start with the sequence number generator (if you can), and try to optimize the map for this.
 
 
 
Q: I have a mapping that runs for hours, but it's not doing that much. It takes 5 input tables, uses 3 joiner transformations, a few lookups, a couple expressions and a filter before writing to the target. We're running PowerMart 4.6 on an NT 4 box. What tuning options do I have?
·         Without knowing the complete environment, it's difficult to say what the problem is, but here's a few solutions with which you can experiment. If the NT box is not dedicated to PowerMart (PM) during its operation, identify what it contends with and try rescheduling things such that PM runs alone. PM needs all the resources it can get. If it's a dedicated box, it's a well known fact that PM consumes resources at a rapid clip, so if you have room for more memory, get it, particularly since you mentioned use of the joiner transformation. Also toy with the caching parameters, but remember that each joiner grabs the full complement of memory that you allocate. So if you give it 50Mb, the 3 joiners will really want 150Mb. You can also try breaking up the session into parallel sessions and put them into a batch, but again, you'll have to manage memory carefully because of the joiners. Parallel sessions is a good option if you  have a multiple-processor CPU, so if you have vacant CPU slots, consider adding more CPU's. If a lookup table is relatively big (more than a few thousand rows), try turning the cache flag off in the session and see what happens. So if you're trying to look up a "transaction ID" or something similar out of a few million rows, don't load the table into memory. Just look it up, but be sure the table has appropriate indexes. And last, if the sources live on a pretty powerful box, consider creating a view on the source system that essentially does the same thing as the joiner transformations and possibly some of the lookups. Take advantage of the source system's hardware to do a lot of the work before handing down the result to the resource constrained NT box.
Q: Is there a "best way" to load tables?
·         Yes - If all that is occurring is inserts (to a single target table) - then the BEST method of loading that target is to configure and utilize the bulk loading tools.  For Sybase it's BCP, for Oracle it's SQL*Loader.  With multiple targets, break the maps apart (see slides), one for INSERTS only, and remove the update strategies from the insert only maps (along with unnecessary lookups) - then watch the throughput fly.  We've achieved 400+ rows per second per table in to 5 target Oracle tables (Sun Sparc E4500, 4 CPU's, Raid  5, 2 GIG RAM, Oracle 8.1.5) without using SQL*Loader.  On an NT 366 mhz P3, 128 MB RAM, single disk, single target table, using SQL*Loader we've loaded 1 million rows (150 MB) in 9 minutes total - all the map had was one expression to left and right trim the ports (12 ports, each row was 150 bytes in length).  3 minutes for SQL*Loader to load the flat file - DIRECT, Non-Recoverable.
Q: How do I guage that the performance of my map is acceptable?
·         If you have a small file (under 6MB) and you have pmserver on a Sun Sparc 4000, Solaris 5.6, 2 cpu's, 2 gigs RAM,  (baseline configuration - if your's is similar you'll be ok).  For NT: 450 MHZ PII 128 MB RAM (under 3 MB file size), then it's nothing to worry about unless your write throughput is sitting at 1 to 5 rows per second.  If you are in this range, then your map is too complex, or your tables have not been optimized.  On a baseline defined machine (as stated above), expected read throughput will vary - depending on the source, write throughput for relational tables (tables in the database) should be upwards of 150 to 450+ rows per second.  To calculate the total write throughput, add all of the rows per second for each target together, run the map several times, and average the throughput.  If your map is running "slow" by these standards, then see the slide presentations to implement a different methodology for tuning.  The suggestion here is: break the map up - 1 map per target table, place common logic in to maplets.
Q: How do I create a “state variable”?
·         Create a variable port in an expression (v_MYVAR), set the data type to Integer (for this example), set the expression to: IIF( ( ISNULL(v_MYVAR) = true or v_MYVAR = 0 ) [ and <your condition> ], 1, v_MYVAR).>  What happens here, is that upon initialization Informatica may set the v_MYVAR to NULL, or zero.>  The first time this code is executed it is set to “1”.>  Of course – you can set the variable to any value you wish – and carry that through the transformations.>  Also – you can add your own AND condition (as indicated in italics), and only set the variable when a specific condition has been met.>  The variable port will hold it’s value for the rest of the transformations.>  This is a good technique to use for lookup values when a single lookup value is necessary based on a condition being met (such as a key for an “unknown” value).>  You can change the data type to character, and use the same examination – simply remove the “or v_MYVAR = 0” from the expression – character values will be first set to NULL.
Q: How do I pass a variable in to a session?
·         There is no direct method of passing variables in to maps or sessions.>  In order to get a map/session to respond to data driven (variables) – a data source must be provided.>  If working with flat files – it can be another flat file, if working with relational data sources it can be with another relational table.>  Typically a relational table works best, because SQL joins can then be employed to filter the data sets, additional maps and source qualifiers can utilize the data to modify or alter the parameters during run-time.
Q: How can I create one map, one session, and utilize multiple source files of the same format?
·          In UNIX it’s very easy: create a link to the source file desired, place the link in the SrcFiles directory, run the session.>  Once the session has completed successfully, change the link in the SrcFiles directory to point to the next available source file.>  Caution: the only downfall is that you cannot run multiple source files (of the same structure) in to the database simultaneously.>  In other words – it forces the same session to be run serially, but if that outweighs the maintenance and speed is not a major issue, feel free to implement it this way.>  On NT you would have to physically move the files in and out of the SrcFiles directory.   Note: the difference between creating a link to an individual file, and changing SrcFiles directory to link to a specific directory is this: changing a link to an individual file allows multiple sessions to link to all different types of sources, changing SrcFiles to be a link itself is restrictive – also creates Unix Sys Admin pressures for directory rights to PowerCenter (one level up).
 
 
Q: How can I move my Informatica Logs / BadFiles directories to other disks without changing anything in my sessions?
·         Use the UNIX Link command – ask the SA to create the link and grant read/write permissions – have the “real” directory placed on any other disk you wish to have it on.
Q: How do I handle duplicate rows coming in from a flat file?
·         If you don't care about "reporting" duplicates, use an aggregator.  Set the Group By Ports to group by the primary key in the parent target table.  Keep in mind that using an aggregator causes the following: The last duplicate row in the file is pushed through as the one and only row, loss of ability to detect which rows are duplicates, caching of the data before processing in the map continues.  If you wish to report duplicates, then follow the suggestions in the presentation slides (available on this web site) to institute a staging table.  See the pro's and cons' of staging tables, and what they can do for you.
Where can I find …?
Q: Where can I find a history / metrics of the load sessions that have occurred in Informatica? (8 June 2000)
·         The tables which house this information are OPB_LOAD_SESSION, OPB_SESSION_LOG, and OPB_SESS_TARG_LOG.  OPB_LOAD_SESSION contains the single session entries, OPB_SESSION_LOG contains a historical log of all session runs that have taken place.  OPB_SESS_TARG_LOG keeps track of the errors, and the target tables which have been loaded.   Keep in mind these tables are tied together by Session_ID.   If a session is deleted from OPB_LOAD_SESSION, it's history is not necessarily deleted from OPB_SESSION_LOG, nor from OPB_SESS_TARG_LOG.  Unfortunately - this leaves un-identified session ID's in these tables.  However, when you can join them together, you can get the start and complete times from each session.  I would suggest using a view to get the data out (beyond the MX views) - and record it in another metrics table for historical reasons.  It could even be done by putting a TRIGGER on these tables (possibly the best solution)...
Q: Where can I find more information on what the Informatica Repository Tables are? 
·         On this web-site.  We have published an unsupported view of what we believe to be housed in specific tables in the Informatica Repository.  Check it out - we'll be adding to this section as we go.  Right now it's just a belief of what we see in the tables.  Repository Table Meta-Data Definitions (http://www.coreintegration.com/innercore/library/infa/InfAuditTrail.zip)
 
 
 
Q: Where can I find / change the settings regarding font's, colors, and layouts for the designer?
·         You can find all the font's, colors, layouts, and controls in the registry of the individual client.  All this information is kept at: HKEY_CURRENT_USER\Software\Informatica\PowerMart Client Tools\<ver>.  Below here, you'll find the different folders which allow changes to be made.  Be careful, deleting items in the registry could hamper the software from working properly.
Q: Where can I find tuning help above and beyond the manuals?
·         Right here.  There are slide presentations, either available now, or soon which will cover tuning of Informatica maps and sessions - it does mean that the architectural solution proposed here be put in place.
Q: Where can I find the map's used in generating performance statistics?
·         A windows ZIP file will soon be posted, which houses a repository backup, as well as a simple PERL program that generates the source file, and a SQL script which creates the tables in Oracle.  You'll be able to download this, and utilize this for your own benefit.
Why doesn’t …?
Q: Why doesn't constraint based load order work with a maplet? (08 May 2000)
·         If your maplet has a sequence generator (reusable) that's mapped with data straight to an "OUTPUT" designation, and then the map splits the output to two tables: parent/child - and your session is marked with "Constraint Based Load Ordering" you may have experienced a load problem - where the constraints do not appear to be met??  Well - the problem is in the perception of what an "OUTPUT" designation is.  The OUTPUT component is NOT an "object" that collects a "row" as a row, before pushing it downstream.  An OUTPUT component is merely a pass-through structural object - as indicated, there are no data types on the INPUT or OUTPUT components of a maplet - thus indicating merely structure.  To make the constraint based load order work properly, move all the ports through a single expression, then through the OUTPUT component - this will force a single row to be "put together" and passed along to the receiving maplet.  Otherwise - the sequence generator generates 1 new sequence ID for each split target on the other side of the OUTPUT component.
Q: Why doesn't 4.7 allow me to set the Stored Procedure connection information in the Session Manager -> Transformations Tab? (31 March 2000)
·         This functionality used to exist in an older version of PowerMart/PowerCenter.  It was a good feature - as we could control when the procedure was executed (ie: source pre-load), but execute it in a target database connection.  It appears to be a removed piece of functionality.  We are asking Informatica to put it back in.
 
Q: Why doesn't it work when I wrap a sequence generator in a view, with a lookup object?
·         First - to wrap a sequence generator in a view, you must create an Oracle stored function, then call the function in the select statement in a view.  Second, Oracle dis-allows an order by clause on a column returned from a user function (It will cut your connection - and report an oracle error).  I think this is a bug that needs to be reported to Oracle.  An Informatica lookup object automatically places an "order by" clause on the return ports / output ports in the order they appear in the object.  This includes any "function" return.  The minute it executes a non-cached SQL lookup statement with an order by clause on the function return (sequence number) - Oracle cuts the connection.  Thus keeping this solution from working (which would be slightly faster than binding an external procedure/function).
Q: Why doesn't a running session QUIT when Oracle or Sybase return fatal errors?
·         The session will only QUIT when it's threshold is set: "Stop on 1 errors".  Otherwise the session will continue to run.
Q: Why doesn't a running session return a non-successful error code to the command line when Oracle or Sybase return any error?
·         If the session is not bounded by it's threshold: set "Stop on 1 errors" the session will run to completion - and the server will consider the session to have completed successfully - even if Oracle runs out of Rollback or Temp Log space, even if Sybase has a similar error.  To correct this - set the session to stop on 1 error, then the command line: pmcmd will return a non-zero (it failed) type of error code. - as will the session manager see that the session failed.
Q: Why doesn't the session work when I pass a text date field in to the to_date function?
·         In order to make to_date(xxxx,<format>) work properly, we suggest surrounding your expression with the following: IIF( is_date(<date>,<format>) = true, to_date(<date>,<format>), NULL)  This will prevent session errors with "transformation error" in the port.  If you pass a non-date to a to_date function it will cause the session to bomb out.  By testing it first, you ensure 1) that you have a real date, and 2) your format matches the date input.  The format should match the expected date input directly - spaces, no spaces, and everything in between.  For example, if your date is: 1999103022:31:23  then you want a format to be: YYYYMMDDHH24:MI:SS with no spaces.
Q: Why doesn't the session control an update to a table (I have no update strategy in the map for this target)?
·         In order to process ANY update to any target table, you must put an update strategy in the map, process a DD_UPDATE command, change the session to "data driven".  There is a second method: without utilizing an update strategy, set the SESSION properties to "UPDATE" instead of "DATA DRIVEN", but be warned ALL targets will be updated in place - with failure if the rows don't exist.  Then you can set the update flags in the mapping's sessions to control updates to the target.  Simply setting the "update flags" in a session is not enough to force the update to complete - even though the log may show an update SQL statement, the log will also show: cannot insert (duplicate key) errors.
Q: What happens when I don't connect input ports to a maplet? (14 June 2000)
·         Potentially Hazardous values are generated in the maplet itself.  Particularly for numerics.  If you didn't connect ALL the ports to an input on a maplet, chances are you'll see sporadic values inside the maplet - thus sporadic results.  Such as ZERO in certain decimal cases where NULL is desired.  This is because both the INPUT and OUTPUT objects of a maplet are nothing more than an interface, which defines the structure of a data row - they are NOT like an expression that actually "receives" or "puts together" a row image.  This can cause a misunderstanding of how the maplet works - if you're not careful, you'll end up with unexpected results.
Q: What is the Local Object Cache? (3 March 2000)
·         The local object cache is a cache of the Informatica objects which are retrieved from the repository when a connection is established to a repository.  The cache is not readily accessed because it's housed within the PM/PC client tool.  When the client is shut-down, the cache is released.  Apparently the refresh cycle of this local cache requires a full disconnect/reconnect to the repository which has been updated.  This cache will house two different images of the same object.  For instance: a shared object, or a shortcut to another folder.  If the actual source object is updated (source shared, source shortcut), updates can only be seen in the current open folder if a disconnect/reconnect is performed against that repository.  There is no apparent command to refresh the cache from the repository.  This may cause some confusion when updating objects then switching back to the mapping where you'd expect to see the newly updated object appear.
Q: What is the best way to "version control"?
·         It seems the general developer community agrees on this one, the Informatica Versioning leaves a lot to be desired.  We suggest not utilizing the versioning provided.  For two reasons: one, it's extremely unwieldy (you lose all your sessions), and the repository grows exponentially because Informatica copies objects to increase the version number.  We suggest two different approaches; 1) utilizing a backup of the repository - synchronize Informatica repository backups (as opposed to DBMS repo backups) with all the developers.  Make your backup consistently and frequently.  Then - if you need to back out a piece, restore the whole repository.  2) Build on this with a second "scratch" repository, save and restore to the "scratch" repository ONE version of the folders.  Drag and drop the folders to and from the "scratch" development repository.  Then - if you need to VIEW a much older version, restore that backup to the scratch area, and view the folders.  In this manner - you can check in the whole repository backup binary to an outside version control system like PVCS, CCS, SCM, etc...  Then restore the whole backup in to acceptance - use the backup as a "VERSION" or snapshot of everything in the repository - this way items don't get lost, and disconnected versions do not get migrated up in to production.
Q: What is the best way to handle multiple developer environments?
·         The school of thought is still out on this one.  As with any - there are many many ways to handle this.  One idea is presented here (which seems to work well, and be comfortable to those who already worked in shared Source Code environments).  The idea is this: All developers use shared folders, shared objects, and global repositories.  In development - it's all about communication between team members - so that the items being modified are assigned to individuals for work.  With this methodology - all maps can use common mapplets, shared sources, targets, and other items.  The one problem with this is that the developers MUST communicate about what they are working on.  This is a common and familiar method to working on shared source code - most development teams feel comfortable with this, as do managers.  The problem with another commonly utilized method (one folder per developer), is that you end up with run-away development environments.  Code re-use, and shared object use nearly always drop to zero percent (caveat: unless you are following SEI / CMM / KPA Level 5 - and you have a dedicated CM (Change Management) person in the works.  Communication is still of utmost importance, however now you have the added problem of "checking in" what looks like different source tables from different developers, but the objects are named the same... Among other problems that arise.
Q: What is the web address to submit new enhancement requests?
·         Informatica's enhancement request web address is: mailto:featurerequest@informatica.com
Q: What is the execution order of the ports in an expression?
·         All ports are executed TOP TO BOTTOM in a serial fashion, but they are done in the following groups: All input ports are pushed values first.  Then all variables are executed (top to bottom physical ordering in the expression).  Last - all output expressions are executed to push values to output ports - again, top to bottom in physical ordering.  You can utilize this to your advantage, by placing lookups in to variables, then using the variables "later" in the execution cycle. 
Q: What is a suggested method for validating fields / marking them with errors?
·         One of the successful methods is to create an expression object, which contains variables.>  One variable per port that is to be checked.>  Set the error “flag” for that field, then at the bottom of the expression trap each of the error fields.>  From this port you can choose to set flags based on each individual error which occurred, or feed them out as a combination of concatenated field names – to be inserted in to the database as an error row in an error tracking table.
Q: What does the error “Broken Pipe” mean in the PMSERVER.ERR log on Unix?
·         One of the known causes for this error message is: when someone in the client User Interface queries the server, then presses the “cancel” button that appears briefly in the lower left corner.>  It is harmless – and poses no threat.
Q: What is the best way to create a readable “DEBUG” log?
·         Create a table in a relational database which resembles your flat file source (assuming you have a flat file source).>  Load the data in to the relational table.>  Then – create your map from top to bottom and turn on VERBOSE DATA log at the session level.>  Go back to the map, over-ride the SQL in the SQL Qualifier to only pull one to three rows through the map, then run the session.>  In this manner, the DEBUG log will be readable, errors will be much easier to identify – and once the logic is fixed, the whole data set can be run through the map with NORMAL logging.>  Otherwise you may end up with a huge (Megabyte) log.>  The other two ways to create debugging logs are: 1) switch the session to TEST LOAD, set it to 3 rows, and run… The problem with this is that the reader will read ALL of the source data.>  2) change the output to a flat file…. The problem with this is that your log ends up huge (depends on the number of source rows you have).
Q: What is the best methodology for utilizing Informatica’s Strengths?
·         It depends on the purpose. However – there is a basic definition of how well the tool will perform with throughput and data handling, if followed in general principal – you will have a winning situation.>  1) break all complex maps down in to small manageable chunks.>  Break up any logic you can in to steps.>  Informatica does much better with smaller more maintainable maps.  2) Break up complex logic within an expression in to several different expressions.>  Be wary though: the more expressions the slower the throughput – only break up the logic if it’s too difficult to maintain.>  3) Follow the guides for table structures and data warehouse structures which are available on this web site.>  For reference: load flat files to staging tables, load staging tables in to operational data stores / reference stores / data warehousing sources, load data warehousing sources in to star schemas or snowflakes, load star schemas or snowflakes in to highly de-normalized reporting tables.>  By breaking apart the logic you will see the fastest throughput.
Q: When is it right to use SQL*Loader / BCP as a piped session versus a tail process?
·         SQL*Loader / BCP as a piped session should be used when no intermediate file is necessary, or the source data is too large to stage to an intermediate file, there is not enough disk or time to place all the source data in to an intermediate file.>  The downfalls currently are this: as a piped process (for PowerCenter 1.5.2 and 1.6 / PowerMart v4.52. and 4.6)>  the core does NOT stop when either BCP or SQL*Loader “quit” or terminate.>  The core will only stop after reading all of the source data in to the data reader thread.>  This is dangerous if you have a huge file you wish to process – and it’s scheduled as a monitored process.>  Which means: a 5 hour load (in which SQL*Loader / BCP stopped within the first 5 minutes) will only stop and signal a page after 5 hours of reading source data.
 
 
 
Q: What happens when Informatica causes DR Watson's on NT? (30 October 2000)
·         This is just my theory for now, but here's the best explanation I can come up with.  Typically this occurs when there is not enough physical RAM available to perform the operation.  Usually this only happens when SQLServer is installed on the same machine as the PMServer - however if this is not your case, some of this may still apply.  PMServer starts up child threads just like Unix.  The threads share the global shared memory area - and rely on NT's Thread capabilities.   The DR Watson seems to appear when a thread attempts to deallocate, or allocate real memory.  There's none left (mostly because of SQLServer).  The memory manager appears to return an error, or asks the thread to wait while it reorganizes virtual RAM to make way for the physical request.  Unfortunately the thread code doesn't pay attention to this requrest, resulting in a memory violation.  The other theory is the thread attempts to free memory that's been swapped to virtual, or has been "garbage collected" and cleared already - thus resulting again in a protected memory mode access violation - thus a DR Watson.  Typically the DR Watson can cause the session to "freeze up".  The only way to clear this is to stop and restart the PMSERVER service - in some cases it requires a full machine reboot.  The only other possibility is when PMServer is attempting to free or shut down a thread - maybe there's an error in the code which causes the DR Watson.  In any case, the only real fix is to increase the physical RAM on the machine, or to decrease the number of concurrent sessions running at any given point, or to decrease the amount of RAM that each concurrent session is using.
Q: What happens when Informatica CORE DUMPS on Unix? (12 April 2000)
·         Many things can cause a core dump, but the question is: how do you go about "finding out" what cuased it, how do you work to solve it, and is there a simple fix?  This case was found to be frequent (according to tech support) among setups of New Unix Hardware - causing unnecessary core dumps.   The IPC semaphore settings were set too low - causing X number of concurrent sessions to "die" with "writer process died" and "reader process died" etc...  We are on a Unix Machine - Sun Solaris 5.7, anyone with this configuration might want to check the settings if they experience "Core Dumps" as well.
1.      Run "sysdef", examine the IPC Semaphores section at the bottom of the output.
2.      the folowing settings should be "increased"
3.      SEMMNI - (semaphore identifiers), (7 x # of concurrent sessions to run in Informatica) + 10 for growth + DBMS setting  (DBMS Setting: Oracle = 2 per user, Sybase = 40 (avg))
4.      SEMMNU - (undo structures in system) = 0.80 x SEMMNI value
5.      SEMUME - (max undo entries per process) = SEMMNU
6.      SHMMNI - (shared memory identifiers) = SEMMNI + 10
·         These settings must be changed by ROOT: etc/system  file.
·         About the CORE DUMP:  To help Informatica figure out what's going wrong you can run a unix utility: "truss" in the following manner:
1.      Shut down PMSERVER
2.      login as "powermart" owner of pmserver - cd to the pmserver home directory.
3.      Open Session Manager on another client - log in, and be ready to press "start" for the sessions/batches causing problems.
4.      type: truss -f -o truss.out pmserver     <hit return>
5.      On the client, press "start" for the sessions/batches having trouble.
6.      When all the batches have completed or failed, press "stop server" from the Server Manager
·         Your "truss.out" file will have been created - thus giving you a log of all the forked processes, and memory management /system calls that will help decipher what's happing.  you can examine the "truss.out" file - look for: "killed" in the log.
·         DONT FORGET: Following a CORE DUMP it's always a good idea to shut down the unix server, and bounce the box (restart the whole server).
Q: What happens when Oracle or Sybase goes down in the middle of a transformation?
·         It’s up to the database to recover up to the last commit point.>  If you’re asking this question, you should be thinking about re-runnability of your processes.>  Designing re-runability in to the processing/maps up front is the best preventative measure you can have.>  Utilizing the recovery facility of PowerMart / PowerCenter appears to be sketchy at best – particularly in this area of recovery.>  The transformation itself will eventually error out – stating that the database is no longer available (or something to that effect).
Q: What happens when Oracle (or Sybase) is taken down for routine backup, but nothing is running in PMServer at the time?
·         PMServer reports that the database is unavailable in the PMSERVER.err log.>  When Oracle/Sybase comes back on line, PMServer will attempt to re-connect (if the repository is on the Oracle/Sybase instance that went down), and eventually it will succeed (when Oracle/Sybase becomes available again).>  However – it is recommended that PMServer be scheduled to shutdown before Oracle/Sybase is taken off-line and scheduled to re-start after Oracle/Sybase is put back on-line.
Q: What happens in a database when a cached LOOKUP object is created (during a session)?
·         The session generates a select statement with an Order By clause.  Any time this is issued, the databases like Oracle and Sybase will select (read) all the data from the table,  in to the temporary database/space.  Then the data will be sorted, and read in chunks back to Informatica server.  This means, that hot-spot contention for a cached lookup will NOT be the table it just read from.  It will be the TEMP area in the database, particularly if the TEMP area is being utilized for other things.   Also - once the cache is created, it is not re-read until the next running session re-creates it.
 
 
 
Generic Questions
Q: Can you explain how "constraint based load ordering" works? (27 Jan 2000)
·         Constraint based load ordering in PowerMart / PowerCenter works like this: it controls the order in which the target tables are committed to a relational database.  It is of no use when sending information to a flat file.  To construct the proper constraint order: links between the TARGET tables in Informatica need to be constructed.  Simply turning on "constraint based load ordering" has no effect on the operation itself.  Informatica does NOT read constraints from the database when this switch is turned on.  Again, to take advantage of this switch, you must construct primary / foreign key relationships in the TARGET TABLES in the designer of Informatica.  Creating primary / foreign key relationships is difficult - you are only allowed to link a single port (field) to a single table as a primary / foreign key.
Q: It appears as if "constraint based load ordering" makes my session "hang" (it never completes).  How do I fix this? (27 Jan 2000)
·         We have a suggested method. The best known method for fixing this "hang" bug is to 1) open the map, 2) delete the target tables (parent / child pairs) 3) Save the map, 4) Drag in the targets again, Parent's FIRST 5) relink the ports, 6) Save the map, 7) refresh the session, and re-run it. What it does: Informatica places the "target load order" as the order in which the targets are created (in the map). It does this because the repository is Seuqence ID Based and the session derives it's "commit" order by the Sequence ID (unless constraint based load ordering is ON), then it tries to re-arrange the commit order based on the constraints in the Target Table definitions (in PowerMart/PowerCenter). Once done, this will solve the commit ordering problems, and the "constraint based" load ordering can even be turned off in the session. Informatica claims not to support this feature in a session that is not INSERT ONLY. However -we've gotten it to work successfully in DATA DRIVEN environments. The only known cause (according to Technical Support) is this: the writer is going to commit a child table (as defined by the key links in the targets).  It checks to see if that particular parent row has been committed yet - but it finds nothing (because the reader filled up the memory cache with new rows).  The memory that was holding the "committed" rows has been "dumped" and no longer exists.  So - the writer waits, and waits, and waits - it never sees a "commit" for the parents, so it never "commits" the child rows.  This only appears to happen with files larger than a certain number of rows (depending on your memory settings for the session).  The only fix is this: Set "ThrottleReader=20" in the PMSERVER.CFG file.  It apparently limits the Reader thread to a maximum of "20" blocks for each session - thus leaving the writer more room to cache the commit blocks.  However - this too also hangs in certain situations.  To fix this, Tech Support recommends moving to PowerMart 4.6.2 release (internal core apparently needs a fix).  4.6.2 appears to be "better" behaved but not perfect.  The only other way to fix this is to turn off constraint based load ordering, choose a different architecture for your maps (see my presentations), and control one map/session per target table and their order of execution.
 
 
Q: Is there a way to copy a session with a map, when copying a map from repository to repository?  Say, copying from Development to Acceptance?
·         Not that anyone is aware of.  There is no direct straight forward method for copying a session.  This is the one downside to attempting to version control by folder.  You MUST re-create the session in Acceptance (UNLESS) you backup the Development repository, and RESTORE it in to acceptance.  This is the only way to take all contents (and sessions) from one repository to another.  In this fashion, you are versioning all of the repository at once.  With the repository BINARY you can then check this whole binary in to PVCS or some other outside version control system.  However, to recreate the session, the best method is to: bring up Development folder/repo, side by side with Acceptance folder/repo - then modify the settings in Acceptance as necessary.
Q: Can I set Informatica up for Target flat file, and target relational database?
·         Up through PowerMart 4.6.2, PowerCenter 1.6.2 this cannot be done in a single map.  The best method for this is to stay relational with your first map, add a table to your database that looks exactly like the flat file (1 for 1 with the flat file), target the two relational tables.  Then, construct another map which simply reads this "staging" table and dumps it to flat file.  You can batch the maps together as sequential.
Q: How can you optimize use of an Oracle Sequence Generator?
·         In order to optimize the use of an Oracle Sequence Generator you must break up you map.  The generic method for calling a sequence generator is to encapsulate it in a stored procedure.  This is typically slow - and kills the performance.   Your version of Informatica's tool should contain maplets to make this easier.   Break the map up in to inserts only, and updates only.  The suggested method is as follows: 1) Create a staging table - bring the data in straight from the flat file in to the staging table.  2) Create a maplet with the current logic in it.  3) create one INSERT map, and one Update map (separate inserts from updates)  4) create a SOURCE called: DUAL, containing the fields: DUMMY char(1), NEXTVAL NUMBER(15,0), CURRVAL number(15,0), 5) Copy the source in to your INSERT map, 6) delete the Source Qualifier for "dummy" 7) copy the "nextval" port in to the original source qualifier (the one that pulls data from the staging table) 8) Over-ride the SQL in the original source qualifier, (generate it, then change DUAL.NEXTVAL to the sequence name: SQ_TEST.NEXTVAL.  9) Feed the "nextval" port through the mapplet.  10) Change the where clause on the SQL over-ride to select only the data from the staging table that doesn't exist in the parent target (to be inserted.  This is extremely fast, and will allow your inserts only map to operate at incredibly high throughput while using an Oracle Sequence Generator.  Be sure to tune your indexes on the Oracle tables so that there is a high read throughput.
Q: Why can't I over-ride the SQL in a lookup, and make the lookup non-cached?
·         Apparently Informatica hasn't made this feature available yet in their tool.  It's a shame - it would simplify the method for pulling Oracle Sequence numbers from the database.  For now - it's simply not implemented.
Q: Does it make a difference if I push all my ports (fields) through an expression, or push only the ports which are used in the expression?
·         From the work that has been done - it doesn't make much of an impact on the overall speed of the map.  If the paradigm is to push all ports through the expressions for readability then do so, however if it's easier to push the ports around the expression (not through it), then do so.
Q: What is the affect of having multiple expression objects vs one expression object with all the expressions?
·         Less overall objects in the map make the map/session run faster.  Consolodating expressions in to a single expression object is most helpful to throughput - but can increase the complexity (maintenance).  Read the question/answer about execution cycles above for hints on how to setup a large expression like this.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

5 comments:


  1. Nice article . Thank you for this beautiful content, Keep it up. Techavera is the best
    Python Training Institute in Noida.
    Visit us For Quality Learning.Thank you

    ReplyDelete
  2. I am glad to read this. Thank you for this beautiful content, Keep it up. Techavera is the best Selenium training course in Noida. Visit us For Quality Learning.Thank you

    ReplyDelete
  3. I am glad to read this. Thank you for this beautiful content, Keep it up. Techavera is the best
    vcloud training course in Noida.
    Visit us For Quality Learning.Thank you

    ReplyDelete
  4. I am glad to read this.Thank you for this beautiful content, Keep it up. Ozonemodeltown is the best
    Ozone Fitness Clubs in Delhi
    Visit us For Quality Learning. Thank you

    ReplyDelete
  5. I am glad to read this. Thank you for this beautiful content, Keep it up. Ozonemodeltown is the best Ozone Gym India training Delhi. Visit us For Quality Learning. Thank you

    ReplyDelete