0% found this document useful (0 votes)
6 views18 pages

informarica questions and answers

Informatica PowerCenter is a data integration tool for ETL processes, featuring components like Designer and Workflow Manager. Key concepts include Active vs. Passive transformations, Lookup transformations, and performance tuning techniques such as Pushdown Optimization. The document also covers various transformations, error handling, and best practices for managing workflows and data processing in Informatica.

Uploaded by

mani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views18 pages

informarica questions and answers

Informatica PowerCenter is a data integration tool for ETL processes, featuring components like Designer and Workflow Manager. Key concepts include Active vs. Passive transformations, Lookup transformations, and performance tuning techniques such as Pushdown Optimization. The document also covers various transformations, error handling, and best practices for managing workflows and data processing in Informatica.

Uploaded by

mani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 18

1. What is Informatica PowerCenter?

Informatica PowerCenter is a data integration tool used for ETL processes,


consisting of Designer, Repository Manager, Workflow Manager, and
Workflow Monitor. It helps in extracting data from various sources,
transforming it, and loading it into targets.

2. Explain the difference between Active and Passive transformations.

Active transformations can change the number of rows passing through them
(e.g., Filter, Router). Passive transformations do not change the row count
(e.g., Lookup, Expression).

3. What is a Lookup Transformation in Informatica?

A Lookup transformation is used to perform lookups on a source or target. It


can be connected or unconnected. It's typically used to fetch related data
from a table or file.

4. How do you handle performance tuning in Informatica?

Use techniques like Pushdown Optimization, parallel processing, partitioning,


indexing, and minimizing the use of transformations that require memory
such as Aggregator or Sorter.

5. What are the different types of caches in Informatica?

Informatica uses three types of caches: Index Cache, Data Cache, and
Persistent Cache. Index Cache stores the index of lookup values, Data Cache
stores the actual data, and Persistent Cache stores cache information across
sessions.

6. What is a Source Qualifier in Informatica?

The Source Qualifier transformation extracts data from a source and converts
it into a format that can be processed by Informatica. It’s the starting point of
any mapping.
7. What is a Joiner Transformation?

The Joiner transformation is used to join data from two sources. It supports
different types of joins like Inner Join, Left Outer Join, Right Outer Join, and
Full Outer Join.

8. How do you perform Incremental Load in Informatica?

Incremental load is done using a technique where only the new or changed
data since the last load is fetched, typically based on a timestamp or an ID.
This can be done using filters and lookup transformations.

9. What are the types of lookups in Informatica?

There are two types of lookups in Informatica: Connected Lookup and


Unconnected Lookup. A connected lookup can return multiple values,
whereas an unconnected lookup returns a single value.

10. What is the use of Router Transformation?

The Router transformation is used to route data to different output groups


based on specified conditions. It’s similar to a Filter transformation but allows
multiple output groups.

11. Scenario: Your session failed due to memory issues. How do you debug
and fix it?

Check session logs for memory usage patterns. Tune cache size for lookups
and aggregators, reduce DTM buffer size, and ensure parallel processing is
set up correctly.

12. How do you migrate a mapping from Dev to Prod in Informatica?

Export the mapping as XML files using Repository Manager and validate
connections, parameter files, and DB links before deployment using
Deployment Groups.
13. How do you schedule workflows in Informatica?

Use Informatica Scheduler, or external tools like Control-M or cron jobs for
scheduling. Use event wait and trigger actions based on file dependencies.

14. What are session logs and how do you manage them?

Session logs contain details of session run status, errors, and performance
metrics. Manage logs by using Workflow Monitor and setting retention
policies.

15. What is pmcmd and where is it used?

pmcmd is a command-line utility to start, stop, and monitor workflows. It is


used for automating ETL tasks, integration with scheduling tools, and CI/CD
pipelines.

16. What is Pushdown Optimization?

Pushdown Optimization is the process where as much data processing as


possible is offloaded to the database to improve performance. It’s applied on
transformations like Filter, Aggregator, etc.

17. How do you handle a scenario where you need to filter records based on
a range of values in Informatica?

Use a Filter transformation to create conditions that filter out unwanted


records. This can include range-based filters such as date or numeric ranges.

18. What is a Transaction Control Transformation?

Transaction Control transformation is used to manage commit and rollback of


data in target tables based on transaction control rules. It helps in controlling
commit behavior during data loading.
19. How do you implement audit logging in your ETL process?

Create an Audit Table with fields like Workflow_Name, Mapping_Name,


Start_Time, End_Time, Source_Count, Target_Count, Status, Error_Message.
Use Post-Session Commands or a Stored Procedure transformation to insert
audit info.

20. How do you handle changes in source data schema (e.g., adding a
column) during an active ETL process?

You need to modify the mappings to reflect the changes. This could involve
adding new fields to the source definitions, updating the transformation
logic, and updating the target schema accordingly.

21. What is a Rank Transformation?

The Rank transformation is used to select the top or bottom N records from a
group of records. It’s typically used when you want to find the highest or
lowest values based on certain conditions.

22. What is the difference between a Filter Transformation and a Router


Transformation?

A Filter transformation is used to filter data based on a condition, while a


Router transformation is used to route data to multiple output groups based
on multiple conditions.

23. What is the difference between a Joiner and Lookup Transformation?

A Joiner transformation is used to join two sources based on a common field.


A Lookup transformation is used to search for a value from one source and
return the associated value from another source, typically for adding more
context or reference data.

24. How do you handle NULL values in Informatica?

Informatica provides functions such as ISNULL() and NVL() to handle NULL


values. You can also use expressions to filter out NULL values or replace
them with default values during transformations.
25. What is the difference between a Normalizer and a Pivot Transformation?

Normalizer transformation is used to normalize data (flatten data from


multiple rows into a single row), whereas a Pivot transformation is used to
transform columns into rows.

26. Explain the concept of a Target Load Order in Informatica.

The Target Load Order is used in a session to specify the order in which
targets are loaded. It helps in managing dependencies between tables,
ensuring the correct sequence of loading.

27. What is a mapping parameter in Informatica?

A mapping parameter is a variable whose value can be set before the


session runs. It allows you to reuse the mapping for different input values
without changing the underlying logic.

28. How do you handle slowly changing dimensions (SCD) in Informatica?

Informatica provides the SCD Type 1, Type 2, and Type 3 transformations to


handle different types of slowly changing dimensions. These transformations
help in managing historical data changes and updates.

29. What is the difference between SCD Type 1 and SCD Type 2?

SCD Type 1 overwrites old data with new data without preserving history,
while SCD Type 2 preserves historical data by adding a new record for each
change, typically with effective and expiry dates.

30. How do you manage performance issues in a large volume ETL process?

Use partitioning, parallel processing, optimize transformations (e.g.,


minimize use of Aggregators and Sorters), and ensure proper indexing in the
source and target databases. Pushdown optimization can also be used to
offload processing to the database.
31. What is the difference between an Unconnected and Connected Lookup?

A connected lookup is part of the data flow, receives input directly from the
pipeline, and can return multiple columns. An unconnected lookup is called
within an expression and returns a single value.

32. What are some best practices for managing and organizing workflows in
Informatica?

Group related workflows into folders, use descriptive names for sessions and
workflows, manage dependencies using pre- and post-session commands,
and maintain consistent logging to help with troubleshooting.

33. What is the role of the Repository in Informatica?

The repository stores metadata for Informatica, including information about


mappings, workflows, transformations, and session logs. It ensures
consistency and version control across ETL processes.

34. How can you handle errors during ETL processing?

Errors can be handled using error tables, custom error handling in the
transformation logic, post-session commands, or error logs. It’s important to
set up proper logging to capture error details for debugging.

35. What is the difference between a Stored Procedure Transformation and


an External Procedure Transformation?

A Stored Procedure Transformation allows you to execute a stored procedure


in the database during ETL. An External Procedure Transformation is used to
call an external program or function (e.g., a DLL or an executable) for custom
processing.

36. How do you perform incremental data load with Informatica?


Informatica’s incremental load is typically achieved by tracking the changes
in source data, using a timestamp or sequence number, and loading only the
modified records. This is often done using a Lookup Transformation to fetch
only new or changed records.

37. What is the difference between a Source Definition and a Target


Definition?

A Source Definition represents the structure of the data that is being


extracted, while a Target Definition represents the structure of the
destination where the data will be loaded.

38. How can you achieve parallel processing in Informatica?

Parallel processing in Informatica can be achieved using session partitioning,


where the session is divided into multiple parts and processed concurrently,
or by using pipeline partitioning to parallelize data flow.

39. What is the use of a Sequence Generator Transformation?

The Sequence Generator transformation is used to generate unique


sequential numbers. It is commonly used for generating surrogate keys for
target tables.

40. What is a Data Masking Transformation in Informatica?

Data Masking is a process of replacing sensitive data with masked values.


The Data Masking Transformation is used in Informatica to protect sensitive
information, such as credit card numbers or personal identifiers.

41. What is the difference between the Connected and Unconnected Lookup
transformation?

A Connected Lookup transformation is directly connected to the data flow


and can return multiple columns, while an Unconnected Lookup is called
using a function within an expression and returns a single value.

42. How do you optimize the performance of an Aggregator transformation?


To optimize an Aggregator transformation, minimize the number of ports, use
sorted input, reduce the number of groups, and apply aggregation functions
efficiently. Using the sorted input feature helps reduce memory usage.

43. What is the difference between a Short-Circuit and a Normalizer


transformation?

Short-Circuit is used to stop a session when an error occurs, while Normalizer


is used for flattening multiple rows into a single row, generally in case of
working with hierarchical or repeated data.

44. How do you handle situations where your source data has duplicate
records?

To handle duplicate records, use a Sorter transformation with a distinct


option, or use the Aggregator transformation to aggregate and eliminate
duplicates. Alternatively, use a Filter transformation to discard duplicates
based on a defined condition.

45. How do you handle late arriving data in Informatica?

Late arriving data can be managed by setting up a delay for processing the
incoming data or using a custom logic in transformations like the Lookup or
Update Strategy transformation to handle the late data correctly.

46. What is the difference between a Sequence Generator and a Surrogate


Key?

A Sequence Generator generates a sequence of unique numbers for use as


surrogate keys, while a Surrogate Key is a key generated by a database or
ETL process, used to uniquely identify a record in a dimension table.

47. What is the difference between a Passive and Active transformation in


terms of row processing?

A Passive transformation does not alter the number of rows in the pipeline,
while an Active transformation can change the number of rows. For example,
a Filter (Active) can drop rows, but an Expression (Passive) can only modify
data within the same row.

48. What are the benefits of using Pushdown Optimization in Informatica?

Pushdown Optimization helps to improve performance by pushing the data


processing logic to the source or target database instead of performing it in
the Informatica engine. This reduces ETL processing time by leveraging
database processing power.

49. How would you implement a Slowly Changing Dimension (SCD) Type 3 in
Informatica?

In SCD Type 3, changes to data are tracked by maintaining only limited


historical data (e.g., current and previous values). You can implement this
using an Expression transformation to track and store both the current and
previous values in different columns.

50. How would you approach error handling in an ETL workflow in


Informatica?

Error handling in an ETL workflow can be done by using a combination of


Error Tables, Post-Session Commands, and custom expressions. You can also
configure error ports in transformations and use the Error Handling option in
the session properties to redirect failed records to a specific table.

51. How do you create reusable objects in Informatica?

You can create reusable objects like transformations, mapplets, and sessions
by defining them and saving them in the repository. These reusable
components can then be used in multiple mappings and workflows to ensure
consistency and save time.

52. What is the function of the Expression Transformation in Informatica?


The Expression transformation allows you to perform row-wise computations
and transformation of data, such as string manipulation, mathematical
operations, conditional checks, and more.

53. What are the different types of sessions available in Informatica?

Informatica supports two types of sessions: Normal sessions (which process


data as defined in the mapping) and Command sessions (which execute
external commands like scripts or commands after a workflow run).

54. How do you handle a scenario where the target database is not available
during the ETL process?

In such a scenario, you can configure your session to retry the connection,
use a database-specific failover mechanism, or write data to a temporary
staging area, which can be later reloaded when the target database becomes
available.

55. What is the difference between the Source Qualifier and the Lookup
transformation?

The Source Qualifier is used to filter, join, or aggregate data from source
tables directly, whereas the Lookup transformation is used to look up
additional data from a reference source or cache, typically used to enrich
source data with reference data.

56. What is the purpose of a Stored Procedure Transformation?

A Stored Procedure transformation allows you to call a stored procedure in


the target database as part of your ETL process. It can be used for tasks like
data validation, auditing, and running complex business logic.

57. What is the difference between a Target Definition and a Target Load
Plan?

A Target Definition represents the structure of the target data that is being
loaded, while a Target Load Plan defines the sequence in which the target
tables should be loaded, managing the dependencies between multiple
targets in a session.

58. How do you implement change data capture (CDC) in Informatica?

Change Data Capture (CDC) in Informatica can be implemented using either


the Source Qualifier with a filter to capture changed data or through
incremental load techniques using timestamps or sequence numbers to track
changes between ETL cycles.

59. What is the role of the Repository Manager in Informatica?

Repository Manager in Informatica is used to manage and organize the


repository metadata. It is used to perform tasks like importing/exporting
objects, performing version control, and checking in/out mappings, sessions,
and other repository objects.

60. What is the use of the Dynamic Lookup Cache in Informatica?

The Dynamic Lookup Cache is used when you want to update existing
records in the target based on a lookup match and insert new records if no
match is found. It dynamically updates the cache during session execution to
handle both inserts and updates.

61. How do you handle the scenario where the data needs to be processed in
parallel for improved performance?

To process data in parallel, you can use partitioning in the session properties
or employ pipeline partitioning. You can also use parallel session execution to
divide and conquer large datasets by processing them in smaller chunks
simultaneously.

62. How do you use the Update Strategy transformation in Informatica?

The Update Strategy transformation is used to specify whether a row should


be inserted, updated, deleted, or rejected in the target. It helps manage the
logic for handling different types of data changes in the target.
63. What is the role of the Cache File in the Lookup transformation?

In the Lookup transformation, the Cache File stores the cached data that is
used for lookups. The cache allows the transformation to perform faster
lookups by keeping a local copy of the reference data instead of querying the
source database for each lookup.

64. What is a Mapplet in Informatica?

A Mapplet is a reusable object in Informatica that consists of one or more


transformations. It can be used across multiple mappings to encapsulate
logic that can be reused in different mappings.

65. What are the different types of errors that can occur during ETL
processing?

Errors during ETL processing can include data truncation, data type
mismatches, constraint violations, connection issues, transformation logic
errors, and external system errors such as unavailability of source or target
systems.

66. How do you configure a session to run a shell script after the session
completes?

You can configure a session to run a shell script by using the Post-Session
Command option in the session properties. This allows you to specify the
script or executable to run after the session finishes processing.

67. How do you handle large volumes of data in Informatica?

For handling large volumes of data, use parallel processing, partitioning, and
pushdown optimization. Also, ensure that your session and database
configurations are optimized to manage large datasets efficiently.

68. What is the function of the Joiner Transformation in Informatica?


The Joiner transformation is used to join two different sources. It supports
various join types like inner join, left outer join, right outer join, and full outer
join, and can handle data from heterogeneous sources.

69. How do you extract data from a flat file in Informatica?

Data extraction from a flat file can be done using the Flat File Source
Definition in Informatica, where you define the structure of the flat file (fixed
width, delimited, etc.) and map the fields to the required targets.

70. How do you handle a situation where you have to combine data from
multiple sources with different data formats?

Use the appropriate source definitions for each data format (e.g., Flat File,
Relational, XML), and then perform any necessary transformations to
harmonize and consolidate the data before loading it into the target.

71. What is the use of the Expression Transformation in Informatica?

The Expression transformation is used to perform row-level calculations,


string manipulations, conditional logic, and other transformations within a
mapping. It allows you to modify or create new columns based on the
existing input data.

72. How do you handle duplicate data in a source system using Informatica?

To handle duplicate data in a source system, you can use the Sorter
transformation with the "Distinct" option enabled, or use the Aggregator
transformation to group data and eliminate duplicates based on specific
fields.

73. What is a Dynamic Cache in the Lookup transformation?

A Dynamic Cache in the Lookup transformation is a cache that is updated


dynamically during session execution. It is used when you want to perform
both updates and inserts to the target table based on the lookup operation.

74. What is the significance of the Target Definition in Informatica?


The Target Definition defines the structure of the target table or file in a
mapping. It contains details about the columns and their data types in the
target, which will receive the transformed data from the source.

75. What are the main differences between an Informatic PowerCenter and
Informatica Cloud?

Informatica PowerCenter is an on-premise ETL tool, while Informatica Cloud is


a cloud-based data integration platform. The main difference is that
PowerCenter is typically used for on-premise integration, whereas Cloud is
used for cloud-based and hybrid data integration.

76. What is a Data Driven Session in Informatica?

A Data Driven Session allows you to pass parameter values dynamically at


runtime based on the data being processed. It is often used when you want
to conditionally control the execution of workflows based on data.

77. What is an ETL Mapping in Informatica?

An ETL Mapping in Informatica defines the data flow from sources to targets.
It includes the transformations that will be applied to the data during
extraction, transformation, and loading.

78. What is the difference between a Session and a Workflow in Informatica?

A Session is the runtime process that executes a mapping, and it performs


the data extraction, transformation, and loading. A Workflow, on the other
hand, is a higher-level object that contains multiple sessions or tasks and
controls the sequence of execution.

79. What is the function of the XML Source and Target in Informatica?

The XML Source and Target are used to read and write data from and to XML
files. You can define an XML schema, which outlines the structure of the XML
file, and map the data to and from relational tables.
80. What is a Post-Session Command in Informatica?

A Post-Session Command is a script or command that is executed after a


session has finished running. It can be used to perform actions such as
sending emails, moving files, or executing other processes once the session
is complete.

81. What is an Aggregate Transformation in Informatica?

The Aggregate transformation is used to perform aggregate calculations on


data, such as summing, averaging, counting, and finding
minimum/maximum values. It is similar to the GROUP BY SQL operation.

82. How do you optimize a session for better performance in Informatica?

To optimize session performance, enable pushdown optimization, use


partitioning, adjust memory settings, minimize the use of complex
transformations (like Aggregator or Sorter), and ensure that the source and
target systems are properly indexed.

83. What is the use of the Sorter Transformation in Informatica?

The Sorter transformation is used to sort data based on one or more columns
in ascending or descending order. Sorting is often required before performing
certain operations like aggregation, joining, or partitioning.

84. What is the function of the Workflow Monitor in Informatica?

The Workflow Monitor allows you to monitor the execution status of


workflows and sessions in real-time. It provides details about the session
execution, including success, failure, and performance metrics, and helps in
troubleshooting errors.

85. How do you process and load unstructured data into a database using
Informatica?

Unstructured data can be processed using custom parsing logic in an


Expression transformation, regular expressions, or by using a specialized
parser. After processing, the data can be loaded into a structured format in
the target database.

86. What is the function of the Parameter File in Informatica?

A Parameter File in Informatica stores parameter values for workflows,


sessions, or mappings, and is used to pass dynamic values at runtime. It
helps in making mappings and sessions reusable by decoupling the logic
from specific values.

87. How do you handle different file formats (CSV, XML, JSON) as sources in
Informatica?

Informatica provides different connectors and source definitions for each file
format (e.g., Flat File Source, XML Source, JSON Source). For CSV files, you
can define the file structure; for XML and JSON, you define schemas to map
the data.

88. How do you implement a process to load data from multiple sources into
a single target?

To load data from multiple sources into a single target, you can use multiple
Source Definitions, and then use transformations like Joiner, Union, or Lookup
to merge the data before loading it into the target.

89. What are the types of partitions in Informatica?

In Informatica, there are different types of partitioning strategies, including


Key Range, Round Robin, and Hash partitioning. These partitioning strategies
help in distributing the data across multiple threads for parallel processing.

90. What is a session log in Informatica, and what does it contain?

A session log in Informatica contains detailed information about the session


run, including session start and end times, number of records processed,
error messages, warnings, and performance metrics. It is critical for
debugging and performance monitoring.
91. How do you handle an unbalanced source in Informatica?

An unbalanced source, where data distribution is skewed, can be handled


using partitioning. By using key range or round-robin partitioning, you can
evenly distribute the data across multiple threads to improve performance.

92. What are the different types of transformations in Informatica?

Informatica offers various types of transformations such as Active and


Passive transformations, Lookup, Joiner, Expression, Aggregator, Filter,
Router, Sorter, Union, Rank, Sequence Generator, and others.

93. What is the purpose of the Workflow Manager in Informatica?

The Workflow Manager is used to define, manage, and schedule workflows in


Informatica. It helps in creating workflows by grouping sessions and tasks,
setting dependencies, and managing workflow execution.

94. How do you perform data validation in an ETL process?

Data validation can be performed using the Expression transformation,


where you can check for data quality issues like missing values, incorrect
formats, or data inconsistencies. You can also use custom validation rules in
the mapping to filter or reject invalid records.

95. What is the use of the Pre-Session Command in Informatica?

A Pre-Session Command is a command or script that is executed before the


session begins. It can be used to perform actions such as data extraction
from an external system, preparation of source files, or initialization of the
environment before data processing.

96. How do you handle file dependencies in Informatica?

File dependencies can be managed using the Event Wait and Event Raise
tasks in a workflow. These tasks help in controlling the execution sequence
by waiting for specific files to be available before proceeding with the ETL
processing.

97. How would you perform real-time data integration in Informatica?

Real-time data integration can be performed by using Informatica Cloud or


the Real-Time feature of PowerCenter. You can use change data capture
(CDC) or event-driven processing to load and transform data as it changes in
real time.

98. What is the role of the Integration Service in Informatica?

The Integration Service is responsible for executing mappings, sessions, and


workflows. It performs all the ETL operations and communicates with the
repository to retrieve and store metadata.

99. What is the difference between a Normalizer Transformation and a Rank


Transformation?

A Normalizer transformation is used to convert multiple rows into a single


row, typically for denormalizing hierarchical data. A Rank transformation is
used to select the top or bottom N records based on a specific condition,
often for ranking data.

100. How do you manage version control in Informatica?

Version control can be managed through the use of the repository, where
different versions of objects (such as mappings, sessions, and workflows) are
maintained. You can check in and check out versions, and ensure that
changes are tracked across different environments.

You might also like