0% found this document useful (0 votes)
94 views

Talend

The tMap component in Talend allows mapping of input to output data. It can add or remove columns, apply transformations, filter data, reject data, multiplex/demultiplex data, and concatenate data. Context variables are parameters that a Job can access during runtime and there are three ways to define them.

Uploaded by

saipriyacoool
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
94 views

Talend

The tMap component in Talend allows mapping of input to output data. It can add or remove columns, apply transformations, filter data, reject data, multiplex/demultiplex data, and concatenate data. Context variables are parameters that a Job can access during runtime and there are three ways to define them.

Uploaded by

saipriyacoool
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

https://www.upgrad.

com/blog/talend-interview-questions-answers/

What is The tMap Component? What are the Various Functions That can be Performed Using the tMap Component?
tMap in Talend is a core component of the ‘Processing’ family. It allows you to map the input to the output data.

Its functions are:

1. It allows you to add or remove columns


2. Transformation rules can be applied on any type of field
3. Input data and output data can be filtered using the constraints specified
4. It allows you to reject data
5. You can multiplex or demultiplex data using the tMap component
6. It allows you to concatenate the data
7. It allows you to interchange the data

Explain The Various Types of Connections in Talend.

Row: This connection represents the data flow. Some row connections are Lookup, Multiple Input/Output and
Uniques/Duplicates. Apart from these, Filter, Output, Rejects, ErrorRejects are also row connections.
Iterate: Using the iterate connection, you can perform a loop function on files in a file directory, rows or database entries.
Trigger: The dependency between Subjobs and Jobs triggered in the order as per the Trigger’s nature is created by Trigger.
Link: Using the Link connection, a user can transfer the information in a table schema to the ELT mapper in Talend

What are The Types of Triggers in Talend?


There are two categories of Triggers:

1.Subjob Triggers which include OnSubjobOK, OnSubjobError and Run if. OnSubjobOk is executed once the previous Subjob
has been executed.

2.Component Triggers which include OnComponentOK, OnComponentError and Run if. OnComponentOk is executed once
the previous component has been executed.

Define Context Variables


Context variables are parameters defined by users that a Job has access to during runtime. The values of these variables
change as the Job goes from the Development stage to the stages of Test and Production.

There are three ways to define Context Variables:

1. Embedded Context Variables


2. Repository Context Variables
3. External Context Variables

What is The Use of tContextLoad?


tContextLoad is part of Talend’s ‘Misc’ components. Using tContextLoad, you can modify the values present in the active
context. The context from a data flow is loaded using tContextLoad.

Differentiate between ‘OnComponentOk’ and ‘OnSubjobOk’.

OnComponentOk OnSubjobOk
1. Belongs to Component Triggers 1. Belongs to Subjob Triggers
2. The linked Subjob starts executing only when the 2. The linked Subjob starts executing only when the
previous component successfully finishes its execution previous Subjob completely finishes its execution
3. This link can only be used with the first component of the
3. This link can be used with any component in a Job
Subjob
Differentiate between tMap and tJoin.

tMap tJoin

1. It is a powerful component which can handle complicated


1. Can only handle basic Join cases
cases

2. Can accept multiple input links (one is main and rest are
2. Can accept only two input links (main and lookup)
lookups)

3. Can have more than one output links 3. Can have only two output links (main and reject)

4. Supports multiple types of join models like unique join,


4. Supports only unique join
first join, and all join etc.

5. Supports inner join and left outer join 5. Supports only inner join

6. Can filter data using filter expressions 6. Can’t-do so

Explain the purpose of tDenormalizeSortedRow.

tDenormalizeSortedRow belongs to the ‘Processing’ family of the components. It helps in synthesizing sorted
input flow in order to save memory. It combines all input sorted rows in a group where the distinct values
are joined with item separators.

Discuss the difference between XMX and XMS parameters.

XMS parameter is used to specify the initial heap size in Java whereas XMX parameter is used to specify the
maximum heap size in Java.

Explain the error handling in Talend.

There are few ways in which errors in Talend can be handled:

o For simple Jobs, one can rely on the exception throwing process of Talend Open Studio, which is
displayed in the Run View as a red stack trace.
o Each Subjob and component has to return a code which leads the additional processing. The Subjob
Ok/Error and Component Ok/Error links can be used to direct the error towards an error handling
routine.
o The basic way of handling an error is to define an error handling Subjob which should execute
whenever an error occurs.

Differentiate between the usage of tJava, tJavaRow, and tJavaFlex components.

Functions tJava tJavaRow tJavaFlex

1. Can be used to integrate custom Java code Yes Yes Yes

2. Will be executed only once at the beginning of


Yes No No
the Subjob

3. Needs input flow No Yes No

4. Needs output flow No Only if Only if


output output
schema is schema is
defined defined

5. Can be used as the first component of a Job Yes No Yes

6. Can be used as a different Subjob Yes No Yes

7. Allows Main Flow or Iterator Flow Both Only Main Both

8. Has three parts of Java code No No Yes

9. Can auto propagate data No No Yes

How can you execute a Talend Job remotely?

You can execute a Talend Job remotely from the command line. All you need to do is, export the job along with
its dependencies and then access its instructions files from the terminal.

What is the purpose of ‘tXMLMap’ component?

This component transforms and routes the data from single or multiple sources to single or multiple destinations.
It is an advanced component which is sculpted for transforming and routing XML data flow. Especially when we
need to process numerous XML data sources.

How can you expand the performance of Talend job which has a complex design?

To improve the performance of Talend job we can do following things:

 Remove redundant fields/columns using tFilterColumns component


 Remove Unwanted data/records using tFilterRows component
 Use Select Query to retrieve data from the database
 Use Database Bulk components
 Use Talend ELT Components when needed
 Split Talend Job into the smaller Subjobs

Which types of Joins supported by the tMap component?

The tMap component supports multiple joins and joins models, which are as follows:

Joins: Inner join, Left join

Join models: Unique join, First join and all join, etc.

What is the tReplicate component?

The tReplicate component duplicates the incoming schema into two similar output flows. And it allows us to perform
different operations on the same schema. The tReplicate component is used to replicate a row as many times as needed.

What are the SQL templates?

Talend Studio allows a range of SQL templates to simplify the most common tasks. It also contains the SQL editor that
allows us to customize or design our SQL templates.
The SQL template is used with the components from the Talend ELT component which having the tSQLTemplate,
tSQLTemplateFilterColumns, tSQLTemplateRollback, tSQLTemplateCommit, tSQLTemplateAggregate,
tSQLTemplateFilterRows and tSQLTemplateMerge and these components execute the selected SQL statements.

With the help of these SQL templates, we can enhance the efficiency of our DBMS [database management system] by
storing and retrieving our data according to the structural requirements.

25) Explain the tJoin component?

The tJoin component is used to perform the inner and outer join between the main data flow and lookup flow, and this
component helps us to ensure the data quality of any source data against a reference data source.

26) Why we use the tLogRow component in Talend?

The tLogRow component is used to display data or results in the Run console window. It is mainly used to monitor data
processed.

27) Why we use the tSortRow component?

The tSortRow component is used to sort the input data based on one or more columns by sort type and order.

The main objective of the tSortRow component is to help us to create metrics and classification of the table.

28) What is the tLoqateAddressRow component?

The tLoqateAddressRow component is used to compare address data against reference data to make sure that it is correct
and complete. If any changes needed, we can correct the spelling, add the missing address data like city, area of the city,
postcode or region, and any other related data.

29) Why we use the tXMLMap component?

The tXMLMap component is used to transform and route data from single or multiple sources to single or multiple
destinations.

http://rathinasamyy.blogspot.com/2015/02/talend-interview-questions-and-answers.html

1.(http://www.deepinopensource.com/talend-interview-questions/)
1. Talend – Merge multiple files into single file with sorting operation.
2. Loading Fact Table Using Talend
3. ROWNUM Analytical Function in Talend
4. SCD-2 Implementations in Talend
5. Deployment strategies in Talend
6. Custom Header Footer in Talend
7. Data Masking Using Talend
8. How to use Shared DB Connection in Talend
9. Load all rows from source to target except last 5
10. Late Arriving Dimension Using Talend
11. Date Dimension Using Talend
12. Dynamic Column Ordering Of Source File Using Talend
13. Incremental Load Using Talend
14. Getting Files From FTP Server
15. Initializing Context At Run Time Using Popup
16. User Define Function In Talend
17. Calling DB Sequence From Talend

2.(http://www.talendtutorials.com/talend-interview-questions)
1. Difference between tAggregatedRow and tAggregateSortedRow in Talend
2. How to resume job execution from same location if job get failed in Talend
3. How to execute more than one sub jobs parallel in Talend
4. How to iterate filename and directories in Talend
5. What is the difference between OnSubjobOK and OnComponentOK in Talend
6. How can you pass a value form parent job to child job in Talend
7. How to call stored procedure and function in Talend Job
8. How to export job and execute outside from Talend Studio
9. How to pass value from outside in Talend
10. Can I define schema of database or tables at run time
11. What is tReplicate in Talend
12. What is tUnite in Talend Open Studio
13. How to optimize talend job to stop outOfMemory runtime error
14. How to optimize Talend Performance
15. How to execute multipule SQL statements with one component in Talend
16. What is tSystem component in Talend
17. Can I execute multiple commands at one time with a tSystem component
18. What is difference between tMap and tFilterrow in Talend

3.(http://www.cram.com/flashcards/talend-interview-questions-5197224)
1. What is the difference between the ETL and ELT components of Talend Open Studio?
2. How does one deploy Talend projects?
3. What are the elements of a Talend project?
4. What is the most current version of Talend Open Studio?
5. How do you implement versioning for Talend jobs?
6. What is the tMap component?
7. What is the difference between the tMap and tJoin components?
8. Which *component* is used to sort data?

4.(http://msureshreddy.blogspot.in/2013/08/talend-interview-questions.html)

5. How to implement versioning for talend jobs ?


6. What is tMap component ?
7. What is difference between tMap and tJoin compoents ?
8. Which component is used to sort that data ?
9. How to perform aggregate operations/functions on data in talend ?
10. What types of joins are supported by tMap component ?
11. How to schedule a talend job ?
12. How to runs talend job as web service ?
13. How to Integrate SVN with Talend ?
14. How to run talend jobs on Remote server ?
15. How to pass data from parent job to child jobs through trunjob component ?
16. How to load context variables dynamically from file/database ?
17. How to run talend jobs in Parallel ?
18. What is Context variables ?
19. How to export a talend job ?

5.( Talend best practices)


1. Talend workspace path should not contain any spaces.
2. Never forget to perform Null Handling.
3. Create Repository Metadata for DB connections and retrieve database table schema for DB tables.
4. Use Repository Schema for Files/DB and DB connections.
5. Create Database connection using t<Vendor>Connection component and use this connection in the Job. Do
not make new connection with every component.
6. Always close the connection to database using t<Vendor>Close component.
7. Create a Repository Document corresponding to every Talend job including revision history.
8. Provide Sub Job title for every sub job to describe the sub job purpose/objective.
9. Avoid Hard Coding in Talend Job component. Instead use Talend context variables.
10. Create Context Groups in Repository
11. Use Talend.properties file to provide the values to context variables using tContextLoad.
12. Create Variables in tMap and use the variables to assign the values to target fields.
13. Create user routines/functions for common transformation and validation.
14. Develop Talend job iteratively.
15. Always Exit Talend open studio before shutting down the PC.
16. Always rename Main Flows in Talend Job to meaningful names.
17. Always design Talend jobs by keeping performance in mind.
Talend Job Design - Performance Optimization Tips
1. Remove Unnecessary fields/columns ASAP using tFilterColumns component.
2. Remove Unnecessary data/records ASAP using tFilterRows component
3. Use Select Query to retrieve data from database
4. Use Database Bulk components
5. Store on Disk Option
6. Allocating more memory to the Jobs
7. Parallelism
8. Use Talend ELT Components when required
9. Use SAX parser over Dom4J whenever required
10. Index Database Table columns
11. Split Talend Job to smaller Subjobs

You might also like