This tutorial shows how to read a CSV file using Talend Open Studio Data Integration. The key steps are:
1. Create a new job and add a tFileInputDelimited component to define the CSV file input
2. Configure the tFileInputDelimited component by specifying the file path, schema, and delimiter settings
3. Add a tLogRow component to output the file contents and link it to tFileInputDelimited
4. Run the job to read the CSV file and display its contents in the log
This tutorial shows how to read a CSV file using Talend Open Studio Data Integration. The key steps are:
1. Create a new job and add a tFileInputDelimited component to define the CSV file input
2. Configure the tFileInputDelimited component by specifying the file path, schema, and delimiter settings
3. Add a tLogRow component to output the file contents and link it to tFileInputDelimited
4. Run the job to read the CSV file and display its contents in the log
This tutorial uses Talend Open Studio Data Integration version 6
1. Create a New Job
a. Ensure that the Integration perspective is selected. b. In the Project Repository, right-click Job Designs and click Create Standard Job in the menu. c. In the Name field of the New Job wizard, fill in the name of the Job as readCSVFile. d. It is good practice to add a purpose and a description to a Job. Then, click Finish to create your Job.
The Job Designer opens an empty Job.
2. Add a tFileInputDelimited component
3. Configure the tFileInputDelimited_1 component
a. In the Job Designer, click the tFileInputDelimited_1 component. b. To define the Basic settings for the component, in the Component view, click the Component tab. o Property Type defines how you will read the data source. o File Name/Stream shows the complete input or output file path. You can either type the path manually or use the ellipsis button [..] to provide the file path. o Row and Field Separators define the type of row separator. o Header and Footer indicate the number of rows in the file that should be ignored. o Limit shows the maximum number of lines to read in the file. o Schema defines the data structure of the file. c. To specify the path and name of the file to be read, click [...] next to the File Name field, select the file from the local disk, and click Open.
Talend takes the complexity out of integration
Based on open source Scalable Future-proof Predictable cost Visit www.talend.com Follow us on Twitter @Talend Talend Tutorial Task Aid >
4. Define the schema for the tFileInputDelimited_1 component
a. To define the schema for the tFileInputDelimited_1 component, click [...] next to the Edit schema field.
The Schema of the tFileInputDelimited_1 wizard opens.
o [+] button adds a column to the schema wizard.
o [x] button removes the selected items from the schema wizard. o [] and [] buttons move selected items up or down in the schema wizard. b. In the Schema wizard, click the [+] icon to add a column. c. In the Column column, enter the field name as movieID. d. To designate this field as the key, select the Key checkbox. e. In the Type column, click Integer. f. Ensure that the Nullable column is unchecked, so that any null value for this column is rejected. g. In the Length column, enter 4. h. Repeat steps b to g for each field in the CSV file. i. To close the Schema wizard, click OK.
5. Add the logging component and propagate the data
a. Add a tLogRow component to the Job. The tLogRow component will display in the console all the rows of data it receives. b. To propagate data from the tFileInputDelimited_1 component to the tLogRow_1 component, in the Job Designer, right-click tFileInputDelimited_1, hold, and drag to tLogRow_1.
Alternative method: To link the components, you can also right-click the source component and click Row > Main.
6. Run the Job
a. In the Run view for the Job readCSVFile, click Run. The file was read by the tFileInputDelimited component, and its content was displayed on the console by the tLogRow component.
Talend takes the complexity out of integration
Based on open source Scalable Future-proof Predictable cost Visit www.talend.com Follow us on Twitter @Talend