SSIS Tutorial: SQL Server 2005 Integration Services Tutorial
SSIS Tutorial: SQL Server 2005 Integration Services Tutorial
SSIS Tutorial
Downloads Required:
Exercise Files
Sample DB from CodePlex
Chapter 16: SQL Server Integration Services
In this chapter:
• The Import and Export Wizard
• Creating a Package
• Working with Connection Managers
• Building Control Flows
• Building Data Flows
• Creating Event Handlers
• Saving and Running Packages
Files needed:
• ISProject1.zip
• ISProject2.zip
Microsoft says that SQL Server Integration Services (SSIS) “is a platform for building high
performance data integration solutions, including extraction, transformation, and load
(ETL) packages for data warehousing.” A simpler way to think of SSIS is that it’s the
solution for automating SQL Server. SSIS provides a way to build packages made up of
tasks that can move data around from place to place and alter it on the way. There are
visual designers (hosted within Business Intelligence Development Studio) to help you
build these packages as well as an API for programming SSIS objects from other
applications.
In this chapter, you’ll see how to build and use SSIS packages. First, though, we’ll look at
a simpler facet of SSIS: The SQL Server Import and Export Wizard.
If you choose to use the supplied solution files rather than building your
own, you’ll need to edit the properties of the OLE DB Connection
Managers within the projects to point to your own test server. You’ll learn
more about Connection Managers in the "Working with Connection
Managers" section later in this chapter.
16. Click the Edit button in the Mapping column for the HumanResources.Department
table.
17. The Column Mappings dialog box lets you change the name, data type, and other
properties of the destination table columns. You can also set other options here,
such as whether to overwrite or append data when importing data to an existing
table. Click Cancel when you’re done inspecting the options.
18. Click Next.
19. Check Execute Immediately and click Next.
20. Click Finish to perform the import. SQL Server will display progress as it performs
the import, as shown in Figure 16-2.
Figure 16-2: Import Wizard results
12. Click the Advanced icon to move to the Advanced page of the dialog box
13. Click the New button.
14. Change the Name of the new column to DepartmentName.
15. Click OK.
16. Right-click the DepartmentList Connection Manager and select Copy.
17. Right-click in the Connection Managers area and select Paste.
18. Click on the new DepartmentList 1 connection to select it.
19. Use the Properties Window to change properties of the new connection. Change
the Name property to DepartmentListBackup. Change the ConnectionString
property to C:\DepartmentsBackup.txt.
Figure 16-5 shows the SSIS package with the three Connection Managers defined.
Task Purpose
Back Up Database Back up an entire database to file or tape
Check Database Integrity Perform database consistency checks
Execute SQL Server Agent Job Run a job
Execute T-SQL Statement Run any T-SQL script
History Cleanup Clean out history tables for other
maintenance tasks
Maintenance Cleanup Clean up files left by other maintenance
tasks
Notify Operator Send e-mail to SQL Server operators
Rebuild Index Rebuild a SQL Server index
Reorganize Index Compacts and defragments an index
Shrink Database Shrinks a database
Update Statistics Update statistics used to calculate query
plans
Table 16-3: SSIS maintenance plan tasks
Container Purpose
For Loop Repeat a task a fixed number of times
Foreach Repeat a task by enumerating over a group
of objects
Sequence Group multiple tasks into a single unit for
easier management
Table 16-4: SSIS containers
Try It!
To add control flows to the package you've been building, follow these steps:
1. If the Toolbox isn't visible already, hover your mouse over the Toolbox tab until it
slides out from the side of the BIDS window. Use the pushpin button in the Toolbox
title bar to keep the Toolbox visible.
2. Make sure the Control Flow tab is selected in the Package Designer.
3. Drag a File System Task from the Toolbox and drop it on the Package Designer.
4. Drag a Data Flow Task from the Toolbox and drop it on the Package Designer,
somewhere below the File System task.
5. Click on the File System Task on the Package Designer to select it.
6. Drag the green arrow from the bottom of the File System Task and drop it on top
of the Data Flow Task. This tells SSIS the order of tasks when the File System Task
succeeds.
7. Double-click the connection between the two tasks to open the Precedence
Constraint Editor.
8. Change the Value from Success to Completion, because you want the Data Flow
Task to execute whether the File System Task succeeds or not.
9. Click OK.
10. Select the File System task in the designer. Use the Properties Window to set
properties of the File System Task. Set the Source property to DepartmentList. Set
the Destination property to DepartmentListBackup. Set the
OverwriteDestinationFile property to True.
Figure 16-6 shows the completed set of control flows.
As it stands, this package uses the file system task to copy the file specified by the
DepartmentList connection to the file specified by the DepartmentListBackup connection,
overwriting any target file that already exists. It then executes the data flow task. In the
next section, you’ll see how to configure the data flow task.
SSIS Tutorial: Building Data Flows
The Data Flow tab of the Package Designer is where you specify the details of any Data
Flow tasks that you’ve added on the Control Flow tab. Data Flows are made up of various
objects that you drag and drop from the Toolbox:
• Data Flow Sources are ways that data gets into the system. Table 16-5 lists the
available data flow sources.
• Data Flow Transformations let you alter and manipulate the data in various ways.
Table 16-6 lists the available data flow transformations.
• Data Flow Destinations are the places that you can send the transformed data.
Table 16-7 lists the available data flow destinations.
Source Use
DataReader Extracts data from a database using a .NET
DataReader
Excel Extracts data from an Excel workbook
Flat File Extracts data from a flat file
OLE DB Extracts data from a database using an OLE
DB provider
Raw File Extracts data from a raw file
XML Extracts data from an XML file
Table 16-5: Data flow sources
Transformation Effect
Aggregate Aggregates and groups values in a dataset
Audit Adds audit information to a dataset
Character Map Applies string operations to character data
Conditional Split Evaluates and splits up rows in a dataset
Copy Column Copies a column of data
Data Conversion Converts data to a different datatype
Data Mining Query Runs a data mining query
Derived Column Calculates a new column from existing data
Export Column Exports data from a column to a file
Fuzzy Grouping Groups rows that contain similar values
Fuzzy Lookup Looks up values using fuzzy matching
Import Column Imports data from a file to a column
Lookup Looks up values in a reference dataset
Merge Merges two sorted datasets
Merge Join Merges data from two datasets by using a
join
Multicast Creates copies of a dataset
OLE DB Command Executes a SQL command on each row in a
dataset
Percentage Sampling Extracts a subset of rows from a dataset
Pivot Builds a pivot table from a dataset
Row Count Counts the rows of a dataset
Row Sampling Extracts a sample of rows from a dataset
Script Component Executes a custom script
Slowly Changing Dimension Updates a slowly changing dimension in a
cube
Sort Sorts data
Term Extraction Extracts data from a column
Term Lookup Looks up the frequency of a term in a
column
Union All Merges multiple datasets
Unpivot Normalizes a pivot table
Table 16-6: Data Flow Transformations
Destination Use
Data Mining Model Training Sends data to an Analysis Services data
mining model
DataReader Sends data to an in-memory ADO.NET
DataReader
Dimension Processing Processes a cube dimension
Excel Sends data to an Excel worksheet
Flat File Sends data to a flat file
OLE DB Sends data to an OLE DB database
Partition Processing Processes an Analysis Services partition
Raw File Sends data to a raw file
Recordset Sends data to an in-memory ADO Recordset
SQL Server Sends data to a SQL Server database
SQL Server Mobile Sends data to a SQL Server Mobile database
Table 16-7: Data Flow Destinations
Try It!
To customize the data flow task in the package you're building, follow these steps:
1. Select the Data Flow tab in the Package Designer. The single Data Flow Task in the
package will automatically be selected in the combo box.
2. Drag an OLE DB Source from the Toolbox and drop it on the Package Designer.
3. Drag a Character Map Transformation from the Toolbox and drop it on the Package
Designer.
4. Drag a Flat File Destination from the Toolbox and drop it on the Package Designer.
5. Click on the OLE DB Source on the Package Designer to select it.
6. Drag the green arrow from the bottom of the OLE DB Source and drop it on top of
the Character Map Transformation.
7. Click on the Character Map Transformation on the Package Designer to select it.
8. Drag the green arrow from the bottom of the Character Map Transformation and
drop it on top of the Flat File Destination.
9. Double-click the OLE DB Source to open the OLE DB Source Editor.
10. Select the HumanResources.Department view. Figure 16-7 shows the completed
OLE DB Source Editor.
The data flows in this package take a table from the Chapter16 database, transform one
of the columns in that table to all uppercase characters, and then write that transformed
column out to a flat file.
SSIS Tutorial: Creating Event Handlers
SSIS packages also support a complete event system. You can attach event handlers to a
variety of events for the package itself or for the individual tasks within a package. Events
within a package “bubble up.” That is, suppose an error occurs within a task inside of a
package. If you’ve defined an OnError event handler for the task, then that event handler
is called. Otherwise, an OnError event handler for the package itself is called. If no event
handler is defined for the package either, the event is ignored.
Event handlers are defined on the Event Handlers tab of the Package Designer. When you
create an event handler, you handle the event by building an entire secondary SSIS
package, and you have access to the full complement of data flows, control flows, and
event handlers to deal with the original event.
By adding event handlers to the OnError event that call the Send Mail
task, you can notify operators by e-mail if anything goes wrong in the
course of running an SSIS package.
Try It!
To add an event handler to the package we’ve been building, follow these steps:
1. Open SQL Server Management Studio and connect to your test server.
2. Create a new query and select the Chapter16 database in the available databases
list on the toolbar.
3. Enter this text into a query window:
SQL Server also includes a command-line utility, dtsexec, that lets you
run packages from batch files.
2. When the package finishes executing, click the hyperlink underneath the
Connection Managers pane to stop the debugger.
3. Click the Execution Results tab to see detailed information on the package, as
shown in Figure 16-12.
Figure 16-12: Information on package execution
All of the events you see in the Execution Results pane are things that
you can create event handlers to react to within the package. As you can
see, DTS issues a quite a number of events, from progress events to
warnings about extra columns of data that we retrieved but never used.
7. Click Execute.
8. Click Close twice to dismiss the progress dialog box and the Execute Package
Utility.
9. Enter this text into a query window with the Chapter16 database selected:
SELECT * FROM DepartmentExports
10. Click the Execute toolbar button to verify that the package was run. You should see
one entry for when the package was run from BIDS and one from when you ran it
from SQL Server Management Studio.
SSIS Tutorial: Exercises
One common use of SSIS is in data warehousing - collecting data from a variety of
different sources into a single database that can be used for unified reporting. In this
exercise you’ll use SSIS to perform a simple data warehousing task.
Use SSIS to create a text file, c:\EmployeeList.txt, containing the last names and network
logins of the AdventureWorks employees. Retrieve the last names from the
Person.Contact table in the AdventureWorks database. Retrieve the logins from the
HumanResources.Employee table in the Chapter16 database.
You can use the Merge Join data flow transformation to join data from two sources. One
tip: the inputs to this transformation need to be sorted on the joining column.
Solutions to Exercises
1. Launch Business Intelligence Development Studio
2. Select File > New > Project.
3. Select the Business Intelligence Projects project type.
4. Select the Integration Services Project template.
5. Select a convenient location.
6. Name the new project ISProject2 and click OK.
7. Right-click in the Connection Managers area of your new package and select New
OLE DB Connection.
8. Click New to create a new data connection.
9. In the Connection Manager dialog box, select the SQL Native Client provider.
10. Select your test server and provide login information.
11. Select the AdventureWorks database.
12. Click OK.
13. Right-click in the Connection Managers area of your new package and select New
OLE DB Connection.
14. Select the existing connection to the Chapter16 database and click OK.
15. Right-click in the Connection Managers area of your new package and select New
Flat File Connection.
16. Enter EmployeeList as the Connection Manager Name.
17. Enter C:\Employees.txt as the File Name.
18. Check the Column Names in the First Data Row checkbox.
19. Click the Advanced icon to move to the Advanced page of the dialog box.
20. Click the New button.
21. Change the Name of the new column to LastName.
22. Click the New button.
23. Change the Name of the new column to Login.
24. Click OK.
25. Select the Control Flow tab in the Package Designer.
26. Drag a Data Flow Task from the Toolbox and drop it on the Package Designer.
27. Select the Data Flow tab in the Package Designer. The single Data Flow Task in the
package will automatically be selected in the combo box.
28. Drag an OLE DB Source from the Toolbox and drop it on the Package Designer.
29. Drag a second OLE DB Source from the Toolbox and drop it on the Package
Designer.
30. Drag a Sort Transformation from the Toolbox and drop it on the Package Designer.
31. Drag a second Sort Transformation from the Toolbox and drop it on the Package
Designer.
32. Drag a Merge Join Transformation from the Toolbox and drop it on the Package
Designer.
33. Drag a Flat File Destination from the Toolbox and drop it on the Package Designer.
34. Click on the first OLE DB Source on the Package Designer to select it.
35. Drag the green arrow from the bottom of the first OLE DB Source and drop it on
top of the first Sort Transformation.
36. Click on the second OLE DB Source on the Package Designer to select it.
37. Drag the green arrow from the bottom of the second OLE DB Source and drop it on
top of the second Sort Transformation.
38. Click on the first Sort Transformation on the Package Designer to select it.
39. Drag the green arrow from the bottom of the first Sort Transformation and drop it
on top of the Merge Join Transformation.
40. In the Input Output Selection dialog box, select the Merge Join Left Input.
41. Click OK.
42. Click on the second Sort Transformation on the Package Designer to select it.
43. Drag the green arrow from the bottom of the second Sort Transformation and drop
it on top of the Merge Join Transformation.
44. Click on the Merge Join Transformation on the Package Designer to select it.
45. Drag the green arrow from the bottom of the Merge Join Transformation and drop
it on top of the Flat File Destination. Figure 16-14 shows the Data Flow tab with
the connections between tasks.
Figure 16-14: Data flows to merge two sources
46. Double-click the first OLE DB Source to open the OLE DB Source Editor.
47. Select the connection to the AdventureWorks database.
48. Select the Person.Contact view.
49. Click OK.
50. Double-click the second OLE DB Source to open the OLE DB Source Editor.
51. Select the connection to the Chapter16 database.
52. Select the HumanResources.Employee view.
53. Click OK.
54. Double-click the first Sort Transformation.
55. Check the ContactID column.
56. Click OK
57. Double-click the second Sort Transformation.
58. Check the ContactID column.
59. Click OK
60. Double-click the Merge Join Transformation.
61. Check the Join Key checkbox for the ContactID column in both tables.
62. Check the selection checkbox for the LastName column in the left-hand table and
the ContactID and LoginID columns in the right-hand table. Figure 16-15 shows
the completed Merge Join Transformation Editor.