What is SSIS?
SSIS stands for SQL Server Integration Services. It is one of the services of SQL Server and was introduced for the first time in SQL Server 2005. It’s a powerful replacement for the legacy DTS (Data Transformation Services) that exists in SQL Server 2000.
Why we need SSIS?
The main uses of SSIS package are,
- Merging Data from Heterogeneous Data Stores Populating Data
- Warehouses and Data Marts Cleaning and Standardizing Data Building
- Business Intelligence into a Data Transformation Process Automating
- Administrative Functions and Data Loading
Comparison of SQL Server Editions for SSIS
Feature
|
Enterprise
|
BI
|
Standard
|
Web/Express
|
Import and Export Wizard
|
Yes |
Yes |
Yes |
Yes |
Basic Data Adapters
|
Yes |
Yes |
Yes |
Yes |
SSIS Designer
|
Yes |
Yes |
Yes |
|
Basic Transformations
|
Yes |
Yes |
Yes |
|
Advanced Data Adapters
|
Yes |
|
|
|
Advanced Transformations
|
Yes |
|
|
|
ETL Process
ETL (Extract, Transform, and Load) is a process in data warehousing responsible for pulling data out of the source systems and placing it into a data warehouse. ETL involves the following tasks.
- extracting the data from source systems (SAP, ERP, other operational systems), data from different source systems is converted into one consolidated data warehouse format which is ready for transformation processing.
- transforming the data may involve the following tasks,
- applying business rules (so-called derivations, e.g., calculating new measures and dimensions);
- cleaning (e.g., mapping NULL to 0 or "Male" to "M" and "Female" to "F" etc.);
- filtering (e.g., selecting only certain columns to load);
- splitting a column into multiple columns and vice-versa;
- joining together data from multiple sources (e.g., lookup, merge);
- transposing rows and columns;
- applying any kind of simple or complex data validation (e.g., if the first 3 columns in a row are empty then reject the row from processing).
- loading the data into a data warehouse or data repository or other reporting applications.
Implementation
Step 1
- Click Start, point to All Programs, expand Microsoft SQL Server 2012, and click SQL Server Data Tools.
- On the File menu, point to New, and click Project.
- Expand Business Intelligence in the "Installed Templates" pane and select "Integration Services". Name the project and click OK.
Step 2
- Click on the Data Flow.
- From the SSIS Toolbox panel, drag and drop the OLE DB Source, Lookup, and OLE DB Destination.
Step 3
- Double click on the OLE DB Source. The following screen will appear.
- Make the new DB connection by clicking on the New… button as we used to make while deploying the cube.
- Specify the name of the table or type the SQL query as a source data.
- Click on the Preview… button to see the data that will be transferred.
- Click on the Columns option in the left side panel, select the desired columns, and click OK.
Step 4
- Link the OLE DB Source component with the Lookup Component by dragging-dropping the blue arrow over the Lookup component.
- Double click the Lookup component and click on the Connection option in the left panel. The following screen will appear.
What is Lookup?
Lookup is a Synchronous Transformation that allows you to perform an equi-join between values in the transformation input and values in the reference dataset, similar to T-SQL. This transformation is used to join two datasets at a time. To join more than two datasets, we need to put multiple Lookup transformations, similar to a T-SQL join condition.
- Specify the table name or use the SQL query the same way we did in OLE DB Source component.
- Click on the Columns option in the left panel, map the columns as shown in the figure given below, and click OK.
Step 5
- Join the blue arrow from Lookup component to the OLE DB destination.
- Double-click the OLE DB Destination component the following screen will appear.
- Specify the destination database and the destination table.
- Click on the Mappings option in the left panel, map the columns as shown in the figure below, and click OK.
Step 6
- Run the project.
- Open SQL Server and view the data in the destination table. It must be populated with the desired data.