This document discusses various data flow transformations in SQL Server Integration Services (SSIS). It begins with an introduction to the different types of transformations, including row transformations and rowset transformations. It then provides examples and demonstrations of specific transformations like Character Map, Derived Column, Aggregate, Pivot, and Percentage Sampling. The document aims to explain how each transformation works and how it can be used to modify or aggregate data in an SSIS data flow.
3. Recap and Q&A
Data Flow Task
Pipeline Architecture
Data Sources
◦ Demo: ADO.NET Source
◦ Demo: Excel Source
◦ Demo: Flat File Source
◦ Demo: OLE DB Source
◦ Demo: XML Source
◦ Demo: Raw File Destination
◦ Demo: Raw File Source
Data Destinations
◦ Demo: OLE DB Destination
◦ Demo: DataReader Destination
◦ Demo: Excel Destination
◦ Demo: Flat File Destination
◦ Demo: SQL Server Destination
Analysis Services Destinations
◦ Demo: Dimension Processing
◦ Demo: Partition Processing
@copyright 2014 (pramod_singla@yahoo.co.in)
4. Data Flow Transformations
These are the components that aggregate, merge, distribute, and modify data
All the Data Flow Transformations are broadly classified into 2 types:-
Type 1 – Synchronous Transformations.
Type 2 – Asynchronous Transformations.
All the Data Flow Transformations are broadly categorized as:
Row Transformations
Rowset Transformations
Split and Join Transformations
Business Intelligence Transformations
Auditing Transformations
Custom Transformations
@copyright 2014 (pramod_singla@yahoo.co.in)
7. Row Transformations
This transformation is used to update column values or create new columns.
It transforms each row present in the pipeline (Input).
@copyright 2014 (pramod_singla@yahoo.co.in)
8. Character Map (Demo)
The transformation that applies string functions to
character data.
The following character mappings are available:
Lowercase : changes all characters to lowercase
Uppercase : changes all characters to uppercase
Byte reversal : reverses the byte order of each character
Hiragana : maps Katakana characters to Hiragana characters
Katakana : maps Hiragana characters to Katakana characters
Half width : changes double-byte characters to single-byte characters
Full width : changes single-byte characters to double-byte characters
Linguistic casing : applies linguistic casing rules instead of system casing
rules
Simplified Chinese : maps traditional Chinese to simplified Chinese
Traditional Chinese : maps simplified Chinese to traditional Chinese
@copyright 2014 (pramod_singla@yahoo.co.in)
9. Data Conversion column(Demo)
The transformation that converts the data type of a
column to a different data type
Can perform the following types of data conversions:
Change the data type
Set the column length of string data and the precision and scale
on numeric data
Specify a code page
If the length of an output column of string data is shorter
than the length of its corresponding input column, the
output data is truncated
@copyright 2014 (pramod_singla@yahoo.co.in)
10. Derived Column(Demo)
The transformation that populates columns with the
results of expressions.
If an expression references an input column that is
overwritten by the Derived Column transformation, the
expression uses the original value of the column, not the
derived value.
@copyright 2014 (pramod_singla@yahoo.co.in)
11. Export Column(Demo)
The transformation that inserts data from a data flow into a file.
uses pairs of input columns: One column contains a file name, and the other
column contains data.
The data to be written must have a DT_TEXT, DT_NTEXT, or DT_IMAGE data
type.
@copyright 2014 (pramod_singla@yahoo.co.in)
Append Truncate File exists Results
False False No The transformation creates a new file and writes the data to the file.
True False No The transformation creates a new file and writes the data to the file.
False True No The transformation creates a new file and writes the data to the file.
True True No The transformation fails design time validation. It is not valid to set both
properties to true.
False False Yes A run-time error occurs. The file exists, but the transformation cannot
write to it.
False True Yes The transformation deletes and re-creates the file and writes the data to
the file.
True False Yes The transformation opens the file and writes the data at the end of the
file.
True True Yes The transformation fails design time validation. It is not valid to set both
properties to true.
12. OLE DB Command (Demo)
The transformation that runs SQL commands for each
row in a data flow.
Configure the OLE DB Command Transformation in
the following ways:
Provide the SQL statement that the transformation runs
for each row.
Specify the number of seconds before the SQL statement
times out.
Specify the default code page.
@copyright 2014 (pramod_singla@yahoo.co.in)
13. Rowset Transformations (Demo)
These transformations create new rowsets
The rowset can include aggregate and sorted values, sample rowsets, or
pivoted and unpivoted rowsets.
@copyright 2014 (pramod_singla@yahoo.co.in)
Transformation Description
Aggregate Transformation The transformation that performs aggregations
such as AVERAGE, SUM, and COUNT.
Sort Transformation The transformation that sorts data.
Percentage Sampling Transformation The transformation that creates a sample data set
using a percentage to specify the sample size.
Row Sampling Transformation The transformation that creates a sample data set
by specifying the number of rows in the sample.
Pivot Transformation The transformation that creates a less normalized
version of a normalized table.
Unpivot Transformation The transformation that creates a more
normalized version of a nonnormalized table.
14. Pivot
The transformation that creates a less normalized version of a normalized table.
Equivalent to PIVOT command in TSQL
Steps to use Pivot Transform :
Configure OLE DB Source and use above query as Source in data flow task.
Drag and open Pivot Transform and go to Input Columns. Select all inputs as we are going to use all
of them in Pivot.
Go to Input and output properties and expand Pivot Default Input. Here we will configure how inputs
will be used in Pivot operations using Pivot key Value.
Expand Pivot Default Output, Click on the Output Columns and click AddColumn. Please note that our
destination has Five Columns, all Columns needs to be manually created in this section.Configure
Name – The name for the output column
PivotKeyValue – The value in the pivoted column that will go into this output.
Source Column: It is the lineage ID of the input column which holds the value for the output column.
@copyright 2014 (pramod_singla@yahoo.co.in)
15. Unpivot (Demo)
The transformation that creates a more
normalized version of a non-normalized table.
Equivalent to UNPIVOT command in TSQL
@copyright 2014 (pramod_singla@yahoo.co.in)
16. Aggregate (Demo)
The transformation that performs aggregations such as AVERAGE, SUM, and COUNT
The Aggregate transformation supports the following operations.
@copyright 2014 (pramod_singla@yahoo.co.in)
Operation Description
Group by Divides datasets into groups. Columns of any data type can be used for grouping. For more information, see
GROUP BY (Transact-SQL).
Sum Sums the values in a column. Only columns with numeric data types can be summed. For more information,
see SUM (Transact-SQL).
Average Returns the average of the column values in a column. Only columns with numeric data types can be
averaged. For more information, see AVG (Transact-SQL).
Count Returns the number of items in a group. For more information, see COUNT (Transact-SQL).
Count distinct Returns the number of unique nonnull values in a group.
Minimum Returns the minimum value in a group. For more information, see MIN (Transact-SQL). In contrast to the
Transact-SQL MIN function, this operation can be used only with numeric, date, and time data types.
Maximum Returns the maximum value in a group. For more information, see MAX (Transact-SQL). In contrast to the
Transact-SQL MAX function, this operation can be used only with numeric, date, and time data types.
17. Sort (Demo)
This Transformation sorts data
Can apply multiple sorts to an input, identified
by a numeral
The Sort transformation can also remove
duplicate rows as part of its sort
This transformation has one input and one
output. It does not support error outputs
@copyright 2014 (pramod_singla@yahoo.co.in)
18. Percentage Sampling
It creates a sample data set using a percentage to
specify the sample size.
It is useful in creating sample data sets
Number of rows in the sample output may not
exactly reflect the specified percentage.
Two Named output are created:
Sampled Output
Unselected Ouput
@copyright 2014 (pramod_singla@yahoo.co.in)
19. Row Sampling (Demo)
The transformation that creates a sample data set by
specifying the number of rows in the sample.
Can specify the exact size of the output sample.
This transformation is useful :
for random sampling.
during package development for creating a small but
representative dataset.
Similar to the Percentage Sampling transformation.
Has one input and two outputs. It has no error output.
@copyright 2014 (pramod_singla@yahoo.co.in)