Data Flow Task

Overview

Data Flow Task Plus is SSIS control flow task and it is an extension of the standard Microsoft Data Flow Task.

Data Flow Task Plus (1.4) allows setup of dynamic data flows. The dynamic setup options can be controlled with parameters from outside. This feature lessens the need to manually open and modify the data flow design, when new source and destination columns have to be accommodated.

Data Flow Task Plus supports also features similar to the features implemented for Script Task Plus. Data Flow Task Plus allows exporting of data flow logic and reuse in other packages. Keeping your SSIS data flow logic separate from your package, helps you keep track of the changes independently in your source control system. Data Flow Task Plus provides also customizable user interface for setup of your data flow. You customize the data flow setup dialog by implementing an SSIS script, part of the data flow task.

We believe these features will help you be more more productive and allow you to componentize your common data flow logic.

Demonstration

Setup (1.4)

The task setup is similar to the Script Task Plus dialog. It has the following features:

  • Setup tab - select data flow initialization parameters. These data flow parameters will be initialized before the data flow is executed. The page configuration is prepared with attributes in the initialization script.
  • Dynamic tab - specifies dynamic data flow options. For details check below.
  • Advanced tab - specifies advanced task options.
    • General page - specifies data flow task name and description.
    • Script page - specifies data flow task script, which is used for Setup tab customization.
    • Expressions page - specifies standard SSIS expressions.
  • Export Data FlowExport Data Flow - export SSIS data flow to external file (.dfx).
  • Link Data FlowLink Data Flow - link to SSIS data flow from external file (.dfx). The link can be removed by pressing the button again.

Setup (1.3 and older)

The task setup is similar to the Script Task Plus dialog. It has the following features:

  • General page - specifies data flow task name and description.
  • Initialize page - select data flow initialization parameters. These data flow parameters will be initialized before the data flow is executed. The page configuration is prepared with attributes in the initialization script.
  • Script page - specifies data flow task script, which is used for Initialize page customization.
  • Expressions page - specifies standard SSIS expressions.
  • Export Data FlowExport Data Flow - export SSIS data flow to external file (.dfx).
  • Link Data FlowLink Data Flow - link to SSIS data flow from external file (.dfx). The link can be removed by pressing the button again.

Dynamic Data Flow (1.4)

Data Flow Task Plus can dynamically modify source and destination components at runtime.

Supported data flow components:

Type Component
Source ADO NET, Excel, Flat File, OLE DB, XML, Data Flow, EDI
Destination ADO NET (SQL 2008 only), DataReader, Excel, Flat File, OLE DB, Recordset, SQL Server, Informix, Oracle, DB2, Data Flow.

The task constructs the extra available columns, assisted by the user-specified options below.

Enabled

Set component to be processed dynamically.

FilterType (Source)

Select filter source type. This property has the options listed in the following table.

Value Description
FilterDirect Set to a regular expression that defines the filter. Selecting the value displays the dynamic option FilterDirect.
FilterFile Select a file that contains the filter. Selecting the value displays the dynamic option FilterFile.
FilterVariable Set to a variable that defines the filter. Selecting the value displays the dynamic option FilterVariable.
FilterDirect (Source)

Type the filter regular expression.

Related Topics: Regular Expression Editor

FilterFile (Source)

Select an existing File connection manager, or click <New connection...> to create a connection manager.

Related topics: File Connection Manager

FilterVariable (Source)

Select an existing user-defined variable, or click <New variable...> to create a variable. The filter can be regular expression, ICollection or ADODB Recordset.

Related Topics: Integration Services Variables, Add Variable

IsMappingVariable (Destination)

Indicates whether the mapping is stored in a variable. This property has the options listed in the following table.

Value Description
True The mapping is stored in a variable. Selecting the value displays the dynamic option MappingVariable.
False The mapping is specified in a File connection manager. Selecting the value displays the dynamic option Mapping.
Mapping (Destination)

Select an existing File connection manager, or click <New connection...> to create a connection manager. The mapping should be specified as key to value text in the following format:

SourceColumn1=DestinationColumn1
SourceColumn2=DestinationColumn2
...

Related topics: File Connection Manager

MappingVariable (Destination)

Select an existing user-defined variable, or click <New variable...> to create a variable. The mapping can be specified as:

  • Key to value text in the following format:
    SourceColumn1=DestinationColumn1\r\nSourceColumn2=DestinationColumn2\r\n...
  • Object supporting IDictionary interface. For sample dictionary implementation, check Dictionary Destination script.
  • ADODB Recordset object. You can use standard Execute SQL Task to get result in this type of object.

Related Topics: Integration Services Variables, Add Variable

MissingColumnFail (Destination) (1.5 SR-2)

Indicates how to handle missing source columns. This property has the options listed in the following table.

Value Description
True The task will fail when source columns are missing in the destination.
False The task will ignore the missing source columns in the destination.
IsColumnDelimiterVariable (Flat File Delimited Source and Destination) (1.5 SR-2)

Indicates whether the column delimiter is stored in a variable. This property has the options listed in the following table.

Value Description
True The column delimiter is stored in a variable. Selecting the value displays the dynamic option ColumnDelimiterVariable.
False The column delimiter is directly specified. Selecting the value displays the dynamic option ColumnDelimiter.
ColumnDelimiter (Flat File Delimited Source and Destination)

Specify flat file column delimiter.

ColumnDelimiterVariable (Flat File Delimited Source and Destination) (1.5 SR-2)

Select an existing user-defined variable, or click <New variable...> to create a variable.

Related Topics: Integration Services Variables, Add Variable

IsColumnsCountVariable (Flat File Fixed and Ragged Right Source) (1.5 SR-2)

Indicates whether the column count is stored in a variable. This property has the options listed in the following table.

Value Description
True The columns count is stored in a variable. Selecting the value displays the dynamic option ColumnsCountVariable.
False The columns count is directly specified. Selecting the value displays the dynamic option ColumnsCount.
ColumnsCountVariable (Flat File Fixed and Ragged Right Source)

Select an existing user-defined variable, or click <New variable...> to create a variable.

Related Topics: Integration Services Variables, Add Variable

ColumnsCount (Flat File Fixed and Ragged Right Source) (1.5 SR-2)

Specify flat file columns count.

IsColumnWidthVariable (Flat File Ragged Right Source) (1.5 SR-2)

Indicates whether the column width is stored in a variable. This property has the options listed in the following table.

Value Description
True The column width is stored in a variable. Selecting the value displays the dynamic option ColumnWidthVariable.
False The column width is directly specified. Selecting the value displays the dynamic option ColumnWidth.
ColumnWidth (Flat File Ragged Right Source)

Specify flat file ragged right column width.

ColumnWidthVariable (Flat File Ragged Right Source) (1.5 SR-2)

Select an existing user-defined variable, or click <New variable...> to create a variable.

Related Topics: Integration Services Variables, Add Variable

Configuration (EDI Source)

Select an existing File connection manager, or click <New connection...> to create a connection manager.

Related topics: File Connection Manager

How to create completely dynamic data flow? (1.4 SR-3)

Starting from 1.4 SR-3 the data flow can be completely dynamic - the need to have one common column across different iteration sets is no longer required. To have completely dynamic components, you have to remove all statically defined columns (both in the components and related connection managers). However there are components, which require at least one column to pass successfully validation. We have implemented a workaround, which is based on thunk (not used) column. If you encounter a component requiring at least one column, define a column named THUNK_COLUMN. This column will be maintained and ignored at runtime by the Data Flow Task Plus.

How to build data flow setup?

The initialization and user interface for a data flow is defined by implementing properties on the script EntryPoint class. The documentation and samples of Script Task Plus for defining parameters is relevant to Data Flow Task Plus initialization setup, so we will not repeat it here. Here you will find description of DataFlowSetting attribute, used to apply parameter value to specific data flow component setting.

  • Go to Toolbox and open Choose Toolbox Items ... dialog. Press SSIS Control Flow Items tab and select Script Task Plus.

    Add ReferenceAdd Reference

  • Drag-in Data Flow Task Plus into your control flow package and open Design Script IDE.
  • Add reference in the script project to CozyRoc.SSISPlus

    Add ReferenceAdd Reference

  • Include Imports CozyRoc.SqlServer.SSIS.Attributes in script. It is also recommended to include Imports System.Drawing.
    Imports System.Drawing
    Imports CozyRoc.SqlServer.SSIS.Attributes
    
    Public Class ScriptMain
  • Now you are ready to define your first Data Flow Task Plus Parameter.

DataFlowSetting attribute

DataFlowSetting attribute defines a list of references to different component properties in the SSIS data flow. It defines the link between initialization parameter and specific SSIS data flow component setting. You can apply one initialization parameter to multiple data flow components and properties. Each reference is a text string with separate sections. Each section is separated with "." dot. Every reference starts with component, followed by "keyword" specifying setting type. Here is a list of sample DataFlowSetting attribute reference types, which are supported currently by Data Flow Task Plus component:

  • DataReader Source.Property.PreCompile - set property PreCompile.
  • DataReader Source.Connection.IDbConnection - set connection IDbConnection.
  • DataReader Source.Column.Input.DataReader Input.DateKey - set input column DateKey in DataReader Input input.
  • DataReader Source.Column.Output.DataReader Output.DateKey - set output column DateKey in DataReader Output output.
  • DataReader Source.Column.OutputExternal.DataReader Output.DateKey - set external output column DateKey in DataReader Output output.

where DataReader Source is a reference to data flow component.

In SSIS different data flow elements like components, properties, connections, columns can be referenced either by name or ID. DataFlowSetting attribute reference supports both.

Data Flow property setting

  • Sample script to setup CommandTimeout property of DataReader Source component.
    <DataFlowSetting(New String() {"DataReader Source.Property.CommandTimeout"})> _
    Public Property CommandTimeout() As Integer
        Get
            CommandTimeout = m_commandTimeout
        End Get
        Set(ByVal value As Integer)
            m_commandTimeout = value
        End Set
    End Property

Data Flow connection setting

  • Sample script to setup IDbConnection connection of DataReader Source component.
    <Connection("ConnectionType")> _
    <DataFlowSetting(New String() {"DataReader Source.Connection.IDbConnection"})> _
    Public Property Connection() As String
        Get
            Connection = m_connection
        End Get
        Set(ByVal value As String)
            m_connection = value
        End Set
    End Property
    
    <Browsable(False)> _
    Public ReadOnly Property ConnectionType() As String
        Get
            ConnectionType = "OLEDB"
        End Get
    End Property

Data Flow column setting

  • Sample script to setup column with ID 3018 in DataReader Output output of DataReader Source component.
    <DataFlowSetting(New String() {"DataReader Source.Column.Output.DataReader Output.3018"})> _
    Public Property Column() As String
        Get
            Column = m_column
        End Get
        Set(ByVal value As String)
            m_column = value
        End Set
    End Property

Dynamic Column Custom Handler (1.4 SR-3)

  • To setup custom routine in the data flow script for additional setup of dynamically added data flow column, attribute your script main class with DataFlowColumnAdded attribute. Set as parameter the name of the script method. Check below the required handler method layout.
    <DataFlowColumnAdded("OnColumnAdded")> _
    Public Class ScriptMain
        Inherits UserComponent
    ...
        Public Sub OnColumnAdded(ByVal component As IDTSComponentMetaData90, ByVal isInput As Boolean, ByVal colName As String)
    ...

Samples

  • The sample package used from the demonstration video can be download from here.
  • A sample package contributed by Paul McMillan, demonstrating dynamic data flow from source to destination database and driven by metadata.