Home » Derived Column Transformation in SSIS

Derived Column Transformation in SSIS

The Derived Column transformation in SSIS allows us to create new column values by applying expressions to transformation input columns.

An expression can be created by any combination of variables, functions, operators, and columns from the transformation input columns.



Lets see the implementation of Derived column transformation in SSIS which creates a new columns and load into SQL table.

As you can see below, we have a sample excel file Exl_CustomerData.xlsx which contains the Customer Details. This is our source file.

Now following is our destination table in SQL database.

Create table dbo.CustomerDetails
(CustomerKey int,
CustomerName nvarchar(255),
Gender nvarchar (255),
MaritalStatus nvarchar (255),
YearlyIncome numeric (8,2),
Country nvarchar (255),
State nvarchar (255),
CustomerName_Len Int,
CustomerName_UpperCase nvarchar (255),
Gender_1 nvarchar (6),
Country_1 nvarchar (255)
)

You can see, table does not have any records currently.

Now open SSDT (SQL Server Data Tool), and take New SSIS project, or you can also create a new SSIS package in existing project.

Here we are creating a new package in existing project.

Right click on SSIS Packages folder under Soliton Explorer, then select New SSIS package from context menu.

You will see a new package with name package1.dtsx is created, lets give it a meaningful name. Just right click on package1.dtsx and click on Rename.




Lets give it name as DerivedColumnsTransformation.dtsx.

Note: do not change or remove the .dtsx extension from package name, just change the package name only.

Lets got to Toolbox pane, the select and drag a Data flow task into Control Flow, as shown below.

After that double click on Data Flow Task, it opens a Data flow view for this selected Data Flow Task.

You can see, Data flow view opens, as our source is Excel file lets got to Other transformation and a drag a Excel source into Data flow.

Lets double click on Excel Source and assign a connection manager to Excel Source.

You will see a Excel Source Editor opens, you have already have an existing connection manager than select from dropdown else can create new by clicking on New.. button.

Lets click on New.. button, you will see a Excel Connection Manger windows open.

Now you need to specify excel file path, also can select excel version.

Just check the option first row has column names, if excel file has columns name in first row other wise uncheck this.

Once you done with this, just click on OK button.

 

Once you click on OK button, you will get back to Excel Source Editor window, there you can see a Excel connection manager dropdown is auto filled with connection you just created.

Now leave the Data access mode as Table or view, now select the excel sheet name from the Name of the Excel sheet dropdown.

After that, if you want to see preview of data then just click on Preview button.

You can see data is coming perfectly in Preview Query Results window, which also ensure that connection manger is configured correctly and data is showing in proper format.



Just click on close button to closing the Preview window.

Now, in Column tab, you can modify column name, or remove any column by unchecking them.

After that click on OK button.

After that, in Toolbox pane under the Common, drag a Derived Column component into Data flow.

Derived Column Transformation

Now using Derived column transformation we will create new column values by applying expressions to transformation input columns means the column values that we are getting from excel source.

Now will add the path, means connect the Excel source component which is the source of path to the Derived Column component which is the end of path.

For this select Excel source component then drag the arrow and connect it to Derived Column as shown below.

After that, just double click on Derived column component.

Once you double click on Derived Column you will see a Derived column transformation Editor window opens.

Now you can create a new column which can be created by combination of variables, functions, operators, and columns from the transformation input columns.



Lets create first Derived column, that will count the Length of CustomerName.

Expand the Columns folder, select CustomerName column and drag it into row column Derived Column Name as shown below.

Rename the Derived column as CustomerName_Len.

Next, if you want to create this Derived column as a new column then select Derived column as <add as new column> which is set by default. Or in case if you want to replace this Derived column with an existing column then can select that column from dropdown.

After that Go to String Functions folder on the right side, expand it then find the LEN function and drag LEN function into Expression column, then specify the column name to Function, here we want to count the length of customer name so specify column CustomerName.

Lets create Second Derived column that will convert CustomerName values into uppercase, for this repeat the same step, drag CustomerName column into Derived column name.

Renamed as CustomerName_Uppercase, and in Expression drag UPPER function as shown below.

Lets create third Derived columns based on if-else condition, which will check if gender value is “M” then display “Male” and If gender value is “F” then display “Female”

First drag the Gender column into Derived column name named as Gender_1 then write below expression in Expression column.

Gender == "M" ? "Male" : "Female"

Lets create Fourth Derived column, that will be a duplicate column of Country.

Its very simply just drag country column and named as Country_1.

After that click on OK button.

Now we have created four derived column, you can also explore other in built function based on your requirements.

Now lets add destination which is SQL Server table so we will take OLEDB Destination   component.

Go to Other Destinations and select and drag OLEDB Destination component into Data Flow view.

Now connect the Derived column component to OLEDB Destination,

Lets assign a connection manager to OLEDB destination.

Double click on OLEDB Destination component, you will see a OLED Destination Editor window opens

Once you click on New.. button, a Configure OLEDB Destination Connection Manager window opens, here you can select an existing connection manager or can create new.

Here we are creating a New, lets click on New.. button.



Once you click on New.. button, a Connection Manger window opens, here you need to specify Server name, Authentication mode and database name as shown below.

Also you can verify connection by clicking on Test Connection button.

After that click on OK button.

After that, you will see a connection manager is auto reflecting in OLEDB connection manager dropdown, now select Data access mode as Table or view, then select a destination table to which you want to load data.

After that, go to Mappings tab. You can map the appropriate input columns with output columns.

You can see by default source file columns that is input columns and destination table column that is output columns are mapped automatically.

If any columns are not mapped correctly or missed then you can rectify or mapped them manually. You can see  columns CustomerName_Len, CustomerName_UpperCase, Gender_1 and Country_1 are not mapped.

Lets map them, just select CustomerName_Len column from Input and drag it and connect with CustomerName_Len output column as shown below.





Lets do the same for remaining columns as well.

Now you can see, all the input columns are mapped with destination output columns as shown below.

After that in Error Output Tab,  you can specified the behavior of the component in case of failure, you can specified any one option as follows:

    • Ignore Failure: It ignores any failure while reading rows from source and the package will continue executing even any error occurred.
    • Redirect Row: It redirects the failed rows to other component which is connected with the error precedence constraints.
    • Fail component: It stops the execution of package in case of any failure.

We go with default selected value that is Fail component.

After that click on OK button.

Now we have created specified Source and Destination connections, derived column transformation. Lets execute the package.

Right click on Data flow view, then select Execute task from context menu.

Now you can see, package run successfully and 9 records are loaded from excel file to SQL Server table.

Lets check SQL Server table, and you can see records are loaded into table, also can see values for columns CustomerName_Len, CustomerName_uppercase, Gender_1 and Country_1 are coming from Derived Column transformation which seems correctly as per our requirement.

Now you can stop the package debugging by clicking on Stop Debugging button.

 

Also Read..

Create a new SSIS project.

Load Flat file data into SQL Server table using SSIS

Load Excel file data into SQL Server Table

 

 435 total views,  1 views today

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.