This Transformation helps us in transferring the data in an effective mode. Using this we can push the operands into the data base (like Group by or Order by on the database table).
Example: – Assume we are doing a lookup on the data and a group by on the same data in the same dataflow. If we are doing that on millions of records, it’s a performance hit. So if we use data transfer transformation, the data flow is split and runs as separate dataflows for lookup and also as separate dataflows for the group by/ orders by.
See the below job which does a lookup on the data and also a group by. Using the Data Transfer transformation, it will run them as a separate processes-
In the query_lkp transform, we are using a lookup on the table Orders-
Using the Data_transfer transformation, we do the following settings-
First we need to select ‘Enable transfer’ by checking the check box.
Transfer Type:- (Table, File, Automatic)
Table type Transfer: if you select table, automatically the remaining tabs (options, bulk loader options, Pre-load commands, post-load commands) will be visible. And also the table Options will be activated. Select the data store and specify a table name which is not existing under the datastore. This is because, this table will be created at run time and will be automatically deleted at the end of the job.
File type transfer: This works in a similar way as the table type transfer.
Automatic type transfer: When you select Automatic, it automatically identifies the type of the source you mentioned.
Using this query transform we are performing a group by on the field ProductID .
While performing a group by on a field, we need to perform min()/ Max() functions on the remaining fields.
Run the job and observe the trace, how the job is splitting each function in to different dataflows-