Changed-Data Capture

This post introduces the concept of changed-data capture (CDC). You use CDC techniques to identify changes in a source table at a given point in time (such as since the previous data extraction). CDC captures changes such as inserting a row, updating a row, or deleting a row. CDC can involve variables, parameters, custom (user-defined) functions, and scripts.

Exercise overview

You will create two jobs in this exercise. The first job (Initial) initially loads all of the rows from a source table. You will then introduce a change to the source table. The second job (Delta) identifies only the rows that have been added or changed and loads them into the target table. You will create the target table from a template.
Both jobs contain the following objects.
• An initialization script that sets values for two global variables: $GV_STARTTIME and $GV_ENDTIME
• A data flow that loads only the rows with dates that fall between $GV_STARTTIME and
• A termination script that updates a database table that stores the last $GV_ENDTIME

Continue reading →