A Validation transform is very much similar to the case transform. This also comes under ‘Platform’ set of transform in data services.
It is used to validate the data and transfer it to Pass and Fail tables. The validation rules can be defined in this transform. They can be simple or complex.
The rules can be written for each single column.
One important point to mention here is that a FAIL rule is stronger that PASS rule as a row will pass once it satisfies all the conditions but it will fail if any one of the condition is not satisfied.
A Validation transform has an input and two output schema.
In the Fail validation schema, there are two extra columns, one is Error Action and the other is Error Column.
The error action column will tell whether the row is sent to pass or fail or both the schema.
The error column will have the information about which column has failed.
One extra table ‘Validation_RuleViolation’ will also be generated having the error details.
Also, there is option for ‘Action on Failure’. Here, you can direct the system to transfer the failed record to pass table, fail table or both the tables and also substitute a text in place of the failed column value.
Here, we will be using the same database as created in Case transform example.
Create a new project ‘validation_transform’.
Create a new job ‘validation’ under it.
Create a dataflow ‘DF_Validation’ under the job.
Add the regionId table as source.
Drag the validation transform to the canvas. Join the source table to the validation transform.
Double click on the validation transform to see the options.
Click on ‘Add’ button to add a validation rule.
Give following validation rule for regionId:
Press on ‘OK’. Observe the check mark appearing before the regionId column:
Give another condition for employeeId as:
Give a condition of country columns as:
Now add two template tables and connect as target to your validation transform. One table is for Pass records and other is for Fail records:
Run the job and observe the data in the template tables.
You can see that no records are there in the pass table. All the records have come in the FAIL table and the ERRORCOLUMN specifies the condition which has failed:
Scenarios where Validation Transform can be used:
- To pass only rows with valid date format to the target
- Usually if there is condition on single row, you can use the where condition in the Query transform or the case transform
- If there are conditions on multiple columns, you can use validation transform
Go into your validation transform, for employeeID and country validation rules, select ‘Send to Pass’.
Now you can see some data has come to the Pass table, but you cannot identify which data has failed the rules.
Open your validation transform again, for the employeeId and country validation rules, give a substitute value for send to Pass items:
Execute the job again and check the pass table values:
Understanding Match Pattern option in Validation Transform
Create a table in the database (MS SQL Server) as below:
Name of the table: valid_pattern
Populate the table with some data:
The correct pattern for hall ticket is Uppercase-Lowercase followed by couple of numbers then Uppercase and Lowercase followed by couple of numbers.
You can observe that the table above has some incorrect entries, we will try to separate them out using validation transform.
Login to your Data Services and import the table valid_pattern into your datastore. Create a job, workflow and dataflow. Add your table, make it as source, add the validation transform:
Go inside your validation transform, specify a validation rule for the ticket_no column.
Observe that the wild card character for uppercase alphabet is ‘X’, for lowercase it is ‘x’, for two digit number, it is ’99’
Add a validation rule for name column as ‘exists in table’ RegionId. This will act as referential integrity of the employee names:
Add the Pass and Fail template tables as target. Save and Validate your Dataflow.
Execute the job. Observe the entries in the Pass and Fail tables.
Points to note:
- There is no wild character for symbols, only for alphanumerics
- You can also add multiple patterns by adding rules in the validation transform