Data Insights in BODS– INFORMATION STEWARD

Data Insight is used to do the DHA (Data Health Assessment) on the data i.e. to see if the data is good to use. You use the tool data Insight to do a test / profiling on the data before we use the data for the ETL process. You can also say that Insight is used to do data investigation. It automates the analysis and monitors the data.

Using Data Insight we can perform the following tasks-

  • Data Profiling
  • Column query
  • Integrity test and Custom query
  • Scheduling
  • Creating a trend reports
  • Sampling reports

Getting Started

Creating Connection

Navigation to data Insight

Note:- First you need to start the Data insight Engine before you use the Data Insight tool.

To start the Data Insight Engine, follow the below navigation-

Start – Program Files – SAP Information Steward – Information Steward

This opens a web based tool.

b8.png

Click on the Enterprise Project.

Following screen is displayed-

b9.png

 

SAP BusinessObjects Information Steward creates a special project, named “Enterprise”, which cannot be deleted. Only the profile and rule results in the Enterprise project are available for the “Data Insight Connection” tables on the Metadata Management tab. Therefore, you can view profile results and data quality scores at the table and column level for tables under the “Data Insight Connection” directory in Metadata Management.

Add tables and files to a Data Insight project so that you can profile the data, run validation rules to monitor the data quality, and create data quality scorecards to measure quality against data quality metrics (or expected data quality targets). The tables and files that you add to your project are displayed on the Workspace Home tab in your project.

 

You can add tables and files in the following ways:

  • Add a table when you know its exact name
  • Browse Data Insight connections to find the table or file
  • Search for a table by name or description
  • View the data in the table
  • Select the tables or files and add them to the Data Insight project

You can specify the connections in Central Management Console (CMC)

Go to the selected table and expand it.

Here you can see 4 tabs (Data Profile, Column Query, Referential Integrity, Custom Query) using which you can perform different types of tests on the data.

 

Data Profile

Using data profile functionality, you can perform tests like Summary on the data, Comparison,  Frequency Test, Word Frequency test, Uniqueness of data, Redundancy test.

Summary test

Summary will give the snap shot of the data for decision making or further drilldown.

You can perform the summary on the table level or on a column level as well.  Select the Check box under the Summary column and click on RUN.

It will give you the Summary Profile on the data which gives a complete DHA on the data.

You can check on ‘Save report’ and click on close. Now it will ask you to save the Profiler report. Click on Yes and give the Profiler name and click on OK.

Now if you notice, the last run column is populated with the time stamp and the result. Click on the result next to the time stamp to see the results.

Comparison test

Comparison is used to get the report of Count and percentages of rows with incomplete column values.

To do a comparison test, Click on the check box under comparison at the table level or the row level and click on RUN.

Comparison test gives the result of the matched or duplicates records available. In our case we don’t have any duplicates or matched records.

You can also click on print report to generate the report and also you can export the report to different formats-

Frequency (FRQ)

Frequency (FRQ) is used to find the frequency distribution of distinct values in columns.

The working procedure is same as the above. Click on the check box under the FRQ and click on RUN to see the results. You can also click on print report to export it into different formats. You can also save the result by checking save report and click on close and give the profiler name.

WFRQ (Word frequency)

WFRQ (Word frequency) refers to frequency distribution of single word.

Same as the above procedure, click on check box and click on run to see the results.

UNQ (Unique)

This gives the count and percentages of the rows with non-unique column values.

Same as the above procedure, click on check box and click on run to see the results.

RDN (Redundancy)

This test is to identify the commonalities and outliers between the columns.

Same as the above procedure, click on check box and click on run to see the results.

Column Query

This is used to Analyze the data within the Data Insight.

Select the column on which you want to perform the test and right click-add combined column query.

We can perform the following test using the Combined column Query.

  • Occurrence                                   Search for the occurrence (<, >, =,<=, >=) ‘n’ times
  • Pattern                                            Pattern of the data in the column
  • Pattern recognition                         Recognizing the string pattern with                                                                  special chars
  • Range                                               specify the min and max values for the                                                            range
  • Reference column                         reference column on which we have to                                                             refer this column
  • Specific value test                         Search the column with a specific value

Select the Radio buttons on the left side and the respective selections will be activated on the right side.

Once you select the query type on the left side, chose the respective options on the right side and click on return data check box and click on run.

In our example, we take the ‘specific values’ test.

Select the specific values on left side and specify a value on the right hand side. Select the Return data check box and click on run. You will get the below result. You can click on print report to see the data in a report format, or you can click the check box save data and click on close.

Now the column query is generated and we can see the time stamp along with the result.

You can also click on the setup report to generate an adhoc report. Click on Setup report, give the chart title , select Chart as Graph and chart type as PIE.

Click on save and close and click on Print report to see the PIE chart as below-

You can also click on Export to export the report in to various formats like PDF or word.

Referential Integrity

Using this we can perform tests like (Uniqueness of data, Foreign key reference, Primary key Unreference, Primary key inference and Auto business rules).

Also you can notice that on the left hand side where you see the columns, an icon is shown representing which is a primary key and which is a foreign key.

You can select the FK REF (foreign key reference) for the column and also click on Return Data and click on Run.

Custom Query

Custom query is a user defined query.

Go to the Custom Query tab, select the data base and right click and choose Add Custom Query.

For example, you can see the report by joining the Orders table with the Order details table.

You can notice that both the tables are automatically linked on the primary key basis and also the SQL code is generated at the bottom. Click on RUN to see the result.

You can also write your custom query to get the specified output.

 

Scheduling test in Data Insight

 

You can schedule all the above tests to monitor the health of the data on a regular basis.

To schedule the reports, go to the menu ‘Tasks’ and add task.

Give the batch name and expand the database to see the saved tests.

Select the test and click on ADD-> to add the reports to the selected items.

Once you add all or required reports for scheduling, click on save, schedule.

Select the name of the report and choose ‘recur once every’ option, give end date and the start time.

Now click on ADD and click on close.

 

Creating a Trend Report

To create a trend report, select Reports – Add Trend report

Give the Trend Name and Chart title.

Expand the database and select the test on which you want to create a trend report and click on ADD. Select the Chart type and data type. You can also choose – Include past data and the scaling unit.

Now click on save and run.

This concludes the data profiling using Data Insight.