Joins and Lookup in SAP Data Services

Courtesy: SAP HELP DOCUMENTS ON SAP BusinessObjects Data Services 4.0 (14.0.1). All the images are copyright of the SAP Software.

 

This post discusses on join conditions and look up in Data Services.

Populating the Sales Fact Table from Multiple Relational Tables

The exercise joins data from two source tables and loads it into an output table. Data Services features introduced in this exercise are:
• Using the query transform FROM clause to perform joins
• Adding columns to an output table
• Mapping column values using Data Services functions
• Using metadata reports to view the sources for target tables and columns

In this exercise, you will:
• Populate the SalesFact table from two source tables:
• Table SalesItem – columns Cust_ID and Order_Date
• SalesOrder – columns Sales_Order_Number, Sales_Line_Item_ID, Mtrl_ID, and Price.
• Use the FROM clause in the Query transform to join the two source tables and add a filter to bring a subset of sales orders to the target.

Populating the Sales Fact Table from Multiple Relational Tables
• Use the LOOKUP_EXT() function to obtain the value for the Ord_status column from the Delivery source table rather than from the SalesOrder table.
• Use metadata reports to view:
• Names of the source tables that populate the target SalesFact table
• Names of source columns that populate the target columns

To add the SalesFact job objects
1. Add a new job and name it JOB_SalesFact.
2. Add a work flow and name it WF_SalesFact.
3. Click the WF_SalesFact name to open it in the workspace.
4. Add a data flow to the work flow definition and name it DF_SalesFact.

To define the data flow as shown below that will generate the sales fact table-

fact table data flow

Defining the details of the Query transform

1. Open the query editor.
2. In the FROM tab, in the Join pairs area, select ODS_SALESORDER from the Left drop-down list.
3. Select ODS_SALESITEM from the Right drop-down list.
The relationship between the SalesItem and SalesOrder sales tables is defined by the common column Sales_Order_Number. The Join option generates a join expression based on primary/foreign keys and column names.
The resulting relationship appears in the From clause text box:

SALESITEM.SALES_ORDER_NUMBER = SALESORDER.SALES_ORDER_NUMBER

4. Click the smart editor icon.
5. In the Smart Editor, type the following text. Use all uppercase, as shown. This statement filters the sales orders by date and brings one year’s sales orders into the target. Also note that as you type the function names, a pop-up window prompts you with options. To select an option that is highlighted, press Enter (or double-click any option in the list).

AND ODS_SALESORDER.ORDER_DATE >= to_date('2007.01.01','yyyy.mm.dd')
AND ODS_SALESORDER.ORDER_DATE <= to_date('2007.12.31','yyyy.mm.dd')

6. Click OK.
7. Map the source columns to output columns. Remember that to map the contents of a column, select the column in the source schema and drag it to the target schema.

Defining the details of the lookup_ext function

You will take the Sales_Fact table order status value (column ORD_STATUS) from the Delivery source table (column DEL_ORDER_STATUS) rather than from the SalesOrder table.

1. In the query editor, select the ORD_STATUS column in the target schema.
2. On the Mapping tab, click the Functions button.
3. In the Select Function window, in the Function categories box, click Lookup Functions.
4. Click lookup_ext from Function name.
5. Click Next.

6. To define the LOOKUP_EXT() function, complete the dialog box with the values shown in the following graphic. To add an expression, you can either drag column names into the Expression and Default fields or click the ellipses button to open the Smart Editor.

lookup function
a. In the Lookup table drop-down list, select ODS_DS. Select the Delivery source table and click OK. The lookup table is where the LOOKUP_EXT() function will obtain the value to put into the ORD_STATUS column.
b. In the “Available parameters” area, expand the “Lookup table” and “Input schema” to display the columns in the Delivery table and SalesItems table, respectively.
7. Set up the first Condition. The conditions identify the row to look up in the lookup table. You need two conditions in this exercise because each item in an order can have a different order status.
a. Drag the DEL_SALES_ORDER_NUMBER column from the Delivery table to the Column in lookup table under “Condition”.
b. Verify the operator is equal to (=).
c. Click the ellipses next to the Expression field to open the Smart Editor. On the Data tab, expand the SalesItem table. Drag the SALES_ORDER_NUMBER column to the right side and click OK.
8. Similarly, set the second condition for DEL_ORDER_ITEM_NUMBER equal to
ODS_SALESITEM.SALES_LINE_ITEM_ID. You can either drag the SALES_LINE_ITEM_ID column into the Expression field, or you can click the ellipses button to open the Smart Editor to add the column to the expression.
9. The “Output” parameter specifies the column in the lookup table that contains the value to put in the ORD_STATUS column in the query. For the “Output”, drag the DEL_ORDER_STATUS column from the ODS_Delivery under “Column in Lookup table”.

10. Click Finish.
The final lookup function should read as follows (word wrapping is optional):

lookup_ext(
[ODS_ds.DBO.DELIVERY, 'PRE_LOAD_CACHE', 'MAX'],
[ DEL_ORDER_STATUS ],
[ 'N/A' ],
[ DEL_SALES_ORDER_NUMBER,'=',
ODS_SALESITEM.SALES_ORDER_NUMBER,
DEL_ORDER_ITEM_NUMBER, '=', ODS_SALESITEM.SALES_LINE_ITEM_ID ])

11. Click the Back arrow in the tool bar. Row13

Next verify that the data flow has been constructed properly.
Click the Validate All button on the toolbar.
If your design contains syntax errors, a dialog box appears with a message describing the error.
As before, warning messages are OK.

Viewing Impact and Lineage Analysis for the SALES_FACT target table

Using the metadata reporting tool, you can easily browse reports about metadata associated with a Data Services job.

The metadata reporting tool is a Web-based application.
In this example, you will use the metadata reporting tool to view reports about the SALES_FACT table to identify when the table was last updated and what are the source tables and column mappings.

1. Open the Impact and Lineage Analysis tool.

a. Open and log in to the Designer.
b. From the Tools menu, click Data Services Management Console.
You can also open the Management Console from the Start menu by clicking Start > Programs > SAP BusinessObjects Data Services 4.0 >  Data Services Management Console.
A browser window that shows the Management Console opens.
c. Log in using the same credentials you used to log in to the Designer.
The Management Console home screen appears.
d. Click Impact & Lineage Analysis.

2. To ensure that Metadata Reports includes the most up-to-date information about sources for columns in target tables, click Settings in the upper right corner.

a. Ensure that Repository contains your current repository name.
b. Click the Refresh Usage Data tab.
c. Ensure that Job Server contains your current Job Server name. Click the Calculate Column Mapping button to determine where the data comes from for each column in the target tables. You should receive the following successful message:
Column mappings are calculated successfully.
d. Click Close.

3. Expand the Target_DS datastore.

A list of tables in the datastore displays.

4. Expand “Data Flow Column Mapping Calculation” to view the calculation status of each data flow.
5. Click the SALES_FACT table.

The Overview tab for Table: SALES_FACT opens on the right side of the browser window. This tab displays general information about the table such as the date and time this target table was last updated (Last Update Date).

6. Click the Lineage tab.

The following Lineage tab displays the sources for the SALES_FACT target table. When you move the pointer over a source table icon, the name of the datastore, data flow, and owner appear.

lineage

7. Expand the SALES_FACT table on the left side of the browser window to display a list of columns.
8. Click the ORD_STATUS column to display information about it on the right side.
The following Lineage tab shows that the ALES_FACT.ORD_STATUS column is based on information in the source columns ODS_SALESITEM.SALES_LINE_ITEM_ID and
ODS_SALESITEM.SALES_ORDER_NUMBER.
These are the source columns that you defined in the condition for the LOOKUP_EXT() function.

You can print these reports by clicking the printer icon on the toolbar.

Courtesy: SAP HELP DOCUMENTS ON SAP BusinessObjects Data Services 4.0 (14.0.1). All the images are copyright of the SAP Software.