ABAP Dataflow & Transports in BODS

This  post introduces the SAP BusinessObjects Data Services objects for extracting data from SAP application sources:

  •  SAP Applications datastores
  •  ABAP data flows
  •  transports

Note:

The procedures in this section require that the software has the ability to connect to an SAP server.
The sample tables provided with Data Services that are used in these procedures do not work with all versions of SAP because the structure of standard SAP tables varies in different versions.

Defining an SAP Applications datastore

1. In the local object library, click the Datastores tab.
2. Right-click inside the blank space and click New.
The “Datastore Editor” dialog box opens.
3. In the Datastore name field, type SAP_DS.

This name identifies the database connection inside the software.
4. In the Datastore type list, click SAP Applications to specify the datastore connection path to the database.
5. In the Application server field, type the name of the remote SAP Applications computer (host) to which Data Services connects.
6. Enter the user name and password for the SAP applications Server.
7. Click OK.
Data Services saves your SAP application database connection to the metadata repository.

To import metadata for individual SAP application source tables

Continue reading →

Recoverable workflow in BODS

This post describes on how to:

• Design and implement recoverable work flows.
• Use Data Services conditionals.
• Specify and use the Auto correct load option.
• Replicate and rename objects in the object library.

 

Recovery Mechanisms

Creating a recoverable work flow manually

A recoverable work flow is one that can run repeatedly after failure without loading duplicate data.
Examples of failure include source or target server crashes or target database errors that could cause a job or work flow to terminate prematurely.
In the following exercise, you will learn how to:
• Design and implement recoverable work flows
• Use Data Services conditionals
• Specify and use the auto-correction table loader option
• Replicate and rename objects in the object library.

Adding the job and defining local variables

1. In the Class_Exercises project, add a new job named JOB_Recovery.
2. Open the job and declare these local variables:

Variable                        Type
$recovery_needed       int
$end_time                     varchar(20)

The $recovery_needed variable determines whether or not to run a data flow in recovery mode. The $end_time variable determines the value of $recovery_needed. These local variables initialize in the script named GetWFStatus (which you will add in the next procedure). Continue reading →

CDC in BODS

Changed-Data Capture

This post introduces the concept of changed-data capture (CDC). You use CDC techniques to identify changes in a source table at a given point in time (such as since the previous data extraction). CDC captures changes such as inserting a row, updating a row, or deleting a row. CDC can involve variables, parameters, custom (user-defined) functions, and scripts.

Exercise overview

You will create two jobs in this exercise. The first job (Initial) initially loads all of the rows from a source table. You will then introduce a change to the source table. The second job (Delta) identifies only the rows that have been added or changed and loads them into the target table. You will create the target table from a template.
Both jobs contain the following objects.
• An initialization script that sets values for two global variables: $GV_STARTTIME and $GV_ENDTIME
• A data flow that loads only the rows with dates that fall between $GV_STARTTIME and
$GV_ENDTIME
• A termination script that updates a database table that stores the last $GV_ENDTIME

Continue reading →

Joins and Lookup in SAP Data Services

This post discusses on join conditions and look up in Data Services.

Populating the Sales Fact Table from Multiple Relational Tables

The exercise joins data from two source tables and loads it into an output table. Data Services features introduced in this exercise are:

• Using the query transform FROM clause to perform joins
• Adding columns to an output table
• Mapping column values using Data Services functions
• Using metadata reports to view the sources for target tables and columns

In this exercise, you will:

• Populate the SalesFact table from two source tables:
• Table SalesItem – columns Cust_ID and Order_Date
• SalesOrder – columns Sales_Order_Number, Sales_Line_Item_ID, Mtrl_ID, and Price.
• Use the FROM clause in the Query transform to join the two source tables and add a filter to bring a subset of sales orders to the target.

Populating the Sales Fact Table from Multiple Relational Tables
• Use the LOOKUP_EXT() function to obtain the value for the Ord_status column from the Delivery source table rather than from the SalesOrder table.
• Use metadata reports to view:

• Names of the source tables that populate the target SalesFact table
• Names of source columns that populate the target columns

To add the SalesFact job objects Continue reading →

Debugger in SAP Data Services

This post describes on how to use debugger in Data Services.

Using the interactive debugger

The Designer includes an interactive debugger that allows you to examine and modify data row by row by placing filters and breakpoints on lines in a data flow diagram.
A debug filter functions as a simple query transform with a WHERE clause. Use a filter to reduce a data set in a debug job execution. A breakpoint is the location where a debug job execution pauses and returns control to you.
This exercise demonstrates how to set a breakpoint and view data in debug mode. Continue reading →

SAP Data Services Designer tool

This post gives you a short overview of the Data Services product and terminology. Refer to the post SAP BO DATA Integrator / Data Services for more details.

Data Services Components

The following diagram illustrates Data Services product components and relationships-

Continue reading →

SQL Transform in SAP BODS

sql_pic.PNG

SQL Transform helps to import a schema in your dataflow that can act as a source.

Create a new batch job, add a workflow and dataflow. In your dataflow, drag the SQL transform from the local object library.  You can find it under the ‘Platform’ set of transforms.

Double click on the transform.

(Refer post Validation Transform to create the required tables in the database)

Select your datastore, write a simple select statement accessing a table already imported in your datastore. Click on ‘Update Schema’.

Observe that now the schema appears on the top window.

schema in.PNG

Add a query transform and add the columns required to the output schema. Join this with a target template table. Save and execute your job.

data_flow.PNG

This transform is particularly useful when you want only a particular set of data from the database table and push it to target level.

Refer to the ebook for more details:

New Ebook – SAP BODS Step by Step

SAP BO DATA Integrator / Data Services

Data services is a part of SAP Business Objects also known as SAP BODS (SAP Business Objects Data Services). It is integrated with SAP Applications. It also supports non sap databases.

Data Services performs ETL processing of data. It loads the data from the source to the target, modifying and cleansing it in the process. For every dataflow, a backend sql is generated. This can be viewed by going to main menu-Validation-Display Optimized SQL

optimized sql.PNG

The ETL job runs IN-MEMORY, hence SAP BODS is the fastest tool in the market place.

 

 

Advantages of Data Services over SAP BI/BW ETL process

  • It is simple to use and has a user friendly framwork
  • It has in build configuration of many types of datasources like flat files, XML, hadoop etc.
  • It has many in-built transformations like key generation, case, merge etc.
  • It has separate jobs for batch execution and real time loads. It can also perform delta loads.
  • There is no concept of Process chains/ DTP/ Info packages if you use the data services to load the data.

Data integrator / Services Architecture

 

1 flowchart.PNG

 

Data Integrator Components

Designer

wp_bods1.png

  • It is used to create the ETL dataflow
  • All the designer objects are reusable

Management Console (URL based tool / Web based tool)

wp_bods2.png

  • It is used to activate the repositories
  • You can create users and user groups and assign roles and privileges here
  • It allows to auto schedule batch jobs and monitor them and also see history of execution

Access Server

  • It gets the XML input (real time data)
  • XML inputs can be loaded to the Warehouse using the Access server
  • It is responsible for the execution of online / real time jobs

Repository Manager

wp_bods3.png

  • It allows to create the Repositories (Local, Central, and Profiler)
  • Repositories are created using standard database. (Oracle, Microsoft SQL, HANA etc.)
  • Data Services system tables are available in the repository which is a database in the Microsoft SQL Server for example. Any new tables imported in the Designer will be stored in the database as local repository external table. Any template tables not imported will be stored as Internal table in the local repository.
  • For a novice learner, central repository can be skipped. Central repository is only used if there are multiple users accessing the designer.

Meta Data Integrator

  • It generates Auto Documentation
  • It generates sample reports and semantic layers
  • It generates job based statistic dash boards

Job Server

This is the server which is responsible to execute the jobs. Without assigning the local / central repository , you cannot execute the job.

Designer Objects

Projects :-

Project is a folder where you store all the related jobs at one place. You can call it as a folder to organize jobs. Only one project can be  opened at a time in the Data Services Designer.

Jobs:-

Jobs are the executable part of the Data Services. A job is present under a project. There are two types of jobs:

  1. Batch Job
  2. Real time Jobs

Work Flows:-

A work flow acts as a folder to contain the related Data Flows. The Work Flows are re-usable. These are optional, i.e. you can execute a job containing a dataflow and no workflow.

Conditionals:-

Conditional contains Work Flows or data flows and these are controlled by scripts. Scripts will decide whether to trigger the conditionals or not.

Scripts:-

Scripts are set of codes used to define or initialize the global variables, control the flow of conditionals or control the flow of execution , to print some statements at the runtime and also to assign specific default values to the variables.

Data Flow:-

The actual data processing happens here. 

Source Data Store:-

This datastore connects your data services designer to your source system.

Target Data Store:-

This datastore connects your data services designer to your target system database.

Transformations:-

These are the query transformations that are used to carry out the ETL process. These are broadly categorized into 3 types(platform, quality and integrator)

File Format :-

It contains various legacy system file formats like XLS, CSV,  XLSX, TXT etc.

Variables:-

You can create and use the local and global variables and use them in the project. The variables name starts with “$” Symbol.

Functions:-

There are numerous inbuilt functions like (String, math, lookup , enrich and so on) provided in the designer.

Template Table:-

These are the temporary tables that are used to hold the intermediate data or the final data. These can be converted to permanent tables (i.e. tables stored in the database) by importing them.

Data Store:-

The data stores acts a port from which you can define the connections to the source or the target systems.

 

Refer to the ebook for more details:

New Ebook – SAP BODS Step by Step