RAS Core Research Note G00130597, Bill Gassman, 30 September 2006
Options Proliferate for ‘Real-Time Data Integration’ Technology • Numerous applications are emerging for real-time or low-latency data integration. Consequently, IT development and operations managers must identify the most cost-effective and appropriate offerings to satisfy on-demand and event-driven needs. • Analysis: Real-time data integration is the technology and process that provides application access to up-to-date information to meet time-sensitive business requirements. • Technology Options • direct access • real-time data acquisition … • publish-subsribe model, change data capture, message exchange • master data centralization • File transfer protocol (FTP) •…
SAP AG 2006, HTWG AK SWT 101106 / 3
Challenges of Real-time Data Warehousing
Looking beyond traditional ETL expand the concept of data acquisition in order to reduce data latency Latency – time lag between an activity completion in an ERP environment and the availability of the completed activity data in a state of the art data warehouse environment Latency = zero demand for tools that enable direct access of data and information without any latency across platforms SAP NetWeaver 2004s BI capabilities for Real-time Data Warehousing
– Real-time Data Acquisition (RDA) – Direct Access
SAP AG 2006, HTWG AK SWT 101106 / 4
ETL, Flavours of ETL, Alternatives ETL (Extraction, Transformation, Load)
New DataSource Concept with SAP NW2004s BI DataSource Transformation Data Transfer Process (DTP)
Flavours of ETL
Classical Staging Processes Real-time Data Acquisition (RDA)
Alternatives
Direct Access
How does the Implementation of ETL Processes look like in an SAP NetWeaver 2004s BI Environment?
SAP AG 2006, HTWG AK SWT 101106 / 5
Real-time Data Warehousing - Introduction Overview NW 2004s BI - ETL Capabilities Real-time Data Warehousing with SAP NW 04s BI Direct Access Real-time Data Acquisition (RDA) Outlook and Summary
Real-time Data Warehousing - Introduction Overview NW 2004s BI - ETL Capabilities Real-time Data Warehousing with SAP NW 04s BI Direct Access Real-time Data Acquisition (RDA) Outlook and Summary
ETL, Flavours of ETL, Alternatives ETL (Extraction, Transformation, Load)
New DataSource Concept with SAP NW2004s BI DataSource Transformation Data Transfer Process (DTP)
Flavors of ETL
Classical Staging Processes
– Persistency of transactional and master data in BI – Extraction using DataSources with services like packaging and sequencing, delta handling – Transformations with graphical UI and sophisticated formula builder – Data loads based on batch scheduling – Data loads can even be scheduled hourly – Process Chains for modeling complex load processes using InfoPackages and DTPs – Scheduling options for Process Chains using SAP NetWeaver batch scheduling or Redwood Chronacle (OEM)
SAP AG 2006, HTWG AK SWT 101106 / 8
SAP NW2004s BI: Data Flow Concept – Example
SAP NetWeaver BI
InfoProvider
Process Chain
Data Transfer Process
TRANSFORMATION
DataSource (PSA) InfoPackage
Source System
SAP AG 2006, HTWG AK SWT 101106 / 9
Data Acquisition Layer – Data Source/Source Systems
Create DB Connection and DataSource
DataSource
DB Connect UD Connect BI Service API File Interface Web Service
Relational Source
MultiDimensional Source
e.g. Hyperion
SAP Source
File
XML
e.g. MSS, DB2, Teradata
e.g. SAP ERP
SAP AG 2006, HTWG AK SWT 101106 / 10
New BI DataSource concept with NetWeaver 2004s Highlights
unique look and feel for all of the DataSource Types direct/remote access is always an option
preview feature is standard
automated coversions (e.g. date format detection) InfoPackages only write into PSA new Transformation handling
new Data Transfer Process logic
SAP AG 2006, HTWG AK SWT 101106 / 11
Source System Tree
Source Sytems categorized:
SAP vs. non SAP File vs. database Relational vs. Multidimensional DB ABAP vs. Java XML vs. Text/Binary
SAP AG 2006, HTWG AK SWT 101106 / 12
SAP NW2004s BI: Data Flow Concept – Example
SAP NetWeaver BI
InfoProvider
Process Chain
Data Transfer Process
TRANSFORMATION
DataSource (PSA) InfoPackage
Source System 1
SAP AG 2006, HTWG AK SWT 101106 / 13
Transformation – Graphical UI
Source fields
Target fields
Note: Key figures, characteristics and date fields are shown on the same level (transformation group)
SAP AG 2006, HTWG AK SWT 101106 / 14
Rules per group
Transformation - Rules
Information on
– Rule type – Currency/ Unit Conversion – Source fields – Target fields
SAP AG 2006, HTWG AK SWT 101106 / 15
Data Transfer Process: Complex Example SAP Netweaver BI
InfoProvider
Process Chain
Process Chain
DTP
TR
DataStore Object 3
DTP
DTP
TR DataStore Object 1
TR
DTP
DataStore Object 2 DTP
DTP
TR DataSource (PSA)
TR
DataSource (PSA)
IP
IP
Source System 1
SAP AG 2006, HTWG AK SWT 101106 / 16
Source System 2
Complex Implementation Example
SAP AG 2006, HTWG AK SWT 101106 / 17
Real-time Data Warehousing - Introduction Overview NW 2004s BI - ETL Capabilities Real-time Data Warehousing with SAP NW 04s BI Direct Access Real-time Data Acquisition (RDA) Outlook and Summary
Two given Options for real-time Data Warehousing
ETL (Extraction, Transformation, Load)
New DataSource Concept with SAP NW2004s BI DataSource Transformation Data Transfer Process (DTP)
Flavours of ETL
Classical Staging Processes Realtime Data Acquisition (RDA)
– Integrated in data flow concept – DataSources with specific Adapter Types like ‚WebService Push‘ and ‚Real-time Extraction from SAP System‘ – DTP Type ‘Real-time Data Acquisition‘ – Daemon based processing with modified Request Handling
Alternatives
Direct Access
– – – – Modeling based on VirtualProvider DataSources are Direct Access enabled DTP Type ‘for Direct Access‘ Driven by Query handling
SAP AG 2006, HTWG AK SWT 101106 / 19
Real-time Data Warehousing - Introduction Overview NW 2004s BI - ETL Capabilities Real-time Data Warehousing with SAP NW 04s BI Direct Access Real-time Data Acquisition (RDA) Outlook and Summary
Direct Access: Simple Example
Query
BI
VirtualProvider
DataTransfer Process for Direct Access
TRANSFORMATION
DataSource for direct access
SourceSystem
SAP AG 2006, HTWG AK SWT 101106 / 21
Direct Access - Implementation
Create VirtualProvider
Create Transformation and DTP
SAP AG 2006, HTWG AK SWT 101106 / 22
Direct Access: Complex Example
Query
BI
VirtualProvider
Remote Characteristic as InfoProvider
DTP for Direct Access
TR
TR
DTP for Direct Access
DataSource (direct access)
DataSource (direct access)
Source System
Transactional Data
Master Data
SAP AG 2006, HTWG AK SWT 101106 / 23
Implementation - Overview
Verify master data handling during Reporting
Create Remote Characteristic as InfoProvider
SAP AG 2006, HTWG AK SWT 101106 / 24
Final Query Result with Remote Master Data Texts
Customer Remote names from Customer external source Names
SAP AG 2006, HTWG AK SWT 101106 / 25
Direct Access Scenarios
BI DataSource Types with Direct Access capabilities File DataSource
Access file system of application server or local workstation
DB Connect DataSource
Access SAP NetWeaver DB platforms using ODBC technique
UD Connect DataSource
Based on the Universal Data Integration (UDI) concept Access to any source system using SAP J2EE Connection Framework based on BI Java Connectors
– JDBC Connector for any DB providing JDBC (MS Access, Teradata, …) – ODBO Connector for any multi dimensional providing OLE DB for OLAP (Hyperion, …) – XML/A Connector for sources providing XML for Analysis (SAP NW BI, …)
BI Service API DataSources
Access to SAP ERP source systems
SAP AG 2006, HTWG AK SWT 101106 / 26
Real-time Data Warehousing - Introduction Overview NW 2004s BI - ETL Capabilities Real-time Data Warehousing with SAP NW 04s BI Direct Access Real-time Data Acquisition (RDA) Outlook and Summary
Strategic vs. Tactical Decision-Making
Standard Data Acquisition Goal Strategic decisionmaking (long term planning) Request oriented (nightly batch job) 1/day … 1/week After a certain period
Real-time Data Acquisition Tactical decision-making (day-to-day decisions) Stream oriented (always active batch job) 1/min … 1/hour Close to real-time
Data Staging Upload frequency Availability for reporting Computing Power
Usually done at night (load balancing)
Permanent resource consumption
Use Real-time Data Acquisition only if necessary.
SAP AG 2006, HTWG AK SWT 101106 / 28
Example: RDA using SAP ERP Service API
SAP NetWeaver BI
Operational Data Store
DataStore Objects Daemon Pull ~ 1/min
Data Transfer Process for Real-time Data Acquisition
Service API-based real-time data acquisition is a two-stage process:
Data is pulled into PSA by Realtime InfoPackage Data is transferred to DataStore Object using Real-time DTP
PSA DataSource
InfoPackage for Real-time Data Acquisition
In both cases the process of data movement is initiated by the system daemon
Delta Queue Service API
Real-time Update
Application
SAP Source System
SAP AG 2006, HTWG AK SWT 101106 / 29
Real-time Data Acquisition (RDA) - Processing Daemon
Originally stands for Disk And Execution MONitor in the UNIX world System process to initiate data loads at regular intervals: from one minute to hourly The BI Daemon data load includes three steps:
– Initiate BI Service-API data pull using InfoPackage for RDA into PSA (SAP source systems) – Track status of data transfer from source system
– Initiate update of DataStore Object using DTP
Successful execution of each step is tracked in a control table
– Allows restarting – Restart can be initiated so it starts at the next step after the last successfully executed step
SAP AG 2006, HTWG AK SWT 101106 / 30
Daemon Monitoring
How to check the status of RDA daemon?
Call transaction RSRDA or press
– Is the daemon (batch job) still running?
Refresh in RSRDA
The status of the daemon is indicated by one of the following icons:
– Daemon active and running – Daemon not active – Daemon stopped with an error – Daemon stopped by user (will change to “not active” or “erroneous”)
Display runtime information about the daemon (context menu in RSRDA)
SAP AG 2006, HTWG AK SWT 101106 / 31
RDA using SAP ERP Service API
SAP NetWeaver BI
DS-Object
Daemon Process
RDA-Data Transfer Process
TRANSFORMATION
DataSource (PSA) RDA-InfoPackage
SAP ECC
SAP AG 2006, HTWG AK SWT 101106 / 32
RDA using SAP ERP Service API - Modelling View
Create InfoPackage and DTP for RDA
SAP AG 2006, HTWG AK SWT 101106 / 33
RDA Example Using Service API – Administration View
Create a Daemon in the RDA Monitor generating an open request waiting for transactional SAP ERP PO data
Monitor: Real-time Data Acquisition
Run SAP ERP Purchasing to push a data record into BI using this chanel
SAP AG 2006, HTWG AK SWT 101106 / 34
RDA – Implementation Scenario
BI
Enhance established Data Flow with RDA capabilities
DataStore Object
DataStore Object
Implement additional DataStore Object for operational reporting Replace standard delta InfoPackage by RDA InfoPackage Regular data loads can be scheduled after closing the RDA InfoPackage Request using appropriate Process Chain feature
RDA
Daemon
PSA
DataSource
RDA
Typically data is deleted regularly from the DataStore Object supplied using RDA Standard reporting can be enhanced by operational reporting using the report-report interface
OLTP
2LIS_02_VAITM
real-time Update
Application
Delta Queue
SAP AG 2006, HTWG AK SWT 101106 / 35
Real-time Data Warehousing - Introduction Overview NW 2004s BI - ETL Capabilities Real-time Data Warehousing with SAP NW 04s BI Direct Access Real-time Data Acquisition (RDA) Outlook and Summary
Stamm- und Prozessdaten Replikation • periodisch
• SAP NW BI ETL • SAP NW BI RDA •… • stetig • EAI (SAP NW Process Integration) •… • sonstig • SAP NW BI Direct Access •…