Event Driven Real Time Analytics
Jon Mead, Rittman Mead September 2012
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E :
[email protected] W: www.rittmanmead.com
Sunday, 30 September 12
Introductions
• Jon Mead ‣ CEO/co-founder of... • Rittman Mead Consulting ‣ Oracle BI & DW Consultancy ‣ Gold Partner ‣ Long(est) running Oracle BI blog ‣ Annual BI Forum ‣ OBIEE Oracle Press book • Customer-facing FTSE listed ‣ UK based and leading ‣ Internet based ‣ Retail based
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E :
[email protected] W: www.rittmanmead.com
Sunday, 30 September 12
Agenda
• Understanding the Project ‣ Legacy architecture ‣ Proposed architecture ‣ Reporting requirements • Technical Infrastructure ‣ Hardware and Software • Data Warehouse Architecture ‣ Adopting the Oracle reference architecture for real time • Design Challenges ‣ De-queuing • Operational ‣ ODI Logging ‣ Multi-threading and scalability • Further thoughts ‣ Middleware or memory based applications
The point of this presentation is to give you an idea of how to approach a real time event driven BI system using Oracle's current toolset.
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E :
[email protected] W: www.rittmanmead.com
Sunday, 30 September 12
Understanding the Project
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E :
[email protected] W: www.rittmanmead.com
Sunday, 30 September 12
Business Goal
• Part of a major re-architecture program • Covering ERP, CRM and BI
Driver: single view of customer Delivered by: channel consolidation into single enterprise data warehouse
• Data migration • Enterprise Architecture • Enterprise Service Bus
• Real-time reporting • Legacy reporting • BAU reporting
• Revenue and Profit • Liability and risk • Up/cross-sell
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E :
[email protected] W: www.rittmanmead.com
Sunday, 30 September 12
Legacy Architecture
• Legacy architecture consisted of two completely separate systems • Retail stored shop based transactions • Online stored transactions generated online
Retail Data Warehouse Retail trading systems Retail trading systems Retail trading systems 24 hour batch (DTS) SQL Server 2005 6TB
Online Data Warehouse Online trading systems Online trading systems Online trading systems 24 hour batch (DTS) SQL Server 2008 3TB
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E :
[email protected] W: www.rittmanmead.com
Sunday, 30 September 12
Proposed Architecture
Enterprise Architecture ODI TIBCO Queue Retail trading systems Retail trading systems Retail trading systems Real-time feed transactions ODI Real-time feed reference data Real Time Data Warehouse Exadata DR Exadata
Real-time feed
OD ta I - o mig nc ra e tio of n f
Online trading systems Online trading systems Online trading systems
Real-time feed Online Data Warehouse SQL Server 2008 3TB
Retail Data Warehouse SQL Server 2005 6TB
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E :
[email protected] W: www.rittmanmead.com
Sunday, 30 September 12
tion Data migra - once off
Da
ODI
Proposed Architecture
Enterprise Architecture ODI TIBCO Queue Retail trading systems Retail trading systems Retail trading systems Real-time feed transactions ODI Real-time feed reference data Real Time Data Warehouse Exadata DR Exadata
Real-time feed
OD ta I - o mig nc ra e tio of n f
Online trading systems Online trading systems Online trading systems
Real-time feed Online Data Warehouse SQL Server 2008 3TB
Retail Data Warehouse SQL Server 2005 6TB
Current state to future state includes a data migration
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E :
[email protected] W: www.rittmanmead.com
Sunday, 30 September 12
tion Data migra - once off
Da
ODI
Reporting Requirements
• Real time monitoring ‣ Risk and liability ‣ Profit and loss • Analytic reporting ‣ Consolidated analytics ‣ Legacy reporting • Operational reporting ‣ Detail level ‣ Support analytical reports ‣ Drill through
Need to understand the different drivers for each of these needs and the value provided by real time reporting during the running of high transaction events
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E :
[email protected] W: www.rittmanmead.com
Sunday, 30 September 12
Technical Infrastructure
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E :
[email protected] W: www.rittmanmead.com
Sunday, 30 September 12
Volumetrics
• Initially processing data from 2500 shops, scaling to capacity ‣ 8 TB of migrated data ‣ Processing 1.8 million transactions a day ‣ Processing 4,000 reference data items a day ‣ Approximately 9 million transaction rows being processed a day ‣ All transactions read from a TIBCO queue ‣ Approximately 200,000 reference data changes a day ‣ 30,834 transaction processing cycles a day (one every ~2.8s) ‣ 2,701 reference data cycles a day ‣ 680,000 recalculations a day • Online transactions will follow ‣ 2 million transactions a day ‣ Comparable downstream figures
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E :
[email protected] W: www.rittmanmead.com
Sunday, 30 September 12
Exadata and ODI
• Standard X2-2 Quarter Rack ‣ 2 compute nodes ‣ All databases split across the nodes • The storage is configure in dual redundancy mode to offer up about 9TB of usable space, however, we use a couple of TB of that for backups and archive redo logs • The flash storage has been set up as 250GB on each node as a local cache and 110GB from each being used to provide a 160GB flash disc. • The database version is 11.2.0.3 and the client have the tuning and diagnostics pack and Heterogeneous Services on top of the usual Exadata software set. ‣ Both the 11.2.0.2 and 11.2.0.3 Oracle Homes still exists. • ODI Agents for UAT and PROD running off Node 2 • Running up to 30 ODI Agents for PROD to get the speed to read off the TIBCO queues. Each agent running with 512MB with the calling agent running 1GB.
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E :
[email protected] W: www.rittmanmead.com
Sunday, 30 September 12
Exadata and ODI
Compute Nodes ODI Installs UAT
ODI AGENTS DEV/ NFT DEV01 DEV02 NFT PROD ODI WORK SCHEMA UAT ODI WORK SCHEMA
NODE 1
PROD
Oracle Databases
NODE 2 ODI AGENTS PROD UAT
UAT REPOSITORY
PROD REPOSITORY
SSD
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E :
[email protected] W: www.rittmanmead.com
Sunday, 30 September 12
Exadata and ODI
Compute Nodes ODI Installs UAT
ODI AGENTS DEV/ NFT DEV01 DEV02 NFT PROD ODI WORK SCHEMA UAT ODI WORK SCHEMA
NODE 1
PROD
Oracle Databases
NODE 2 ODI AGENTS PROD UAT
UAT REPOSITORY
PROD REPOSITORY
Currently only one Exadata server available, so shared platform
SSD
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E :
[email protected] W: www.rittmanmead.com
Sunday, 30 September 12
Data Warehouse Architecture
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E :
[email protected] W: www.rittmanmead.com
Sunday, 30 September 12
Key Drivers
• Part of integrated Enterprise Architecture • The enterprise data model was designed and developed in Enterprise Architect by the middleware architects • The architects wanted to base the approach on the Oracle Reference Data Warehouse architecture • There were different reporting needs for real time and business as usual reporting • Write performance was likely to be as big a factor as read performanc
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E :
[email protected] W: www.rittmanmead.com
Sunday, 30 September 12
Oracle Reference Architecture
• Simplified view of Oracle’s Data Warehouse Reference Architecture • Enterprise Architecture was XML based
Active Data Warehouse Staging Perfromance Audit and Reconciliation
OBIEE
Analysis
Foundation
Operational and realtime
Enterprise Architecture
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E :
[email protected] W: www.rittmanmead.com
Sunday, 30 September 12
Oracle Reference Architecture
• Simplified view of Oracle’s Data Warehouse Reference Architecture • Enterprise Architecture was XML based
Active Data Warehouse Staging Perfromance Audit and Reconciliation
OBIEE
Analysis
Foundation
Operational and realtime
One of design drivers was that the foundation layer reflected the enterprise data model
Enterprise Architecture
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E :
[email protected] W: www.rittmanmead.com
Sunday, 30 September 12
Limitations
• The ODS would reflect the enterprise architecture ‣ Non-database centric view • Considerable processing to get data into ODS ‣ Data processing from Staging to Foundation was too complex to support SLAs • Performance layer also to reflect existing more data warehouse structures
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E :
[email protected] W: www.rittmanmead.com
Sunday, 30 September 12
Limitations
• The ODS would reflect the enterprise architecture ‣ Non-database centric view • Considerable processing to get data into ODS ‣ Data processing from Staging to Foundation was too complex to support SLAs • Performance layer also to reflect existing more data warehouse structures
Result was non-performant and unusable structures for real-time reporting
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E :
[email protected] W: www.rittmanmead.com
Sunday, 30 September 12
Alternative Architecture
• Split the processing between real-time and BAU ‣ Process 1: Staging to Performance (real-time) ‣ Process 2: Staging to ODS to Foundation • Independent control of either process • Mechanism to handle peaks in data • Needed to ensure consistency between processes
BET BETSLIP STG_xxx STG_xxx STG_xxx TIBCO ODI Real time ETL STG_CTL ODI Real time ETL 3NF tables BET BETSLIP ODI Micro batch ETL BET BETSLIP Decomposition and Aggregate tables SQL Real time query 3NF tables
Micro batch ETL ODI
BET BETSLIP Dimension and fact tables
Near real time SQL
SQL
Staging
Foundation
Performance
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E :
[email protected] W: www.rittmanmead.com
Sunday, 30 September 12
OBIEE
Near real time
Design Challenges
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E :
[email protected] W: www.rittmanmead.com
Sunday, 30 September 12
De-queuing
• Concern that ODI would not be able to de-queue ‣ A lot of fluctuations, depending on events • XML messages were verbose ‣ Large amount of processing time for each batch of messages • Scalability provided by creating more agents ‣ What would the limitations be in terms of RAM ‣ What would the limitations be in terms of connections ‣ What would the limitations be in terms of management
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E :
[email protected] W: www.rittmanmead.com
Sunday, 30 September 12
2 Queues
• Queue are unstructured so the data can arrive in any order ‣ Difficulty of processing business logic ‣ Timestamps not always accurate ‣ Keys not always present • One of most challenging areas of the project ‣ Often need to do manual lookup of keys Solution: recycle mechanism
Transactions
TIBCO
STG_xxx STG_xxx STG_xxx Recycle
Reference data Real time ETL
STG_CTL
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E :
[email protected] W: www.rittmanmead.com
Sunday, 30 September 12
High number of writes
• The real-time write process generated a very high number of writes • Exadata optimised for bulk reads • Contention for REDO logs (see also the ODI Logging) • Exadata configured for more of an OLTP system than Data Warehousing system ‣ However both share the same server
• Resolution: lots of work by the DBAs to optimise database configuration
Isn't this a little bit like an OLTP system?
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E :
[email protected] W: www.rittmanmead.com
Sunday, 30 September 12
Operational Challenges
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E :
[email protected] W: www.rittmanmead.com
Sunday, 30 September 12
ODI Parser
• XML Parser ‣ The XML data definition files were dynamically generated. ‣ The current version of the XML Parser does not do a double pass of the definition file ‣ Any referenced complex definitions needed to be defined in the order they were accessed ‣ The software generating the XML definition files did not do this • Resolution ‣ Build a Java program to re-parse the XML data definition file and output a correctly ordered version ‣ This is a once per release process ‣ This behaviour if fixed in 11.1.1.7 of ODI (I think)
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E :
[email protected] W: www.rittmanmead.com
Sunday, 30 September 12
De-queuing Performance
• ODI struggled to keep up with, or fell behind the queue at peak times ‣ Volumes of messages were not regular • We also found agents failing ‣ Hence we needed a resumption mechanism
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E :
[email protected] W: www.rittmanmead.com
Sunday, 30 September 12
Scaling agents
• Because of the failing agents, we couldn’t just increase their number • Set up parent agents ‣ One for each queue ‣ One for monitoring and maintenance scripts • Each parent agent ran a number of child agents ‣ Each child agent was actually two agents ‣ Second agent acted as redundancy ‣ Agents killed after 50 executions
A1 A1 A1 C1
Q1
Q2
P1
P2
M&M
A2 A2 A2 C2
A3 A3 C3
A4 A4 C4
C6
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E :
[email protected] W: www.rittmanmead.com
Sunday, 30 September 12
Impacts of Multiple Agents
• Memory ‣ Parent agent 1024MB ‣ Child agent 512M • Total number of agents used ‣ 3 parents ‣ 18 child (primary) and 18 child (secondary) ‣ Total: 39 = 21504MB (approx 21GB) • However we didn’t get anywhere near linear scaling ‣ Max TPS = 176 ‣ Max queue TPS = 480 • Second option is to increase the number of queues ‣ Split by functional area
Connections: Every time an ODI agent read from the queue a new connection was created and destroyed. There didn’t seem to be any pooling.
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E :
[email protected] W: www.rittmanmead.com
Sunday, 30 September 12
ODI Logging
• ODI Logging ‣ The ODI processes create 900GB of log files a day ‣ ODI logging needs high IOPS ‣ Exadata, by default not allocating enough IOPS resource to the ODI logging ‣ ODI logging then becomes a limiting factor on the database performance ‣ Target is SNP_SESS_TASK_LOG ‣ Log writer process cannot keep up ‣ Number of active processes mean the database will be performing as hard as it can and more activity will slow everything down.
!
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E :
[email protected] W: www.rittmanmead.com
Sunday, 30 September 12
ODI Logging
• ODI Logging ‣ The ODI processes create 900GB of log files a day ‣ ODI logging needs high IOPS ‣ Exadata, by default not allocating enough IOPS resource to the ODI logging ‣ ODI logging then becomes a limiting factor on the database performance ‣ Target is SNP_SESS_TASK_LOG ‣ Log writer process cannot keep up ‣ Number of active processes mean the database will be performing as hard as it can and more activity will slow everything down.
!
ODI does the same logging but the volume preserved is reduced with lower levels of logging. So in fact, lower levels of logging could be more IO demanding as more data is deleted.
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E :
[email protected] W: www.rittmanmead.com
Sunday, 30 September 12
SNP_SESS_TASK_LOG
• The most demanding SQL statement on the system is and always has been the update to SNP_SESS_TASK_LOG • This table holds 3 CLOB columns. The update is "lazy", all columns are updated each time. Thus each update can potentially update: ‣ the table ‣ three clob indexes ‣ three clob tables
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E :
[email protected] W: www.rittmanmead.com
Sunday, 30 September 12
ODI Logging Impact
• Initially the system was throttled on the SNP_SESS_TASK_LOG. • ODI IOPS demand was maxing out the physical disc IOPS capability of the box • Move SNP_SESS_TASK_LOG and SNP_SESS_TASK to a new ASM diskgroup created from the SSD storage in Exadata
!
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E :
[email protected] W: www.rittmanmead.com
Sunday, 30 September 12
ODI Logging Impact
• Initially the system was throttled on the SNP_SESS_TASK_LOG. • ODI IOPS demand was maxing out the physical disc IOPS capability of the box • Move SNP_SESS_TASK_LOG and SNP_SESS_TASK to a new ASM diskgroup created from the SSD storage in Exadata
!
Simple solution for the ODI Logging problem is to move the database to another server
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E :
[email protected] W: www.rittmanmead.com
Sunday, 30 September 12
ODI temporary tables
• The ODI real-time processing creating a large number of I$ and other internal tables • Once the processing around these is complete, they are put the Recycle Bin • The Recycle Bin become either full or unmanageable
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E :
[email protected] W: www.rittmanmead.com
Sunday, 30 September 12
ODI temporary tables
• The ODI real-time processing creating a large number of I$ and other internal tables • Once the processing around these is complete, they are put the Recycle Bin • The Recycle Bin become either full or unmanageable
There is a wider issue here that affects scalability of the whole solution, discussed in next slides
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E :
[email protected] W: www.rittmanmead.com
Sunday, 30 September 12
Future Challenges
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E :
[email protected] W: www.rittmanmead.com
Sunday, 30 September 12
Scalability
• The main bottleneck we are experiencing is I/O ‣ High number of writes to the database ‣ ODI $ internal tables ‣ ODI logging • We should address this problem by making better use of memory • We also have a constraint on the amount of memory Exadata can provide the agents ‣ Any allocated memory has the opportunity cost of not be used by the database • We should also explore other ‘logical’ approaches to solving this problem
Accessing data in memory reduces the I/O reading activity when querying the data which provides faster and more predictable performance than disk
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E :
[email protected] W: www.rittmanmead.com
Sunday, 30 September 12
Process Flexibility
• The source system splits the data into real time and batch ‣ ESB provides 2 separate queues • The XSD is the same for both queues • The processing for both queues is constantly running in a loop ‣ The batch queue is much larger than the real time queue. ‣ The foundation layer requires data from both • The data from each queue lands in the same stage tables partitioned by the queue name • Entire process controlled by maintaining BATCH_IDs
SRC Systems Feed
Non Critical Data
Real Time Data
DQ Process
DQ Process
Stage Schema
Event Stage Tables CTL_EVENT batch_id batch_type ODS_processed RTF_processed STG_processed
CDC Process
CDC Process
STG_ Tables
ODS Load Processing
Real Time processing
Foundation Layer Tables
Reporting Tables
Foundation Schema
Performance Schema
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E :
[email protected] W: www.rittmanmead.com
Sunday, 30 September 12
In-Memory Processing
• Will require re-writing of the Knowledge Modules ‣ Should also persist connections • Option 1: Remove the writes to the $ tables and attempt to do more operations on-the-fly ‣ Potential loss of audit trail and reconciliation points ‣ Currently all outer joins are materialised ‣ Will need to perform • Option 2: Use In-Memory database?
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E :
[email protected] W: www.rittmanmead.com
Sunday, 30 September 12
In-Memory Processing
• Will require re-writing of the Knowledge Modules ‣ Should also persist connections • Option 1: Remove the writes to the $ tables and attempt to do more operations on-the-fly ‣ Potential loss of audit trail and reconciliation points ‣ Currently all outer joins are materialised ‣ Will need to perform • Option 2: Use In-Memory database?
Exalytics?
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E :
[email protected] W: www.rittmanmead.com
Sunday, 30 September 12
Conclusion
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E :
[email protected] W: www.rittmanmead.com
Sunday, 30 September 12
Conclusion
• The Oracle Reference Data Warehouse architect can support real time event driven ETL, however it may need modifications • IDO has some rough edges and kinks that need to be ironed out for it to act at this kind of enterprise level • Don’t underestimate the effort of doing a data migration • Its important to understand the implications and differences of middleware centric data models and processing compared with databases centric ones
The objectives of the project where achieved. The resulting data is being used on a daily basis and proving significant value to the organisation
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E :
[email protected] W: www.rittmanmead.com
Sunday, 30 September 12
Questions?
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E :
[email protected] W: www.rittmanmead.com
Sunday, 30 September 12
Event Driven Real Time Analytics
Jon Mead, Rittman Mead September 2012
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E :
[email protected] W: www.rittmanmead.com
Sunday, 30 September 12