Five Steps to Mastering Master Data Management
Ron Lewis November 19, 2009
Presentation Overview
• Introduction • What is Master Data Management? g • The 5 Steps for Master Data Management:
• Discovery – finding all of the data sources, who they are used by and how they are used • Analysis – identifying authoritative sources, discrepancies, and candidates for consolidation • Design – designing the metadata repository • Implementation–implementing a metadata repository • Establish data governance
• Leveraging Technology to facilitate:
• Business Process and Data Modeling g • Data Governance and Discovery • Metadata Repository Implementation g • Metadata Management
• Presentation Focus:
19/11/2009
The Discovery and Analysis Phases
2
Master Data Management
• Master Data Management
• Master Data is: Principle business data essential for conducting business • MDM provides an enterprise perspective on the critical Business Processes and the Data necessary to support them • Bottom line: Improve decision making
• Core Tasks
• Building the Business Process Models • Data Governance (Standardizing data - nomenclature, domains, data quality and consumption rules) • Synchronizing related operational systems using the data • Integrating/reconciling disparate data silos to provide single enterprise view • Building and managing an enterprise metadata repository
• Challenge: Must Shift Thinking to the Enterprise Perspective
11/15/2009 3
Discovery Phase
• Step 1 – Discovery
• Capturing and modeling the essential business processes • Mapping processes to the data necessary to complete each process successfully • Identifying data sources and gathering appropriate metadata
• Primary Challenges• Cost - It’s Expensive and Disruptive • Gaining Executive Leadership Support – (“You mean we don’t have this already?”)
• Solution Solution• Start with what’s most important • What’s important should be obvious
11/15/2009
4
Discovery Phase
• Involve your infrastructure and/or security personnel • Iteration I: Capture existing data and schemas p g
• Find your database servers, respective owners and access • Reverse engineering your physical data models y g • Build a master data dictionary and catalog
• Iteration II: Profile existing applications to help with business
• Database Centric: ETL, Stored Procedures, and Triggers • Application Source Code and User Behavior
• Tools You’ll Need
• Infrastructure/security tools ( y (Nessus) ) • Data Modeling and Profiling tools (ER/Studio Data Architect/DBOptimizer) • Application Profiling tools (NitroSecurity APM) p y g yp • Repository to manage the metadata byproducts
19/11/2009
5
Infrastructure / Security Tooling
19/11/2009
6
Use ER Studio to Reverse Engineer
19/11/2009
7
Reverse Engineer Physical Schemas
19/11/2009
8
Example Reverse Engineered Model
19/11/2009
9
Start Building Master Data Catalog
19/11/2009
10
Exporting Catalog for Sharing
19/11/2009
11
Discovery – Profiling Data Use
• Biggest Challenges We’re Solving:
• Reconciling and integrating disparate “Data Silos” into a central location • Identifying duplicative data elements (or attributes) • Laying the foundation for identifying which of the data sources contain the actual “source data”
• High Percentage of Business Logic is encapsulated as Programming Logic g g g p g g g
• Stored Procedures and Trigger code stored in the database • Application Source Code • Extract Transform and Load Scripts • We need visibility to this logic, and we need to be able to store it somewhere
• Tools necessary for this:
• DSAuditor and DB Optimizer or Performance Center (to capture live data use) • Source Code Analyzers (I like Fortify SCA, and Embarcadero JBuilder) • Profile ETL using Embarcadero’s MetaWizard (usually convert ETL to XML) • Store metadata in ER/Studio Data Architect’s Data Lineage and Transform Rules Support
19/11/2009
12
Profiling Data Use with DBOptimizer
19/11/2009
13
Analysis Phase
• Step 2 – Analysis
• Identifying authoritative sources, discrepancies, and candidates for consolidation • Evaluating Data Flow and Transform Rules • Capturing/Defining Synonyms and Assigning Aliases • Setting the Foundation for Data Governance
• Primary Challenges• Cost – It’s Time Consuming and is a “Team Effort” g y • Getting ancillary information that teams don’t want to share
• Solution• Start with what’s most important • Wh ’ i What’s important should b obvious h ld be b i
11/15/2009
14
Analysis Phase
• Iteration I: Evaluate ETL for data lineage and transform rules
• Start by reverse engineering the ETL, converting it to XML • Incorporate it into the repository
• Iteration II: Identify synonymous elements and build alias list
• Evaluate data domains and transform rules for issues such as state and use • Enlist database and development staff to identify alias and tag the data elements in the master catalog
• Tools You’ll Need
• Data Modeling tools (ER/Studio and MetaWizard) • Repository to manage the metadata byproducts (ER/Studio)
19/11/2009
15
Analysis Phase – Evaluating ETL
• Biggest Challenges We’re Solving:
• Finding which data source is feeding what other data sources • Collecting Data Lineage metadata • Making it accessible to the right team members
• Convert the ETL to a form that allows manipulation ( p (such as XML) ) • Importing the metadata into the data modeling tool • Build, publish and control access to your master data repository • Start gathering and applying metadata tags • Tools necessary for this:
• MetaWizard • ER/Studio Data Architect (or the like)
19/11/2009
16
Data Lineage and Transform Rules
19/11/2009
17
Setting the Foundation for Governance
19/11/2009 18
Analysis Phase – Identifying Synonyms
• Biggest Challenges We’re Solving:
• Indentifying like data elements and candidates for consolidation • Building Aliases • Establishing the foundation for Data Governance
• Evaluate data nomenclature using tool functions such as Merge and g g
Compare to identify the obvious overlaps
• Compare descriptors from database staff • Compare data use and consumption rules derived from tools such as DB
Optimizer
• Tools necessary f this: for
• ER/Studio Data Architect (or the like)
19/11/2009
19
Performing Analysis With Compare Utility
19/11/2009
20
Exporting to Excel for Input into Database
19/11/2009
21
Candidates for Consolidation
19/11/2009
22
Step 3 Building the Repository
• Step 3–Building Metadata Repository
• Populating the Repository with the right metadata • Establishing and Controlling Access to the metadata • Performing metadata management
• Primary Challengesy g
• Defining who needs access to what metadata • Establishing the rules of use
• Suggestions Suggestions• Implement change control and auditing tool • What’s important should be obvious • Understand the value of the metadata on profitability
19/11/2009
23
Step 4 Implementing the repository
• Step 4 - Implementing the repository
• Mapping the metadata to the requisite business processes • Leveraging the metadata to determine candidates for business process re-engineering
• Primary Challenges• Getting the p g processes down in modeled form • Obtaining Middle Level Management and Senior Leadership buy in to changes identified by metadata
• Suggestions• Leverage a modeling tool that facilitates data to process mapping (integrated metadata) • Focus on what’s most important to the business—try not to focus on EVERYTHING
19/11/2009
24
Step 5 Establishing Data Governance
• Step 5 – Establishing Data Governance
• All of the above steps lays the foundation for good data governance • Get Senior Leadership to stipulate policy enforcing the rules you’ve derived • Build a Plan and Standardize Iteratively – (don’t try to fix everything all at once)
• Primary Challengesy g
• Fundamental Opposition to Change • Maintaining Momentum
• Suggestions Suggestions• Find a quick kill – tackle the biggest organizational problem you can handle • Focus on what’s most important to the business—and what drives easily visible ROI
19/11/2009
25
Summary
• What We Covered:
• Defined Master Data and Master Data Management • The 5 Steps for Master Data Management: • Discovery – finding all of the data sources, who they are used by and how they are used • Analysis – identifying authoritative sources, discrepancies, and candidates for consolidation • Design – designing the metadata repository • Implementation–implementing a metadata repository • Establish data governance • Demonstrated how to leverage specific technology to facilitate: • Business Process and Data Modeling • Data Governance and Discovery • Metadata Repository Implementation • Metadata Management
19/11/2009
26
Questions and Answers
• Tools Discussed:
• Nessus • ER/Studio Data Architect / Business Architect and ER/Studio Repository • DBOptimizer • Change Manager
• Technologies Discussed:
• Building the Data Catalog • Capturing and Storing Metadata • Metadata Analysis
• Contact Info:
•
Ron Lewis,
[email protected]
19/11/2009
27