Documentation

Published on June 2016 | Categories: Documents | Downloads: 37 | Comments: 0 | Views: 393

of 56

Content

APPLICATION OF DATA WAREHOUSING AND DATA MINING

Submitted to: Mr. GAURAV CHANDIOK, Department of Information technology

Submitted by: M.N.SRIKANTH (C-36) SARAN KUMAR REDDY DUVVURU (C-38) SRI HARSHA LOKKINEDDI (C-51)

1|Page

CERTIFICATE

This is to certify that the project work titled APPLICATIONS OF DATA WAREHOUSING AND DATA MINING being submitted by Mr. M.N.SRIKANTH, Mr. SARAN KUMAR REDDY DUVVURU and Mr. SRI HARSHA LOKKINEDDI as a part of the course MBA (General), is a record of bona fide work carried out by us under the guidance and supervision of Mr. GAURAV CHANDIOK at the DEPARTMENT OF INFORMATION TECHNOLOGY, AMITY BUSINESS SCHOOL, UTTARPRADESH.

Mr. M.N.SRIKANTH Mr. SARAN KUMAR REDDY DUVVURU Mr. SRI HARSHA LOKKINEDDI

2|Page

ACKNOWLEDGEMENT

We wish to express my deep sense of gratitude to Mr. Rokkam Sai Sankar, Principal solutions engineer, Oracle, for his able guidance and useful suggestions, which helped me a lot in completing this project work.

M.N.SRIKANTH SARAN KUMAR REDDY DUVVURU SRI HARSHA LOKKINEDDI

3|Page

INDEX

Title page Certificate Acknowledgement TABLE OF CONTENTS 1. Introduction 2. Applications 3. Business Aspects 4. Technical Aspects 5. SWOT analysis APPENDIX 1. Articles of reference 2. Papers of reference

1 2 3

5 10 14 16 45

46 57

4|Page

1. INTRODUCTION Need of Data Warehousing and Data Mining Data collection and database creation (1960‟s and earlier) - Primitive file processing

Database management systems (1970‟s) - Network and relational database systems - Data modeling tools - Indexing and data organization techniques - Query languages and query processing - User interfaces - optimization methods Advanced data bases systems (mid-1980‟s - present) - Advanced data models: Extended-relational, object-oriented, object-relational - Application-oriented: spatial, temporal, multimedia, active, scientific, knowledge-bases, World Wide Web. Data warehousing and data mining (late-1980‟s - present) - data warehouse and OLAP technology - data mining and knowledge discovery

New generation of information systems (2000 - ...) Fig 1: Evolution of database technology
5|Page

Data warehousing: A data warehouse is a relational database that is designed for query and analysis rather than for transaction processing. It usually contains historical data derived from transaction data, but it can include data from other sources. It separates analysis workload from transaction workload and enables an organization to consolidate data from several sources. In addition to a relational database, a data warehouse environment includes an extraction, transportation, transformation, and loading (ETL) solution, an online analytical processing (OLAP) engine, client analysis tools, and other applications that manage the process of gathering data and delivering it to business users. A common way of introducing data warehousing is to refer to the characteristics of a data warehouse. They are:

 Subject Oriented - Data warehouses are designed to help you analyze data. For example, to learn more about your company‟s sales data, you can build a warehouse that concentrates on sales. Using this warehouse, you can answer questions like "Who was our best customer for this item last year?" This ability to define a data warehouse by subject matter, sales in this case, makes the data warehouse subject oriented.  Integrated - Integration is closely related to subject orientation. Data warehouses must put data from disparate sources into a consistent format. They must resolve such problems as naming conflicts and inconsistencies among units of measure. When they achieve this, they are said to be integrated.  Nonvolatile - Nonvolatile means that, once entered into the warehouse, data should not change. This is logical because the purpose of a warehouse is to enable you to analyze what has occurred.  Time varying - In order to discover trends in business, analysts need large amounts of data. This is very much in contrast to online transaction processing (OLTP) systems, where performance requirements demand that historical data be moved to an archive. A data warehouse‟s focus on change over time is what is meant by the term time variant.
6|Page

Fig 2: Architecture of Data Warehousing

7|Page

Data mining: The major reason that data mining has attracted a great deal of attention in information industry in recent years is due to the wide availability of huge amounts of data and the imminent need for turning such data into useful information and knowledge. The information and knowledge gained can be used for applications ranging from business management, production control, and market analysis, to engineering design and science exploration. Data mining can be viewed as a result of the natural evolution of information technology. An evolutionary path has been witnessed in the database industry in the development of the following functionalities:Data collection and database creation, data management (including data storage and retrieval, and database transaction processing), and data analysis and understanding (involving data warehousing and data mining). For instance, the early development of data collection and database creation mechanisms served as a prerequisite for later development of effective mechanisms for data storage and retrieval, and query and transaction processing. With numerous database systems offering query and transaction processing as common practice, data analysis and understanding has naturally become the next target. Data mining refers to extracting or “mining” knowledge from large amounts of data. The term is actually a misnomer. Remember that the mining of gold from rocks or sand is referred to as gold mining rather than rock or sand mining. Thus, “data mining” should have been more appropriately named “knowledge mining from data”, which is unfortunately somewhat long. “Knowledge mining”, a shorter term, may not reflect the emphasis on mining from large amounts of data. Nevertheless, mining is a vivid term characterizing the process that finds a small set of precious nuggets from a great deal of raw material. Thus, such a misnomer which carries both “data” and “mining” became a popular choice. There are many other terms carrying a similar or slightly different meaning to data mining, such as knowledge mining from databases, knowledge extraction, data/pattern analysis, data archaeology, and data dredging. Many people treat data mining as a synonym for another popularly used term, “Knowledge Discovery in Databases”, or KDD. Alternatively, others view data mining as simply an essential step in the process of knowledge discovery in databases. It consists of an iterative sequence of the following steps:     Data cleaning (to remove noise or irrelevant data) Data integration (where multiple data sources may be combined) Data selection (where data relevant to the analysis task are retrieved from the database) Data transformation (where data are transformed or consolidated into forms appropriate for mining by performing summary or aggregation operations, for instance)  Data mining (an essential process where intelligent methods are applied in order to extract data patterns)
8|Page

 Pattern evaluation (to identify the truly interesting patterns representing knowledge based on some interestingness measures)  Knowledge presentation (where visualization and knowledge representation techniques are used to present the mined knowledge to the user).

Fig 3: Data mining as a process of knowledge discovery

Fig 4: Architecture of a typical data mining system 9|Page

2. APPLICATION

Data warehousing: The Hyperion methodology, used here, gives a general overview of Hyperion‟s Data Warehousing approach and highlights the benefits to its clients. Problem statement ( Data warehousing): Early data warehousing efforts focused on separating the decision-support environment from operational transaction-processing systems. The ultimate goal was to create a centralized source of data for accurate, consistent reporting. Success came when individuals from different parts of the enterprise could attend the same meeting; each equipped with a common set of figures. Recognizing that the greatest cost of implementing data warehouses comes from the process of extracting, transforming and integrating data from source systems, a typical data warehouse implementation is 70 percent or more extraction, transformation and loading (ETL) Initial industry attention centered on identifying sound architectural approaches to warehouse construction and management. This was essential to address the IT department‟s challenge of managing change as new source systems, business rules and subject areas were added to the warehouse project. Today, data warehousing has entered a new era. The lessons gained since data warehousing became mainstream more than a decade ago have largely resolved the architectural challenges, enabling organizations to place more attention on the ultimate purpose of delivering value to end users. Today‟s data warehouses have moved far beyond the executive information systems of the early 1990‟s and are serving as strategic tools for many departments in an organization, such as marketing, sales, finance, operations and human resources. This increased role is driving the need for data access and analytical capabilities, and increases the number of individuals in the organization directly interacting and benefiting from the data warehouse. IT
10 | P a g e

managers can no longer afford to limit warehouse utilization to a set of standard queries and reports made available to an elite group of users.

Data mining: This methodology focuses on using Naïve Bayes, one of the Data Mining algorithms (shipped in-the-box with Analytic Services) to develop a model to solve a typical business problem in the admissions department at an academic university. The paper details out the approach that is taken by the user to solve the problem and explains the various steps that are performed by using Analytic Services in general and the Analytic Services Data Mining Framework in particular, towards arriving at the solution. Problem statement (Data mining): One of the problems related to managing admissions that typical universities face is to be able to predict with reasonable accuracy the likelihood that an applicant would eventually enroll in an academic program. Universities typically incur a considerable expense in promoting their programs and in following up with prospective candidates. Identifying applicants with a higher likelihood of enrollment into the program will help the university channel the promotional expenditure in a more gainful way. The candidates typically apply to more than one university to widen their chances of getting enrolled within that academic year. Universities that can quickly arrive at a decision on the applicant stand a higher chance of getting acceptance from candidates. University collects from applicants a variety of data as part of the admissions process: demographic, geographic, test scores, financial information, etc. In addition to that, the admissions department at the University also has acceptance information from the previous year‟s admissions process. The problem at hand is to use all this available data and predict whether an applicant will choose to enroll or not. The University is also interested in analyzing the composite factors influencing the enrollment decision. This additional analysis is useful in adjusting the admissions policy at the university and also in ensuring effective cost management in the admissions department.

11 | P a g e

Available data: Data Warehousing:

12 | P a g e

Data mining:

13 | P a g e

3. BUSINESS ASPECTS Data Warehousing: Competitive pressures and a thriving new economy have driven thousands of businesses around the world to make enormous investments in their information technology solutions. Investments in enterprise resource planning (ERP) software and customer relationship management (CRM) solutions are only two examples of the new trends. Businesses use ERP software to improve operational efficiency and increase profitability, and CRM solutions to automate customer interaction processes that improve customer service and increase sales productivity. More recently, the Internet and electronic commerce have revolutionized business. Many organizations are pioneering innovative Web-based approaches to educate customers and offer more convenient services. In the process, collect a wealth of information about those customers. However, with these emerging opportunities to reach new markets and expand global presence comes a new set of business challenges. Many traditional methods of buying, selling and servicing customers are being transformed literally overnight. Established companies are suddenly finding themselves faced with the need to embrace change quickly or perish. The fact is, that in today‟s modern technological world, companies are data rich and information poor. The growing need to anticipate changing market conditions and customer preferences, develop intelligent business plans, and act proactively is fueled by the availability of critical information. And because so much of that essential information is locked away within ERP, CRM and other transaction-oriented systems, the need for data warehousing is more compelling than ever. Without the ability to move from data to information to knowledge to action, companies cannot stay competitive in today‟s modern and ever changing economy. Companies spend a good portion of time reporting on data contained within source and legacy systems on activities that have occurred. Much time, energy and effort is expended with limited value. As companies move data into a consolidated data store, they gain the ability to begin doing analysis on the information. This allows creation of new sources of information about customers, market segments, product profiles and competitive information. As companies begin to amass a set of analytical tools, the ability to model what-if scenarios moves the newly
14 | P a g e

acquired information to the knowledge stage. Here companies gain the ability to do sophisticated and informed strategic planning. In the final stage, companies use the acquired knowledge to plan future activities and to chart courses of action. Data warehousing is the process that enables data to be transformed into meaningful business information. Organizations seeking to gain competitive advantage and increase revenue are driving the explosive growth in the data warehousing market.

Data mining: Data Mining is the process of knowledge discovery involving finding hidden patterns and associations, constructing analytical models, performing classification and prediction, and presenting mining results. Data Mining is one of the functional groups that is offered with Hyperion System 9 BI+ Analytic Services – a highly scalable enterprise class architecture analytic server (OLAP). The Data Mining Framework within Analytic Services integrates data mining functions with OLAP and provides the users with highly flexible and extensible on-line analytical mining capabilities. On-line analytical mining greatly enhances the power of exploratory data analysis by providing users with the facilities for data mining on different subsets of data at different levels of abstraction in combination with the core analytic services like drill up, drill down, pivoting, filtering, slicing and dicing – all performed on the same OLAP data source.

15 | P a g e

4. TECHNICAL ASPECTS Data warehousing: Hyperion Platform for Analytic Applications Hyperion is the leading provider of analytic application software that helps organizations extend their data warehouse assets. Hyperion‟s analytic solutions platform includes packaged analytic applications, OLAP server technology, data and application integration technologies, and a family of tools for client/server and Web-enabled reporting, analysis, presentation and application development. A comprehensive range of tightly integrated third-party tools and applications extend Hyperion‟s analytic platform, maximizing customer choice and flexibility, and delivering the lowest total cost of ownership. Hyperion products integrate well with existing applications and use of the products is not limited to a particular architecture or infrastructure. Hyperion leverages client‟s existing investments in software and tools while extending the capability to mine and extract useful information from their warehouse.

Hyperion Essbase Virtually all data warehouse best practice methodologies embrace variants of what is often referred to as a ”hub and spoke” architecture. Data warehouses function as the hub, or staging area, for feeding application-specific data marts. Relational databases store and manage ever increasing volumes of detailed data in the warehouse. That data is extracted, transformed, and integrated from multiple source systems. Specialized extraction, transformation and load tools often automate the process of developing and maintaining data warehouses. As the “single source of the truth,” the data warehouse feeds analytic application data marts and Hyperion® Essbase® applications. The storage and calculation requirements for analytic applications are notably different from the data management requirements of the data warehouse. Analytic applications must be easy to use and quickly deploy, store and present information in an intuitive fashion, perform
16 | P a g e

specialized sophisticated calculations and deliver speed-of-thought responses to end-user queries. Unlike relational databases, OLAP servers are designed specifically for this purpose. International Data Corporation, META Group, and other industry research firms recognize Hyperion Essbase as the market-leading OLAP server for analytic applications in the data warehousing market. As an enterprise-class solution, Hyperion Essbase is designed for performance, scalability and ease of deployment on all popular Unix, Windows and midrange server platforms. With a published application-programming interface that is supported by more than 80 third-party tools, Hyperion Essbase is an open solution that provides users with an extensive range of choices. These advantages become increasingly important as the scope of a warehousing project expands. Success spawns more concurrent users, more applications, sophistication within applications and larger data volumes. Hyperion Essbase scales to meet these demands, delivering sub-second query response times even with thousands of concurrent users accessing the same information on the same server. Audited, industry-standard benchmarks consistently demonstrate the superior performance and scalability of Hyperion Essbase as compared to other OLAP servers. Analytic applications, such as product and customer profitability analysis, often require calculations that are not practical using SQL tools against relational databases. For example, a retail planner may need to understand the percent of revenue share per product and product family for 50,000 products and 500 product families. A marketing manager may want to rank aggregate revenues by customer for 25,000 customers, by month, quarter and year. The powerful calculation features of Hyperion Essbase make it possible to answer such questions. As data warehousing projects roll out to geographically and functionally dispersed divisions of the enterprise, the ability to run applications on different hardware platforms becomes a critical requirement. Hyperion Essbase is supported on multiple server platforms, including Windows 95, 98 and Windows NT; Sun Solaris; HP-UX; IBM-AIX; and IBM AS/400.

17 | P a g e

Essbase Integration Services Data warehousing is an iterative process. As users become familiar with the information that is available, new questions arise, requirements change, and new application needs emerge. Unexpected business situations often create the need for “on-demand” single use applications. Essbase Integration Services is the link between the data warehouse and/or data marts and analytic applications. Essbase Integration Services provides a centralized graphical environment that makes it easy to quickly create, change, and manage Hyperion Essbase applications, which are tightly integrated with the data warehouse and/or data marts. To further streamline application development, Essbase Integration Services integrates metadata with warehouse development tools from leading vendors, such as Acta, IBM, Informatica and Sagent. To empower end users with the ability to analyze data at all levels, application administrators can tailor applications built using Essbase Integration Services to allow drillthrough from derived data in their Hyperion Essbase applications to detailed data in the warehouse or marts. For lights-out production automation, administrators can schedule distributed applications to load data at specified intervals on a full or incremental basis.

Information Delivery An enterprise data warehousing strategy must provide data access, presentation and analysis options that accommodate the needs of different classes of users. Specialized power users, for example, may require sophisticated client/server tools for applications such as statistical analysis or data mining. Financial analysts often prefer to work with information in spreadsheet applications. Product managers may be better served with Web-enabled interactive tools, and a broader class of end users may be served with Web-enabled read-only access to daily operational information. Hyperion offers a comprehensive suite of client/server and Web-enabled tools for financial reporting, sophisticated interactive analysis, integration with Microsoft Excel and Lotus 1-2-3, and application development. In addition, Hyperion Alliance Partners offer more than 80

18 | P a g e

Hyperion Essbase-ReadyTM tools and applications for visualization, statistics, data mining, production reporting and application development.

Project Management Hyperion Solutions employs a strong project management methodology, Structured Techniques for Assured Results (STAR®). This methodology provides for staged and phased implementations and insures proper checkpoints to insure that both Hyperion and the Client project remain on track. The Star Methodology  Insures that there is an overall view of the project from the beginning– This ensures that solutions implemented are in line with the overall project strategy and the project doesn‟t become a piecemeal implementation of disjointed parts.  Allows “sanity checks” for expectations and deadlines – Because the methodology takes total project prospective, individual tasks undergo checks to insure that times and resources make sense in view of the total project. Critical checkpoints insure that client expectations are in line with what is slated for delivery.  Creates a centralized escalation point for effective risk management – STAR provides for a centralized escalation point if there are issues or problems that need to be dealt with. This reduces risk from untimely delays caused from a lack of coordination and inability to provide timely information for project status.  Provides the framework for strategic and tactical multiproduct and integration advice – Since there is an overall approach, the methodology insures products and activities are done in unison and work in unison to support the total business solution. Multiple products are considered during installation so products already installed don‟t conflict or suffer ill effects from installation of other products.  Provides the framework for change control leadership – The change control process is well defined and project issues are dealt with in a timely and expedient fashion. Changes in scope are completely analyzed and consequences are determined and information

19 | P a g e

propagated to both client and appropriate team members. Impacts of changes are immediately known.  Provides the framework for rapid response to project issues and situations – The methodology allows issues and day-to-day situations to be effectively dealt with in short order and with the appropriate team members (both client and internal teams).

TWO-TRACK DATA WAREHOUSE IMPLEMENTATION APPROACH Open, Best of Breed Data Warehousing Solution Data warehousing is an essential practice for all competitive enterprises. The most successful data warehousing strategies empower users at all levels in the organization with the information they need to understand and optimize business performance. The data warehouse is the central repository for data consolidation, cleansing, transformation and storage in a format best organized for reporting, extraction and data mining. Hyperion Essbase and Essbase Integration Services leverage the information stored in the data warehouse and are critical components of a flexible, sustainable architecture that is specifically designed to maximize analytic performance and comply with data warehousing best practices. Best of breed solutions from Hyperion enable organizations to deliver application-specific data marts to a diverse and broad range of constituencies across the enterprise. Hyperion‟s approach to data warehouse solutions is composed of two complimentary tracks. These tracks are often executed in parallel; however, they may be deployed as multiple phases. The first track implements Hyperion Essbase and populates the analytical applications with data directly from source – legacy, transactional or ERP – systems. In Track 2, Hyperion leverages and extends Track 1‟s capabilities to develop a complete decision support solution, encompassing some or all of the following: extraction/transformation/load (ETL), staging and/or operational data store (ODS), data warehouse, data mart(s), and analytical applications (Hyperion Essbase).

20 | P a g e

Implementation Tracks Track 1 provides the ability to extract data directly from a source system into a Hyperion Essbase application. This will allow the client to solve some immediate needs within their organization and recognize some immediate return on investment (ROI).

Fig 5: Track 1, it allows users to extract data directly from a source system with a Hyperion Essbase application.

21 | P a g e

Fig 6: Track 2 is a complete decision support system Track 2 extends the solution to include a solid infrastructure to encompass corporate decision support. Hyperion will leverage the data mapping and source extraction effort from Track 1and extend this capability to include additional business requirements. A decision support system (DSS) will be implemented as a base upon which to cleanse, standardize, consolidate and aggregate the data from the many source systems into one best source of information. From this decision Support System, Hyperion will populate the data marts and, using Essbase Integration Services, the Hyperion Essbase analytical applications.

22 | P a g e

Fig 7: The Total Hyperion Solution facilitates the creation of Hyperion Essbase cubes, allowing users to drill back to information in the data warehouse. The final diagram, labeled “Total Hyperion Solution,” depicts the deployment created by a typical Hyperion data warehouse implementation. Under this implementation, creation of all of the Hyperion Essbase cubes become automated processes occurring at scheduled intervals or on demand. This solution also provides the ability for the users to drill back from the analytical application(s) to the underlying detail information in the data warehouse.

23 | P a g e

Data mining: Preparing mining attributes The available input data can broadly be of two data types – „number‟ or „string‟. However, since measures in Analytic Services are essentially stored in the database in a numerical format, the „string‟ type input data will have to be encoded into a „number‟ type data before being stored in Analytic Services. For example, if the gender information is available as a string stating „Male‟ or „Female‟ it needs to be first encoded into a numeric – like „1‟ or „0‟, before being stored as a measure in the Analytic Services OLAP database. Mining attributes can be of two types – „categorical‟ or „numerical‟. Mining attributes that describe discrete information content like gender („Male‟ or „Female‟), zip code (95054, 94304, 90210, etc.), customer category („Gold‟, „Silver‟, „Blue‟), status information („Applied‟, „Approved‟, „Declined‟, „On Hold‟), etc. are termed „categorical‟ attribute types. Mining attributes that describe continuous information content like sales, revenue, income, etc. are termed „numerical‟ attribute types. The Analytic Services Data Mining Framework has the capability of working with algorithms that can handle both categorical and numerical attribute types. Among the algorithms that are shipped in the box with the Analytic Services Data Mining Framework, the Naïve Bayes and the Decision Tree algorithms have the capability to handle both categorical as well as numerical mining attribute types and treat them accordingly. One of the key steps in Data Mining is the data auditing or the data conditioning phase. This involves putting together, cleansing, categorizing, normalizing, and proper encoding of data. This step is usually performed outside the Data Mining tool. The effectiveness of the Data Mining algorithm is largely dependent on the quality and completeness of the source data. In some cases, for various mathematical reasons, the available input data may also need to be transformed before it is brought into a Data Mining environment. Transformations may sometimes also include splitting or combining of input data columns. Some of these transformations may be done on the input dataset outside the Data Mining Framework by using standard data manipulation techniques available in ETL tools or RDBMS environments. For the current case the input data does not need any mathematical transformation, but some encoding is

24 | P a g e

needed to convert data into a format that can be processed within the Analytic Services OLAP environment. In the current problem at the ABC University, the available set of input data consisted of both „string‟ and „number‟ data types. The list below gives some of the input data, which needed encoding of „string‟ type input into „number‟ type input:  Identity related data – like Gender, City, State, Ethnicity  Data related to the application process – like Application Status, Primary Source of contact, Applicant Type, etc.  Date related data – like Application Date, Source Date, etc. (Dates were available in the original dataset as strings, specifically they had two different formats – “yymmdd” and “mm/dd/yy”, and they had to be encoded into a number.) In the current case study, these encodings were done outside the Analytic Services environment by the construction of look-up master tables where the „string‟ type input were listed in a tabular format and the records were sequentially numbered. Subsequently, the „string‟ type input was referred to by its corresponding numeric identifier during data load into Analytic Services. Table 1 shows a few samples of how such mapping files will look like.

State ID 1 2 3 4 5 6

State name VT CA MA

Applied status ID 3 4 5

Application status Applied Offered admission Paid fees Enrolled

MI NH NJ 6

Table 1: Typical mapping of numeric identifiers

25 | P a g e

Preparing the cube After all the input data has been identified and made ready, the next step is to design an outline and load the data into an Analytic Services cube. In the context of the current case the Analytic Services outline created was as follows:  All the input data (measures in the OLAP context) were organized together into five groups (a two level hierarchy created in the measures dimension) based on a logical grouping of measures. The details of each of the measure are explained in the table below -Table 2: Analytic Services outline expanded.  Data load is performed just as it is normally done for any Analytic Services cube. At this stage we have:  Designed an Analytic Services cube  Loaded it with relevant data It should be noted that the steps described so far are generic to Analytic Services cube building and did not need any specific support from the Analytic Services Data Mining Framework.

Measure Group

Explanation Measures related to information about the applicants‟ identity were organized into this group. Some of these measures were transformed from „string‟ type to „number‟ type to facilitate modeling it within the Analytic Services database context.

Measures related to various test scores and high school examination results were organized into this group.

26 | P a g e

Measures related to the context of the applicants application processing have been organized together into this group.

Measures related to the academic background.

Measures providing information about the financial support and funding associated with the applicant.

Table 2: Analytic Services outline expanded

Identifying the optimal set of mining attributes It is necessary to reduce the number of attributes / variables presented to an algorithm so that the information content is enhanced and the noise minimized. This is usually performed using supporting mathematical techniques to ensure that the most significant attributes are retained within the dataset that is presented to the algorithm. It should be noted here that the choice of significant attributes are more driven by the particular data rather than by the problem itself. Attribute analysis or attribute conditioning is one of the initial steps in the Data Mining process and is currently performed outside the Data Mining Framework. The main objective during this exercise is to identify a subset of mining attributes that are highly correlated with the

27 | P a g e

predicted attribute; while ensuring that the correlation within the identified subset of attributes is as low as possible. The Analytic Services platform provides for a wide variety of tools and techniques that can be used in the attribute selection process. One method to identify an optimal set of attributes is to use certain special data reduction techniques implemented within Analytic Services through Custom Defined Functions (CDFs). Additionally, users can use other data visualization tools like Hyperion Visual Explorer to arrive at a decision on the effectiveness of specific attributes in contributing to the overall predictive strength of the Data Mining algorithm. Depending on the nature of the problem the users may choose to utilize an appropriate tool and technique in deciding the optimal set of attributes. One of the advantages of working with the Analytic Services Data Mining Framework is the inherent capability in Analytic Services to support customized methods for attribute selection by the use of Custom Defined Functions (CDFs). This is essential since the process of mining attribute selection can vary significantly across various problems and having an extensible toolkit comes in very handy to be able to customize a method to suit a specific problem. In the current case at ABC University, a CDF was used to identify the correlation effects amongst the available set of mining attributes. A thorough analysis of various subsets of the available mining attributes was performed to identify a subset that is highly correlated with the predicted mining attribute and at the same time has low correlation scores within the subset in itself. Since some Data Mining algorithms (like Naïve Bayes, Neural Net) are quite sensitive to inter-attribute dependencies, an attempt was made to outline the clusters of mutually dependent attributes, with a certain degree of success. From each cluster a single, most convenient, attribute was selected. For this case study, an expert made the decision, but this process can be generalized to a large degree. An optimal set of five mining attributes was identified after this exercise. Table 3 shows the list of identified mining attributes, grouped by the input attribute type – categorical or numerical.

28 | P a g e

Categorical Type FA Received App Status Application Type

Numerical Type Student Budget Total Award

Table 3: Optimal set of mining attributes identified At this stage we have:  Designed an Analytic Services cube  Loaded it with relevant data  Identified the optimal subset of measures (mining attributes)

MODELING THE PROBLEM We will now use the Data Mining Framework to define an appropriate model (for the business problem) based on the Analytic Services cube and the identified subset of mining attributes (measures). Setting up the model includes selecting the algorithm, defining algorithm parameters and identifying the input data location and output data location for the algorithm.

Choosing the algorithm The next step in the Data Mining process is to pick the appropriate algorithm. There are a set of six basic algorithms provided in the Data Mining Framework – Naïve Bayes, Regression, Decision Tree, Neural Network, Clustering and Association Rules. The Analytic Services Data Mining Framework also allows for the inclusion of new algorithms through a well-defined process described in the vendor guide that is part of the Data Mining SDK. The six basic algorithms are a sample set that is shipped with the product to provide a starting point for using
29 | P a g e

the Data Mining Framework. Choosing an algorithm for a specific problem needs basic knowledge of the problem domain and the applicability of specific mathematical techniques to efficiently solve problems in that domain. The specific problem that is being discussed in this paper falls into a class of problems termed as classification problems. The need here is to classify each applicant into a discrete set of classes on the basis of certain numerical and categorical information available about the applicant. The „class‟ referred to in this context is the status of the applicants application looked at from an enrollment perspective: “will enroll” or “will not enroll”. There is historical data available indicating which kind (with a specific combination of categorical and numerical factors associated with them) of applicants that have gone ahead and accepted offers from the ABC University and subsequently enrolled into the programs. There is data available for the negative case as well – i.e. applicants that did not eventually enroll into the program. Given the fact that this problem can be looked at as a classification problem and the fact that there is historical information available, one of the algorithms that is suitable for the analysis is the Naïve Bayes classification algorithm. We chose Naïve Bayes for modeling this particular business problem.

Deciding on the algorithm parameters Every algorithm has a set of parameters that control the behavior of the algorithm. Algorithm users need to choose the parameters based on their knowledge of the problem domain and the characteristics of the input data. Analytic Services provides adequate support for such preliminary analysis of data using Hyperion Visual Explorer or the Analytic Services Spreadsheet Client. Users are free to analyze the data using any tool convenient and determine their choices for the various algorithm parameters. Each of the algorithms has a set of parameters that determine the way the algorithm will process the input data. For the current case, the algorithm chosen is Naïve Bayes and it has four parameters that need to be specified – “Categorical, Numerical, RangeCount, Threshold”. The

30 | P a g e

details of each of the parameters and the implications of setting them are described in the online help documentation. Out of the selected list of attributes we have a few that are of categorical type and hence our choice for the „Categorical‟ parameter is a „yes‟. Similarly, there are attributes that are of numerical type and hence the choice for „Numerical‟ parameter also is a „yes‟. The data was analyzed using a histogram plot to understand the distribution before deciding on the value to be provided for the „RangeCount‟ parameter. This parameter needs to be large enough to allow for the algorithm to use all the variety available in the data and at the same time should be small enough to prevent over fitting. From the analysis of the input data for this particular case, setting this parameter „12‟ seemed reasonable. The „RangeCount‟ controls the binning1 process in the algorithm. It should be emphasized that the binning schemes (including bin count) really depend on the specific circumstances and may vary to a great degree between different problems. At this stage we have:  Designed an Analytic Services cube  Loaded it with relevant data  Identified the optimal subset of measures (mining attributes)  Chosen the algorithm suitable for the problem  Identified the parameter values for the chosen algorithm

APPLYING THE DATA MINING FRAMEWORK Now that we have completed all the preparatory steps for Data Mining, the next step is to use the Data Mining Wizard in the Administration Services Console to build a Data Mining model for the business problem. There are three steps involved in effectively using the Data Mining functionality to provide predictive solutions to business problems. 1. Building the Data Mining model 2. Testing the Data Mining model 3. Applying the Data Mining model

31 | P a g e

Each of these steps, performed using the Data Mining Wizard in the Administration Services Console, uses MDX expressions to define the context within the cube to perform the data mining operation. Various accessors, specified as MDX expressions, identify data locations within the cube. The framework uses the data in the locations as input to the algorithm or writes output to the specified location. Accessors need to be defined for each of the algorithms so as to let the algorithm know specific contexts for each of the following:  (the attribute domain) the expression to identify the factors of our analysis that will be used for prediction [In the current context this expression pertains to the mining attributes that we identified]  (the sequence domain) the expression to identify the cases/records that need to be analyzed [In the current context this expression will identify the list of applicants] (the external domain) the expression to identify if multiple models need to be built [Not relevant in the current context]  (the anchor) the expression to specify the additional restrictions from dimensions that are not really participating in this data mining operation [In the current context all the dimensions of the cube that we used have relevance to the problem. Accordingly, the anchor in the current context only helps restrict the algorithm scope to the right measure in the „Measures‟ dimension]

Building the data mining model To access the Data Mining Framework, you will need to bring up the Data Mining Wizard in the Administration Services Console, and choose the appropriate application and database as shown in Figure 8.

32 | P a g e

Figure 8: Choosing the application and database In the next screen (Figure 9 below), depending on whether you are building a new model or revising an existing model, you choose the appropriate task option.

33 | P a g e

Figure 9: Creating a Build Task

Figure 10: Settings to handle missing data
34 | P a g e

This will bring up the wizard screen for setting the algorithm parameters and the accessor information associated with the chosen algorithm, in this case Naïve Bayes. The user will select a node in the left pane to see and provide values for the appropriate options and fields displayed in the right pane. As shown in Figure 10, select “Choose mining task settings” to set how to handle missing data in the cube. The choice in this case is to replace with „As NaN‟ (Not-ANumber). The Naïve Bayes algorithm requires that we declare upfront if we plan to use either or both of „Categorical‟ and „Numerical‟ predictors. In the context of the current case, we have both categorical and numerical attribute types and hence the choice is „True‟ for both these parameters. „RangeCount‟ was decided at 12. „Threshold‟ was fixed at 1e-4, a very small value. Figure 11 shows the completed screen for the parameters setting.

Figure 11: Setting parameters
35 | P a g e

The Naïve Bayes algorithm has two predictor accessors – „Numerical Predictor‟ and „Categorical Predictor‟ and one target accessor. Figure 12 shows the various domains that need to be defined for the accessors. Table 4 shows the values that were used for the case being discussed. All the information provided during this stage of model building is preserved in a template file so as to facilitate reuse of the information if necessary.

Figure 12: Accessors associated with Naive Bayes algorithm

36 | P a g e

Table 4: Setting up accessors for the “build” mode while using Naive Bayes algorithm

Figure 13: Generating the template and model

37 | P a g e

Once the accessors are defined, the Data Mining Wizard will prompt the user to provide names for the template and model that will be generated at this stage. Figure 13 shows the screen in which the model and template names need to be defined. At this stage we have:  Built a Data Mining model built using the Naïve Bayes algorithm

Testing the data mining model The next step will be to test the newly built model to verify that it satisfies the level of statistical significance that is needed for the model to be put to use. Ideally, a part of the input data (with valid known outcomes – historical data) will be set aside as a test dataset to verify the goodness of the Data Mining model that is developed by the use of the algorithm. Testing the model on this test dataset and comparing the outcomes predicted by the model against the known outcomes (historical data) is also one among the multiple processes supported by the Data Mining Wizard. A „test‟ mode template can be created by a process similar to creating a „build‟ mode template as described in the previous section. While building the „test‟ mode template the user needs to provide a „Confidence‟ parameter to let the Data Mining Framework know the minimum confidence level necessary to declare the model as a valid one. We specified a value of 0.95 for the „Confidence‟ parameter. The exact steps in the wizard and descriptions of the various parameters can be obtained from the online help documentation. Once the process is completed the results of the test appear (the name of which was specified in the last step of the Data Mining Wizard) against the „Model Results‟ node. Figure 14 shows the node in the Administration Services Console „Enterprise View‟ pane where the „Mining Results‟ node is visible. The model can be queried within the Administration Services Console interface to obtain a list of the model accessors by using the “Query Result” functionality. Invoking “Show Result” for the „Test‟ accessor will indicate the result of the test. Figure 15 below shows the list of model accessors in the result set of a model based on the Naïve Bayes algorithm used in the test mode. If the „Test‟ accessor has a value 1.0 then the test is deemed successful and the model is declared
38 | P a g e

„good‟ or „valid‟ for prediction. Figure 16 shows the result of test for the case being discussed in this paper. At this stage we have:  Built a Data Mining model built using the Naïve Bayes algorithm  The model has been verified as valid with 95% confidence

Figure 14: Model Results node in the Administration Services Console interface

Figure 15: Model accessors for result set associated with a model based on Naive Bayes algorithm

39 | P a g e

Figure 16: Test results

Applying the data mining model The intent at this stage is to use the recently constructed Data Mining model to predict whether new applicants are likely to enroll into the program. Using the Data Mining model in the apply mode is similar to the earlier two steps. The Data Mining Wizard guides the user to provide the parameters appropriate to the „apply‟ mode. The „Target‟ domain is usually different in the „apply‟ mode since data is written back to the cube. The details of the various accessors and the associated domains can be obtained from the online help documentation. Table 6 shows the values that were provided to the Data Mining Wizard to use the model in the „apply‟ mode. Just as in the „build‟ mode the names of the results model and template are specified in the wizard and the template is saved before the model is executed. The results of the prediction are written into the location specified by the „Target‟ accessor – The mining attribute that is referred to by the MDX expression: {[ActualStatus]}. The results can be visualized either by querying the model results in the Administration Services Console using the “Query Result” functionality as described in the previous section, or by accessing the cube and reviewing the data written back to the cube. One of the options to view the results will be to use the Analytical Services Spread Sheet Client to connect to the database and view the cube data for the „Actual Status‟ measure.

INTERPRETING THE RESULTS The results of the Data Mining model need to be interpreted in the context of the business problem that it is attempting to solve. Any transformation done to the input measures need to be
40 | P a g e

appropriately adjusted for while attempting to interpret the results. In the context of the case being discussed in this paper, the intent was to predict whether applicants were likely to enroll at the ABC University. The possible outcomes in this case are either the applicant will enroll or the applicant will not enroll. The model was verified against the entire set of available data (over 11300 records).

Table 5: Setting up accessors for the “apply” mode while using Naive Bayes algorithm

The confusion matrix You can construct a confusion matrix by listing the „false positives‟ and „false negatives‟ in a tabular format. A „false positive‟ happens when the model predicts that an applicant will enroll and in reality the applicant does not enroll. A „false negative‟ happens when the model predicts that an applicant will not enroll and in reality the applicant does enroll. The results predicted by the model can be compared with the actual outcome as available in the historical data to build the confusion matrix. In general for such classification problems, it is most likely that one of these („false positives‟ or „false negatives‟) will be slightly more important than the other in a business context. In the case being discussed in this paper, a „false negative‟ means
41 | P a g e

lost revenue, whereas a „false positive‟ means additional promotional expenditure in trying to follow up on an applicant who will eventually not enroll. The importance of each should be analyzed in the context of the business and the model needs to be rebuilt if necessary with a different training set (historical data) or with a different set of attributes. Figure 17 below shows the confusion matrix constructed using the data set that was analyzed as part of this case study. It is evident from the confusion matrix that the model predicted that 1550 (1478 + 72) students will enroll. Of that, only 1478 actually enrolled and 72 did not enroll. This implies that there were 72 false positives. Similarly, the model predicted that 9805 (9356 + 449) students will not enroll. Of that, only 9356 actually did not enroll, whereas 449 actually did enroll. This implies that there were 449 false negatives.

Predicted Data Actual Data Will Enroll Will Enroll Will not Enroll 1478 72 Will Not Enroll 449 9356 Figure 17: Confusion matrix to analyze the model‟s effectiveness in prediction

Analyzing the results On further analysis of the results the following observations can be made: Incorrect Predictions False positives False negatives Total No. of Cases 72 449 521 Percentage of Cases 0.634% 3.954% 4.59%

Success rate of the model: 95.41% (only 521 incorrect predictions in 11355 cases)

42 | P a g e

ADDITIONAL FUNCTIONALITY The Analytic Services Data Mining Framework offers more functionality that can be used when deploying models in real business scenarios. Some of the further steps that can be considered include: Transformations The Data Mining Framework also offers the ability to apply a transform to the input data just before it is presented to the algorithm. Similarly, the output data can be transformed before being written into the Analytic Services cube. The Data Mining Framework offers a basic list of transformations – exp, log, pow, scale, shift, linear that can be used through the Data Mining Wizard. The details of each of these transformations, what they do and how to use them can be obtained from the Analytic Services online help documentation. This list of transformations is further extensible through the import of custom Java routines written specifically for the purpose. The details of how to write Java routines to be imported as additional transforms can be obtained from the vendor guide that is shipped as part of the Data Mining SDK

Mapping In some cases when the model has been developed for a different context and needs to be used elsewhere, the „Mapping‟ functionality is useful. Through this functionality the user can provide information to the Data Mining Framework on how to interpret the existing model accessors in the new context in which it is being deployed. More information on using this functionality can be obtained from the online help documentation.

Import/export of pmml models The Data Mining Framework allows for portability through import and export of mining models using the PMML format.

43 | P a g e

Setting up models for scoring The Data Mining models built using the Analytic Services Data Mining Framework can also be set up for „scoring‟. In the „scoring‟ mode the user interacts with the model at real time and the results are not written to the database. The input data can either be sourced from the cube or through data templates which the user fills up during execution. The „scoring‟ mode of deployment can be combined with custom applications built using developer tools provided by Hyperion Application Builder to make applications that cater to a specific business process while leveraging powerful predictive analytic capability from the Analytic Services Data Mining Framework. The online help documentation provides additional details on how to „score‟ a Data Mining model.

Using the data mining framework in batch mode There is also a batch mode interface to access the functionalities provided in the Data Mining Framework. Scripts written using the MaxL command interface can be used to do almost all the functionality that is exposed through the Data Mining Wizard. Details of the MaxL commands and their usage can be obtained from the online help documentation.

Building custom applications Custom applications can be developed using Analytic Services as the backend database and developer tools provided along with Hyperion Application Builder. The functionality provided by the Data Mining Framework can be invoked through APIs.

44 | P a g e

5. SWOT ANALYSIS

Strengths  Quality of client references  Launch of Micro Strategy 9  Innovation in Mobile BI  Lower administration cost paradigm  Strong customer support  Product strengths

Weaknesses  Challenges associated with organic growth in a consolidating market  Dependence on big deals to grow  Low market share in emerging markets  High learning curve for end users and developers  Lack of performance management

capabilities of its own Opportunities  Add  Improving Threats

functionality

from

ancillary

data

performance

markets  Attacking the midmarket  Improve go-to-market opportunities

undermines differentiation  The rise of data discovery tools  Reliance on professional services  Economic downturn

with more OEM deals

45 | P a g e

ARTICLES OF REFERENCES 1. http://findarticles.com/p/articles/mi_m0EIN/is_2005_Oct_11/ai_n15682597/ BUSINESS INTELLIGENCE MADE EASY WITH DEBUT OF HYPERION SYSTEM 9 BI+ SANTA CLARA, Calif. -- Hyperion Delivers Broadest BI Functionality within a Single Workspace Allowing Users to Report Against Both Relational and Multidimensional Data Sources Hyperion (Nasdaq:HYSL), the global leader in Business Performance Management software, today introduced Hyperion System 9 BI , a complete business intelligence platform that addresses all types of management, financial and production reporting, and delivers highly interactive, multi-source dashboards within the Hyperion System 9 Workspace. Hyperion System 9 BI links a powerful and easy-to-use management reporting solution with market-leading advanced analytics for seamless reporting against both relational and multidimensional data sources. The solution is built on a common Foundation Services layer that streamlines deployment, management and administration to increase productivity and reduce total cost of ownership. Hyperion System 9 BI is the business intelligence component of Hyperion System 9. Launched today, Hyperion System 9 is the most comprehensive performance management solution ever and the first to integrate financial management applications with a BI platform into a modular system that will adapt to any business need. This comprehensive, integrated solution delivers consistent visibility of past, present and future business operations, enabling sustained improvement in business performance. "Hyperion System 9 BI actually puts intelligence into business intelligence, by helping users quickly and intuitively report against relational and multidimensional sources in one session and using one interface," said John Kopcke, chief technology officer for Hyperion. "But we're also putting the 'plus' in BI through the integration of reporting with financial management applications in Hyperion System 9. For the first time, customers are not forced to choose between a great BI platform and a seamless path to performance management. With Hyperion System 9 and its BI component, they get the best of both worlds when they want it."
46 | P a g e

Personalized Workspace and Foundation Services Users of Hyperion System 9 BI will benefit from the Hyperion System 9 Workspace, the first personalized workspace for Business Performance Management. The unified Workspace means that end users do not have to shuffle between multiple systems and technologies to get the answers to their business questions. It provides users with one, simple interface for financial, interactive and production reporting, as well as rich analytics, enterprise metrics, scorecards, dashboards and master data management. Hyperion System 9 BI also leverages the Foundation Services layer of Hyperion System 9 that delivers common user provisioning, common alerting, shared metadata and license management. The infrastructure enables IT organizations to build, deploy, scale and manage across all of its users, organizations and applications, thereby ensuring ease of use and lowering TCO.

Work the Way Customers Work Hyperion worked closely with Frog Design, a leading industrial design firm, in a study to analyze how users work with reports, dynamic forecasts, financial consolidation and advanced analytical software. The study, which closely tracked 70 customers and their use of software over a period of months, directly resulted in the creation of Hyperion's unique Business Performance Management Workspace. The Workspace, which fundamentally changes the way users interact with their performance management software, is at the heart of the innovations Hyperion is introducing with Hyperion System 9. "With Hyperion System 9, we will have much faster, simpler and more efficient delivery of information through flexible, interactive dashboards," said Greg Backhus, director of data warehousing, Helzberg Diamonds. "We'll be able to easily and transparently present data from both relational and multidimensional data sources into one dashboard. We will no longer need to build custom-coded dashboards and one-off reports. The unified system will reduce our administration and support costs as well."

47 | P a g e

Industry Leading Functionality For years, Hyperion has offered the easiest to create and most interactive dashboards, self-service reporting that spans from production to financial, and leading advanced analytics that support its customers' entire BI environments. Hyperion System 9 BI continues to offer companies the industry's leading BI functionality with additional proven technologies: --Master Data Management -- Hyperion BI , as part of Hyperion System 9, works with Hyperion Master Data Management (MDM) Services to provide consistent master data definitions across the enterprise, and allows business users to directly manage complex, rapidly changing master data. Hyperion is the only Business Performance Management company to offer such capabilities. --Service-Oriented Architecture -- Hyperion System 9 and Hyperion System 9 BI are built on a service-oriented architecture. The Hyperion System 9 BI technology leverages key standards-based technologies, including SOA concepts, to enable enterprise-class deployment and easier integration within an enterprise IT infrastructure. --SAP BW Functionality: The SAP BP functionality within Hyperion System 9 BI provides industry-leading financial reporting capabilities, high-volume, pixel-perfect production reporting, and ad-hoc Web Analysis against SAP BW. It also provides high-powered, yet easyto-use wizard-based dashboarding capabilities and the ability to combine SAP and non-SAP data seamlessly in one report and one user interface. Hyperion BI accesses SAP BW through certified BAPI interfaces and is certified as Powered by SAP NetWeaver. --Itanium(R)/64-Bit Scalability: The BI platform is fully enabled on 64-bit Intel Itanium 2-based HP Integrity server platforms running HP-UX 11i and Microsoft Windows Server 2003 operating environments. Intel's Itanium 2 architecture delivers four billion times the addressable capacity of 32-bit processing, which translates into unmatched scalability and performance, lower total cost of ownership (TCO) and a secure investment as scalability requirements increase over time.

48 | P a g e

Hyperion System 9 BI also provides new innovations requested by customers: --Smart View for Office: This feature allows users to embed BI content into, and to present that content in, Microsoft Excel, Microsoft Word and Microsoft PowerPoint. This version of Smart View also provides support for Hyperion's leading financial management applications, also a part of Hyperion System 9. --Transparent User Access to Any Source: Hyperion System 9 BI provides a common view into all data sources and across all management processes, allowing users to make better, faster decisions based on a single reliable source of information that spans all areas of the operations. --Metrics and Scorecard Integration: Scorecard and metrics information are rapidly becoming expected integrated components of a BI platform. Having these offerings within one thin client environment further reduces the cost of maintaining user interfaces, allowing users quick and easy access to scorecard and metrics information from one place. --Data Mining: Data mining within the solution enables the exchange of data mining models via PMML between advanced analytics and external data mining engines. Hyperion System 9 BI now supports new data mining models, such as those for credit scoring, enabling new applications including credit risk management. --Change Management: As companies change, so do their business practices, which in turn impact their data stores. For example, if a company changes the name of a column within the data warehouse, all of the reports on top of it become disconnected. Hyperion System 9 BI provides a way for companies to quickly identify what has changed and allows them to automatically update those reports despite underlying data store changes.

About Hyperion Hyperion Solutions Corporation is the global leader in Business Performance Management software. More than 10,000 customers rely on Hyperion software to provide visibility into how their businesses are performing and to help them plan and model to improve
49 | P a g e

that performance. Using Hyperion software, customers collect data, organize and analyze it, then communicate it across the enterprise. Hyperion offers the industry's only Business Performance Management solution that integrates financial management applications with a business intelligence platform into a single system. Named one of the FORTUNE 100 Best Companies to Work For (2004), Hyperion serves global customers in 45 countries. A network of more than 600 partners provides the company's innovative and specialized solutions and services. Hyperion generated revenues of USD 703 million for the fiscal year that ended June 30, 2005 and is traded under the Nasdaq symbol HYSL. For more information, please visit www.hyperion.com.

Safe Harbor Statement Statements in this press release other than statements of historical fact are forwardlooking statements, including, but not limited to, statements concerning the potential success of anticipated product features, the anticipated product offerings and the potential market opportunities for business performance management software. Such statements constitute anticipated outcomes and do not assure results. Actual results may differ materially from those anticipated by the forward-looking statements due to a variety of factors, including, but not limited to the company's ability to retain and attract key employees, the successful and timely development of new products, the impact of competitive products and pricing, customer demand, and technological shifts. For a more detailed discussion of factors that could affect the company's performance and cause actual results to differ materially from those anticipated in the forwardlooking statements, interested parties should review the company's filings with the Securities and Exchange Commission, including the Report on Form 10-K filed on August 31, 2005 and the report on Form 10-Q filed on May 10, 2005. The company does not undertake an obligation to update its forward-looking statements to reflect future events or circumstances.

50 | P a g e

2. http://findarticles.com/p/articles/mi_m0CGN/is_1999_Feb_9/ai_53910291/?tag=content; col1 HYPERION SPINS OFF APPSOURCE; GOES FOR MS OLAP SPACE Hyperion Solutions Corp has spun off its decision support arm as a wholly owned subsidiary in a bid to compete head on in the Microsoft Sequel Server OLAP services market space. The new company will be renamed Appsource Corp, its original name before it was purchased, in December 1997, by Arbor Software. Arbor subsequently merged with Hyperion in May 1998 and since then Appsource has sold its decision support tools under the Hyperion brand name. Now, with the majority of Hyperion's SBase multidimensional customers under its belt, Hyperion decided to spin off Appsource to enable the subsidiary to focus exclusively on selling tools into Microsoft's OLAP services space, said Adrian Marshall, VP of marketing for the newly-formed company. "We wanted to set up a small, highly-responsive unit that we could react quickly to change and compete head on in the marketplace," he told Computer Wire. The company also took the opportunity to announce the next version of its Wired for OLAP tool set. Release 4.0 of the software offers a number of improvements over the previous version including enhanced support for Microsoft's SQL Server 7.0 OLAP Services with features including ranking (of top products or companies for example), write-back (the ability to enter data at the front end which is automatically updated on the OLAP server), and a host of advanced selections, including the ability to drill down and find out information at a regional level. The latest version also seamlessly interoperates on both Windows and web-based clients and features a new, web-style interface on all client editions. And by enabling the wired repository to be stored in any leading relational database, including SQL, Oracle 8 and IBM's DB2, the repository is more scalable, and can be accessed by thousands of users, as opposed to just 50, as was the case with the earlier version. Wired for OLAP version 4.0 is available immediately. It costs $100 per user for the web viewer only edition; $300 for the web edition; $600 for the standard client version and $1,000 for the professional edition. Web clients also require the wired application server, which costs $12,500 and serves up to 250 concurrent users. Sales are indirect through channels, telesales and e-commerce partners.

51 | P a g e

3. http://www.information-management.com/news/1018391-1.html HYPERION ACQUIRES RAZZA SOLUTIONS; DELIVERS MASTER DATA MANAGEMENT FOR BUSINESS PERFORMANCE MANAGEMENT Hyperion, a global leader in business performance management software, announced it has acquired substantially all of the assets of Razza Solutions, Inc., an Austin, Texas-based private software company. A Hyperion partner since 2000, Razza provides a market-leading solution for synchronizing master data across business performance management (BPM) including business intelligence (BI) platforms, financial and analytical applications and transactional systems. The transaction closed on January 21, 2005. Hyperion does not expect the impact on revenue and earnings to be material. Razza's 15-person staff became Hyperion employees, effective immediately. Line of business and management leaders need to accurately plan for and track business performance. They need to have a single point of control with accurate, timely updates that fully reflect the current business reality. The myriad of complex and dynamic business variables such as new and/or retiring products, organizational changes, pricing changes, customer wins and impacts of mergers and acquisitions add enormous complexity to the job of ensuring data integrity. For the past five years, Hyperion customers including Fifth Third Bank, HCA Healthcare and Mentor Graphics have relied on Razza to manage master data for their reporting, analysis and planning needs, and to ensure critical data integrity for reporting hierarchies and business dimensions. Razza counts more than 30 customers among the Fortune 500. Hyperion will market the solution as Hyperion Master Data Management (MDM) Server for Business Performance Management, and integrate it as a core service of the Hyperion Business Intelligence Platform. Hyperion MDM Server will complement and leverage Hyperion Hub's open infrastructure foundation. In the past, businesses have leveraged master data management to harmonize information and context for key transactional and ERP systems, and to synchronize and monitor the customer
52 | P a g e

records, product lists and geographical data upon which these systems rely. Hyperion MDM Server extends the benefits of this approach to business performance constructs such as business dimensions, reporting structures, hierarchies, attributes and business rules. It provides common context and consistency for these master data across BI systems, enterprise data warehouses, financial and analytic applications and transactional systems. It enables a common vocabulary for business users and ensures a single version of the truth throughout the Business Performance Management system. The need is real. According to the Tower Group, 50 percent of enterprises maintain master data separately in 11 or more source systems. Eighty percent of enterprises plan on centralizing master data. These companies are seeking a way to streamline and remove complexity from the process.

4. http://news.cnet.com/Arbor-Software,-Hyperion-merge/2100-1001_3-211517.html ARBOR SOFTWARE, HYPERION MERGE Arbor Software and Hyperion Software said today that they are merging into a new software company, to be called Hyperion Solutions, through the exchange of stock. In a joint statement, the two software companies said Arbor will exchange 0.95 shares of its common stock for each share of Hyperion, creating a company with more than $350 million in revenues, based on pro forma results for the 12 months ended March 31. Arbor shareholders will own approximately 40 percent of the continuing company while Hyperion shareholders will own approximately 60 percent. Arbor is an online analytical processing (OLAP) software maker. Hyperion is an budget and financial applications vendor with close partnerships with many other applications vendors, most notably Baan. Last year the business applications vendor signed a deal with Hyperion to integrate their offerings, build new Internet apps, and pool distribution efforts.

53 | P a g e

Today's deal is being accounted for as a pooling of interests and is expected to close in the late summer, the companies said, adding that based on the Nasdaq closing prices of both companies on May 22, the combined market capitalization would be approximately $1.3 billion. Analysts see the merger as further entrenchment of a three-front battlefield in the OLAP war between Microsoft, Oracle, and Arbor. "This will strengthen the Arbor camp significantly, specifically on the applications side," said Mike Schiff, an analyst with Current Analysis. "It places [Arbor] in a stronger position against Microsoft and Oracle." He did not comment on how this would affect Hyperion's relationship with Baan. The merger between the two companies also makes sense in the wake of Microsoft's decision to bundle its Plato OLAP Server as a component of its SQL Server 7.0, because it shows Arbor is looking dig in and do battle in the market space with its own OLAP solution, enhanced by Hyperion technology. The companies said both of their respective boards of directors have unanimously approved the definitive merger agreement. Hyperion Solutions will have its headquarters based in Sunnyvale, California, and will develop analytic application software. The company is expected to have 1,800 employees working in 26 countries. John Dillon, Arbor's current chairman and chief executive officer, will become CEO of the combined company and Jim Perakis, Hyperion's current chairman and CEO, will become chairman of the combined company. Stephen Imbler, Arbor's current CFO; Bill Binch, Arbor's current head of sales; Kirk Cruikshank, Arbor's current head of marketing; Craig Schiff, Hyperion's current head of services; and Mark Bilger, Hyperion's current head of development, will hold these positions in the combined company. The Board of Directors of the merged organization will include four directors from Hyperion and three directors from Arbor.
54 | P a g e

Arbor Software recorded $82 million in revenues for the twelve months ended March 31, 1998, while Hyperion recorded revenues of $271 million during the same period.

5. http://www.computerworld.com/s/article/110706/Hyperion_to_acquire_data_quality_ven dor_UpStream_Software HYPERION TO ACQUIRE DATA-QUALITY VENDOR UPSTREAM SOFTWARE Computerworld - Hyperion Solutions Corp. today announced plans to acquire UpStream Software Corp., a Rochester, Mich.-based provider of financial data-quality technology Santa Clara, Calif.-based Hyperion plans to offer UpStream‟s tools as part of its business performance management suite to help enterprises ensure that data required for compliance with the Sarbanes-Oxley Act and other regulations is accurate, according to Hyperion officials. Terms of the deal, which is expected to close in two weeks, were not disclosed. While many other data-quality vendors focus on cleansing customer and product data for marketing campaigns, UpStream‟s tools focus exclusively on ensuring the accuracy of financial data, said Rich Clayton, Hyperion‟s vice president of product marketing. “Financial data quality has a similar process, but has a very different set of business requirements -- things like a repository for a [Sarbanes-Oxley] audit, the ability to monitor your internal audit controls and the ability to go from your source system to what you report to Wall Street,” Clayton said. Hyperion will begin marketing the UpStream products – which include a guided Webbased workflow interface, data preparation server and predefined adapters for Hyperion applications – within two weeks, Clayton said. The tools are designed to eliminate the multiple points of failure between the submissions of data from local business units to the consolidation of that information globally so organizations can avoid restating financial results. UpStream will continue to operate from Rochester and Hyperion plans to keep all of its employees, according to Clayton. Duplicate sales offices may be consolidated, however.

55 | P a g e

PAPERS OF REFERENCE 1. http://www.oracle.com/technetwork/middleware/smart-view-for-office/overview/smartview-overview-wp-134759.pdf 2. http://h20195.www2.hp.com/v2/GetPDF.aspx/4AA1-2356ENW.pdf

56 | P a g e

Documentation

Comments

Content

Sponsor Documents

Recommended