Australian Bureau of Statistics Gains Powerful Analytic Facilities with Integrated Data Warehouse
“The Oracle data warehouse allowed us to store large amounts of data in a single location and introduce more efficient analytic capabilities. Using tools like Oracle Discoverer, we can access and view data with flexibility and fast response times that we didn’t have previously.” — Don Bartley, Chief Technology Officer, Australian Bureau of Statistics
Australian Bureau of Statistics Canberra, Australia www.abs.gov.au
Industry:
Public Sector
Employees:
2,000 to 4,999
Established in 1905, the Australian Bureau of Statistics (ABS) is Australia’s central statistical authority, providing high quality, objective, and timely statistics about the nation’s economy and its social and cultural characteristics. Its collective databases contain approximately 4 terabytes of data, collected over 15 years. The ABS was an early adopter of information technology and has assembled a large range of applications, platforms, and toolsets over the past 25 years. Up until the late 1980s, the organization stored information in many systems and across multiple platforms. This made it difficult to manage data and reuse information across economic survey areas. In 2002, ABS launched a Business Statistics Innovation Program aimed at improving data quality, provider management, and efficiency and productivity. As part of the program, the organization developed an Input Data Warehouse using Oracle technology to collect and store economic statistics for real-time operations and analysis. As a result, ABS was able to reengineer its input supply chain for maximum efficiency, enhance data capture and storage techniques, improve data protection measures, and introduce new reporting capabilities. “The Oracle data warehouse allowed us to store large amounts of data in a single location and introduce more efficient analytic capabilities,” said Don Bartley, chief technology officer for the Australian Bureau of Statistics. “Using tools like Oracle Discoverer, we can access and view data with flexibility and fast response times.”
Oracle Products & Services:
Oracle Database Oracle Warehouse Builder Oracle Application Server Oracle Enterprise Manager with Management Packs Oracle Advanced Security Oracle Partitioning Oracle Internet Directory Oracle Discoverer Oracle E-Business Suite Oracle Financials Oracle Payroll
Designed for Powerful Performance
Key Benefits:
Increased efficiency by reengineering the input supply chain, enhancing data capture and storage techniques Enabled swift querying, performance tabulations, and data aggregation by adopting a star schema database structure Facilitated fast response times and provided the flexibility to access and view data using Oracle Discoverer Ensured analytic computations can be run faster by preassembling certain elements of the results Protected data from unauthorized viewing through the use of Oracle Fine Grained Access Control to define roles and assign privileges to users Eased database administration and improved service levels with Oracle Enterprise Manager Enabled implementation of a conceptual framework for metadata based on international standards (ISO 11179)
ABS built its first data warehouse 12 years ago. Known today as the Output Data Warehouse, it was a significant achievement because it enabled the organization to store and manage data from one location and generate a range of products in different formats from a single data source. Information in the Output Data Warehouse feeds the ABS Web site and its publications workbench. Buoyed by the success of this project, ABS prototyped a second data warehouse in 2001. The Input Data Warehouse (IDW) currently stores business statistics only and was designed to improve data quality and integration, streamline provider management, and enhance the organization’s capacity to meet new statistical challenges. “We had previously been storing data in separate systems and different platforms, which made it difficult to meet certain objectives. What we needed was a well-catalogued data store that could be shared across organizational boundaries,” said Bartley. “Another aim was to enhance statistical and analytical use of data and better understand the impact of different survey processes on the quality of the aggregated data.” ABS based the IDW structure on a star schema design. A large fact table (the ‘star’) resides at the center of the model, surrounded by various points, or reference tables. The fact table in the IDW has around 3 billion rows. The basic principle behind the schema is to provide quick, easy, and flexible access to data. “It’s a very good structure for fast querying, performance tabulations, and data aggregation,” said Brian Studman, director of Database Administration, Technical Services Division, at the Australian Bureau of Statistics. Rather than one star, the IDW comprises a cluster of stars. “The stars are cross-related to each other, meaning they might share the same set of properties, such as the information provider or the location in which the data was collected,” said Studman. Data can be brought together on common attributes with minimal intervention from IT staff, while maintaining each data set’s unique attributes. To speed data processing time, the IDW includes Materialized Views that natively provides basic online analytical processing (OLAP)-like performance in the relational database management
past, and Oracle Enterprise Manager has been able to provide us with efficiencies in database administration. It also gave us the ability to improve service levels through greater visibility into the entire Oracle environment.” Future Plans ABS is currently gearing up for Census 2006, a national survey conducted every five years that collects social, economic, and housing information to produce a snapshot of Australian society. Data from the census will be collected, stored, and processed in an Oracle data warehouse with a detailed transaction history store utilizing a similar structure to the IDW. Bartley said ABS is interested in exploring Oracle’s XML capabilities. “In order to access data from a resource such as a corporate directory or the IDW, we need to know the shape of the packet and some criteria. These are described in an XML schema. “One of the things we are interested in is how we can use Oracle to generate an XML data stream and XML schemas. “Another area of interest is business process management. Oracle has some business process management tools and we’d like to explore how we can use them in conjunction with our own administration tools.” Why Oracle? Maintaining data in separate systems was not delivering business value to ABS, leading the organization to move to a relational database environment in 1990. “There was a real need to move to an industry-standard database structure to improve performance and expand the range of functions available,” said Bartley. “We knew OLAP and other data mining tools could be integrated with relational databases, so we could bolt the two products together. “We were already using Oracle products and it was logical to explore the company’s relational database offering. Oracle has always delivered the high performance, stability, security, and scalability our business requires.” “Oracle is a vital data management tool for ABS,” added Studman. “It is an enterprise-class database management system with exceptional functionality. In addition to enabling more efficient information management, the more complex database
design allows us to carry out sophisticated querying, reporting, and analytic tasks.” Implementation Process The project to develop and implement the IDW was undertaken solely by ABS. At various points before and during the deployment, the organization called on advice and assistance from the database community, external groups, and sister agencies overseas. It also engaged independent consultants to review the implementation at the end of each phase to ensure the project was on track to meet business goals. Phase one (prototyping) was undertaken between November 2001 and December 2002. It involved building a ‘production pilot’ warehouse for storing, processing, and analyzing selected data sets from the Australian Taxation Office. In-house facilities combined with Oracle Warehouse Builder and Oracle Discoverer were the primary tools used to load, analyze, and query data during this phase. The first phase of the project helped ABS gain a better understanding of the range of issues faced by ABS statisticians and assisted the organization to develop a strong business case justifying the progression to phase two. Phase two commenced in December 2002. It involved extending the phase one pilot warehouse and replacing existing stores for the selected data sets. Business survey and other administrative data were also incorporated into the larger warehouse. This phase provided ABS with more opportunities to evaluate the use of transactional data, undertake more extensive analysis, and explore the links with other ABS systems. At the end of September 2003, a business case was prepared outlining the costs and benefits of moving the IDW into full production (phase three). A migration and implementation plan formed part of the business case. Phase three began in October 2003 with the implementation of the IDW. ABS then took the opportunity presented by a central data store to review the survey processes within the IDW to see whether improvements (such as significance editing) could be made. The first version of the IDW was released into production in December 2003. The second version was released a year later.
The Australian Bureau of Statistics (ABS) is the government agency responsible for collecting and publishing statistical information about Australia. As well as providing financial and economic statistics, the ABS presents information on contemporary social issues and areas of public policy concern.