Taxonomy as a Service for Enterprise Applications: An Approach to Improving Information Findability Richard Iams
Introduction Access to current, accurate, and relevant information is vital for businesses to effectively serve their customers, partners, and employees. Business intelligence and data warehouse solutions continue evolving in order to efficiently mine structured data—that is, data that is organized in a defined format— from multiple applications. These solutions provide information about key relationships and entities important to the business, such as clients, products, services and partners. Using terms consistently to organize and describe collected business data is an important factor in building effective systems and supporting business intelligence strategies. However, this same approach is rarely taken for managing unstructured information—that is, information that has no defined format, which comprises a significant volume of information created and shared among business system users. A business’s unstructured information may reside anywhere, in any format, and in any medium: reports, PowerPoint presentations, Wikis, network file shares, emails, intranets and portals, blogs, and local hard drives. Linking both structured and unstructured information types about a related subject is difficult, time consuming and costly. Searching all possible sources to find information on a single topic in order to generate new analyses, satisfy an information request, or comply with legal and regulatory affairs is a daunting task. The answer to the challenge is to implement a shareable vocabulary, or taxonomy, that can be used for organizing and searching content across all information sources. Unstructured business information is important because it can provide vital context and history not necessarily revealed by structured data contained in databases. Reports and memos will identify important contributors to a business decision and the logic used to derive a solution which is missing from a common project plan, time accounting system or financial reporting application. Often times, however, business units will organize their content in a manner that works for them, but not for the rest of the organization. They may apply specific classifications and terms in an application or storage
1760 Old Meadow Road., McLean, VA 22102 • 703.748.7082 • www.ppc.com
1
“Taxonomy as a Service” for Enterprise Applications Richard Iams repository that meets their own departmental needs but is not intuitive to other departments. Equally, individual users often organize their personal files in a manner that makes sense to them but not to their colleagues. As a result, existing tools for organizing business information are used ineffectively. Highly relevant information may go undiscovered and unused because it can’t be found. This could lead to recreation of the information—a duplication of effort. Worse yet, the information may remain unfound, resulting in incomplete and flawed analyses. Even improved findability through the use of text‐ based search does not guarantee that relevant structured and unstructured information can be found. Both types of information often remain disconnected from each other. This requires users to spend time searching from different systems, each with their own vocabulary and method of organization. Business Case Evaluation This white paper discusses an approach for implementing a business taxonomy as a service, which provides a shareable, reusable vocabulary across enterprise systems containing both structured and unstructured information types. Users, system managers, developers and content managers must reference the same terms to describe and organize business information in order to maximize its findability. This approach was developed for a global non‐profit organization which implemented donation‐based sponsorship programs for children and families in low‐resource countries. In this case, the organization implemented its services in many different countries, matching donors’ contributions to recipients through a sponsorship model. In addition, the organization partnered with affiliate donor organizations in other countries to collect and distribute donations using the headquarters’ operating model. Country programs were accountable for reporting financials and results to headquarters. Relevant information included demographic data about program recipients, children and families with resource needs, and the narrative reports about participant progress. In addition, the organization managed its own marketing and volunteer campaigns, and maintained a collection of multimedia data related to its events and fundraising campaigns, as well as images of program and participant activities. A significant challenge for the organization was to manage the information contained in various repositories in order to accurately track financials, provide analyses designed to optimize donations and support program growth. The related stories and images produced by the programs and participants were crucial to documenting successes and growing their model for providing need‐based aid programs. The program stories offered compelling marketing material in support of routine business data documenting the impact of donations and the work of personnel. Combining measureable results with individual success stories about the children and families assisted was critical for sustaining trust within the donor community. In addition to enterprise databases housing information about its donors, partners and service recipients, the business maintained a corporate intranet, a public website, a digital media repository, network file shares, and shared portal workspaces as sources of unstructured information. Despite shared access to data applications and content repositories, important business information was not easily found by users. Business reports generated by different departments produced inconsistent
1760 Old Meadow Road., McLean, VA 22102 • 703.748.7082 • www.ppc.com
2
“Taxonomy as a Service” for Enterprise Applications Richard Iams analyses regarding strategic topics‐ such as clients, products and services‐ resulting from inconsistent definitions of related terms. The evaluation demonstrates the capability to centrally manage a controlled vocabulary of strategic terms and metadata values, and syndicate those terms to subscribing applications with reusable code for custom navigation and controlled vocabulary choice lists. Commercially available products are used to store standard terms and relationships and apply them to the needed enterprise systems. Developers utilize application programming interfaces, common programming techniques, and reusable code as needed to extract the required business taxonomy and implement it within the system architecture and user interface. Using consistent content tagging and classification, this practice results in improved findability and presentation of related information for users regardless of the content repository and the information format‐ structured or unstructured. The Importance of the Business Taxonomy In order to address the challenges of information findability, Project Performance Corporation (PPC) recommends the use of a Business Taxonomy design over traditional taxonomy design. Unlike a traditional taxonomy that may classify a deep and complex set of terms, a business taxonomy focuses on simpler, more intuitive topics familiar to enterprise users regardless of their role or experience. These topics provide an easy means to classify business content, promoting findability of information. The topics should represent key business terms independent from the format of the information source‐ structured or unstructured‐ and be applicable across information systems. So, core business activities and functions are described in simple terms and used to group similar information across the business. When the business taxonomy is applied consistently across all enterprise applications, users are more successful in finding the information they need. Designing the proper taxonomy for users is only a part of the issue. The taxonomy must also be accessible to the systems that will apply the defined vocabulary, which will enable enterprise‐wide use and improve the findability of information. A business application capturing structured data is designed to minimize data duplication and reuse shared reference values; it can accomplish this by controlling the input and selection choices for users. Similar benefits exist from applying a standard, controlled hierarchy of terms to describe and organize unstructured information as it is published. A business taxonomy can be maintained using outlines and spreadsheets because of its simple design, and used as needed in a selected application. Portals and intranets, common applications for a business taxonomy design, may provide their own internal method to manage terms and their relationships to one another. Business taxonomies are often associated with intranets, extranets, websites, and portals, but they may be applied wherever simple, intuitive organization of information for identified audiences is required. The taxonomies should reflect key topics and concepts that are useful for describing business activities and strategic interests, resulting in a close relationship to current sources used to organize information within the business. However, the issue for many organizations is that each application repository deployed maintains its own taxonomy and metadata scheme and does not necessarily use terms consistently among them. For instance, a label used for location, such as state or country, may be
1760 Old Meadow Road., McLean, VA 22102 • 703.748.7082 • www.ppc.com
3
“Taxonomy as a Service” for Enterprise Applications Richard Iams named differently among applications and the choices used for classification may vary, some using full names and others using abbreviations (e.g., NY versus New York). Sales staff may describe their functions and organize information in relation to business activities differently from those in Manufacturing. As a result, the navigation and categorization of each support application will vary. Confusion and complexity will result as employees and customers attempt to use each application to find information they need from the different departments. Familiar and consistent terms, applied across all information sources, can alleviate the problem. The Value of a Taxonomy Management Tool A Business Taxonomy and Metadata design standardizes and defines terms and their relationships in a way that will be intuitive to business users. These terms reflect familiar business vocabulary that can be used to classify and organize information contained in both structured and unstructured repositories. Managing even the simplest of taxonomies intended for an enterprise system, such as a portal, by using paper or spreadsheets presents challenges once the taxonomy changes to reflect business activities and information requirements. Populating the terms in a taxonomy management tool offers the potential for immediate and long‐term benefits on two fronts: 1) it offers application owners and administrators the ability to maintain the taxonomy centrally and 2) it offers those administrators the ability to implement changes uniformly to multiple applications through a service that pushes taxonomy changes to the subscribing applications. If designed and managed correctly, this can provide an organization with consistent navigation and metadata “tagging” across all applications within the enterprise. In short, applications will no longer exist in verticals that are navigated differently and cannot “talk” to each other in order to link related information. Without a centralized management tool, the business taxonomy must be created manually in each of the consuming business applications (e.g., portals, web sites, collaboration work spaces) that use different application interfaces. If these applications are to work well together, they will need to share a common controlled vocabulary necessary for users to classify, organize, and find information about the business consistently. Without a common vocabulary, it is impossible for their information to be reliably related. Furthermore, without a taxonomy management tool, any change to the system needs to be identically replicated in each application. In effect, copies of the business taxonomy are created that may be subject to unapproved changes. This offers the potential for the systems to quickly fall out of synch with each other. Governance processes are also necessary to ensure the taxonomy design meets the ongoing needs of the users. Once a system exists to store and use the taxonomy, policies and procedures must be in place to manage change as manually maintaining taxonomies in multiple applications creates a massive administrative burden. Governance to represent all units of the business and all types of users requires direct meetings and collaboration to communicate proposed changes among stakeholders. This will quickly become a tedious, labor‐intensive process as business terms are added, new relationships created, and modifications made to existing terms in order to support enterprise applications. The taxonomy management tool allows stakeholders to participate remotely in governance and to monitor proposed changes remotely as terms emerge to reflect business activities. A management tool enforces
1760 Old Meadow Road., McLean, VA 22102 • 703.748.7082 • www.ppc.com
4
“Taxonomy as a Service” for Enterprise Applications Richard Iams policies designed to maintain an accurate and consistent taxonomy, and allows decentralized ownership of specific terms and shared input into the overall design. Stakeholders involved in reviewing change requests that impact the taxonomy itself and subscribing systems may remain informed without the need for frequent face‐to‐face meetings. Implementing Taxonomy as a Service The concept of Taxonomy as a Service presented in this paper draws from principles of Service Oriented Architecture (SOA), in order to promote reusability of controlled terms and interoperability between applications. Rather than embed controlled, managed vocabularies in each consuming application, a taxonomy management tool becomes the central repository for strategic terms, concepts, and reference values necessary for any enterprise application. The management tool publishes approved terms to: •
Provide user navigation—the defined browsing hierarchy within an application;
•
Choice lists—or controlled values—for classifying (tagging) content; and
•
Filters for controlling and grouping data in views to users.
The functionality of governance and change management is provided by one tool rather than many when using Taxonomy as a Service. Business requirements determine which terms and choice lists maintained in the management tool are used by which applications, and the application developers determine the appropriate methods for implementation according to application specifications. Terms that are appropriate for strategic, enterprise‐wide information classification and sharing can be stored and managed centrally, while those that are relevant to specific business units can be managed within the application(s) that support those discrete units. Business applications designed to manage and report structured data are not necessarily compatible or integrated with portal solutions, particularly with regard to decentralized information management and collaboration features. One pitfall of relying on collaborative features to promote and improve findability of information is permitting staff to devise unique classification schemes or to require no minimal “tagging” of content in the first place. While this may work well for the team, it inhibits the findability of useful information by other members of the business and exemplifies the overarching challenge of organizing and finding enterprise content. In effect, this practice replicates the findability problem the portal was supposed to solve. Instead, when a minimum enterprise standard of approved metadata tags is enforced, portal and cross‐ site searches can produce more relevant results. Users are provided a consistent set of choices, stored in the shared taxonomy management tool for consistent use, to describe files and content intuitively and at the same time represent concepts and activities important to the business. While portal solutions provide basic controls to create standard lists of terms and choices, they are limited in their capability to manage and document changes in business terminology as it evolves and publish these lists to other
1760 Old Meadow Road., McLean, VA 22102 • 703.748.7082 • www.ppc.com
5
“Taxonomy as a Service” for Enterprise Applications Richard Iams applications. Embedding controlled, managed vocabularies independently in a portal application may limit the overall usefulness of the business taxonomy and otherwise increase complexity of its use. This limitation can be eliminated by a taxonomy management tool, which can be programmatically connected to a portal solution and which can publish the centrally managed and approved terms as custom columns to assigned sites, update new columns, and manage allowable choices for classifying stored items. In addition, site‐by‐site variances to the approved taxonomy can be reported back to the management tool so administrators may monitor for compliance to approved standards. While some level of standardization is implementable across sites within the portal solution architecture, its role in this case is as a subscribing enterprise application rather than hosting the enterprise taxonomy. This promotes a higher degree of term reusability in other enterprise applications as opposed to just within the portal application itself. For example, the administrators of a data application might identify a new value required to label a strategic enterprise program. Updating a list to enable entry and classification of transactional data can result in a change notice to the central management tool, thereby making the new value available to portal users, as well as other enterprise applications, who will submit unstructured files about the new program. A management tool’s application programming interface (API) provides extensibility to the application, allowing a variety of methods not only to read from, but also to write to, the tool. As such, it enables use of reusable code to publish current vocabulary and reference lists to the subscribing applications, and it allows these applications to push changes to the central repository. This ability to re‐use the code, or write it once and leverage it across multiple business applications, speeds up development and ensures a consistent use of the management tool’s related functionality across all applications. In effect, this allows the management tool to serve as a central reference authority for all strategic terms, vocabularies, reference data and attributes contained in web‐service enabled applications. Applications can read these values and return lists and relationships between all the enterprise metadata. In the business case example, filtered views of data, originating from a master data management application and presented in custom portal page, may be updated and controlled programmatically with values maintained in the management tool. Functional, departmental, and other role designations may be used to provide custom information, data views and file links to users based on their assigned profiles. This may promote targeting and personalization of business information, providing users rapid access to information that is most useful and relevant to them as it is becomes available. Summary Organizing, finding, and moving information throughout a business is a critical function. The business taxonomy provides a foundation for improving the ability of users to find information they need so they may take appropriate action. Whether the information is structured or unstructured in format, the use of consistent terminology in enterprise repositories is important to linking related information from various sources. A taxonomy management tool provides the central storage and governance capabilities to effectively maintain key business terms that need syndication to various systems.
1760 Old Meadow Road., McLean, VA 22102 • 703.748.7082 • www.ppc.com
6
“Taxonomy as a Service” for Enterprise Applications Richard Iams The long‐term value for reusability of terms and code, improved findability, and access to relevant business information are important factors for an organization’s leadership to consider when investing in a taxonomy management tool or an alternative solution. Deploying taxonomy as a service creates new and improved opportunities for linking related enterprise information, both structured and unstructured, contained in multiple sources and providing users a more complete, accurate view of the information they need to conduct their daily work. This approach may offer new pathways for users to find and discover important business information regardless of their preferred method‐ search or browse‐ or application source. It also provides better opportunities for those who create, or are responsible for, new business information to reach more colleagues in the enterprise resulting from improved classification and findability. The use of a separate application for managing the business taxonomy, and implementation of taxonomy as a service, may be justified in cases where multiple information repositories exist and a high degree of application integration and information findability is required to support long‐term consistency and reuse of strategic business terms across the enterprise. For more information, contact Richard Iams at 703‐748‐7116 or
[email protected]. About the Author Richard Iams is a Senior Analyst in PPC’s internationally known Knowledge Management Practice. He has experience supporting information system projects in a variety of integrated software and hardware environments. His focus is on deploying solutions that improve organizational collaboration including business taxonomy design, portal technologies and communities of practice‐ and on providing sustainable management strategies for implementations. He is an expert in the requirements gathering, selection, and implementation of taxonomy tools and classifiers.
1760 Old Meadow Road., McLean, VA 22102 • 703.748.7082 • www.ppc.com
7