Big Data: Opportunities, Strategies and Challenges Executive Summary Gregg Barrett
Acknowledgment This report draws extensively, and focuses on, the work and viewpoints from industry participants including: Diversity Limited Economist Intelligence Unit Gartner HBR Hortonworks IBM ITG Intel McKinsey Ordnance Survey John Standish Consulting Christopher Bienko @ IBM Dirk deRoos @ IBM John Choi @ IBM Marc Andrews Andrews @ IBM Paul Zikopoulos @ IBM Rick Buglio @ IBM Strategy Meets Action
References are included in-text as well as in the References section at the end of the report.
Challenges facing the industry Difficult and uncertain economic conditions, low interest rates, decreasing underwriting profitability, higher combined ratios and low investment returns are placing insurers under stress. Insurers also have to confront commoditisation of the business, more informed consumers, high customer churn rates, new distribution channels and strong competition. If this was not enough natural perils, increases in regulatory intervention and greater demands for transparency by regulators, together with ever increasing compliance requirements are placing immense strain on the capabilities of insurers. According to IBM (2013) to thrive in this environment insurers must gain a specific set of capabilities that will allow them to:
Build a customer-centric business model Find profitable ways to sustain growth Develop new, competitively priced products Increase claims efficiency and effectiveness Improve capital management and investment decisions Improve risk management and regulatory reporting (IBM, 2013, pg. 2)
Insurers are turning to analytics The business of insurance is based on analysing data to understand and evaluate risks. Two important insurance professions, actuarial and underwriting, emerged at the beginning of the modern insurance era in the 17th century. These both revolve around and are dependent upon the analysis of data. (Strategy Meets Action, 2012, pg. 3)
While the insurance industry has long been recognized for analysing data, the new news involves the overwhelming amount of data that is now available for analysis and the sophistication of the technology tools that can be used to perform the analysis. The opportunities for advanced analysis are many and the potential business impact is enormous. (Strategy Meets Action, 2013, pg. 3)
The Concept of Big Data In simple terms Big Data refers to a data environment that cannot be handled by traditional technologies. Big Data is often described in terms of the three V’s, and if you are at IBM, it is likely to be the four V’s . Figure 1 below illustrates the IBM four V representation representation of Big Data: Figure 1: Big Data in dimensions
Figure 1. Four dimensions of big data. Copyright 2012 by IBM. Reprinted with permission.
Volume refers to the quantity (gigabytes, terabytes, petabytes etc.) of data that organizations are trying to harness. Importantly there is no specific measure of volume that defines Big Data, as what constitutes truly “high” volume volume varies by industry and even geography. What is clear is that data volumes continue to rise. Variety refers to different types (forms) of data and data sources. When referring to data types this includes; numeric, text, image, audio, web, log files etc., whether structured or unstructured. The growth of data sources such as social media, smart devices, sensors and the Internet of Things has not only resulted in increases in the volume of data but increases in the types of data as well.
refers to speed at which data is created, processed and analysed. Velocity impacts latency, Velocity refers which is the lag time between when data is created or captured, and when it is processed into an output form for decision making purposes. Importantly, certain c ertain types of data must be analysed in realtime to be of value to the business, a task that places impossible demands on traditional systems where the ability to capture, store and analyse data in real real-time -time is severely limited. Veracity refers to the level of reliability associated with certain types of data. According to IBM some data is inherently uncertain, for example: sentiment and truthfulness in humans; GPS sensors bouncing among the skyscrapers of Manhattan; weather conditions; economic factors; and the future. When dealing with these types of data, no amount of data cleansing can correct for it. Yet despite uncertainty, the data still contains valuable information. The need to acknowledge and embrace this (IBM, 2012, pg. 5) uncertainty is a hallmark of Big Data.
The Big Data Impact According to McKinsey (2011), Big Data creates value in several ways: - - - -
Creating transparency Enabling experimentation to discover needs, expose variability, and improve performance Segmenting populations to customize actions Replacing/supporting human decision making with automated algorithms a lgorithms
Innovating new business models, products, and services
To understand the impact at an organisational level, Erik Brynjolfsson with a team at MIT, working in partnership with McKinsey, Lorin Hitt at Wharton and the MIT doctoral student Heekyung Kim, conducted structured interviews with executives at 330 public North American companies about their organizational and technology management practices, and gathered performance data from their annual reports and independent sources. Based on the analyses they conducted one relationship stood out: The more companies characterized themselves as data-driven, the better they performed on objective measures of financial and operational results. In particular, companies in the top third of their industry in the use of data-driven decision making were, on average, 5% more productive and 6% more profitable than their competitors. This performance difference remained robust after accounting for the contributions of labour, capital, purchased services, and traditional IT investment. (HBR, 2012) Further an IBM study based on o n survey responses of more than 1,000 1 ,000 business and IT executives from more than 60 countries, revealed four transformative shifts in the use of Big Data: 1. A solid majority of organizations are now realizing a return on their Big Data investments within a year. 2. Customer centricity still dominates analytics activities, but organizations are increasingly solving operational challenges using Big Data. 3. Integrating digital capabilities into business processes is transforming organizations. 4. The value driver for Big Data has shifted from volume to velocity. (IBM, 2014, pg. 1)
While Big Data has resulted in significant opportunity it has also brought new challenges. According to Zikopoulos, deRoos, Bienko, Buglio and Andrews (2014), some challenges include: -
Greater volumes of data than ever before Placing more demands on the organisations security plan.
The experimental and analytical usage of the data Democratizing data within the organisation requires building trust into the Big Data platform. A data governance framework covering lineage, ownership etc. is required for any successful Big Data project.
The nature and characteristics of Big Data The data consists of more sensitive personal details than ever before raising governance, risk and compliance concerns.
The adoption of technologies that are still maturing
Big Data technologies like Hadoop (and much of the NoSQL world) do not have all of the enterprise hardening from a security perspective that’s needed, and there’s t here’s no doubt compromises are being made.
A look at Big Data in Insurance Exploration and discovery
Big Data necessitates an approach of exploration and discovery. As articulated by Gartner (2013), business analysts have typically worked to a requirements-based model, answering clearly-defined business questions. Big Data, however, demands a different approach, using opportunistic analytics and exploring answers to ill-formed il l-formed or non-existent questions. (Gartner, 2013, pg. 1) Figure 2: Culture change - Discovery versus control control
Figure 2. A better assessment of the data around and connected to a single piece of information enables a more complete, in-context understanding. Copyright 2013 by IBM. Reprinted with permission.
Moving to a data driven culture
Gartner (2014) has found that many insurance IT departments lack a consistent, enterprise-wide business intelligence and data management strategy, because of siloed, line of-business-centric IT systems. (Gartner, 2014, pg. 6) In embracing the Big Data paradigm the Economist Intelligence Unit (2013) suggests moving towards what they call a “data driven culture”. According to the report, in promoting a data driven culture organisations should consider: -
Data-driven companies place a high value on sharing. Companies own data, not employees. Data are a resource that can power growth, not something to be hoarded.
Shared data should be utilised by as many employees as possible, which in practice means rolling out training wherever it is needed.
Data collection needs to be a primary activity across departments
Perhaps most importantly, implementing a data driven culture requires buy-in from the top; without that, little will change. (Economist Intelligence Unit, 2013, pg. 11)
Emerging techniques in Big Data on the insurance front
According to Ordnance Survey (2013) the following are some of the emerging techniques being deployed by insurers: -
Predictive modelling: already well used by insurance companies, this works even better when more data is fed into the model.
Data-clustering: automated grouping of similar data points can provide new insights into apparently familiar situations. Livehoods.org is an example of how social media and ‘machine learning’ can reveal previously-unseen previously-unseen patterns.
Sentiment analysis: textual keyword analysis can help analyse the mood of Twitter chatter on a given topic or brand.
Web crawling: sophisticated programmes that can identify an individual’s ‘web footprint’ as a result of posting on social media m edia websites, blogs and photo-sharing services. Using data-matching, this can be linked to public records and data from other third parties to build a multi-dimensional profile of an individual. (Ordnance Survey, 2013, pg. 22)
Data protection, a lurking risk
In addition to the transformative shifts in the use of Big Data mentioned earlier, the same IBM report found that respondents rated data protection lowest on the list of data priorities; only 11 percent of respondents respon dents identified it a “top three” priority. Given the proliferation of large-scale data breaches in recent years, organizations risk the loss of customer and business partner confidence if adequate precautions are not taken to safeguard data, as well as legal and remediation fees. Moreover, business leaders should thoughtfully consider how their organizations use data to minimize any potential backlash in perceived privacy infringement. (IBM, 2014, pg. 9) Skills gap
organisations – requiring requiring people with The Big Data environment requires a skill set that is new to most organisations – deep expertise in statistics and machine learning, as well as managers and analysts who know how to operate companies by using insights from Big Data. According to McKinsey (2011), the United States alone faces a shortage of 140,000 to 190,000 people with deep analytical skills as well as 1.5 million managers and analysts to analyse Big Data and make decisions based on their findings. In addressing the skills gap, IBM (2014) suggests organisations should consider the following:
Learn from the best within your organization. - Tap into the pockets of talent within the organization - those few using predictive or prescriptive analytics - to expand the skills of others. - Create a strong internal professional program to arm analysts and executives who already understand the organization’s business fundamentals with analytics. Sharing resources re sources and knowledge is a cost-effective way to build skills and helps limit the need to seek talent elsewhere. Externally supplement skills based on business case. Not all organizations need a data scientist full time; the same is true for niche analytics skills that may be used only to solve specific challenges. - Organizations should invest in the talent and skills they need to solve the majority of their analytics demands - Consider vendors to supplement critical niche skills that are hard to find and expensive to employ. (IBM, 2014, pg. 15)
Big Data technologies Apache Hadoop is the starting point for most organizations wanting to take the plunge into Big Data analysis. The Hadoop ecosystem
In their book, Big Data Beyond the Hype, Zikopoulos, deRoos, Bienko, Buglio and Andrews (2014) classify Hadoop as an ecosystem of software packages that provides a computing framework. These include MapReduce, which leverages a K/V (key/value) processing framework (don’t confuse that with a K/V database); a file system (HDFS); and many other software packages that support everything from importing and exporting data (Sqoop) to storing transactional data (HBase), orchestration (Avro and ZooKeeper), and more. When you hear that someone is running a Hadoop cluster, c luster, it’s likely to mean MapReduce (or some other framework like Spark) running on HDFS, but others will be using HBase (which also runs on HDFS). Vendors in this space include IBM (with BigInsights for Hadoop), Cloudera, Hortonworks, MapR, and Pivotal. On the other hand, NoSQL refers to non-RDBMS SQL database solutions such as HBase, Cassandra, MongoDB, Riak, and CouchDB, among others. (Zikopoulos, deRoos, Bienko, Buglio, Andrews, 2014, pg. 38)
Key components of many Big Data environments: MapReduce MapReduce is a system for parallel processing of large data sets.
According to IBM (2015) as an analogy, you can think of map and reduce tasks as the way a census was conducted in Roman times, where the census bureau would dispatch its people to each city in the empire. Each census taker in each city would be tasked to count the number of people in that city c ity and then return their results to the capital city. At the capital, the results from each city would be reduced to a single count (sum of all cities) to determine the overall population of the empire. This mapping of
people to cities, in parallel, and then combining the results (reducing) is much more efficient than sending a single person to count every person in the empire in a serial fashion. (IBM, 2015) Hadoop MapReduce is the heart of Hadoop. Hadoop is an open source software stack that runs on a cluster of machines. Hadoop provides distributed storage and distributed processing for very large data sets. NoSQL
NoSQL is a database environment. Using the definition from Planet Cassandra (2015), a NoSQL database environment is, simply put, a non-relational and largely distributed database system that enables rapid, ad-hoc organization and analysis of extremely high-volume, disparate data types. NoSQL databases were developed in response to the sheer volume of data being generated, stored and analyzed by modern users (user-generated data) and their applications (machine-generated data). (Planet Cassandra, 2015) Spark What is Spark and what does it mean for Hadoop?
IBM (2014) refers to Spark as an open source engine for fast, large-scale data processing that can be used with Hadoop, boasting speeds up to 100 times faster than Hadoop MapReduce in memory, or 10 times faster on disk. As with the early enthusiasm around Hadoop, Spark should not be thought of as a singular platform for analytics, as it can be used with existing investments for the widest variety of data types and analytics workloads.
Figure 3: Example of a Big Data environment
Figure 3. Application Enrichment with Hadoop. Copyright 2013 by Hortonworks Inc.. Reprinted with permission.
The impact of Hadoop
According to IBM (2015), Hadoop changes the economics and dynamics of large-scale computing by enabling a solution that is: -
Scalable: Add new nodes as needed without changing data formats, how data is loaded, how jobs are written or the applications on top. - Cost-effective: Hadoop brings massively parallel computing to commodity servers. The result
is a significant decrease in the cost per terabyte of storage, which in turn makes it affordable to model all your data. Flexible: Hadoop is schema-less, and can absorb any type of data, structured or not, from a number of sources. Data from multiple sources can be joined and aggregated in arbitrary ways, enabling deeper analyses than any one system can provide by itself. Fault-tolerant: When you lose a node, the system redirects work to another location of the data and continues processing without missing a beat.
(IBM, 2015, pg. 2)
Hadoop is not without its own set of challenges. According to IBM (2014), there are four key areas of Hadoop that need to mature in order to drive wider adoption, these include: 1) 2) 3) 4)
Performance Performance the reduction of skills skills data governance deep integration with existing technologies technologies
Along similar lines TDWI Research (2015) in a recent survey found respondents struggling with the following barriers to Hadoop implementation: Barriers to Hadoop: - Skills gap - Weak business support - - - -
Security concerns Data management hurdles Tool deficiencies Containing costs
(TDWI Research, 2015)
According to a study by the International Technology Group, organisations need to be particularly mindful in the highly skilled programming requirements demanded of most Hadoop environments, noting that: Although the field of players has since expanded to include hundreds of venture capital-funded capital -funded startups, along with established systems and services vendors and large end users, social media businesses continue to control Hadoop. Most of the more than one billion lines of code code – – more more than 90 percent, according to some estimates – estimates – in in the Apache Hadoop stack has to date been contributed by these.
The priorities of this group have inevitably influenced Hadoop evolution. There tends to be an assumption that Hadoop developers are highly skilled, capable of working with “raw” open source code and configuring software components on a case-by-case basis as needs change. Manual coding is the norm. Decades of experience have shown that, regardless of which technologies are employed, manual coding offers lower developer productivity and greater potential for errors than more sophisticated techniques. (ITG, 2013, pg. 2)
Big Data in the context of traditional technologies The Big Data environment has been brought about by the advancement in technology enabling the processing and storage of the volume, variety, velocity and veracity of data, which is beyond the capabilities of traditional technology. Big Data supplements traditional systems
As illustrated in Figure 3, the Big Data environment supports traditional technology, extending capabilities into areas previously unsupported. Gartner (2013) suggest that Big Data doesn't replace traditional data and analytics: “…..big data technologies are not really replacing incumbents such as business intelligence, relational database management systems and enterprise data warehouses. Instead, they supplement traditional information management and analytics.” (Gartner, 2013, pg. 13)
Examples of three insurance use cases with Big Data According to Gartner (2013) Big Data and the associated technology has been shown to provide the following benefits: - - -
Detection and prevention of fraud or other security violations High ROI Little operational disruption
(Gartner, 2013, pg. 5)
Big Data to fight fraud
According to John Standish Consulting (2013), mobilizing Big Data is gaining wider attention in antifraud circles. Insurers are sitting on troves of data, hard and soft. Much is never accessed for fraudfighting. Insurers can dramatically increase their anti-fraud assertiveness by insightfully accessing, analyzing and mobilizing their large volumes of untapped data. Marshaling analytics and big data with current rules and indicators into a seamless and unified antifraud effort creates an expansive world of possibilities. -
Imagine the ability to search a billion rows of data and derive incisive answers to complex
questions in seconds. Imagine being able to comb through huge numbers of claim files quickly. 10
Imagine more-quickly linking numerous ring members and entities acting in well-disguised concert. These suspects likely could not be detected with sole or even primary reliance on basic methods such as fraud indicators. Ultimately, imagine analyzing entire caseloads faster and more completely, thus addressing the largest fraud problems and cost drivers in any of an insurer’s coverage territories. territories .
Case study: Fraud at IBC The Insurance Bureau of Canada (IBC) is the national insurance industry association representing Canada’s home, car and business insurers. Because investigation of cases of suspected automobile insurance fraud often took several years, the company’s investigative services division wanted to accelerate its’ its’ process. The IBC worked with IBM to conduct a proof of concept (POC) in Ontario, Canada that explored new ways to increase the efficiency of fraud identification. The POC showed how IBM solutions for big data can help identify suspect individuals and flag suspicious claims. IBM solutions also help users visualize relationships and linkages to increase the accuracy and speed of discovering potential fraud. In the POC, more than 233,000 claims from six years were analyzed. The IBM solutions identified more than 2,000 suspected fraudulent claims with a value of CAD41 million. IBM and the IBC estimate that these solutions could save the Ontario automobile insurance industry approximately CAD200 million per year. (IBM, 2012)
Big Data for customer segmentation
Case study: Customer segmentation at Progressive In July 2012, Progressive Insurance released new findings from an analysis of five billion real-time driving miles, confirming that driving behaviour has more than twice the predictive power of any other insurance rating factor. Loss costs for drivers with the highest-risk driving behaviour are approximately two-and-a-half times the costs for drivers with the lowest-risk behaviour. These results suggest that car insurance rates could be far more personalized than they are today. Progressive has also found that 70% of o f drivers who have signed up for its its’’ Snapshot UBI program pay less for their insurance. The program involves installing a small monitoring device in the car (900,000 drivers have already done this) and driving normally. After the device has collected enough data, customers receive a personalized rate for their insurance. Progressive is currently expanding access to Snapshot to all of its’ its’ drivers - not just Progressive customers - who can take a free test drive of the technology and after 30 days find out whether their own driving behaviour can lower the price they pay for insurance. The problem with today's less granular systems of customer classification in the property and casualty insurance market is that the majority of drivers who present a lower risk subsidize the minority of higher-risk drivers. (Gartner, 2013, pg. 5)
Big Data for underwriting
Case study: Improving underwriting decisions A large global property casualty insurance company wanted to accelerate catastrophe risk modelling in order to improve underwriting decisions and determine when to cap exposures in its’ its’ portfolio. The current modelling environment was too slow and unable to handle the large-scale data volumes that 11
the company wanted to analyze. The goal was to run multiple scenarios and model losses in hours, but the current environment required up to 16 weeks. As a result, the company conducted analysis only three or four times t imes per year. A proof of o f concept demonstrated that the company could improve performance by 100 times, accelerating query execution from three minutes to less than three seconds. The company decided to implement IBM solutions for big data, and can now run multiple catastrophe risk models every month instead of only three or four times per year. Once data is refreshed, the company can create “what“what-if” scenarios in hours rather than weeks. With a better and faster understanding of exposures and probable maximum losses, the company can take action sooner to change loss reserves and optimize its’ its ’ portfolio. (IBM, 2013, pg. 7)
Costs associated with typical Big Data implementations Although a Big Data environment such as that illustrated in Figure 3 can be constructed from open source software, such as Hadoop and a NoSQL database such as MongoDB, there are still substantial costs involved. These include: 1) Hardware costs 2) IT and operational costs in setting up a machine cluster and supporting it 3) Cost of personnel to work on o n the ecosystem These costs are NOT trivial for the following reasons: - - - -
Dealing with cutting edge technology and finding people who know the technology is challenging The technology introduces a different programming paradigm, frequently requiring additional training of existing engineering teams These technologies are new and still evolving and are not yet mature in the enterprise ecosystem The hardware is server grade and large clusters require resources including network administration, security administration, system administration etc., as well as data centre operational costs including electricity, cooling etc.
Infrastructure as a Service (IaaS)
One consideration that can mitigate the cost implications of hardware and support personnel is the use of a cloud offering. As pointed out by Intel (2015) clouds are already depl deployed oyed on pools of server, storage, and networking resources and can scale up or down as needed. Cloud computing offers a cost-effective way to support Big Data technologies and the advanced analytics applications that can drive business value. Diversity Limited (2010) defines Infrastructure as a Service (IaaS) as “a way of delivering Cloud Computing infrastructure – infrastructure – servers, servers, storage, network and operating systems – systems – as as an on-demand service. Rather than purchasing servers, software, datacenter space or o r network equipment, organisations instead buy those resources as a fully outsourced service on demand. demand.””
Recommended course for Big Data IBM (2015) recommends that organisations consider the following when embarking on the Big Data journey: 1. Choose projects with a high potential po tential return on investment, for which data sources are readily accessible and already in electronic form, and establish clear goals and quantifiable metrics. There should be a strong business need for making the resulting data easily accessible to broad user communities. 2. The data architecture should be extensible to allow addition of other data sources, including streaming data, as needed. 3. As the project continues, create a feedback loop to inform other departments of insights derived about products, marketing and sales. This helps promote the value of analytics, builds a culture that focuses on deriving even better information from analytics, and instils a high level of trust in the data’s veracity and completeness. completeness.
c apabilities. The 4. Surround Hadoop with a strong ecosystem of Big Data tools and analytics capabilities. richer the portfolio of capabilities in the selected Hadoop solution, the more freedom teams have to solve problems and advance the organization’s insights. organization’s insights. (IBM, 2015, pg. 4)
Recommended Big Data platform - -
Utilise an IaaS offering Explore the MapR and the IBM BigInsights offerings further.
IBM BigInsights example:
IBM BigInsights is based on 100 percent open source Hadoop. It extends Hadoop with enterprisegrade technology including administration and integration capabilities, visualization and discovery tools as well as security, audit history and performance management. According to IBM, the BigInsights platform offers: - - - -
Increased performance: An average 4 times performance gain over open source Hadoop.1 Usability: BigInsights is optimized for a wide range of roles, including integration developers, administrators, data scientists, analysts and line-of-business contacts. Integrated with IBM Watson™ Foundations big data platform: BigInsights comes bundled with search and streaming analytics capabilities. Analytics: Built-in Hadoop analytics capabilities for machine data, social data, text and Big R enable you to locate actionable insights from data in the Hadoop cluster rather than having to move the data around.
Figure 4: Three-year Costs for Use of IBM InfoSphere BigInsights and Open Source Source Apache Hadoop for Major Applications – Applications – Averages Averages for All Installations
Figure 4. Three-year Costs for Use of IBM InfoSphere BigInsights and Open Source Apache Hadoop for Major Applications. Copyright 2013 by 2013 by the International Technology Group. Reprinted with permission.
Conclusion Big Data is having a substantive impact on the P&C insurance industry. Insurers are combining Big Data and analytics to overcome many of the challenges confronting the industry, and to support new capabilities. Although implementing a Big Data platform is i s not without its’ its’ challenges, through careful consideration, the organisation should be able to generate an appreciable return on its its’’ Big Data and analytics initiative. The availability of IaaS platforms for Big Data reduce many of the initial risks that would traditionally be associated with such projects. In addition the Big Data offerings from MapR Technologies and IBM, based on initial research appear to be strong candidates for evaluation.
References Diversity Limited. (2010). Moving your infrastructure to the cloud . [pdf]. Retrieved from http://diversity.net.nz/wp-content/uploads/201 http://diversity.net.nz/wp-con tent/uploads/2011/01/Moving-to-the-C 1/01/Moving-to-the-Clouds.pdf louds.pdf Economist Intelligence Intelligence Unit. (2013). Fostering a data-driven culture. [pdf]. Retrieved from http://www.economistinsights.com/search/n http://www.economistin sights.com/search/node/sites%20default%20fi ode/sites%20default%20files%20downloads%20Tableau les%20downloads%20Tableau%20DataCu %20DataCu lture%20130219%20pdf Gartner. (2013). Characteristics of the traditional versus the big data approach. [Table]. Retrieved from Gartner. (2013). Big data business benefits are hampered by 'culture clash' . [pdf]. Retrieved from https://www.gartner.com/doc/2588415 Gartner. (2013). Use big data to solve fraud and security problems. [pdf]. Retrieved from https://www.gartner.com/doc/2397715 Gartner. (2013). How it should deepen big data analysis a nalysis to support customer-centricity. [pdf]. Retrieved from https://www.gartner.com/doc/ https://www.gartner.com/doc/2531116 2531116 Gartner. (2013). Consistent view of the customer for big data. [Diagram]. Retrieved from Gar Gartner. tner. (2013). How it should deepen big data analysis to support customer-centricity . [pdf]. Retrieved from https://www.gartner.com/doc/ https://www.gartner.com/doc/2531116 2531116 Gartner. (2014). Agenda overview for p&c and life insurance. [pdf]. Retrieved from https://www.gartner.com/doc/ https://www.gartner.com/doc/2643327 2643327 HBR. (2012). Big Data: The management revolution. [pdf]. Retrieved from https://hbr.org/201 https://hbr.org/2012/10/big-data 2/10/big-data-the-management-revolution -the-management-revolution/ar /ar Hortonworks. (2013). Application enrichment with hadoop. [Dia [Diagram]. gram]. Retrieved from Hortonworks. (2013). Apache Hadoop patterns patterns of use. [pdf]. Retrieved from http://hortonworks.com/blog http://hortonworks.com/blog/apache-hadoop-patterns-of/apache-hadoop-patterns-ofuse-refine-enrich-and-explore/ IBM. (2012). Four dimensions of big data. [Diagram] Retrieved from IBM IBM,, (2012). Analytics: the real-world use of big data. [pdf]. Retrieved from http://public.dhe.ibm.com/common/ssi/ecm http://public.dhe.ibm .com/common/ssi/ecm/en/gbe03519us /en/gbe03519usen/GBE03519US en/GBE03519USEN.PDF EN.PDF IBM. (2012). Analytics: the real-world use of big data. [pdf]. Retrieved from http://public.dhe.ibm.com/common/ssi/ecm/en/ http://public.dhe.ibm.com/common/ssi/ecm/en/gbe03519usen/G gbe03519usen/GBE03519USEN BE03519USEN.PDF .PDF IBM. (2012). Insurance bureau of Canada. [pdf]. Retrieved from http://www-01.ibm.com/common/ssi/cgibin/ssialias?subtype=AB&infotype=PM bin/ssialias?s ubtype=AB&infotype=PM&appname=SWG &appname=SWGE_IM_IM_USEN&ht E_IM_IM_USEN&htmlfid=IMC14775USE mlfid=IMC14775USEN&attachment=I N&attachment=I MC14775USEN.PDF IBM. (2013). A better assessment of the data around and connected to a single piece of information enables a more complete, in-context understanding . [Diagram]. Retrieved from IBM. (2013). The future of insurance. [pdf]. Retrieved from http://public.dhe.ibm.com/common/ssi/ecm/en http://public.dhe.ibm.com/common/ssi/ecm/en/imw14671usen/ /imw14671usen/IMW14671USEN.PD IMW14671USEN.PDF F IBM. (2013). Harnessing the power of big data and analytics for insurance. [pdf]. Retrieved from http://public.dhe.ibm.com/common/ssi/ecm http://public.dhe.ibm .com/common/ssi/ecm/en/imw14672usen/IM /en/imw14672usen/IMW14672USEN W14672USEN.PDF .PDF IBM. (2014). Analytics: The speed advantage. [pdf]. Retrieved from http://www-935.ibm.com/services/us/gbs http://www-935.ibm.com/services/us/gbs/thoughtleadership/20 /thoughtleadership/2014analytics/ 14analytics/ IBM. (2014). IBM expands hadoop commitment with support for spark.. [blog]. Retrieved from http://www.ibmbigdatahub.com/blog/ibm-exp http://www.ibmbigdatahub.com/blog/ibm-expands-hadoop-commitment-supp ands-hadoop-commitment-support-spark ort-spark
IBM. (2015). BigInsights for apache hadoop quick start edition. [pdf]. Retrieved from http://www-01.ibm.com/com http://www-01.ibm.com/common/ssi/cgimon/ssi/cgibin/ssialias?infotype=PM&subtype=BR bin/ssialias?i nfotype=PM&subtype=BR&htmlfid=IMB141 &htmlfid=IMB14164USEN#loaded 64USEN#loaded IBM. (2015). Making the case for hadoop and big data in the enterprise. [pdf]. Retrieved from http://www-01.ibm.com/com http://www-01.ibm.com/common/ssi/cgimon/ssi/cgibin/ssialias?infotype=PM&subtype=BK bin/ssialias?i nfotype=PM&subtype=BK&htmlfid=IMM14161US &htmlfid=IMM14161USEN#loaded EN#loaded ITG. (2013). Business case for enterprise big data deployments. [pdf]. Retrieved from http://www-01.ibm.com/com http://www-01.ibm.com/common/ssi/cgimon/ssi/cgibin/ssialias?htmlfid=IME14028USE bin/ssialias?h tmlfid=IME14028USEN&appname=skm N&appname=skmwww www B igInsights and Open Source Apache ITG. (2013). Three-year Costs for Use of IBM InfoSphere BigInsights Hadoop for Major Applications. [Diagram]. Retrieved from ITG. (2013). Business case for enterprise big data deployments . [pdf]. Retrieved from http://www-01.ibm.com/common/ http://www-01.ibm.com/common/ssi/cgissi/cgibin/ssialias?htmlfid=IME14028USE bin/ssialias?h tmlfid=IME14028USEN&appname=skmwww N&appname=skmwww
Intel. (2015). Big data cloud technology. [pdf]. Retrieved from http://www.intel.co.za/content/dam/www/publi http://www.intel.co.za/content/dam/www/public/us/en/documents/prod c/us/en/documents/product-briefs/big-datauct-briefs/big-datacloud-technologies-brief.pdf McKinsey. (2011). Big data: The next frontier for innovation, competition, and productivity. [pdf]. Retrieved from http://www.mckinsey.com/insights/busin http://www.mckinsey.com/ insights/business_technology/big_d ess_technology/big_data_the_next_frontier_for_innova ata_the_next_frontier_for_innovation tion Ordnance Survey. (2013) The big data rush: how data analytics can yield underwriting gold. [pdf]. Retrieved from http://events.marketforce.eu.com/big-data-underwriting-rep http://events.marketforce.eu.com/big-data-underwriting-report-email ort-email Planet Cassandra. (2015). Nosql databases defined and explained. [web page]. Retrieved from http://www.planetcassa http://www.planetcassandra.org/what-is-nosql/ ndra.org/what-is-nosql/ advan ced analytics for insurance fraud . [blog]. Retrieved Standish, J. (2 (2013). 013). Speed to detection - strategically leveraging advanced from http://www.johnstandishconsultinggroup.com/ http://www.johnstan dishconsultinggroup.com/JohnStandishConsu JohnStandishConsultingGroup.com/Blog ltingGroup.com/Blog/Entries/2013 /Entries/2013/8/9_Speed /8/9_Speed _to_Detection_-_Strategically_Leveraging_ _to_Detection_-_Strate gically_Leveraging_Advanced_Analytics_for_Insura Advanced_Analytics_for_Insurance_Fraud.html nce_Fraud.html
Strategy Meets Action Action.. (2012). Data an and d analytics in insurance. [pdf]. Retrieved from https://www.acord.org/library/Documents/ https://www.acord.org/library/Documents/2012_SMA_Data_An 2012_SMA_Data_Analytics.pdf alytics.pdf Strategy Meets Action Action.. (2013). Data and analytics in insurance: p&c plans and priorities for 2013 and beyond . [pdf]. Retrieved from https://strategymeetsaction.com/data-and-analytics-in-insurance-p-and-c-plans-and-prioritiesfor-2013-and-beyond/ Zikopoulos, P., deRoos, D., Bienko, C., Buglio, R., Andrews, M. (2014). Big data beyond the hype. [pdf]. Retrieved from https://www.ibm.com/developerworks/community/ https://www.ibm.com/d eveloperworks/community/blogs/SusanVis blogs/SusanVisser/entry/big_data_ ser/entry/big_data_beyond_the_hype_a_gui beyond_the_hype_a_gui de_to_conversations_for_today_s_data_center?lang=en