big data - Data Types

Published on July 2016 | Categories: Documents | Downloads: 28 | Comments: 0 | Views: 229
of 23
Download PDF   Embed   Report

Comments

Content

Data Types, Characteristics and Uses

2

Big Data Technology

3

Big Data EveryWhere!


Lots of data is being collected
and warehoused
• Web data, e-commerce
• purchases at department/
grocery stores
• Bank/Credit Card
transactions
• Social Network

How much data?

 Google processes 20 PB a day (2008)
 Wayback Machine has 3 PB + 100 TB/month
(3/2009)
 Facebook has 2.5 PB of user data + 15 TB/day
(4/2009)
 eBay has 6.5 PB of user data + 50 TB/day
(5/2009)
 CERN’s Large Hydron Collider (LHC) generates
640K ought to be
15 PB a year
enough for
anybody.

The Earthscope
• The Earthscope is the world's largest science project.
Designed to track North America's geological evolution,
this observatory records data over 3.8 million square
miles, amassing 67 terabytes of data. It analyzes
seismic slips in the San Andreas fault, sure, but also the
plume of magma underneath Yellowstone and much,
much more.
(http://www.msnbc.msn.com/id/44363598/ns/technology
_and_science-future_of_technology/#.TmetOdQ--uI)

Types of Data






Relational Data (Tables/Transaction/Legacy Data)
Text Data (Web)
Semi-structured Data (XML)
Graph Data
• Social Network, Semantic Web (RDF), …



Streaming Data
• You can only scan the data once

Big Data Analysis Example


Big data can generate significant financial value across sectors

8

Who is collecting all of this data?

Government Agencies

(Hey, I didn’t say which government!)

Big Pharmaceutical Companies

Who is collecting all this data?

Consumer Products Companies

Big Box Stores

Who is collecting what?

Credit Card Companies

What data are they getting?
Airline ticket

Restaurant check

Grocery Bill
Hotel Bill

Why are they collecting all this
data?
Target Marketing
 To send you catalogs for
exactly the merchandise
you typically purchase.
 To suggest medications that
precisely match your
medical history.
 To “push” television
channels to your set instead
of your “pulling” them in.
 To send advertisements on
those channels just for
you!

Targeted Information
 To know what you need
before you even know
you need it based on past
purchasing habits!
 To notify you of your
expiring driver’s license or
credit cards or last refill
on a Rx, etc.
 To give you turn-by-turn
directions to a shelter in
case of emergency.

What to do with these data?






Aggregation and Statistics
• Data warehouse and OLAP
Indexing, Searching, and Querying
• Keyword based search
• Pattern matching (XML/RDF)
Knowledge discovery
• Data Mining
• Statistical Modeling

Where Is This “Big Data” Coming From ?
4.6
billion
camera
phones
world
wide

100s of
millions
of GPS
enabled

data every
day

? TBs of

12+ TBs
of tweet data
every day

30 billion RFID
tags today
(1.3B in 2005)

devices
sold
annually

25+ TBs
of
log data
every day

2+
billion
76 million smart
meters in 2009…
200M by 2014

people
on the
Web by
end 2011

With Big Data, We’ve Moved into a New Era of Analytics

12+ terabytes

5+ million

of Tweets
create daily.

100’s
of different
types of data.

trade events
per second.

Volume

Velocity

Variety

Veracity

Only

1 in 3

decision makers trust
their information.

The number of organizations who see analytics
70% is growing.
as a competitive advantage

57 %

63%
2010

business initiative

BUSINESS
IMPERATIVE
2011
2012

IQ

Four Characteristics of Big Data

Cost efficiently
processing the
growing Volume
50x

2010

35
ZB
2020

Establishing the
Veracity of big
data sources

Responding to the
increasing Velocity

30
Billion

RFID
sensors and
counting

Collectively
Analyzing the
broadening Variety

80% of the

worlds data is
unstructured

1 in 3 business leaders don’t trust
the information they use to make
decisions

The 5 Key Big Data Use Cases

Big Data Exploration
Find, visualize, understand
all big data to improve
decision making

Enhanced 360o View
of the Customer

Security/Intelligence
Extension

Extend existing customer
views (MDM, CRM, etc) by
incorporating additional
internal and external
information sources

Lower risk, detect fraud
and monitor cyber security
in real-time

Operations Analysis

Data Warehouse Augmentation

Analyze a variety of machine
data for improved business results

Integrate big data and data warehouse
capabilities to increase operational efficiency

Big Data Exploration: Needs
Find, visualize, understand all big data
to improve decision making

Struggling to manage
and extract value from
the growing 3 V’s of
data in the enterprise;
Need to unify
information across
federated sources

Inability to relate “raw”
data collected from
system logs, sensors,
clickstreams, etc., with
customer and line-ofbusiness data managed
in enterprise systems

Risk of exposing
unsecure personally
identifiable information
(PII) and/or privileged
data due to lack of
information awareness

Big Data Exploration: Value & Diagram
Relational
Data
File
Systems
Content
Management
Email

Data Explorer

Application/
Users

CRM
Supply
Chain
ERP
RSS Feeds
Cloud
Custom
Sources

Find, Visualize & Understand
all big data to improve
business knowledge
• Greater efficiencies in business
processes
• New insights from combining and
analyzing data types in new ways
• Develop new business models
with resulting increased market
presence and revenue

Enhanced 360º View of the Customer: Needs
Extend existing customer views (MDM, CRM,
etc) by incorporating additional internal and
external information sources

Need a deeper
understanding of
customer sentiment
from both internal and
external sources

Desire to increase
customer loyalty
and satisfaction
by understanding
what meaningful
actions are
needed

Challenged getting the
right information to the
right people to provide
customers what they need
to solve problems, crosssell & up-sell

Security/Intelligence Extension: Needs
Security/Intelligence Extension enhances
traditional security solutions by analyzing all
types and sources of under-leveraged data

Enhanced
Intelligence &
Surveillance
Insight

Real-time Cyber
Attack Prediction
& Mitigation
Crime prediction
& protection
Reduce
Customer Churn

Analyze data-in-motion & at rest to:
• Find associations
• Uncover patterns and facts
• Maintain currency of information
Analyze network traffic to:
• Discover new threats early
• Detect known complex threats
• Take action in real-time
Analyze Telco & social data to:
• Gather criminal evidence
• Prevent criminal activities
• Proactively apprehend criminals
• Customer Retention
© 2013 IBM Corporation

Operations Analysis: Needs
Analyze a variety of machine
data for improved business results
Business Challenges:
•Complexity and rapid growth of machine data
•Difficult to capture small fraction of machine for better
decision
•In-ability to analyze machine data and combine it with
enterprise data for a full view analysis
Benefits:
• Gain real-time visibility into operations,
customer experience, transactions and
behavior
• Proactively plan to increase operational
efficiency

• Identify and investigate anomalies
• Monitor end-to-end infrastructure to
proactively avoid service degradation
or outages

Sponsor Documents

Or use your account on DocShare.tips

Hide

Forgot your password?

Or register your new account on DocShare.tips

Hide

Lost your password? Please enter your email address. You will receive a link to create a new password.

Back to log-in

Close