BigDataOverview SB

Published on June 2016 | Categories: Documents | Downloads: 40 | Comments: 0 | Views: 268
of 19
Download PDF   Embed   Report

BigData Overview

Comments

Content

Big Data –The journey begins…

Objective
• Share Contemporary understanding on Big
Data.
• Creating awareness, and spark up interest to
explore new avenues in Big Data trends /
technologies.
• Big Data initiatives in ThomsonReuters.

Content
The rise of the Bytes
Astonishing facts and figures
World Data forecast
Broad classification of Big Data
Characteristics of Big Data –The 3 Vs of Big
Data
Challenges of Big Data and next Gen tools
Big Data’s impact on Thomson Reuters

The rise of the Bytes …
10008 YB -> Yottabyte
10007 ZB -> Zetabyte
10006 EB-> Exabyte
10005 PB ->Petabyte

10004 TB -> Terabyte
10003 GB -> Gigabyte

10002 MB -> Megabyte
1000 KB -> Kilobyte

Astonishing facts and figures …
ERIC Schmidt, Chairman of Google Said :

“From the dawn of humanity to 2003 data produced by
human race is 5 Exa bytes( 10006), and now every 2
days we are creating 2 Exa Bytes of data”

World Data forecast.






In 2010, estimated amount of world digital data was 1.2 ZB.
In 2013, the web data reached to 4 Zettabytes
Data growth will be 44 times greater in 2020 than in 2009.
Data volume is doubling in every 1.2 years.

Big Data :Broad classification

Big Data :Broad classification
(Contd…)
• Structured data
– Fits into table, stored in RDMBS
– It is 20% of the world data

• Semi-Structured Data:

Big Data :Broad classification (Contd…)
• Unstructured data:
– 80% of world data semi-Structured /
Unstructured

Big Data :Characteristics
• The 3 Vs of Big Data…..

Big Data :Characteristics (contd..)
• Volume: Huge Volume of data is being
generated by different sources.
• Velocity: The speed at which data comes into
real time as a consequence of different
sources.
Variety: The different forms of data.
 Machine Generated: Sensors, Machines, Satellites, Weather data
 User Generated Data: Social Media sites, Face book, Twitter
 Operational Data: Stock Market, Application Logs

Big data :Significant data
producers

 NYSE trading/day produces  1 TB
 New websites created every minute a day
571.
 Google data processing /day 20 peta
bytes.
 Data uploaded daily to Facebook 100
terabytes.
 Aadhar card for India…
 UIDs for Indian population of 1.5 BILLION.
 Per resident 5MB
 I/O everyday 30 TB

Big Data : Challenges
• Handle the variety of data.
• Store the Huge volumes of data in
existing in different forms.
• Process /Analyze this Huge data
. Eg :
By using the traditional RDBMS approach
for decoding the human genome takes
10 years.

What next ??
• Next generation of data tools and
techniques like Hadoop and NoSQL
databases are needed to handle the Big
Data….

What we intends…

Linked Data - RDF
• RDF (Resource Description Framework) is a
standard model for data interchange on the Web
• It’s the foundation upon which the web of
semantic data is built
• Organized into triples [Subject, Predicate, Object]
Predicate
Subject

Object

• A “predicate” defines the relationship between the “subject”
and “object” nodes
16

RDF Example
RDF: XML based language for triples using URIs

Inferred relationships…

Subject=Dan,
Predicate= is_from,
Object=England

Relationship doesn’t exist inferred from the other two: new
knowledge

The Graph – Federated Machine Readable Knowledge

Questions ??

Sponsor Documents

Or use your account on DocShare.tips

Hide

Forgot your password?

Or register your new account on DocShare.tips

Hide

Lost your password? Please enter your email address. You will receive a link to create a new password.

Back to log-in

Close