has been with us for ages in BIG SPATIAL DATA various forms…but pretty invisible!!
5
Ancient River
Egypt
nile
Engineers
used
to try data analysis to predict crop yields 6695
Km long
6
Challenges Perceptions Concepts Basic Intro
…the 15 min route to THANK YOU slide
7
An English professor wrote the words :
“A Woma Woman n without her man is nothing” On the chalk board and asked his students to punctuate it correctly….
“A Woman,without her man,is nothing”
“A Woman: Without her, man is nothing”
8
Series 1
How we understand it ?
Social media data Th The e lates latestt buzzword buzz word
Large volumes of Geo data
Non traditional forms of Geo data
Data influx from new technologies
Real time Geo infor i nformation mation
New kinds of Geo data and analysis
A greater scope of Geo Int info
0
2
4
6
8
10
12
14
16
18
20
DEFINING BIG SPATIAL DATA
9
BIG SPATIAL DATA Spatial sets exceeding capacity of current data computing systems……
….to manage, process or effort analyze the data with reasonable
due to Volume, Velocity, Variety and Veracity
DEFINING BIG SPATIAL DATA
10
DATA is Exploding in Volume
Velocity
VARIETY
While decreasing in Veracity
11
BIG SPATIAL DATA Finding actionable in Massive info volumes of both structured and unstructured geo data that is so large and complex that it’s complex difficult to process with traditional database and software techniques…… techniques ……
U.S. drone aircraft sent back 24 years worth of video footage in 2009
90% of data in the world was created in the
2.5 EB of data is created
last 2 years
every day
13 growth of geospatial data is outpacing both software and services and is set to become a major contributor to the overall growth of the industry * Estimated revenue FY 2013
14
100% security is a myth
No one has said this!!! But it remains a fact
Increasing attack surface
15 The
technology is
ready….
But are
we ready
?
16
DISASTER RELIEF RETAIL UTILITIES FINANCIAL
FRAUD DETECTION DISEASE SURVEILLANCE ECO-ROUTING TELECOMMUNICATIONS INSURANCE CALL CENTER REQUESTS
16
17
The other side of the story
18
Security challenges before we adopt Big spatial data
19
Ek Distributed programming frameworks
Utilise parallilism in computation & storage to process massive amounts of 20 Map Input file
Localdata Reduce
Reduce
Intermediate Combining
Mapper performs computation & outputs a key/value pairs
Shuffle
Reducer combines the values belonging to each distict key and outputs the result
Output File
Distributed programming frameworks
21
MAP
REDUCE
Splits the input data-set into
Aggregate results from map phase
independent chunks which are processed
performs a summary operation
in a completely parallel manner
FRAMEWORK
Schedules and re-runs tasks
Splits the input
Moves map outputs to reduce inputs
Receive the results
Distributed programming frameworks
Read 1 TB
One Machine
10 Machine’s
4 i/o Channels Each channel : 100 MB/s
4 i/o Channels Each channel : 100 MB/s
45 Min
4.5 Min
So challenge is not storage but it is I/O speed
23
Untrusted Mappers
Securing the data in the presence of an untrusted mapper
Distributed programming frameworks
24
TWO
NO SQL ISSUES
25
First off : the name NoSQL is not “NEVER SQL” NoSQL is not “No To SQL “
26
NoSQL Is simply
Not Only SQL!!!!!
27
MongoDB
NoSQL DB are still evolving with respect to security infrastructure
Redis
28
Data storage & transaction logs
29 STORAGE TIERS
- Multi-tiered storage media
- Necessitated by scalable size - Different categories of data - Different types of storage
Data storage & transaction logs
30 Keeping track of data location
Lower tier means reduced security,, loose access security controls
Data storage & transaction logs
31
INPUT VALIDATION/FILTERING
32 How can we trust data ?
Validating alidati ng data when source of input data is not reliable?
Filtering malicious data @ BYOD
Input validation/filtering
33
REAL TIME MONITORING
34 Humongous number of alerts!!!!
False positives
Filtering malicious data @ BYOD
REAL TIME MONITORING
35
Secure communication
36
End to end security ?
Data encryption : attribute based encryption!!!to be made richer
Secure communication
37
Granular audits
38
New attacks will keep happening…and to find out we need detailed audit logs
Missed true positives
Granular audits
39
PRIVACY ISSUES
40
EG : How a retailer was able to identify that a teenager was pregnant before her father knew
In the world of big data,privacy invasion is a business model m odel
PRIVACY ISSUES
41
And...
We Also Have cloud with us?
42
At 1.4% in 2011-12 Cloud was a very small percentage of the total IT spend
43
Pace of Big Spatial Data adoption has been
Sluggish
44
There is unlikely to be a day soon in near future when we have a
“FIND TERRORIST” BUTTON
45
We have mostly beendate….. reactive till
46 USE KERBEROS FOR NODE AUTHENTICA AUTHENTICATION TION – (BUT WE KNOW IT’S A PAIN TO SET UP)