Challenges for BIG Data

Published on January 2017 | Categories: Documents | Downloads: 53 | Comments: 0 | Views: 440
of 4
Download PDF   Embed   Report

Comments

Content

Challenges for BIG Data

Challenges For BIG Data
Big Data is a large unstructured big volume data set, which for commonly used database
management systems like RDBMS or DBMS is too complex to be handled. The size of the data
stored in the world has exploded as it is constantly being gathered by various sources and these
keep increasing too. The capacity of the world to store data doubles every two years. Everyday
around two and a half exabytes of data is created. Big data uses statistical inference to
determine regressions, nonlinear relationships and data dependencies from a large volume of
data.

Data From All Around
The Internet, the mobile devices, the remote sensing devices, their cameras and microphones,
their wireless sensor networks and the Radio Frequency Identifications (RFID) all generate hug
amount of data. The software logs generated by all of these are of huge volumes too.

Big Challenges
The challenges for BIG Data are divided into the three classical fronts.


Volume
o
o



Velocity
o



The unceasing increase in the amount of data created everyday is overwhelming.
It can bring a long running software system to a standstill by sheer size and the
inability to process in an acceptable time limit.

The speed of data in and out, the transactions and the analysis to be done as
expected by the business could be in fact outdoing the speed of light itself.

Variety
o
o
o

If variety is the spice of life, here in the Big Data world it could very well be the
reason for sleepless nights for the technology gurus.
To decipher the range of data types and sources itself is a challenge. Much later
is the question of devising the methods to capture, curate and store these.
Once this is done comes the challenge to allow meaningful analysis, search and
visualization of the data.

To sum it up, the growth and digitization of global information storage capacity itself is a
challenge today

Putting Big Data to Use
The Big Data Systems can be implemented by following these steps to have a mature and
meaningful data set.









Data Profiling
Data Cleansing
Data Integration of structured and unstructured data
Data Merging
Data Migration
Data Replication
ETL / ELT / ETLT Design and Development
Interfacing legacy systems with the modern approach

Big Data Tools
The Big Data toolkit needs no introduction. While Hadoop is a distributed file system,
MapReduce is a framework for data abstractions. On the other hand we have Hive for data
summarization and adhoc queries. Pig is used for parallel processing and Sqoop for data
integration of Hadoop with RDBMS. HBase is a structured storage commonly used for large
tables. Flume is generally used to transfer data from log data to centralized data repositories.
And last but not the least, creating a rage in BIG Data world is MongoDB, the Crossplatform
document oriented open source database.
Intelligent use of BIG Data coupled with Business Intelligence Tools to analyze the data to
meaningful information is a challenging proposition for most organizations.

Sponsor Documents

Or use your account on DocShare.tips

Hide

Forgot your password?

Or register your new account on DocShare.tips

Hide

Lost your password? Please enter your email address. You will receive a link to create a new password.

Back to log-in

Close