1. various sources, that are too large and


The term big data is used
to describe the growth and the availability of huge amount of structured and
unstructured data. Big data which are beyond the ability of commonly used
software tools to create, manage and process data within a suitable time. Big data
is important because the more data is collected more accurate result are there
and able to optimize business processes. The Big data is very important for
business and society purpose. The data came from everywhere like sensors that
used to gather climate information, available post or share data on the social
media sites, video movie audio etc. This collection of data is called ?BIG
DATA?. Big data is a general term for massive amount of digital data being
collected from various sources, that are too large and raw in form. Big data
deals with new challenges like complexity, security, risks to privacy. Big data
is redefining the data management from extraction, transformation and
processing to cleaning and reducing 7. There has been a lot of growth in the
amount of data generated by web these days. The data has been so large that it
becomes difficult to analyze it with the help of our traditional mining
methods. Big data term has been coined for data that exceeds the processing
capability 6.

We Will Write a Custom Essay Specifically
For You For Only $13.90/page!

order now

2.      Big Data Characteristics and challenges

Huge volume of data:
Rather than thousands or millions of rows, Big Data can be billions of rows and
millions of columns the size of data is now larger than terabytes and
petabytes. This large scale makes it difficult to analyze using conventional

of data types and structures Big Data reflects the variety of new data sources,
and structures, including digital traces being left on the web and other
digital repositories for subsequent analysis: – big data comes from various· sources. It is designed to
handle structured, semi-structured as well as unstructured data. Whereas the
traditional methods were designed to handle structured data and that too not of
such large volume.

of new data creation and growth: Big Data can describe high velocity data, with
rapid data
ingestion and near real time analysis big data should be used to mine large
amount of data within a pre-defined period of time. The traditional methods of
mining may take huge time to mine such a volume of data. 

Variability: – Variability
describes the amount of variance used in summaries kept within the data bank
and refers how they are closely clustered or spread out within the data set.

– All e-commerce systems and enterprises are enthusiastic in improving the
relationship with customers by providing value added services. For that, study
on customer trends and attitudes in the market are to be analyzed. Users can
also query the data store to find out business trends and accordingly they can
change their master plan or strategies. By making big data open to all, it
creates functional analysis

Veracity There are several challenges: · How can the cope with
imprecision, uncertainty, missing values, misstatements or untruths? ·
How good is the data? How broad is the coverage? · How fine is the sampling
resolution? How periodically are the readings? · How well understood are
the sampling biases? · Is there data available, at all? 3.5 Data
Discovery This is a huge challenge to find out high-quality data from the vast
collections of data that are out there on the Web. 3.6 Relevance and Quality
The challenge is determining the quality of data sets and relevance to
particular issues. 3.7 Data Comprehensiveness Are there areas without coverage?
What are the implications? 3.8 Management Challenges The main management
challenges are: Data privacy, Governance, Ethical, Security

Data Sources: Mobile Sensors

Social Media

Video surveiliancce

Video rendering

Smart grids

Geophysical exploration

Medical imaging

Gene sequencing



industries have led the way in developing their ability to gather and exploit
• Credit card companies monitor every purchase their customers make and can
identify fraudulent purchases with a high degree of accuracy using rules
derived by processing billions of transactions.
• Mobile phone companies analyze subscribers’ calling patterns to determine,
for example, whether a caller’s frequent contacts are on a rival network. If
that rival network is offering an attractive promotion that might cause the
subscriber to defect, the mobile phone company can proactively offer the
subscriber an incentive to remain in her contract.

For companies such as LinkedIn and Facebook, data itself is their primary
product. The valuations of these companies are heavily derived from the data
they gather and host, which contains more and more intrinsic value as the data

Although the volume of Big Data tends to attract the most
attention, generally the variety and velocity of the data provide a more apt
definition of Big Data. (Big Data is sometimes described as having 3 Vs:
volume, variety, and velocity.) Due to its size or structure, Big Data cannot
be efficiently analyzed using only traditional databases or methods. Big Data
problems require new tools and technologies to store, manage, and realize the
business benefit. These new tools and technologies enable creation,
manipulation, and management of large datasets and the storage environments
that house them. in Fig (1).      In
section 2, related literature of renowned researchers in the related field are
discussed, section 3 provides details about dimensions of e-learning systems,
section 4 explains the related work done in network architectures. The section
5 provides working and flow of traditional web mining system. In section 6
proposed framework and its working is described and in section 7 concludes the purpose
of this research script.

3.    Technology Used for Big Data Mining.

Big data has great potential to produce useful information for companies
which can benefit the
way they manage their problems. Big data analysis is becoming indispensable for
discovering of intelligence that is involved in the frequently occurring
patterns and hidden rules.
These massive data sets are too large and complex for humans to effectively
extract useful
information without the aid of computational tools. Emerging technologies such
as the Hadoop
framework and MapReduce offer new and exciting ways to process and transform
big data,
defined as complex, unstructured, or large amounts of data, into meaningful
knowledge. There are many techniques available for data management.


I'm Victor!

Would you like to get a custom essay? How about receiving a customized one?

Check it out