0% found this document useful (0 votes)
26 views4 pages

37 A Review Paper On Big Data Analytics

Uploaded by

Hitesh Bhatt
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views4 pages

37 A Review Paper On Big Data Analytics

Uploaded by

Hitesh Bhatt
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

International Journal of Innovative Research in Computer Science & Technology (IJIRCST)

ISSN: 2347-5552, Volume-10, Issue-2, March 2022


https://doi.org/10.55524/ijircst.2022.10.2.37
Article ID IRPV1058, Pages 185-188
www.ijircst.org

A Review Paper on Big Data Analytics


Ankur Gupta
Assistant Professor, Department of Computer Science & Engineering, RIMT University, Mandi Gobindgarh, Punjab, India
Correspondence should be addressed to Ankur Gupta; [email protected]
Copyright © 2022 Made Ankur Gupta. This is an open-access article distributed under the Creative Commons Attribution License, which permits
unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

ABSTRACT- The information revolution gave decision A. Big Data Analytics


makers huge quantities of data available. Big data refers to
datasets that are not just large, but also significant in diversity Nowadays, the phrase "Big Data" has referred to datasets
and speed and that make standard tools and approaches hard which grow to be difficult to use standard techniques for
to handle. Because such data is fast growing, solutions database administration. Those are data sets which have the
should be explored and supplied so that these datasets are capacity to gather, save, store and analyze data over an
able to handle and extract value and information. In acceptable span of time beyond the usual software and
moreover, policymakers need to be able to get useful storage devices.
understanding into such diverse and fast changing data, from Big data sizes increase steadily from a few dozen terabytes
everyday transactions to contacts with customers and social to several petabytes of data in a single piece of data. As a
network information. The Big Data Analytics, the result, the acquisition, storage, discovery, sharing,
deployment of advanced Big Data Analytics methodologies processing and analysis of large data are some of the
can deliver this value. The purpose of the article is to problems.
evaluate several analytical methods and means that may be In this part, we begin by examining the features and
used for Big Data and the potential that Big Data Analytics relevance of big data. Business advantage, of course, can
offers for the use of diverse decision-making areas. usually be extracted from the analysis of huge, more
complicated data sets which need real-time or near-real time;
KEYWORDS- Analytics, Big data, Data mining, Datasets, nevertheless, it requires new data structures, analysis
Network. methodologies and tools. The next section will therefore
explain the strategies and methodologies of big data
I. INTRODUCTION analytics, especially from big data storage and administration
through big data analysis processing.
Envision a world lacking data storage; a situation whereby
each detail about an individual or group, every transaction 1) Characteristics of Big Data
made or every documented element is gone right after usage. Big data means data that required new technical
Organizations might lose the capacity to get important architectures, analysis and instruments to provide insights
information and understanding, do comprehensive analyses, into a new source of corporate value to be utilized in terms
and also provide new possibilities and benefits. Anything of size, spread, variety and time. Big data is distinguished by
from client names and addresses, availability items, three major features: volume, variety and speed, or three Vs.
transactions made, hiring staff, etc. is becoming crucial to The volume and quantity of the data are its size. Speed relates
continuity on even a daily basis. Data is the fundamental to the rate during which or how much the data is changed.
basis that every company prospers [1-3]. Finally, the diversity covers the many data formats and
Consider now the depth of the specifics and the volume of categories and the various purposes and forms of data
data and information that technological and Internet analysis [4]. Big data, as also the quantity of records,
advances have supplied. With increased storage and data operations, tables or files, may be measured by size in TBs
collecting technologies, enormous volumes of data are easily and PBs. Moreover, it comes from a broader range of
available. More and more data are produced every second, sources, including logs, clip streams and social media, since
and in order for value extraction it has to be stored and one of the events that directly create big data [5-8].
evaluated. In addition, data are less expensive to keep,
allowing companies to take full use of the large volumes of 2) Big Data Analytics Tools and Methods
data collected. Conventional data management and analysis methodologies
The input of this study is to analyses existing Big Data and technologies can no longer readily examine these data
analytics publications. Some of the many large data tools, sets. Accordingly, new tools and methodologies for big data
methodologies and technologies available will thus be analysis and the systems needed to store and manage that
examined, and their uses and potential in a number of data are needed. The advent of big data hence has an impact
decision fields will be described. on all aspects, from data gathering and processing to final
derived choices.

Innovative Research Publication 185


International Journal of Innovative Research in Computer Science & Technology (IJIRCST)

The initial attempts the many tools, analytical tools and


methodologies, visualization and assessment tools for big
data storage, management and processing into the various
phases of the decision-making processes. Throughout this
part, each area will be addressed further.
B. Big Data Storage and Management
The conventional techniques to store and retrieve structured
data include relationship databases, data marts, and data
storages. The data can be uploaded to the storage of OSs with
extraction, transformation, loading or extraction, loading or
transforming, tools for extracting data from external sources
and transforming the data in order to suit operational
requirements. Data is therefore cleansed, processed and
catalogued prior pattern discovery and analytical capabilities
are available online [9]. The big data environment,
meanwhile, requires the capability of Magnetic, Agile and
profound analytics that differ from the feature of a typical
data warehouse. First and foremost, typical EDW methods
prohibit the inclusion of new sources of data until they have Figure 1: illustrate the diagram shows how the MapReduce
been cleaned up. As large data settings nowadays have to be nodes and how HDFS work together
magnetic, so that all data sources are attracted irrespective of
the information quality.
Figure 1 demonstrates the collaboration between
C. Big Data Analytic Processing MapReduce nodes and HDFS. In step 1, a huge dataset is
The analytical processing happens after large data storage. available comprising log files, sensor data or whatever. The
Thus according to four major data processing needs exist. HDFS contains replicas of the data shown on the Data Nodes
Initially, rapid data loading is necessary. Because traffic on by blue, yellow, beige and rose symbols. In step 2, the
the disc and network conflicts with queries during data customer defines and runs a task on the map and reduces
loading, data loading time must be reduced. Next, quick work in a given data set. The Job Tracker subsequently
query processing is necessary. Many questions are important assigns jobs in step 3 out across task trackers. The task
to answer times to fulfil the needs of large workloads and in- tracker operates the mapper and creates output, then
time requests. In furthermore, extremely efficient use of maintained in the system of HDFS files. In step 4, the
storage space is the third need for Big Data Processing. reduction work runs on the mapped data to generate the
Because the rapid increase of user activity might need result.
scalable storage capacities and computer power, there must D. Big Data Analytics and Decision Making
be limited disc space for the proper management of data
storage during processing and problems with storing the data The importance of big data from the decision-point makers
in order to optimize the space usage. The fourth condition, in of view is its capacity to give information as well as to
recognize the cost from which to base judgments. The
conclusion, is that the workload patterns would be very
management decision-making process seems to have been a
dynamic. Big data sets should be highly adapted and not
major issue of research all throughout period. Big
specified for unanticipated dynamics in data processing,
since they are examined by different platforms and information for policy-makers is becoming an essential asset.
applications for different goals and for different methods. Widespread amounts of very specific information from many
MapReduce is the initial phase of mapping data input to a sources, also including scanners, mobile telephoning, loyalty
cards, the Internet and social media channels, provide
series of value pairs as output. As a result, the "Map"
companies important benefits [11]. This is only feasible if
function splits huge computer jobs into smaller work and
the data is correctly analyzed and useful insights are
allocates them to the relevant key/value pairs. Unstructured
data, for example, text, may therefore be mapped to a revealed, allowing decision makers to exploit the
structured key-value pair in which the key, for instance, opportunities coming from the richness of historical and real
could have been the word in the manuscript. This outcome is time data created by supply chains, industrial processes,
consumption habits, etc.
then the "Reduce" feature input. Reduce then collects and
First step of decision making processes is the intelligence
combines the output to produce the final result of the
phase during which data are collected through internal and
computing work by merging all values with the same key
value [10]. external data sources which may have been utilized to
identify challenges and opportunities. During this period,
large data sources should be identified, data acquired,
processed, stored and transferred to the end-user from
diverse sources [12]. Following the identification of the
origins and kinds of information required for evaluation, the

Innovative Research Publication 186


International Journal of Innovative Research in Computer Science & Technology (IJIRCST)

selected information is gathered, saved, and utilized in one huge quantities of data. Big data is a data which scale,
of the already stated distributed databases and administration diversity and complexity demand the development and
software. Big data is organized, produced, and analyzed once extraction of value and information from it of new
it has been gathered and kept. architecture, methods, algorithms and insights. Hadoop is the
When alternative action sequences are generated and main platform for big data processing and addresses the
assessed through a conceptualization or a representational difficulty of building information relevant for analysis.
model of the challenge, the next step in decision-making is Hadoop is a software-project for open source processing of
the preliminary design. Each phase is subdivided into three large-scale data collections across multiple server clusters. It
phases, model planning, data analysis and analysis. is meant to scale thousands of machines from one server with
Therefore, a model is picked, planned, and then implemented a high tolerance of faults [15].
for data analytics, including those previously described, and Tessa Van Der Valk et al. studied the application of
assessed. sociological networks in policies design and evaluation is
fairly limited. The goal of this paper is to highlight study
1) Risk Management and Fraud detection
areas in innovations that may benefits through the utilization
Big data analytics in the field of risk management may be of SNA, as well as to investigate planning and organizational
used by industries also including investments and retail ramifications derived from the use of SNA in the academic
banking, as well as reinsurance. Big data analytics can assist community. Three important study topics have been
to identify investments by comparing the chance of returns identified: cooperation networks; network infrastructure; and
versus likelihood of damages since risk assessment and technology networks. The managerial and regulatory
carrying is a crucial component for the financial services consequences and possible guidance are addressed [16].
industry. The comprehensive and dynamic risk assessment Sitaram Asur studied the social media have grown
may also be analyzed for internally and externally big data. omnipresent and crucial for social networking and sharing of
Big data analytics can also be used to reduce fraud, in content in recent years. Yet the material produced on these
particular in the public, banking and insurance industries. blogs remains mostly unexplored. On this article, we show
Although analytics have already been frequently employed how material in media platforms may be utilized to predict
in the field of automated detection of fraud, companies and real results. We are using Twitter.com buzz in particular to
industries want to exploit the potential of big data to optimize estimate film box office income. We show that a basic model
their systems [13]. built on a rate of tweeting can be superior to market-based
2) Improvement and Quality Management predictions for certain subjects. We also show how Twitter
emotions could be further used to increase social media
In order to improve profitability and decrease expenses, Big forecasting power [17].
Data may especially be utilized for manufacturing, energy,
utilities and telecommunications industries for quality
management by increasing the quality of goods and services III. DISCUSSION
offered. For instance, predictive analytics may be utilized in The literature has thus been examined to give analysis of and
the production process in order to decrease the variability in significance to decision making in the ideas of Big Data
performance and prevent quality problems with early Analytics that are under study. Big data and its properties and
warning warnings. This can improve scrap rates and reduce significance were therefore examined. In addition, several of
the amount of time for marketing as any manufacturing the tools and methodologies for large data analysis have been
process interruptions can save a considerable amount of studied. Big data storage and administration were therefore
money before they arise [14]. specified, as were the processing of Big Data Analytics.
Big data may also be employed to better understand Some of the other sophisticated approaches of data analytics
variations in location, frequency and weather and climatic have also been examined further. This analysis has thus
intensity. Citizens and businesses, also including farmers and offered people and companies with examples of the different
tourists and transportation firms, can profit from this. tools, techniques and technology that may be used for the use
Furthermore, weather related natural disasters may be of big data. This offers consumers a sense of the technology
foreseen using new sensors and analytic tools for long term they need and developers and idea of what they can do to
climate models and closer weather predictions and deliver better solutions for Big Data Analytics for decision-
preventative or adaptive actions can be implemented in making. So the help to decision making from Big Data
advance. Analytics has been shown.

II. LITERATURE REVIEW IV. CONCLUSION


S. Kumari studied the phrase "Big Data" refers to novel In this study, we studied the new Big Data subject that has
approaches and technology for capturing, storing, delivering, attracted a great deal of attention due to its exceptional
managing and evaluating high-velocity and various potential and advantages. In this information era, large kinds
structures of petabytes or bigger datasets. Big data might be of high speed data are created every day and hidden
formatted, unstructured or semi-structured, and standard data knowledge patterns and inherent features are laid inside the
management approaches cannot thus be used. Parallelism is high-speed data. Large data analysis may thus be used with
applied for the economic and efficient processing of very improved analyses of big data and uncover hidden insights

Innovative Research Publication 187


International Journal of Innovative Research in Computer Science & Technology (IJIRCST)

and important knowledge in the use of sophisticated Comput Ind Eng. 2018;
analytical technologies to make business changes easier. [13] SAS. The Value of Big Data and the Internet of Things to the UK
Furthermore, if implemented appropriately, every new Economy. Rep SAS by Cent Econ reforms. 2016;(February):54.
technology may provide numerous potential advantages and [14] Wamba SF, Gunasekaran A, Akter S, Ren SJ fan, Dubey R,
Childe SJ. Big data analytics and firm performance: Effects of
improvements, let alone big data, who’s, if properly dynamic capabilities. J Bus Res. 2017;
addressed, is an important sector with a promising future. It [15] Choi TM, Wallace SW, Wang Y. Big Data Analytics in
includes sufficient storage, administration, integration, Operations Management. Prod Oper Manag. 2018;
federation, purification, processing, analysis, etc. Big data [16] Van Der Valk T, Gijsbers G. The use of social network analysis
multiplies these challenges enormously with all the in innovation studies: Mapping actors and technologies. Innov
difficulties in traditional data management because of greater Manag Policy Pract. 2010;12(1):5–17.
volumes, speeds and variety of data and sources. Continued [17] Asur S, Huberman BA. Predicting the future with social media.
studies might thus concentrate on developing a Big Data In: Proceedings - 2010 IEEE/WIC/ACM International
Management roadmap or framework that can incorporate Conference on Web Intelligence, WI 2010. 2010.
prior problems. In this era of data overflow we think that big
data analytics are of tremendous importance and can bring
unanticipated insights and advantages for policymakers in
numerous sectors. Big data analytics may offer the
framework for the study, technical and humanitarian
advances if properly exploited and implemented.

REFERENCES
[1] Siddiqui MHF, Kumar R. Interpreting the Nature of Rainfall with
AI and Big Data Models. In: Proceedings of International
Conference on Intelligent Engineering and Management, ICIEM
2020. 2020.
[2] Sehgal D, Agarwal AK. Real-time sentiment analysis of big data
applications using twitter data with Hadoop framework. In:
Advances in Intelligent Systems and Computing. 2018.
[3] Sehgal D, Agarwal AK. Sentiment analysis of big data
applications using Twitter Data with the help of HADOOP
framework. In: Proceedings of the 5th International Conference
on System Modeling and Advancement in Research Trends,
SMART 2016. 2017.
[4] TechAmerica Foundation’s Federal Big Data Commission.
Demystifying Big Data: A Practical Guide To Transforming The
Business of Government Listing of Leadership and
Commissioners Global Executive Vice President and General
Manager. UNICOM Gov. 2012;1–40.
[5] Jain N, Awasthi Y. WSN-AI based Cloud computing
architectures for energy efficient climate smart agriculture with
big data analysis. Int J Adv Trends Comput Sci Eng. 2019;
[6] Gupta P, Tyagi N. An approach towards big data - A review. In:
International Conference on Computing, Communication and
Automation, ICCCA 2015. 2015.
[7] Al-Bahri B, Noronha H, Pandey J, Singh AV, Rana A. Evaluate
the Role of Big Data in Enhancing Strategic Decision Making for
E-governance in E-Oman Portal. In: ICRITO 2020 - IEEE 8th
International Conference on Reliability, Infocom Technologies
and Optimization (Trends and Future Directions). 2020.
[8] Gupta D, Rana A, Tyagi S. A novel representative dataset
generation approach for big data using hybrid Cuckoo search. Int
J Adv Soft Comput its Appl. 2018;
[9] Cuzzocrea A, Song IY, Davis KC. Analytics over large-scale
multidimensional data: The big data revolution! Int Conf Inf
Knowl Manag Proc. 2011;101–3.
[10] Herodotou H, Lim H, Luo G, Borisov N, Dong L, Cetin FB, et al.
Starfish: A self-tuning system for big data analytics. CIDR 2011
- 5th Bienn Conf Innov Data Syst Res Conf Proc. 2011;261–72.
[11] Elgendy N, Elragal A. Big Data Analytics in Support of the
Decision Making Process. Procedia Comput Sci. 2016;100:1071–
84.
[12] Tiwari S, Wee HM, Daryanto Y. Big data analytics in supply
chain management between 2010 and 2016: Insights to industries.

Innovative Research Publication 188

You might also like