0% found this document useful (0 votes)
18 views8 pages

1 Introduction To Data Science Lecture 1 KG Sir OEC M 621 (E)

Data science

Uploaded by

31PIYALI MAISAL
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views8 pages

1 Introduction To Data Science Lecture 1 KG Sir OEC M 621 (E)

Data science

Uploaded by

31PIYALI MAISAL
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Introduction to data science (lecture 1)

Dr. Koushik Ghosh


Department of Mathematics, University Institute of
Technology, The University of Burdwan, Golapbag
(North), Burdwan-713104, India
E-mail:
[email protected]/[email protected]
1
What is Big Data?
Big Data can be defined as large data sets that have need of high level
computation to assemble, to deal with and to analyze to bring
consequential inferences.
Big Data Analytics is coined as a precise and properly ordered data
management tools, equipments and procedures for effective investigation
and scrutiny of big data both structured and unstructured. In current days
it is popularly known as Data Science.
Definition due to Diebold:
“Recently much good science, whether physical, biological, or social,
has been forced to confront—and has often benefited from—the Big Data
phenomenon. Big Data refers to the explosion in the quantity (and
sometimes, quality) of available and potentially relevant data, largely the
result of recent and unprecedented advancements in data recording and
storage technology”. [Reference: Diebold, F.X. (2012). On the origin(s)
and development of the term ‘Big Data’. PIER Working Paper 12-037.
Penn Institute for Economic Research, Department of Economics,
University of Pennsylvania]
Big data can possess three important characteristics viz. i) volume (data
quantity or size), ii) velocity (data speed) and iii) variety (data types).
On the basis of this we can have an alternative definition of big data as
follows:
“Big Data is high volume, high velocity, and/or high variety information
assets that require new forms of processing to enable enhanced decision-
making, insight discovery and process optimization”. [Reference: Laney,
D. (2001). 3D data management: Controlling data volume, velocity, and
variety, Application Delivery Strategies, META Group]
Examples:
 Call Centre Logs
 Client Chats
 SMS Texts
 Instagram Pictures
 Click Stream on the Web
 Social Media
 Blogs
 CCTV
 Barcode Scanner
 Geographic Information Systems (GIS)
 Genomics
 YouTube
 Internet of Things (IoT)
 Astrophysical and Geophysical Data
 Climate Data
The Significance of Big Data in the Present Era

Manyika et al. argued that the society is ‘on the cusp of a tremendous
wave of innovation, productivity, and growth as well as new modes of
competition and value capture—all driven by Big Data’. [Reference:
Manyika, J. et al. (2011). Big Data: The next frontier for innovation,
competition, and productivity. McKinsey Global Institute, McKinsey &
Co.] By means of Big data Analytics we can identify and analyze unseen
patterns, extract meaning and insight of a big data which enables better
decision making, and make predictive analysis.
Challenges in Big Data:
Every day we generate 2.5 quintillion (2.5 X 1018) bytes of data and out
of that 90% possibly have been created in last two-three years.
[Reference: Siegel, E. (2013). Predictive analytics: The power to predict
who will click, buy, lie, or die. Hoboken, New Jersey: John Wiley &
Sons]
But the actual problem with Big Data is not about its storage as the cost
of storage has fallen. The problem or challenge actually lies in finding
effective strategies to transform data reliably into useful information.
[Reference: Moldoveanu, M. C. (2013). The ingenuity imperative: What
Big Data means for big business. Rotman Magazine, 59–63]
The challenges include capture, search, sharing, analysis and
visualization of big data.
[Courtesy: Big Data: Prospects and Challenges: Janakiraman Moorthy
(Coordinator), Rangin Lahiri, Neelanjan Biswas, Dipyaman Sanyal,
Jayanthi Ranjan, Krishnadas Nanath, and Pulak Ghosh, VIKALPA, The
Journal for Decision Makers, 40(1) 74–96, 2015
Indian Institute of Management, Ahmedabad, SAGE Publications]

You might also like