0% found this document useful (0 votes)
38 views15 pages

Word Practice 1

This document discusses big data, its importance in the digital world, and some of the key challenges associated with it. It defines big data using the three V's of volume, variety, and velocity. It describes how digital transformation has led to unprecedented amounts of data being generated from a variety of sources. Some of the main challenges of big data include the need to store and process large volumes of data from multiple sources in real-time.

Uploaded by

zoeolaizola
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views15 pages

Word Practice 1

This document discusses big data, its importance in the digital world, and some of the key challenges associated with it. It defines big data using the three V's of volume, variety, and velocity. It describes how digital transformation has led to unprecedented amounts of data being generated from a variety of sources. Some of the main challenges of big data include the need to store and process large volumes of data from multiple sources in real-time.

Uploaded by

zoeolaizola
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 15

Big data and its importance

in the digital world


Content 2
BIG DATA: DESCRIPTION AND DEFINITION......................................................................4
WHERE IS ALL THIS BIG DATA? BIG FIGURES AND SOURCES OF DATA...............................................5
THE TECHNOLOGICAL COMPONENT OF BIG DATA...........................................................7
KEY CHALLENGES..................................................................................................................7
DATA SCIENCE: MINING INFORMATION FROM DATA......................................................9
SKILLS AND KNOW-HOW........................................................................................................9
WHAT IS DATA SCIENCE.......................................................................................................10
THE DATA ECONOMY AND BIG DATA, AN ESSENTIAL BINOMIAL...................................11
MAIN OBJECTIVES...............................................................................................................11
OBSTACLES AND ADOPTION OF BIG DATA IN SPAIN AND LATIN AMERICA.....................14

Big data and its importance in the digital world


Big data is frequently characterized using the three V’s: volume,
variety and velocity. 3
• Volume: the most obvious attribute and that captured in the
term ‘big’. The growth is evident in the evolution from metrics
such as megabytes, gigabytes and terabytes to petabytes.
• Variety: both in terms of the type of data and its sources, so
that we have gone from handling static data structured in
databases (derived from limited sources) to processing: (i)
structured, semi-structured and unstructured information; (ii)
dynamic or continuously changing data; (iii) data generated by
people, machines, sensors, etc.
• Velocity: in the capture, movement and processing of data to
the point of taking place in real time.
Some organizations have added new V’s to the definition of big data:
veracity (the quality of the data captured), variability (the handling of
inconsistencies due to change in the meaning of the data) and value
(the derived income or profits).

Big data and its importance in the digital world


Big data: description and definition
4
With statements and similes as eye-catching as “the oil of the twenty-
first century”, big data has emerged as one of the major technology
revolutions unfolding today. This term refers to the emergence and
utilization of large volumes of data, some of which already existed
but had never before been seen as a source of value for businesses,
of transparency for governments or as the means to enhanced self-
management for all citizens in their everyday lives. Hand in hand with
digitalization – the transformation by which processes that were
formerly carried out physically, manually or mechanically are now
carried out using information technology – everything that happens in
our daily lives leaves a trace of footprint in the form of data.

So what is data? Data is a register that is stored in


data silos and lakes, saving variables such as the
player that produced an interaction, at which time it
took place, in what place, what characteristics the
player presented, what the context was, etc.
To provide an example from our everyday lives, each time a user
makes a purchase online a footprint is established in a database
recording what he or she bought, how much and how he or she paid,
what other products he or she bought at the same time, where he or
she was when he made the purchase, along with a myriad of other
information. Think that at present, in Spain alone, the major online
commerce portals are seeing their revenue grow year after year,
which means billions of transactions and millions of users generating
all of the information itemized above. That’s why it is big.
Digitalization has not only enabled the ‘sensorization’ of many of our
movements and actions, it has paved the way for economic scaling
to an unprecedented magnitude. This process has enabled
commerce to transcend many of the physical barriers it previously
faced: a shoe store’s business was limited by the size of the store,
the number of salespersons or its opening hours; nowadays any
clothing company with an online portal can provide service in any
part of the world and customers can buy its products at any time and
in any place. Of course, there are still limits: the physical products
continue to require a logistics, supplier and manufacturing platform
that is hard to scale up at the same pace as potential sales but the
digitalization of the industry and its continuous monitoring is enabling
process optimization to a degree we are still unable to grasp at. That
natural consequence of digital transformation, another buzzword, is
the generation of unprecedented volumes of data.

Big data and its importance in the digital world


Where is all this big data? Big figures and sources of
data 5
We are not limited to the data at our fingertips, which we can now
‘sensorize’ thanks to digital transformation, there are third parties that
allow us to use their data, respecting the associated legal
constraints, to build information or applications from them.
This is the case of social networks such as Twitter or Facebook,
which offer application programming interfaces (API, or more
colloquially, data download connection points) to third parties,
enabling them to extract detailed information about their users and
their behavior patterns. YouTube has other APIs to extract statistics
from the videos its portal hosts and other networking sites such as
LinkedIn, Instagram and Google Plus do likewise.
This big data that we users are generating outside of firms such as
banks, insurers, large retailers, supermarkets or fashion apparel
brands are being used by the latter to learn more about our tastes
and interests, align their products and offers, understand our daily
mobility patterns and adapt their marketing campaigns so as to be
less disruptive. If the companies’ internal databases already qualify
as big data, the volume of data handled by these external sources is
mindboggling.
As shown in the next chart, in just one minute nearly half a million
tweets are sent, 3.5 million Google searches performed and nearly a
million users log on to Facebook. Every one of these events is a new
thread in an immense table of data that reflects what we do every
day, how we feel or where we spend it.
But were there no data before this digital revolution? Of course. In
that instance, is big data and data analysis a totally new
phenomenon? Of course not.
One of the oldest recorded instances of the concerted collection and
exploitation of data is the analysis of patients who died from cholera
performed by Dr. John Snow, one of the precursors of epidemiology,
in London in 1854. By visualizing geographically the deaths from
cholera in the month of September that year, this doctor verified that
the variable that best explained their occurrence was a well of
contaminated water located on Broad Street. Thanks to his analysis
of the data, he was also able to explain why people who lived far
from that street fell ill and why at a nearby workshop where over 500
people worked, only five got sick.
The reasons why, firstly, a new term has been embraced to refer to
these technologies and techniques and, secondly, it has suddenly
become a buzzword across all sectors of the economy are multiple:

Big data and its importance in the digital world


The main one, already mentioned, is the digital transformation that
has triggered the collection of information on a large scale from 6
various business processes.
The second is the advent of technologies that allow us to store and
process large volumes of information in a cheap and scalable
manner.
The third is the acquisition of know-how in the world of business with
respect to a wide variety of statistical, mathematical and IT
techniques, with which the academic and scientific communities have
been familiar since the 1950s, that allows us to identify patterns in
our data with which to anticipate what might happen in the future.
The fourth and last reason is, precisely, what has come to be known
as the data economy, the awareness that both the data itself and the
value that can be derived from it (information, conclusions,
predictions, etc.) can be used to help businesses generate more
sales, reduce costs and, even, create new data-driven businesses.

Big data and its importance in the digital world


The technological component of big data
7
The advent of all of the sources of data mentioned earlier, among
many others, created a technological challenge for businesses, as
their traditional IT systems were not equipped to handle the new
characteristics of this new data compared to their legacy sources.

Key Challenges
The key challenges arising in the wake of the advent of the new
sources of digital data were the following:
• The need to store enormous volumes of data: the most
obvious challenge was surely the need to store and process
such vast amounts of data.
• Intake of data from multiple sources: the emergence of
different points of access to data, in various formats and
different means of connection, etc. For the advanced analysis
that machine learning techniques enable, it is necessary to
model our problem in the most detailed manner possible, in
turn making it necessary to incorporate several sources of
information, both internal (using new and existing company
tools, marketing apps, etc.) and external (social media, public
data, meteorology data, event data, localization data, etc.).
• Data capture rates: some of these sources not only generate
high volumes of data, they also do it at speeds that vary over
time, punctuated by huge peaks. By way of illustration,
although the number of tweets per minute that mention a
football player is high, when this player scores a goal, the
number of mentions concentrated in a very short time lapse
surges.
• Unstructured data: sources of data are emerging that, instead
of contributing semantically specific information, needs to be
pre-processed to extract its true meaning. For example, a
company’s customer database includes information about the
age or city where its customers live (fields containing
unequivocal semantic information) but also includes the
opinions those customers share in chats, in the form of free
text; the fact that the machines are capable of storing these
opinions does not mean they are capable of understanding
them (which users have complained about the technical
service in the past month?).
The traditional standard databases, known as relational databases,
were very robust from the standpoint of the companies’ processes
and operations and guaranteed consistency, durability and isolation
over time but they were not efficient enough to deal with the issues
posed above. Large companies such as Google and Yahoo invested

Big data and its importance in the digital world


heavily in research and development to resolve this deficit. The result
was an absolute paradigm shift in terms of data storage and 8
processing systems. Without getting embroiled in the technical
details, they came up with a solution that enabled the storage and
processing of information in a distributed manner (among many
servers) with far fewer data structuring requirements compared to the
legacy systems. This enables, in a very simple manner, what is
known as horizontal scalability, the ability to grow storage and
processing capacity over time by simply adding new servers to the
existing IT systems without impacting what is already there.
The key technology development to emerge, in 2008, was Hadoop,
an implementation of the above-mentioned distributed scheme.
However, the major novelty introduced by this technology was not
only the fact that it resolved the problems the new data sources had
posed but also the fact that Yahoo and Nutch (the two main
companies involved in its development) released it as open source
software, which meant that a large community of developers helped
to move it along and it was ultimately embraced by the technology
players as the standard. Since then a myriad of technologies have
been developed around the Hadoop environment and today there are
other distributed processing models that outperform it. However, it
was the strategic gambit of releasing this know-how as open source
software that made big data, in its technological manifestation, one of
the companies’ main objectives.

Big data and its importance in the digital world


Data science: mining information from data
9
This new technology helps businesses with their everyday work of
managing the data they produce and capture from other sources;
however, the fact of storing and processing the data does not imply
the extraction of value from it.
Imagine an immense table of data in which you can search for, add
or transform information. But to what end? What use would these
huge data centers be for businesses and public entities if they did not
know what to do with them?

Although the term ‘big data’ originally referred to


technology, it has become so popular that it is often
used to also refer to the uses given to the data.
Company surveys asking about how they use big
data typically cover ‘use cases’ and the ways they
extract value from it.
One of the main challenges facing organizations when it comes to
tackling their big data projects is indeed defining what they want to
solve with their data. This task requires, on the one hand, sufficient
knowledge of the business to establish the organization’s
requirements and estimate what is to be gained from carrying out
these projects; but it also requires, on the other hand, the technical
know-how to transform a business problem or use into a scientific
variable that can be used, based on the evidence the data provides
and the application of analytical, statistical and mathematical tools, to
extract robust conclusions, propose data-based actions or predict
future behavior. This new class of tasks, requiring a mix of business
and scientific know-how, is what is today known as data science and
is entirely different in nature to the technological issue addressed
above.

Skills and Know-how


The professionals tasked with carrying out these duties are data
scientists and they are very much in demand in today’s job market.
They offer a mixed bag of mathematical, statistics, IT and business
skills and know-how and they are put in charge of projects and
developments that can be broken down into well-differentiated parts:
• Identifying the problem: formulating the question that, if
answered, will resolve a specific business problem.
• Getting the data: studying which sources of data are needed
to solve the problem.
• Exploring and analyzing the data: using analytical, statistical
and visualization techniques, searching for the answers from a
descriptive standpoint to the specific problem.

Big data and its importance in the digital world


• Modelling the data: certainly another of the key disruptions in
the world of big data. The scope for creating mathematical 10
models based on millions of historical records that pave the
way, among other things, for discovering behavior patterns,
predicting how customers and users will act, developing
automated recommendation mechanisms, etc.
• Communicating the results: the data scientists must translate
all of these technical tasks and their outcomes into business
language so that they can be understood by an organization’s
decision-makers.

What is Data Science


In short, data science is the universe of techniques that allows us to
go from a data warehouse to the applications that extract value from
that data. Even if we don’t see them, these models are already
present in our lives: when we turn on a GPS system and it estimates
how long it will take to get to our destination based on past
experience and traffic data, there is a mathematical model behind it;
when we visit an e-commerce portal and based on what we have
bought in the past and our profiles, it recommends products that
match our preferences, it is an automated recommendation system
that picks the ideas; when a company decides to open a franchise in
one place instead of another, the decision is the result of a data-
based study analyzing the corresponding success factors such as
footfall, mobility flows and the socio-demographic profile in the area.
We are surrounded by algorithms and models that facilitate decision-
making and know us thanks to the digital footprint we leave in our
wake, in the virtual and physical world alike.

Big data and its importance in the digital world


The data economy and big data, an
11
essential binomial
Big data today/Monetizing data: internal data, external data, new
business models, open innovation, data-based products, data
services.
At this juncture, we have learned that companies have figured out
how to tackle the technological issue posed by having to handle the
overwhelming amounts of data in existence today and, having
resolved the issue, undertake data-driven projects. But how do they
extract real value from their data? What kinds of projects are being
undertaken across the various sectors of the economy?

Main Objectives
The data science projects companies are taking on can have a host
of objectives but the main ones can be summed up as follows:
• Learning more about customers and users: by analyzing
customer behavior patterns enterprises can design strategies
to boost sales and customer loyalty by using this information
to enhance customer relations. Depending on a company’s
business, this can take several forms: if we are talking about a
digital business, it can mean studying and taking decisions on
the basis of how users browse, the content they visit and other
factors that shape the user-friendliness of a portal with the aim
of multiplying conversion rates; if we are talking about a brick
and mortar retail chain and we are capable of measuring the
places a customer lingers and relate that information with what
he or she ultimately buys, that information can be used to fine-
tune strategy with the aim of increasing sales.
• Cost-cutting: enterprises’ internal data is a reflection of what is
happening at the organization and can be used to identify
inefficiencies in its processes and reporting structures that can
be corrected on the basis of that analysis. A common example
of how costs can be streamlined is by predicting demand over
different periods of time. For example, in the manufacturing
sector, in order to branch out their services across a large
territory, companies use maintenance and logistics partners
whose agreements can be adjusted as a function of forecast
demand. If a company knows in advance the level of demand
it will encounter it can tailor these services, saving
unnecessary costs while raising the standard of customer
service provided in parallel.
• Creation of new data-driven products and services: the
footprint left by users in companies’ databases can be used to

Big data and its importance in the digital world


generate recommendations for customers with the aim of
better matching their tastes or improving the service offered. 12
The online real estate portals are a clear example of this new
way of extracting value from data. These portals enable users
to find offers for apartments and commercial premises,
whether for rent or sale, and get in contact with the owners.
Consider the information these companies warehouse: on the
one hand they build a repository of property prices across an
extensive territory and they track user browsing patterns,
which provide a very good proxy for demand by region, on the
other. By crossing this information they can create automated
house appraisal tools and even offer data services to third
parties such as the ability to offer property price estimates as
a function of the properties’ characteristics and locations.
• New businesses: enterprise data is useful for third companies
with entirely different businesses, offering them the chance to
conceive of completely new business lines that generate data-
driven income. A case of such a new business for the bank
sector would be the creation of software based on the data
generated by face-to-face transactions paid for by credit card
with the idea that governments, for example, can understand
how tourists behave when visiting cities (in which areas they
spend more depending on their origin or what places of
interest they congregate in).

This begs the question, is this a reality? Are


companies managing to get value from their data
along these four facets?
A company that knows its customers better is bound to do a better
job at retaining them, keeping them satisfied and getting them to
purchase a larger number of products, which, in general, should
translate into a better business performance. A well-known case
study is that of the Four Seasons hotel chain, which analyzed the
comparative performance of its restaurants to create an ideal
experience that all guests should encounter when visiting them. As a
result, it managed to increase average expenditure in its restaurants
as well as boosting overall guest satisfaction. Moreover, they
adapted the amount of time allotted for sittings, making them longer
during holiday periods, when their guests were not in a rush, and
thereby reducing wait times.
By using big data to better understand their internal processes,
companies can identify inefficiencies that can be fixed, reducing
costs as a result. An example of how this can be achieved is found in
the project undertaken by Intel in 2012 in which it reduced the cost of
its technology by analyzing its enormous production history. What
they did was to analyze the quality control process and focus on

Big data and its importance in the digital world


which tests were more useful depending on the product materials
and parts. If, on average, a part used to undergo around 19,000 13
tests, after the study these tests were focused on just a few parts,
without reducing the products’ life span, which translated into
immediate and direct cost savings for the company.
Lastly, and probably the most visionary aspect of data monetization,
there are also businesses based on customer data that provide data
services that help us take better decision, even though the data in
question do not belong to the provider. A good case in point is the
personal finance management applications. Applications such as
Mint (very popular in the US), Mooverang and Fintonic use our data
histories to recommend ways to save, pose challenges, monitor our
accounts or provide automated alerts. These companies do not
generate the data but instead ask users for permission to access
their data to derive the information they need. True embodiment of
the data economy.

Big data and its importance in the digital world


Obstacles and adoption of big data in Spain
14
and Latin America
Understanding what big data can resolve: as we noted above, the
biggest challenge is translating business problems into data
problems and that is something which companies find hard to do
themselves.
• Complexity: data projects do not yield immediate results and
contain a scientific and exploratory component, which means
that the path to the solution is not always a straight line. As a
result of this uncertainty, it is possible that some of the
approaches embarked on turn out to be non-viable or that the
projects undertaken, and their associated costs, prove
protracted.
• Difficulty in finding the required talent: due to the mix of skills
needed to undertake data science projects, there is a scarcity
of skilled professionals, while companies are not typically able
to generate these skills in-house.
• Difficulty in estimating returns: In many use cases, it is hard to
estimate a big data project’s return on investment.
• Regulatory issues: some of a company’s use cases or internal
data policies can have legal consequences that must be taken
into account upfront, making the risk and governance issue a
vital aspect for certain data-related initiatives.
• Translating analysis into everyday applications: the companies
that are already undertaking big data projects find that it is
hard to put the results they obtain to work for them. On many
occasions the results of these projects are business
conclusions that can be translated into strategic initiatives but
in other instances the result needs to be integrated into the
companies’ IT systems, which requires system maintenance,
customization and development, tasks that increase the cost
of the projects and sometimes bring them to a halt.
For all of these reasons, even though big data is one of the strategic
trends taking strongest hold in companies across all sectors, the
process of embracing it - and by extension its full development -
remains ongoing.
According to several reports, 84.8% of companies in Spain are
carrying out big data projects or have plans to do so imminently.
Global analysis of how it is being adopted by sector reveals that 70%
of financial institutions report having already undertaken big data
projects, focused primarily on gaining better customer insight and
improving their product purchase propensity models, risk analysis
systems and the detection of potential fraud across a range of

Big data and its importance in the digital world


transactions such as card payments or loan applications. In the case
of the telecommunications industry, roughly 60% of organizations are 15
executing big data projects to analyze their users’ mobility,
understand how they are connecting up, enhance their marketing
campaigns or send out personalized communications by availing of
geographic data in their possession.
The media, a sector in which digital transformation has had a major
impact in terms of modifying the traditional business models due to
the advent of online news portals, the main use of big data is related
to personalized content recommendations and the creation of new
data-based products; 73% of companies report to be immersed in
some form of big data project. In other sectors, such as the retail or
logistics sectors, adoption of big data is tracking well above 50%.
Elsewhere, the big data phenomenon is similarly unstoppable in Latin
America and investments in big data technology are growing year
after year. The countries committing most significantly to developing
this technology are Brazil, Mexico and Argentina, where over 75% of
companies claim that they see big data as strategic and are carrying
out or planning to carry out big data projects within two years’ time.
The total market size is estimated at over $6.5 trillion.
In short, big data is a cross-cutting phenomenon that stands to
benefit all sectors of the economy and companies of all shapes and
sizes; and the world’s organizations are already extracting value from
their data. This is translating into strong demand for skilled big data
and data science professionals. Given the relatively nascent nature
of the field, there is often a shortage of the skills needed to develop
all the possible applications that could improve organizations’
business performance.

Big data and its importance in the digital world

You might also like