Word Practice 1
Word Practice 1
Key Challenges
The key challenges arising in the wake of the advent of the new
sources of digital data were the following:
• The need to store enormous volumes of data: the most
obvious challenge was surely the need to store and process
such vast amounts of data.
• Intake of data from multiple sources: the emergence of
different points of access to data, in various formats and
different means of connection, etc. For the advanced analysis
that machine learning techniques enable, it is necessary to
model our problem in the most detailed manner possible, in
turn making it necessary to incorporate several sources of
information, both internal (using new and existing company
tools, marketing apps, etc.) and external (social media, public
data, meteorology data, event data, localization data, etc.).
• Data capture rates: some of these sources not only generate
high volumes of data, they also do it at speeds that vary over
time, punctuated by huge peaks. By way of illustration,
although the number of tweets per minute that mention a
football player is high, when this player scores a goal, the
number of mentions concentrated in a very short time lapse
surges.
• Unstructured data: sources of data are emerging that, instead
of contributing semantically specific information, needs to be
pre-processed to extract its true meaning. For example, a
company’s customer database includes information about the
age or city where its customers live (fields containing
unequivocal semantic information) but also includes the
opinions those customers share in chats, in the form of free
text; the fact that the machines are capable of storing these
opinions does not mean they are capable of understanding
them (which users have complained about the technical
service in the past month?).
The traditional standard databases, known as relational databases,
were very robust from the standpoint of the companies’ processes
and operations and guaranteed consistency, durability and isolation
over time but they were not efficient enough to deal with the issues
posed above. Large companies such as Google and Yahoo invested
Main Objectives
The data science projects companies are taking on can have a host
of objectives but the main ones can be summed up as follows:
• Learning more about customers and users: by analyzing
customer behavior patterns enterprises can design strategies
to boost sales and customer loyalty by using this information
to enhance customer relations. Depending on a company’s
business, this can take several forms: if we are talking about a
digital business, it can mean studying and taking decisions on
the basis of how users browse, the content they visit and other
factors that shape the user-friendliness of a portal with the aim
of multiplying conversion rates; if we are talking about a brick
and mortar retail chain and we are capable of measuring the
places a customer lingers and relate that information with what
he or she ultimately buys, that information can be used to fine-
tune strategy with the aim of increasing sales.
• Cost-cutting: enterprises’ internal data is a reflection of what is
happening at the organization and can be used to identify
inefficiencies in its processes and reporting structures that can
be corrected on the basis of that analysis. A common example
of how costs can be streamlined is by predicting demand over
different periods of time. For example, in the manufacturing
sector, in order to branch out their services across a large
territory, companies use maintenance and logistics partners
whose agreements can be adjusted as a function of forecast
demand. If a company knows in advance the level of demand
it will encounter it can tailor these services, saving
unnecessary costs while raising the standard of customer
service provided in parallel.
• Creation of new data-driven products and services: the
footprint left by users in companies’ databases can be used to