Business Analytics Unit I
Business Analytics Unit I
Data-driven companies treat their data as a business asset and actively look for
ways to turn it into a competitive advantage. Success with business analytics
depends on data quality, skilled analysts who understand the technologies and the
business, and a commitment to using data to gain insights that inform business
decisions.
In this tutorial, we're going to talk about the different phases of the life cycle of
data analytics, in which we will go over different life cycle phases and then go
over them in detail.
The Data analytics lifecycle was designed to address Big Data problems and data
science projects. The process is repeated to show the real projects. To address the
specific demands for conducting analysis on Big Data, the step-by-step
methodology is required to plan the various tasks associated with the acquisition,
processing, analysis, and recycling of data.
Phase 1: Discovery -
o The team studies data to discover the connections between variables. Later,
it selects the most significant variables as well as the most effective models.
o In this phase, the data science teams create data sets that can be used for
training for testing, production, and training goals.
o The team builds and implements models based on the work completed in the
modelling planning phase.
o Some of the tools used commonly for this stage are MATLAB and
STASTICA.
o The team creates datasets for training, testing as well as production use.
o The team is also evaluating whether its current tools are sufficient to run the
models or if they require an even more robust environment to run models.
o Tools that are free or open-source or free tools Rand PL/R, Octave, WEKA.
o Commercial tools - MATLAB, STASTICA.
o Following the execution of the model, team members will need to evaluate
the outcomes of the model to establish criteria for the success or failure of
the model.
o The team is considering how best to present findings and outcomes to the
various members of the team and other stakeholders while taking into
consideration cautionary tales and assumptions.
o The team should determine the most important findings, quantify their value
to the business and create a narrative to present findings and summarize
them to all stakeholders.
Phase 6: Operationalize -
o The team distributes the benefits of the project to a wider audience. It sets up
a pilot project that will deploy the work in a controlled manner prior to
expanding the project to the entire enterprise of users.
o This technique allows the team to gain insight into the performance and
constraints related to the model within a production setting at a small scale
and then make necessary adjustments before full deployment.
o The team produces the last reports, presentations, and codes.
o Open source or free tools such as WEKA, SQL, MADlib, and Octave.
Types of Data Analytics
1.Predictive Analytics
Predictive analytics turn the data into valuable, actionable information. predictive
analytics uses data to determine the probable outcome of an event or a likelihood
of a situation occurring. Predictive analytics holds a variety of statistical
techniques from modeling, machine learning, data mining, and game theory that
analyze current and historical facts to make predictions about a future
event. Techniques that are used for predictive analytics are:
Linear Regression
Time Series Analysis and Forecasting
Data Mining
Basic Corner Stones of Predictive Analytics
Predictive modeling
Decision Analysis and optimization
Transaction profiling
2.Descriptive Analytics
Descriptive analytics looks at data and analyze past event for insight as to how to
approach future events. It looks at past performance and understands the
performance by mining historical data to understand the cause of success or
failure in the past. Almost all management reporting such as sales, marketing,
operations, and finance uses this type of analysis.
The descriptive model quantifies relationships in data in a way that is often used
to classify customers or prospects into groups. Unlike a predictive model that
focuses on predicting the behavior of a single customer, Descriptive analytics
identifies many different relationships between customer and product.
Common examples of Descriptive analytics are company reports that provide
historic reviews like:
Data Queries
Reports
Descriptive Statistics
Data dashboard
3.Prescriptive Analytics
Prescriptive Analytics automatically synthesize big data, mathematical science,
business rule, and machine learning to make a prediction and then suggests a
decision option to take advantage of the prediction.
Prescriptive analytics goes beyond predicting future outcomes by also suggesting
action benefits from the predictions and showing the decision maker the
implication of each decision option. Prescriptive Analytics not only anticipates
what will happen and when to happen but also why it will happen. Further,
Prescriptive Analytics can suggest decision options on how to take advantage of a
future opportunity or mitigate a future risk and illustrate the implication of each
decision option.
For example, Prescriptive Analytics can benefit healthcare strategic planning by
using analytics to leverage operational and usage data combined with data of
external factors such as economic data, population demography, etc.
3.Diagnostic Analytics
In this analysis, we generally use historical data over other data to answer any
question or for the solution of any problem. We try to find any dependency and
pattern in the historical data of the particular problem.
For example, companies go for this analysis because it gives a great insight into a
problem, and they also keep detailed information about their disposal otherwise
data collection may turn out individual for every problem and it will be very
time-consuming. Common techniques used for Diagnostic Analytics are:
Data discovery
Data mining
Correlations
Data collection is the process of collecting and evaluating information or data from
multiple sources to find answers to research problems, answer questions, evaluate
outcomes, and forecast trends and probabilities. It is an essential phase in all types
of research, analysis, and decision-making, including that done in the social
sciences, business, and healthcare.
During data collection, the researchers must identify the data types, the sources of
data, and what methods are being used. We will soon see that there are many
different data collection methods. There is heavy reliance on data collection in
research, commercial, and government fields.
Before an analyst begins collecting data, they must answer three questions first:
Before a judge makes a ruling in a court case or a general creates a plan of attack,
they must have as many relevant facts as possible. The best courses of action come
from informed decisions, and information and data are synonymous.
The concept of data collection isn’t a new one, as we’ll see later, but the world has
changed. There is far more data available today, and it exists in forms that were
unheard of a century ago. The data collection process has had to change and grow
with the times, keeping pace with technology.
Whether you’re in the world of academia, trying to conduct research, or part of the
commercial sector, thinking of how to promote a new product, you need data
collection to help you make better choices.
Now that you know what is data collection and why we need it, let's take a look at
the different methods of data collection. While the phrase “data collection” may
sound all high-tech and digital, it doesn’t necessarily entail things like
computers, big data, and the internet. Data collection could mean a telephone
survey, a mail-in comment card, or even some guy with a clipboard asking
passersby some questions. But let’s see if we can sort the different data collection
methods into a semblance of organized categories.
Primary and secondary methods of data collection are two approaches used to
gather information for research or analysis purposes. Let's explore each data
collection method in detail:
Primary data collection involves the collection of original data directly from the
source or through direct interaction with the respondents. This method allows
researchers to obtain firsthand information specifically tailored to their research
objectives. There are various techniques for primary data collection, including:
b. Interviews: Interviews involve direct interaction between the researcher and the
respondent. They can be conducted in person, over the phone, or through video
conferencing. Interviews can be structured (with predefined questions), semi-
structured (allowing flexibility), or unstructured (more conversational).
e. Focus Groups: Focus groups bring together a small group of individuals who
discuss specific topics in a moderated setting. This method helps in understanding
opinions, perceptions, and experiences shared by the participants.
Secondary data collection involves using existing data collected by someone else
for a purpose different from the original intent. Researchers analyze and interpret
this data to extract relevant information. Secondary data can be obtained from
various sources, including:
e. Past Research Studies: Previous research studies and their findings can serve as
valuable secondary data sources. Researchers can review and analyze the data to
gain insights or build upon existing knowledge.
Now that we’ve explained the various techniques, let’s narrow our focus even
further by looking at some specific tools. For example, we mentioned interviews as
a technique, but we can further break that down into different interview types (or
“tools”).
Word Association
The researcher gives the respondent a set of words and asks them what comes to
mind when they hear each word.
Sentence Completion
Role-Playing
Respondents are presented with an imaginary situation and asked how they would
act or react if it was real.
In-Person Surveys
Online/Web Surveys
These surveys are easy to accomplish, but some users may be unwilling to answer
truthfully, if at all.
Mobile Surveys
Phone Surveys
No researcher can call thousands of people at once, so they need a third party to
handle the chore. However, many people have call screening and won’t answer.
Observation
Sometimes, the simplest method is the best. Researchers who make direct
observations collect data quickly and easily, with little intrusion or third-party bias.
Naturally, it’s only effective in small-scale situations.
Among the effects of data collection done incorrectly, include the following -
Let us now look at the various issues that we might face while maintaining the
integrity of data collection.
Quality assurance and quality control are two strategies that help protect data
integrity and guarantee the scientific validity of study results.
Quality control - tasks that are performed both after and during data collecting
Quality assurance - events that happen before data gathering starts
Let us explore each of them in more detail now.
Quality Assurance
As data collecting comes before quality assurance, its primary goal is "prevention"
(i.e., forestalling problems with data collection). The best way to protect the
accuracy of data collection is through prevention. The uniformity of protocol
created in the thorough and exhaustive procedures manual for data collecting
serves as the best example of this proactive step.
The likelihood of failing to spot issues and mistakes early in the research attempt
increases when guides are written poorly. There are several ways to show these
shortcomings:
Failure to determine the precise subjects and methods for retraining or training
staff employees in data collecting
List of goods to be collected, in part
There isn't a system in place to track modifications to processes that may occur
as the investigation continues.
Instead of detailed, step-by-step instructions on how to deliver tests, there is a
vague description of the data gathering tools that will be employed.
Uncertainty regarding the date, procedure, and identity of the person or people in
charge of examining the data
Incomprehensible guidelines for using, adjusting, and calibrating the data
collection equipment.
Now, let us look at how to ensure Quality Control.
Quality Control
Despite the fact that quality control actions (detection/monitoring and intervention)
take place both after and during data collection, the specifics should be
meticulously detailed in the procedures manual. Establishing monitoring systems
requires a specific communication structure, which is a prerequisite. Following the
discovery of data collection problems, there should be no ambiguity regarding the
information flow between the primary investigators and staff personnel. A poorly
designed communication system promotes slack oversight and reduces
opportunities for error detection.
Direct staff observation conference calls, during site visits, or frequent or routine
assessments of data reports to spot discrepancies, excessive numbers, or invalid
codes can all be used as forms of detection or monitoring. Site visits might not be
appropriate for all disciplines. Still, without routine auditing of records, whether
qualitative or quantitative, it will be challenging for investigators to confirm that
data gathering is taking place in accordance with the manual's defined methods.
Additionally, quality control determines the appropriate solutions, or "actions," to
fix flawed data gathering procedures and reduce recurrences.
Problems with data collection, for instance, that call for immediate action include:
Fraud or misbehavior
Systematic mistakes, procedure violations
Individual data items with errors
Issues with certain staff members or a site's performance
Researchers are trained to include one or more secondary measures that can be
used to verify the quality of information being obtained from the human subject in
the social and behavioral sciences where primary data collection entails using
human subjects.
There are some prevalent challenges faced while collecting data, let us explore a
few of them to understand them better and avoid them.
The main threat to the broad and successful application of machine learning is poor
data quality. Data quality must be your top priority if you want to make
technologies like machine learning work for you. Let's talk about some of the most
prevalent data quality problems in this blog article and how to fix them.
2.Inconsistent Data
When working with various data sources, it's conceivable that the same
information will have discrepancies between sources. The differences could be in
formats, units, or occasionally spellings. The introduction of inconsistent data
might also occur during firm mergers or relocations. Inconsistencies in data have a
tendency to accumulate and reduce the value of data if they are not continually
resolved. Organizations that have heavily focused on data consistency do so
because they only want reliable data to support their analytics.
3.Data Downtime
Data is the driving force behind the decisions and operations of data-driven
businesses. However, there may be brief periods when their data is unreliable or
not prepared. Customer complaints and subpar analytical outcomes are only two
ways that this data unavailability can have a significant impact on businesses. A
data engineer spends about 80% of their time updating, maintaining, and
guaranteeing the integrity of the data pipeline. In order to ask the next business
question, there is a high marginal cost due to the lengthy operational lead time
from data capture to insight.
Schema modifications and migration problems are just two examples of the causes
of data downtime. Data pipelines can be difficult due to their size and complexity.
Data downtime must be continuously monitored, and it must be reduced through
automation.
4.Ambiguous Data
Even with thorough oversight, some errors can still occur in massive databases or
data lakes. For data streaming at a fast speed, the issue becomes more
overwhelming. Spelling mistakes can go unnoticed, formatting difficulties can
occur, and column heads might be deceptive. This unclear data might cause a
number of problems for reporting and analytics.
In the Data Collection Process, there are 5 key steps. They are explained briefly
below -
The first thing that we need to do is decide what information we want to gather.
We must choose the subjects the data will cover, the sources we will use to gather
it, and the quantity of information that we would require. For instance, we may
choose to gather information on the categories of products that an average e-
commerce website visitor between the ages of 30 and 45 most frequently searches
for.
The process of creating a strategy for data collection can now begin. We should set
a deadline for our data collection at the outset of our planning phase. Some forms
of data we might want to continuously collect. We might want to build up a
technique for tracking transactional data and website visitor statistics over the long
term, for instance. However, we will track the data throughout a certain time frame
if we are tracking it for a particular campaign. In these situations, we will have a
schedule for when we will begin and finish gathering data.
We will select the data collection technique that will serve as the foundation of our
data gathering plan at this stage. We must take into account the type of information
that we wish to gather, the time period during which we will receive it, and the
other factors we decide on to choose the best gathering strategy.
4. Gather Information
Once our plan is complete, we can put our data collection plan into action and
begin gathering data. In our DMP, we can store and arrange our data. We need to
be careful to follow our plan and keep an eye on how it's doing. Especially if we
are collecting data regularly, setting up a timetable for when we will be checking in
on how our data gathering is going may be helpful. As circumstances alter and we
learn new details, we might need to amend our plan.
It's time to examine our data and arrange our findings after we have gathered all of
our information. The analysis stage is essential because it transforms unprocessed
data into insightful knowledge that can be applied to better our marketing plans,
goods, and business judgments. The analytics tools included in our DMP can be
used to assist with this phase. We can put the discoveries to use to enhance our
business once we have discovered the patterns and insights in our data.
One of the primary purposes of data preparation is to ensure that raw data being
readied for processing and analysis is accurate and consistent so the results of BI
and analytics applications will be valid. Data is commonly created with missing
values, inaccuracies or other errors, and separate data sets often have different
formats that need to be reconciled when they're combined. Correcting data errors,
validating data quality and consolidating data sets are big parts of data preparation
projects.
Data preparation also involves finding relevant data to ensure that analytics
applications deliver meaningful information and actionable insights for business
decision-making. The data often is enriched and optimized to make it more
informative and useful -- for example, by blending internal and external data sets,
creating new data fields, eliminating outlier values and addressing imbalanced data
sets that could skew analytics results.
Data scientists often complain that they spend most of their time gathering,
cleansing and structuring data instead of analyzing it. A big benefit of an effective
data preparation process is that they and other end users can focus more on data
mining and data analysis -- the parts of their job that generate business value. For
example, data preparation can be done more quickly, and prepared data can
automatically be fed to users for recurring analytics applications.
Data preparation is done in a series of steps. There's some variation in the data
preparation steps listed by different data professionals and software vendors, but
the process typically involves the following tasks:
Data preparation can also incorporate or feed into data curation work that creates
and oversees ready-to-use data sets for BI and analytics. Data curation involves
tasks such as indexing, cataloging and maintaining data sets and their associated
metadata to help users find and access the data. In some organizations, data curator
is a formal role that works collaboratively with data scientists, business analysts,
other users and the IT and data management teams. In others, data may be curated
by data stewards, data engineers, database administrators or data scientists and
business users themselves.
What are the challenges of data preparation?
Data preparation is inherently complicated. Data sets pulled together from different
source systems are highly likely to have numerous data quality, accuracy and
consistency issues to resolve. The data also must be manipulated to make it usable,
and irrelevant data needs to be weeded out. As noted above, it's a time-consuming
process: The 80/20 rule is often applied to analytics applications, with about 80%
of the work said to be devoted to collecting and preparing data and only 20% to
analyzing it.
Data preparation can pull skilled BI, analytics and data management practitioners
away from more high-value work, especially as the volume of data used in
analytics applications continues to grow. However, various software vendors have
introduced self-service tools that automate data preparation methods, enabling both
data professionals and business users to get data ready for analysis in a streamlined
and interactive way.
The self-service data preparation tools run data sets through a workflow to apply
the operations and functions outlined in the previous section. They also feature
graphical user interfaces (GUIs) designed to further simplify those steps. As
Donald Farmer, principal at consultancy TreeHive Strategy, wrote in an article on
self-service data preparation (linked to above), people outside of IT can use the
self-service software "to do the work of sourcing data, shaping it and cleaning it
up, frequently from simple-to-use desktop or cloud applications."
But, it added, some tools lack the ability to scale from individual self-service
projects to enterprise-level ones or to exchange metadata with other data
management technologies, such as data quality software. Gartner recommended
that organizations evaluate products partly on those features. It also cautioned
against looking at data preparation software as a replacement for traditional data
integration technologies, particularly extract, transform and load (ETL) tools.
Several vendors that focused on self-service data preparation have now been
acquired by other companies; Trifacta, the last of the best-known data prep
specialists, agreed to be bought by analytics and data management software
provider Alteryx in early 2022. Alteryx itself already supports data preparation in
its software platform. Other prominent BI, analytics and data management vendors
that offer data preparation tools or capabilities include the following:
Altair
Boomi
Datameer
DataRobot
IBM
Informatica
Microsoft
Precisely
SAP
SAS
Tableau
Talend
Tamr
Tibco Software
In business analytics, data collection methods play a vital role in data gathering.
The techniques used in business analytics can be broadly classified into two main
types: qualitative and quantitative. Qualitative techniques are used to gather
descriptive data, while quantitative techniques are used to collect data that can be
analyzed statistically.
Primary data is the data that is collected directly from the source. It is collected
through surveys, interviews, focus groups, and observations.
Secondary data is the data that is already available and has been collected by
someone else. It can be gathered from sources like books, articles, websites, and
government reports.
The data collected through business analytics can be used to make decisions about
various aspects of the business, like marketing, product development, and human
resources. The data can also be used to improve the efficiency of business
processes.
1. Surveys
Collection prejudice:
Contextual bias:
2. Transactional Tracking
Keeping track of this information might help you better understand your consumer
base and make judgments about focused marketing campaigns.
You can employ both focus groups and interviews to collect qualitative and
quantitative data. Focus groups usually consist of multiple persons, whereas
interviews are normally conducted one-on-one. Real-time observation of their
interactions with your product and the recording of their emotions and inquiries
might yield insightful information.
Focus groups and interviews are data collection techniques that allow you to
inquire about individuals’ thoughts, drives, and emotions surrounding your brand
or product, as well as surveys. To avoid this, you can employ a facilitator to plan
and carry out interviews on your account.
4. Observation
Due to the candour, it provides, seeing users engage with your product or website
might be helpful for data collection. You can see in real time if your customer
experience is challenging or unclear.
However, organising observational sessions can take time and effort. You can
monitor a user’s involvement with a beta version of your website or product by
using a third-party service to capture users’ navigation across your site.
Observations give you the opportunity to examine how users engage with your
product or website directly. However, they are less accessible than other data
collection techniques. To enhance and build on areas of success, you can use the
qualitative and quantitative data collected from this.
5. Online Tracking
Using pixels and cookies, you can collect behavioural data. Both of these
programmes track users’ online activity across several websites and give
information about the material they are most interested in and interact with.
Additionally, you may monitor user activity on your company’s website, including
the most popular pages, whether or not visitors are perplexed while using it, and
how much time users spend on product pages. You can utilise this to enhance the
website’s look and facilitate users’ navigation to their desired location.
It’s frequently free and simple to set up to insert a pixel. The cost of implementing
cookies can be high, but the quality of the data you’ll get might make it
worthwhile. Once pixels and cookies are installed, they begin to collect data on
their own and require little to no upkeep.
It’s crucial to remember that tracking online activity may have ethical and legal
privacy concerns. Ensure you comply with regional and industry data privacy rules
before tracking users’ online activity.
6. Forms
Online forms are useful for collecting qualitative information about users,
particularly contact or demographic details. You can utilize them to gate content or
registration, such as for webinars and email newsletters, and they’re reasonably
cheap and easy to set up.
Afterwards, you may make use of this information to get in touch with potential
customers, develop demographic profiles of current clients, and carry out
remarketing activities like email workflows and content recommendations.
You can use social media data to ascertain which topics are most significant to
your following. For instance, you might observe a sharp rise in engagements when
your business posts about its environmental initiatives.
To understand more about hypothesis testing in detail, you can read about it here or
you can also learn it through this course.
Here are 5 key reasons why hypothesis generation is so important in data science:
The million-dollar question – when in the world should you perform hypothesis
generation?
In the light of the above, it is the first critical step in defining the structure of
available data. Data Modeling is the process of creating data models by which data
associations and constraints are described and eventually coded to reuse. It
conceptually represents data with diagrams, symbols, or text to visualize the
interrelation.
Data Modeling thus helps to increase consistency in naming, rules, semantics, and
security. This, in turn, improves data analytics. The emphasis is on the need for
availability and organization of data, independent of the manner of its application.
The best way to picture a data model is to think about a building plan of an
architect. An architectural building plan assists in putting up all subsequent
conceptual models, and so does a data model.
These data modeling examples will clarify how data models and the process of
data modeling highlights essential data and the way to arrange it.
1. ER (Entity-Relationship) Model
This model is based on the notion of real-world entities and relationships among
them. It creates an entity set, relationship set, general attributes, and constraints.
2. Hierarchical Model
This data model arranges the data in the form of a tree with one root, to which
other data is connected. The hierarchy begins with the root and extends like a tree.
This model effectively explains several real-time relationships with a single one-
to-many relationship between two different kinds of data.
For example, one supermarket can have different departments and many aisles.
Thus, the ‘root’ node supermarket will have two ‘child’ nodes of (1) Pantry, (2)
Packaged Food.
3. Network Model
4. Relational Model
This popular data model example arranges the data into tables. The tables have
columns and rows, each cataloging an attribute present in the entity. It makes
relationships between data points easy to identify.
For example, e-commerce websites can process purchases and track inventory
using the relational model.
6. Object-Relational Model
The data modeling process helps organizations to become more data-driven. This
starts with cleaning and modeling data. Let us look at how data modeling occurs at
different levels.
Model Validation
1. Conceptual Design
The foundation of any model validation is its conceptual design, which needs
documented coverage assessment that supports the model’s ability to meet business
and regulatory needs and the unique risks facing a bank.
The design and capabilities of a model can have a profound effect on the overall
effectiveness of a bank’s ability to identify and respond to risks. For example, a
poorly designed risk assessment model may result in a bank establishing
relationships with clients that present a risk that is greater than its risk appetite, thus
exposing the bank to regulatory scrutiny and reputation damage.
2. System Validation
4. Process Validation
If done effectively, model validation will enable your bank to have every
confidence in its various models’ accuracy, as well as aligning them with the bank’s
business and regulatory expectations. By failing to validate models, banks increase
the risk of regulatory criticism, fines, and penalties.
Hold-Out: In this method, the mostly large dataset is randomly divided to three
subsets:
2. Validation set is a subset of the dataset used to assess the performance of model
built in the training phase. It provides a test platform for fine tuning model’s
parameters and selecting the best-performing model. Not all modelling
algorithms need a validation set.
3. Test set or unseen examples is a subset of the dataset to assess the likely future
performance of a model. If a model fit to the training set much better than it fits
the test set, overfitting is probably the cause.
Classification Evaluation
Regression Evaluation
The importance of data interpretation is evident and this is why it needs to be done
properly. Data is very likely to arrive from multiple sources and has a tendency to
enter the analysis process with haphazard ordering. Data analysis tends to be
extremely subjective. That is to say, the nature and goal of interpretation will vary
from business to business, likely correlating to the type of data being analyzed.
While there are several types of processes that are implemented based on
individual data nature, the two broadest and most common categories are
“quantitative and qualitative analysis”.
Yet, before any serious data interpretation inquiry can begin, it should be
understood that visual presentations of data findings are irrelevant unless a sound
decision is made regarding scales of measurement. Before any serious data
analysis can begin, the scale of measurement must be decided for the data as this
will have a long-term impact on data interpretation ROI. The varying scales
include:
For a more in-depth review of scales of measurement, read our article on data
analysis questions. Once scales of measurement have been selected, it is time to
select which of the two broad interpretation processes will best suit your data
needs. Let’s take a closer look at those specific methods and possible data
interpretation problems.
When interpreting data, an analyst must try to discern the differences between
correlation, causation, and coincidences, as well as many other biases – but he also
has to consider all the factors involved that may have led to a result. There are
various data interpretation methods one can use to achieve this.
The interpretation of data is designed to help people make sense of numerical data
that has been collected, analyzed, and presented. Having a baseline method for
interpreting data will provide your analyst teams with a structure and consistent
foundation. Indeed, if several departments have different approaches to interpreting
the same data while sharing the same goals, some mismatched objectives can
result. Disparate methods will lead to duplicated efforts, inconsistent solutions,
wasted energy, and inevitably – time and money. In this part, we will look at the
two main methods of interpretation of data: qualitative and quantitative analysis.
Qualitative data analysis can be summed up in one word – categorical. With this
type of analysis, data is not described through numerical values or patterns, but
through the use of descriptive context (i.e., text). Typically, narrative data is
gathered by employing a wide variety of person-to-person techniques. These
techniques include:
After qualitative data has been collected through transcripts, questionnaires, audio
and video recordings, or the researcher’s notes, it is time to interpret it. For that
purpose, there are some common methods used by researchers and analysts.
Content analysis: As its name suggests, this is a research method used to identify
frequencies and recurring words, subjects and concepts in image, video, or audio
content. It transforms qualitative information into quantitative data to help in the
discovery of trends and conclusions that will later support important research or
business decisions. This method is often used by marketers to understand brand
sentiment from the mouths of customers themselves. Through that, they can extract
valuable information to improve their products and services. It is recommended to
use content analytics tools for this method as manually performing it is very time-
consuming and can lead to human error or subjectivity issues. Having a clear goal
in mind before diving into it is another great practice for avoiding getting lost in
the fog.
Thematic analysis: This method focuses on analyzing qualitative data such as
interview transcripts, survey questions, and others, to identify common patterns
and separate the data into different groups according to found similarities or
themes. For example, imagine you want to analyze what customers think about
your restaurant. For this purpose, you do a thematic analysis on 1000 reviews and
find common themes such as “fresh food”, “cold food”, “small portions”, “friendly
staff”, etc. With those recurring themes in hand, you can extract conclusions about
what could be improved or enhanced based on your customer’s experiences. Since
this technique is more exploratory, be open to changing your research questions or
goals as you go.
Narrative analysis: A bit more specific and complicated than the two previous
methods, narrative analysis is used to analyze stories and discover the meaning
behind them. These stories can be extracted from testimonials, case studies, and
interviews as these formats give people more space to tell their experiences. Given
that collecting this kind of data is harder and more time-consuming, sample sizes
for narrative analysis are usually smaller, which makes it harder to reproduce its
findings. However, it still proves to be a valuable technique in cases such as
understanding customers' preferences and mindsets.
Discourse analysis: This method is used to draw the meaning of any type of visual,
written, or symbolic language in relation to a social, political, cultural, or historical
context. It is used to understand how context can affect the way language is carried
out and understood. For example, if you are doing research on power dynamics,
using discourse analysis to analyze a conversation between a janitor and a CEO
and draw conclusions about their responses based on the context and your research
questions is a great use case for this technique. That said, like all methods in this
section, discourse analytics is time-consuming as the data needs to be analyzed
until no new insights emerge.
Grounded theory analysis: The grounded theory approach aims at creating or
discovering a new theory by carefully testing and evaluating the data available.
Unlike all other qualitative approaches on this list, grounded theory analysis helps
in extracting conclusions and hypotheses from the data, instead of going into the
analysis with a defined hypothesis. This method is very popular amongst
researchers, analysts, and marketers as the results are completely data-backed,
providing a factual explanation of any scenario. It is often used when researching a
completely new topic or with little knowledge as this space to start from the
ground up.
Mean: a mean represents a numerical average for a set of responses. When dealing
with a data set (or multiple data sets), a mean will represent a central value of a
specific set of numbers. It is the sum of the values divided by the number of values
within the data set. Other terms that can be used to describe the concept are
arithmetic mean, average and mathematical expectation.
Standard deviation: this is another statistical term commonly appearing in
quantitative analysis. Standard deviation reveals the distribution of the responses
around the mean. It describes the degree of consistency within the responses;
together with the mean, it provides insight into data sets.
Frequency distribution: this is a measurement gauging the rate of a response
appearance within a data set. When using a survey, for example, frequency
distribution, it can determine the number of times a specific ordinal scale response
appears (i.e., agree, strongly agree, disagree, etc.). Frequency distribution is
extremely keen in determining the degree of consensus among data points.