0% found this document useful (0 votes)
5 views

Unit 2 Research Methods Ipr

The document outlines key concepts in research methodology, including measurement levels (nominal, ordinal, interval, and ratio) and the importance of reliability in measurement. It discusses the design and effectiveness of questionnaires as research instruments, highlighting their advantages and disadvantages, as well as various types of sampling methods used in research. Additionally, it emphasizes the significance of selecting appropriate research instruments based on the study type, whether quantitative or qualitative.

Uploaded by

Flora Mary
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Unit 2 Research Methods Ipr

The document outlines key concepts in research methodology, including measurement levels (nominal, ordinal, interval, and ratio) and the importance of reliability in measurement. It discusses the design and effectiveness of questionnaires as research instruments, highlighting their advantages and disadvantages, as well as various types of sampling methods used in research. Additionally, it emphasizes the significance of selecting appropriate research instruments based on the study type, whether quantitative or qualitative.

Uploaded by

Flora Mary
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 31

Unit-2 Research Methodology & IPR

Measurements, Measurement Scales, Questionnaires and Instruments, Sampling and


methods. Data - Preparing, Exploring, examining and displaying.
Measurement:
Measurement is the process of observing and recording the observations that are
collected as part of a research effort. There are two major issues that will be
considered here.
First, to understand the fundamental ideas involved in measuring. Here we consider
two of major measurement concepts. In Levels of Measurement, the meaning of the
four major levels of measurement: nominal, ordinal, interval and ratio. Then we move
on to the reliability of measurement, including consideration of true score theory and
a variety of reliability estimators.
Second, to understand the different types of measures that you might use in social
research. We consider four broad categories of measurements. Survey
research includes the design and implementation of interviews and
questionnaires. Scaling involves consideration of the major methods of developing and
implementing a scale. Qualitative research provides an overview of the broad range
of non-numerical measurement approaches. And unobtrusive measures presents a
variety of measurement methods that don’t intrude on or interfere with the context of
the research.

LEVELS OF MEASUREMENT
There are different levels of measurement. These levels differ as to how closely they
approach the structure of the number system we use. It is important to understand the
level of measurement of variables in research, because the level of measurement
determines the type of statistical analysis that can be conducted, and, therefore, the
type of conclusions that can be drawn from the research.
Nominal Level
A nominal level of measurement uses symbols to classify observations into categories
that must be both mutually exclusive and exhaustive. Exhaustive means that there
must be enough categories that all the observations will fall into some category.
Mutually exclusive mean
s that the categories must be distinct enough that no vations will fall
obser

1
into more than one category. This is the most basic level of measurement; it is
essentially labeling. It can only establish whether two observations are alike or
different, for example, sorting a deck of cards into two piles: red cards and black cards.

In a survey of boaters, one variable of interest was place of residence. It was measured
by a question on a questionnaire asking for the zip code of the boater's principal place
of residence. The observations were divided into zip code categories. These categories
are mutually exclusive and exhaustive. All respondents live in one zip code category
(exhaustive) but no boater lives in more than one zip code category (mutually
exclusive). Similarly, the sex of the boater was determined by a question on the
questionnaire. Observations were sorted into two mutually exclusive and exhaustive
categories, male and female. Observations could be labeled with the letters M and F, or
the numerals 0 and 1.

The variable of marital status may be measured by two categories, married and
unmarried. But these must each be defined so that all possible observations will fit into
one category but no more than one: legally married, common-law marriage, religious
marriage, civil marriage, living together, never married, divorced, informally separated,
legally separated, widowed, abandoned, annulled, etc.
In nominal measurement, all observations in one category are alike on some property,
and they differ from the objects in the other category (or categories) on that property
(e.g., zip code, sex). There is no ordering of categories (no category is better or worse,
or more or less than another).
Ordinal Level
An ordinal level of measurement uses symbols to classify observations into categories
that are not only mutually exclusive and exhaustive; in addition, the categories have
some explicit relationship among them.
For example, observations may be classified into categories such as taller and shorter,
greater and lesser, faster and slower, harder and easier, and so forth. However, each
observation must still fall into one of the categories (the categories are exhaustive) but
no more than one (the categories are mutually exclusive). Meats are categorized as
regular, choice, or prime; the military uses ranks to distinguish categories of soldiers.

Most of the commonly used questions which ask about job satisfaction use the ordinal
level of measurement. For example, asking whether one is very satisfied, satisfied,

2
neutral, dissatisfied, or very dissatisfied with one's job is using an ordinal scale of
measurement.

Interval Level
An interval level of measurement classifies observations into categories that are not
only mutually exclusive and exhaustive, and have some explicit relationship among
them, but the relationship between the categories is known and exact. This is the first
quantitative application of numbers.

In the interval level, a common and constant unit of measurement has been established
between the categories. For example, the commonly used measures of temperature are
interval level scales. We know that a temperature of 75 degrees is one degree warmer
than a temperature of 74 degrees, just as a temperature of 42 degrees is one degree
warmer than a temperature of 41 degrees.

Numbers may be assigned to the observations because the relationship between the
categories is assumed to be the same as the relationship between numbers in the
number system. For example, 74+1=75 and 41+1=42.
The intervals between categories are equal, but they originate from some arbitrary
origin. that is, there is no meaningful zero point on an interval scale.
Ratio Level
The ratio level of measurement is the same as the interval level, with the addition of a
meaningful zero point. There is a meaningful and non-arbitrary zero point from which
the equal intervals between categories originate.
For example, weight, area, speed, and velocity are measured on a ratio level scale. In
public policy and administration, budgets and the number of program participants are
measured on ratio scales.
In many cases, interval and ratio scales are treated alike in terms of the statistical tests
that are applied.
Variables measured at a higher level can always be converted to a lower level, but not
vice versa. For example, observations of actual age (ratio scale) can be converted to
categories of older and younger (ordinal scale), but age measured as simply older or
younger cannot be converted to measures of actual age.

3
Questionaries & Instruments:
A questionnaire is a research tool featuring a series of questions used to collect useful
information from respondents. These instruments include either written or oral
questions and comprise an interview-style format. Questionnaires may be qualitative
or quantitative and can be conducted online, by phone, on paper or face-to-face, and
questions don’t necessarily have to be administered with a researcher present.
Questionnaires feature either open or closed questions and sometimes employ a
mixture of both. Open-ended questions enable respondents to answer in their own
words in as much or as little detail as they desire. Closed questions provide
respondents with a series of predetermined responses they can choose from.

Is a Questionnaire Just Another Word for “Survey”?


While the two terms seem synonymous, there are not quite the same. A questionnaire
is a set of questions created for the purpose of gathering information; that information

4
may not be used for a survey. However, all surveys do require questionnaires. If you
are using

5
a questionnaire for survey sampling, it’s important to ensure that it is designed to gather
the most accurate answers from respondents.

Why Are Questionnaires Effective in Research?


Questionnaires are popular research methods because they offer a fast, efficient and
inexpensive means of gathering large amounts of information from sizeable sample
volumes. These tools are particularly effective for measuring subject behavior,
preferences, intentions, attitudes and opinions. Their use of open and closed research
questions enables researchers to obtain both qualitative and quantitative data,
resulting in more comprehensive results.

Advantages of Questionnaires
Some of the many benefits of using questionnaires as a research tool include:
 Practicality: Questionnaires enable researchers to strategically manage their
target audience, questions and format while gathering large data quantities on
any subject.
 Cost-efficiency: You don’t need to hire surveyors to deliver your survey questions
— instead, you can place them on your website or email them to respondents at
little to no cost.
 Speed: You can gather survey results quickly and effortlessly using mobile tools,
obtaining responses and insights in 24 hours or less.
 Comparability: Researchers can use the same questionnaire yearly and
compare and contrast research results to gain valuable insights and minimize
translation errors.
 Scalability: Questionnaires are highly scalable, allowing researchers to
distribute them to demographics anywhere across the globe.
 Standardization: You can standardize your questionnaire with as many
questions as you want about any topic.
 Respondent comfort: When taking a questionnaire, respondents are
completely anonymous and not subject to stressful time constraints, helping
them feel relaxed and encouraging them to provide truthful responses.
 Easy analysis: Questionnaires often have built-in tools that automate analyses,
making it fast and easy to interpret your results.
Disadvantages of Questionnaires
Questionnaires also have their disadvantages, such as:
 Answer dishonesty: Respondents may not always be completely truthful with
their answers — some may have hidden agendas, while others may answer how
nk society would deem most
acceptable. 6
they thi

7
 Question skipping: Make sure to require answers for all your survey questions.
Otherwise, you may run the risk of respondents leaving questions unanswered.
 Interpretation difficulties: If a question isn’t straightforward enough,
respondents may struggle to interpret it accurately. That’s why it’s important to
state questions clearly and concisely, with explanations when necessary.
 Survey fatigue: Respondents may experience survey fatigue if they receive too
many surveys or a questionnaire is too long.
 Analysis challenges: Though closed questions are easy to analyze, open
questions require a human to review and interpret them. Try limiting open-
ended questions in your survey to gain more quantifiable data you can evaluate
and utilize more quickly.
 Unconscientious responses: If respondents don’t read your questions
thoroughly or completely, they may offer inaccurate answers that can impact
data validity. You can minimize this risk by making questions as short and
simple as possible.
Types of Questionnaires in Research
There are various types of questionnaires in survey research, including:
 Postal: Postal questionnaires are paper surveys that participants receive
through the mail. Once respondents complete the survey, they mail them back
to the organization that sent them.
 In-house: In this type of questionnaire, researchers visit respondents in their
homes or workplaces and administer the survey in person.
 Telephone: With telephone surveys, researchers call respondents and conduct
the questionnaire over the phone.
 Electronic: Perhaps the most common type of questionnaire, electronic surveys
are presented via email or through a different online medium.

What are Research Instruments?


A research instrument is a tool used to collect, measure, and analyze data related
to your subject.
Research instruments can be tests, surveys, scales, questionnaires, or
even checklists.

A research instrument is a tool used to obtain, measure, and analyze data from subjects
around the research topic.

8
To decide the instrument to use based on the type of study you are conducting:
quantitative, qualitative, or mixed-method. For instance, for a quantitative study, you
may decide to use a questionnaire, and for a qualitative study, you may choose to use a
scale.

While it helps to use an established instrument, as its efficacy is already established,


you may if needed use a new instrument or even create your own instrument.

What are the Different Types of Interview Research Instruments?


The general format of an interview is where the interviewer asks the interviewee to
answer a set of questions which are normally asked and answered verbally. There are
several different types of interview research instruments that may exist.
1. A structural interview may be used in which there are a specific number of
questions that are formally asked of the interviewee and their responses
recorded using a systematic and standard methodology.
2. An unstructured interview on the other hand may still be based on the same
general theme of questions but here the person asking the questions (the
interviewer) may change the order the questions are asked in and the specific
way in which they’re asked.
3. A focus interview is one in which the interviewer will adapt their line or content
of questioning based on the responses from the interviewee.
4. A focus group interview is one in which a group of volunteers or interviewees
are asked questions to understand their opinion or thoughts on a specific
subject.
5. A non-directive interview is one in which there are no specific questions agreed
upon but instead the format is open-ended and more reactionary in the
discussion between interviewer and interviewee.

What is sampling?
Sampling is a technique of selecting individual members or a subset of the population
to make statistical inferences from them and estimate characteristics of the whole
population. Different sampling methods are widely used by researchers in market
research so that they do not need to research the entire population to collect
actionable insights.

It is also a time-convenient and a cost-effective method and hence forms the basis of
any research design. Sampling techniques can be used in a research survey software

9
for optimum derivation.

10
For example, if a drug manufacturer would like to research the adverse side effects of
a drug on the country’s population, it is almost impossible to conduct a research study
that involves everyone. In this case, the researcher decides a sample of people
from each demographic and then researches them, giving him/her indicative feedback
on the drug’s behavior.

Types of sampling: sampling methods


Sampling in market research is of two types – probability sampling and non-probability
sampling. Let’s take a closer look at these two methods of sampling.
1. Probability sampling: Probability sampling is a sampling technique where a
researcher sets a selection of a few criteria and chooses members of a
population randomly. All the members have an equal opportunity to be a part
of the sample with this selection parameter.
2. Non-probability sampling: In non-probability sampling, the researcher
chooses members for research at random. This sampling method is not a fixed
or predefined selection process. This makes it difficult for all elements of a
population to have equal opportunities to be included in a sample.

In this blog, we discuss the various probability and non-probability sampling methods
that you can implement in any market research study.

Types of probability sampling with examples:


Probability sampling is a sampling technique in which researchers choose samples
from a larger population using a method based on the theory of probability. This
sampling method considers every member of the population and forms samples based
on a fixed process.

For example, in a population of 1000 members, every member will have a 1/1000
chance of being selected to be a part of a sample. Probability sampling eliminates bias
in the population and gives all members a fair chance to be included in the sample.
There are four types of probability sampling techniques:
 Simple random sampling: One of the best probability sampling techniques that
helps in saving time and resources, is the Simple Random Sampling method. It
is a reliable method of obtaining information where every single member of a
population is chosen randomly, merely by chance. Each individual has the
same probability of being chosen to be a part of a sample.

11
 For example, in an organization of 500 employees, if the HR team decides on
conducting team building activities, it is highly likely that they would prefer
picking chits out of a bowl. In this case, each of the 500 employees has an
equal opportunity of being selected.

 Cluster sampling: Cluster sampling is a method where the researchers divide


the entire population into sections or clusters that represent a population.
Clusters are identified and included in a sample based on demographic
parameters like age, sex, location, etc. This makes it very simple for a survey
creator to derive effective inference from the feedback.

For example, if the United States government wishes to evaluate the number of
immigrants living in the Mainland US, they can divide it into clusters based on
states such as California, Texas, Florida, Massachusetts, Colorado, Hawaii, etc.
This way of conducting a survey will be more effective as the results will be
organized into states and provide insightful immigration data.

 Systematic sampling: Researchers use the systematic sampling method to


choose the sample members of a population at regular intervals. It requires
the selection of a starting point for the sample and sample size that can be
repeated at regular intervals. This type of sampling method has a predefined
range, and hence this sampling technique is the least time-
consuming. For example, a researcher intends to collect a systematic sample
of 500 people in a population of 5000. He/she numbers each element of the
population from 1-5000 and will choose every 10th individual to be a part of
the sample (Total population/ Sample Size = 5000/500 = 10).

 Stratified random sampling: Stratified random sampling is a method in which


the researcher divides the population into smaller groups that don’t overlap
but represent the entire population. While sampling, these groups can be
organized and then draw a sample from each group
separately. For example, a researcher looking to analyze the characteristics
of people belonging to different annual income divisions will create strata
(groups) according to the annual family income. Eg – less than $20,000,
$21,000 –
$30,000, $31,000 to $40,000, $41,000 to $50,000, etc. By doing this, the
researcher concludes the characteristics of people belonging to different

12
income groups. Marketers can analyze which income groups to target and
which ones to eliminate to create a roadmap that would be fruitful result.

13
Uses of probability sampling
There are multiple uses of probability sampling:
 Reduce Sample Bias: Using the probability sampling method, the bias in the
sample derived from a population is negligible to non-existent. The selection
of the sample mainly depicts the understanding and the inference of the
researcher. Probability sampling leads to higher quality data collection as the
sample appropriately represents the population.

 Diverse Population: When the population is vast and diverse, it is essential to


have adequate representation so that the data is not skewed towards
one demographic. For example, if Square would like to understand the people
that could make their point-of-sale devices, a survey conducted from a sample
of people across the US from different industries and socio-economic
backgrounds helps.
 Create an Accurate Sample: Probability sampling helps the researchers plan
and create an accurate sample. This helps to obtain well-defined data.
Types of non-probability sampling with examples
The non-probability method is a sampling method that involves a collection of
feedback based on a researcher or statistician’s sample selection capabilities and not on
a fixed selection process. In most situations, the output of a survey conducted with a
non- probable sample leads to skewed results, which may not represent the desired
target population. But, there are situations such as the preliminary stages of research
or cost constraints for conducting research, where non-probability sampling will be
much more useful than the other type.

Four types of non-probability sampling explain the purpose of this sampling method
in a better manner:
 Convenience sampling: This method is dependent on the ease of access tosubjects
such as surveying customers at a mall or passers-by on a busy street. It is usually
termed as convenience sampling, because of the researcher’s ease of carrying it out
and getting in touch with the subjects. Researchers have nearly no authority to
select the sample elements, and it’s purely done based on proximity and not
representativeness. This non-probability sampling method is used when there are
time and cost limitations in collecting feedback. In situations where there are
resource limitations such as the initial stages of research, convenience sampling is
used.

14
For example, startups and NGOs usually conduct convenience sampling at a mall to
distribute leaflets of upcoming events or promotion of a cause – they do that by
standing at the mall entrance and giving out pamphlets randomly.

 Judgmental or purposive sampling: Judgemental or purposive samples are


formed by the discretion of the researcher. Researchers purely consider the
purpose of the study, along with the understanding of the target audience. For
instance, when researchers want to understand the thought process of people
interested in studying for their master’s degree. The selection criteria will be:
“Are you interested in doing your masters in …?” and those who respond with
a “No” are excluded from the sample.

 Snowball sampling: Snowball sampling is a sampling method that researchers


apply when the subjects are difficult to trace. For example, it will be
extremely challenging to survey shelterless people or illegal immigrants. In
such cases, using the snowball theory, researchers can track a few categories
to interview and derive results. Researchers also implement this sampling
method in situations where the topic is highly sensitive and not openly
discussed—for example, surveys to gather information about HIV Aids. Not
many victims will readily respond to the questions. Still, researchers can
contact people they might know or volunteers associated with the cause to
get in touch with the victims and collect information.

 Quota sampling: In Quota sampling, the selection of members in this sampling


technique happens based on a pre-set standard. In this case, as a sample is
formed based on specific attributes, the created sample will have the same
qualities found in the total population. It is a rapid method of collecting
samples.

Uses of non-probability sampling


Non-probability sampling is used for the following:
 Create a hypothesis: Researchers use the non-probability sampling method to
create an assumption when limited to no prior information is available. This
method helps with the immediate return of data and builds a base for further
research.

 Exploratory research: Researchers use this sampling technique widely when

15
conducting qualitative research, pilot studies, or exploratory research.

16
 Budget and time constraints: The non-probability method when there are
budget and time constraints, and some preliminary data must be collected.
Since the survey design is not rigid, it is easier to pick respondents at random
and have them take the survey or questionnaire.

How do you decide on the type of sampling to use?


For any research, it is essential to choose a sampling method accurately to meet the goals
of your study. The effectiveness of your sampling relies on various factors. Here are
some steps expert researchers follow to decide the best sampling method.
 Jot down the research goals. Generally, it must be a combination of cost,
precision, or accuracy.
 Identify the effective sampling techniques that might potentially achieve the
research goals.
 Test each of these methods and examine whether they help in achieving your goal.
 Select the method that works best for the research.

Difference between probability sampling and non-probability sampling methods


We have looked at the different types of sampling methods above and their subtypes. To
encapsulate the whole discussion, though, the significant differences between
probability sampling methods and non-probability sampling methods are as below:

Probabili Non-Probability
ty Methods
Metho
ds

Probability Sampling is Non-probability


a sampling is a
sampling technique in sampling technique in which
which samples from a the researcher selects samples
efinition
larger population are based on the researcher’s
chosen using a method subjective judgment rather
based on the theory of than random selection.
probability.

17
ternatively Known as Random sampling method.Non-random sampling method

The is The
select is
opulation selection

popu population
latio arbitrarily.
n
rand
omly.

ature The research is conclusive.The research is exploratory.

Since there is a method forSince the sampling method is


deciding the sample, arbitrary, the population
ample
the population demographics representation
demographics are is almost always skewed.
conclusively
represented.

18
Takes longer to conduct since This type of sampling method is
the research design defines quick since neither the sample
Time Taken
the selection parameters or selection criteria of the
before the market research sample are undefined.
study begins.

This type of sampling is This type of sampling is entirely


entirely unbiased and hence biased and hence the results are
Results
the results are unbiased too biased too, rendering the
and conclusive. research speculative.

In probability sampling,
there is an underlying
hypothesis before the In non-probability sampling, the hypothesis
Hypothesis
study begins and the is derived after conducting the research study.
objective of this method is
to prove the hypothesis.

Data Preparation Steps


The specifics of the data preparation process vary by industry, organization and
need, but the framework remains largely the same.
1. Gather data
The data preparation process begins with finding the right data. This can come
from an existing data catalog or can be added ad-hoc.

19
2. Discover and assess data
After collecting the data, it is important to discover each dataset. This step is about

getting to know the data and understanding what has to be done before the data
becomes useful in a particular context.
Discovery is a big task, but Talend’s data preparation platform offers visualization tools
which help users profile and browse their data.
3. Cleanse and validate data
Cleaning up the data is traditionally the most time consuming part of the data
preparation process, but it’s crucial for removing faulty data and filling in gaps.
Important tasks here include:
 Removing extraneous data and outliers.
 Filling in missing values.
 Conforming data to a standardized pattern.
 Masking private or sensitive data entries.
Once data has been cleansed, it must be validated by testing for errors in the data
preparation process up to this point. Often times, an error in the system will become
apparent during this step and will need to be resolved before moving forward.
4. Transform and enrich data
Transforming data is the process of updating the format or value entries in order to
reach a well-defined outcome, or to make the data more easily understood by a
wider
audience. Enriching data refers to adding and connecting data with other related
information to provide deeper insights.
5. Store data
Once prepared, the data can be stored or channeled into a third party application—
such as a business intelligence tool—clearing the way for processing and analysis to
take place.

20
21
What is Data Exploration?
Data exploration definition: Data exploration refers to the initial step in data analysis in
which data analysts use data visualization and statistical techniques to describe dataset
characterizations, such as size, quantity, and accuracy, in order to better understand
the nature of the data.

Data exploration techniques include both manual analysis and automated data
exploration software solutions that visually explore and identify relationships between
different data variables, the structure of the dataset, the presence of outliers, and the
distribution of data values in order to reveal patterns and points of interest, enabling
data analysts to gain greater insight into the raw data.

Data is often gathered in large, unstructured volumes from various sources and data
analysts must first understand and develop a comprehensive view of the data before
extracting relevant data for further analysis, such as univariate, bivariate, multivariate,
and principal components analysis.

Data Exploration Tools


Manual data exploration methods entail either writing scripts to analyze raw data or
manually filtering data into spreadsheets. Automated data exploration tools, such as
data visualization software, help data scientists easily monitor data sources and
perform big data exploration on otherwise overwhelmingly large datasets. Graphical
displays of data, such as bar charts and scatter plots, are valuable tools in visual data
exploration.

A popular tool for manual data exploration is Microsoft Excel spreadsheets, which can
be used to create basic charts for data exploration, to view raw data, and to identify the
correlation between variables. To identify the correlation between two continuous
variables in Excel, use the function CORREL() to return the correlation. To identify the
correlation between two categorical variables in Excel, the two-way table method, the
stacked column chart method, and the chi-square test are effective.

There is a wide variety of proprietary automated data exploration solutions,


including business intelligence tools, data visualization software, data preparation
software vendors, and data exploration platforms. There are also open source data
exploration tools that include regression capabilities and visualization features, which
can help businesses integrate diverse data sources to enable faster data exploration.
software includes data visualization
22
Most data analytics

23
Why is Data Exploration Important?
Humans process visual data better than numerical data, therefore it is extremely
challenging for data scientists and data analysts to assign meaning to thousands of
rows and columns of data points and communicate that meaning without any visual
components.
Data visualization in data exploration leverages familiar visual cues such as shapes,
dimensions, colors, lines, points, and angles so that data analysts can effectively
visualize and define the metadata, and then perform data cleansing. Performing the
initial step of data exploration enables data analysts to better understand and visually
identify anomalies and relationships that might otherwise go undetected.
What is Data Preparation?
Data preparation is the process of cleaning and transforming raw data prior to
processing and analysis. It is an important step prior to processing and often involves
reformatting data, making corrections to data and the combining of data sets to enrich
data.

Data preparation is often a lengthy undertaking for data professionals or business


users, but it is essential as a prerequisite to put data in context in order to turn it into
insights and eliminate bias resulting from poor data quality.

For example, the data preparation process usually includes standardizing data formats,
enriching source data, and/or removing outliers.
Benefits of Data Preparation + The Cloud
76% of data scientists say that data preparation is the worst part of their job, but the
efficient, accurate business decisions can only be made with clean data. Data
preparation helps:
 Fix errors quickly — Data preparation helps catch errors before
processing. After data has been removed from its original source, these
errors become more difficult to understand and correct.
 Produce top-quality data — Cleaning and reformatting datasets ensures
that all data used in analysis will be high quality.

 Make better business decisions — Higher quality data that can be


processed and analyzed more quickly and efficiently leads to more timely,
efficient and high-quality business decisions.
 Additionally, as data and data processes move to the cloud, data preparation moves with it
for even greater benefits, such as:

 Superior scalability — Cloud data preparation can grow at the pace of

24
the business. Enterprise don’t have to worry about the underlying
infrastructure or try to anticipate their evolutions.

 Future proof — Cloud data preparation upgrades automatically so that


new capabilities or problem fixes can be turned on as soon as they are
released. This allows organizations to stay ahead of the innovation curve
without delays and added costs.

 Accelerated data usage and collaboration — Doing data prep in the cloud
means it is always on, doesn’t require any technical installation, and lets
teams collaborate on the work for faster results.

What Is Data Analysis?


Although many groups, organizations, and experts have different ways to approach
data analysis, most of them can be distilled into a one-size-fits-all definition. Data
analysis is the process of cleaning, changing, and processing raw data, and extracting
actionable, relevant information that helps businesses make informed decisions. The
procedure helps reduce the risks inherent in decision-making by providing useful
insights and statistics, often presented in charts, images, tables, and graphs.
It’s not uncommon to hear the term “big data” brought up in discussions about data
analysis. Data analysis plays a crucial role in processing big data into useful
information. Neophyte data analysts who want to dig deeper by revisiting big data
fundamentals should go back to the basic question, “What is data?”

Why is Data Analysis Important?


Here is a list of reasons why data analysis is such a crucial part of doing business today.
 Better Customer Targeting: You don’t want to waste your business’s precious
time, resources, and money putting together advertising campaigns targeted
at demographic groups that have little to no interest in the goods and services
you

offer. Data analysis helps you see where you should be focusing your
advertising efforts.

 You Will Know Your Target Customers Better: Data analysis tracks how well
your products and campaigns are performing within your target
demographic. Through data analysis, your business can get a better idea of
your target audience’s spending habits, disposable income, and most likely

25
areas of interest. This data helps businesses set prices, determine the length
of ad campaigns, and even help project the quantity of goods needed.

 Reduce Operational Costs: Data analysis shows you which areas in your
business need more resources and money, and which areas are not
producing and thus should be scaled back or eliminated outright.

 Better Problem-Solving Methods: Informed decisions are more likely to be


successful decisions. Data provides businesses with information. You can see
where this progression is leading. Data analysis helps businesses make the
right choices and avoid costly pitfalls.

 You Get More Accurate Data: If you want to make informed decisions, you
need data, but there’s more to it. The data in question must be accurate. Data
analysis helps businesses acquire relevant, accurate information, suitable for
developing future marketing strategies, business plans, and realigning the
company’s vision or mission.

What Is the Data Analysis Process?


Answering the question “what is data analysis” is only the first step. Now we will look at
how it’s performed. The data analysis process, or alternately, data analysis steps,
involves gathering all the information, processing it, exploring the data, and using it to
find patterns and other insights. The process consists of:

 Data Requirement Gathering: Ask yourself why you’re doing this analysis,
what type of data analysis you want to use, and what data you are planning
on analyzing.

 Data Collection: Guided by the requirements you’ve identified, it’s time to


collect the data from your sources. Sources include case studies, surveys,
interviews, questionnaires, direct observation, and focus groups. Make sure
to organize the collected data for analysis.

 Data Cleaning: Not all of the data you collect will be useful, so it’s time to clean
it up. This process is where you remove white spaces, duplicate records, and
basic errors. Data cleaning is mandatory before sending the information on
for analysis.

26
 Data Analysis: Here is where you use data analysis software and other tools
to help you interpret and understand the data and arrive at conclusions. Data
analysis tools include Excel, Python, R, Looker, Rapid Miner, Chartio,
Metabase, Redash, and Microsoft Power BI.

 Data Interpretation: Now that you have your results, you need to interpret
them and come up with the best courses of action, based on your findings.

 Data Visualization: Data visualization is a fancy way of saying, “graphically


show your information in a way that people can read and understand it.” You
can use charts, graphs, maps, bullet points, or a host of other methods.
Visualization helps you derive valuable insights by helping you compare
datasets and observe relationships.

What Is the Importance of Data Analysis in Research?


A huge part of a researcher’s job is to sift through data. That is literally the definition of
“research.” However, today’s Information Age routinely produces a tidal wave of data,
enough to overwhelm even the most dedicated researcher.

Data analysis, therefore, plays a key role in distilling this information into a more
accurate and relevant form, making it easier for researchers to do to their job.

Data analysis also provides researchers with a vast selection of different tools, such as
descriptive statistics, inferential analysis, and quantitative analysis.
So, to sum it up, data analysis offers researchers better data and better ways to analyze
and study said data.

What is Data Analysis: Types of Data Analysis


There are a half-dozen popular types of data analysis available today, commonly
employed in the They are worlds of technology and business.

 Diagnostic Analysis: Diagnostic analysis answers the question, “Why did this
happen?” Using insights gained from statistical analysis (more on that later!),
analysts use diagnostic analysis to identify patterns in data. Ideally, the
analysts find similar patterns that existed in the past, and consequently, use
those solutions to resolve the present challenges hopefully.

27
 Predictive Analysis: Predictive analysis answers the question, “What is most
likely to happen?” By using patterns found in older data as well as current
events, analysts predict future events. While there’s no such thing as 100
percent accurate forecasting, the odds improve if the analysts have plenty of
detailed information and the discipline to research it thoroughly.

 Prescriptive Analysis: Mix all the insights gained from the other data analysis
types, and you have prescriptive analysis. Sometimes, an issue can’t be solved
solely with one analysis type, and instead requires multiple insights.

 Statistical Analysis: Statistical analysis answers the question, “What


happened?” This analysis covers data collection, analysis, modeling,
interpretation, and presentation using dashboards. The statistical analysis
breaks down into two sub-categories:

1. Descriptive: Descriptive analysis works with either complete or selections of


summarized numerical data. It illustrates means and deviations in
continuous data and percentages and frequencies in categorical data.

2. Inferential: Inferential analysis works with samples derived from complete


data. An analyst can arrive at different conclusions from the same
comprehensive data set just by choosing different samplings.

 Text Analysis: Also called “data mining,” text analysis uses databases and data
mining tools to discover patterns residing in large datasets. It transforms raw
data into useful business information. Text analysis is arguably the most
straightforward and the most direct method of data analysis.
 Displaying data in research is the last step of the research process. It is important
to display data accurately because it helps in presenting the findings of the
research effectively to the reader. The purpose of displaying data in research is to
make the findings more visible and make comparisons easy. When the researcher
will present the research in front of the research committee, they will easily
understand the findings of the research from displayed data. The readers of the
research will also be able to understand it better. Without displayed data, the data
looks too scattered and the reader cannot make inferences.
 There are basically two ways to display data: tables and graphs. The tabulated data
and the graphical representation both should be used to give more accurate picture
of the research. In quantitative research it is very necessary to display data, on the
other hand in qualitative data the researcher decides whether there is a need to
display data or not. The researcher can use an appropriate software to help
tabulate and display the data in the form of graphs. Microsoft excel is one such
example, it is a user-friendly program that you can use to help display the data.

28
Tables for displaying data in research
 The use of tables to display data is very common in research. Tables are very
effective in presenting a large amount of data. They organize data very well and
makes the data very visible. A badly tabulated data also occurs, in case, you do
not have knowledge of tables and tabulating data consult a statistician to do this
step effectively.
 Parts of a table
 To know the tables and to tabulate data in tables you should know the parts or
structure of the tables. There are five parts of a tables, namely;
 Title
 The title of the table speaks about the contents of the table. The title should
have to be concise and precise, no extra details. The title should be written in
sentence case.
 Stub
 The column at the left-most of the table is called as stub. A stub has a stub-
heading at the top of the column, not all tables have stub. The stub shows the
subcategories that are listed along Y-axis.
 Caption
 The caption is the column heading, the variable might have subcategories which
are captioned. These subcategories are provided on the X-axis, the captions are
provided on the top of each column.
 Body
 The body of the table is the actual part of the table in which resides the whole
values, results, and analysis.
 Footnotes
 There can be many different types of notes that you may have to provide at the
end of the table. The footnotes are provided just below the table and labeled as
the source. The source generally are provided when the table has been taken
from some other source. They are also provided for explaining some point in the
table.

Sometimes there is some part of the table that is taken from a source so it
should also be mentioned.
 Types of tables
 Tables are the most simple means to display data, they can be categorized into
the following;
 Univariate
 Bivariate

29
 Polyvariate
 These categories are based on the numbers of variables that need to be
tabulated in the table. A univariate table has one variable to be tabulated; a
bivariate table, as the name suggests, has two variables to be tabulated and a
polyvariate table has more than two variables to be tabulated.
 Graphs to display data
 The purpose of displaying data is to make the communications easier. Graphs
should be used in displaying data when they can add to the visual beauty of the
data. The researcher should decide whether there is a need for table only or he
should also present data in the form of a suitable graph.
 Types of graphs
 You can use a suitable graph type depending on the type of data and the
variables involved in the data.
 The histogram
 The histogram is a graph that is highly used for displaying data. A histogram
consists of rectangles that are drawn next to each other on the graph. The
rectangles have no space in between them. A histogram can be drawn for a
single variable as well as for two or more than two variables. The height of the
bars in the histogram represent the frequency of each variable. It can be drawn
for both categorical and continuous variables.
 The bar chart
 The bar chart is similar to a histogram except in that it is drawn only for
categorical variables. Since it is used for categorical variables, therefore, it is
drawn with space between the rectangles.
 The frequency polygon
 A frequency polygon is also very much like a histogram. A frequency polygon
consists of frequency rectangles drawn next to each other but the values taken
to draw the rectangles is the midpoint of the values. The height of the rectangles
describes the frequency of each interval. A line is drawn that touches the
midpoints at the highest frequency level on Y-axis and it touches the X-axis on
each extreme end.

 The cumulative frequency polygon


 The cumulative frequency polygon is also a frequency polygon, it is drawn using
the cumulative frequencies on the Y-axis. The values on the X-axis are taken by
using the endpoints of the interval. The endpoints of the interval are joined to
each other the reason being that the cumulative frequency is always based on the
upper limit of an interval.

30
 The stem and leaf display
 The stem and leaf display is another easy way to display data. The stem and leaf
display if rotated to 90 degrees become a histogram.
 The pie chart
 The pie chart is a very different way to display data. The pie chart is a circle, as a
circle has 360 degrees so it is taken in percentage and the whole pie or circle
represent the whole population. The pie or circle is divided into slices or
sections, each section represents the magnitude of the category or the sub-
category.

 The trend curve


 The trend curve is also called as the line diagram. It is drawn by plotting the
midpoints on the X-axis and the frequencies commensurate with each interval
on the Y-axis. The trend curve is drawn only for a set of data that has been
measured on the continuous, interval or ratio scale. A trend diagram or the line
diagram is most suitable for plotting values that show changes over a period of
time.
 The area chart
 The area chart is a variation of the trend curve. In area chart, the sub-categories
of a variable can be displayed. The categories in the chart are displayed by
shading them with different colors or patterns. For example, if there are both
males and females category in the dataset both can be highlighted in this chart.
 The scattergram
 A scattergram is a very simple way to plot the data on a chart. The scattergram
is used for data where the change in one variable affects the change in the other
variable. The frequency against each interval is plotted with the help of dots.

31

You might also like