Unit 2 Research Methods Ipr
Unit 2 Research Methods Ipr
LEVELS OF MEASUREMENT
There are different levels of measurement. These levels differ as to how closely they
approach the structure of the number system we use. It is important to understand the
level of measurement of variables in research, because the level of measurement
determines the type of statistical analysis that can be conducted, and, therefore, the
type of conclusions that can be drawn from the research.
Nominal Level
A nominal level of measurement uses symbols to classify observations into categories
that must be both mutually exclusive and exhaustive. Exhaustive means that there
must be enough categories that all the observations will fall into some category.
Mutually exclusive mean
s that the categories must be distinct enough that no vations will fall
obser
1
into more than one category. This is the most basic level of measurement; it is
essentially labeling. It can only establish whether two observations are alike or
different, for example, sorting a deck of cards into two piles: red cards and black cards.
In a survey of boaters, one variable of interest was place of residence. It was measured
by a question on a questionnaire asking for the zip code of the boater's principal place
of residence. The observations were divided into zip code categories. These categories
are mutually exclusive and exhaustive. All respondents live in one zip code category
(exhaustive) but no boater lives in more than one zip code category (mutually
exclusive). Similarly, the sex of the boater was determined by a question on the
questionnaire. Observations were sorted into two mutually exclusive and exhaustive
categories, male and female. Observations could be labeled with the letters M and F, or
the numerals 0 and 1.
The variable of marital status may be measured by two categories, married and
unmarried. But these must each be defined so that all possible observations will fit into
one category but no more than one: legally married, common-law marriage, religious
marriage, civil marriage, living together, never married, divorced, informally separated,
legally separated, widowed, abandoned, annulled, etc.
In nominal measurement, all observations in one category are alike on some property,
and they differ from the objects in the other category (or categories) on that property
(e.g., zip code, sex). There is no ordering of categories (no category is better or worse,
or more or less than another).
Ordinal Level
An ordinal level of measurement uses symbols to classify observations into categories
that are not only mutually exclusive and exhaustive; in addition, the categories have
some explicit relationship among them.
For example, observations may be classified into categories such as taller and shorter,
greater and lesser, faster and slower, harder and easier, and so forth. However, each
observation must still fall into one of the categories (the categories are exhaustive) but
no more than one (the categories are mutually exclusive). Meats are categorized as
regular, choice, or prime; the military uses ranks to distinguish categories of soldiers.
Most of the commonly used questions which ask about job satisfaction use the ordinal
level of measurement. For example, asking whether one is very satisfied, satisfied,
2
neutral, dissatisfied, or very dissatisfied with one's job is using an ordinal scale of
measurement.
Interval Level
An interval level of measurement classifies observations into categories that are not
only mutually exclusive and exhaustive, and have some explicit relationship among
them, but the relationship between the categories is known and exact. This is the first
quantitative application of numbers.
In the interval level, a common and constant unit of measurement has been established
between the categories. For example, the commonly used measures of temperature are
interval level scales. We know that a temperature of 75 degrees is one degree warmer
than a temperature of 74 degrees, just as a temperature of 42 degrees is one degree
warmer than a temperature of 41 degrees.
Numbers may be assigned to the observations because the relationship between the
categories is assumed to be the same as the relationship between numbers in the
number system. For example, 74+1=75 and 41+1=42.
The intervals between categories are equal, but they originate from some arbitrary
origin. that is, there is no meaningful zero point on an interval scale.
Ratio Level
The ratio level of measurement is the same as the interval level, with the addition of a
meaningful zero point. There is a meaningful and non-arbitrary zero point from which
the equal intervals between categories originate.
For example, weight, area, speed, and velocity are measured on a ratio level scale. In
public policy and administration, budgets and the number of program participants are
measured on ratio scales.
In many cases, interval and ratio scales are treated alike in terms of the statistical tests
that are applied.
Variables measured at a higher level can always be converted to a lower level, but not
vice versa. For example, observations of actual age (ratio scale) can be converted to
categories of older and younger (ordinal scale), but age measured as simply older or
younger cannot be converted to measures of actual age.
3
Questionaries & Instruments:
A questionnaire is a research tool featuring a series of questions used to collect useful
information from respondents. These instruments include either written or oral
questions and comprise an interview-style format. Questionnaires may be qualitative
or quantitative and can be conducted online, by phone, on paper or face-to-face, and
questions don’t necessarily have to be administered with a researcher present.
Questionnaires feature either open or closed questions and sometimes employ a
mixture of both. Open-ended questions enable respondents to answer in their own
words in as much or as little detail as they desire. Closed questions provide
respondents with a series of predetermined responses they can choose from.
4
may not be used for a survey. However, all surveys do require questionnaires. If you
are using
5
a questionnaire for survey sampling, it’s important to ensure that it is designed to gather
the most accurate answers from respondents.
Advantages of Questionnaires
Some of the many benefits of using questionnaires as a research tool include:
Practicality: Questionnaires enable researchers to strategically manage their
target audience, questions and format while gathering large data quantities on
any subject.
Cost-efficiency: You don’t need to hire surveyors to deliver your survey questions
— instead, you can place them on your website or email them to respondents at
little to no cost.
Speed: You can gather survey results quickly and effortlessly using mobile tools,
obtaining responses and insights in 24 hours or less.
Comparability: Researchers can use the same questionnaire yearly and
compare and contrast research results to gain valuable insights and minimize
translation errors.
Scalability: Questionnaires are highly scalable, allowing researchers to
distribute them to demographics anywhere across the globe.
Standardization: You can standardize your questionnaire with as many
questions as you want about any topic.
Respondent comfort: When taking a questionnaire, respondents are
completely anonymous and not subject to stressful time constraints, helping
them feel relaxed and encouraging them to provide truthful responses.
Easy analysis: Questionnaires often have built-in tools that automate analyses,
making it fast and easy to interpret your results.
Disadvantages of Questionnaires
Questionnaires also have their disadvantages, such as:
Answer dishonesty: Respondents may not always be completely truthful with
their answers — some may have hidden agendas, while others may answer how
nk society would deem most
acceptable. 6
they thi
7
Question skipping: Make sure to require answers for all your survey questions.
Otherwise, you may run the risk of respondents leaving questions unanswered.
Interpretation difficulties: If a question isn’t straightforward enough,
respondents may struggle to interpret it accurately. That’s why it’s important to
state questions clearly and concisely, with explanations when necessary.
Survey fatigue: Respondents may experience survey fatigue if they receive too
many surveys or a questionnaire is too long.
Analysis challenges: Though closed questions are easy to analyze, open
questions require a human to review and interpret them. Try limiting open-
ended questions in your survey to gain more quantifiable data you can evaluate
and utilize more quickly.
Unconscientious responses: If respondents don’t read your questions
thoroughly or completely, they may offer inaccurate answers that can impact
data validity. You can minimize this risk by making questions as short and
simple as possible.
Types of Questionnaires in Research
There are various types of questionnaires in survey research, including:
Postal: Postal questionnaires are paper surveys that participants receive
through the mail. Once respondents complete the survey, they mail them back
to the organization that sent them.
In-house: In this type of questionnaire, researchers visit respondents in their
homes or workplaces and administer the survey in person.
Telephone: With telephone surveys, researchers call respondents and conduct
the questionnaire over the phone.
Electronic: Perhaps the most common type of questionnaire, electronic surveys
are presented via email or through a different online medium.
A research instrument is a tool used to obtain, measure, and analyze data from subjects
around the research topic.
8
To decide the instrument to use based on the type of study you are conducting:
quantitative, qualitative, or mixed-method. For instance, for a quantitative study, you
may decide to use a questionnaire, and for a qualitative study, you may choose to use a
scale.
What is sampling?
Sampling is a technique of selecting individual members or a subset of the population
to make statistical inferences from them and estimate characteristics of the whole
population. Different sampling methods are widely used by researchers in market
research so that they do not need to research the entire population to collect
actionable insights.
It is also a time-convenient and a cost-effective method and hence forms the basis of
any research design. Sampling techniques can be used in a research survey software
9
for optimum derivation.
10
For example, if a drug manufacturer would like to research the adverse side effects of
a drug on the country’s population, it is almost impossible to conduct a research study
that involves everyone. In this case, the researcher decides a sample of people
from each demographic and then researches them, giving him/her indicative feedback
on the drug’s behavior.
In this blog, we discuss the various probability and non-probability sampling methods
that you can implement in any market research study.
For example, in a population of 1000 members, every member will have a 1/1000
chance of being selected to be a part of a sample. Probability sampling eliminates bias
in the population and gives all members a fair chance to be included in the sample.
There are four types of probability sampling techniques:
Simple random sampling: One of the best probability sampling techniques that
helps in saving time and resources, is the Simple Random Sampling method. It
is a reliable method of obtaining information where every single member of a
population is chosen randomly, merely by chance. Each individual has the
same probability of being chosen to be a part of a sample.
11
For example, in an organization of 500 employees, if the HR team decides on
conducting team building activities, it is highly likely that they would prefer
picking chits out of a bowl. In this case, each of the 500 employees has an
equal opportunity of being selected.
For example, if the United States government wishes to evaluate the number of
immigrants living in the Mainland US, they can divide it into clusters based on
states such as California, Texas, Florida, Massachusetts, Colorado, Hawaii, etc.
This way of conducting a survey will be more effective as the results will be
organized into states and provide insightful immigration data.
12
income groups. Marketers can analyze which income groups to target and
which ones to eliminate to create a roadmap that would be fruitful result.
13
Uses of probability sampling
There are multiple uses of probability sampling:
Reduce Sample Bias: Using the probability sampling method, the bias in the
sample derived from a population is negligible to non-existent. The selection
of the sample mainly depicts the understanding and the inference of the
researcher. Probability sampling leads to higher quality data collection as the
sample appropriately represents the population.
Four types of non-probability sampling explain the purpose of this sampling method
in a better manner:
Convenience sampling: This method is dependent on the ease of access tosubjects
such as surveying customers at a mall or passers-by on a busy street. It is usually
termed as convenience sampling, because of the researcher’s ease of carrying it out
and getting in touch with the subjects. Researchers have nearly no authority to
select the sample elements, and it’s purely done based on proximity and not
representativeness. This non-probability sampling method is used when there are
time and cost limitations in collecting feedback. In situations where there are
resource limitations such as the initial stages of research, convenience sampling is
used.
14
For example, startups and NGOs usually conduct convenience sampling at a mall to
distribute leaflets of upcoming events or promotion of a cause – they do that by
standing at the mall entrance and giving out pamphlets randomly.
15
conducting qualitative research, pilot studies, or exploratory research.
16
Budget and time constraints: The non-probability method when there are
budget and time constraints, and some preliminary data must be collected.
Since the survey design is not rigid, it is easier to pick respondents at random
and have them take the survey or questionnaire.
Probabili Non-Probability
ty Methods
Metho
ds
17
ternatively Known as Random sampling method.Non-random sampling method
The is The
select is
opulation selection
popu population
latio arbitrarily.
n
rand
omly.
18
Takes longer to conduct since This type of sampling method is
the research design defines quick since neither the sample
Time Taken
the selection parameters or selection criteria of the
before the market research sample are undefined.
study begins.
In probability sampling,
there is an underlying
hypothesis before the In non-probability sampling, the hypothesis
Hypothesis
study begins and the is derived after conducting the research study.
objective of this method is
to prove the hypothesis.
19
2. Discover and assess data
After collecting the data, it is important to discover each dataset. This step is about
getting to know the data and understanding what has to be done before the data
becomes useful in a particular context.
Discovery is a big task, but Talend’s data preparation platform offers visualization tools
which help users profile and browse their data.
3. Cleanse and validate data
Cleaning up the data is traditionally the most time consuming part of the data
preparation process, but it’s crucial for removing faulty data and filling in gaps.
Important tasks here include:
Removing extraneous data and outliers.
Filling in missing values.
Conforming data to a standardized pattern.
Masking private or sensitive data entries.
Once data has been cleansed, it must be validated by testing for errors in the data
preparation process up to this point. Often times, an error in the system will become
apparent during this step and will need to be resolved before moving forward.
4. Transform and enrich data
Transforming data is the process of updating the format or value entries in order to
reach a well-defined outcome, or to make the data more easily understood by a
wider
audience. Enriching data refers to adding and connecting data with other related
information to provide deeper insights.
5. Store data
Once prepared, the data can be stored or channeled into a third party application—
such as a business intelligence tool—clearing the way for processing and analysis to
take place.
20
21
What is Data Exploration?
Data exploration definition: Data exploration refers to the initial step in data analysis in
which data analysts use data visualization and statistical techniques to describe dataset
characterizations, such as size, quantity, and accuracy, in order to better understand
the nature of the data.
Data exploration techniques include both manual analysis and automated data
exploration software solutions that visually explore and identify relationships between
different data variables, the structure of the dataset, the presence of outliers, and the
distribution of data values in order to reveal patterns and points of interest, enabling
data analysts to gain greater insight into the raw data.
Data is often gathered in large, unstructured volumes from various sources and data
analysts must first understand and develop a comprehensive view of the data before
extracting relevant data for further analysis, such as univariate, bivariate, multivariate,
and principal components analysis.
A popular tool for manual data exploration is Microsoft Excel spreadsheets, which can
be used to create basic charts for data exploration, to view raw data, and to identify the
correlation between variables. To identify the correlation between two continuous
variables in Excel, use the function CORREL() to return the correlation. To identify the
correlation between two categorical variables in Excel, the two-way table method, the
stacked column chart method, and the chi-square test are effective.
23
Why is Data Exploration Important?
Humans process visual data better than numerical data, therefore it is extremely
challenging for data scientists and data analysts to assign meaning to thousands of
rows and columns of data points and communicate that meaning without any visual
components.
Data visualization in data exploration leverages familiar visual cues such as shapes,
dimensions, colors, lines, points, and angles so that data analysts can effectively
visualize and define the metadata, and then perform data cleansing. Performing the
initial step of data exploration enables data analysts to better understand and visually
identify anomalies and relationships that might otherwise go undetected.
What is Data Preparation?
Data preparation is the process of cleaning and transforming raw data prior to
processing and analysis. It is an important step prior to processing and often involves
reformatting data, making corrections to data and the combining of data sets to enrich
data.
For example, the data preparation process usually includes standardizing data formats,
enriching source data, and/or removing outliers.
Benefits of Data Preparation + The Cloud
76% of data scientists say that data preparation is the worst part of their job, but the
efficient, accurate business decisions can only be made with clean data. Data
preparation helps:
Fix errors quickly — Data preparation helps catch errors before
processing. After data has been removed from its original source, these
errors become more difficult to understand and correct.
Produce top-quality data — Cleaning and reformatting datasets ensures
that all data used in analysis will be high quality.
24
the business. Enterprise don’t have to worry about the underlying
infrastructure or try to anticipate their evolutions.
Accelerated data usage and collaboration — Doing data prep in the cloud
means it is always on, doesn’t require any technical installation, and lets
teams collaborate on the work for faster results.
offer. Data analysis helps you see where you should be focusing your
advertising efforts.
You Will Know Your Target Customers Better: Data analysis tracks how well
your products and campaigns are performing within your target
demographic. Through data analysis, your business can get a better idea of
your target audience’s spending habits, disposable income, and most likely
25
areas of interest. This data helps businesses set prices, determine the length
of ad campaigns, and even help project the quantity of goods needed.
Reduce Operational Costs: Data analysis shows you which areas in your
business need more resources and money, and which areas are not
producing and thus should be scaled back or eliminated outright.
You Get More Accurate Data: If you want to make informed decisions, you
need data, but there’s more to it. The data in question must be accurate. Data
analysis helps businesses acquire relevant, accurate information, suitable for
developing future marketing strategies, business plans, and realigning the
company’s vision or mission.
Data Requirement Gathering: Ask yourself why you’re doing this analysis,
what type of data analysis you want to use, and what data you are planning
on analyzing.
Data Cleaning: Not all of the data you collect will be useful, so it’s time to clean
it up. This process is where you remove white spaces, duplicate records, and
basic errors. Data cleaning is mandatory before sending the information on
for analysis.
26
Data Analysis: Here is where you use data analysis software and other tools
to help you interpret and understand the data and arrive at conclusions. Data
analysis tools include Excel, Python, R, Looker, Rapid Miner, Chartio,
Metabase, Redash, and Microsoft Power BI.
Data Interpretation: Now that you have your results, you need to interpret
them and come up with the best courses of action, based on your findings.
Data analysis, therefore, plays a key role in distilling this information into a more
accurate and relevant form, making it easier for researchers to do to their job.
Data analysis also provides researchers with a vast selection of different tools, such as
descriptive statistics, inferential analysis, and quantitative analysis.
So, to sum it up, data analysis offers researchers better data and better ways to analyze
and study said data.
Diagnostic Analysis: Diagnostic analysis answers the question, “Why did this
happen?” Using insights gained from statistical analysis (more on that later!),
analysts use diagnostic analysis to identify patterns in data. Ideally, the
analysts find similar patterns that existed in the past, and consequently, use
those solutions to resolve the present challenges hopefully.
27
Predictive Analysis: Predictive analysis answers the question, “What is most
likely to happen?” By using patterns found in older data as well as current
events, analysts predict future events. While there’s no such thing as 100
percent accurate forecasting, the odds improve if the analysts have plenty of
detailed information and the discipline to research it thoroughly.
Prescriptive Analysis: Mix all the insights gained from the other data analysis
types, and you have prescriptive analysis. Sometimes, an issue can’t be solved
solely with one analysis type, and instead requires multiple insights.
Text Analysis: Also called “data mining,” text analysis uses databases and data
mining tools to discover patterns residing in large datasets. It transforms raw
data into useful business information. Text analysis is arguably the most
straightforward and the most direct method of data analysis.
Displaying data in research is the last step of the research process. It is important
to display data accurately because it helps in presenting the findings of the
research effectively to the reader. The purpose of displaying data in research is to
make the findings more visible and make comparisons easy. When the researcher
will present the research in front of the research committee, they will easily
understand the findings of the research from displayed data. The readers of the
research will also be able to understand it better. Without displayed data, the data
looks too scattered and the reader cannot make inferences.
There are basically two ways to display data: tables and graphs. The tabulated data
and the graphical representation both should be used to give more accurate picture
of the research. In quantitative research it is very necessary to display data, on the
other hand in qualitative data the researcher decides whether there is a need to
display data or not. The researcher can use an appropriate software to help
tabulate and display the data in the form of graphs. Microsoft excel is one such
example, it is a user-friendly program that you can use to help display the data.
28
Tables for displaying data in research
The use of tables to display data is very common in research. Tables are very
effective in presenting a large amount of data. They organize data very well and
makes the data very visible. A badly tabulated data also occurs, in case, you do
not have knowledge of tables and tabulating data consult a statistician to do this
step effectively.
Parts of a table
To know the tables and to tabulate data in tables you should know the parts or
structure of the tables. There are five parts of a tables, namely;
Title
The title of the table speaks about the contents of the table. The title should
have to be concise and precise, no extra details. The title should be written in
sentence case.
Stub
The column at the left-most of the table is called as stub. A stub has a stub-
heading at the top of the column, not all tables have stub. The stub shows the
subcategories that are listed along Y-axis.
Caption
The caption is the column heading, the variable might have subcategories which
are captioned. These subcategories are provided on the X-axis, the captions are
provided on the top of each column.
Body
The body of the table is the actual part of the table in which resides the whole
values, results, and analysis.
Footnotes
There can be many different types of notes that you may have to provide at the
end of the table. The footnotes are provided just below the table and labeled as
the source. The source generally are provided when the table has been taken
from some other source. They are also provided for explaining some point in the
table.
Sometimes there is some part of the table that is taken from a source so it
should also be mentioned.
Types of tables
Tables are the most simple means to display data, they can be categorized into
the following;
Univariate
Bivariate
29
Polyvariate
These categories are based on the numbers of variables that need to be
tabulated in the table. A univariate table has one variable to be tabulated; a
bivariate table, as the name suggests, has two variables to be tabulated and a
polyvariate table has more than two variables to be tabulated.
Graphs to display data
The purpose of displaying data is to make the communications easier. Graphs
should be used in displaying data when they can add to the visual beauty of the
data. The researcher should decide whether there is a need for table only or he
should also present data in the form of a suitable graph.
Types of graphs
You can use a suitable graph type depending on the type of data and the
variables involved in the data.
The histogram
The histogram is a graph that is highly used for displaying data. A histogram
consists of rectangles that are drawn next to each other on the graph. The
rectangles have no space in between them. A histogram can be drawn for a
single variable as well as for two or more than two variables. The height of the
bars in the histogram represent the frequency of each variable. It can be drawn
for both categorical and continuous variables.
The bar chart
The bar chart is similar to a histogram except in that it is drawn only for
categorical variables. Since it is used for categorical variables, therefore, it is
drawn with space between the rectangles.
The frequency polygon
A frequency polygon is also very much like a histogram. A frequency polygon
consists of frequency rectangles drawn next to each other but the values taken
to draw the rectangles is the midpoint of the values. The height of the rectangles
describes the frequency of each interval. A line is drawn that touches the
midpoints at the highest frequency level on Y-axis and it touches the X-axis on
each extreme end.
30
The stem and leaf display
The stem and leaf display is another easy way to display data. The stem and leaf
display if rotated to 90 degrees become a histogram.
The pie chart
The pie chart is a very different way to display data. The pie chart is a circle, as a
circle has 360 degrees so it is taken in percentage and the whole pie or circle
represent the whole population. The pie or circle is divided into slices or
sections, each section represents the magnitude of the category or the sub-
category.
31