The Disguised Market Research Club Presents Market Research Compendium
The Disguised Market Research Club Presents Market Research Compendium
Presents
Market Research Compendium
1
About Illumina
Every puzzle has a solution but when the puzzle is about delving into the consumer psyche, it is a tough
nut to crack. Illumina, the disguised market research club of MDI, has been providing companies with that
unique opportunity to study consumer behavior for 24 years, in a format that aims to produce the
cleanest and most bias free incites from Market Research. We are proud to claim to be the second oldest
club of such a format in India.
In 2021, MDI celebrates the 25th edition of Illumina, where corporates and students of MDI will once
again collaborate to understand the psyche of Indian consumers. In the past, we have had the honor of
collaborating with brands like HUL, Colgate Palmolive, Casio, OYO Rooms, Godrej, ITC, Airtel, Flipkart,
Pepsi, Coca-Cola, Hewett Packard, Idea Cellular and Videocon to name a few.
2
Market Research
Market research consists of systematically gathering data about people or companies – a market – and
then analyzing it to better understand what that group of people need. The results of market research,
which are usually summarized in a report, are then used to help business owners make more informed
decisions about the company’s strategies, operations, and potential customer base.
Understanding the industry shifts, changing consumer needs and preferences, and legislative trends,
among other things, can shape where a business chooses to focus its efforts and resources. That’s the
value of market research.
1. Primary Research: Primary data is first-hand information you gather yourself, or with the help of a
market research firm. You control it.
2. Secondary Research: Secondary data is pre-existing public information, such as the data shared in
magazines and newspapers, government or industry reports. You can analyze the data in new ways,
but the information is available to a large number of people.
Using primary or secondary data, there are three types of research studies that are usually conducted
which are as follows:
1. Exploratory Research: Exploratory market research gathers lots of open-ended data from many
people to better understand a problem or opportunity. The goal is to gather perceptions and opinions
regarding an issue, so your company can decide how to address it. But first you have to understand
how your market sees the issue. This is usually the first form of research in any field.
Example: Any form of open-ended conversations and interviews to understand how the market
perceives a product or a service would be of the exploratory type.
2. Descriptive Research: Descriptive research is defined as a research method that describes the
characteristics of the population or phenomenon that is being studied. This methodology focuses
more on the “what” of the research subject rather than the “why” of the research subject.
Example: An apparel brand that wants to understand the fashion purchasing trends among New York
buyers will conduct a demographic survey of this region, gather population data and then conduct
descriptive research on this demographic segment. The research will then uncover details on “what
is the purchasing pattern of New York buyers”, but not cover any investigative details on “why” the
pattern exists. Because for the apparel brand trying to break into this market, understanding the
nature of their market is the objective of the study.
3
3. Causal Research: Causal research, also called explanatory research, is the investigation of (research
into) cause-and-effect relationships. To determine causality, it is important to observe variation in
the variable assumed to cause the change in the other variable(s), and then measure the changes in
the other variable(s).
Example: If a clothing company currently sells blue denim jeans, causal research can measure the
impact of the company changing the product design to the color white.
Business research, like other forms of scientific inquiry, involves a sequence of highly interrelated
activities. The stages of the research process overlap continuously. Nevertheless, it follows a general
pattern which is as below:
A researcher has to know what to measure before knowing how to measure something. The problem
definition process should suggest the concepts that must be measured. A concept can be thought of as a
generalized idea that represents something of meaning. There are two kinds of concepts:
1. Simple Concepts: These are concepts such as age, sex, height, width, weight etc. These are
relatively concrete properties and very easy to measure.
2. Complex Concepts: These are concepts which are abstract in nature. Concepts such as loyalty,
personality, channel power, trust, customer satisfaction etc. are very difficult to measure. Thus,
to understand measurement of these concepts, we use pre-existing scales from previously
published academic journals. For example, loyalty has been measured as a combination of
customer share (the relative proportion of a person’s purchase going to one competing brand)
and commitment (the degree to which a customer will sacrifice to do a business with a brand).
4
Concepts are the basic units of theory development. A theory is a formal, testable explanation of some
events that includes explanations of how things relate to one another. However, theories require an
understanding of the relationship among concepts. Thus, once the concepts of interest have been
identified, a researcher is interested in the relationship among these concepts.
Propositions are statements concerned with the relationships among concepts. A proposition explains
the logical linkage among certain concepts by asserting a universal connection between concepts.
For example, we might propose that treating the employees of an organization better will make them
more loyal employees and increase customer satisfaction. This is certainly a logical link between
managerial actions and employee reactions, but is quite general and not really testable in its current form.
A hypothesis is a formal statement explaining some outcome. In its simplest form, a hypothesis is a guess
and is a proposition that is empirically testable. The hypotheses should be logically derived from and
linked to our research objectives.
Measures of Scales
Business research uses many scales or number systems and not all number
systems represent the same richness of data. Similarly, not all concepts
require a rich measurement scale. Thus, depending upon the task, there are
four kinds of scales:
1. Nominal Scale: The most elementary scale which just assigns a value
to an object for identification and classification purposes.
2. Ordinal Scale: This is richer than Nominal Scale. Here, the number used can be used to arrange or
rank a concept. Research participants are often used to rate a service from 1 to 5 with 1 being the
worst and 5 being the best.
3. Interval Scale: Scales that have both nominal and ordinal properties but also capture information
about differences in quantities of a concept from one observation to the next are called interval
scales.
A very famous example of interval scale is temperature. Consider temperature of June 6 as 80
deg. F and December 7 as 40 deg F. We can say that Jun 6 was hotter than December 7 but can
we cannot say that June 6 was twice as hot as December 7.
Another important factor about interval scale is the non-disappearance of concept at the value 0.
For example, 0 deg C does not mean the absence of temperature.
4. Ratio Scale: While an interval scales possess a relative meaning; a ratio scale has an absolute
meaning. A good example of this would-be weight of an object. When the weight of an object is 0
that essentially means the object doesn’t exist. Another difference from the interval scale is that
an 80 kg person is twice as heavy as a 40 kg person. Here both the difference and the ratio convey
meaning about the data.
5
Errors in research
Hypothesis testing using sample observations is based on probability theory. We make observation of a
sample and use it to infer the probability that some observation is true within the population the sample
represents. Because we cannot make any statement about a sample with complete certainty, there is
always the chance that an error will be made.
1. Type 1 error: An error caused by rejecting the null hypothesis when it is true; has a probability of
alpha. Practically a Type 1 error occurs when the researcher concludes that a relationship exists
in the population when in reality it doesn’t exist.
2. Type 2 error: An error caused by failing to reject the null hypothesis when the alternate
hypothesis is true. It occurs when researcher concludes no relationship exists when in reality there
exists one.
Apart from the above, there are errors that can creep into the survey. These errors can be classified
broadly basis the flow chart given below
Interviewer
cheating
6
Apart from the above-mentioned errors given in the flowchart above, some biases can also be present in
the deliberate falsification or Unconscious misrepresentation stage. These are as follows:
1. Acquiescence bias: A tendency for respondents to agree with all or most questions asked to them
in a survey.
2. Extremity bias: A category of response bias that results some individuals to use extremes when
responding to questions
3. Interviewer bias: A response bias that occurs because of the presence of the interviewer
influencing the respondents answer
4. Social desirability bias: Bias in responses caused by respondents’ conscious or unconscious
decision to gain prestige or appear desirable
The terms parametric statistics and non-parametric statistics refer to two major groupings of statistical
procedures. The major difference between them lies in the underlying assumptions about the data to be
analyzed.
Parametric Statistics involve numbers with known, continuous distributions. When the sample size is
large and the data is interval or ratio scaled, parametrical statistical procedure is appropriate. It is based
on the assumption that the data in the study are drawn from a population with a normal bell-shaped
distribution (Normal distribution).
Nonparametric statistics are appropriate when the numbers do not conform to a known or continuous
distribution. Thus, you cannot make the assumption that the data which is being analyzed is normal in
nature. This happens typically with nominal and ordinal scale types.
T-test
The t-distribution
Univariate t-test
The Univariate t-test is appropriate for testing the hypothesis involving some observed mean against some
specified value. The t-distribution like standard normal distribution is a normal (bell shaped curve) as
stated above. When sample size is larger than 30, the t-test and the Z-test are almost identical. Thus, as a
thumb rule, whenever sample sizes are less than 30, t-test is used and whenever sample sizes are more
than 30 the Z-test is preferred.
The t-test distribution is significantly affected by the degrees of freedom. The degrees of freedom are
determined by the number of distinct calculations that are possible given a set of information.
Univariate t-test the number of degrees of freedom is sample size (n) minus one.
7
The degrees of freedom influence the shape and height of the t distribution.
Note: The Z-test and the t-test are very similar and will provide the same results in most situations.
However, when the population standard deviation (SD) is known, the Z-test is more appropriate. When
SD is unknown and the sample size is greater than 30 the Z test can be used. When SD is unknown and
the sample size is less than 30 the t-test is more appropriate.
Chi-Square test
A chi square test is a non-parametric test which means that one cannot assume that the data analyzed is
normal in nature. There are two kinds of chi-square tests:
• Chi-square test for independence
• Chi-square test for goodness of fit.
8
Sample data is divided into intervals and then the numbers of points that fall into the interval are
compared, with the expected numbers of points in each interval.
For example, suppose a company printed baseball cards. It claimed that 30% of its cards were rookies;
60% were veterans but not All-Stars; and 10% were veteran All-Stars. We could gather a random sample
of baseball cards and use a chi-square goodness of fit test to see whether our sample distribution differed
significantly from the distribution claimed by the company.
Thus, the hypothesis would be:
H0: Data is consistent with the claim of the company
H1: Data is inconsistent with the claim of the company
One-way ANOVA
The one-way ANOVA (Analysis of Variances) is a parametric test done to measure if the means of two or
more samples are different or not. Thus, the H0 for this test is that all the means of all the samples selected
from the population is same.
Consider an example where there are two people and three drinks. In one-way ANOVA we try to
determine if preferences for drinks are same or different across samples.
H0 (a): There is no effect of gender on the test anxiety level of university students
H0 (b): There is no effect of educational level on test anxiety of university students
H0 (c): There is no effect of the interaction of gender and educational level on the test anxiety level of
university students
9
Simple Regression
Simple regression quantifies the relationship between one or more predictor variable(s) and one
outcome variable. Linear regression is commonly used for predictive analysis and modeling. For
example, it can be used to quantify the relative impacts of age, gender, and diet (the predictor variables)
on height (the outcome variable). Linear regression is also known as multiple regression, multivariate
regression, ordinary least squares (OLS), and regression.
Example: The table on the right shows some data from the early
days of the Italian clothing company Benetton. Each row in the
table shows Benetton’s sales for a year and the amount spent on
advertising that year.
Multiple Regressions
Linear regression with a single predictor variable is known as simple regression. In real-world applications,
there is typically more than one predictor variable. Such regressions are called multiple regressions.
Returning to the Benetton example, we can include year variable in the regression, which gives the result
that Sales = 323 + 14 Advertising + 47 Year. The interpretation of this equation is that every extra million
Euro of advertising expenditure will lead to an extra 14 million Euro of sales and that sales will grow due
to non-advertising factors by 47 million Euro per year.
10