0% found this document useful (0 votes)
32 views

RM-Cha 9

This document provides an overview of quantitative data processing and analysis techniques for economists. It discusses 1) data processing which involves compiling, editing, coding, classifying and tabulating raw data, 2) descriptive analysis such as measures of central tendency and frequency distributions to describe data, and 3) inferential analysis including statistical tests like chi-square and correlation to analyze relationships between variables and test hypotheses. Regression is also introduced as a technique to assess causal relationships and predict the value of one variable based on another.

Uploaded by

kide93920
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views

RM-Cha 9

This document provides an overview of quantitative data processing and analysis techniques for economists. It discusses 1) data processing which involves compiling, editing, coding, classifying and tabulating raw data, 2) descriptive analysis such as measures of central tendency and frequency distributions to describe data, and 3) inferential analysis including statistical tests like chi-square and correlation to analyze relationships between variables and test hypotheses. Regression is also introduced as a technique to assess causal relationships and predict the value of one variable based on another.

Uploaded by

kide93920
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 83

UNITY UNIVERSITY:

FACULTY - COLLEGE OF BUSINESS, ECONOMICS


AND SOCIAL SCIENCE
COURSE TITLE: Research Methodology for
Economists
COURSE CODE: Econ 364
DEPARTMENT: Economics
LEVEL: Undergraduate
CREDIT HOURS: 3
ACADEMIC YEAR: 2023/2024 G.C
SEMESTER: I
COURSE INSTRUCTOR: H . A
E-mail:
SKYPE:
Cellphone: Unity University Dec 2023
CHAPTER: – NINE

Overview of Data
Processing and
Analysis
Overview of Data Processing and
Analysis
•The goal of any research is to provide information

out of raw data.

•The raw data after collection has to be processed and

analyzed in line with the outline (plan).


Data Processing and Analysis . . .
•The compiled data must be:
✓ classified,
✓ processed,
✓ analyzed and
✓ interpreted carefully before their
complete meanings and implications can
be understood.
Data Processing
• Data processing implies compiling, editing,

coding, classification and tabulation of

collected data so that they are amenable to

analysis.

• You are checking and cleaning the data.


Data Analysis
•It is the process of breaking a complex topics into
smaller part to get better understanding of it
•Thus data analysis is:
✓ further transformation of the processed data to
look for patterns and relations among data groups.
✓ the computation of certain indices or measures
along with searching for patterns or relationship
that exist among the data groups.
Data Analysis . .
•Analysis particularly in case of survey or
experimental data involves:
✓ estimating the values of unknown parameters of
the population and testing of hypothesis for
drawing inferences.

•Examining your data in detail


Data Analysis can be categorized as
I. Descriptive Analysis
II. Inferential (Statistical) Analysis

Data Analysis:
1. Quantitative Data Analysis
2. Qualitative Data Analysis
1

Quantitative Data
Processing and
Analysis
• Used for 3 important tasks:

1. To measure difference between groups

2. To assess the relationship between variables

3. To test hypothesis scientifically


1. Data preparation/processing

2. Describing the data/descriptive statistics

3. Drawing the inference of the data/inferential statistics

4. Interpretation of the data


Compilation

Editing

Coding 1. Data processing


/preparation
Classification
data entry

Tabulation

2. Descriptive Data Analysis 3. Inferential

Univariate Bivariate Multivariate


4. Interpretation
Step 1: Processing of Data/Data Preparation

Compilation Editing Coding Classification Tabulation

Bringing Process of Refers to the Data into The process


together all examining the process of homogenous of
the collected raw assigning groups. summarising
data to detect numerals or raw data and
collected Two Ways:
errors and other
Coding Classification Tabulation
data and displaying
omissions and symbols to ● Classificat
arranging in same in
to correct answers so ion on
a particular these when that compact
attributes
order possible. responses form
● Classificat
Basically can be put
into limited
ion on
scrutiny of the
number of class
completed
categories or intervals
questionnaires.
classes.
1. Compilation:

• It can be ordered in accordance with

✓ serial number

✓ role number

✓ others
2. Editing:
• Is a process of examining the collected raw data to
detect errors and omission (extreme values) and to
correct those when possible

• Careful scrutiny of completed questionnaires or


schedules

• Check if questioners answers properly or skipped


• Editing can be either

I. Field editing:

II. Central editing:


3. Coding:
• Refers to the process of assigning numerical or other
symbols to answers so that responses can be put into a
limited number of categories or classes.
• Coding is used when the researcher uses computer to
analyze the data otherwise it can be avoided.
Ex: Dummy variables:
1 = Male 0 = Female
1 = family owns a home
0 = family do not own a home
4. Classification:
• Data Classification implies the processes of
arranging data in groups or classes on the basis of
common characteristics.

a). Classification of data according to attributes:

• On the basis of common characteristics (literacy,


sex, honesty, etc) or numerical (such as, weight, age
height, income, expenditure, etc.).
b). Classification of data according to class interval
• Numerical data like income, production, age, weighted, etc
can be classified into intervals.

Ex:

➢ Individuals whose incomes, say, are within 1001-1500 Birr


can form one group,

➢ Those whose incomes within 500-1000 Birr form another


group and so on.
Some problems in processing
Don’t know (DK) Responses:

• During data processing, the researcher often comes


across some responses that are difficult to handle.

???

• How the DK responses are to be dealt with by


researcher?
Prevention is the best!

• The best way is to design better types of question.

• Good rapport (understanding) of interviews with

respondents will result in minimizing DK response.


5. Tabulation:

• Put your variable /data in raw and column


Step 2: Descriptive statistics - Describing the Data

o Refers to the transformation of raw data into a form


that will make them easy to understand and
interpret.

o Descriptive analysis is largely the study of


distribution of one variable.
• The most common forms of describing the
processed data are:

✓ Measurements of
✓ Tabulation
dispersion
✓ Averages,
✓ Measurement of
✓ Frequency distribution,
asymmetry
✓ Percentage distribution
✓ Data transformation and
✓ Measurements of central
index number
tendency
Step 3: Drawing the inferences of the data –
Inferential statistics
• Inferential Analysis

• In case of bivariate or multivariate population

• They frequently conduct and seek to determine


the:

✓ r/ship b/n variables and test statistical


significance
Which Analysis to Use?

• It depends on the

1. Specific objective of the research

2. Data type

3. Data assumption – data requirement


1 Specific objective of the research

• If our objective is to see the r/ship b/n 2 variables

use

a. Chi-square test:

b. Correlation test:
a. Chi-square (χ2) test:

• Tests the r/ship b/n 2 qualitative nominal variables

where order doesn’t matter.

• A chi-square tests are often used to test hypotheses.

• A measure of the d/c b/n the observed and expected

frequencies of the outcomes of a set of events or

variables.
Ex:

• Is there a relationship between student gender and

course choice?

• Gender vs political party preference.


Formula

χc2 = Chi squared


Oi = Observed value
Ei = Expected value
Decision:
• If the χ2 value is large, the null hypothesis is rejected.
or
Hypothesis
P-value Description
Interpretation
It indicates the null
P-value ≤ 0.05 hypothesis is very Rejected
unlikely.
It indicates the null Accepted or it
P-value > 0.05
hypothesis is very likely. “fails to reject”.
b. Correlation test:

• It is a bivariate analysis that measures the strength of

association b/n 2 variables and the direction of the

relationship.

r = +1 r = -1 r=0
Ex:

• If the price of oil decreases, airfares also decrease

• If the price of oil increases, so do the prices of

airplane tickets
Limitation of correlation test:-

• It cannot:

➢tell about cause-and-effect b/n variables

➢Marginal impact of one on the other

➢What %age of the variable one (y) is explained by x

➢Forecasting
❑Both tests cannot establish a causal

relationship between two variables.

❑B/s of the above limitations, OLS is most

useful
Regression Analysis
• Similar to correlation analysis, but goes beyond that

• It assesses the causal r/s b/n variables

• The researcher tries to estimate or predict the average value


of one variable on the basis of the value of other variable.

• Ex: A connection b/n

rainfall (X) and crop

productivity (Y)
Time series Analysis:

• Successive observations of the given phenomenon

over a period of time are analyzed through time

series analysis.

• It measures the relationship between variables and

time (trend)
• If our objective is to compare the average
value b/n groups use the mean comparison
tests:

a. T-test:

b. ANOVA
a). T-test
• Statistical test procedure

• Analysis whether there a significant d/c b/n means


of 2 groups

• 3 types of t-tests

I. One sample t-test:

II. Independent sample t-test:

III. Paired sample t-test:


I. One sample t-test:
• Compares mean value of a sample with known mean value

Ex:

• You want to find out if the one quintal cement your


company produces really weigh 100kgs on average.

• To test this, weigh 50 quintal

cements and compare the actual

weight with the weight they

should have (100 kgs).


II. Independent sample t-test: Unpaired t-test

• Compare the mean values of 2 independent groups/samples

• If the d/c in means is large enough, it is assumed that the


two groups differ.

• If there is average
grade d/c b/n M and F
in Unity University
III. Paired sample t-test:

• The dependent samples T-test

• Compare the mean values of 2 dependent groups


• Tests whether the mean
values of two dependent
groups differ significantly
from each other.
Ex:
• You want to check whether a new drug increases
memory performance. Test the memory performance
of 50 people before and after they take the medicine.

• Health consciousness of
the Ethiopian population
before and after Covid -19
Hypothesis Test of Gender effect . . .

Ho: The new drug has no impact on increasing memory

Ha: The new drug contributes in increasing memory

Decision Rule
• Use either t-test or p-value
a. T-test
• If tCal.Value > tcritical . . . Reject Ho
b. P-value
• If P-value <  . . . Reject Ho
b. ANOVA:

• What is ANOVA? Analysis of Variance

• ANOAV is a test used to determine differences

between research results from  3 unrelated

samples or groups.
Ex:

• Impact of the level of employee training on


customer satisfaction ratings.

• Dependent Variable: The quantitative dependent


variable is customer satisfaction.

• Independent variable: The level of employee


training

• Use ANOVA for this Research Topic.


• Why ANOVA for this Research Topic?

✓It helps you understand how employees of d/t


training levels

➢Beginner

➢Intermediate and Rate customer


satisfaction
➢Advanced
Employee Training Impact on Customer Satisfaction
• ANOVA result is based on the F ratio

• F ratio is a measure of the comparison b/n the


variation b/n groups and variation withing groups.

• Higher F ratio values indicate the variation between


groups is larger than the individual variation of
groups.
• If no real d/c exists b/n the tested groups, which is

called the null hypothesis, the result of the ANOVA's

F-ratio statistic will be close to 1.

• The larger the F ratio, the more likely that the

groups have different means.


Hypothesis testing:

Ho: Employees have the same customer satisfaction


ratings

Ha: Employees have d/t customer satisfaction ratings

Decision Rule

• If there is a statistically significant result - null


hypothesis is rejected – meaning the employee
groups performed differently.
Types of ANOVA
One-way ANOVA
• One-way ANOVA is its most simple form
• Testing differences between three or more groups based on one
independent variable
• Ex: Comparing the sales performance of different stores in a retail
chain.
Two-way ANOVA
• Used when there are two independent variables.
• Allows for the evaluation of the individual and joint effects of the
variables.
• Ex: The impact of both advertising spend and product placement
on sales revenue.
MANOVA
• Multivariate ANOVA
• Differs from ANOVA as it tests for multiple dependent variables
Cross Tabulation:
• Used to quantitatively analyze the r/ship b/n multiple
variables.
• Enable researchers to understand the correlation b/n the d/t
variables.
• To compare the results for one or more variables with the
results of another.
2 Data Type - Measurement Scale
• A measurement scale is used to qualify or quantify
data variables in statistics.

• It determines the kind of techniques to be used for


statistical analysis.

• The measurement scales are used to measure;

a. Qualitative and

b. Quantitative data
• Measurement scales are four in number

1) Nominal scale
used to measure qualitative
2) Ordinal scale data

3) Interval scale
used to measure quantitative
data.
4) Ratio scale
Nominal Scale

• The nominal scale is a scale of measurement that is

used for identification purposes.


Nominal Scale Example 1
Q # 1: Which political party are you affiliated
with?
Independent
Prosperity
EDPA
• Labeling Independent as “1”, Prosperity as “2” and
EDPA as “3” does not in any way mean any of the
attributes are better than the other.
• They are just used as an identity for easy data
analysis.
Nominal Scale Example 2

Q # 2: What is your favorite color?

White

Green

Black

Gray

• Labeling white as “1”, Green as “2” and Black as


“3”, Gray as “4”.
Ordinal Scale

• The attributes on an ordinal scale are usually

arranged in ascending or descending order.


Ordinal Scale Example 1
Q # 1: How would you rate our company
software?
Excellent
Very Good
Good
Bad
Poor
• The attributes in this example are listed in
descending order.
Ordinal Scale Example 2

Q # 2: How old are you?


Note that:
20 - 25 Age can be continuous

26 - 30 variable

31 - 35

36 - 40

41 - 45
Ordinal Scale Example 3

Q # 3: What is your highest education level?


Professor
PhD
MSC/MA
Diploma
High school
Junior
Illiterate
Interval Scale

• The interval scale of data measurement is a scale in


which the levels are ordered and each numerically
equal distances on the scale have equal interval
difference.
A 5 Minutes Interval Time Scale
• Some of these uses include:

➢ calculating a student’s CGPA,

➢ measuring a patient’s temperature,

➢ etc.
Ratio Scale

• Allows the researcher to compare both the


differences and the relative magnitude of numbers.

• Some examples of ratio scales include length,


weight, time, etc.

• The common ratio scale are price, number of


customers, competitors, etc.
In general,
• Categorical data – qualitative data
• Scale data – quantitative data
• Again quantitative data can be 2 types
a. Continuous
b. Discrete
Continuous:
• If our data is continuous (if our dependent variable
is continuous) use simple linear regression model
• If our dependent variable is categorical (binomial,
multinomial or ordinal) it depends again as follows.
Ex:
• If our dependent variable is binomial we can’t use linear
regression
• Instead use logistic regression
✓ Binomial logistic regression – dichotomous
✓ Multinomial logistic regression
✓ Ordinal logistic regression
• If our dependent variable is categorical with more than 2
responses use multinomial logistic regression
• If there is an order in these response (if it is ordinal
category) use ordinal logistic regression
3 Data Assumption- Requirement

• Each data analysis tools has its own assumptions

• Why assumptions?

➢ b/s each data analysis tool has its own limitations

➢ so the assumptions of each tool or model must be


known well before analysing your data

➢This helps you to exactly select the appropriate data


analysis tool
Ex
• In order to run using linear regression analysis the
dependent variable must be continuous but not
sufficient.

• The assumptions of OLS:


✓ No multicollinearity

✓ Normal distribution of the data


should be fulfilled
✓ No autocorrelation

✓ etc.
4 . Interpretation of the Data

• It means you will do deep or critical examination of

the research

• Deeply examine that research

• Making sense of that research in an understandable

manner

• Refer chapter 10 for further understanding about it


2

Qualitative Data
Analysis Methods
Common Methods are:

1. Content analysis

2. Narrative analysis

3. Discourse analysis

4. Thematic analysis

5. Grounded Theory

6. IPA - Interpretative phenomenological analysis


1. What is content analysis?
• It is widely accepted and the most frequently
employed technique for data analysis in research
methodology.

• It can be used to analyze the documented


information from text, images, and sometimes from
the physical items.

• It depends on the research questions to predict when


and where to use this method.
1. What is Narrative analysis?
• This method is used to analyze content gathered
from various sources such as personal interviews,
field observation, and surveys

• The majority of times, stories, or opinions shared by


people are focused on finding answers to the
research questions.

• Analysis of conversations and interactions


3. What is discourse analysis?
• Similar to narrative analysis, discourse analysis is used to
analyze the interactions with people.

• This particular method considers:

✓ the social context under which or within which the


communication b/n the researcher and respondent
takes place.

✓ focuses on the lifestyle and day-to-day environment


while deriving any conclusion.
4. Thematic Analysis
• Used when you are trying to find out something about
people’s views, opinions, knowledge, experiences or
values from a set of qualitative data
Ex:
• How do patients perceive doctors in a hospital setting?
• What are young women’s experiences on dating sites?
• What are non-experts’ ideas and opinions about
climate change?
• How is gender constructed in high school history
teaching?
5. Grounded Theory
• To study a particular phenomenon and discover new
theories that are based on the collection and analysis of real
world data.

• The theory is “grounded” in actual data

• Which means the analysis and development of theories


happens after you have collected the data.

• When you want to explain why a particular phenomenon


happened, then use it for analyzing quality data is the best
resort.
6. IPA
• Interpretative phenomenological analysis -IPA
• Aims to provide detailed examinations of personal
lived experience.
• An approach to psychological qualitative research
with an idiographic focus
• Which means that it aims to offer insights into how
a given person, in a given context, makes sense of a
given phenomenon.
• Understanding people’s unique experiences of a
phenomenon
Quiz 1
1. ____________implies compiling, editing, coding, classification and
tabulation of collected data. A). Data processing B). Data analysis
C). Data Interpretation D). All F). None

2. The process of breaking a complex topics into smaller part to get


better understanding of the data is _____ A). Processing B).
Analysis C). Interpretation D). All F). None

3. Estimating the values of unknown parameters of the population and


testing of hypothesis for drawing inferences is: A). Data processing
B). Data analysis C). Data Interpretation D). All F). None

4. Distinguish between T-test and ANOVA


Quiz 2
1. The process of summarizing raw data and displaying same in
compact form is ___ A). Compiling B). Editing C). Coding D).
Classification E). Tabulation F). None

2. Scale of measurement that is used for identification purposes is __


A). Interval Scale B). Ratio Scale C). Nominal D). Ordinal E). All

3. Estimating the values of unknown parameters of the population and


testing of hypothesis for drawing inferences is: A). Data processing
B). Data analysis C). Data Interpretation D). All F). None

4. Mention the stages sequentially that researchers are undertaking in


quantitative and qualitative data analysis.
END OF CHAPTER NINE

THANK YOU!

You might also like