0% found this document useful (0 votes)
77 views105 pages

Madhur BRM Practical File Final

This document provides instructions for using SPSS software, including how to open SPSS, navigate the interface, define variables through the Variable View, specify variable properties like name, type, label, missing values, and understand different levels of measurement for variables. The Variable View allows editing variable parameters to properly prepare data for analysis in SPSS.

Uploaded by

Doli Chawla
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
77 views105 pages

Madhur BRM Practical File Final

This document provides instructions for using SPSS software, including how to open SPSS, navigate the interface, define variables through the Variable View, specify variable properties like name, type, label, missing values, and understand different levels of measurement for variables. The Variable View allows editing variable parameters to properly prepare data for analysis in SPSS.

Uploaded by

Doli Chawla
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 105

GURU GOBIND SINGH INDRAPRASTHA

UNIVERSITY

INSTITUTE OF INNOVATION IN TECHNOLOGY &


MANAGEMENT

Business Research Methodology


BBA 213

Submitted To: Submitted By:

Akshay Chauhan Name: Madhur Kathuria


Class: BBA M3
Associate Professor (Mgmt.) Enrolment no: 14290301721

1
Institute of Innovation in Technology & Management, New Delhi

(Affiliated to GGSIPUniversity)

Lesson Plan for Research Methodology Lab

Programme: BBASemester: IIIPaper Code: 208Academic Year: 2021-2024


Date of commencement of classes: 23rd July’

Subject Objective:
 To understand the various aspects of research, identify the various tools available to a
researcher.
 To help the students to become a business manager and can take practical decisions.
 To develop students’knowledge and understanding of Research Methodology.

INDEX

Lecture Details of the Topics/Sub- Page. No Signature/Remarks


No. topics to be covered

L-1 Introduction to SPSS 5-6

L-2 Defining Variables & 7-10


Data Coding

L-3 Entering Data into 11-12


SPSS

L-4 Types of Data 13

2
L-5 Questionnaire Design 14-23

L-6 Reliability Test 24-27

L-7 Frequency tables: 28-31

Using frequency tables for


analyzing data

L-8 Graphical representation of 32-47


statistical data

histogram (simple vs.


clustered)

L-9 Boxplot 48-50

L-10 Line charts 51-53

L-11 Scatterplot 54-55

L-12  Sample and Population, 56-59


concept of confidence
interval

L-13 Testing normality assumption 60-66


`in SPSS

L-14 T– test (one sample, 67-72


independent – sample)

L-15 T– test ( paired sample) 73-75

L-16 Chi square Test 76-77

L-17 Chi square Test 78-80

L-18 Anova 81-83

L-19 Anova 84-86

3
L-20 Correlation analysis: 87-89
Spearman
L-21 Correlation analysis: 90-91
Pearson

L-22 Regression Analysis 92-98

L-23 How to apply correlation 99-101

L-24 How to apply Regression 102-105

Name & Signature of Faculty Programme Director Director

4
Assignment - 1
Introduction to SPSS
What is SPSS?

SPSS is a Windows based program that can be used to perform data entry and analysis and
to create tables and graphs. SPSS is capable of handling large amounts of data and can
perform all of the analyses covered in the text and much more .SPSS is commonly used in
the Social Sciences and in the business world, so familiarity with this program should serve
you well in the future. SPSS is updated often. This document was written around an earlier
version, but the differences should not cause any problems.

Opening SPSS

Depending on how the computer you are working on is structured, you can open SPSS in
one of two ways.
1. If there is an SPSS shortcut like this on the desktop, simply put the cursor on it and
double click the left mouse button.
2. Click the left mouse button on the button on your screen, then put your cursor on
Programs or All Programs and left click the mouse. Select SPSS 17.0 for Windows by
clicking the left mouse button. (For a while that started calling the program PASW
Statistics 17, but they seem to have given that up as a dumb idea when everyone else calls it
SPSS. The version number may change by the time you read this.)

Layout of SPSS

The Data Editor window has two views that can be selected from the lower left hand side of
the screen. Data View is where you see the data you are using. Variable View is where you
can specify the format of your data when you are creating a file or where you can check the
format of a pre-existing file. The data in the Data Editor is saved in a file with the
extension .sav.

SPSS Menus and Icons

1. File includes all of the options you typically use in other programs, such as open, save,
exit. Notice, that you can open or create new files of multiple types as illustrated to the
right.
2. Edit includes the typical cut, copy, and paste commands, and allows you to specify
various options for displaying data and output.
3. Options Click on Options, and you will see the dialog box to the left. You can use this to
format the data, output, charts, etc.
4. View allows you to select which toolbars you want to show, select font size, add or
remove the gridlines that separate each piece of data, and to select whether or not to
display your raw data or the data labels.

5
5. Data allows you to select several options ranging from displaying data that is sorted by a
specific variable to selecting certain cases for subsequent analyses.
6. Transform includes several options to change current variables.
7. Analyze includes all of the commands to carry out statistical analyses and to calculate
descriptive statistics.
8. Graphs includes the commands to create various types of graphs including box plots,
histograms, line graphs, and bar charts.
9. Utilities allows you to list file information which is a list of all variables, there labels,
values, locations in the data file, and type.
10. Add-ons are programs that can be added to the base SPSS package. You probably do not
have access to any of those.
11. Window can be used to select which window you want to view (that is, Data Editor,
Output Viewer, or Syntax).
12. Help has many useful options including a link to the SPSS homepage, a statistics coach,
and a syntax guide.

6
Assignment – 2
Defining Variables & Data Coding
Variable View

If you click on the Variable View tab, you’ll get a screen that looks like this.

Each column (Name, Type, Width, etc) represents a parameter of the variable. And
each row corresponds to a variable in the Data View. At the moment, we only have one
variable, so there’s only one row.

To change the name of our variable to something meaningful, just click where it
currently says VAR00001, and replace it with “Age”. You can also change the number of
decimal places that are displayed in the Data View by clicking in the first cell of the
Decimals column, and changing 2 to 0.

7
Variable Parameters

You now know how to change the Name and Decimals parameters. Here are the other
parameters that can be specified and changed within the Variable View.

Type
Type specifies the type of a variable, such as numeric, string, data, date, and so on. In
our example, SPSS has correctly identified Age as a numeric type. If you need to change
the type, click inside the cell you want to change and a list of variable types will be
displayed (as below). Select the variable type you require and click OK.

Width
This parameter relates to the number of digits or characters that are displayed for a
particular variable within its column in the Data View.

Label
Label allows you to choose the text that is displayed in any SPSS output. For example, if
you give the Age variable a label “Age Status”, then “Age Status” will appear on charts,
graphs and tables. To add a label, click inside a cell within the Label column, and type in
the value.

8
Values
Valid if you have a numeric variable. If you are using numbers to groups, you can click
the box to assign labels.
Missing
This parameter allows you to specify a code for missing values. You might get missing
values if people refuse to answer a particular question on a questionnaire. To set a code
for missing values, click in a cell within the “Missing” column (the cell row should
correspond to the variable for which you wish to code missing values.

Select “Discrete missing values”, and input a value that doesn’t conflict with the data in
your data set. We’ve entered 999, because nobody is 999 years old.

Columns
This parameter relates to the width of the column in the Data View grid. To increase or
decrease the size of the column, click up or down on the arrow icon.

Align
This parameter relates to the alignment of values in the Data View grid. You can left
justify, right justify or centre the data values within a column.

Measure
This parameter relates to the level of measurement of your data.

 SPSS specifies three levels of measurement, nominal, ordinal or scale. You need
to get this right.

 Select nominal if your values are categories (for example, sex, religion, disease,
social class, species).

9
 Choose ordinal data if your values are a series of ranks, such as in the case
of a motor racing result (first, second, third, fourth) or class standing.

 Select scale if your values are numerical, where each interval (for instance, 1 metre, 1
second, 1 inch) is the same size.

Data View – Each row of data view represents one participant or one subject. Each column
within the data view, represents a single variable.

Rules/ Properties of Variable

1. No space between the variable words (either use camal case or underscore)
2. Variable name cannot begin with special characters. It must have to begin with letter.
3. Variable name cannot begin with the number.
4. Each variable name must be unique.
5. Variable name can be of 64 characters or less.
6. Width - Width represents how wide or narrow a string variable can be.
7. Column – Columns represents the how wide or narrow variable column can be.

Four Scales of Measurement

1. Nominal Scale - meant for identity only.


2. Ordinal Scale – meant for ranking and identity.
3. Interval Scale – meant for ranking, identity and equal difference.
4. Ratio Scale – meant for ranking, identity, equal difference and true origin.

10
Assignment - 3
Entering Data into SPSS

Follow these steps to enter data into SPSS:


1. Click the Variable View tab. Type the name for your first variable under the Name
column.
2. Click the Data View tab.
3. Now you can enter values for each case.
4. Repeat these steps for each variable that you will include in your dataset.

Steps to enter data into SPSS:


1. Define your variables. In order to enter data using SPSS, you need to have some variables.
These are the columns of the spreadsheet when using "Data View", and each one will contain
data that is all the same format.
 To define your variables, double-click a column heading "Data View" A menu will
appear, allowing you to define the variable.
 When entering a variable Name, it must begin with a letter and capitalization is ignored.
 When choosing the Type, you can choose between "String" (characters) and a variety
of numerical formats.

2. Create a multiple choice variable. If you are defining a variable that has two or
more set possibilities, you can set labels for the values. For example, if one of
your variables is whether or not an employee is active, your only two options for
that variable might be "Active" and "Former".
 Open the Labels section of the Define Variable menu, and create a numbered value for
each possibility (e.g. "1", "2", etc.).
 For each value, give it a corresponding label (e.g. "Active", "Former").
 When you enter in the data for that variable, you only have to type "1" or "2" to select
the option you want.

3. Enter your first case. Click the empty cell directly underneath the leftmost
column. Enter in the value that matches the variable type into the cell. For
example, if the column is "name", you might enter in an employee's name.
 Each row is one "case", which is referred to as a record in other database programs.

11
4. Continue filling out variables. Move to the next empty cell to to the right and fill out
the appropriate value. Always fill out one complete record at a time. For example, if
you are entering employee records, you would enter a single employee's name,
address, phone number, and salary before moving on to the next employee.
 Make sure that the values you enter match the Type format. For example, entering a dollar
value in a Date-formatted column will cause an error.

5. Finish filling out your cases. After each case is finished, move down to the
next row and enter in the next. Make sure each case has an entry for every
variable.
 If you decide you need to add another variable, double-click the next open column header
and create one.

6. Manipulate your data. Once you have finished entering all of your data, you can
use the tools built-in to SPSS to start manipulating your data. Some possible
examples include:
 Create a frequency table
 Run a regression analysis
 Run an analysis of variance
 Create a scatter plot graph

7. Then you obtain the output.

8. Save the file with the extension .sav.

12
Assignment – 4
Types of Data

General speaking, statistical techniques are determined by the type of data. A basic
understanding about the data types is helpful for choosing statistical procedures. In
SPSS, a column is for a variable and a row is for a case. There are, generally
speaking, two major types of data:

 Qualitative variables: The data values are non-numeric categories.


Examples: Blood type, Gender.

 Quantitative variables: The data values are counts or numerical measurements. A


quantitative variable can be either discrete such as # of students receiving an 'A' in a class, or
continuous such as GPA, salary and so on.

Another way of classifying data is by the measurement scales. In statistics, there are
four generally used measurement scales:

 Nominal data: data values are non-numeric group labels. For example, Gender variable can be
defined as male = 0 and female =1.

 Ordinal data (we sometimes call 'Discrete Data'): data values are categorical and may be ranked
in some numerically meaningful way. For example, strongly disagree to strong agree may be
defined as 1 to 5.

 Continuous data:

 Interval data : data values are ranged in a real interval, which can be as large as from negative
infinity to positive infinity. The difference between two values are meaningful, however, the ratio
of two interval data is not meaningful. For example temperature, IQ. Today is 1.2 times hotter than
yesterday is not much useful nor meaningful.

 Ratio data: Both difference and ratio of two values are meaningful. For example, salary, weight.

13
Assignment - 5
QUESTIONNAIRE DESIGN
A questionnaire is a research instrument consisting of a series of questions for the purpose
of gathering information from respondents. Questionnaires can be thought of as a kind of written
interview. They can be carried out face to face, by telephone, computer or post.

Objectives of Questionnaire :

1. Standardization (Fixed Pattern) of data collection.


2. Arriving at a better analysis for generalized conclusions.
3. Convenient tabulation and analysis of data.
4. Motivating and encouraging the participants to provide complete and accurate responses.
5. Minimizing response error and maximizing response rate.

SLEEP QUESTIONNAIRE

14
STEPS :

1. Go on the variable view.


2. Put the data in different columns.
3. Go on the values column and assign values as – 1. Strongly Disagree
2. Disagree
3. Neutral
4. Agree
5. Strongly Agree

4. Then click on OK.

15
SLEEP QUESTIONNAIRE

16
17
18
19
20
21
22
23
Assignment – 6
RELIABILITY TEST
Reliability analysis allows you to study the properties of measurement scales and the items that
compose the scales. The Reliability Analysis procedure calculates a number of commonly used
measures of scale reliability and also provides information about the relationships between
individual items in the scale. In simple words Reliability means checking the internal
consistency of the item.

 Reliability Analysis measures the interval consistency of the scale items in the
questionnaire.
 No. of scale questions indicate separate number of reliability analysis to be run.
 Do not mix positively items with the negatively items.

Testing – to check the correlation in mean of inner items that how they are reliable using
scale.

Take away points from Reliability Analysis of Scale Items:

Strongly Disagree Disagree Neutral Agree Strongly Agree

Equal Intervals (Interval Scale)

Cronbach’s Alpha

 Cronbach’s alpha is a measure of internal consistency, that is, how closely related a
set of items are as a group.It is considered to be a measure of scale reliability.
 0.7 is an ideal value of Cronbach’s alpha. Analysts frequently use 0.7 as a benchmark value
for Cronbach’s alpha.
 In a Cronbach's alpha analysis, a score of 0.7 or above is considered good, that is, the scale is
internally consistent. A score of 0.5 or below means that the questions need to be revised or
replaced, and in some cases, that the scale needs to be redesigned.
 Cronbach’s alpha value fall – Do not delete that item.
 Cronbach’s alpha value rise – Delete that item.
 In simple words, Minimum Cronbach’s alpha value is 0.5 if there are 10 or more
scale items. However the Cronbach’s alpha of 0.5 will also work if the number of
items are less than 10.
24
Small Scale
 More than 10 and 10 – Alpha value is 0.7

Large Scale

25
OUTPUT

26
Interpretation:

Data. Data can be dichotomous, ordinal, or interval, but the data should be coded
numerically.

Assumptions. Observations should be independent, and errors should be uncorrelated


between items. Each pair of items should have a bivariate normal distribution. Scales
should be additive, so that each item is linearly related to the total score.

STEPS:

1. Go to Analyze.
2. Click on Scale.
3. Click on Reliability Analysis.
4. Reliability Analysis dialog box will appear.
5. Click on Statistics.
6. Reliability Analysis : Statistics dialog box will appear.
7. From the column of Descriptive for click on Item, Scale, Scale if item deleted.
8. From the column of Summaries click on Means, Variances.
9. From the column of Inter- Item click on Correlations.
10. Click on Continue.
11. Click on Ok.

27
Assignment - 7
FREQUENCY TABLES
The Frequencies procedure can produce summary measures for categorical variables in the form
of frequency tables, bar charts, or pie charts.

A percentage frequency distribution is a display of data that specifies the percentage of


observations that exist for each data point or grouping of data points. It is a particularly useful
method of expressing the relative frequency of survey responses and other data. Many times,
percentage frequency distributions are displayed as tables or as bar graphs or pie charts.

Testing the difference in frequencies of variables in statistics and chart format.

Variables - Unique identification number (Nominal Variable ),Gender (Nominal Variable )

(1-Male , 2- female )

Height (Scale Variable) , Weight (Scale Variable )

28
29
OUTPUT

30
STEPS:

1. Go on Analyze.
2. Go on Descriptive Statistics.
3. Go on Frequencies.
4. Then Click on Ok.

Frequencies Statistics –

 Percentile value –Quartiles

 Central Tendency- Mean and Sum

 Dispersion- Standard deviation, Varience, Range ,Minimum ,Maximum.

 Distribution – Skewness, Kurtisos

Frequencies chart -

 Bar chart
 Histogram –show normal curve

Normal Distribution Curve –

Is divided into 3 parts.

 Normal Distribution Curve


 Positively Skewed Curve
 Negatively Skewed Curve

In Normal Distribution Curve data divided into 50% portion.

 If 50% or more data in right side then the right side/tail will be longer and it is called as
Positively Skewed Curve.
 If 50% or more data in left side then the left side/tail will be longer and it is called as
Negatively Skewed Curve.
 If the data is equally divided on both the left and right side means 50% on the right side and
50% on the left side then it is known as Normal Distribution Curve.

In case of Discrete Variable we draw Bar Graph.

In case of Continuous Variable we draw Histogram.

31
Assignment – 8
Graphical Representation of Statistical Data
BAR GRAPH
 Bar charts usually present categorical variables, discrete variables or continuous variables
grouped in class intervals. They consist of an axis and a series of labelled horizontal or
vertical bars. The bars depict frequencies of different values of a variable or simply the
different values themselves.
 If your dataset includes multiple categorical variables, bar charts can help you understand
the relationship between them.
 It allows you to compare different sets of data among different groups easily. It
instantly demonstrates this relationship using two axes, where the categories are on one
axis and the various values are on the other. A bar graph can also illustrate important
changes in data throughout a period of time.
 Bar graphs are a convenient method to represent different sets of data. Apart from this,
these types of graphs are easy to describe and easily understood by the reader.

32
OUTPUT

33
34
Interpretation:

 A bar graph or charts depicts the magnitude, sizes or the differences at equal intervals of time.
 Each bar represents separate item and collectively multiple bars are displayed
horizontally or vertically.
 It depicts the changes in the value of the dependent variable plotted on Y-axis at discrete
intervals of the independent variable on x axis.
 This means that on x axis of the bar diagram is a discrete variable while the other axis
represents a scale for one of continuous variable.
 Bars are vertical lines where the lengths of the bars are proportional to their numeric values.

STEPS:

1. Go on Analyze.
2. Go on Descriptive Statistics.
3. Go on Frequencies.
4. Frequencies dialog box will appear.
5. Go on Charts.
6. Frequencies Charts dialog box will appear.
7. From the Chart Type select Bar Charts and from the Chart Values select Frequencies.
8. Click on Continue.

35
MEANS
Mean implies average and it is the sum of a set of data divided by the number of data. Mean can
prove to be an effective tool when comparing different sets of data. The mean is the most
frequently used measure of central tendency because it uses all values in the data set to give you
an average. For data from skewed distributions, the median is better than the mean because it
isn't influenced by extremely large values.

Variables – Height (ScaleVariable ) ,Gender (Nominal Variable ) male -1,female-2.

Means –Dependent list (height ) , independent list –(gender ).

36
37
OUTPUT

STEPS:

1. Go to Analyze.
2. Go to Compare Means.
3. Click on Means.
4. Means dialog box will appear.
5. Place the Height in the Dependent List and place the Gender in the Independent List.
6. Click on Ok.

38
HISTOGRAM
A histogram is a method that uses bars to display count or frequency data. The independent
variable consists of interval- or ratio-level data and is usually displayed on the abscissa (x-axis),
and the frequency data on the ordinate (y-axis), with the height of the bar proportional to the
count.

It is used to summarize discrete or continuous data that are measured on an interval scale. It is
often used to illustrate the major features of the distribution of the data in a convenient form. It is
also useful when dealing with large data sets ,

Variables – height (Scale Variable ), weight (Scale Variable).

Properties of Histogram

1. Histogram shows us the shape of the distribution.


2. Histogram shows the skewness of the distribution.

This curve is known as Normal Distribution / Moderately Skewed Curve because the curve is
equally divided means 50% on the right side and 50% on the left side.

Mean = Median = Mode

This curve is known as Positively Skewed Curve because Right Tail of the curve is longer.
Mean > Median > Mode

39
It is known as Negatively Skewed Curve because Left Tail of the curve is longer.

Mode > Median > Mean

Skewness – Skewness is basically the shape of the distribution.

Skewness shows us the Graphical Presentation of data science.

3. Histogram shows us Outliers / Extreme Values.

 Should be Manually Rectify.


 Or Eliminate from the data set.
4. Histogram shows us Kurtosis.

Kurtosis – The flatness or peakedness of a curve is called as Kurtosis.

 Highly Peakedness – Leptokurtic


 Moderately Peakedness – Mesokurtic
 Flat - Platykurtic

5. Histogram shows the width of the bar which indicates lower and upper class limit.
6. Histogram is always used with Interval or Ratio

Scale. Nominal Scale

Measure Categorical variable Draw Bar Chart Gender No


Normal Curve

Ratio Scale

Measure Continuous variable Draw Histogram Height & Weight


Normal Curve

40
41
42
OUTPUT

43
STEPS:

1. Go to Graphs.
2. Click on Chart Builder.
3. Chart Builder dialog box will appear.
4. From the Gallery choose Histogram.
5. From the Variables column place the Height and Weight on x- axis.
6. Click on Element Properties.
7. Element Properties Set Parameters dialog box will appear.
8. Put the Custom value for anchor as 0.
9. Put the interval width as 1.
10. Click on Ok.

To obtain Normal Distribution Curve :

44
45
46
Steps To run Normal Distribution Curve :
1. Go to Analyze.
2. Click on Descriptive Statistics.
3. Click on Frequencies.
4. Frequencies chart dialog box will appear.
5. Click on Charts.
6. Frequencies: Charts dialog box will appear.
7. From the column of Chart Type choose Histograms.
8. Click on Show normal curve on histogram.
9. Click on continue.
10. Click on Ok.

47
Assignment - 9
BOX PLOT
A box plot is used to state five statistics at one time within each categorical value in graphic mode.
The statistics are the minimum value, first quartile, median value, third quartile, and maximum
value. It will help you to find the values that fall out the normal distribution.

Characteristics of Box plot

1. Box plot shows the center and spread of the data.


2. Box plot helps in identifying Skewness and outliers.
3. Box plot helps in comparing different variables.
4. End to End whisker shows the range of the dataset.
5. The box indicates inter- quartile range.
6. Box Plot is equal to an five no. Summary.

How to deal with Outliers :

1. Keep the outliers as it is, if it is a legitimate one.


2. Since the outlier can affect the Skewness of the data set, try to apply non- parametric test.
3. If the outlier is legitimate, correct the entry data errors.
4. Winsorize the outliers to match it with next minimum or maximum possible value.
5. Throw the outlier, if it is a Multivariate outlier.

48
OUTPUT

 Asterisk Mark representing outliers (Legitimate outlier).


 Legitimate accepted outlier.

49
Interpretation:

 The vertical axis shows the height of the respondents and the horizontal axis shows
gender of the respondents that is male and female.
 The box plot depicts that the variances of Male and Female are unequal.
 Further, the distribution of males is positively skewed from the position of median and
that of female, the distribution is slightly negatively skewed from the median (broadly
towards normal distribution) .

STEPS:

1. Go to Graphs.
2. Click on Chart Builder.
3. Chart Builder dialog box will appear.
4. From the Gallery choose Box plot.
5. From the variables column place the gender on x- axis and height on y – axis.
6. Click on Ok.

50
Assignment – 10
Line Charts
Line graphs are used to track changes over short and long periods of time. When smaller
changes exist, line graphs are better to use than bar graphs. Line graphs can also be used to
compare changes over the same period of time for more than one group.

51
52
Steps :

1. Go to Graphs.
2. Click on Chart Builder.
3. Chart Builder dialog box will appear.
4. From the variables: column drag Gender on x- axis and Height on y-axis.
5. Click on Titles/ Footnotes.
6. Click on Title 1.
7. Click on Footnote 1.
8. In the Edit Properties of : Click on Title 1.
9. In Contents box write HEIGHT ACROSS GENDER
10. Click on Apply.
11. Click on Ok.
12. In the Edit Properties of : Click on Footnote 1.
13. In Contents box write SOURCE: BBA BATCH (III SEM)
14. Click on Apply.
15. Click on Ok.

53
Assignment -11
Scatterplot

54
STEPS :

1. Go to Graphs.
2. Click on Chart Builder.
3. Chart Builder dialog box will appear.
4. From the Gallery Click on Scatter/ Dot.
5. From the Variables: column drag the weight on y- axis and drag the height on x- axis.
6. Click on Ok.

55
Assignment – 12

Sample and Population, concept of confidence interval

In statistics, the confidence level indicates the probability, with which the estimation of
the location of a statistical parameter (e.g. an arithmetic mean) in a sample survey is
also true for the population.

 Sampling confidence level: A percentage that reveals how confident you can
be that the population would select an answer within a certain range. For
example, a 95% confidence level means that you can be 95% certain the results
lie between x and y numbers.

C onfidence interval depend on:

The confidence interval is based on the margin of error. There are three factors that
determine the size of the confidence interval for a given confidence level. These are:
sample size, percentage and population size. The larger your sample, the more sure
you can be that their answers truly reflect the population.

The most common confidence levels are 90%, 95% and 99%.

 Population parameters are typically unknown because it is usually impossible to


measure entire populations. By using a sample, we can estimate these parameters.
However, the estimates rarely equal the parameter precisely thanks to random
sampling error. Fortunately, inferential statistics procedures can evaluate a sample
and incorporate the uncertainty inherent when using samples. Confidence intervals
place a margin of error around the point estimate to help us understand how wrong
the estimate might be.

56
57
58
L476
3.143 3.707 3.999
*1.g9S
s.ol

I.BU Z.764
3.1M 4.4S7
II 4J1B
3,012
4.Id0
!.761

1.748
ld33 1.740 1110

3.88]

L717 Z.819
Zo74
l.1t4

3.674
UI !
3.646

&35B M!7 3.373


13a 3•S/6 3.2gl
Assignment - 13

NORMALITY TEST
Normality tests are used to determine if a data set is well - modeled by a normal distribution and
to compute how likely it is for a random variable underlying the data set to be normally
distributed.
More precisely, the tests are a form of model selection, and can be interpreted several ways,
depending on one's interpretations of probability.

Variables—Height (Scale Variable ),Gender (Nominal Variable ).

60
61
OUTPUT

62
63
64
65
Steps : To run the Normality Test

1. Go to Analyze.
2. Click on Descriptive Statistics.
3. Click on Explore.
4. Explore dialog box will appear.
5. Drag the height in the Dependent list and Drag the Gender in the Independent/ Factor list.
6. Click on Plots.
7. Explore : Plots dialog box will appear.
8. From the Boxplots column click on Factor level together.
9. From the Descriptive column click on Histogram.
10. Click on Normality plots with tests.
11. Click on continue.
12. Click on Ok.

66
Assignment -14
ONE SAMPLE T TEST
The one-sample t test is a statistical hypothesis test used to determine whether an unknown
population mean is different from a specific value.

William Gosset (1905) published his readings with some of his students. It is known as T-
TEST/ “Students t test “.

T TEST applies only:

 In case of small sample (s).


 Population variance is unknown.
 The population variance is estimated from the sample variance.
 The mean is assumed to be known for applying t test.

Degrees of Freedom (df )- The degree of freedom is taken as n-1 in t test.

67
68
OUTPUT

STEPS:

1. Go to Analyze.
2. Click on Compare Means.
3. Click on One – Sample T Test.
4. One – Sample T Test dialog box will appear.
5. Place the height in the Test Variable(s): column.
6. Put the Test Value as 65.
7. Click on Ok.

69
INDEPENDENT -SAMPLES T TEST
Two independent samples to be measured at a single point in time and since, the
groups are independent of each other; we can appropriately apply independent
samples t-test.
 Sig means p value.

70
71
OUTPUT

STEPS:

1. Go to Analyze.
2. Click on Compare Means.
3. Click on Independent-Samples T TEST.
4. Independent – Samples T Test dialog box will appear.
5. Place the height in the Test Variable(s): column and place the gender in the Grouping
variable.
6. Click on Define Groups.
7. Define Groups dialog box will appear.
8. In the Use specified values put the value of Group 1 as 1 and put the value of Group 2 as 2.
9. Click on Continue.
10. Click on Ok.

72
Assignment – 15
PAIRED- SAMPLES T TEST
A single sample to be measured at different points in time (just like a before and
after design) and since, the groups are dependent of each other; we can
appropriately apply paired samples t-test.

Assumptions :
 The sample population need to be normally distributed.
 The variances of both the populations need to be equal(σ2).
 The sample drawn must be independent.

73
74
Interpretation:
 If P(sig) value is less than SIGNIFICANT LEVEL reject Null Hypothesis.
 If T Calculated value is greater than T Critical value Reject Null Hypothesis.
T Calculated Value > T Critical Value { Reject Null Hypothesis}
 Upper value & lower value both are negative and it does not include 0 so reject the Null
Hypothesis.

STEPS:

1. Go to Analyze.
2. Click on Compare Means.
3. Click on Paired-Samples T TEST.
4. Paired-Samples T TEST dialog box will appear.
5. Place the height under Variable 1 column and place the weight under Variable 2 column.
6. Click on Ok.

75
Assignment – 16 & 17

Chi- SQUARE TEST


(A test for the statistical significance of the strength of an

association) Non-Parametric Test for One

Sample

The chi-square test for independence, also called Pearson's chi-square test or the chi

square test of association, is used to discover if there is a relationship between two

categorical variables.

Rationale and Utility

 For applying Chi-square test, data can be ordinal or categorical. The


objective is to compare the distribution of responses, or the proportions
of participants in each response category, to a known distribution.
 The observed frequencies in each response category are compared to
the frequencies that would be expected if the null hypothesis was true.
 Χ2 = Σ(f0 – fe)2/fe
Fo = Observed frequencies
Fe = Expected frequencies

76
77
78
79
Interpretation:

 In Chi-square test Observed frequency distribution align with expected frequency distribution
 In Chi-square test Observed frequency distribution compare with expected frequency distribution
 In non- parametric test Variables are of categorical in nature
 In parametric test Variables are measured in matric scale

In this question Null Hypothesis (H0 ) is The newly launched courses has no impact
over the distribution of response of students.

The Alternate Hypothesis (H1) is The newly launched courses has impact over the
distribution of students.

For Chi- Square Test we need two things that is :

1. Degrees of Freedom that is (k-1) where k represents the number of response categories.(df)=(

k-1 ) In this question the number of response categories is = 3 so the degrees of freedom is (3-

1)= 2.

2. And second thing that we need is Significance Level.

In this question significance level is 5%.

(Chi square test) Calculated Value > (Chi square test) Critical Value { Reject Null Hypothesis}

 We do have sufficient sample evidence to reject Null Hypothesis.

Steps:
1. Go to Data.
2. Click on Weight Cases.
3. Weight Cases dialog box will appear.
4. Drag the Frequency in the Frequency Variable.
5. Click on Ok.
6. Click on Analyze.
7. Click on Nonparametric Tests.
8. Click on Legacy Dialogs.
9. Click on Chi- square.
10. Chi- square Test dialog box will appear.
11. Drag the Exercise_ Pattern in the Test Variable List.
12. In the Expected Values column put the values as 60, 25, 15.
13. Click on Ok.

80
Assignment – 18 & 19
One- Way ANOVA
 ANOVA was described by R.A.Fisher.
 Analysis of variance (ANOVA) is a statistical technique used to analyze multiple factors
which are hypothesized to influence the variable.

Classification of Anova :

 One – Way ANOVA


 Two – Way ANOVA
 N- Way ANOVA Case

One – Way ANOVA – Is used when we have a single factor with three or more levels and
multiple observations at each level.

Two – Way ANOVA - Is used to compare the effect of multiple levels of two factors with
multiple observations at each level.

 When covariates are included in ANOVA analysis, it is called analysis of covariance


(ANCOVA)
 When there are two dependent variables, multivariate analysis of variance (MANOVA) can
be used to test the hypothesis.
 The two or more dependent variables should be interval or ratio scale variables.

81
VARIABLE VIEW

82
83
OUTPUT

84
85
Interpretation:

 Significant value is greater then Levene’s Statistic


 Then Levene’s Statistic is not significant means variances among the job quotient are equal.
Variances are Homogenous (That is a good thing). So no need of Welch and Games Howell
test.
 In ANOVA we apply F- test.
 For Numerator the degrees of freedom is (C-1)
 Total Columns =4 (4-1) .So degrees of freedom for Numerator is = 3
 For Denominator the degrees of freedom is (N-C)
 Total no. of observations = 20. So the degrees of freedom is (20-4) = 16
 So the degrees of freedom for denominator is = 16

STEPS :

1. Go to Analyze.
2. Click on Compare Means.
3. Click on One- Way ANOVA.
4. One- Way ANOVA dialog box will appear.
5. Place the Managerial Level in the Dependent List and Job Stress Quotient in the
Factor column.
6. Click on post_ Hoc.
7. One- Way ANOVA: Post Hoc Multiple Comparisons dialog box will appear.
8. From the column of Equal Variances Assumed click on Tukey.
9. From the row of Equal Variances Not Assumed click on Games Howell
10. Put the value of significance level as 0.05
11. Click on Continue.
12. Click on options.
13. One- Way ANOVA: Options dialog box will appear.
14. From the Statistics column click on Descriptive, Homogeneity of variance test, Welch.
15. Click on Means plot.
16. Click on Continue.
17. Click on Ok.

86
Assignment - 20
Correlation Analysis : Spearman
Correlation is a bivariate analysis that measures the strength of association between two
variables.

The correlation coefficient varies between +1 and -1 wherein a value closer to ±1 represents the
stronger relationship between the variables involved.

A correlation of -1.0 indicates a perfect negative correlation, and a correlation of 1.0


indicates a perfect positive correlation

Measures of Correlation

Karl Pearson’s Coefficient of Correlation (r) –

 Is applicable to determine the strength of correlation between the two quantitative


variables involved at a time.
 And only used on ratio scale variable.

Edward Spearman Rank Correlation (p) –

 Is applicable to determine the strength of correlation between two of the qualitative nature
of variables.
 It is for Categorical variable.

Three Categories of Correlation:

1. Simple Correlation – Two variable are involved at a point of time.


2. Partial Correlation – Studying relationship between two variables and keeping third
constant at point of time.
3. Multiple Correlation - A multiple correlation coefficient yields the maximum degree of liner
relationship that can be obtained between two or more independent variables and a single
dependent variable.

87
88
89
Assignment - 21
Correlation Analysis : Pearson

90
STEPS :
1. Go to Analyze.
2. Click on Correlate.
3. Click on Bivariate.
4. Bivariate Correlations dialog box will appear.
5. Place the Height & Weight in variables column.
6. From the row of Correlation Coefficients click on Pearson & Spearman.
7. Click on Flag significant correlations.
8. Click on Ok.

91
Assignment - 22
LINEAR REGRESSION ANALYSIS
A statistical measure that defines co-relationship or association of two variables. Describes how an
independent variable is associated with the dependent variable.

There are 3 assumptions of linear regression which should meet by using SPSS.

Assumption- 1

No Presence of outliers.

 If the outliers are present the properties of regression does not fulfil.
 The Minimum and Maximum values of standardized residual must be within the range of + -
3.29.

Assumption- 2

Independence of Observation

 Durbin- Watson Value must fall within the range of 1,2,3.


 1 < Durbin -Watson Value < 3.

Assumption- 3

Normal Distribution of the Predicted variable

 Independent variable have four names – Independent / Regressor/ Predictor / Estimator /


Cause .
 Dependent variable have four names – Dependent / Effect / Regressed / Estimated /
Predicted.
 Predicted variable is Dependent variable. The data of this is Normally distributed.
)
Coefficient of Determination (r2

With the help of the correlation coefficient, we can determine the coefficient of
determination. Coefficient of determination is simply the variance that can be explained
by X variable in y variable. If we take the square of the correlation coefficient, then we
will find the value of the coefficient of determination.

92
93
94
OUTPUT

95

96

97
STEPS :

1. Go to Analyze.
2. Click on Regression.
3. Click on Linear.
4. Linear Regression dialog box will appear.
5. Place the Weight under Dependent variable and Place the Height under Independent variable
6. Click on Statistics.
7. Linear Regression Statistics dialog box will appear.
8. From the column of Regression Coefficients click on Estimates, Confidence interval Level
(%) , Model fit, Descriptives.
9. Fill the Confidence interval Level (%) as 95.
10. From the column of Residuals click on Durbin- Watson, Casewise diagnostics.
11. Fill the Outliers outside as 3.
12. Click on Continue.
13. Click on Plots.
14. Linear Regression : Plots dialog box will appear.
15. Place ZRESID on Y and ZPRED on X.
16. From the column of Standardized Residual Plots click on Histogram, Normal probability
plot.
17. Click on Continue.
18. Click on Ok.

98
Assignment 23: How to apply correlation.

Step 1: Go to analyze and click on correlate. Then click on bivariate.

Step 2: Specify variables and check Pearson correlation coefficients

99
Step 3: Click on options and select means and standard deviation. Click on continue and then on
ok.

Step 4: The output is shown below.

10
Step 5: Then click on spearman correlation confinement.

Step 6: Output is shown below.

10
Assignment 24:How to apply regression.

Step 1: Click on regression and then on linear.

Step 2: Specify the dependent and independent variables.

10
Step 3: Go to statistics and check estimate, model fit and descriptive.

Step 4: Click on plots and specify dependent and independent. Also check histogram and NPP.

10
Step 5: Output is shown below with R square as regression value.

10
10

You might also like