Spss
Spss
SPSS
Statistical Package for Social Sciences
VERSIONS OF SPSS
SPSS Ver-1 to Ver-5 : DOS VERSIONS SPSS Ver-6 to Ver-15 : WINDOWS VERSIONS SPSS-X : For MAIN FRAMES (on various operating system platforms) SPSS-LAN: For LANs Web site: http://www.spss.com
BASIC APPLICATIONS
Creating data as Spreadsheet Generating Reports as Tables Statistical Analysis of Data Graphic Presentations
Creating data or Getting data Defining data Modifying data Processing data
generating tables statistical analysis generating graphs
Data Definition Variables Name Variable Type Field Width Decimal Positions Variable Label Value Labels Missing Values Column Width Alignment Scale
Maxi. 8 characters (up to Ver 10) First letter must be alphabet Arithmetic operators, special symbols and blank spaces not permitted Two variables can not have same name in one data file
Variable Name
Variable Label
It helps in reading outputs. No restriction on characters.
Variable Type
Numeric (Floating point) String (Character / Text) Date Currency
Value Labels
It helps in reading tables and other outputs. For example variable Marital Status has five values (codes):
value 1 means Never Married value 2 means Currently Married value 3 means Widow/Widower value 4 means Divorced value 5 means Separated
Missing Values
These are values indicating No Response or Not Applicable in any variable. Declaring missing values tells the SPSS package to ignore the cases containing these values during analysis. A blank in Excel or dBase/FoxPro file is treated as missing value. In SPSS data file, blanks appear as dots (.) denoting that theses are missing values.
MANIPULATING FILES
Insert variable Sort cases Transpose - Interchange rows and columns Merge Files - Add cases, Add variables Aggregate Select cases - Select with if condition Weight cases - for estimation / projection
Status Bar - process, selection, weight, n of cases Tool Bar - for data, syntax, chart, navigator (output) Fonts - type, size Grid Lines Value Labels
VIEW
Compute - create new variable in existing data file through an arithmetic expression. Recode - reorganize values of a variable. Rank cases Auto recode Create Time Series Replace missing values
Data Modifications
STATISTICAL PROCEDURES
OLAP Cubes On Line Analytical Processing Cubes Calculates uni-variate summary statistics with-in one or more categorical variables
DESCRIPTIVE STATISTICS
Frequencies - one variable at a time with various uni-variate statistics. Descriptives - uni-variate statistics. Explore - studying behaviour of variables. Crosstabs - Two-way, Three-way Ratio Statistics
MEANS
Display mean & S.D. by groups. One sample t-test. Two independent sample t-test. Two related samples or paired samples t-test. One-way ANalysis Of VAriance (ANOVA) with post-hoc tests.
LINEAR REGRESSION
Methods: Enter, Stepwise, Remove, Backward, Forward. Regression Coefficients: Estimate, Standard Error, Standardized coefficients, Significance. Residuals: Durbin-Watson test (for autocorrelation) Save: Predicted values, Residuals etc. Plot: Histogram, Normal Probability plot. Others: Multi-colinearity diagnosis, partial correlation, R-square change etc.
CORRELATIONS
Bivariate Correlations. Partial Correlations. Distances - Similarities and Dissimilarities
CLASSIFY
K-means Cluster Hierarchical Cluster Discriminant Analysis
DATA REDUCTION
Factor Analysis. Correspondence Analysis. Optimal Scaling - Homals, Princals, Overals.
FACTOR ANALYSIS
Methods: Principal Components, Principal Axis factoring, Maximum Likelihood etc. Criteria: Minimum Eigen value, N of factors, Number of Iterations. Rotation: Varimax, Quartimax, Equamax, Promax, Oblimin. Display: Initial factor matrix, Rotated factor matrix. Plot: Scree plot.
SCALES
Reliability Analysis - Alpha, Splithalf, Guttman, Parallel. Multi Dimensional Scaling (MDS)
NON-PARAMETRIC TESTS
Chi-square Binomial Runs test One sample K-S test Two independent samples tests Several independent samples tests Two related samples tests Several related samples tests
CHARTS
Bar, Line, Area, Pie, Hi-Low Pareto Charts, Control Charts (Xbar,R,p,c) Box Plot, Error Bar Scatter Plot, Histogram, P-P Plot, Q-Q Plot, Sequence Charts ROC Curve (Receivers Op Characteristic) Time Series : Autocorrelations, Spectral Plots, Cross-correlations,
Types of Data
Nominal: A variable can be treated as nominal when its values represent categories with no intrinsic ranking; for example, the department of the company in which an employee works. Examples of nominal variables include region zip code religious affiliation etc.
Ordinal Data
A variable can be treated as ordinal when its values represent categories with some intrinsic ranking; for example, levels of service satisfaction from highly dissatisfied to highly satisfied. Examples of ordinal variables include attitude scores representing degree of satisfaction or confidence and preference rating scores.
Scale Data
A variable can be treated as scale when its values represent ordered categories with a meaningful metric, so that distance comparisons between values are appropriate. Examples of scale variables include age in years and income in thousands of dollars.
Data Analysis
Simple Tabulation and Cross Tabulation Univariate and Bivariate Analysis Dependent and Independent variables First Stage Analysis- Simple Tabulation Second Stage Analysis- Cross Tabulation The Chi-square test for cross tabulation
Experimental Designs
Completely Randomized design in a one way ANOVA (single Factor) Randomized Block Design (single blocking factor) Latin Square Design (two blocking factor) Factoral design with two or more factors.
Regression
Basically two approaches: 1. Hit and trial approach (stepwise regression)exploratory research 2. A preconceived approach The output consist of the beta coefficient for all the independent variables in the model. The output also gives the result of a t-test for significance of each variable in the model, and the result of F-test for model on the whole. The coefficient of determination R2 is the total varience in y explained by all independent variables in the regression equation.
Problem A manufacturer and marketer of electric motors would like to build a regression model consisting of 5 or 6 independent variables, to predict sales. Past data has been collected for 15 sales territories, on sales and 6 independent variables. Build a regression model and recommend whether or not it should be used by the company
Dependent variable Y= Sales in Rs. Lakh in the territory Independent Variable X1= Mkt potential in the territory X2= No. of dealers of the company in the territory X3= No. of sales people in the territory X4= Index of Competitor activity on a 5 point scale (1= low, 5= high) X5= No. of service people in the territory X6= No. of existing customers in the territory
Factor Analysis
For Data reduction There are two stages in Factor analysis Factor Extraction process Rotation of principal components
THANK YOU