Working With Statistics Using Excel: K.V.S. Sarma Professor of Statistics Sri Venkateswara University Tirupati - 517 502
Working With Statistics Using Excel: K.V.S. Sarma Professor of Statistics Sri Venkateswara University Tirupati - 517 502
K.V.S. SARMA Professor of Statistics Sri Venkateswara University Tirupati 517 502 [email protected]
This is the age of Facts, Figures and Statistics The common man is expected to read data from various sources and draw conclusions. Data is available everywhere. One has to wonder at the innumerable data sources around. Statistical tools help in converting Data into Information
Statistics today
DATA
INFORMATION
Data does not mean numerical facts aloneit includes text, pictures as well as voice!! Some one has to compile that large data and make valuable information
Statistics and Computers help in this process
District Level
State Level Region Level Country Level Global Level A database is a collection of records (or data files) combined and treated as a unit for information retrieval
Statistical Databases!
The DATA should be converted into information (reports) by applying Data Analysis Tools
Establishing functional relationship between causes and effect Computing the Growth rates Understanding the Trends and making forecasts and many more! Preparing a document stating the methodology and interpreting the results
How to do?
Reference to Statistical Books for formulae Bypassing complex calculations and reporting the easy-to-do things alone!
A new health insurance scheme is introduced by a company for its employees The management wishes to know the reaction of its employees to the new scheme Opinions were collected from 50 employees on several aspects like
Age, Gender, Marital Status, Education level, Present arrangements for health check up, monthly income and Concept Rating.
Opinions were sought on a five point scale (multiple choice-tick one only)
Age (initially no coding ) actual years Gender Male Female Marital Status Married Single Monthly income Less than Rs.1000 Rs.1000 to Rs.2999 Rs.3000 to Rs.4999 Rs.5000 & above
M F M S 1 2 3 4
1 2 3 4 1 2 3
4
Analysis is based on the questions for which the data is expected to provide answers Some questions Identify how many are interested in the new scheme and how many are either indifferent or not interested Analyze Cross tabulate them along Gender, Age, the Data! Education, marital status etc Is there any relationship between the income level and the type of response? Identify the factors influencing the adoption to new scheme? What else the data speaks!
Data Entry -The First Step Analysis with Software The Second Step
The data collected from the field contains filled-in questionnaires or sheets Each sheet must have a serial number The sheets should be converted into a data file for use in computer We can probably divide the work and make more than one file and assign the work to Data Entry Operators The Data Entry Design should be well planned and be common for all operators These data files can be pooled up if necessary to make a project-data-file
Data should be arranged as separate records one for each individual (entity) The data should be numeric for Taking data carrying out any analysis from book to Names and other labels will not go computer in for analysis but can be used for reporting Suitable coding should be defined before entering data in the computer
Software for data entry and data Packages for Statistical Analysis analysis
SPSS SAS MINITAB SYSTAT
A VISIT TO EXCEL
Open Excel On the title bar of the Excel window the file name appears as Microsoft Excel Book1 It usually contains three sheets named Sheet1,Sheet2 and Sheet3 In Sheet1 start entering the data from cell A1 Reserve the first row for column headings like Sno, Age, Gender etc Key in the data row wise or column wise (press ENTER key after each entry) Save the file with a suitable name in a Folder meant for this project
Finding sums Data sorting and Filtering Making one dimension tables Cross tabulations Creating different types of graphs Making abstracts from worksheets Changing the styles of presenting data Linking Excel report to a document
Selecting a part of data Sorting Filtering Column width Cut, Copy & Paste Auto Fill Paste Special Freeze Panes Exporting Excel data to Word
A free package of simple statistical tools is available in Excel It is called Data Analysis Pak It provides for analyses like
Data Analysis Pak
Summary statistics Comparison of groups Correlations Regression analysis Statistical tests of hypothesis ..and many more
SNO 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
NAME GENDER RAJA. M B ANITHA. R G NEELIMA. K G SIVARAJAN. A B MUTHU. B G GOPAL.R B BEENA. A G ACHUTAN. S B PRADEEP.M B PERUMAL. S B VARADAN. D B DIVYA. T G VASUMATHI. D G ANDAL. B B JAYA. L G RAMAN. N B MUREGESH. M B GANESH. L B SASIKALA. R G VALLI. M G
CASTE SC SC ST OC OC OC BC BC BC OC OC BC BC SC ST BC ST ST BC SC
ENGLISH MATHS 60 27 55 44 46 54 35 47 20 46 54 50 63 46 54 52 35 40 25 36 28 40 64 56 37 45 63 44 56 52 45 48 50 46 35 38 52 50 41 55
SCIENCE 45 36 65 28 35 45 64 65 54 45 38 37 54 36 63 54 68 65 54 58
Soft Skill
Skill
You can make one-way and two-way frequency tables from Excel sheet Use Data menu and select the Pivot Table and Chart sub menu Follow the Wizard steps You will get the required tables
Can we do this with hand calculations if there are thousands of cases? Not impossible but difficult to do!
Soft Skill
CERTAINLY !
ENGINEERING FUNCTIONS
STATISTICAL FUNCTIONS
DEMO FOLLOWS..
16.7 16.9 14.3 13.8 16.9 15.3 15.6 15.6 12.7 19.5 16.9 12.9
12.6 13.7 18.3 13.2 15.0 18.9 18.0 15.4 14.1 14.3 12.4 13.5
15.1 16.0 18.3 13.7 17.2 14.8 15.8 12.6 12.2 16.2 15.4 15.1
13.4 14.4 16.6 18.4 14.5 16.0 15.7 15.4 16.6 15.9 17.6 14.2
16.7 15.3 13.2 17.1 13.6 18.5 20.6 17.2 17.0 16.8 16.2 15.3
17.7 16.4 17.5 13.9 16.6 13.3 13.5 15.1 15.6 15.3 14.4 14.8
14.6 12.8 16.9 20.5 13.0 19.2 16.3 14.1 14.7 17.3 18.8 15.2
18.0 11.5 15.2 13.2 17.9 16.2 15.1 13.1 18.7 13.1 13.5 14.4
15.8 13.4 14.0 14.9 18.8 14.4 14.3 15.4 18.3 12.3 14.2 16.1
14.8 16.0 17.7 17.4 17.9 17.8 10.7 13.5 13.2 17.0 14.8 18.2
lower limit upper limit upper bound (BIN) 10 12.0 11.9 12 14.0 13.9 14 16.0 15.9 16 18.0 17.9 18 20.0 19.9 20 22.0 21.9
Class 10 - 12 12 - 14 14 -16 16 - 18 18 - 20 20 - 22
freq 2 26 43 31 16 2 120
freq 2 26 43 31 16 2 120
ADVANCED FEATURES
The t-test
Sugali Yanadi 20.43 17.7 22.51 21.4 18.99 20.7 20.49 19.3 23.12 21 25.63 17.9 18.08 18.6 20.63 18.5 22.55 18.2 22.43 20.3 22.77 23.23
t-test output
t-Test: Two-Sample Assuming Equal Variances Sugali Yanadi Mean 21.73833 19.36 Variance 4.319215 1.898222 Observations 12 10 Pooled Variance 3.229768 Hypothesized Mean Difference 0 df 20 t Stat 3.090767 P(T<=t) one-tail 0.002882 t Critical one-tail 1.724718 P(T<=t) two-tail 0.005764 t Critical two-tail 2.085962
p-p Plot
Thank you