Exer 03
Exer 03
Descriptive Statistics
You should now be able to load a data file into R using R-commander and have some idea about entering your own data into a spreadsheet package and saving it in a format that can then be loaded into R. This exercise introduces descriptive statistics.
Exercise Objectives: To appropriately describe variables using descriptive statistics To use R to compute descriptive statistics. To investigate using graphs to describe data.
The data set faces.txt (available on RGSweb) show the results of an experiment where composite pictures were made out of a number of photographs. Two sets of pictures were made, one set which blended 4 photographs together and one set which blended 32 photographs together. A large group of people were then asked to rate the photographs for attractiveness. These ratings were then averaged providing numeric data, with higher scores indicating higher attractiveness. The research question under investigation is whether more normal faces are more or less attractive than less normal faces. The group of photographs which are composed of 32 pictures can be assumed to be more normal than the group of photographs containing only 4 photographs. Download the data set faces.txt and load into R. The variables are group and attractiveness. group is an unordered categorical variable and attractiveness is a numeric variable. For the variable `group' provide the following statistics: mode median mean
For the variable `attractiveness' provide the following statistics: mode median mean variance standard deviation range
Hint: Many of the statistics you need can be computed in Rcmdr using the following commands: Statistics Summaries Active data set In order to compute the mode for a numeric variable, we need to see which number is the most frequent. This is relatively trivial information, but can be computed in R if we first change the numeric variable attractiveness into a categorical variable (a factor) and then ask the programme which category is the most frequent. First, save the variable attractiveness as a categorical variable using the commands: Data Manage Variables in active data set Convert numeric variables to factors... Variables (pick one or more): choose attractiveness Factor levels: use numbers New variable name...: attractiveness.cat OK A new variable has now been added to the data base under the heading `attractiveness.cat'. You can see this by looking at the data using the View data set button in Rcmdr.
You can now find the most frequent category by using the commands: Statistics Summaries Frequency distributions... or by plotting a bar graph (Graphs, bar graph...) the mode will be the highest bar.
Statistics for Individual Groups. The statistics used so far have just described the individual variables. In order to investigate the research question we need to look at the descriptive statistics for each group of photographs. To look at the statistics for the numeric variable attractiveness for each of the groups you need to compute descriptive statistics for each of the groups. This can be achieved using the following commands: Statistics Summaries Numerical summaries summarize by groups: group OK
Compare the mean and standard deviations for the two groups.
illustrate the differences between the groups using a box plot. Graphs boxplot plot by groups: group OK
If you feel brave, you might like to run a back-to-back histogram. First load the package Hmisc (if you don't have it, install it). Then cut and paste the following commands into R-Console
out <- histbackback(split(attractiveness, group), probability=TRUE)
barplot(-out$left, col="red" , horiz=TRUE, space=0, add=TRUE, axes=FALSE) barplot(out$right, col="blue", horiz=TRUE, space=0, add=TRUE, axes=FALSE)
For full information about the back-to-back histogram, please look on the web at:
http://addictedtor.free.fr/graphiques/RGraphGallery.php?graph=136 What do you conclude about the attractiveness of faces on the basis of your analysis? How are the groups distinguished? Are more normal faces more or less attractive?