0% found this document useful (0 votes)
32 views9 pages

Statistics in Psychology (3)

Uploaded by

jahnavi.dubey
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views9 pages

Statistics in Psychology (3)

Uploaded by

jahnavi.dubey
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

Statistics in Psychology

Question Bank (each question carries 10 marks)


1. What is Statistics? Explain the key terms in statistics.
2. Write a note on descriptive statistics.
3. What are the measures of central tendency?
4. What are the measures of variability?
5. Write a note on the correlation coefficient. Why does correlation not mean causation?

Introduction:
Psychology is field of studies based on human behaviour and mind. It includes the
study of conscious and unconscious phenomena, as well as feeling and thought. If you take a
minute to ponder over the terms, you just read through – ‘behaviour’, ‘mind’, ‘conscious’,
‘unconscious’, ‘feeling’ ‘thought’; you will realize that all these terms are imaginary
constructs that human beings have made up. You cannot pin point to any part of the brain and
say that behaviour or mind originates in there. What does physically exist is the human brain
and its workings are so complex that we have entire disciples dedicated to it, i.e.,
‘Neuroscience’ ‘Cognitive science’, etc.
Coming back to Psychology, experts in various scientific field have raised questions
with regard to its scientific rigor. Alex Berezow, a microbiologist, argued the case that
psychology is not a real science. His arguments are not entirely to be ignored. At the heart of
his argument is psychology’s lack of quantifiability and dearth of accurate terminology. He
points out research in fields like happiness where definitions are neither rigid nor objective
and data is not quantifiable. Here’s what he has to say, quote for quote:
“Happiness research is a great example of why psychology isn't science.
How exactly should "happiness" be defined? The meaning of that word
differs from person to person and especially between cultures. What
makes Americans happy doesn't necessarily make Chinese people happy.
How does one measure happiness? Psychologists can't use a ruler or a
microscope, so they invent an arbitrary scale. Today, personally, I'm
feeling about a 3.7 out of 5. How about you?”

Consider the terms we saw earlier, how do you scientifically define ‘behaviour’,
‘mind’, ‘conscious’ ‘unconscious’, ‘feeling’ or ‘thought’ when they do not exist in the
physical dimension. They are a state of being, that is where Berezow’s argument comes from.
Look at all the other sciences, microbiology, zoology, botany, physics, chemistry; they are
based on physical dimensions and objects which can be objectively measured without any
single trace of error. Can we say the same about psychology, or for that matter any of the
social sciences, be it sociology, history, economics, etc?None of them can be objectively
measured without any single trace of error.
However, all is not doomed. No need to panic and switch over to BSc or BCom. All
Social Sciences are sciences, and the proof of that are the Nobel Prizes which recognise and
reward scientific research in these fields. Social Sciences make use of statistics to make their
fields scientific. Statistics not only gives social sciences the opportunity to objectively
quantify human behaviours but also solves another big problem.
The next big problem of social sciences is the subject of research itself. Say if for
example you wanted to do a research on ‘happiness’, in order to do so you would have to do a
study on each and every human being alive on earth to come up with answers that can be
generalised to everyone. Sciences like microbiology and physics do not face this problem. In
microbiology an identified organism will have consistent features throughout its species. In
physics, an electron being studied would have similar characteristics as all electrons. The
point I’m making is, in these sciences the subject of the research is not changing, its constant.
However, that’s not the same for human behaviour. Human behaviour is highly volatile, not
only is it changing from person to person but it is also changing within each person in a
matter of seconds. So then, how can you draw generalisable and accurate conclusions about
human behaviour. The solution to this is Statistics.
Statistics is a systematic study of samples. Instead of studying human behaviours in
its entirety, we draw on a sample of behaviours and apply various statistical techniques on it.
Technically, we define the entirety of human behaviours being measured as the “population”.
In statistical terms a population is the entire collection of events. So, if the subject of the
study is happiness, all human beings who have experienced happiness become the part of the
population. If the subject of your study is depression, all human beings who have experienced
depression become the part of the population. So, whatever may be the subject of the
research, all events (participants) who are directly related to your subject become the
population. Another point is that population can be of any size. They can range from a
relatively small set of numbers, which can be collected easily, to a large but finite set of
numbers, which would be impractical to collect in their entirety. Unfortunately, in
psychology, most times we are interested in populations which are usually very large. The
practical consequence if that we seldom if ever measure entire populations.
Instead, we are forced to draw only a sample of observations from that population and
use that sample to infer something about the characteristics of the population. But we do not
draw just any odd sample. We would like to draw a random sample. Truly random sample
follows a particular set of procedures to ensure that each and every element (event) in that
population has an equal chance of being selected. The simplest example is of pulling out a
winner from a lottery. Each and every ticket has an equal and likely chance of being pulled
out. Such is a truly random sample. But for all practical reasons, it’s not always possible to
have a truly random sample. One because not every random person agrees to be part of your
research or experiment. And usually its painstakingly difficult to get a sample of 1000 or
even 100 participants which are truly random. Thus, researchers just settle for non-random
samples (one of the reasons by Berezow calls psychology research non-valid). Non-random
samples are not representative of the population and can give you wrong results, hence its
important to get as much as possible random samples.
After you have collected your sample, you want to know the characteristics of the
data. The first thing you do is look at the frequencies of the data. Frequency is the number of
occurrences for per data point, or group of data. A table representing frequencies is called
frequency table.
Number Number of Number of Number of
of Glasses people Glasses per people
per day (frequency) day (frequencies)
1 0 1–2 1
2 1 3–4 6
3 2
4 4 4–6 15
5 5 7–8 9
6 6
7 5 9 – 10 3
8 4
9 2
10 1
Frequency Table 1 Frequency Table 2
The same data can also be depicted in a graphical format. This is called a Histogram.

Histogram 1.
As you can see it is very easy to interpret a histogram at a glance. Using graphical
data makes it more intelligible. A histogram or frequency distribution of the data is a way of
organizing data in some sort of logical order. When there is a huge sample of data, it is better
to group adjacent values together into a histogram. This makes large data sets more
meaningful, yet preserving important trends in the data. The limit of length of an interval i.e.,
the size of the grouping of data depends on the discretion of the data analyst. Typically, you
would want to try different interval lengths until the data looks meaningful to you. However,
it is recommended that the total number of intervals be around 10 to 12. But there are no hard
and fast rules with regard to it.
Frequency Distributions:
A frequency distribution is a table or graph that shows how often different numbers,
orscores, appear in a particular set of scores. (look to table 1 and 2, and histogram 1) Tables
can be useful, especially when dealing with small sets of data. Sometimes amore visual
presentation gives a better “picture” of the patterns in a data set, and that iswhen researchers
use graphs to plot the data from a frequency distribution. The common graph being
histogram.
The Normal Curve: Frequency polygons allow researchers to see the shape of a set of data
easily. A common frequency distribution of this type is called the normal curve. It has a very
specific shape and is sometimes called the bell curve.
The normal curve is used as a
model for many things that are measured,
such as intelligence, height, or weight, but
even those measures only come close to a
perfect distribution (provided large
numbers of people are measured). One of
the reasons that the normal curve is so
useful is that it has very specific
relationships to measures of central
tendency and a measurement of variability, known as the standard deviation.
Skewness:Distributions aren’t alwaysnormal in shape. Some distributions are described as
skewed. This occurs when the distributionis not even on both sides of a central score with the
highest frequency. Skewed distributions are called positively or negatively skewed,
depending on wherethe scores are concentrated. A concentration in the high end would be
called negativelyskewed. A concentration in the low end would be called positively skewed.
The directionof the extended tail determines whether it is positively (tail to right) or
negatively(tail to left) skewed.

Descriptive Statistics
Descriptive statistics are a way of organizing numbers and summarizing them so thatthey can
be understood. There are two main types of descriptive statistics:
 Measures of Central Tendency. Measures of central tendency are used to summarize
the data and give you one score that seems typical of your sample.
 Measures of Variability. Measures of variability are used to indicate how spreadout
the data are. Are they tightly packed or are they widely dispersed?
The actual descriptive statistics are best understood after we explain the concept ofa
frequency distribution.One-way psychologists get started in a research project is to look at
their data, butjust looking at a list of numbers wouldn’t do much good. So, we make a
histogram.

Measures of Central Tendency


A frequency distribution is a good way to look at a set of numbers, but there’s still
alot to look at—isn’t there some way to sum it all up? One way to sum up numericaldata is to
find out what a “typical” score might be, or some central number aroundwhich all the others
seem to fall. This kind of summation is called a measure of centraltendency,or the number
that best represents the central part of a frequency distribution.There are three different
measures of central tendency: the mean, the median,and the mode.
Mean: The most commonly used measure of central tendency is the mean, the
arithmeticaverage of a distribution of numbers. That simply indicates that you add up allthe
numbers in a particular set and then divide them by how many numbers there are.This is
usually the way teachers get the grade point average for a particular student,for example. If
Warren’s grades on the tests he has taken so far are 86, 92, 87, and 90,then the teacher would
add 86 + 92 + 87 + 90 = 355, and then divide 355 by 4 (the numberof scores) to get the mean,
or grade point average, of 88.75. Here is the formula forthe mean:
ΣX
Mean =
N
What does this mean?
 Σ is a symbol called sigma. It is a Greek letter and it is also called the summation
sign.
 X represents a score. Warren’s grades are represented by X.
 ΣX means add up or sum all the X scores or SX = 86 + 92 + 37 + 90 = 355.
 N means the number of scores. In this case, there are four grades.
 We then divide the sum of the scores (SX) by N to get the mean or
ΣX
Mean = = 355/4 = 88.75
N
The mean is a good way to find a central tendency if the set of scores clustersaround the
mean with no extremely different scores that are either far higher or far lowerthan the mean.
You may hear or read about a concept called “regression to the mean.” This is
aconcept that describes the tendency for measurements of a variable to even out over
thecourse of the measurements (Stigler, 1997). If a measurement is fairly high at first,
subsequentmeasurements will tend to be closer to the mean, the average measurement,
forexample. This is one of the reasons that researchers want to replicate measurements
manytimes rather than relying on the first results, which could cause them to draw
incorrectconclusions from the data.
Median: Yes, the mean doesn’t work as well when there are extreme scores, as you
wouldhave if only two students out of an entire class had a perfect score of 100 and
everyoneelse scored in the 70s or lower. If you want a truer measure of central tendency in
such acase, you need one that isn’t affected by extreme scores. The median is just such a
measure.A median is the score that falls in the middle of an ordered distribution of
scores.Half of the scores will fall above the median, and half of the scores will fall below it.
Ifthe distribution contains an odd number of scores, it’s just the middle number, but if
thenumber of scores is even, it’s the average of the two middle scores. The median is also
the50th percentile.
Think about measures of income in a particular area. Ifmost people earn around
$35,000 per year in a particular area, but there are just a fewextremely wealthy people in the
same area who earn $1,000,000 a year, a mean of all theannual incomes would no doubt
make the area look like it was doing much better thanit really is economically. The median
would be a more accurate measure of the centraltendency of such data.
Name Allison Ben Carol Denise Evan Fethia George Hal Ing Jay
a
IQ 160 150 139 102 102 100 100 100 98 95
Mode: The mode is another measure of central tendency,in which the most frequent score is
taken as the centralmeasure. In the numbers given in Table above, the modewould be 100
because that number appears more timesin the distribution than any other. Three people have
thatscore. This is the simplest measure of central tendencyand is also more useful than the
mean in some cases,especially when there are two sets of frequently appearingscores. For
example, suppose a teacher notices that onthe last exam the scores fall into two groups, with
about15 students making a 95 and another 14 students making a67. The mean and the median
would probably give a numbersomewhere between those two scores—such as 80.That
number tells the teacher a lot less about the distributionof scores than the mode would
because, in this case,the distribution is bimodal—there are two very differentyet very
frequent scores.

Measures of central tendency and the shapeof the distribution: When the distribution is
normalor close to it, the mean, median, and mode are the sameor very similar. There is no
problem. When the distributionis not normal, then the situation requires a little
moreexplanation.
Skewed Distributions If the distribution is skewed, then the mean is pulled in thedirection of
the tail of the distribution. The mode is
still the highest point, and the medianis
between the two. Let’s look at an
example. In Figure to rightwe have a
distribution ofsalaries at a company. A
few people make a low wage, most
make a mid-level wage, andthe bosses
make a lot of money. This gives us a
positively skewed distribution with
themeasures of central tendency placed
as in the figure. As mentioned earlier,
with such adistribution, the median
would be the best measure of central tendency to report. If thedistribution were negatively
skewed (tail to the left), the order of the measures of centraltendency would be reversed.
Bimodal Distributions: If you have a bimodal distribution, then none of the measuresof
central tendency will do you much good. You need to discover why you appear tohave two
groups in your one distribution.

Measures of Variability:
Descriptive statistics can also determine how much the scores in a distribution
differ,or vary, from the central tendency of the data. These measures of variability are
usedto discover how “spread out” the scores are from each other. The more the scorescluster
around the central scores, the smaller the measure of variability will be, andthe more widely
the scores differ from the central scores, the larger this measurementwill be.
There are two ways that variability is measured. The simpler method is by
calculatingthe range of the set of scores, or the difference between the highest score and the
lowestscore in the set of scores. The range is somewhat limited as a measure of
variabilitywhen there are extreme scores in the distribution. For example, if you look at Table
mentioned above,the range of those IQ scores would be 160 – 95, or 65. But if you just look
at the numbers,you can see that there really isn’t that much variation except for the three
highest scoresof 139, 150, and 160.
The other measure of variability that is commonly used is the one that is related tothe
normal curve, the standard deviation. This measurement is simply the square rootof the
average squared difference, or deviation, of the scores from the mean of the distribution.The
mathematical formula for finding the standard deviation looks complicated,but it is really
nothing more than taking each individual score, subtracting the mean fromit, squaring that
number (because some numbers will be negative and squaring them getsrid of the negative
value), and adding up all of those squares. Then this total is dividedby the number of scores,
and the square root of that number is the standard deviation. Inthe IQ example, it would go
like this:


2
Standard Deviation Formula SD = Σ ( X −M )
N
How does the standard deviation relate to the normal curve? Let’s look at theclassic
distribution of IQ scores. It has a mean of 100 and a standard deviation of 15as set up by the
test designers. It is a bell curve. With a true normal curve, researchersknow exactly what
percentage of the population lies under the curve between eachstandard deviation from the
mean. For example, notice that in
the percentages in Figure to the
right, one standard deviation above
the mean has 34.13 percent of the
populationrepresented by the graph
under that section. These are the
scores between the IQs of100 and
115. One standard deviation below
the mean (−1) has exactly the same
percent,34.13, under that section—
the scores between 85 and 100.
This means that 68.26percent of
the population falls within one standard deviation from the mean, or oneaverage “spread”
from the center of the distribution. For example, “giftedness” is normallydefined as having an
IQ score that is two standard deviations above the mean.On the Wechsler Intelligence Scales,
this means having an IQ of 130 or greater becausethe Wechsler’s standard deviation is 15.
But if the test a person took to determinegiftedness was the Stanford-Binet Fourth Edition
(the previous version of the test),the IQ score must have been 132 or greater because the
standard deviation of that testwas 16, not 15. The current version, the Stanford-Binet Fifth
Edition, was publishedin 2003, and it now has a mean of 100 and a standard deviation of 15
for compositescores.
Although the “tails” of this normal curve seem to touch the bottom of the graph,in
theory they go on indefinitely, never touching the base of the graph. In reality,though, any
statistical measurement that forms a normal curve will have 99.72 percentof the population it
measures falling within three standard deviations either above orbelow the mean. Because
this relationship between the standard deviation and the normalcurve does not change, it is
always possible to compare different test scores or setsof data that come close to a normal
curve distribution. This is done by computing a zscore, which indicates how many standard
deviations you are away from the mean.
It is calculated by subtracting the mean from your score and dividing by the
standarddeviation. For example, if you had an IQ of 115, your z score would be 1.0. If you
had anIQ of 70, your z score would be −2.0. So on any exam, if you had a positive z score,
youdid relatively well. A negative z score means you didn’t do as well. The formula for a
zscore is:
X− M
z=
sd

Inferential Statistics
The correlation coefficient:
A correlation is a measure of the relationship between two or more variables. Simply
stated, correlation is an expression of the degree and direction of correspondencebetween two
things. It reflects thedegree of concomitant variation between variable X and variable Y. The
coefficient ofcorrelation is the numerical index that expresses this relationship: It tells us the
extent towhich X and Y are “co-related” or “co-occuring”.
The meaning of a correlation coefficient is interpreted by its sign and magnitude.It
would be “plus”(for a positive correlation), “minus”(for a negative correlation), or “none” (in
the rare instance that the correlationcoefficient was exactly equal to zero). If asked to supply
information about its magnitude,it would respond with a number anywhere at all between +1
and – 1. And hereis a rather intriguing fact about the magnitude of a correlation coefficient: It
is judgedby its absolute value. This means that to the extent that we are impressed by
correlationcoefficients, a correlation of +.99 is every bit as impressive as a correlation of
–.99. The two ways to describe a perfect correlation between two variables are aseither + 1 or
– 1. If a correlation coefficient has a value of + 1 or – 1, then the relationshipbetween the two
variables being correlated is perfect,without error in the statisticalsense. And just as
perfection in almost anything is difficult to find, so too are perfectcorrelations.
It’schallenging to try to think of any two variables in psychological workthat are perfectly
correlated.
If two variables simultaneously increase or simultaneously decrease, then thosetwo
variables are said to be positively (or directly) correlated. The height and weightof normal,
healthy children ranging in age from birth to 10 years tend to be positivelyor directly
correlated. As children get older, their height and their weight generallyincrease
simultaneously. A positive correlation also exists when two variables
simultaneouslydecrease. For example, the less preparation astudent does for an examination,
the lower the score onthe examination. A negative (or inverse) correlation occurswhen one
variable increases while the other variabledecreases. For example, there tends to be an
inverse relationshipbetween the number of miles on your car’s odometer(mileage indicator)
and the price a cardealer is willing to give you on a trade-in allowance; allother things being
equal, as the mileage increases, the value of money offered ontrade-in decreases.
If a correlation is zero, then absolutely no relationship exists between the two
variables.And some might consider “perfectly no correlation” to be a third variety of
perfectcorrelation; that is, a perfect noncorrelation. After all, just as it is nearly impossiblein
psychological work to identify two variables that have a perfect correlation, so it isnearly
impossible to identify two variables that have a zero correlation. Most of thetime, two
variables will be fractionally correlated. The fractional correlation may beextremely small but
seldom “perfectly” zero.
Correlationis often confused with causation. It must be emphasizedthat a correlation
coefficient is merely an index ofthe relationship between two variables, not an index ofthe
causal relationship between two variables. If you weretold, for example, that from birth to
age 18 there is a highpositive correlation between index finger size and IQ,would it be
appropriate to conclude that index finger size causes IQ? Of course not.The period from birth
to age 18 is a time of maturation in all areas, including physicalsize and cognitive abilities
such as IQ. Intellectual development parallels physicaldevelopment during these years, and a
relationship clearly exists between physicaland mental growth. Still, this doesn’t mean that
the relationship between index finger size and IQ is causal.
The formula for calculating correlation is:
Σ ( X− X ) (Y −Y )
r=
n . S X SY

or
Σ ( X −X ) (Y −Y )
r=
√ [Σ( X−X ) ][ Σ ( Y −Y ) ]
2 2

---------------------------------------------------The End-----------------------------------------------

You might also like