0% found this document useful (0 votes)
26 views

Validity and Reliability

The document discusses validity and reliability in psychological research. It defines validity as how accurate a test or study is, and discusses internal and external validity. Internal validity refers to whether a study truly tests what it aims to, while external validity is about generalizing findings. The document also defines reliability as the consistency of measures, and discusses ways to assess reliability, such as test-retest reliability and inter-observer reliability. Improving validity and reliability in experiments, questionnaires, and observations is also addressed.
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views

Validity and Reliability

The document discusses validity and reliability in psychological research. It defines validity as how accurate a test or study is, and discusses internal and external validity. Internal validity refers to whether a study truly tests what it aims to, while external validity is about generalizing findings. The document also defines reliability as the consistency of measures, and discusses ways to assess reliability, such as test-retest reliability and inter-observer reliability. Improving validity and reliability in experiments, questionnaires, and observations is also addressed.
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Validity and reliability

Validity: How accurate (correct) is the test, study, theory, explanation, results, diagnosis etc
Are the effects genuine/real?
There are two types of validity…

Internal:
Was the study testing what it was supposed to be testing?
Experiment: The IV should be the only thing that affects the DV so we can establish CAUSE and EFFECT.
It relates to whether the results can be attributed to the effects of the manipulated variable alone, rather than to any other ‘nuisance’ variables
(called confounding variables).

What is the difference between extraneous and confounding variables?

It is affected by whether or not the measures taken, or tasks used, were actually testing what they were supposed to be. A study cannot be said to
be testing what it claims to be testing in the aim if there are sources of bias, confounding variables, order effects, and/or demand characteristics.
Where any of these occur, the study has low internal validity. E.g. Milgram’s study has been criticised for having low internal validity as pps
may not have believed the set up and were playing along.

Add one more example

External:
Ecological validity- Can the findings be generalised beyond the setting in which the study was performed? E.g. if it was in a lab, can we say the
same effects would be found in real life? E.g. learning trigrams in a lab would lack ecological validity because people do not behave naturally in
a lab.

Another issue linked to this is mundane realism. Was the task in the study like ‘real life’? If you take the same task (trigrams) and ask them to
learn it in a supermarket, the study sill has low ecological validity because it cannot be applied to real life.

Add two more examples

Temporal validity- Refers to how relevant the time period is in affecting the findings, e.g. a study on attitudes conducted decades ago cannot be
expected to have temporal validity due to how quickly attitudes change in society.

E.g. Psychodynamic explanation lacks temporal validity as his ideas are a reflection of the Victorian society in which he lived and people
weren’t free to discuss sex.

Add two more examples

Assessing validity…

Face validity Looking at a measure/test/scale and deciding ‘on the face of it’, if it is valid. Asking an expert to ‘look over’ it.

Concurrent If the results from the test/measure etc. match the results on a similar well-known test/measure etc. e.g. If someone devises a
validity personality test, they can test participants and then compare the results to a well-known one such as Eysenck’s personality
inventory.
If the scores correlate to 0.8 or higher, then it has concurrent validity.

Improving validity…
Experiments Use a control group – means the researcher can confidently say a change in the DV was due to the manipulation of the IV.
E.g. testing the effectiveness of a drug, a control group will ensure that it is the drug that’s improved the symptoms, rather than
just the passage of time.
Use standardised procedures – minimise participants reactivity and investigator effects.
Use single and double blind procedures – reduce demand characteristics and investigator effects.

Questionnaires Use a lie scale (e.g. like Eysenck did) to make sure respondent are being consistent and not showing social desirability bias.
Maintaining anonymity so participants don’t feel the need to lie.

Observations Use covert, naturalistic observations to ensure behaviour is as natural as possible and therefore improving ecological validity.
Have precise behavioural categories that mean the data collected is accurate.

Qualitative These generally have higher ecological validity because they reflect the participants’ reality. Researcher needs to use direct quotes
methods to ensure they are not misinterpreting information. Use triangulation to show from other sources that the interpretation is correct.

Reliability means…

How consistent is the test, study, scales, questionnaires, experiment, observation, theory, explanation, results or measuring tools/device?

If a measure can be repeated, then we can describe it as reliable.

For example, if someone completed an IQ test on day 1 and then sat the same test on day 4, we would expect to find the same/similar result and we
would say that the IQ test was reliable.

Assessing reliability…

Test re-test The person completes the test/questionnaire/interview etc. and are then given the same test again on a different occasion.
Some time is left in between (to make sure they can’t remember their answers), but not too long that their
attitude/behaviour etc. has changed.

Inter-observer This links to observations. If there is more than 1 observer, they must try to ensure they are interpreting events in the
reliability same way. We need to avoid/reduce subject interpretations.
A small scale pilot study is done first to consider the behavioural categories they might have, and then to make sure they
are applying them in the same way.

Exam pracitce

A head teacher wanted to increase recycling in his school. He arranged for the canteen to have three bins, one for cardboard, one for plastic and
one for food waste. A month later, a psychology teacher decided to see if the students were recycling. One lunch break she watched different
students going to the bins. Each time, she wrote down which of the three types of waste they recycled. She positioned herself so that the students
could not see her but so that she had a clear view of the bins.

(a) Identify one type of observation being used in this investigation. Justify your answer. (2)

(b) Explain the sampling technique the teacher used to record her observations. (2)

Two psychology students investigated the effect of type of play area on friendly behaviours. They watched the behaviour of six-year-old
children in two different play areas and recorded their observations using a set of behavioural categories. They observed 25 children in the first
play area and another 25 children in a second play area.

Play Area 1 was a grass space, surrounded by trees and plants.

Play Area 2 was a paved space, surrounded by brick and concrete walls.

(a) What are behavioural categories? Explain why it was important to use behavioural categories in this observation (4 marks)

Improving reliability…

Experiments Lab experiments tend to be considered reliable as the researcher can control many of the variables that may affect the
outcome.
E.g. instructions given to pps. The more controlled, the more likely it is it can be replicated. As long as pps are tested
under exactly the same conditions as last time, the same results should be found.
Questionnaires If the test-rest correlation is LOWER than 0.8, then some of the questions (items) may be changed or completely
removed.
E.g. a question might be interpreted in different ways by different people and therefore needs rewording.

Observations Making sure behavioural categories have been operationalized i.e. made measurable beforehand. Behaviours should
not overlap. E.g. hugging and cuddling!

Qualitative methods Use the same trained interviewer and use structured interviews to avoid leading questions. Structured interviews will
have more control over the psychologist’s behaviour.

Exam practice

1) A psychologist wanted to investigate the extent to which 2 people believed in ghosts and devised a questionnaire as a way of
testing this. There were 20 questions in total.

a. Explain what is meant by validity. Refer to the item in your answer. (3 marks)
b. Explain 2 ways that the psychologist could have improved the validity of the investigation. (4 marks)

2) Two students decided to conduct an observation on how children respond when left with someone other than their primary
caregiver. They set up a study whereby the mother left the child in a room with another adult and returned 5 minutes later.
The students decided that before the study, they should create some behavioural categories that they will look for. The students
conducted their observation and compared their results.

A. Explain what this indicates about the reliability of the findings. (2)
B. Explain how the students could improve the reliability of their observation. (4)

3) A psychologist conducts a lab experiment to test the memory for eye witness accounts. They decide to show the pps real life
CCTV footage of a bank robbery and ask them set questions about events in the footage (such as descriptions of the criminal,
victims etc).

a. Explain what is meant by validity. You should refer to the item in your answer. (3)
b. Outline two ways that the psychologist could ensure that the study is valid. (4)
c. Explain what is meant by reliability. You should refer to the item in your answer. (3)
d. Outline two ways that the psychologist could ensure that the study is reliable. (4)

You might also like