0% found this document useful (0 votes)
3 views

Handout

The document discusses the principles of learning through behaviorism, focusing on classical and operant conditioning. It explains how behaviors are learned through environmental interactions and the processes of reinforcement and punishment. Key concepts include acquisition, extinction, generalization, discrimination, and the shaping of responses in learning contexts.

Uploaded by

frankjoy900
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Handout

The document discusses the principles of learning through behaviorism, focusing on classical and operant conditioning. It explains how behaviors are learned through environmental interactions and the processes of reinforcement and punishment. Key concepts include acquisition, extinction, generalization, discrimination, and the shaping of responses in learning contexts.

Uploaded by

frankjoy900
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

LECTURE NOTE ON GSS 212 PSYCHOLOGY

N/B This note is only for lecture purposes. It is not to be used for any other motive.
INTRODUCTION
Learning involves procedures that bring about change in behaviour. It is a process by which
individuals develop thought and behavioural patterns in the course of life. People learn right from
childhood what is safe to do and what is unsafe to do; where to go and be feel safe, and where not
to go because is unsafe etc. It is a process that explains why one would lick his lips at the sight of
food thought to be pleasant, and would abstain from food that from previous experience is thought
to be unpleasant; why you handle a sharp knife with extreme care and why you scamper away in
the sight of someone who approaches you violently with a knife.
This module will explore the different Behaviourist theories of learning. We will explore the
classical conditioning model, the operant conditioning, and the social learning model.

BEHAVIORISM
Behaviourism is a psychological position that studies what people and other animals do without
reference to thoughts, ideas or emotional and other internal states. For example, to explain why
John laughed, one may say that it is because he is happy. How do you know that he is happy?
Because he is laughing. For the behaviourists, this would not make sense. They choose explain
reality from what is observable and verifiable. Mental experiences have to be described in
observable, and therefore, behavioural terms.
John Watson is known to be the father of Behaviourism. In his masterpiece “Behaviourist
Manifesto”(1913) he posited that psychology is a natural science, and to be truly scientific, its
subject matter has to be such that can be reliably and objectively studied. From him consciousness
was out of the scope of such study, and human behaviour could not be explained by instincts.
Accordingly, behaviours are caused by events and experiences in the environment, and behaviour
patterns are largely learned through experience. Even thoughts and emotions are linked to, and in
that sense caused by those environmental events.
Jacques Loeb, one of the foremost proponents of behaviourism expressed this radically: “Motions
caused by light or other agencies appear to the layman as expressions of will and purpose on the
part of the animal, whereas in reality the animal is forced to go where carried by its legs.”

CLASSICAL CONDITIONING
One major contribution to behaviourism is the classical conditioning of Ivan P. Pavlov. Pavlov, a
Russian physiologist, in the early 1897, in his research on digestion, made an observation that
made significant contribution to the understand of learning. He posited that animals had inert
automatic connections, which he named unconditioned reflexes, between a stimulus such as food

1
and a response such as secreting digestive juices. New reflexes are then acquired by transferring a
response due to a given stimulus to another stimulus. Classical conditioning is therefore the
process by which an organism learns to associate two different stimuli—a neutral one and one that
is already connected to a given reflexive response— transferring to one the response due to the
other. It is otherwise called Pavlovian conditioning.
The reflexive response, the action that is elicited, is known as Unconditioned response (UCR), and
the stimulus that automatically evokes it is the Unconditioned stimulus (UCS). The neutral
stimulus, one that elicits irrelevant or no response, is then introduced, and is named Conditioned
Stimulus (CS), and the response transferred to this previously neutral stimulus, that was formerly
UCR, is the Conditioned response (CR), which often resembles the UCR.
Pavlov observed, in the experiment with a dog, that it salivated (UCR) automatically when food
(UCS) was presented. When a metronome sounded (neutral stimulus), the dog only lifted up its
head and looked around without salivating. This sound was then paired with the presentation of
food for a couple of times with consistency, and after some repeated pairings, the dog salivated
(CR) once the metronome sounded (CS). The dog salivated (CR) at the sound of the metronome,
depending on preceding conditions, namely pairing CS with UCS. The conditioned response
therefore is response evoked by the conditioned stimulus due to the conditioning or training
procedure.
Conditioning is said to occur more rapidly where the CS is relatively unfamiliar. If you have heard
a tone many times (followed by nothing) and now start hearing the tone followed by a strong
stimulus, you will be slow to show signs of conditioning. The sound of gunshots may not elicit
same fear response in one who has lived with the army as in one who has not been familiar with
military operations. Some phobias are developed in this way. Foods eaten prior to stomach upset,
after repeated occurrences, tend to generate aversion and allergy by association of the food to the
stomach-ache, sometimes without investigating the health of the person’s internal system.
Acquisition and Extinction
The process that establishes or strengthens a conditioned response is known as acquisition. This
decrease of the conditioned response is called extinction. To extinguish a classically conditioned
response, repeatedly present the conditioned stimulus (CS) without the unconditioned stimulus
(UCS). That is, acquisition of a response (CR) occurs when the CS predicts the UCS, and
extinction occurs when the CS no longer predicts the UCS . If acquisition is learning to make a
response, extinction is learning to inhibit it. After repeated presentation of CS without the UCS,
the learned response (CR) tends to diminish. However a spontaneous recovery occurs, namely a
temporary return of an extinguished response after a pause, or a period of no presentation neither
of the UCS nor the CS.
Generalization and Discrimination
An important element of the theory is the phenomenon of stimulus generalization. This is the
extension of a conditioned response from the training stimulus to similar stimuli. The more closely
the semblance is with the conditioned stimulus, the more likely it is to elicit similar response. Many
people would very likely treat a priest or seminarian based on their experience of another priest or

2
seminarian. However, discrimination entails the learned capacity to respond differently to similar
stimuli that predict different outcomes. Example, you discriminate between a bell that signals time
for class to start and a different bell that signals a fire alarm.
In some cases there exist the blocking effect. This is a situation in which the previously established
association to one stimulus blocks the formation of an association to an added stimulus. To explain
this: First, if the first stimulus predicts the outcome, the second stimulus adds no new information.
Second, the subject attends strongly to the stimulus that already predicts the outcome and therefore
pays less attention to the new stimulus.
An example: a child sees the father return from work angry, beats him up over nothing. After
several days, the child becomes frightened once he sees the father. Identify the UCS, UCR, CS and
CR. Also think of a functionary in the seminary who is always offensive in his speeches. Attempt
to identify: neutral stimulus, UCS, UCR, CS, and CR. Identify the phenomenon of generalization
and distinction.
An important point to note from Pavlov as the basis for classical conditioning is that conditioning
occurred because presenting two stimuli close to each other in time developed a connection
between their brain representations. Later research showed that animals do not treat the
conditioned stimulus as if it were the unconditioned stimulus. Also, being close in time is not
enough. Learning occurs if the first stimulus predicts the second stimulus.
Classical conditioning seeks to explain human emotional and automatic responses as well as
phenomena like drug tolerance. Drug tolerance is acquired by body defences (UCR/CR) produced
automatically from the assimilation of the drug in the body (UCS) which is eventually associated
with the intake of the drug (CS). The consequence is that larger quantities of the drug would be
required to produce the same effect.

OPERANT CONDITIONING

Edward L. Thorndike, a
Harvard graduate
student, began training
cats in a basement
(1898). He put cats into
puzzle boxes from
which they could escape
by pressing a lever,
pulling a string, tilting a
pole, or other means.
Sometimes, he placed
food outside the box,

3
but usually, cats worked just to escape from the box. The cats learned to make whatever response
opened the box, especially if the box opened quickly. They learned by trial and error. Thorndike
concluded that learning occurs because certain behaviours are strengthened at the expense of
others. In a given situation like Thorndike’s puzzle box, an animal exhibits a series of responses
such as pawing the door, scratching the wall, pacing around etc. ). It starts with its most probable
response (R1). If nothing special happens, it proceeds to other responses, eventually reaching one
that opens the door—for example, bumping against the pole (R7 in this example). Opening the
door reinforces the preceding behaviour, and increases the probability of repeating the behaviour
linked to the satisfaction obtained.

Thorndike’s study presents a kind of learning that is called operant conditioning (the subject
operates on the environment to produce an outcome) or instrumental conditioning (the subject’s
behaviour is instrumental in producing the outcome). Operant or instrumental conditioning is the
process of changing behaviour by providing a reinforcer after a response. The major proponent of
the Operant conditioning theory is the Burrhus Frederic Skinner upholding that human behaviour
originates from environment, and that the consequences of a behaviour determine the possibilities
of it being repeated.
What differentiates operant from classical conditioning is the procedure. In classical conditioning,
the behaviour of the subject has no effect on the result or on the stimuli, whereas in operant
conditioning, the behaviour of the subject produces some results that in turn influences future
behaviour; the subject has to make some response to produce the outcome which in turn condition
future behaviours. Again, classical conditioning explains mainly automatic responses, or learning
of reflexes, whereas, operant conditioning is concerned more with behaviour and their
consequences.

4
Reinforcement and punishment
A reinforcement is the process of increasing the future probability of the most recent response.
Accordingly, Thorndike summarizes his views in the law of effect: Of several responses made to
the same situation, those that are accompanied or closely followed by satisfaction to the animal
will, other things being equal, be more firmly connected with the situation, so that, when it recurs,
they will be more likely to recur. This satisfaction then serves as reinforce that determines a
repetition of the connected response.
Distinction is made between primary reinforcers and secondary reinforcers. Primary (or
unconditioned) reinforcers are reinforcing by their own nature and properties, like food, water, etc.
whereas secondary (conditioned) reinforcers are reinforcing by association with something else,
like money, which in reinforcing because we can exchange it for primary reinforcers like food and
water. A student learns that a good grade wins approval, and an employee learns that increased
productivity wins the employer’s praise. In these cases, secondary means “learned.” It does not
mean unimportant. We spend most of our time working for secondary reinforcers.
Reinforcement and punishment are two ways to encourage or inhibit behaviour. While a reinforcer
increases the probability of response, a punishment decreases it. A reinforcer can be either a
presentation of something (e.g., receiving food) or a removal (e.g., stopping pain). Similarly,
punishment can be either a presentation of something (e.g., receiving pain) or a removal (e.g.,
withholding food).
Punishment is most effective when it is quick and predictable. An uncertain or delayed punishment
is less effective. For example, the burn you feel from touching a hot stove is highly effective in
teaching you something to avoid. The threat that smoking cigarettes might give you cancer many
years from now may also be effective, but less so. Punishments are not always effective. If the
threat of punishment were always effective, the crime rate would be zero. While some may say
that punishment produces no long-term effects, perhaps it could be better to say that punishment
does not greatly weaken a response when no other response is available. (If someone punished you
for breathing, you would continue breathing nevertheless.)
As regards severe punishment and child abuse, researches (by Larzelere and Kuhn (2005) and by
Hyland, Alkhalaf, and Whalley (2013)) demonstrate outcomes ranging from antisocial behaviour,
low self-esteem, hostility towards the parents to risk of life-time health problems.
Categories of reinforcement and punishment
Reinforcement can be either positive, when it entails presenting something satisfying, e.g. food,
or negative when it entails avoiding the undesirable, e.g. pain. In whichever case, reinforcement
increases the probability that a behaviour be repeated. Negative reinforcement can sometimes be
called escape learning or avoidance learning, since reinforcement occurs by the escape form
danger or avoidance of pain.
Punishment can also entail presenting something undesirable like pain (positive), or subtracting
something desirable like food, pleasure, privileges etc. (negative punishment). What determines
whether it is reinforcement or punishment is the outcome on target behaviour: it if decreases

5
behaviour, it is punishment, if it increases behaviour, it is reinforcement. If the procedure entails
the withdrawal or avoidance of something (food or pain), it is negative; if it has to do with
presenting something (desirable or undesirable), it is positive.

Acquisition and Extinction in Operant Conditioning


In operant conditioning, acquisition occurs when a response is reinforced repeatedly and learning
is obtained by increased behaviour. Extinction occurs if responses cease to produce
reinforcements. In classical conditioning, CS without UCS results in the weakening of CR and
therefore, extinction. Whereas in operant conditioning, response without reinforcement results in
weakening of response, and therefore, extinction of behaviour.
Generalization and Discrimination in operant conditioning
The more similar a new stimulus is to the original reinforced stimulus, the more likely is the same
response. This phenomenon is known as stimulus generalization. A bird that learns to avoid a bad-
tasting butterfly will also avoid other butterflies of similar appearance. A seminarian is more likely
to salute any bishop the way he salutes his bishop. In applying for a job, one is more likely to use
the same format he had used and was successful in a previous instance.
A stimulus that indicates which response is appropriate or inappropriate is called a discriminative
stimulus. Much of our behaviour depends on discriminative stimuli. For example, you learn
ordinarily to be quiet during a lecture but you talk when the professor encourages discussion. You
learn to drive fast on some streets and slowly on others. The ability of a stimulus to encourage
some responses and discourage others is known as stimulus control.

6
Shaping of Responses and Chaining behaviour
Shaping occurs when reinforcements gradually lead through simpler tasks to more complex
behaviours. For example, in education, your parents or teachers first praise you for counting your
fingers. Later, you must add and subtract to earn their congratulations. Step by step, your tasks
become more complex until you are doing advanced mathematics.
A similar thing applies to a series of actions. Ordinarily, one does not just perform one isolated
action, but accomplishes a sequence of actions. This is called chaining. Its significance is that one
could chain behaviours, reinforcing each one with the opportunity to engage in the next. Thus,
People learn chains of responses, too. You learn to eat with a fork and spoon. Then you learn to
put your own food on the plate before eating. Eventually, you learn to plan a menu, go to the store,
buy the ingredients, cook the meal, put it on the plate, and then eat it. Reinforcement of each
behaviour entails the opportunity to engage in the next behaviour.
Schedules of reinforcement
The simplest procedure in operant conditioning is to provide reinforcement for every correct
response, a procedure known as continuous reinforcement. However, our common experience is
that continuous reinforcement is not always obtained. Reinforcement for some responses and not
for others is known as intermittent reinforcement or partial reinforcement. This has implications
for learning as we tend to behave differently when we learn that only some of our responses will
be reinforced.
In addition to continuous reinforcement, four other schedules for the delivery of intermittent
reinforcement are fixed ratio, variable ratio, fixed interval, and variable interval. E.g of ratio
schedule is when a labourer is paid after a certain size of work done, like how many blocks he has
produced. E.g. of interval schedule is wages at the end of the month. In both cases, behaviour
differ.
A fixed-ratio schedule provides reinforcement only after a certain (fixed) number of correct
responses. Examples include factory workers who are paid for every ten pieces they turn out A
fixed-ratio schedule requiring a small number of responses, such as two or three, produces a steady
rate of response. However, if the schedule requires many responses before reinforcement, the
typical result is a pause after each reinforcement, and then resumption of steady responding.
A variable-ratio schedule is similar to a fixed-ratio schedule, except that reinforcement occurs after
a variable number of correct responses. The more applications you submit, the better your chances,
but you cannot predict how many applications you need to submit before receiving a job offer.
Gambling pays off on a variable ratio. If you enter a lottery, each time you enter you have some
chance of winning, but you cannot predict how many times you must enter before winning (if
ever). Variable-ratio schedules generate steady response rates.
A fixed-interval schedule provides reinforcement for the first response after a specific time
interval. Animals (including humans) on such a schedule learn to pause after reinforcement and
begin to respond again toward the end of the time interval. Checking your mailbox is an example
of behaviour on a fixed-interval schedule. If your mail is delivered at about 3 p.m. and you are

7
eagerly awaiting an important package, you might begin to check around 2:30 and continue
checking every few minutes until it arrives. Showing up on time for class is another example of a
fixed-interval schedule. Returning from Wednesday walk at 6pm is often reinforced negatively by
escape from punishment and positively by good recommendation. It is an example of fixed interval
reinforcement. Note that the delay between one reinforcement and the next is constant, but the
number of responses is variable.
With a variable-interval schedule, reinforcement is available after a variable amount of time.
Checking your email or your Facebook account is an example: A new message could appear at any
time, so you check occasionally. The scriptural admonition to stay awake because we never know
the time and the hour that the Son of Man will come is an example of a variable interval
reinforcement. You cannot know how long before your next response is reinforced. Consequently,
responses on a variable-interval schedule are slow but steady.
PRACTICAL APPLICATIONS OF OPERANT CONDITIONING
Persuasion
To persuade means to bring the other person to one's side without them realising it, giving the
illusion of having made a free choice. The concept of persuasion excludes all forms of violence.
An application of shaping is to start by reinforcing a slight degree of cooperation and then working
up to the goal little by little. Whether we want to get rats to salute the flag or soldiers to denounce
it, the most effective training technique is to start with easy behaviours, reinforce those behaviours,
and then gradually shape more complex behaviours.
Applied Behaviour Analysis/Behaviour Modification
In applied behaviour analysis, also known as behaviour modification, a psychologist removes
reinforcement for unwanted behaviours and provides reinforcement for more acceptable
behaviours. People with a painful injury avoid using the injured arm or leg and receive much
sympathy. In some cases the sympathy and the excuse for not working very hard become powerful
reinforcers, and people continue acting injured and complaining about the pain after the injury has
healed. To overcome this maladaptive behaviour, families and friends can praise or otherwise
reinforce attempts at increased mobility, and stop providing sympathy for the complaints. This
policy, of course, requires distinguishing between real pain and exaggerated pain.
NOT ALL LEARNING IS THE SAME
For the acquisition of learning, UCS and CS are to be separated by very brief interval to permit
association, such automatic learning as food aversions rather permit larger time interval between
intake of food and feeling of sickness for association and then learning.
Again, behaviours are reinforced to guarantee a repetition, but such learning as Birdsong learning
takes place even when there are no apparent trials and no apparent reinforcement. E.g children
learning to speak, and learning foreign language.

8
SOCIAL LEARNING
Albert Bandura (1977) outlines certain social mechanisms that facilitate learning. Not all learning
proceed by trial and error. A priest does not just put on vestments at random to see which
arrangement fits best. He learns by copying from others. One does not cook with random recipes
or dance with random steps. According to the social-learning approach of A. Bandura, we learn
about many behaviours by observing the behaviours of others. For example, if you want to learn
to drive a car, you start by watching people who are already skilled. When you try to drive, you
receive reinforcement for driving well and punishments (possibly injuries or fine!) if you drive
badly, but your observations of others facilitate your progress
For Bandura, we learn through personal experience, persuasion, vicarious experience (the
experience of significant or familiar others), and our physiological and affective states.
Modelling and Imitation
Why do we imitate? Other people’s behaviour often provides information. E.g. if everyone were
jumping off a cliff, it could be possible that they know something you do not know.
Another reason for imitation is that other people’s behaviour establishes a norm or rule. Doing
the same as other people is often helpful. E.g., you drive on the side of the road used by others, or
wear cassocks like others as soon as you come into the seminary, even without reading the
seminary rules and regulations.
You also imitate automatically in some cases. E.g., you imitate someone who yawns because
seeing a yawn induced yawning. Other examples of imitation, often without motivation include
smiling or frowning when we see another smile or frown; spectators at a football match or athletic
events moving their legs or arms in synchrony with what the players or athletes are doing. People
also copy the hand gestures they see. You can demonstrate by telling someone, “Please wave your
hands” while you clap your hands. Many people copy your actions instead of following your
instructions.
Vicarious reinforcement and Punishment
People tend to imitate behaviour that has been reinforcing to someone else, especially someone
that they like. That is, you learn by vicarious reinforcement or vicarious punishment—by
substituting someone else’s experience for your own. When a sports team wins consistently, other
teams copy its style of play. Advertisers depend heavily on vicarious reinforcement. They show
happy, successful people using their product, with the implication that if you use their product, you
too will be happy and successful.
However, generally, with adults, as with children vicarious reinforcement works better than
vicarious punishment, largely because most people do not identify with someone who failed or
received punishment. In an experiment with children (3-7) who cheated in a test, those earlier
instructed on the rewards of telling the truth were more likely to tell the truth compared to children
who were instructed on the punishment for those who lied and those who received neutral
instruction.

9
Self-Efficacy in Social Learning
Self-efficacy is a central concept in the Social learning theory of Bandura. It is a position that you
imitate someone else’s behaviour only if you have a sense of self-efficacy—the belief of being
able to perform the task successfully. You would probably consider your strengths and weaknesses,
compare yourself to the successful person, and estimate your chance of success. We tend to imitate
the actions of successful people but only if we feel self-efficacy, a belief that we could perform the
task well.
Self-Reinforcement and Self-punishment
If your sense of self-efficacy is strong enough, you try to imitate the behaviour of a successful
person. But actually succeeding may require prolonged efforts. People typically set a goal for
themselves and monitor their progress toward that goal. Sometimes people reinforce or punish
themselves, just as if they were training someone else. (but people usually forgive themselves
without imposing the punishment.)

10

You might also like