0% found this document useful (0 votes)
8 views

Conditioning and Learning

The document discusses conditioning and learning, emphasizing its importance in shaping behavior through experiences. It covers foundational concepts such as Pavlovian (Classical) Conditioning and Operant Conditioning, detailing principles like acquisition, extinction, generalization, and discrimination. Additionally, it explores advanced phenomena in classical conditioning and the role of consequences in operant conditioning, highlighting the integration of behavioral findings and neurobiological perspectives in modern learning science.

Uploaded by

Rory
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Conditioning and Learning

The document discusses conditioning and learning, emphasizing its importance in shaping behavior through experiences. It covers foundational concepts such as Pavlovian (Classical) Conditioning and Operant Conditioning, detailing principles like acquisition, extinction, generalization, and discrimination. Additionally, it explores advanced phenomena in classical conditioning and the role of consequences in operant conditioning, highlighting the integration of behavioral findings and neurobiological perspectives in modern learning science.

Uploaded by

Rory
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 26

Conditioning and Learning: Core

Concepts and Phenomena


Learning is one of the most fundamental processes in both human and
animal behavior. It refers to the way experiences shape future behavior,
allowing organisms to adapt to their environment. Psychologists have
identified multiple forms of learning, from simple associations between
stimuli to complex cognitive processes and social influences. Conditioning
generally describes the process by which behavioral responses become
linked to specific stimuli or outcomes. This article will explore core
concepts and phenomena in conditioning and learning, organized into
several sections. We begin with the foundational principles of Pavlovian
(Classical) Conditioning, then proceed to Operant (Instrumental)
Conditioning, which involves learning through consequences. Next, we
examine Cognitive Aspects of Learning that challenged strict
behaviorist views, followed by Motor Skill Learning and Procedural
Memory, highlighting how skills are acquired and stored. Finally, we
discuss Social and Observational Learning, illustrating how learning
occurs in a social context by watching others. Throughout, both behavioral
findings and neurobiological perspectives are emphasized, demonstrating
how modern learning science integrates different levels of analysis.

I. Foundations of Conditioning: Pavlovian


(Classical) Conditioning
Classical conditioning, also known as Pavlovian conditioning, was first
described by Ivan Pavlov in the early 20th century. Pavlov famously
demonstrated that dogs could learn to associate a neutral stimulus (such
as a bell sound) with an unconditioned stimulus (food) that naturally
elicited salivation. After repeated pairings, the bell alone would trigger
salivation, a learned response. This simple experiment revealed that
organisms can form associations between stimuli, resulting in new reflex-
like responses. The basic principles of classical conditioning include the
definitions of unconditioned and conditioned stimuli and responses, as
well as key processes such as acquisition, extinction, generalization, and
discrimination.

Basic Principles: UCS, UCR, CS, CR, Acquisition, Extinction,


Generalization, Discrimination

Unconditioned Stimulus (UCS) and Unconditioned Response (UCR):


An unconditioned stimulus is any stimulus that naturally and automatically
triggers a reflexive response without prior learning. The response it elicits
is called the unconditioned response. In Pavlov’s experiment, the UCS was
the food placed in the dog’s mouth, and the UCR was the dog’s salivation.
These are “unconditioned” because they involve no prior training; the
response is innate or reflexive.

Conditioned Stimulus (CS) and Conditioned Response (CR): A


conditioned stimulus is an originally neutral stimulus that, after being
paired repeatedly with the UCS, comes to elicit a response on its own. The
response to the CS is termed a conditioned response – it is “conditioned”
because it is learned. In the Pavlov example, the sound of the bell initially
produced no salivation (it was neutral), but after association with food it
became a CS, and the dog’s salivation to the bell alone was the CR.
Notably, the CR is often similar to, but not necessarily identical to, the
original UCR. Through conditioning, the organism has learned a new
trigger for an existing reflex.

Acquisition: Acquisition is the initial learning phase during which the


association between the CS and UCS is formed. Each pairing of the CS
with the UCS (for instance, each time the bell is rung and then food is
delivered) strengthens the conditioned association. Early in training, the
CR is weak or absent, but with successive pairings the CR becomes
stronger and more reliable. Acquisition typically follows a curve of
increasing learning: the first few pairings produce rapid learning, which
then levels off as it approaches a maximum performance (an asymptote).
The speed and strength of acquisition can depend on several factors,
including the intensity of the UCS, the salience of the CS, and the timing
between the CS and UCS. Generally, shorter intervals between CS and
UCS (with the CS presented slightly before the UCS) yield more effective
learning, a procedure known as forward conditioning.

Extinction: Once a CR has been acquired, it can be diminished or


eliminated through the process of extinction. Extinction occurs when the
conditioned stimulus is repeatedly presented without the unconditioned
stimulus. Without reinforcement from the UCS, the learned association
weakens over time. For example, if Pavlov kept ringing the bell without
ever giving food afterward, the dog’s salivation to the bell would gradually
decrease and eventually stop. It is important to note that extinction is not
the same as “unlearning” or erasing the original association; rather, it is a
form of new learning (inhibitory learning) that the CS no longer predicts
the UCS. Evidence for this comes from the phenomenon of spontaneous
recovery, wherein after a rest period the extinguished CR can reappear
when the CS is presented again. Spontaneous recovery suggests that the
original CS–UCS association is still stored in memory and can resurface,
though typically the recovered response is weaker and will again
extinguish if the UCS is not reintroduced.

Generalization: Stimulus generalization refers to the tendency for


stimuli similar to the original CS to evoke a similar conditioned response.
In other words, once an animal has learned to respond to a CS, it may also
respond (though often to a lesser degree) to stimuli that resemble the CS.
For instance, if a dog has been conditioned to salivate to a bell tone of a
particular pitch, it might also salivate when hearing a slightly different
tone. Generalization is an adaptive feature of learning – it allows
organisms to respond to new stimuli that are similar enough to ones
previously encountered, which is useful in real-world settings where exact
repetitions are rare. However, too much generalization can be
nonadaptive if the organism reacts the same way to stimuli that are
actually quite different in significance.

Discrimination: Discrimination is the counterbalance to generalization. It


is the process by which an organism learns to distinguish between
different stimuli, responding to the conditioned stimulus but not to other,
similar stimuli. Discrimination training involves selectively reinforcing the
target CS–UCS association while presenting other stimuli (often similar to
the CS) without the UCS. Over time, the subject learns that only a specific
stimulus is followed by the UCS, and thus narrows its response to that
stimulus alone. For example, if a dog hears two different tones but only
one tone is ever paired with food, the dog will learn to salivate to the food-
associated tone (CS+) and not to the other tone (CS-). Discrimination
ensures that learned responses are specific to appropriate cues, which
increases the precision of behavior.

Together, generalization and discrimination determine the breadth of


stimulus control over a conditioned response – generalization makes
learning flexible and broad, while discrimination makes it precise and
specific. These basic principles of classical conditioning demonstrate how
associations are formed, strengthened, and tuned, laying the groundwork
for more complex learning phenomena.

Advanced Phenomena in Classical Conditioning: Blocking,


Overshadowing, Latent Inhibition, Sensory
Preconditioning, Conditioned Taste Aversion, Evaluative
Conditioning

Beyond the basic paradigm, researchers have uncovered several


advanced phenomena in classical conditioning that reveal additional
complexities in how associations are learned:

 Blocking Effect: In blocking, prior learning with one stimulus can


interfere with the acquisition of a new association with a second
stimulus. Imagine an animal is first conditioned to associate a tone
(CS1) with a shock (UCS) until the tone reliably elicits a fear
response. Next, a second stimulus, say a light (CS2), is presented
together with the tone as a compound, followed by the shock. One
might expect the light would also become a CS by virtue of these
pairings. However, the blocking effect (first demonstrated by Leon
Kamin) is that little or no conditioning occurs to the light because
the tone already fully predicted the shock. The animal learns
nothing about the light since the shock was not surprising in the
presence of the well-established tone signal. Blocking shows that
conditioning is not just about contiguity (two stimuli occurring
together) but about the informational or predictive value of stimuli –
learning depends on whether the UCS is unexpected or
“unpredicted” by existing cues.
 Overshadowing: Overshadowing is a phenomenon observed when
two stimuli are conditioned together as a compound
(simultaneously) with an unconditioned stimulus. It refers to the
finding that one stimulus may dominate or “overshadow” the other
in terms of becoming associated with the UCS. Typically, the more
salient or intense cue in the compound will produce a stronger CR,
while the less salient cue produces a weaker CR or none at all. For
example, if a loud tone and a dim light are presented together
followed by food, a dog might later salivate strongly to the loud tone
alone but very little to the dim light alone. The intense tone stimulus
overshadowed the light in the learning process. Overshadowing
suggests that stimuli compete for association strength, and stronger
or more attention-grabbing cues win out, capturing most of the
learning.
 Latent Inhibition (CS Pre-Exposure Effect): Latent inhibition
refers to the observation that pre-exposure to a neutral stimulus
without any consequence can interfere with its later ability to be
conditioned. If an animal is repeatedly exposed to a tone (with no
UCS) prior to conditioning, and then that same tone is later paired
with a UCS, the animal will learn the association more slowly
compared to an animal with no tone pre-exposure. In essence, the
stimulus that was repeatedly experienced alone has become
familiar and seemingly inconsequential, so the animal has learned
to ignore it to some extent. When conditioning is attempted, this
prior "learned irrelevance" must be overcome, resulting in slower
acquisition. Latent inhibition highlights the role of attention and
novelty in learning – novel stimuli are conditioned more readily than
those that the organism has already learned are irrelevant.
 Sensory Preconditioning: Sensory preconditioning is almost the
reverse of the typical conditioning sequence and shows that
associations can form between neutral stimuli prior to any direct
reinforcement. In sensory preconditioning, two neutral stimuli are
first paired together repeatedly (for example, a tone and a light are
presented together several times). Neither stimulus initially elicits
any strong response, and no unconditioned stimulus is used in this
stage. Next, one of those stimuli (say, the light) is paired with an
unconditioned stimulus (such as food) through standard
conditioning until the light becomes a CS that elicits a CR
(salivation). Finally, the other stimulus (the tone) is tested by itself –
remarkably, the tone will often elicit the CR (salivation) even though
it has never been directly paired with the food. The organism
indirectly learned about the tone–food relationship because it had
previously associated the tone with the light and then learned the
light–food relationship. Sensory preconditioning demonstrates that
organisms can form associations between stimuli in the absence of
any explicit UCS, and those associations can allow learning to
generalize or transfer once one of the stimuli becomes meaningful.
It suggests that animals can learn a sort of “link” or internal
representation between two events (tone and light) without any
obvious reinforcement, a finding that hints at more cognitive forms
of associative learning.
 Conditioned Taste Aversion: Conditioned taste aversion (CTA) is
a special type of classical conditioning in which an organism learns
to avoid a flavor or food that has been associated with illness or
poisoning. What makes taste aversion learning remarkable is that it
often occurs in one trial and can span very long delays between the
CS and UCS. For example, a rat that consumes a novel sweet-
flavored food and later (even several hours afterward) gets sick
(perhaps from a mild toxin or radiation inducing nausea) may
subsequently refuse to eat that food again. The sweet taste (CS)
and illness (UCS) can be separated by hours, yet a robust aversion
is learned. Additionally, taste aversions are highly specific – the rat
will avoid that flavor but not necessarily other stimuli present during
illness. John Garcia’s classic experiments in the 1960s on what he
called the “bright-noisy water” demonstrated that rats easily
learned to associate taste with illness, but not other cues like lights
or sounds with illness, indicating a biological predisposition. This
phenomenon challenged the assumption that conditioning obeys the
same rules for all stimulus combinations. Instead, it suggested the
concept of biological preparedness: animals are evolutionarily
prepared to learn certain associations (like taste-illness) more
readily than others, likely because those couplings have survival
significance in nature (avoiding poisonous foods). Conditioned taste
aversion thus broadened our understanding of classical conditioning
by showing that the temporal contiguity rule can be broken (very
long CS-UCS intervals can work) and that the nature of the stimulus
matters (some CS-UCS combinations are learned more strongly than
others due to innate predispositions).
 Evaluative Conditioning: Evaluative conditioning refers to the
process by which we form preferences or aversions (likes or dislikes)
through association, even when there is no explicit reflex evoked. It
is a variant of classical conditioning that affects our evaluative
judgments. In evaluative conditioning, a neutral stimulus (often a
brand, image, or word) is paired with another stimulus that already
evokes a positive or negative affective response (for instance,
pleasant music or unpleasant smells, or images that are emotionally
charged). After repeated pairings, the neutral stimulus takes on the
valence of the other stimulus – we come to like it if it was paired
with something we like, or dislike it if repeatedly paired with
something aversive. For example, advertisers often pair a product
(neutral stimulus) with upbeat music, attractive visuals, or
celebrities (positive stimuli) in hopes that consumers will transfer
those positive feelings to the product. Unlike the salivary or fear
responses in typical classical conditioning, evaluative conditioning is
measured by changes in attitudes or preferences. It demonstrates
that associative learning plays a role not just in reflexive responses
but also in how we come to feel about things. This form of
conditioning is more cognitive in nature and often occurs without
our conscious awareness of the pairings. It is robust and can even
occur with just a single pairing in some cases, contributing to how
our likes and dislikes can be shaped by experience.

These advanced phenomena reveal that classical conditioning is not a


simple, automatic process of pairing stimuli. Instead, factors such as prior
knowledge, stimulus salience, timing, and evolutionary biases all influence
the learning outcome. Discoveries like blocking and latent inhibition
indicate that animals are sensitive to the informational value of cues and
have a limited capacity or willingness to assign significance to redundant
or previously irrelevant stimuli. Similarly, phenomena like sensory
preconditioning and evaluative conditioning show that even absent an
explicit unconditioned stimulus or overt response, learning can occur
through associations, hinting at underlying cognitive processes. Overall,
Pavlovian conditioning forms the foundation of our understanding of
associative learning, and these basic and advanced principles set the
stage for more complex theories that integrate cognitive elements.

II. Operant (Instrumental) Conditioning:


Learning by Consequences
Operant conditioning, also known as instrumental conditioning, involves
learning through the consequences of behavior. Whereas classical
conditioning centers on associations between stimuli (with the response
being a reflex elicited by a stimulus), operant conditioning is about
associations between behaviors and outcomes. In operant conditioning, an
organism learns to increase behaviors that lead to rewarding or satisfying
consequences and decrease behaviors that lead to punishing or
undesirable consequences. This form of learning was pioneered by
Edward Thorndike and later advanced by B. F. Skinner, among others.
Thorndike’s early experiments in the 1890s with cats in “puzzle boxes”
led him to formulate the Law of Effect, which states that behaviors
followed by favorable consequences become more likely in the future,
while behaviors followed by unfavorable consequences become less likely.
Essentially, successful behaviors are “stamped in” and unsuccessful ones
are “stamped out.” This principle laid the groundwork for operant
conditioning, emphasizing the role of outcomes in shaping voluntary
behavior.

B. F. Skinner expanded on Thorndike’s ideas and coined the term


“operant” to refer to behavior that operates on the environment to
produce consequences. Skinner devised the operant conditioning chamber
(or Skinner box) to systematically study how reinforcing or punishing
consequences affect the rate of behavior. In a typical Skinner box
experiment, an animal (say, a rat) might learn to press a lever (the
operant behavior) when doing so yields a food pellet (a consequence).
Over time, the rat presses the lever more frequently because the
consequence is rewarding. Through such experiments, key concepts in
operant conditioning were defined:

 Reinforcement is any consequence that strengthens or increases


the behavior it follows. If a behavior becomes more frequent or
likely because of a consequence, that consequence is a reinforcer.
 Punishment is any consequence that weakens or decreases the
behavior it follows. If a behavior becomes less frequent because of a
consequence, that consequence is a punisher.
 Positive means adding a stimulus, and Negative means removing
a stimulus. Importantly, in behavioral psychology “positive” and
“negative” are not judgments of value (good or bad), but indicate
the addition or subtraction of something following the behavior.

Combining these concepts, we have four types of operant conditioning


consequences:

 Positive Reinforcement: Adding a desirable stimulus after a


behavior, which increases the likelihood of that behavior. For
example, giving a child praise or a treat for doing their homework is
positive reinforcement; the added reward makes the child more
likely to do homework in the future. In animal training, a dog getting
a biscuit for sitting on command is similarly a positive reinforcer for
the sitting behavior.
 Negative Reinforcement: Removing or avoiding an aversive
stimulus after a behavior, which increases the likelihood of that
behavior. Negative reinforcement often involves relief: a behavior is
strengthened because it helps escape or prevent something
unpleasant. For instance, taking an aspirin relieves a headache – the
reduction of pain (an aversive condition removed) reinforces the act
of taking aspirin, making one more likely to use painkillers in the
future. Similarly, a rat may learn to press a lever to turn off a loud
noise; lever-pressing increases because it results in the removal of
the annoying sound.
 Positive Punishment: Adding an aversive stimulus after a
behavior, which decreases the likelihood of that behavior. This is
sometimes simply called “punishment” in everyday language – for
example, scolding a dog or inflicting an electric shock on a rat after
an undesired response are forms of positive punishment. The
presentation of something unpleasant (a reprimand, a pain)
following the behavior makes that behavior less likely to occur
again.
 Negative Punishment: Removing a desirable stimulus after a
behavior, which decreases the likelihood of that behavior. Negative
punishment is effectively a penalty or loss of privilege following
misbehavior. For example, if a teenager breaks a rule and the
parents take away the teen’s video game privileges for a week, this
removal of something enjoyable is intended to discourage the rule-
breaking behavior in the future. In operant terms, something the
individual likes (a positive condition) is subtracted contingent on the
behavior, leading to a reduction in that behavior.

Through these four types of consequences, operant conditioning can


shape an enormous range of behaviors. Reinforcement (whether
positive or negative) increases behavior, while punishment (positive or
negative) decreases it. It is also worth noting that the effectiveness of
punishment versus reinforcement can differ – reinforcement generally
leads to more sustainable behavior change and less negative side effects,
whereas punishment can sometimes cause fear, aggression, or avoidance
of the punishing agent rather than true behavior change. Therefore, in
practical settings (like education or animal training) reinforcement
strategies are usually preferred for encouraging good behavior, and
punishment is used sparingly and carefully if at all.

Shaping (Successive Approximations): One of the powerful


techniques in operant conditioning is shaping, which allows the training of
complex behaviors that an organism would rarely if ever perform on its
own. Shaping involves reinforcing successive approximations to the target
behavior. In other words, one breaks down a complex behavior into
smaller steps and reinforces each step as it comes closer to the desired
behavior. For example, to train a rat to press a lever, a trainer might start
by reinforcing the rat whenever it moves toward the lever. As the rat
approaches the lever area regularly, the criterion for reinforcement is
raised: now only touching the lever earns a reward. After that, perhaps
only full presses of the lever get reinforced. Through this gradual process,
the rat is “shaped” into performing the complete behavior. Shaping is
essentially a way of guiding behavior by building on natural variations –
organisms naturally exhibit a range of behaviors, and by rewarding the
ones that are closer to what is desired, those responses become more
frequent. Over time, the behavior is refined. Shaping is commonly used in
animal training to teach tricks and can also be observed in human
learning (for instance, when learning a new skill, teachers or coaches
often reinforce incremental progress). It underscores that operant
conditioning is not limited to simple actions but can assemble complex
sequences of behavior by reinforcing intermediate steps.

Schedules of Reinforcement: How and when reinforcement is delivered


can greatly impact learning and behavior. Skinner and colleagues
discovered that different reinforcement schedules (patterns of providing
rewards) lead to different outcomes in terms of response rates and
resistance to extinction. The simplest schedule is continuous
reinforcement, where every instance of the target behavior is
reinforced. Continuous reinforcement is very effective for initial
acquisition of a behavior because the contingency is clear – each time the
behavior occurs, a reward follows. However, continuously reinforced
behaviors tend to extinguish relatively quickly if the reward stops,
because the organism immediately notices the change in pattern.

More interesting are partial (intermittent) reinforcement schedules,


where only some occurrences of the behavior are reinforced. Partial
reinforcement can be based on the passage of time (interval schedules) or
on the number of responses emitted (ratio schedules), and each can be
fixed or variable:

 Fixed Ratio (FR) Schedule: Reinforcement is given after a fixed


number of responses. For example, on an FR-5 schedule, an animal
might receive a food pellet after every 5 lever presses. Fixed ratio
schedules typically produce a high rate of responding, followed by a
brief pause after each reward (this pause is sometimes called a
“post-reinforcement pause”). The subject learns that a certain
number of responses is required, so it tends to respond rapidly to
reach that number, then take a short break and start again. Real-
world analogy: a factory piece-rate payment system (e.g., being
paid for every 10 items assembled) is like an FR schedule.
 Variable Ratio (VR) Schedule: Reinforcement is given after a
variable number of responses, with a certain average value. For
instance, on a VR-5 schedule, on average every 5th response is
reinforced, but the exact response count for each reward varies
unpredictably (perhaps the 3rd response, then the 7th, then 5th,
etc.). Variable ratio schedules produce very high and steady rates of
responding with no predictable pauses, because the subject never
knows which response will bring the next reward. This schedule is
famously exemplified by gambling devices like slot machines: a
gambler might win after an unpredictable number of lever pulls. The
uncertainty and possibility that “the next one might be the jackpot”
keeps the person (or animal) responding persistently. VR schedules
also make behaviors highly resistant to extinction, since the
behavior has been built to expect that not every response is
rewarded.
 Fixed Interval (FI) Schedule: Reinforcement is given for the first
response that occurs after a fixed time interval has elapsed. For
example, on an FI-30-second schedule, once the subject receives a
reward, a 30-second clock starts during which responses will not be
reinforced; the very first response after 30 seconds have passed will
produce the next reward, then the interval resets. Fixed interval
schedules tend to produce a characteristic “scalloped” response
pattern: initially after a reward there is a lull in responding (since
immediately after reinforcement no new reward is available for a set
time), but as the fixed time interval progresses and the next
possible reward time draws nearer, the rate of responding
increases. Organisms learn the temporal contingency, responding
slowly right after reinforcement and more rapidly as the time for the
next reward approaches. A common example in human behavior
might be checking for mail – if mail arrives once a day, checking the
mailbox repeatedly right after it’s been delivered yields nothing, but
as the next delivery time nears, one might check more frequently.
 Variable Interval (VI) Schedule: Reinforcement is given for the
first response after a variable amount of time has passed, with a
certain average interval. For example, on a VI-30-second schedule,
the wait time for reward availability might average 30 seconds but
could be 10 seconds on one trial, 45 on another, 30 on another, and
so on. Variable interval schedules produce a moderate, steady rate
of responding without the pronounced pauses or scallops of fixed
schedules. Because the subject never knows exactly when the time
window for the next reward will open, a relatively consistent level of
checking or responding is maintained. A real-life analogy could be
random drug testing in the workplace: since it’s unpredictable when
the test will occur (on average maybe once a month, but exact
timing varies), employees might consistently refrain from the tested
behavior at all times rather than just before known test dates.

An important consequence of partial reinforcement is the partial


reinforcement extinction effect. Behaviors that have been maintained
on partial reinforcement schedules tend to be more resistant to extinction
than those on continuous reinforcement. The organism has learned that
persistence sometimes pays off (since not every response was previously
rewarded), so when reinforcement stops, it does not immediately quit
responding. In contrast, an animal that always got a reward for the
behavior quickly notices the change when rewards cease and stops
responding sooner. This is why habits formed by intermittent rewards (like
gambling, or a child’s tantrums being occasionally successful) can be hard
to break.

Extinction in Operant Conditioning: Similar to classical conditioning,


operant behaviors can undergo extinction if the reinforcement that
maintained them is removed. Extinction in operant conditioning means
that a previously reinforced behavior is no longer followed by its reward or
consequence, and as a result, the behavior gradually diminishes. For
example, if a vending machine normally delivers a snack (reinforcement)
when you insert money and press a button, but one day it malfunctions
(no reward is delivered), you might press the button a few more times out
of frustration but eventually stop trying. The behavior of pressing the
button will decline if the reward is consistently absent. A notable effect
during operant extinction is the extinction burst – often, there is an
initial increase in the behavior’s frequency or intensity when the expected
reinforcement is first removed. The organism “tries harder” at first,
perhaps pressing the lever rapidly or with more force, as if attempting to
make the usual reward appear. If no reward comes, eventually the
behavior falls off. Over time, without reinforcement, the learned behavior
may disappear or revert to baseline levels. As with classical extinction,
however, a behavior extinguished in operant conditioning can show
spontaneous recovery after some time has passed, or it can return quickly
if the reinforcement contingency is reinstated (relearning tends to be
faster than original learning).

Operant conditioning thus provides a framework for understanding how


consequences shape voluntary actions. It has immense practical
applications: from animal training to parenting techniques, from education
to behavior therapy. In education, for instance, providing positive
reinforcement (like praise or good grades) for studying and assignment
completion can motivate students to engage more with learning
materials. In the workplace, performance-based bonuses (positive
reinforcement) or loss of perks for poor performance (negative
punishment) are used to influence employee behavior. Behavior
modification programs sometimes use token economies, where tokens are
earned for good behavior and later exchanged for rewards (an application
of reinforcement principles).

From a neurobiological perspective, operant conditioning is supported by


reward circuits in the brain. Dopamine neurons in the midbrain
(particularly in the ventral tegmental area, projecting to the nucleus
accumbens in the basal ganglia) are known to fire in response to
unexpected rewards and to decrease firing when an expected reward fails
to occur. These neural signals act as “prediction error” signals –
essentially the brain’s way of encoding the surprise or discrepancy
between expected and received outcomes. Such signals can strengthen
synaptic connections for actions that lead to better-than-expected results
(akin to positive reinforcement) and weaken connections for those that
lead to worse-than-expected outcomes. Over time, this reward-feedback
system in the basal ganglia helps establish habits and learned action
patterns, providing a neural implementation of the law of effect. Thus,
operant learning is not just a behavioral phenomenon but also a biological
one, with the brain’s chemistry and neural plasticity underlying the
changes in behavior we observe.

III. Cognitive Aspects of Learning


By the mid-20th century, psychologists began to recognize that pure
behaviorist accounts of learning (focused only on observable stimuli and
responses) were insufficient to explain all learning phenomena. Internal
cognitive processes – such as expectations, mental representations,
attention, and memory – play critical roles in how organisms learn.
Several classic findings challenged the behaviorist idea that learning is
just a mindless strengthening of responses:

Cognitive Challenges to Behaviorism: Latent Learning,


Insight, Cognitive Maps, and Expectancy

One challenge came from studies of latent learning by Edward C.


Tolman. In contrast to the behaviorist assumption that reinforcement is
necessary for learning, Tolman demonstrated that animals can learn
about their environment even without rewards. In a famous 1930s
experiment, Tolman and Honzik placed rats in a maze without any reward
at the exit for several days. These rats wandered through the maze during
practice runs but had no obvious incentive to reach the goal. Later, when
a food reward was finally introduced, these rats quickly demonstrated
they knew the maze layout – in fact, they ran to the goal as efficiently as
rats that had been rewarded every time. The previously unrewarded rats
had apparently formed a mental map of the maze during exploration,
even though this learning remained “latent” (hidden) until a motivation
(food) was present. Tolman coined the term cognitive map for the
internal representation the rats formed of the spatial layout. This finding
showed that learning can occur without an explicit behavioral
demonstration and without reinforcement, undermining the idea that
reward is necessary for learning to happen. The rats behaved as if they
had expectations or knowledge about the maze that they could exploit
when needed.

Tolman’s interpretation was that animals (and people) are not just passive
recipients of stimuli but are actively processing information, exploring,
and constructing knowledge. In the maze example, the rats learned the
maze cognitively (developing knowledge of paths and turns) rather than
merely forming stimulus-response chains. When conditions changed
(introduction of food), they could use that knowledge flexibly – for
example, if their preferred path was blocked, they could take an alternate
route, implying a high-level “map-like” understanding rather than just a
reflexive sequence of turns.

Another challenge to simple S–R (stimulus–response) learning came from


insight learning observed by Wolfgang Köhler in the 1920s with
chimpanzees. Köhler presented chimps with problems such as obtaining
an out-of-reach banana by stacking crates or using a stick. Rather than
gradually learning through trial and error, the chimps often appeared to
have a sudden flash of insight – they would sit quietly, seemingly ponder
the problem, and then abruptly execute a solution (for example, stacking
two boxes and climbing up to grab the banana) without incremental
approximations. This “aha!” moment suggested that the chimps were
mentally analyzing the situation. Köhler’s work (published in The Mentality
of Apes, 1925) argued that animals could solve problems through
understanding relationships (like “the stick can be used to pull the banana
closer”) rather than through reinforced practice of each small step. Insight
learning implies that the organism internally restructures the problem – a
cognitive process – and that learning can be a sudden reorganization of
thoughts, not just a gradual strengthening of correct responses. This was
hard to explain with behaviorist principles, which would predict more
incremental learning shaped by reinforcement. Köhler’s findings,
alongside Tolman’s, indicated that something more cognitive was at play
in learning behavior.
Tolman summarized his view by saying learning is the acquisition of
“cognitive expectancies.” Indeed, even in classical conditioning, later
researchers like Robert Rescorla showed that it’s not merely contiguity
(the CS and UCS occurring together) that determines learning, but
contingency – the informational value of the CS about the UCS.
Rescorla’s experiments in the late 1960s demonstrated that if a CS is
occasionally paired with a shock but also shocks occur at other times
randomly, conditioning to the CS is weak. However, if the same number of
CS-shock pairings are arranged in such a way that almost all shocks are
signaled by the CS (i.e. the CS is a reliable predictor of shock and rarely
occurs alone), strong conditioning occurs. This suggests the animal is
computing the predictive relationship: learning only occurs when the CS
provides new information about the likelihood of the UCS. The animal
behaves as if it has an expectation that shock will follow the CS when that
relationship is consistent, and little expectation (and thus little fear) when
the CS is not informative. Simple pairing frequency cannot explain this
difference; rather, the probability of the UCS given the CS vs. the UCS in
absence of the CS (the contingency) is critical.

Building on such findings, Rescorla and Wagner (1972) proposed an


influential expectancy model of classical conditioning – essentially a
mathematical model that formalized how animals adjust their
expectations with each learning trial. The Rescorla–Wagner model
posits that learning is driven by the difference between the expected
outcome and the actual outcome (a prediction error). On each pairing of
CS and UCS, the associative strength of the CS is increased or decreased
in proportion to how surprising the UCS is (surprising meaning it differed
from what was predicted by all cues present). If the UCS is larger or more
present than expected, the prediction error is positive and the CS gains
associative strength (excitatory conditioning); if the UCS is absent or
smaller than expected, the prediction error is negative and the CS loses
associative strength (possibly becoming inhibitory). Over trials, the CS
comes to accurately predict the UCS and the prediction error falls to zero,
at which point learning reaches asymptote. This model elegantly
explained phenomena like blocking: when CS1 has already been learned
to predict the UCS, the UCS is no longer surprising, so when a new
stimulus CS2 is added, there is minimal prediction error left for CS2 to
acquire strength – hence CS2 is “blocked” from becoming a good
predictor. In short, the Rescorla–Wagner theory treats the learner as if it is
doing a simple computation to update expectations about the world. This
is a clear departure from the original Pavlovian notion of conditioning as a
mechanical strengthening of reflex pathways; it implies the organism is
tracking outcomes and making a kind of primitive prediction.

The idea of animals forming expectations connects back to behavior: even


the form of a conditioned response can sometimes reflect an expectation.
For example, if a tone is paired with food, a dog might eventually
approach the place where food is normally delivered or start salivating (as
if anticipating food), whereas if a tone is paired with a shock, a rat might
freeze (an adaptive response preparing for a painful event). These
responses make sense in light of what the animal expects next (food or
pain) rather than being direct reflexes to the tone itself. Such
observations reinforced the view that learning involves knowledge about
relationships (S–S associations, stimulus-stimulus, like tone means food)
and expectations, not just new reflexive S–R bonds.

In summary, latent learning showed that learning and performance are


not the same (learning can occur without immediate performance until
motivation is present), insight learning showed that not all learning is
incremental trial-and-error (some problems are solved via mental
reorganization), and the importance of contingency and models like
Rescorla–Wagner demonstrated that even in simple conditioning,
organisms behave as if they are tracking probabilities and outcomes.
These cognitive aspects led to a broader understanding of learning as an
information-processing activity. They paved the way for cognitive
learning theories, which incorporate concepts like goals, plans, and
representations. Even behaviorists had to concede that internal
processes, while not directly observable, could be inferred from behavior
and were necessary to explain complex learning.

Attention and Memory in Learning: The Role of Cognitive


Processes

As the cognitive revolution took hold, researchers examined how attention


and memory processes intertwine with learning. It became clear that what
an organism learns is profoundly influenced by what it pays attention to
and how it encodes and retains information.

Selective Attention in Learning: Learning often occurs in


environments with multiple stimuli, yet not all stimuli are equally learned
about – animals and people focus on certain cues at the expense of
others. This is where selective attention comes in. The concept of
selective attention in learning suggests that learners pick and choose
which aspects of the environment to associate, based on salience and
relevance. For example, in a classical conditioning situation with a
compound stimulus (like a light and a tone together paired with a shock),
we saw with overshadowing that the more salient stimulus grabs the
learning. One interpretation is attentional: the animal’s limited processing
capacity is mostly devoted to the salient cue, so the other cue is scarcely
learned about. In operant conditioning, too, an animal might only pay
attention to certain signals that indicate when its behavior will be
rewarded (called discriminative stimuli). If a light signals that lever
presses will now be reinforced (light on = reward available, light off = no
reward), the rat learns to press only when the light is on – it has learned to
discriminate based on that light, effectively focusing its attention on the
relevant context for reward.
Attention can be seen as the gatekeeper of learning – events that are
attended to are more likely to be associated and remembered, while
unattended or deemed-irrelevant stimuli are ignored. The phenomenon of
latent inhibition, discussed earlier, can also be viewed through an
attentional lens: a pre-exposed stimulus (tone that occurred repeatedly
with no consequence) might lose its novelty and the animal may stop
paying much attention to it, so when conditioning is attempted with that
tone, learning is slow because the animal must first “re-attend” to the
stimulus. Some learning theories explicitly incorporate attention,
suggesting that animals might dynamically adjust which cues they attend
to. If one cue proves to be a good predictor of an outcome, the animal will
pay more attention to it in the future; conversely, if a cue is unreliable, the
animal will pay less attention to it. This helps explain why blocking occurs
(the animal is already focused on the established cue, and may not even
process the new cue much) and why surprising events can redirect
attention. In human learning, the importance of attention is evident:
students who focus on relevant parts of the material learn better than
those who are distracted by irrelevant information. Effective teaching and
training often involve guiding the learner’s attention to the key aspects of
a task.

Working Memory and Learning: Working memory refers to the system


that actively holds and manipulates information in mind over short periods
(usually a few seconds) – essentially, it’s the brain’s “scratchpad” or
mental workspace. It is limited in capacity (often cited as able to hold
about 7±2 items, as per George Miller’s classic finding) and in duration
(without active maintenance, information fades within half a minute or
so). Working memory is crucial for many learning tasks, especially those
that involve reasoning, multi-step procedures, or integrating information.
For example, when learning to solve a math problem, a student must hold
some numbers and intermediate results in mind; when learning a
sentence in a new language, a person relies on working memory to hold
the beginning of the sentence while parsing the rest. If working memory
capacity is exceeded or if attention strays, the chain of thought can break,
and learning the solution or sentence may fail.

Working memory and attention are closely intertwined – we tend to keep


in working memory what we’re actively focusing on. A learner with a
greater working memory capacity can juggle more pieces of information
simultaneously, which can facilitate understanding complex relationships.
However, if a task overwhelms working memory (too many things to keep
track of at once), learning can be hampered. This is why breaking
information into smaller “chunks” can greatly aid learning – chunking
effectively packages multiple elements into a single unit, reducing the
load on working memory. For instance, a long sequence of actions can be
grouped into sub-routines so that the learner can remember “do sub-task
A, then sub-task B” instead of a dozen individual steps.
Cognitive strategies like rehearsal (mentally repeating information),
elaboration (explaining or connecting new information to what one
already knows), and use of mnemonic devices all serve to manage the
limitations of working memory and promote better encoding into long-
term memory. The concept of cognitive load in instructional design
stems from this: learning material should be presented in ways that do
not overload the learner’s working memory. As learners become more
proficient, processes that initially required conscious working memory can
become automatic (offloaded to long-term procedural memory), thereby
freeing up mental resources for new learning – this is part of the expertise
development process.

Memory Consolidation: Learning is only useful if it is retained, and this


brings in the critical role of memory consolidation. When we first learn
something (be it a fact, a procedure, or an association), the new memory
is initially unstable and susceptible to disruption. Memory consolidation
is the process by which fragile short-term memories are transformed into
stable long-term memories. In the brain, this involves physiological and
structural changes. At the cellular level, repeated neural activity can
strengthen synapses – a phenomenon known as long-term potentiation
(LTP). As one popular phrase puts it, “neurons that fire together, wire
together,” meaning that when a set of neurons repeatedly activates in
synchrony (such as when a particular stimulus and response occur
together, or when rehearsing a piece of information), their connections
become more efficient. These synaptic changes are believed to play a
major role in learning and memory processes. When two neurons fire at
the same time repeatedly, they become more likely to fire together in the
future; eventually, the connection between them is strengthened.

Consolidation is not instantaneous; it can take time – on the order of hours


to days or even longer – for memories to solidify. During this window,
interference can disrupt learning. For example, in classic experiments,
giving animals electroconvulsive shocks or drugs that inhibit neural
activity shortly after learning an avoidance task can cause them to forget
the task, presumably by disrupting consolidation. In everyday experience,
if we cram information and then immediately flood our brain with a bunch
of new inputs, we might not retain what we studied. Sleep has been
found to play a remarkable role in consolidation: many studies show that
after learning, a period of sleep (especially certain sleep stages like slow-
wave sleep and REM sleep) enhances memory retention. During sleep, the
brain appears to reactivate and reorganize memory traces – the
hippocampus (a region deeply involved in forming new declarative
memories) “replays” activity patterns from recent experiences, which may
help strengthen those memories and gradually integrate them into the
cortex for long-term storage.

Memory consolidation isn’t just about making memories permanent; it


also can involve transforming memories. Over time, some details may be
lost or generalized. When we recall a memory, it can become briefly
unstable again (a process called reconsolidation upon re-storage),
during which it might be updated or altered before settling again. This
indicates that our memories are continually being constructed and
reconstructed, not merely filed away like fixed recordings.

For learning skills or habits (procedural memories), consolidation often


involves the reorganization of brain activity to become more efficient.
Early in learning a motor skill, many brain regions (including prefrontal
cortex, which handles conscious control, and motor areas) are highly
engaged and working hard; after consolidation and practice, the skill
execution becomes more localized to specific motor circuits (including the
basal ganglia and cerebellum) and requires less conscious oversight.
Essentially, with consolidation and practice, what was effortful can
become automatic.

In educational contexts, understanding consolidation leads to strategies


like spaced repetition – spacing out study sessions over time allows for
reconsolidation and strengthening of the memory trace with each review,
leveraging the brain’s natural consolidation processes, as opposed to
massing study in one session (which might lead to a lot being forgotten if
not consolidated properly).

Bringing attention and memory together: effective learning requires the


learner to attend to the right information (so that it is encoded) and to
engage working memory in processing that information deeply (rather
than superficially), and then consolidation mechanisms must take over to
store that learning for the long term. Cognitive psychology and
neuroscience thus provide a fuller picture of learning: not just the external
pairing of stimuli or responses with outcomes, but the internal handling of
information – what we focus on, how we hold it in mind, and how our
brains physically change to lock in the new knowledge. Modern learning
theories often integrate these elements, acknowledging that to truly
understand learning, one must consider the mind and brain in addition to
behavior.

IV. Motor Skill Learning and Procedural


Memory
Not all learning is about stimuli and rewards; much of what we learn
involves skills – from tying shoelaces to playing a musical instrument.
Motor skill learning refers to the process by which movements, initially
difficult or clumsy, become smoother, faster, and more precise with
practice. This form of learning is closely tied to procedural memory,
which is memory for how to do things (the “know-how”), often acquired
implicitly. Researchers studying skill acquisition have identified stages
through which learners progress and have proposed theories about how
the brain encodes motor routines.
Stages of Skill Acquisition: Cognitive, Associative,
Autonomous

A classic framework by Paul Fitts and Michael Posner (1967) divides skill
learning into three stages:

 Cognitive Stage: In the initial stage of learning a new skill,


performance is consciously controlled and often slow and error-
prone. The learner is intellectually trying to understand what to do.
This phase involves a lot of thinking, explanation, and often
verbalizable knowledge. For instance, when someone is first
learning to drive a car, they have to think about every action (“Now
I need to press the clutch, shift to second gear, slowly release the
clutch while pressing the gas…”). Movements in the cognitive stage
are typically awkward; the learner makes frequent mistakes and
may need to break the task into sub-tasks. Feedback is crucial in
this phase – the learner uses external guidance or trial-and-error
outcomes to adjust their understanding of the task. Because
attention is fully engaged in performing the skill, it’s hard to focus
on anything else simultaneously. The cognitive stage is essentially
performance by following rules or instructions, and a lot of the
variability in performance comes from figuring out the strategy or
technique.
 Associative Stage: With practice, the learner enters a middle
phase where actions become more refined and efficient. The basic
patterns have been learned, and now the individual is concentrating
on how to do the movements better. Errors decrease, and
consistency improves. In this associative stage, the performer is
associating particular actions with successful outcomes –
strengthening the connections and getting rid of ineffective
movements. There is less need to think through each step; some
sub-tasks begin to be handled automatically while others still
require attention. Using the driving example, shifting gears or
steering becomes smoother as the driver no longer has to verbalize
each tiny action, but they still must pay attention, especially in
complex traffic situations. Performance feedback (either external or
self-monitoring) continues to be important, but the learner now
often can detect and correct some of their own errors in real time.
Motor memory is forming: the sequence of actions is becoming
ingrained, though not fully automatic yet. Reaction times improve
and the learner can start to focus on higher-order aspects (like
driving strategically rather than the mechanics of operating the car).
 Autonomous Stage: After extensive practice, the skill may reach a
stage of automation. In the autonomous stage, the performance
of the skill becomes largely automatic, rapid, and effortless,
requiring minimal conscious attention. The individual can perform
the movement sequences reliably even while possibly attending to
other things. For example, experienced drivers can execute the act
of driving (shifting gears, operating pedals, steering) without much
conscious thought, freeing their attention to plan the route or
respond to novel road conditions. Errors are few and usually minor
adjustments. The movement patterns have been fine-tuned to be
efficient – extraneous motions are eliminated, and the person’s
technique is optimized through practice. At this stage,
improvements still occur but are more gradual and subtle, as the
performer is already very proficient; they might refine timing or
adapt to special situations. The hallmark of the autonomous stage is
that the skill execution is now ingrained in procedural memory – the
person might even find it difficult to articulate exactly how they
perform the skill, because it’s become intuitive or muscle memory.

Progression through these stages is a continuum; not every skill achieves


a fully autonomous stage (especially if practice is limited), but this model
captures how learning transforms a task from a heavily cognitive
endeavor to a largely automatic routine. It also explains why beginners
and experts think differently about tasks: the beginner relies on explicit
instructions and feedback, whereas the expert has internalized the task to
the point of intuition. Once in the autonomous stage, dual-task
performance becomes easier (e.g., an expert can carry a conversation
while driving, something a novice could hardly do). It’s important to note
that if an autonomous skill is not practiced for a long time, performance
might slip (a phenomenon known as skill decay or “use it or lose it”), but
relearning is usually faster than initial learning because some memory
traces remain.

Motor Programs and Schemas

To understand how skills become automatic, researchers introduced the


concept of motor programs. A motor program is an abstract
representation in the nervous system that defines a sequence of
movements. When you execute a learned motor program, a series of
muscle commands unfolds in a structured, pre-planned way without
requiring conscious guidance for each individual action. For instance,
when a skilled typist types a familiar word, they don’t plan each finger
movement letter-by-letter; instead, they execute a program for the whole
sequence of keystrokes in that word. Early theories of motor control
suggested that once learned, a specific movement sequence is stored as
a program that can be run off when triggered.

However, a challenge arises: if every possible movement had its own


specific program stored, that would be extremely inefficient and inflexible.
How do we produce new variations of a movement (like throwing a ball
different distances, or saying a word at different speeds or volumes)
without having practiced each exact variant? The answer came with
Richard Schmidt’s Schema Theory (1975), which proposed that what is
learned in motor skill acquisition is not a vast library of specific
movements, but rather general rules or schemas for generating
movements. According to schema theory, the brain acquires a
generalized motor program for a class of actions (say, the act of
throwing) and along with it develops schemas that relate the parameters
of the action to the outcomes. Each time you perform the action, you get
feedback (e.g., how far the object went, how accurate the throw was) and
that trial updates your schema.

In practice, this means when you face a new situation – you want to throw
a ball 20 meters whereas you’ve only practiced throws to 10 meters – you
don’t need an entirely new motor program. Instead, your brain adjusts
parameters (like force and angle) based on the learned schema to achieve
the desired outcome. Schmidt described two key schemas:

 The recall schema: used to generate the movement, it is formed


by combining past outcomes with past parameters. Essentially, it
answers, “Given my goal, what parameters should I set for the
motor program to achieve it?” As one practices, one builds a record
of, for example, how much force leads to how far a ball travels.
When a new distance is desired, the recall schema
interpolates/extrapolates the appropriate force.
 The recognition schema: used to evaluate movement outcomes, it
predicts the expected sensory consequences of a movement. For
example, based on parameters, one might expect a certain feeling
in the arm or see the ball travel in a certain trajectory; if the actual
outcome deviates, that error signal can update the schemas.

Through these schemas, the performer can correct errors and refine the
generalized motor program even for new conditions. A practical upshot is
that variable practice (practicing a variety of conditions, say throwing at
various targets) can strengthen the schema more than repetitive practice
of one exact movement. Variable practice teaches the learner how to
adjust parameters flexibly, leading to better performance in novel
situations – a principle that has been confirmed in many motor learning
experiments.

Thus, motor programs give a structured template for a skill, and schemas
provide the adjustable rules to adapt that template to specific
circumstances. This accounts for both the efficiency of skilled movement
and its flexibility. By the time a skill is in the autonomous stage, the
performer has a well-honed generalized motor program and robust
schemas, allowing them to perform quickly and adapt when needed with
minimal conscious deliberation.

Neural Basis of Motor Learning: Cerebellum, Basal


Ganglia, and Motor Cortex

Motor skill learning is underpinned by complex changes in the brain,


engaging multiple regions. Both cortical and subcortical structures are
involved in acquiring and storing procedural memories:
 Motor Cortex: The primary motor cortex (and connected premotor
areas) is the region of the brain that directly controls voluntary
muscle movements. During the learning of a new motor skill, the
motor cortex is highly engaged as it refines the patterns of neural
firing needed to produce the desired movements. As practice
continues, the representation of the movements in motor cortex can
reorganize – a phenomenon known as cortical plasticity. For
example, studies have shown that when people learn fine motor
tasks (like a sequence of finger taps), the area of motor cortex
representing those fingers can expand and the coordination
between neurons becomes more efficient. In essence, the motor
cortex “tunes” itself to better drive the muscles for the practiced
task. In early learning, a broad network including motor cortex,
sensory cortex (for feedback), and even prefrontal cortex (for
attention and decision-making) is active. With practice, the
activation becomes more focused in motor regions, indicating that
the control of the skill is being handed over to dedicated motor
circuits. Highly skilled performers often show less overall brain
activation for the task than novices – a sign that their brains
execute the task more efficiently, using just the necessary circuits
(which fire strongly and reliably) with less extraneous activity.
 Cerebellum: The cerebellum is a brain structure at the back of the
head (behind the brainstem) critical for motor coordination, timing,
and error correction. It has long been recognized as essential for
motor learning, particularly for calibrating movements and learning
precise timings. The cerebellum receives copies of motor commands
from the cortex as well as sensory feedback from the movement
outcome; it compares the intended movement with what actually
happened. If there’s a discrepancy (an error), the cerebellum helps
generate corrective adjustments. Over repeated practice, these
corrections get incorporated so that feedforward commands
improve. A classic example of cerebellar involvement is in
adaptation tasks: if you wear prism glasses that shift your visual
field and try to throw darts at a target, initially you miss due to the
distortion. With practice, the cerebellum contributes to adapting
your throws to compensate for the shift (learning to adjust your
aim). If the glasses are removed, you initially err in the opposite
direction (an “aftereffect”), reflecting the cerebellum’s learned
adjustment, which then readapts back to normal. Individuals with
cerebellar damage struggle with such adaptation tasks; they also
have difficulty with skills requiring fine timing (like tapping rhythms)
and tend to produce erratic, uncoordinated movements (ataxia),
indicating that the cerebellum failed to properly fine-tune the motor
commands. In classical conditioning, the cerebellum is known to be
the key site for learning in tasks like the eyeblink conditioning
paradigm (pairing a tone with an airpuff to the eye); lesions in
specific deep cerebellar nuclei prevent the conditioned eyeblink
response from being learned. All this illustrates that the cerebellum
acts as a specialized learning machine for motor calibration and
certain types of associative learning, using powerful plasticity
mechanisms to adjust neural output for improved motor
performance.
 Basal Ganglia: The basal ganglia are a group of nuclei deep in the
forebrain that are heavily involved in initiating movements, forming
habits, and stimulus-response learning. They are central to
procedural learning – the gradual learning of habits or skills
through repetition and reward feedback. During operant
conditioning or trial-and-error learning of actions, the basal ganglia
(in particular, the striatum) integrate information about the context,
the action, and the outcome. Dopamine signals from the midbrain
(which we discussed in operant conditioning) heavily innervate the
basal ganglia, and these signals reinforce the neural patterns
corresponding to successful actions. Over time, actions that are
consistently rewarded become more automatic and tied to
contextual cues – essentially forming habits – and this involves the
basal ganglia’s circuitry restructuring. A well-known insight into
basal ganglia function comes from Parkinson’s disease, where
dopamine-producing neurons degenerate: patients have trouble
initiating movements and learning new motor tasks, but their
declarative memory can be intact. Conversely, patients with certain
types of amnesia (hippocampal damage) can learn motor skills
normally while not remembering the training sessions – reflecting
that the basal ganglia (and cerebellum) can support skill learning
independently of the brain systems for conscious memory. The
basal ganglia are thought to store the “chunks” of action sequences
that make up a habit; when a context cue appears, the basal
ganglia help trigger the stored sequence without need for conscious
assembly. For example, an experienced pianist, upon seeing a
familiar piece of sheet music, can execute well-practiced passages
with little thought to each note – the sequences have become
ingrained, in part via basal ganglia circuits.

These three systems – motor cortex, cerebellum, and basal ganglia – work
together during skill acquisition. Early in learning, the cortex is heavily
engaged and the cerebellum is busy processing errors. Later, as
movements become reliable, the basal ganglia help chunk and automate
the sequences, the cerebellum’s error corrections diminish once
performance stabilizes, and the motor cortex representation becomes
streamlined for efficient execution. Of course, other areas such as the
premotor cortex (involved in planning movements and integrating sensory
cues) and parietal cortex (spatial processing) also play roles, but the trio
above are key players in procedural memory.

In summary, motor skill learning transforms deliberate actions into


ingrained routines. At a behavioral level, we see this in the progression
from clumsy to skilled, and at a cognitive level, in the shift from conscious
to automatic control. At a neural level, this corresponds to the distribution
of task control across different brain circuits – from cortical and cognitive
involvement to subcortical and streamlined motor representation. This
multifaceted understanding has practical implications: for example,
training regimens that gradually reduce feedback frequency leverage the
autonomous stage characteristics, variable practice strengthens schemas
for better generalization, and therapies for motor disorders often try to
engage the remaining healthy parts of these neural systems to relearn or
compensate for lost skills.

V. Social and Observational Learning


Learning does not occur only through direct experience; often we learn by
observing the behavior of others. Observational learning, also called
social learning, occurs when an individual acquires new behaviors or
information by watching others (called models) and the consequences
those models experience. This form of learning was famously
demonstrated by Albert Bandura in a series of studies in the 1960s on
children’s imitation of aggression. In the classic “Bobo doll” experiment,
children watched an adult model behave aggressively toward an inflatable
Bobo doll (for example, hitting it with a mallet and shouting). Later, when
given a chance to play in a room with the same doll, those children
imitated much of the aggressive behavior they had observed, often even
using the same actions and words. Importantly, Bandura showed that if
children saw the model rewarded for the aggressive behavior (for
instance, praised or given candy), they were even more likely to imitate it;
if they saw the model punished, they were less likely to reproduce the
aggression. Yet, even those who didn’t initially act out the aggression had
clearly learned it – if later incentivized (asked to demonstrate what the
adult did, or offered a reward), they could reproduce the behavior. This
showed a clear distinction between learning and performance: the
children had learned the behavior by observation regardless of whether
they performed it immediately.

From these findings Bandura formulated his Social Learning Theory,


which emphasizes the role of observational learning, imitation, and
modeling in human behavior. According to Bandura, four key processes
are at work:

 Attention: One must first pay attention to the model’s behavior


and its consequences. Various factors influence attention, such as
the model’s attractiveness, status, similarity to the observer, or the
salience of the behavior. Children, for example, are very attentive to
actions of adults or peers they admire.
 Retention: The observed behaviors must be encoded in memory so
that they can be recalled. This often means forming a mental
representation of the behavior (either as images or verbal
descriptions). If an observer cannot remember what the model did,
they can’t imitate it later. Bandura noted that we sometimes
rehearse or mentally practice what we’ve seen to help retain it.
 Reproduction: The observer must have the ability to reproduce the
behavior. This involves converting the remembered pattern into
action. One’s physical capabilities or skills play a role here – for
instance, a child might remember how a basketball player
performed a complex maneuver but may not yet be able to execute
it. Through practice of the component actions (and sometimes
through feedback from the environment or others), the observer can
improve their ability to replicate the behavior.
 Motivation: Finally, there must be a reason or motivation to
perform the observed behavior. Even if something is learned and
one is capable of doing it, one might not perform it without
incentive. Motivation can come from direct reinforcement or
punishment (will I get rewarded if I do this?), but interestingly, in
observational learning it often comes from vicarious
reinforcement or vicarious punishment – seeing someone else
get rewarded or punished can affect the observer’s willingness to do
the same. If a child observes another student being praised by the
teacher for asking questions, the child may be more motivated to
ask questions themselves, expecting similar praise. Conversely,
seeing someone scolded for a behavior can deter the observer from
that behavior.

Observational learning extends the reach of learning beyond one’s


personal actions. A great deal of human learning in childhood happens
through modeling: language acquisition, social etiquette, problem-solving
strategies, and emotional responses are all influenced by watching
parents, siblings, teachers, and even characters in media. Children imitate
not only overt actions but also attitudes – they pick up on how adults react
to situations, how they treat other people, etc. This is how cultural norms
and behaviors often propagate across generations.

The notion of vicarious learning was a significant departure from strict


behaviorist doctrine because it highlighted that direct reinforcement to
the learner isn’t necessary; seeing someone else experience
consequences can change an observer’s behavior. It also underscored the
cognitive aspect: the observer has to internally represent and recall the
behavior and its outcome. Bandura later renamed his theory “social
cognitive theory” to reflect the importance of these cognitive processes in
social learning.

On the neural side, a fascinating discovery supporting our propensity for


imitation was the finding of mirror neurons in the brain. In the 1990s,
neuroscientists Giacomo Rizzolatti and colleagues, while recording from
the premotor cortex of macaque monkeys, observed neurons that fired
not only when the monkey performed an action (like grasping a peanut)
but also when the monkey simply watched another individual perform the
same action. These were dubbed “mirror neurons” because they seemed
to mirror the observed behavior in the observer’s brain. In humans, direct
single-neuron recordings are limited, but brain imaging and other studies
have provided evidence of a similar mirror system. For example, when
people watch someone else move or gesture, regions of their own motor
cortex and associated areas often activate as if they were performing the
actions themselves. This neural mirroring is thought to be a basis for
understanding others’ actions and intentions – by internally simulating
what we see, we can use our own motor knowledge to interpret the
observed behavior.

Mirror neurons provide a potential mechanism for observational learning:


when you watch a teacher demonstrate a dance move, your motor system
might covertly rehearse it via mirror neuron activity, priming you to
execute the move. They might also contribute to empathy and social
understanding, as a similar phenomenon occurs for emotions (we have
empathic responses when seeing others in pain or joy, partly because our
brain mirrors those states to some degree). However, imitation in humans
is not purely an automatic mirroring – it’s also under strategic control. We
don’t imitate everything we see; we discern whom to imitate and what
behaviors are appropriate to reproduce. That decision-making involves
frontal cortex and other cognitive systems evaluating the social context
and the likely outcomes (again linking to Bandura’s motivation process).

In addition to humans, many animals show forms of observational


learning, though often to a more limited extent. Young chimpanzees, for
instance, learn to use tools by watching older chimps (e.g., cracking nuts
with stones). Birds have been shown to learn songs by listening to adult
conspecifics. There are also examples of primates and other mammals
learning fears by observation (a monkey raised in captivity might not fear
snakes, but if it observes another monkey reacting fearfully to a snake, it
can quickly develop a fear of snakes itself). These examples suggest
evolutionary advantages to learning from others: it is safer and more
efficient to learn about dangers, food sources, or skills from someone
else’s experiences rather than solely through one’s own trial-and-error.

Social learning has broad implications. In education, students learn not


just from instruction but by observing peers and mentors. In workplaces,
new employees often learn proper procedures and work culture by
watching seasoned employees (often informally). In society, behaviors can
spread through observational learning – both beneficial behaviors (like
kindness, safe practices) and undesirable ones (like aggression or
prejudices seen in media or community). Understanding social learning
helps in designing interventions: for example, positive role models in
media can encourage constructive behavior in viewers, and conversely,
exposure to violent or antisocial models can have negative influences (as
research on media violence suggests).

In summary, social and observational learning highlight that learning is


not just an individual affair locked within one’s own actions; it is
profoundly social. We are equipped psychologically and biologically to
learn from others. Bandura’s work provided a framework for how imitation
works and stressed cognition and motivation in that process, while the
discovery of mirror neurons revealed that our brains have embedded
circuitry for mirroring others’ actions. Together, these insights emphasize
that much of what we learn about the world comes through social
transmission, allowing each generation to build on the knowledge of the
previous one without having to reinvent the wheel through direct
experience.

References:

1. Pavlov, I. P. (1927). Conditioned Reflexes: An Investigation of the


Physiological Activity of the Cerebral Cortex. (Translated by G. V.
Anrep). London: Oxford University Press.
2. Thorndike, E. L. (1898). Animal Intelligence: An Experimental Study
of the Associative Processes in Animals. Psychological Monograph,
2(4).
3. Skinner, B. F. (1938). The Behavior of Organisms: An Experimental
Analysis. New York: Appleton-Century.
4. Garcia, J., & Koelling, R. A. (1966). Relation of cue to consequence in
avoidance learning. Psychonomic Science, 4(1), 123–124.
5. Tolman, E. C. (1948). Cognitive maps in rats and men. Psychological
Review, 55(4), 189–208.
6. Köhler, W. (1925). The Mentality of Apes. (2nd ed., trans. E. Winter).
London: Harcourt, Brace & World.
7. Rescorla, R. A., & Wagner, A. R. (1972). A theory of Pavlovian
conditioning: Variations in the effectiveness of reinforcement and
nonreinforcement. In A. Black & W. Prokasy (Eds.), Classical
Conditioning II: Current Research and Theory (pp. 64–99). New York:
Appleton-Century-Crofts.
8. Baddeley, A. D., & Hitch, G. J. (1974). Working memory. In G. H.
Bower (Ed.), The Psychology of Learning and Motivation (Vol. 8, pp.
47–89). New York: Academic Press.
9. McGaugh, J. L. (2000). Memory—a century of consolidation. Science,
287(5451), 248–251.
10. Fitts, P. M., & Posner, M. I. (1967). Human Performance.
Belmont, CA: Brooks/Cole.
11. Schmidt, R. A. (1975). A schema theory of discrete motor skill
learning. Psychological Review, 82(4), 225–260.
12. Bandura, A., Ross, D., & Ross, S. A. (1961). Transmission of
aggression through imitation of aggressive models. Journal of
Abnormal and Social Psychology, 63(3), 575–582.
13. di Pellegrino, G., Fadiga, L., Fogassi, L., Gallese, V., &
Rizzolatti, G. (1992). Understanding motor events: A
neurophysiological study. Experimental Brain Research, 91(1), 176–
180.

You might also like