Perception and object recognition
Perception and object recognition
recognition
Module 2
Contents
Theories of Perception: Gestalt approach, Top–Down vs. Bottom- up Processing, Information Processing; Pattern Recognition: Feature
detection analysis, Template matching, Prototype matching;
Biological basis of perception and basic plan of generating sensory codes – Visual, Auditory, Touch, Pain, Smell; Basic psychophysics
and Signal Detection Theory. Visual perception - Form, Colour, Depth, Objects and Faces.
Sensation and Perception
● Detecting the presence of a certain type of energy and making use of that
energy to provide information as to the nature of the environment surrounding
us.
● Thus sensation’ to refer to that initial detection and the term ‘perception’ to
refer to the process of constructing a description of the surrounding world.
● For example, there is a difference between the cells in a person’s eye
reacting to light (sensation) and that person knowing that their friend is
offering them a cup of tea (perception).
The Perceptual Process
Distinction between sensation and perception
Figure 1: The subject of the picture is a Dalmatian dog. Why is it difficult to see?
Figure 2: The subject of the picture is a fraser spiral. Why is it difficult to see?
● Although the figure appears to form a spiral, it is actually a set of concentric circles.
Perceptual process
It was clear from an eye examination that he could see well and, by many
other criteria, it was obvious that he was not crazy. Dr. P.’s problem was
eventually diagnosed as visual form agnosia—an inability to recognize
objects—that was caused by a brain tumor. He perceived the parts of objects
but couldn’t identify the whole object, so when Sacks showed him a glove,
Dr. P. described it as “a continuous surface unfolded on itself. (Sacks, 1985)
Perceptual process
● The fact that perception often leads to action means that perception is a
continuously changing process.
● For example, the scene that Ellen is observing changes every time she shifts
her attention to something else or moves to a new location, or when
something in the scene moves.
● Knowledge is any information that the perceiver brings to a situation
● Information that a person brings to a situation can be things learned years
ago, such as when Ellen learned to tell the difference between a moth and a
butterfly, or knowledge obtained from events that have just happened.
Bottom-up and top-down processing
1 A blindfolded student trying to work out what the unknown object they have
Imagine you are the blindfolded student. What strategies do you think you might
employ to complete the above two tasks successfully? Can you identify any key
Kanizsa’s (1976) illusory square, in which a square is perceived even though the
image does not contain a square but only four three-quarter-complete circles.
Gestalt approach to perception: ‘The whole is greater than the sum of its parts’.
Law of Pragnanz, described by Koffka as: ‘Of several geometrically possible organizations that one will
actually occur which possesses the best, simplest and most stable shape’ (Koffka, 1935, p.138).
Gibson’s theory of perception
● One bottom-up approach to perception, is based on the premise that the
information available from the visual environment is so rich that no
cognitive processing is required at all.
● Gibson conceptualized the link between perception and action by
suggesting that perception is direct, in that the information present in light
is sufficient to allow a person to move through and interact with the
environment.
Gibson’s theory of perception
● Studies with chess players have shown that the temporal lobe is
indeed activated when the players access the stored chunks in their
long-term memory (Campitelli et al., 2007).
● Template matching theories fail to explain some aspects of the
perception of letters. (The CAT)
● We identify two different letters ( A & H) from only one physical form.
Neuroscience and template theories
● Experiments suggest that there is a
difference in brain between the perception
of letters and digits.
● An area on or near the left fusiform gyrus
(part of the occipital and temporal lobes) is
activated significantly more when a person
is presented with letters than with digits.
● A particular barcode will always look exactly
the same way, making it easy for
computers to read. Letters, to the contrary,
can look different although they depict the
same letter.
● Template matching will distinguish between
different bar codes but will not recognize
that different versions of the letter A written
in different scripts are indeed both A’s.
Feature-Matching Theories
Biederman’s recognition-by-components:
● We recognize 3D objects by manipulating simple geometric shapes
called geons (for geometrical ions)
● Parts of the larger object are recognized as subobjects.
● Subobjects are categorized into types of geons
● The larger object is recognized as a pattern formed by combining
geons.
● Only edges are needed to recognize geons.
Sample jeons
Example
If you see a car, you perceive it as being made up of a number of different geons.
You can recognize the car even if it is partly obscured by another object and you
can’t see all of the geons. This is because you can still infer the presence of the
other geons. Cells in the inferior temporal cortex react stronger to changes in
geons than to changes in other geometrical properties (e.g., changes in the size
or diameter of a cylinder; Vogels et al., 2001).
How do you distinguish one face from the other?
Recognition-by-Components Theory
● The main pathway between the eye and the cortex is the
retina-geniculate- striate pathway.
● It transmits information from the retina to V1 and then V2 via the
lateral geniculate nuclei (LGNs) of the thalamus.
● Retinopy: retinal receptor cells are mapped to points on the surface of
the visual cortex.
RETINOPY
Processing in the Lateral Geniculate Nucleus
● Major function of the LGN is to regulate neural information as it flows from the
retina to the visual cortex.
● 90% of the fibers in the optic nerve arrive at the LGN and the other 10 %
travel to the superior colliculus.
● LGN receives more input back from the cortex than it receives from the retina.
For every 10 nerve impulses the LGN receives from the retina, it sends only 4
to the cortex.
● The signals arriving at the LGN are sorted and organized based on the eye
they came from, the receptors that generated them, and the type of
environmental information that is represented in them.
Information relays from the
thalamus to the primary visual cortex (Area
V1). Visual information passes to the
secondary areas of visual processing
(V1–V8) where aspects of color, form, and
motion are processed. From there it is
analyzed in parallel streams through the
ventral temporal (“What”) and the dorsal
parietal areas (“Where”)
Visual deficits
● Damage to area V1 causes cortical blindness, or hemianopia, in the opposite
visual field.
● people with cortical blindness are sometimes able to indicate that a stimulus
is present, that it has moved, or that it is in a certain location, even though
they have no conscious ability to “see” in the conventional sense. This
phenomenon is termed blindsight
● If area V1 is damaged in both hemispheres, complete blindness will occur.
Visual deficits
● Damage to V4 results in achromatopsia, the complete loss of ability to detect
color.
● Lesions to area V5 result in akinetopsia, or the specific inability to identify
objects in motion.
● Damage to areas V3-V5 can result in a general inability to perceive form. In
this situation, patients may be able to make a perfect copy of a drawing, but
are totally unable to understand that the connection of lines corresponds to a
specific shape or object.
Consequences of lesions in area V1.
Dr. P’s affliction was that he was visually unaware of the totality or gestalt of objects. He
could see and identify form and color but could not combine these aspects into a higher
sense of meaning that is a rose. His only visual reality was a mechanistic identification
of features. This is typical of how visual agnosia primarily involves the processes
necessary for object recognition or object meaning while leaving intact elementary visual
processes. Also, Dr. P’s agnosia, as is usually the case, was modality specific. Although
his visual knowing was impaired, a higher sense of knowing was available through
sense of smell. Dr. P also had no problem in recognizing people by their voices.
Disorders of the ventral stream: Visual agnosias
● Apperceptive agnosia: deficits in object perception, or the inability to combine
the individual aspects of visual information such as line, shape, color, and
form together to form a “whole” percept. They seem to see in bits and pieces.
● Associative visual agnosics have difficulty to varying degrees in assigning
meaning to an object.
● Even though they can recognize differences in form between pictures of a pair
of scissors and a paper punch by matching the scissors to a like pair in a
display of office objects (with which an apperceptive agnosic would have
difficulty), they have lost the link between the visual percept and the semantic
meaning.
Disorders of the ventral stream: Visual agnosias
● In both cases, if shown a pair of scissors, neither the apperceptive nor the associative
agnosic can correctly name “scissors.”
● But although the associative agnosic can pick out a pair of scissors, she or he shows
difficulties not only in naming but in explaining or demonstrating the use for scissors.
● The most common site of damage in apperceptive agnosia is the parieto-occipital area of
the right hemisphere.
● If both hemispheres are involved, then the patient may have Balint’s syndrome, disturbance
in visually guided reaching, an inability to systematically scan the environment or fixate the
eyes on an object, and an inability to be aware of more than one object at a time
Disorders of th
Tests for apperceptive agnosia.
● Mr. P., a 67-year-old man, had suffered a right parietal stroke. At the time of our first seeing
him (24 hours after admission), he had no visual-field defect. He did, however, have a
variety of other symptoms: Mr. P. neglected the left side of his body and of the world. When
asked to lift up his arms, he failed to lift his left arm but could do so if one took his arm and
asked him to lift it. When asked to draw a clock face, he crowded all the numbers onto the
right side of the clock. When asked to read compound words such as ice cream and
football, he read cream and ball. When he dressed, he did not attempt to put on the left side
of his clothing (a form of dressing apraxia) and when he shaved, he shaved only the right
side of his face. He ignored tactile sensation on the left side of his body. Finally, he
appeared unaware that anything was wrong with him and was uncertain what all the fuss
was about (anosagnosia). Collectively, these symptoms constitute contralateral neglect.
Primary visual cortex
● The patient was a healthy 39- year-old right-handed man who, because of an explosion,
was hit by a projectile steel nut that penetrated his skull in the right parietal occipital area.
“This man could perceptually recognize objects, identify colors, and discriminate right and
left. He also had no problems in spatial depth perception and size constancy. But the
patient himself alerted his doctors that he was having trouble finding his way through the
hallways on the way to the bathroom and was having trouble reading the time. He said he
had to read each hand of the clock separately and then figure out the time.
● Staff observed him to collide “with objects on his left which he had clearly perceived a few
moments before. He was liable at table to knock over dishes on his left-hand side and
occasionally missed food on the left of his plate. He commonly failed to attend to the
left-hand page in turning the pages of a book and reading lines of disconnected words
commonly omitted the first word or two” (Paterson & Zangwill, 1944, p. 339).
Parallel processing pathways
Processing of Color: Ventral stream
● The pathway for color processing begins with cones and ends in the inferior
temporal and frontal lobes.
● Information about color is relayed to the V1 by way of parvocellular neurons in
the LGN and from there, color information is sent to the thin stripes in V2 and,
from there, to V4.
● Neurons in V4 appear to analyze the wavelength of objects and make
wavelength comparisons among objects in the visual field.
● This information is then relayed to the inferior temporal and frontal lobes for
further processing.
Processing of Color: Evidence (fMRI, Zeki and Marini (1998))
● When the subjects were exposed to colored objects, a pathway extending from V1 to V4,
the inferior temporal and frontal lobes were activated.
● However, different areas of the frontal lobe were activated, depending on whether the
objects were colored appropriately or inappropriately.
● For example, when the men were shown red strawberries, V1, V2, V4, the inferior
temporal cortex, the hippocampus, and an area on the ventrolateral prefrontal cortex were
activated.
● When the men were shown blue strawberries, the activated areas included V1, V2, V4,
the inferior temporal cortex, and the dorsolateral prefrontal cortex.
● The hippocampus was most likely activated with red strawberries because the
appropriately colored strawberries stimulated a memory process.
● The abnormally colored objects activated the dorsolateral frontal cortex, whereas the
normally colored objects activated the ventrolateral frontal cortex, which demonstrates
that the frontal lobe plays a role in analyzing the color of objects in our visual space.
Processing of Form: Ventral stream
● Information about orientation, lines, and edges is sent to the brain via
the parvocellular and magnocellular pathways.
● V1 neurons are sensitive to the orientation and features of objects.
● Different neurons in V1 respond to different types of stimuli.
● Simple cells are thought to be part of the parvo- system, with its
emphasis on form, and complex cells can be considered part of the
magno- system, with its emphasis on movement.
● A third type of neuron in V1 is the hypercomplex cell, preferring stimuli
with a particular length or width in a particular orientation.
Processing of motion
● Rods can rapidly detect visual events, which makes them especially
sensitive to motion.
● Information about motion is relayed to the V1 by way of the
magnocellular pathway.
● From V1, this information is sent to the thick stripes of V2 and on to V5,
which is located in the temporal lobe, adjacent to area V4.
● Axons from area V5 project to the posterior parietal lobe, where
information about motion is analyzed.
Processing of depth and spatial relations
● The auditory system contains mechanical receptors designed to detect sound frequency.
● These hairlike receptors are located in the fluid of the long, coiled, snail-like cochlea of
the inner ear. As the mechanical mechanisms of the middle ear respond to external
sound waves, they cause vibrations in the fluid of the inner ear, thus vibrating the hairs of
the auditory receptors.
● These receptors synapse with the auditory nerve.
● The auditory nerve from each ear projects to the cochlear nuclei of the medulla.
● From there, each pathway branches to project auditory information to superior olivary
nuclei of the medulla.
● In this way, the auditory system differs from the visual system in that each hemisphere
receives input from both ears, resulting in bilateral representation of sound.
● This may help the person localize sound in space.
Pathways to the auditory cortex
● The auditory pathways then course through the lower brainstem and
ascend through the thalamus, where they are projected to the primary
auditory cortex
● The primary auditory cortex of each hemisphere lies deep within the
temporal lobe
● This area is commonly termed Heschl’s gyrus which processes the
“fragments” of sound
● A tonotopic map projects onto the auditory cortex, similar to the retinotopic
map of the visual system.
● The primary auditory cortex processes several elements of sound.
● In addition to frequency, the features of sound include loudness, timbre,
duration, and change.
Pathways to the auditory cortex
Pathways to the auditory cortex
● Both the cochlear nucleus and the superior olive send projections to the
inferior colliculus in the dorsal midbrain.
● Two distinct pathways emerge from the inferior colliculus, coursing to the
medial geniculate nucleus, which lies in the thalamus.
● The ventral region of the medial geniculate nucleus projects to the primary
auditory cortex (area A1), whereas the dorsal region projects to the auditory
cortical regions adjacent to area A1.
● After the primary auditory cortex processes sound features, they are
integrated into understandable speech sounds in the secondary auditory
processing area commonly known as Wernicke’s area.
Auditory cortex
● In humans, the primary auditory cortex (A1) lies within Heschl’s gyrus and is
surrounded by secondary cortical areas (A2)
Auditory deficits
● Damage to the left hemisphere auditory processing areas results in the partial
or total inability to decipher spoken words known as receptive aphasia, or
Wernicke’s aphasia.
● However, people with receptive aphasia can often still recognize the
emotional tone of language, because the speaker’s intent, such as anger,
sarcasm, or humor, is processed as voice intonation
● In right hemisphere damage, the patient accepts words at face value but
loses the nuances of jokes and emotional intention and impaired harmonic
and melodic ability.
The somatosensory system