Lecture 2 Notes_ Memory and Cognition_ Perception
Lecture 2 Notes_ Memory and Cognition_ Perception
Perception
What is it?
Why is perceiving an object so difficult?
What cues does the visual system use to perceive objects
Connecting anatomy and philology to perception
Separating figures from the ground
Sensation vs. Perception – What’s the difference?
Detection of energy vs. Interpretation of sensory information
Bottom-up vs. top-down processing
Perception involves
1) Current sensory input
2) Previous Experience X 2
Low-Level Vision-Outline
What's your vision?
Problem to be solved
How simple computations yield more complex information
Problems to be solved
Many of the qualities of objects that we would like to know about trade-off with
other qualities
Shape/Orientation
Reflected/Light Source/Shadow
Size/Distance
Size/Distance
Monocular Depth Cues
Relative size
Texture gradient
Interposition
Linear perspective
Height in plane
Binocular Depth Cues
Binocular disparity
Binocular convergence
Problems
Problem 1 is the inverse projection
Problem 1 is that the types of information that we want to trade off with one
another
Problem 2 is that the initial information the visual system has is extremely
impoverished
How do you get from one to another
Vision is not an exact representation of what is in the world, it is a representation
of what is probably in the world
Why Edges
An edge is a sudden discontinuity in intensity
Edges frequently correspond to the boundaries of objects; a map of edges is a
good start to identifying objects
Edges are invariant to lighting conditions
How to find
Computationally easy to find discontinuities
Compare means of adjacent columns, rows, diagonals
Biological Evidence
120 million rods 6 million cones → 1.2 million ganglion cells
Ganglion cells: center during
One cell and off-cells
On cells when light is focused on that receptor field they become active +
when light is focused in the center they inhibit activity in the area around it
+ see inhibition
Visual Detection: Shape and Contour
Detecting lines and edges
Simple cells-orientation-specific slits of light in a particular location
It does seem that some of the cells relatively early in the visual processing
stream care about the edges
Real World problems to which we can apply Gestalt Principles
Basic tendency to organize visual input
Segmentations
Determining where objects are in an image and what their boundaries are
Grouping
Grouping stuff together as part of the same objects; for example, across
occluders
Gestalt approach
Problems-we don't perceive local events in an image and perceive
more global figures
Elucidate principles that determine the grouping of local “things” in
an image into figures
Proximity
Similarity
Collinearity
Figure-ground Organization
Perceive a boundary belonging to a “thing-like” featural region
It has a definite shape, and appears closer, whereas the ground appears
farther and extends behind the figure
Size, Symmetry
Frame of Reference
Ex. align a rod within this frame so that the rod is vertical
Object Recognition
Human perception is more than the sum of information in the distal stimulus
Perceiving figures is critical for object recognition
What is object recognition-identify a complex arrangement of sensory stimuli, and
view it as distinct from its background
Models of Object Recognition
Template
Feature
“New wave” of feature models (3D features)
Template Model
1) Memory representation is a holistic unanalyzed entity (a template).
2) An input pattern is compared to the stored representation.
3) Identity is determined by the selection of the template with the greatest
amount of overlap.
Problems:
Size
Orientation
Need too many templates
Feature-Analysis Models
1) Inputs are broken down into a small list of distinctive features
2) Identity is determined by selecting the feature list most similar to the input
Gibson (1969). Subjects were given a reaction test to determine if two letters were
the same or different.
G vs W
P vs R
RT = 458 msec
RT = 571 msec
Good:
Larsen & Bundenson (1996) – developed a model that identified 95% of
addresses and zip codes
Visual input does seem to be decomposed into features
Physiological evidence about edges (Hubel & Wiesel 1965)
Problems:
No role for the spatial relation between features
Moving objects appear differently
Natural objects
New Wave of Feature Models
Recognition by component models uses three-dimensional features.
36 Geons lead to 30,000 readily discriminable objects.
You usually only need to see the edges of a geon
Geons have properties that are invariant to rotations, size, and translation
Problems with Geons
Mental rotation plays a role
Local view hypothesis
Top Down vs. Bottom Up
Top-down processing is strong when a stimulus is registered for just a fraction of
a second.
Top-down processing is also strong when stimuli are incomplete or ambiguous.
Object recognition combines bottom-up and top-down processing
Word Superiority Effect
Faster and more reliable in identifying a letter when it is part of a word than a
non-word or in isolation
Face Perception
Should be a challenging task
Need to recognize faces from different angles, in different settings, with different
expressions
Alternatives
Nature of the task
Perceptual expertise
Specific to faces (Haxby and colleagues?)