Professional Documents
Culture Documents
Arbib: CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 7. Object Recognition
Arbib: CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 7. Object Recognition
Object Recognition
What is Object Recognition?
Segmentation/Figure-Ground Separation: prerequisite or consequence? Labeling an object [The focus of most studies] Extracting a parametric description as well
Arbib: CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 7. Object Recognition
grasp programming
Inferotemporal Cortex
What (ventral)
AT: Goodale and Milner Lesion here: Inability to verbalize or pantomime size or orientation
Arbib: CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 7. Object Recognition
Clinical Studies
Studies with patients with some visual deficits strongly argue that tight interaction between where and what/how visual streams are necessary for scene interpretation. Visual agnosia: can see objects, copy drawings of them, etc., but cannot recognize or name them! Dorsal agnosia: cannot recognize objects if more than two are presented simultaneously: problem with localization Ventral agnosia: cannot identify objects.
Arbib: CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 7. Object Recognition
Arbib: CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 7. Object Recognition
Inferotemporal Pathways
Later stages of IT (AIT/CIT) connect to the frontal lobe, whereas earlier ones (CIT/PIT) connect to the parietal lobe. This functional distinction may well be important in forming a complete picture of inter-lobe interaction.
Arbib: CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 7. Object Recognition
neurons in cortex - Coding: one neuron per object or population codes? - Biologically-inspired algorithms for shape perception - The "gist" of a scene: how can we get it in 100ms or less? - Visual memory: how much do we remember of what we have seen? - The world as an outside memory and our eyes as a lookup tool
Arbib: CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 7. Object Recognition
Arbib: CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 7. Object Recognition
Object recognition
- The
basic issues - Translation and rotation invariance - Neural models that do it - 3D viewpoint invariance (data and models) - Classical computer vision approaches: template matching and matched filters; wavelet transforms; correlation; etc. - Examples: face recognition. - More examples of biologicallyinspired object recognition systems which work remarkably well
Arbib: CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 7. Object Recognition
Attention-based analysis: Scan scene with attention, accumulate evidence from detailed local analysis at each attended location. Main issues: - what is the internal representation? - how detailed is memory? - do we really have a detailed internal representation at all!!?
Gist: Can very quickly (120ms) classify entire scenes or do simple recognition tasks; can only shift attention twice in that much time!
Arbib: CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 7. Object Recognition
A.
1400 1200 1000 800 600 400 200 0 0 200 400 600 800 1000 Reaction Tim e Targets Distractors
Minimum ResponseTime
Claim: This processing can be involved B.is so 6quick that only feedforward A n li m a
V
Arbib: CS564 - Brain Theory and Artificial Intelligence,
why build a detailed internal representation of the world? too complex not enough memory and useless?
The world is the memory. Attention and the eyes are a look-up tool!
Arbib: CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 7. Object Recognition
Arbib: CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 7. Object Recognition
Arbib: CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 7. Object Recognition
1) pixel-based (light intensity) 2) primal sketch (discontinuities in intensity) 3) 2 D sketch (oriented surfaces, relative depth between surfaces) 4) 3D model (shapes, spatial relationships, volumes)
TMB2 view: This may work in ideal cases, but in general cooperative computation of multiple visual cues and perceptual schemas will be required. problem: computationally intractable!
Arbib: CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 7. Object Recognition
VISIONS
A computer vision system from 1987 developed by Allen Hanson and Edward Riseman on the basis of the HEARSAY system for speech understanding (TMB2 Sec. 4.2) and Arbibs Schema Theory (TMB2 Sec. 2.2 and Chap. 5)
This is schema-based and can be mapped onto hypotheses about cooperative computation in the brain.
Key idea: Bringing context and scene knowledge into play so that recognition of objects proceeds via islands of reliability to yield a consensus interpretation of the scene. See TMB2 Sec. 5.2 for the figures.
Arbib: CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 7. Object Recognition
Arbib: CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 7. Object Recognition
JIM 3 (Hummel)
Arbib: CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 7. Object Recognition
Arbib: CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 7. Object Recognition
Collection of Fragments 2
Arbib: CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 7. Object Recognition
Arbib: CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 7. Object Recognition
Viewpoint Invariance
Major problem for recognition.
Biederman & Gerhardstein, 1994: We can recognize two views of an unfamiliar object as being the same object.
Arbib: CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 7. Object Recognition
Direct Template Matching: Processing hierarchy yields activation of view-tuned units. A collection of view-tuned units is associated with one object. View tuned units are built from V4-like units, using sets of weights which differ for each object. e.g., Poggio & Edelman, 1990; Riesenhuber & Poggio, 1999
Arbib: CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 7. Object Recognition
Arbib: CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 7. Object Recognition
the model neurons are tuned for size and 3D orientation of object
Arbib: CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 7. Object Recognition
Arbib: CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 7. Object Recognition
Arbib: CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 7. Object Recognition
Recognition by Components
Structural approach to object recognition:
We can recognize a novel/unfamiliar object by parsing it in terms of its component pieces, then comparing the assemblage of pieces to those of known objects.
Arbib: CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 7. Object Recognition
Arbib: CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 7. Object Recognition
They are robust to noise (can be identified even with parts of image missing)
Arbib: CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 7. Object Recognition
Support for RBC: We can recognize partially occluded objects easily if the occlusions do not obscure the set of geons which constitute the object.
Arbib: CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 7. Object Recognition
Potential difficulties
A. Structural description not enough, also need metric info B. Difficult to extract geons from real images C. Ambiguity in the structural description: most often we have several candidates D. For some objects, deriving a structural representation can be difficult
Edelman, 1997
Arbib: CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 7. Object Recognition
Arbib: CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 7. Object Recognition
Arbib: CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 7. Object Recognition
Arbib: CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 7. Object Recognition
visual processing
Arbib: CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 7. Object Recognition
Place
Common objects
(Tjan, 1999)
Arbib: CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 7. Object Recognition
memory
...
ti
memory
memory
Independent R1 Decisions
Ri
Rn
Delays
Homunculus Response
t1
tn
Arbib: CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 7. Object Recognition
Task:
Arbib: CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 7. Object Recognition
Image
normalize position
normalize orientation
downsampling
memory
Arbib: CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 7. Object Recognition
Image
normalize position
normalize orientation
downsampling
memory
Site 1
memory
Site 2
memory
Site 3
Arbib: CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 7. Object Recognition
Study stimuli:
Test stimuli:
1) familiar (studied) views, 2) new positions, 3) new position & orientations
1800 {30%}
1500 {25%}
800 {20%}
450 {15%}
210 {10%}
Arbib: CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 7. Object Recognition
Site 3
norm. pos.
Site 1 raw image
Processing speed for each recognition module depends on recognition difficulty by that module.
Arbib: CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 7. Object Recognition
Familiar views
1 1
Novel positions
1
Proportion Correct
0.8
0.8
0.8
norm. ori.
0.6
0.6
0.6
0.4
0.4
0.4
Site 1
0.2
0.2
0.2
raw image
0 10 100
0 10 100
0 10 100
Contrast (%)
Arbib: CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 7. Object Recognition
Familiar views
1 1
Novel positions
1
Proportion Correct
0.8
0.8
0.8
norm. ori.
0.6
0.6
0.6
0.4
0.4
0.4
Site 1
0.2
0.2
0.2
raw image
0 10 100
0 10 100
0 10 100
Contrast (%)
Black curve: full model in which recognition is based on the fastest of the responses from the three stages.
Arbib: CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 7. Object Recognition
Arbib: CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 7. Object Recognition
- Recording
from neurons: electrophysiology - Multi-unit recording using electrode arrays - Stimulating while recording - Anesthetized vs. awake animals - Single-neuron recording in awake humans - Probing the limits of vision: visual psychophysics - Functional neuroimaging: Techniques - Experimental design issues - Optical imaging - Transcranial magnetic stimulation
Arbib: CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 7. Object Recognition