Mid-Level Vision, Grouping, Illusory Contours, and Lightness
The visual system maps the retinal image onto retinotopic representations, with the left visual field projected onto the right cerebral hemisphere and vice versa. Hierarchical pathways split into dorsal and ventral streams, each processing different aspects of the scene. Early vision, from retina through V1, applies linear filters that extract local measurements such as orientation and contrast. Mid‑level vision follows, using those measurements to infer the structure of objects and scenes.
Perceptual Grouping
Local measurements are often ambiguous; a dark patch might be paint or a shadow. The brain resolves this ambiguity by grouping elements that likely belong together. “Things that are similar tend to group together. Things that are different tend to pop out as separate.” Gestalt principles—similarity, common fate, proximity, good continuation, and closure—capture the statistical regularities of the natural world. Edges in the environment tend to be smooth, so the visual system treats smoothly aligned line segments as belonging to the same contour. At the computational level this process is a probabilistic inference based on world statistics. Neural circuitry implements the inference: neurons whose receptive fields lie along a common contour excite each other, strengthening the representation of that contour while inhibitory interactions suppress competing configurations.
Illusory Contours and Completion
The visual system can perceive edges that lack local contrast, a phenomenon called modal completion. For example, a white strip on a pair of ovals induces the perception of a continuous edge even though no luminance change exists at that location. Amodal completion extends this idea, allowing the brain to infer a hidden contour behind an occluder without explicitly seeing the edge. Relatability—how smoothly two contour fragments can be connected—guides this inference. Recordings from V2 neurons reveal responses to illusory contours, indicating that these cells encode the perceived border rather than merely reacting to local image energy. Rüdiger von der Heydt first reported these V2 responses in 1984.
Figure and Ground
Assigning figure versus ground is an ill‑posed problem solved by statistical cues. Size, convexity, and parallelism reliably predict which side of a contour belongs to the figure. When the visual system resolves this ambiguity, perception can become bistable, flipping between alternative interpretations of the same edge. V2 neurons exhibit “border ownership,” signaling which side of a contour is perceived as the figure even when the stimulus within the receptive field remains unchanged.
Lightness Perception
Lightness perception separates surface reflectance from illumination. Luminance equals reflectance multiplied by illumination, and “you can't unmultiply [reflectance and illumination]. So it's a classic example of an ill‑posed problem.” The visual system estimates reflectance—the intrinsic pigmentation of a surface—despite varying lighting conditions, achieving lightness constancy: the same shade of gray appears consistent across different illumination levels. Lambertian surfaces, which scatter light uniformly, simplify this computation, whereas specular or translucent (non‑Lambertian) materials introduce additional complexity. As the lecturer puts it, “We're going to turn shadows into paint in your head,” highlighting the brain’s active inference about material properties.
Takeaways
- Mid-level vision bridges linear early measurements and object recognition by making probabilistic inferences about ambiguous local cues.
- Perceptual grouping relies on statistical regularities such as smooth edges, implemented through excitatory interactions among neurons aligned along a contour.
- Illusory contours arise from modal and amodal completion, with V2 neurons encoding perceived edges even without local contrast.
- Figure–ground assignment uses cues like size, convexity, and parallelism, and V2 border‑ownership cells signal which side of a contour belongs to the figure.
- Lightness constancy separates surface reflectance from illumination, but non‑Lambertian materials complicate the visual system’s estimate.
Frequently Asked Questions
How does the visual system use world statistics for perceptual grouping?
It treats grouping as a probabilistic inference, internalizing the empirical probability that two line segments belong to the same object based on their orientation and position. Neural circuits reinforce aligned neurons through mutual excitation, strengthening the perceived contour while suppressing alternatives.
What neural evidence supports the perception of illusory contours?
Recordings from V2 neurons show responses to edges that lack local contrast, indicating that these cells encode the perceptual border rather than merely reacting to image energy. This finding, first reported by Rüdiger von der Heydt in 1984, demonstrates a neural basis for modal completion.
Does this page include the full transcript of the video?
Yes, the full transcript for this video is available on this page. Click 'Show transcript' in the sidebar to read it.
Helpful resources related to this video
If you want to practice or explore the concepts discussed in the video, these commonly used tools may help.
Links may be affiliate links. We only include resources that are genuinely relevant to the topic.