CV_V unit notes
CV_V unit notes
• The study of how shape can be inferred from cues is sometimes called shape from X,
because the individual instances are called shape from shading, shape from texture,
and shape from focus.
• you can clearly see the shape of the object from just the shading variation. The
surface normal changes across the object, the apparent brightness changes as a
function of the angle between the local surface orientation and the incident
illumination
• The problem of recovering the shape of a surface from this intensity variation is
known as shape from shading.
• Most shape from shading algorithms assume that the surface under consideration is
of a uniform albedo and reflectance, and that the light source directions are either
known or can be calibrated by the use of a reference object. Under the assumptions
of distant light sources and observer, the variation in intensity (irradiance equation)
becomes purely a function of the local surface orientation,
Photometric stereo
• Another way to make shape from shading more reliable is to use multiple light
sources that can be selectively turned on and off. This technique is called
photometric stereo, as the light sources play a role analogous to the cameras located
at different locations in traditional stereo.
• For each light source, we have a different reflectance map, R1(p, q), R2(p, q), etc.
Given the corresponding intensities I1, I2, etc. at a pixel, we can in principle recover
both an unknown albedo ρ and a surface orientation estimate (p, q).
Shape from texture
• The variation in foreshortening observed in regular textures can also provide useful
information about local surface orientation.
• Shape from texture algorithms require a number of processing steps, including the
extraction of repeated patterns or the measurement of local frequencies to compute
local affine deformations, and a subsequent stage to infer local surface orientation.
Shape from focus
• A strong cue for object depth is the amount of blur, which increases as the object’s
surface moves away from the camera’s focusing distance.
• A number of techniques have been developed to estimate depth from the amount of
defocus.
• The amount of blur increase in both directions as you move away from the focus
plane. Therefore, it is necessary to use two or more images captured with different
focus distance settings
• A strong cue for object depth is the amount of blur, which increases as the object’s
surface moves away from the camera’s focusing distance.
• A number of techniques have been developed to estimate depth from the amount of
defocus.
• The amount of blur increase in both directions as you move away from the focus
plane. Therefore, it is necessary to use two or more images captured with different
focus distance settings
• The magnification of the object can vary as the focus distance is changed or the
object is moved. This can be modeled either explicitly (making correspondence more
difficult) or using telecentric optics.
• The amount of defocus must be reliably estimated. A simple approach is to average
the squared gradient in a region, but this suffers from several problems, including the
image magnification problem mentioned above.
Surface representations
• surface representations are techniques used to model and analyze the shape and
appearance of objects in three-dimensional space. These representations help in
understanding and interpreting visual data.
Surface interpolation
• One of the most common operations on surfaces is their reconstruction from a set of
sparse data constraints, i.e., scattered data interpolation.
• When formulating such problems, surfaces may be parameterized as height fields
f(x), as 3D parametric surfaces f(x), or as non-parametric models such as collections
of triangles
• surfaces can be parameterized as height fields, often represented as z=f(x,y), where z
is the height (or depth) value at each point in the (x,y) plane. In this representation:
• The surface is defined by a function that gives the height z for each point (x,y). This
allows for the representation of any surfaces that can be described in terms of their
height relative to a base plane.
• When rendering, these surfaces can be visualized using techniques such as shading,
which considers the surface normal derived from the gradient of the height function.
• These surfaces can be rendered by evaluating the parametric equations over a grid of
u and v values, generating a mesh of points that represent the surface.
• Point Clouds: Collections of data points in 3D space, often obtained from depth
sensors or 3D scanning. Each point represents a location on the object's surface.
• Mesh Representations: Composed of vertices, edges, and faces, meshes provide a
more structured way to represent surfaces. They can capture detailed geometric
features. A triangle-based mesh model is a common representation used in 3D
graphics and modeling. It consists of vertices, edges, and faces, where each face is
typically a triangle.
• NURBS (Non-Uniform Rational B-Splines): A mathematical representation that can
define complex curves and surfaces smoothly. NURBS are widely used in CAD and
modeling applications.
• Implicit Surfaces: Defined by a scalar function, where points on the surface satisfy a
specific equation (e.g., f(x,y,z)=0). These can create smooth surfaces without needing
explicit connectivity.
• Voxel Grids: 3D grids where each voxel (volume element) contains information
about the space it occupies. Voxel representations are useful in volumetric rendering
and medical imaging.
• Depth Maps: 2D images where each pixel value represents the distance from the
camera to the nearest surface point.
• Level Sets: A numerical technique used to track shapes and interfaces. Level sets can
represent dynamic surfaces that change over time.
Volumetric representations
• Volumetric representation in 3D reconstruction involves creating a three-dimensional
model by capturing the volume of an object rather than just its surface. This
approach is particularly useful in various fields, including computer vision, medical
imaging, and virtual reality.
• Voxel Grids: A volumetric representation typically uses voxels (3D pixels) to divide
space into a grid. Each voxel contains information about the material or density at
that location, allowing for detailed internal structures.
• Methods of Reconstruction:
• CT and MRI Scanning: Common in medical imaging, these techniques generate
volumetric data by slicing through an object and capturing internal structures.
• 3D Scanning: Techniques like laser scanning or structured light capture surface
points, which can then be converted into a volumetric model.
• Rendering Techniques:
• Ray Marching: Used for rendering volumetric data, where rays are cast through the
voxel grid, sampling density and color at each step.
• Direct Volume Rendering: Displays 3D data without converting it to a surface, useful
for visualizing complex structures like those in medical scans.
Image-based Rendering: View Interpolation
• Image-based rendering (IBR) is a technique used in computer graphics to generate
new views of a scene from a set of input images. One of the key methods within IBR
is view interpolation, which creates intermediate views between existing images.
Here's an overview of how it works and the techniques involved:
• View Interpolation in Image-Based Rendering
• Definition: View interpolation refers to the process of generating new images from a
given set of images taken from different viewpoints. It is particularly useful in
scenarios where capturing every possible viewpoint is impractical
Techniques for View Interpolation
• Linear Interpolation:
• For two images taken from slightly different viewpoints, linear interpolation
can be used. By blending the pixel values based on a parameter ttt (ranging
from 0 to 1), you can create a new view that appears between the two
original images.
• This method is simple but may not handle occlusions or depth variations well.
• Depth-Image Based Rendering (DIBR):
• This approach uses depth information along with the color images. By
reconstructing the scene geometry based on depth maps, you can project the
images from new viewpoints.
• DIBR can handle occlusions more effectively, allowing for more realistic
interpolated views.
• Multi-View Stereo (MVS):
• MVS techniques estimate depth and 3D structure from multiple images. Once
the depth is known, it can be used to render new views by projecting the 3D
points onto the desired image plane.
• This method often involves complex algorithms to ensure accuracy in depth
estimation.
• Image Warping:
• This technique involves transforming the images to align them based on their
perspective. By using homographies or projective transformations, images
can be warped to create a seamless transition between views.
• Warping can be combined with depth information for improved results.
• Texture Mapping and View Synthesis:
• Using texture mapping techniques, one can synthesize new views by sampling
texture from existing images based on the projected geometry. This method
can leverage both color and depth information.
• Algorithms like the Lumigraph allow for more complex light interactions,
creating convincing views.
• Machine Learning Approaches:
• Recent advances in deep learning have led to neural networks that can learn
to generate new views from a set of images. Techniques like generative
adversarial networks (GANs) and convolutional neural networks (CNNs) can
produce high-quality interpolations.
• These models learn the underlying patterns of the data, allowing for more
sophisticated rendering that can handle complex scenes and lighting.
Applications
Virtual Reality and Augmented Reality: Providing immersive experiences by
generating real-time views from limited input images.
Film and Animation: Creating seamless transitions and panoramic views without the
need for extensive re-shooting.
Video Games: Enhancing graphics and enabling smooth camera movements in 3D
environments.
Architectural Visualization: Allowing clients to see potential designs from various
angles without the need for full renders.