0% found this document useful (0 votes)
30 views

Range Image Segmentation For 3-D Object Recognition

Uploaded by

dzenek
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views

Range Image Segmentation For 3-D Object Recognition

Uploaded by

dzenek
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 157

University of Pennsylvania

ScholarlyCommons

Technical Reports (CIS) Department of Computer & Information Science

May 1988

Range Image Segmentation for 3-D Object Recognition


Alok Gupta
University of Pennsylvania

Follow this and additional works at: https://repository.upenn.edu/cis_reports

Recommended Citation
Alok Gupta, "Range Image Segmentation for 3-D Object Recognition", . May 1988.

University of Pennsylvania Department of Computer and Information Science Technical Report No. MS-CIS-88-32.

This paper is posted at ScholarlyCommons. https://repository.upenn.edu/cis_reports/736


For more information, please contact [email protected].
Range Image Segmentation for 3-D Object Recognition

Abstract
Three dimensional scene analysis in an unconstrained and uncontrolled environment is the ultimate goal
of computer vision. Explicit depth information about the scene is of tremendous help in segmentation and
recognition of objects. Range image interpretation with a view of obtaining low-level features to guide
mid-level and high-level segmentation and recognition processes is described. No assumptions about the
scene are made and algorithms are applicable to any general single viewpoint range image. Low-level
features like step edges and surface characteristics are extracted from the images and segmentation is
performed based on individual features as well as combination of features. A high level recognition
process based on superquadric fitting is described to demonstrate the usefulness of initial segmentation
based on edges. A classification algorithm based on surface curvatures is used to obtain initial
segmentation of the scene. Objects segmented using edge information are then classified using surface
curvatures. Various applications of surface curvatures in mid and high level recognition processes are
discussed. These include surface reconstruction, segmentation into convex patches and detection of
smooth edges. Algorithms are run on real range images and results are discussed in detail.

Comments
University of Pennsylvania Department of Computer and Information Science Technical Report No. MS-
CIS-88-32.

This technical report is available at ScholarlyCommons: https://repository.upenn.edu/cis_reports/736


RANGE IMAGE SEGMENTATION
FOR 3-D OBJECT RECOGNITION
Alok Gupta

MS-CIS-88-32
GRASP LAB 141

Department of Computer and Information Science


School of Engineering and Applied Science
University of Pennsylvania
Philadelphia, PA 19104

May 1988

Acknowledgements: The work reported herein was supported in part by NSF grant
DCR-8410771, Airforce grant AFOSR F49620-85-K-0018,ArmytDAAG-29-84-K-0061, NSF-
CERlDCR82-19196 Ao2, DARPAIONR NO014-85-K-0807, U.S. Postal Service contract
104230-87-M-0195.
UNIVERSITY OF PENKSYLVANU
THE MOORE SCHOOL OF ELECTRICAL ENGINEERING
SCHOOL OF ENGINEERING AND APPLIED SCIENCE

RANGE IMAGE SEGMENTATION F O R


3-D O B J E C T RECOGNITION

Alok Gupta

Philadelphia, Pennsylvania
May 1988

A thesis presented to the Faculty of Engineering and Applied Science of the University of
Pennsylvania in partial fulfillment of the requirements for the degree of Master of Science
in Engineering for graduate work in Computer and Information Science.

Ruzena ~ a j c s ~ '
(Advisor) '

Richard ~ a u l
(Graduate Group Chair)
Abstract

Three dimensional scene analysis in an unconstrained and uncontrolled environment is the


ultimate goal of computer vision. Explicit depth information about the scene is of tremen-
dous help in segmentation and recognition of objects. Range image interpretation with
a view of obtaining low-level features to guide mid-level and high-level segmentation and
recognition processes is described. No assumptions about the scene are made and algorithms
are applicable to any general single viewpoint range image. Low-level features like step edges
and surface characteristics are extracted from the images and segmentation is performed
based on individual features as well as combination of features. A high level recognition
process based on superquadric fitting is described to demonstrate the usefulness of initial
segmentation based on edges. A classification algorithm based on surface curvatures is used
to obtain initial segmentation of the scene. Objects segmented using edge information are
then classified using surface curvatures. Various applications of surface curvatures in mid
and high level recognition processes are discussed. These include surface reconstruction,
segmentation into convex patches and detection of smooth edges. Algorithms are run on
real range images and results are discussed in detail.
Acknowledgements

I would like to thank my advisor Dr. Ruzena Bajcsy for the guidance and encouragement. I
am grateful t o Dr. Kwangyoen Wohn for motivating me t o work in the field of range image
interpretation. Prof. Wohn also guided me in initial stages of this thesis and furnished
programs for least squares fitting. Thanks to Franc Solina for many ideas and suggestions
and for superquadrics software. Finally, thanks to Gus Tsikos for help with the range image
scanner.
The support of the following contract and grants is gratefully acknowledged : NSF grant
DCR-8410771, Airforce grant AFOSR F49620-85-K-0018, Army/DAAG-29-84-K-0061, NSF-
CER/DCR82-19196 Ao2, DARPA/ONR N0014-85-K-0807, US Postal Service contract 104230-
87-M-0195.
Contents

Abstract

Acknowledgements

1 Introduction

2 Acquisition and Preprocessing of Range Images


2.1 Range Image Acquisition . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2 Scaling of Range Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3 Smoothing of Range Images . . . . . . . . . . . . . . . . . . . . . . . . . . .

3 3-D Edges and Segmentation based on edges


3.1 Edge detection using v2G(x.y) . . . . . . . . . . . . . . . . . . . . . . . . .
3.2 Segmentation of Range Images using edge information . . . . . . . . . . . .
3.3 Segmentation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.4 R.ecognition of segmented objects using Superquadrics . . . . . . . . . . . .
3.5 Results of Superquadric Fitting and Classification . . . . . . . . . . . . . .

4 Surface Characterization and Segmentation


4.1 Differential Geometry of Surfaces . . . . . . . . . . . . . . . . . . . . . . . .
4.2 Computing Surface Characteristics of Range Images . . . . . . . . . . . . .
4.2.1 Estimation of partial derivatives of Depth Maps . . . . . . . . . . .
4.2.2 Results of Lnitial Segmentation . . . . . . . . . . . . . . . . . . . . .
4.3 Post processing of Labeled scenes . . . . . . . . . . . . . . . . . . . . . . . .
4.3.1 Obtaining Convex patches . . . . . . . . . . . . . . . . . . . . . . . .
4.3.2 Object Surface Classification . . . . . . . . . . . . . . . . . . . . . .

5 Discussion

A 2nd order Least squares fitting in symmetric neighborhood

B Source Code Listing


Chapter 1

Introduction

Three dimensional scene analysis in an unconstrained and uncontrolled environment is the


ultimate goal of computer vision. Most of the effort in this regard has gone in extracting
three dimensional information from intensity images and arriving at a meaningful and suf-
ficiently unambiguous interpretation of the scene. However, the problem with monocular
vision is the loss of 3-D information thereby making the interpretation process undercon-
strained. Shape from X methods have been widely studied in last two decades to extract
depth of the scene using texture,shading,color,contourand motion. Depth extraction from
stereo images is computationally expensive and results in sparse depth maps requiring re-
construction techniques for further interpretation. Range images on the other hand are
obtained by realtime depth sensors and provide dense 3-D information of the visible sur-
faces.
Range images are dense depth maps measuring the distance of the physical surface from
a known reference plane. Different types of ranging methods are available to obtain range
information according to the application. Magnetic resonance imaging systems give true
3-D images, i.e, all the points in 3-D space are specified. Visible surfaces can be scanned by
time of flight laser range finders and amplitude-modulated laser range finders. The most
common and cheapest are the triangulation-based scanners. Structured lighting systems
scan the scene with a laser stripe to obtain depth information of the visible surface in a
calibrated workspace. Research interest in range image processing has grown tremendously
in recent years due to increasing availability of structured lighting range sensors. While
these sensors can be employed in closed environment only and suffer from other drawbacks
(like shadows, inability to sense highly reflective surfaces and some colors) they are useful
for real-time scanning of good quality at low cost.
The range images dealt with in this work are of z(x, y) type, (i.e., monge-patch surfaces)
where each pixel gives the Z-depth at the coordinate x and y. Since range images (or
depth maps) contain explicit 3-D information about the scene it is expected that surface
description and object recognition should be easier to handle with range images. However
if the scene is quite complicated, then the problem cannot be solved that easily by using
range images as one might think. Intensity information can be used to complement range
information where ambiguity arises in interpretation, but this involves registration and
correspondence problems and may even complicate the analysis.
Representation of range images is just like that of reflectance images. A two dimensional
array of depth values specifying (x,y,z) coordinates with respect to a known coordinate frame
is enough for most applications. This allows many low level intensity image processing
techniques to be directly used to process range images by interpreting the pixel value as
'depth9instead of 'reflectance value'. Contrast and brightness however have to be interpreted
as surfaces of varying depths.
We have addressed the problem of object and surface segmentation in this report. Seg-
mentation is essentially goal oriented. It can be conveniently divided into two processes :
initial segmentation and final segmentation. Initial segmentation process is a result of local
computations done in a known neighborhood of every pixel in the image. The final segmen-
tation process does refinement of the initial segmentation using global constraints to arrive
at a global interpretation of the scene. We have not assumed any domain knowledge or lim-
ited the objects to be of certain type. Our goal is to study boundary based segmentation,
surface based segmentation and integration of the two methods. It is possible to segment
the scene in flat, convex and concave subparts with detailed description of individual parts
using boundary and surface based techniques.
An important aspect of the object recognition problem is the robustness of the recogni-
tion approach. It is essential that algorithm be size-invariant, position invariant, orientation
invariant and be able to recognize p a r t i d y occluded objects. As observed by Besl and Jain
[9]it is known from the results in differential geometry that Gaussian and mean curva-
ture are visible-invariant features of a surface region in the sense that they do not change
under viewpoint transformations that do not affect the visibility of that region. When a
surface region is visible, its curvature measurements are invariant to changes in surface
parametrization and t o translations and rotations. The invariant property is important for
3-D object recognition. Since our final segmentation process will be dependent on the local
computations it is necessary that the low-level features be invariant.
While these two approaches are domain independent, any high level recognition approach
like model based interpretation makes use of domain specific knowledge. We use a high
level volumetric approach using superquadrics described in [23] to illustrate the usefulness
of initial segmentation in high level vision. Figure 1 presents the paradigm explored in this
work.
It is clear that processing of range images can be divided into three major stages : low
level, intermediate level and high level. After range image is acquired from the sensor, it
needs to be smoothed before any useful operations can be performed on it. Though it creates
localization problems, it reduces the effect of quantization which is important for surface
fitting. Low level processing is data-driven with the objective of obtaining useful local
features that can be used by higher processing stages. Three dimensional edges constitute
important features. We have used the Laplacian of Gaussian operator of [24] to detect step
edges. Smooth edges have a different significance in case of range images and are more
difficult to detect. This will be discussed in chapter 3 in detail.
Computation of curvature involves computing first and second order derivatives at every
pixel in the image. Based on curvature signs, initial segmentation of the scene is performed.
This is further improved by region growing done with global constraints. Haralick et a1 [6]
have described a mathematical treatment for describing the topographic primal sketch of the
-]if
scanner.

Preprocessing :
1) Scaling.
2) Smoothing. 1
' Edge det
Laplacian of Gaussian.
Computation of second
order derivatives by
least squares fitting.

/ Segmentation :> Segmentation :>


Region growing based Pixel labeling using
on Edges. Segment Gaussian and Mean
scene into objects. Curvature.

'Model fitting using


superquadrics.
' [surface Classification ?*
1) region growing by least
1 squares fitting.
t 2) Classification by
Histogramming
on superquadric I

I arameters

Object COaasiff led. Object surf ace Classified.


Figure 1: A paradigm for Range Image Segmentation and 3-D object recognition
underlying gray tone intensity surface of a digital image. They use first and second direc-
tional derivatives to classify each picture element as one of peak,pit,ridge,ravine,saddle,flat,
and hillside. Michael Brady eta1 [4, 5, 71 describe a study of classes of curves as a source of
constraint on the surface on which they lie, and as a basis for describing it. Their approach
gives a curvature primal sketch of the surface. Tracing lines of curvature in real range
images is very unreliable due to the low x-y resolution of the scanner and quantization and
other sensing errors. Besides it is noise sensitive and computationally expensive. Besl and
Jain [25, 9, 131 have done a comprehensive study of invariant surface characteristics and
presented an algorithm for variable order surface fitting for image segmentation. They have
summarized the field of 3-D object recognition in their excellent survey [3]
A scale-space based algorithm for extraction and representation of physical properties of
a surface, using curvature properties of the surface is discussed in Fan,Medioni and Nevatia
[14]. Nackman [19] has described the two dimensional critical point configuration graphs
for describing the behavior of smooth functions of two variables by extracting peaks (local
maxima), pits(1ocal minima) and passes (saddle points) of a surface. Our approach is to not
to go into too much detail of the surface but to label the surface as flat,convex and concave
accurately. Thus local variations are ignored in favour of a more global interpretation. Yang
and Kak [33] describe an algorithm to analyze the topmost object in a pile. They compute
derivatives by fitting B-splines and use local curvature information to label the object as
flat and curved. Their method can only handle one type of surface for the topmost object in
the scene and has other problems in assuming that step edges form a closed contour, which
is not true in a general range image as described in chapter 3. A new approach for surface
classification using characteristic contours is proposed by Sethi and Jayaramamurthy [20].
Characteristic contours are defined as the loci of the points where the surface nolinals are at
a constant inclination to a selected reference vector. However it requires segmented surface
and normal vector at every point, which limit its usefulness to surface classification in final
stage of recognition process.
Their are specific methods available to process images acquired using a light-stripe
rangefinder. Smith and Kanade [34] have done contour classification of light-stripes to pro-
duce object centered 3-dimensional descriptions. Another method by Martin Herman [35]
extracts detailed, complete descriptions of polyhedral objects from light-stripe rangefinder
data.
Segmentation of scene into surface primitives is useful in many applications. Most of
the techniques discussed above involve curvature determination. Hebert and Ponce [8] have
used surface normals (the Extended Gaussian Images) to classify surfaces into three simple
primitive surfaces: planar,cylindrical, and conic regions. Duda, Nitzan and Barrett 1361have
presented an algorithm for detecting planar regions using registered range and reflectance
data.
Most of the high level recognition approaches include model matching. Kuan and Dra-
zovich [37] have represented the objects as viewpoint-independent volumetric model based
on generalized cylinders. They perform feature-to-model matching based on low-level fea-
tures derived from range imagery. Constructing the 3-D model of an object involves in-
tegrating data or descriptions of an object obtained from multiple views and representing
this intergrated data in a coherent manner. Vemuri and Aggarwal [38] have presented an
algorithm for automatic construction of models by determining the orientation of the object
in the calibrated workspace and representing the object in cylindrical coordinates. Their
method does not require correspondence to be established but requires registered intensity
and range data of the scene while building the model. We have used superquadric models
t o recognize segmented objects. The classification procedure matches superquadric param-
eters with the parameters of the identifiable models. Since models are well defined by eleven
superquadric parameters, there is no need to build models of the objects in advance.
Chapter 2

Acquisition and Preprocessing of


Range Images

Range images obtained by different scanners differ in the format of the output. In order to
apply low level techniques to the image it is necessary that the image points be quantized
in Z-depth format with equal resolution factor in X and Y direction. Once converted into
Z-depth format the image is smoothed.This chapter discusses some practical aspects of real
range image processing which are important if any useful results are desired.

2.1 Range Image Acquisition

The test images.used in this work were acquired by structured lighting triangulation based
scanners. Figure 2 shows the ranging geometry of a typical range sensor. The trigonometry
of a sensor will not be described here.
Either the laser stripe moves and scans the scene or the workspace moves under a vertical
laser stripe. If the viewpoint of the sensing camera is not the same as the laser then shadows
(regions with missing data) are obtained. In order to discriminate between shadows and
background, (region of known depth on which object is sitting) background is assigned a
nonzero depth.
Laser Source

Direction of scanning
(a) Ranging u s i n g structured l i g h t i n g .

+ydim
(b) Z-depth format of range images

Figure 2: Ranging geometry of a structured lighting scanner and Z-depth format


Contrary to the popular assumption made by researchers, it may not always be possible
to represent the visible surface in Z-depth format, viewing perpendicular to the background.
To be able to represent a.ll the scanned points in 2-depth format, it is necessary to digitize
the scene watching parallel to camera's line of sight. This may require rotating the scene
to align the Z axis along the line of sight of camera thereby rotating the background which
is no longer of constant depth. The segmentation procedure should take this into account.
Also, this makes the processing viewpoint dependent. To avoid the trouble arising due to
this, it is often convenient to fix the viewpoint at the cost of losing some scanned points.
This problem is acute with images obtained from white scanner where f ( 2 , y) is not unique.
The solution is to segment the scene from background and then rotate the scene to obtain
the Z-depth image.

2.2 Scaling of Range Images

Sampling interval of the scanners depends on the thickness of the laser stripe, value of laser
stripe increment and resolution of the camera. More often than not vertical resolution (along
Y axis) is different from horizontal resolution (along X axis). Thus the sampled points are
not spaced uniformly in X and Y direction. Since we apply neighborhood operators during
low level processing of images, it is necessary to rescale the images uniformly in both
directions. 'We have rescaled the 2-depth image by fitting a plane on three neighborhood
points. Figure 3 illustrates the difference between unscaled and uniformly scaled images.

2.3 Smoothing of Range Images

Depth resolution of a range image is an important parameter in low level processing. Range
scanners usually have depth resolution good enough for most applications. In fact a res-
olution of 0.01 inchlpixel is too fine and noise sensitive for surface fitting purposes. The
problem comes in quantization of z values. If entire scan depth is quantized within 8 bits
(most convenient representation), effective depth resolution is drastically reduced thereby
Figure 3: 2-depth format images. left :original resolution. right : uniformly scaled

increasing the quantization error. Since surface fitting is very sensitive to quantization
error, we have minimized it by following two step procedure :

1. Original depth resolution is preserved by storing the depth value unscaled in 2 bytes.
This allows 64k possible quantization levels. Scaling along Z axis is done only when
needed.

2. Image is smoothed using a Gaussian operator and smoothed values are stored in
floating point buffers so as not to lose any precision.

One way to reduce noise is to perform median filtering of the image. It ensures that
isolated noise is reduced and edges are not smoothed.
Our approach is to study the scale-space behavior of range images. We have used
Gaussian operator to smooth images. The Gaussian function in two dimensions is given
by :
Since it is separable in X and Y directions, one dimensional Gaussian operator is applied
separably on the image. Smoothing is controlled by the size of the operator, which is
determined by a. Gaussian operator has some nice properties that make it a unique operator
for our purposes.
Yuille and Poggio [39]have proved that the Gaussian low-pass filter is the only filter
with a nice-scaling behavior for linear derivative operators like the laplacian operator. It
also satisfies the following conditions :

1. Filtering is shift-invariant and therefore, a convolution.

2. The filter has no preferred scale. The filter is properly normalized at all the scales.

3. The filter recovers the whole image at sufficiently small scales.

lim F ( z , u ) = S(z)
u-+o

where S(x) is the Dirac delta function.

4. The position of the center of the filter is independent of scale of the filter. Otherwise
zero crossings of a step edge would change their position with change of scale.

5. The filter goes to zero as 1x1 + oo and as a -, oo.

We have studied the behavior of increasing sigma value on edge detection, surface char-
acterization and segmentation. As a value increases, window size of the Gaussian operator
increases and details are lost. Figure 4 and figure 5 show the perspective plots of a range
image before and after smoothing respectively.
Minor surface perturbations are smoothed easily. But the undesirable effect of uniform
application of Gaussian operator is smoothing of all types of edges. Step edges (chapter 3)
are smoothed t o form roughly convex and concave subparts (chapter 4).This complicates
edge detection, specially detection of smooth edges (concave and convex edges) that are
Figure 4: 3-D perspective plot of original image

Figure 5: 3-D perspective plot of smoothed image ( a = 1)

12
further smoothed. Thin objects tend to merge into the underlying objects, making seg-
mentation difficult. As will be discussed in chapter 3, as we go up the scale objects start
merging. We have found sigma value of 1 ( window size 5 x 5 ) to be best suited for our
experiments. In surface based segmentation technique, smoothing alters the local behavior
of the surface, but makes the result more reliable, specially away from the edges. Step
edges are shown as adjacent convex and concave regions. Segmentation using these effects
of smoothing is discussed in chapter 4 in detail.
Step Concave roof Convex roof Convex ramp concave ramp

Figure 6: Edges in range images

allows boundaries to be read off as zero crossings of the LOG operated image. VTe'lldiscuss
the significance of the LOG operator in view of range images. M'hile step edges pose no
particular problem, smooth edges are difficult to detect by local operations. In range image
segmentation it is of particular interest that the a pile of objects be segmented into convex
subparts. This requires detection of concave edges that will delimit the convex subparts.
Mitiche and Aggarwal [28] have presented a probabilistic approach of detecting the convex
and concave edges by using domain specific constraints.
Though 3-D edges are quite useful for object recognition, there are some inherent limita-
tions in edge information that make their use limited to aiding the higher level recognition
processes along with a host of invariant features. Edge classification depends on the orienta-
tion of object in 3-D space and is therefore not an invariant feature. Thus edge information
cannot be the only feature used by the recognizer and it has to be used in conjunction with
other features. However, as will be seen later, edge information is good enough for early
segmentation of range images because the requirement of invariant features does not apply
to the initial-segmentation process.
In case intensity information is available range data can be complemented by reflectance
data to pick up weak 3-D edges like the step edges created by overlapping thin objects. Wong
and Hayrapetian [32] used range information to segment intensity images. Gil, Mitiche
and Aggarwal [27] have described experiments in combining intensity and range edges.
While intensity images are certainly useful in detecting edges in the scene, they need to be
registered in the same way as range images to avoid correspondence problem. This may
Vertical section
of a range image.

b
X

First
derivative.

Second
h Derivative.

Y
Figure 7: Derivatives of a cross-section of a range image

possibility of extracting smooth edges using the v2G operator.


The Gaussian distribution in one dimension is defined as :

The first and second derivatives are :

The cross-section of a range image and the profiles of first and second order derivative
are shown in figure 7.
In two dimensions the LOG operator becomes :
Figure 8: 3-D edges detected in a synthetic range image. : upperleft original im-
age. upperright thresholded step edges. lowerleft thresholded convex edges. lowerright
t hresholded concave edges.

It is clear that behavior of second order derivatives is unique at every type of change in
surface. There are positive spikes at concave ramp edge, negative spikes at convex roof and
ramp edges and zero crossings at jump edges. However there are serious practical limitations
in using this response t o detect concave and convex edges. The response at these edges is
dependent on the convexity or concavity of the edge which is roughly the measure of angle
at which the two surfaces meet. If the angle is too small and change in depth is gradual
as in most situations, the response would be below or same as that due to local surface
changes. See figure 8 for the step, convex and concave responses obtained in a synthetic
range image having planar regions. Even in synthetic range image the responses deteriorate
as image was smoothed and smooth concave and convex edges virtually disappeared.
Thresholding of zero crossings is necessary in case of range images to avoid local surface
perturbations. Responses due to weak concave and convex edges would then be filtered.
Zero-crossings generated by weak step edges may also lie below the thresholding value. In
range images, Value of zero-crossing has direct relationship with the magnitude of depth
discontinuity. Thus selection of a thmshold effectively restricts the minimum detectable
depth. An object with less than acceptable height would be invisible in the edge image.
As observed in the previous chapter, it is absolutely necessary to smooth an image before
attempting any local operation it. LOG operator gives following image :

which can be written as :

Degree of smoothing depends on the value of sigma, which controls the size of the
window. Larger the sigma, greater is smoothing. While this effect is interpreted in intensity
images as blurring and hence reduction of details, In range images it is seen in terms of
smoothing the surface at the boundaries in addition to reduction of details. This results in
all types of boundaries to become smoothed and can have undesirable effects on boundary
detection and surface based segmentation. We have observed that with increasing scale
value, range images loose vital boundary information presenting difficulties in edge based
segmentation. Empirically determined window size of 5 ( sigma value = 1 ) is chosen for
processing all the images.
The algorithm for edge detection is given below.

1. Read in the range image.

2. Convolve the image with Gaussian operator separably in X and Y direction.

3. Convolve the G(x,y) * I(x,y) image with Laplacian operator:


4. Label the Zero-crossing ( with maximum value ) at every pixel in the V2G(x,y)*I(x, y)
image. Also mark the direction along which maximum crossing value is found.

5. Threshold the image at a predetermined value to label pixels as belonging to step


edges.

Figures ll(b),12(b) and other figures show the magnitude of zero-crossings detected in
range images.
It is observed that threshold selection is important in defining the acceptable depth,
which in turn is determined by the amount of detail seen in the filtered image. Also the
thresholding value is different at different scales. Threshold value varies inversely with
smoothing parameter (a).As we go up the scale-space the need for thresholding decreases.
Next we discuss a segmentation technique based on the step edges detected by the LOG
operator.

3.2 Segmentation of Range Images using edge information

Segmentation of objects in a range image depends on actual requirements. One should


therefore define the problem of segmentation clearly in the relevant context. In order to
recognize an object it is necessary to isolate the object, which is not a trivial task. In a
practical environment where objects can be of any size and shape, segmentation of individual
objects can be a difficult task. Objects in a range image will always be partially occluded
making the problem of segmentation and recognition difficult. If the scene consists of a
heap of objects and we have to recognize one object then our problem can be simplified by
considering the object of immediate importance, the one on the top of the heap. But this
method has only specific applications like picking parts out of a bin etc. and is not useful
as a general segmentation strategy.
Another approach to a complete segmentation is to segment the picture according to
the requirement. Sometimes image segmentation into different surface types may be useful
and at other times convex objects need to be segmented. Whatever the method, a seg-
mentation process working on local information cannot always give requisite results. In
fact, segmentation at lower-level of processing can at best give locally valid results which
may be conflicting from global point of view. Thus a robust segmentation process has to
work with higher stages of processing to yield globally valid results. This introduces the
concept of feedback from higher stages so as to work in a closed loop with the goal of object
recognition. Then segmentation can be considered as a part of object recognition stage.
As discussed before, 2nd order derivatives give enough information to delineate objects
at the step boundaries. We will develop an algorithm for segmenting the topmost object in
a heap of arbitrary objects.
First stage of segmentation process involving isolation of objects from background can
be easily accomplished by thresholding the image at known background depth. The diffi-
cult part is to identify an individual object from the heap of objects. At this point it is
essential t o define what is meant by an 'individual object'. An object can be a complex
combination of various 'primitive objects' like cube, sphere, cylinder etc.Some important
questions need answering here before any further progress can be made : What are the end
boundaries(edges) of the object and what are the internal boundaries of the object and how
t o distinguish between them ?
Different types of 3-D edges are jump edges, concave roof edges, concave ramp edges,
convex roof edges and convex ramp edges. Considering any of these edges as the internal
or external boundary of the object is going to put ,restrictions on the object types that
can be segmented. For example, a jump edge ( Unless it is an occluding edge against the
background) need not be the end boundary of the actual object. Since the segmentation
process is essentially a local operation and no other knowledge is used this problem cannot
be solved at this segmentation level. There are following solutions to this problem :

1. Make assumptions about the type of the objects. For example, assume the topmost
object to be convex. This is a very strong assumption and requires convex internal
boundary information. As noted before, this information is difficult to derive using
second order derivatives. We will see in next chapter, how surface characterization
techniques can be used to approximate the presence of smooth internal boundaries of
an object.

2. a priori knowledge about the scene. But this is against our approach towards a general
and robust segmentation algorithm.

3. Feedback from higher stages of object recognition to eliminate need for a priori knowl-
edge and to relax the strong assumptions made at low-level segmentation stage. Since
higher levels of object recognition are global processes and may have knowledge about
the domain, a closed loop segmentation procedure is bound to perform better than
one having no feedback.

Thus to achieve a reliable segmentation of the initial scene, we will assume that the top-
most object is delineated by jump boundaries. This may not always be true as two objects
can join at convex or concave edges or one object may merge into next one due to negligible
thickness of the object at the point of contact. This means that local information cannot
give perfect segmentation in all the cases. In such cases we need higher level processing to
figure out the right segmentation of the scene. The segmentation based on external bound-
ary information will give only an initial estimate of segmentation. This estimate is reliable
to the point that it can distinguish between objects of predetermined depth.
In the context of invariant object recognition it is important to note that step boundaries
may vary with orientation of the object. Thus they are used only to segment the object
and not to recognize the object. We will discuss the results of recognition and classification
of the segmented object using the superquadric technique developed by Franc [23].
The block diagram of segmentation and classification process is shown in figure 9
A practical problem with using zero-crossing as step boundaries is that they do not form
closed contours. The boundaries delineating the object may not be completely enclosing
the object, resulting in region growing process to overflow from the object and include the
neighboring object as part of the object. This drawback renders the final segmentation
result very unreliable. Yang and Kak [33] use a priori knowledge about the width of the
-
Smoothed range Range Image

by region growing.
All objects
seg~mented. Given
Supporting surface to surface
I information
~- -- -
. .. . . .. . ..
c la ssificat ion
Generate list of procedure
points 1
Add points.
horizontally/
vertically.
#
Superquadric I

Model selection.

I Model classification. 1
Figure 9: Flow chart of segmentation and recognition process
object and contour tracking to extract the closed contour surrounding the object. Their
method does not guarantee success in all the cases. David Heeger [26]has proposed a
computational approach to gap filling. It is computationally expensive and not suited for
our purpose since we want to avoid explicit contour tracing in the entire image and want
the region growing method to take care of it. Peter Allen [29] has used gap filling method
based on contour growing proposed by Nevatia and Babu [30]. They perform gap filling on
the entire scene using the predecessor-successor graph of all connected contours.See Peter
Allen's Ph.D. thesis for details. Contours are then merged based on the requirement of
merging N pixel gaps. This approach is again computationally expensive. Peter Allen
observes that filling of at most two pixel gaps is acceptable because of the ambiguities
resulting with three or more pixel gap-filling requirement. We have implemented two pixel
gap f a n g by constraining the region growing process near the boundaries, thus avoiding
the explicit gap filling stage.
One and two pixel gap filling is accomplished by simply requiring that a pixel having a
boundary pixel in its &neighborhood be not grown recursively. Instead, the pixel and the
boundary pixel are marked as grown. Figure 10 illustrates gap filling in one instance.
Thus we are able to avoid the contour tracing explicitly to fill gaps. Three or more pixel
gaps cannot be adequately handled by gap-fillers. Some sort of post processing is necessary
to further segment the segmented object in this case. One way is to trace all the boundary
pixels of the segmented object and use concavity information to segment the object into
parts. This approach is being implemented and will not be discussed here.
The algorithm for segmentation is given below :

1. Read in the original range image I(x,y) and V2G * I(x,y) image.

label-val = 0

2. Segment the objects from background by thresholding at background depth ( supplied


by the user). In case background is not of uniform depth, a plane can be fitted to
represent the background and threshold the objects from the scene.

3. Locate the 3x3 window with maximum height by averaging the pixel values in 3x3
Pixel Being Grown. Pixels in region.

Boundary Pixel.

Figure 10: An example of gap filling


window at every pixel in the range image.This gives the seed region for region growing.
Clearly the window lies on the topmost object. If there are more than one heap, then
also only one seed region is obtained.

4. label-val = label-val + 1

5. Grow the seed region recursively in a l l 8 directions. For gap filling procedure to work
it is necessary to grow pixels in &neighborhood. Let pij be the pixel being grown . A
pixel pij in &connected neighborhood of pi, is not grown under one of the following
conditions :

(a) depth($;,) <=background-t hreshold


(b) pij is already labeled.

(c) If Qij,any pixel in &connected neighborhood of fij satisfies :


l ~ p l a c i a n ( ~>
; ~=) edge-t hreshold
Then pij is l-pixel distance from an edge pixel and likely to be in a gap. Make :
label($ij) = label(^;^) and
label(Qii) = label(pij).

6. If number of pixels in the extracted region < acceptable size then region is invalid else
it is valid.

7. If region is valid then determine supporting points of the region.

8. The region extracted in the first pass is topmost region. Subsequent regions are grown
from top to bottom, left to right. If any more pixels are left to be processed then pick
up any unprocessed pixel and go back to step 5 to grow region.

9. Output topmost region, all valid regions and supporting points in separate image files.

3.3 Segmentation Results


Figure 11: Segmentation: (a) original image from RCA. (b) Edges detected. (c) topmost
object. (d) all segmented objects
Figure 12: Segmentation:(a) original image from RCA. (b) Edges detected. (c) topmost
object. (d) all segmented object
Figure 13: Segmentation: (a) original image from grasp lab scanner. (b) Edges detected.
(c) topmost object. (d) all segmented objects
Figure 14: Segmentation: (a) original image from RCA. (b) Edges detected at a = 1. (c)
segmented parts of one object at a = 1. (d) Segmentation at a = 2, for the same threshold
value.
Figure 15: Segmentation at different scales: (a) image smoothed at a = 2.0. (b)
objects segmented. (c) : image smoothed a t - a = 3.0. (d) objects segmented
The programs are written in C and implemented on a VAX-785 running UNIX . Ikonas
graphics and PM format for images are used in all programs. Images are acquired from
two sources. Most of the images used as examples are from RCA range image database
and remaining are scanned by Grasp lab's range scanner. All the images are digitized in
2-depth format. RCA images have better ( 12 bits/pixel ) resolution than Grasplab images
(8 bits/pixel). Hence more detail is seen in the former. It makes difference in detection of
thin objects. Due t o different Z resolution of two scanners, we have used different threshold
values for the two sets of images. All the images are smoothed uniformly with Gaussian of
a = 1 (window size = 5 x 5). Zero-crossings of LOG operator are thresholded to remove
response due to minor surface perturbations. The threshold at a given a value also limits the
thickness of objects that can be segmented. Threshold values are determined empirically,
since histogram of zero-crossings cannot be used in determining threshold automatically
as is done in intensity images. However threshold value remains same for all the images
acquired from the same source. This is true for all the empirically determined parameters
reported in this thesis. Background value is also known to the program and is constant given
a scanner. Results of processing the images are in figures 11, 12,13 and 14. In figure 11 all
the objects are segmented correctly. The topmost object is a Cylindrical object. Figure 12
shows merging of objects because of very weak step boundary information. Figure 13 shows
results of segmentation on the image obtained from grasp lab scanner. A constant offset of
100 is added t o original image depth values and zero-crossings are enhanced for displaying
purposes. Figure 14 exhibits different results at two scales for the same edge threshold.
The scene has single object, a box with string tied around it, so that the box is divided
into 4 partitions. Because of the high depth resolution of the image, edge information due
t o string is enough to segment the box into three parts at a = 1 (figure 14(c)). Increasing
the a value to 2, removes the details of the string and whole visible surface is recovered
(figure 14(d)).
To study the effect of increasing sigma value on zero-crossing, one of the multiple object
images is processed for a = 1,2,3. Note that objects start to merge as sigma increases,
with thin objects undetectable at a = 2.(Figures 11, 15)
3.4 Recognition of segmented objects using Superquadrics

The surfaces extracted by the previous algorithm can be classified as one of the eight basic
surface types. We will discuss this classification approach in detail in next chapter. In this
section we will describe a high level recognition and classification method that classifies the
segmented object into four broad categories.
We have used superquadric model recovery method implemented by Franc [23] to recog-
nize the segmented object in a range image. Details of the procedure for superquadric fitting
are discussed in Franc's Ph.D. thesis. Superquadrics are a family of parametric shapes that
can be used as primitives for shape representation in computer vision [31]. Superquadrics
are like lumps of clay that can be deformed and glued together into realistic looking models.
However, we will consider only non-deformed superquadric models for classification of the
object into one of the categories :

1. flat : Object with negligible height compared to length and width;

2. roll : A Cylindrical object.

3. box : An object with comparable height Jength and width.

4. Irregular : Any object not falling in any of the above three categories.

Superquadric implicit equation is given by :

Parameters a l , a2, and as define the superquadric size in x,y and z direction respectively.
EI is the squareness parameter in the latitude plane and EZ is the squareness parameter in
the longitude plane. Based on these parameter values superquadrics can model a large set
of standard building blocks, like spheres, cylinders, parallelepipeds and shapes in between.
Figure 16 illustrates the various types of shapes obtainable by changing two shape param-
eters. If both ~1 and e2 are 1, the surface defines an ellipsoid. Cylindrical shapes are
obtained for ~1 < 1 and ~2 = 1. Parallelepipeds are obtained for both ~1 and ~2 are < 1.
Figure 16: Superquadric models as function of shape parameters ( c 1 . z 2 ) for given size
parcmeters ( a l .az, a s )
We have restricted the model recovery procedure to fit the models with 0 < E ~ , E Z< 1. We
will not discuss the details of model recovery here.
The Criteria used for classification are three size parameters, two shape parameters and
the goodness of fit (GOF) measure. The superquadric procedure returns a GOF measure
using the following equation :
N
1
GOF = - ~
N -
[ a 1 a 2 a 3 ( ~ ( ~ , ~ , ~ ; a l , a 2 , a 3 , ~ 1 , ~ 21)12
~ # ~ ~ , $ , ~ ~ ~ ~ ~ ,
i=l

Where F is the superquadric inside-outside fuction described in Franc [23]. #, 0, $ define


the orientation and p,, p,, p, define position of superquadric in space.
The object given by the segmentation procedure has only the points visible to the
scanner. Much of the volumetric information is lost in the Z-depth format of representation.
While this is not a serious problem in case of curved objects like cylinder or segmented
surfaces having volumetric information ( like a tilted box viewed from above ), model
fitting becomes ambiguous if the visible surface is flat. If it is known that the original scene
had only one object, then the supporting surface can be assumed t o be the plane parallel
to the known background. The problem complicates in the multiple object scenes, where
it becomes impossible to assign correct depth to the segmented object. Given no prior
knowledge about the surface type, we need to add points in every case to give volumetric
information to the superquadric procedure. Points can be added in two ways :

1. Background is assumed to be the supporting surface of the object. Points are added on
the background by backprojecting the visible surface on the background ( figure 17(c).
While this is desirable in case of flat surfaces, it is not right for surfaces with volumetric
information.

2. Supporting points of the segmented object are used to determine the immediate sup-
porting surface(s) of the object. Points are added vertically (figure 17(e)) to the
object. This technique is more flexible since it can handle objects not lying on the
background. But it results in more points to be added in addition to assuming that
Figure 17: Horizontal and vertical addition of points.(a) object. (b) original points.
(c) horizontal addition of points. (d) fit with horizontal addition. (e) vertical addition of
poins. (f) fit with vertical addition.

the object is actually touching the neighboring objects, which may not be true in
general.

In general it is not possible to extract correct supporting surface information from a


single viewpoint. We have used horizontal addition of points in our experiments a s it is
faster than vertical addition and recovers the desired model.
The algorithm for model fitting,selection and classification is following :

1. Read the segmented object in 2-depth format.

2. Format conversion and point addition : Generate a list of points in 3-D space
representing the object. Call it points.orig. For every point on the visible surface add
a point at the same (s,y) coordinates on the background. Output the list of original
and added points in points-add.

3. S u p e r q u a d r i c fitting : Run Superquadric model fitting procedure on points.orig.


Model obtained is model.or4g. Run Superquadric model fitting procedure on points.add.
Model obtained is model.add. Iterative superquadric fitting is stopped if one of the
following conditions is met :

(a) Number of iterations > 15.


(b) Goodness of fit of ith iteration ( i 5 15) is 5 Acceptable measure. This measure
is empirically determined.

(c) If for the jth ( j2 5) iteration :

Condition (a) assumes that model recovery is complete by 15th iteration. Con-
dition (b) stops the procedure if an acceptable model is obtained early in the
process. Condition (c) monitors the rate convergence of fitting procedure. It
terminates the fitting procedure if the GOF measures of last five iterations do
not vary much. All the values used in the above three conditions are empirically
determined.

4. M o d e l selection :

IF (GOF(model.add) AND GOF(model.orig) 5 Acceptable-fit)


T H E N GOTO volume,criterion

ELSE IF (GOF(model.add) 5 Acceptable- f it)


THEN model = model. add GOTO classify.

ELSE IF (GOF(model.orig) 5 Acceptable, f it)


THEN model = model.orig GOTO classify.
ELSE

OBJECT = Irregular. GOTO Done.

5. Volume-criterion : Volume can be approximated as a1 x a2 x as.

TEEN model = model.orig

ELSE model = model.add.

6. classify : Classify model using al, a2, a3 and el, e2:

(a) IF ((a3 & a l ) AND (a3 & a2)AND


(el < 0.5) AND (e2 < 0.5))
THEN OBJECT = FLAT.

(b) ELSE I F ((al & as) AND (a2 & as) AND
(el < 0.5) AND (ez < 0.5))
THEN OBJECT = FLAT.

(c) ELSE I F ((al > T H R E S H B O X ) AND (a2 > THRESH-BOX) AND


(as > THRESH-BOX) AND (el < 0.5) AND (e2 < 0.5))
THEN OBJECT = BOX.

(d) ELSE I F ((a1 > THRESH-1JlOLL) AND (a2 > THRESH-1-ROLL) AND
(a3 > T H R E S H L R O L L ) AND
< 0.5) AND (e2 > 0.5))
THEN OBJECT = ROLL.

(e) ELSE OBJECT = Irregular

THRESH-BOX is the minimum acceptable dimension of the box.

THRESH-I-ROLL is the minimum acceptable width and height of the roll.

THRESH-%ROLL is the minimum acceptable length of roll.


r

OBJECT Hodel a1 a2 a3 el e2 Error

Orig 16.30 8.65 75.70 0.24 0.95 56.21


CyUindeu .
Add 16.07 35.29 72.16 0.10 0.14 393.28

Orig 3.47 33.6 45.29 0.10 0.81 1274.40


Box
Add 38.11 51.14 39.47 0.10 0.10 441.16

Orig 7.46 46.76 57.72 0.10 0.52 319.32


Tlst
Add 8.86 45.75 57.52 0.10 0.17 245.34

4.928 50.41 76.79 0.10 0.43 4083.61


0th.r. .O r i g
Add 9.38 53.00 82.44 0.10 0.10 4178.83
L

Figure 18: Parameter values of the recovered models

7. Done : Output classified model with parameters. Determine Orientation and position
of the model in world coordinate system.

3.5 Results of Superquadric Fitting and Classification

The superquadric fitting procedure and classifier were run on the objects segmented pre-
viously. The superquadric parameters for the four types of recovered objects are shown in
figure 18.
Figure 19 shows model recovery on topmost object segmented in figure 14. The model
selection process rejected model.orig due to large fit-error and accepted model.add. Even
if model.orig had an acceptable error measure, rnodel.add would have been selected due
RCA 21

Figure 19: Superquadric fitting and Model selection (a box) : (a) original points. (b)
fitted model on original points. (c) original and added points. (d) fitted model on original
and added points.

to larger volume. The acceptable error magnitude was empirically determined to be 500.
In figure 20 model.orig is selected and classified as roll because of the tremendous error
difference between the two acceptable models. modeLadd is accepted and classified as flat
in figure 22 because of volume consideration, although both the models have acceptable
error measures. Finally, the film mailer in figure 23 is classified as irregular as the fit-errors .

of both the models is more than acceptable error measure.


The results are shown for the four classes of objects. Tapered , bent or concave objects
cannot be represented by these models and hence will be classified as irregular. Franc
Solina's superquadric method also d o w s for tapering and bending along with segmentation
RCA 19

Figure 20: Superquadric fitting and Model selection (a Cylindrical object) : (a)
original points. (b) fitted model on original points. (c) original and added points. (d) fitted
model on original and added points.
Figure 21: .Segmentation: upperleft : original image from grasp lab scanner ( aletter )
upperright:Edges detected. lowerleft : topmost object. lowerright : all segmented objects
Figure 22: Superquadric fitting and Model selection (a letter) (a) original points. (b)
fitted model on original points. (c) original and added points. (d) fitted model on original
and added points.
RCA 24

Figure 23: Superquadric fitting and Model selection (a film-mailer) : (a) original
points. (b) fitted model on original points. (c) original and added points. (d) fitted model
on original and added points.
of the complex objects into parts. In the next chapter we will describe a surface classification
scheme that uses the output of segmentation routines described in this chapter.
Chapter 4

Surface Characterization and


Segmentation

Surface characterization refers t o the computational process of partitioning the surfaces into
regions with equal characteristics. Since our ultimate goal is object recognition, classifica-
tion of the surfaces by the characteristics of the surface functions is very useful. Classical
differential geometry provides a complete surface description of analytic surfaces so as to
obtain a complete set of surface characteristics. Surface characterization can be successfully
used in intermediate and high level processing of the object recognition problem.
Important surface characteristics, that are visible-invariant are Gaussian cumaturt and
the mean curvature. They are invariant to changes in surface parametrization and to trans-
lations and rorations of object surfaces. Guassian curvature is an intrinsic property of the
surface while mean curvature is an extrinsic property of the curvature.
From differential geometry it is well known that curvature, speed, and torsion uniquely
determine the shape of 3-D surfaces. The surface characteristics of our interest are the
ones with one-to-one relationship with curve shapes. The mathematics of a general surface
representation scheme and calculation of Guassian and mean curvatures is described in
following section.
4.1 Differential Geometry of Surfaces

Parametric form of equation for a regular surface S with respect to a known coordinate
system is :

S = (2, y, z ) : z = x1(u, v), y = x2(u, v), z = x3(u, v), (u, v) E D G R~

The surface is a locus of points in Euclidean three-space defined by the end points of
the vector X(u,v) with xi(u,v) the components of the vector. These red functions are
assumed to be defined over an open connected domain of a Cartesian u,v plane and to
have continuous second partial derivatives there. In our analysis of range images we are
assuming that this condition is satisfied.
The second condition for a regular surface is automatically satisfied by 2-depth format
images. It requires that the coordinate vectors Xu= XI = a
a~ 9 X, = X 2 = ax are linearly
independent :

The surface in range images is given by :

and coordinate vectors become :

These vectors are linearly independent given the first condition.


It can be shown using differential geometry techniques that first and second fundamental
forms(which exist only if the surface is analytic) uniquely characterize a general smooth
surface. The first fundamental form I of a surface is defined as :
Figure 24: Coordinate frame at the Neighborhood of a point

I ( u , v , du, dv) = dX.dX = [ du dv ] [::: [ ::] =d u T k b

where [g] matrix elements are given by :

The two tangent vectors x, and x, lie in the tangent plane T(u,v) of the surface at the
point ( u ,v). [g] matrix is symmetric for an analytic surface.
figure 24 shows the coordinate frame at the Neighborhood of a point.
The first fundamental form I(u, v , du, dv) measures the small amount of movement in the
parameter space (du,sv). The first fundamental form is invariant to surface parametrization
changes and to translations and rotations in the surface. Therefore it depends on the
surface itself and not on how it is embedded in the 3-D space. The metric functions E, F, G
determine all the intrinsic properties of the surface. In addition they define the area of a
surface :
The second fundamental form of the surface is given by :

Where [b] matrix elements are defined as :

Where the double subscript denotes second partial derivatives

a2x a2x a2x


xuu(u,v) = W xvv(u,v ) = -
dv2
xu"(% v ) = xvu(u, v) -
= dudv

The second fundamental form measures the correlation between the change in the normal
vector dn and the change in the surface position at a point ( u , v ) as a function of small
movement (du,dv) in the parametric space. Besl and Jain [9]have discussed the properties
of first and second fundamental forms in detail. We will consider some of the important
properties of Gaussian and Mean curvature in the following paragraphs.
It can be shown that the [g] matrix and the [b] matrix elements are the continuous
functions with continuous second and first partial derivatives respectively and that they
uniquely determine the surface type. From the [g] and [b] matrices calculated above surface
shape and intrinsic surface geometry can be uniquely determined.
The Gaussian curvature function K of a surface can be defined in terms of the two
matrices as :

K = det
921 g22 b21 922
RIDGE L-
G =o

RIDGE

Figure 25: Basic surface types in range images (a) surface types (b) table of surface types.

and the mean curvature of a surface is defined as :

The two types of curvatures are together referred to as surface curvature functions.
They exhibit very important properties that enable them t o be used as features for higher
level of processing. For detailed discussion on the properties of surface curvature functions
see Besi and Jain [9].Some of the relevant properties are summarized below :

1. Surface types can be determined by the sign of surface curvatures. They are shown
in figure 25

2. Gaussian curvature e-xhibits isometric invariance properties.

3. Mean curvature is slightly less sensitive to noise than Gaussian curvature.

4. Gaussian curvature function of a convex surface uniquely determines the surface.

5 . Mean curxature function of a graph surface taken together with the boundary curve
of a graph surface uniquely determines the graph surface from which it was computed.
6. Gaussian and mean curvature are invariant to arbitrary transformations of the (u, v)
parameters of a surface as long as the Jacobian of the transformation is always non-
zero.

7. Gaussian and mean curvatures are invariant to rotations and translations of a surface.
This property enables us to obtain view-independent characteristics.

8. Gaussian curvature is an isometric invariant of a surface. Therefore it is an intrinsic


surface quantity. It is independent of the the way the surface is embedded in the 3-D
space.

9. Gaussian and mean curvature axe local surface properties.

10. Another important property of surface curvatures is that Gaussian curvature indicates
the surface shape at individual surface points.When surface is shaped like an ellipsoid
in the Neighborhood of(u, v), K(u, v) > 0. It is < 0 for locally saddle-shaped surface
and is = 0 if the surface is flat,rdge-shaped or valley-shaped locally. Mean curvature
also indicates surface shapes at individual points when considered together with the
Gaussian curvature.

The above observations are very important for surface classification and have been widely
studied and used in range image processing.In fact surface characteristics constitute an
important part in the realization of the ultimate goal of three dimensional object recognition.

4.2 Computing Surface Characteristics of Range Images

Given a range image, our objective is to calculate the Gaussian and mean curvature. To
compute surface curvature we need to know the estimates of the first and second partial
derivatives of the depth map. Equations to get the partial derivatives can be simplified in
the case of the 2-depth format range image. Parameterization takes a very simple form:
T
XU = [ u v f(u,v) ] . The T superscript indicates the transpose This gives following
formulas for the surface partial derivative and the surface normal.
and the six fundamental form coefficients :

911 =1 + ft 922 '1 + f: 912 = fufv

fuu fuv fvv


= = =
bll
J1 + f," + fZ bl2
J1 + f,2 + f: b22
J1 + f,2 + fv2
The expression for Gaussian curvature is given by :

K =
fuufvv - f:v
(1 + f,2 + fv2)2
And the expression for mean curvature is given by:

H = fUU + fvv + fuufv2 + fvvf: - 2fufvfuv


2(1 + f,2 +
Thus if we are given a depth map function f(u, v) that possesses first and second partial
derivatives, Gaussian and mean curvature can be computed directly.

4.2.1 Estimation of partial derivatives of Depth Maps

Partial derivatives of the range image can be obtained by fitting a continuous differentiable
function that best fits the data. There are various techniques available in mathematics that
have been used by computer vision researchers to determine partial derivatives of depth
maps.
Using Discrete Orthogonal Polynomials

Besl and Jain [9]used discrete quadratic orthogonal polynomial fitting at each pixel to
estimate derivatives. It is possible to control Neighborhood size for making local estimates
which is important in case of actual range images.
A quadratic surface is fit at each pixel in the image, using a window convolution operator
of size desired by the user.
Each point in the given window is associated with a position (u, v ) from the set U X U
where N is odd :

The following discrete orthogonal polynomials provide the quadratic surface fit :

Where M = (n - 1)/2. The b;(u) functions are normalized orthogonal polynomials :

Where P ( M ) is a fifth - order polynomial in M :

bi(u) vectors are computed according t o the window size. First the surface estimate
function f ( u , v ) is calculated :

that minimizes the mean square term :


Coefficients are given by :

The first and second partial derivatives can then be directly read from the aij coeffi-
cients :

fu = Q I O f, = a01 fuv = all fuu = 2 ~ 2 0 f,, = 2a02

After the first and second partial derivatives are determined, surface Chara,cteristics at
each pixel are calculated.

Using Difference O p e r a t o r s

Brady etal [4] have used 3 x 3 difference operators to locally compute first and second
derivatives of the Gaussian smoothed surface. Neighborhood size cannot be increased in
this method. The operators are :

Using B-Spline fitting

Yang and Kak [33] have derived 3 x 3 operators using B-splines for computing partial
derivatives of a range map. These can be combined with Gaussian operator to increase the
window size and reduce sensitivity to noise. The operators give partial derivatives at the
center pixel of each operator.
XUU : [:2 ;4
:2] XVV:; [: 1 -2
-8
-2
1
4 1 XUV:;
1
[ ]1
0
-1 0
0 -1
0 0
1

Least Squares Polynomial Fitting

We have used a fast least squares fitting method to derive partial derivatives in the symmetric
Neighborhood of a pixel. This method allows the Neighborhood size to be controlled.
A surface fit of order n can be written as :

We have used second ( n = 2) order fitting in the Neighborhood of every pixel to compute
first and second order derivatives. Clearly, since the pixel a t which derivatives are computed
is at the origin, we get :

Thus derivatives are read off directly from the coefficients. We have also used the general
least squares fitting procedure for fitting polynomial on surface patches. For the purpose of
Original
crosssection.

smoothed
crosssection.

concave convex
f
concave

Figure 26: Effect of uniform Gaussian Smoothing

computing derivatives it is observed that we always have symmetric Neighborhood around


the pixel. This fact simplifies the least squares equations. See appendix B for the simpli-
fied least square fitting equations for second order bivariate approximation in a symmetric
Neighborhood.

4.2.2 Results of Initial Segmentation

The above mentioned process is applied to actual range images and results are shown in
figures 27,28,29, 30,31,32,33.
The smoothing behavior of Gaussian operator was briefly discussed in chapter 2. It is
observed that step edges in range images are actually adjacent convex and concave edges.
This is further amplified after smoothing the image with any size of Gaussian operator.
Brady etal. [4] have restricted the Gaussian application to inside of the region. We have
used Gaussian uniformly in the range image with the intention of uniformly smoothing the
image for the purpose of obtaining reliable curvature estimates (see figure 26)
Figure 27: curvature estimation (a) original image. 192 x 256 12 bitslpixel image (b)
smoothed image. (c) regions. (d) error in fitting
Figure 28: curvature estimation lefi to right?from top; original image.150 x 150 8
bitslpixel image; error in fitting; segmented regions; flat regions; convex regions; concave
regions.
Figure 29: Initial labeling of scene with different threshold values (a) 0.01 (b) 0.02
(c) 0.03 (d) 0.04 .
Figure 30: Labeling of scene :(a) a l l regions (b) convex (c) concave (d) flat.

Figure 31: Thresholded left : gauss. right : mean. black indicates zero, gray is positive
and white is negative value
-0.4 -0.2 0 0.2 0.4
<-- values -->

Figure 32: Histogram of Gaussian Curvature

-1.5 -1 -0.5 0 0.5 1 1.5


<-- values -->

Figure 33: Histograin of Mean curvature


For surface characterization purpose we use higher sigma value (= 1.5) and for post
segmentation processing we work at lower level in scale. Step Edges detected at sigma
= 1.0 are used to detect region boundaries in higher level processing stage discussed in next
section.
Although the response of Gaussian and Mean curvature is reliable it is necessary to
threshold the values around zero. 1% of maximum was used as threshold in all examples.
Figures 27 and 28 show the labeled regions and error in fitting second order polynomial at the
Neighborhood of each pixel, in images with 12 bitslpixel and 8 bitslpixel respectively. The
fit-error is appreciable at boundaries including smooth edges. This means that curvature
estimates at the edges are not reliable. At such points curvature magnitude may not be
reliable though sign of curvatures is reliable. Further results are shown only for image in
figure 27.
Figure 29 shows the effect of threshold values on curvature signs. As threshold values
for Gaussian and mean curvature is changed, pixel labeling may change if the curvature
magnitude is not appreciable. image thresholded at 1%of maximum curvature magnitude
(see figure 29(c)) has correct labelings. Further results are shown only for this threshold
value. Pixels are classified as one of the eight basic types. We can classify the entire im-
age into concave, convex, and flat regions by simply merging all neighboring pixels having
similar type of surface, i.e, flat,concave or convex (see figure 30. Thresholded values of
Gaussian and mean curvature are shown in figure 31. White patches indicate zero magni-
tude, gray indicate positive magnitude and black indicates negative curvature magnitude.
It is observed that Gaussian curvature is mostly zero except for isolated patches, since the
image has no spherical object. Non-zero mean curvature values are obtained at step edges
and on a cylindrical object. Histogram of the magnitude of Gaussian and mean curvature
(figure 32 and 33) for the entire image show appreciable mean curvature magnitude in the
image and no significant Gaussian curvature. It can therefore be inferred that the scene has
flat and possibly cylindrical objects.
4.3 Post processing of Labeled scenes

The segmentation done by labeling the individual pixels using sign of Gaussian and mean
curvature is local in nature and threshold dependent. In order to interpret these labelings
globally, we need to process the the labeled image with glob1 constraints. Besl and Jain [25]
have proposed a variable order surface fitting algorithm. Surface patches are described as
linear,quadric or cubic.
Our approach depends on the actual requirements. We describe two methods, ( both are
preliminary ) to obtain useful segmentation given labeled image. The first method simply
groups convex patches to form connected convex subparts of the scene. Second method uses
the segmented objects obtained from algorithm described in chapter 3.

4.3.1 Obtaining Convex patches

As noted in third chapter, detection of smooth edges is difficult to extract using only local
information. Curvature information at the d types of edges is easy to record. From figure 26
it is clear that edges in smoothed images can be recorded as thin convex and concave regions.
In particular, convex edges are of convex cylinder type, with zero Gaussian curvature but
appreciable negative mean curvature. Similarly, concave edges are of concave cylinder type,
with zero Gaussian curvature but appreciable positive mean curvature. Thus all types of
edges give either convex or concave cylindrical response. But the edge response is obtained
over wider region due to smoothing and large window size during derivative computation.
It is therefore not possible to have exact localization of patches obtained by merging convex
regions.
A simple algorithm for obtaining convex patches is given below :

1. Read the labeled image.

2. Label each patch as a region.

3. Initialize the region data structure to record surface type, number of pixels, topmost
pixel in the region,neighbours of the regi~n~extremities
of the region and the label
Figure 34: Convex patches

Figure 35: convex patches

64
assigned to the region.

4. For the next unprocessed topmost region of the type flat or peak(convex sphere) or
ridge (convex cylinder) with acceptable number of pixels do:

(a) Extend the original region to include all neighboring regions of type flat or
ridge. Other types of regions are considered concave or part of other convex
subpart. p e a k patches are not included because they will be selected as seed
region.

(b) Repeat the above step to extend the region, till it is not possible to grow any
more.

5. Output the convex subparts. End.

Figures 34 and 35 show convex patches obtained from labeled image obtained in figure27
and in figure 28(a) respectively. Majority of objects in figure 34 are merged into one convex
patch while they are separated in figure 35

4.3.2 Object S u r f a c e Classification

Surfaces on the segmented objects can be classified as one of basic surface type using the
initial labeling based on sign of curvatures. Yang and Kak [33]have used extended Gaussian
images to identify surface type on isolated surfaces. Histogram of labels in an isolated object
can give some idea about the surface and guide the surface fitting process.
The classification algorithm is as follows :

1. Read in the segmented objects image and labeled surface image.

2. For each object in the image do :

(a) Erode the object in labeled image so as to remove points within 5 pixel distance of
the object boundary. This reduces the effect of smoothing and window size during
curvature estimation which is mainly contributed by pixels near the boundary,
and does not reflect the nature of region.
Figure 36: Classified surfaces

(b) Histogram the remaining pixel-label values.


(c) If more than 90 % pixels are of one type,either flat or cylinder or s p h e r e then
the surface can be classified as such. If there are two or more peaks in the
histogram, object has more than one surface type.

(d) In single surface cases fit the best fitting surface on the points. Output the
description of surface.

(e) Further processing by region growing by surface fitting is necessary to smoothen


the surface patches. Fit surfaces on individual patches and merge them by region
growing.

This algorithm is being implemented. Initial classification process will classify the sur-
faces in figure 11 as 5 plane surfaces, 1 cylindrical and 1 irregular surface ( the film mailer).
See figure 36.
First and second order polynomials were fitted on flat and non-flat surface patches
respectively in image of figure 11. The reconstructed image is shown in figure 37. Besl
Figure 37: Original a n d reconstructed Images. left: original images right: recon-
structed images obtained by fitting 1st and 2nd order surfaces on patches labeled by seg-
mentation process.

and Jain (13,251 have used initial labeling to obtain seed regions in the final region growing
process. They perform variable order surface fitting to approximate the scene as a collection
of piece-wise continuous functions.
Chapter 5

Discussion

Though results of running the various algorithms described in previous chapters on im-
ages acquired from different scanners are consistent, there is scope for refinement of all the
approaches. We will discuss the merits and demerits of each method and suggest improve-
ments.
We need to study the scale-space behavior of range images in detail. This would lead to a
better understanding of the scale at which range images should be handled. We have noticed
that thresholding of zero-crossings makes the entire segmentation procedure dependent on
the threshold value. Though we have obtained consistent results with a fixed empirically
determined value for all the images obtained from a particular scanner, threshold selection is
not automatic. Secondly, even with right threshold value the region may not be completely
bounded by the zero-crossings ( in case of overlaps by thin objects or sensor noise) To
make the whole process less sensitive to threshold, following post-processing steps (region
splitting ) are suggested :

1. Read in the segmented object.

2. Trace the contours around the object as it is defined now and also any other boundaries
that are now lying inside the object. Except for the bounding contours, other contours
may not be closed. They may simply lie within the region and actually are boundary
of the real object. In such a case mark the beginning and end of the contour. If the
contour touches the closed contour then mark the point of contact as end of inner
contour.

3. In all the contours mark the concavities.

4. Now split the region by connecting two contours (gap filling) or connecting two points
of concavity ( gap filling or region splitting) or connecting an end point of contour
with a concavity, based on predetermined gap filling distance.

5. The output is the segmented object.

The above method should be indifferent to threshold values on higher side as it splits the
region consisting of more than one regions. To reduce the sensitivity to low threshold values
(which will result too many small regions) some sort of merging is required. Merging is a
much difficult task, so it is better to keep the threshold high and have the post-segmentation
process perform the splitting, rather than initial segmentation performing splitting due t o
low threshold value.
Another solution to splitting is to let higher level recognition process make globally valid
observations to split the region. The higher level procedure may use a priori information or
may make some assumptions or apply global constraints to split the region. Franc Solina's
(see 1231) superquadric procedure caa split the regions into identifiable parts by performing
model fitting on individual part of the object.
In chapter 4 we noticed that labeling of the scene based on curvature sign is threshold
sensitive. While thresholding around zero is necessary to obtain meaningful results, it is
not clear how that value can be automatically determined. Curvature determination being
local, the labeling is sensitive to noise and surface texture. It is not well understood how to
generate a global interpretation of such surfaces.
Bibliography

[l] R.M.Bolle and D.B. Cooper,Bayesian recognition of local 3-D shape by approximating
image intensity functions with quadric polynomials, IEEE Trans. Pattern Analysis and
Machine Intelligence. PAMI-6,No.4,1984,418-429.

[2] R.Bajcsy; Three dimensional scene ana1ysis;Proc. Pattern Recognition Conf. ; Mi-
ami,Florida,pp. 1064-1074,1980.

[3] P.J.Bes1 and R.C.Jain,Three dimensional object recognition, ACM Computing surveys
17,No.(1),1985,pp. 75-145.

[4] Brady,M.,Ponce,J.,Yuille,A.,and Asada,H., Describing surfaces, MIT A1 lab memo


822,January 1985.

[5] Haruo Asada and Michael Brady ;The curvature Curvature Primal Sketch, IEEE Pat-
tern Analysis and Machine Intelligence, PAMI-8,No.l, January 1986.

[6] Robert Haralick,Layne Watson and Thomas Laffey ;The Topographic Primal Sketch,
International Journal of Robotics Research, vol 2,No 1,Spring 1983,pp 50-72.

[7] Jean Ponce and Michael Brady ;Toward a Surface Prima2 Sketch, IEEE Conference on
Robotics and Automation, March 1984, pp 420-425.

[8] M.Hebert and J. PonceA new method for segmenting 3-D scenes into primitives Proc.
of the 6th int conf. on pattern recognition. 1982
[9] P.J.Bes1 and R.C.Jain,Invariant surface characteristics for 9d Object Recognition in
Range Images,Computer vision,Graphics, Image Processing No.(1),1986,pp 33-80.

[lo] R.M.Haralick Digital step edges from zero crossings of second directonal
derivatioes,PAMI-6,No. 1,1984,58-68.

[ll]B d a r d and Brown Computer Vision,1982, Prentice Hal1,New Jersey.

[12] B.K.P. Horn Machine Vision

[13] P.J.Bes1 and R.C. Jain Segmentation through symbolic surface descriptions,Proc on
Computer Vision and Pattern Recognition,l986.

[14] T.J.Fan,G.Medioni and R.NevatiaDescriptions of surfaces from Range data using cur-
vature properties, Proc. on Computer Vision and Pattern Recognition,l986.

[15] T.C.Henderson efficient segmentation method for mnge data. Proc. of the society for
photo-optical Instrumentation Engineers conference on Robot Vision. 1982.

[16] A. Heurtas and R. Nevatia ;Edge detection in Aerial Images Using V2G(x, y), in Semi-
annual Technical Report on Image Understanding Research, University of Southern
California,l98l,pp 16-26.

[17] S.Inokuchi eta1 ,A three dimensional edge region opemtor for range pictures. Proc. of
6th international Conference on pattern recognition. 1982.

[18] D.L. Milgrim and C.M. Bjorklund Range image processing : Planar surface extmction.
Proc of the 5th International Conference on Pattern Recognition. 1980.

[19] L.R.Nackman Two-dimensional critical point configurntion gmphs. IEEE PAMI-6,July


1984

[20] I.K.Sethi and Jayaramamurthy Surface classification using chamcteristic contours.


Proc. of the 7th International Conference on Pattern Recognition.IEEE,1984
[21] 0.D.Faugeras and M.Hebert;The Representation,Recognition and Positioning of 3-0
Shapes from mnge datajn Techniques for 3-D Machine Perception Edited by A. Rosen-
fe1d;Published by North-Holand,l986, pp. 13-51.

[22] B.K.P. HorqExtended Gaussian Images;Proc. of the IEEE,vol. 72,No. 12, pp. 1671-
1686,December 1984.

[23] Franc Solina; Shape Recovery and Segmentation with Deformable Part Modeki; Ph.D.
thesis, Grasp laboratory, University of Pennsylvania. MS- CIS-87-111.

[24] D. Marr and E. Hildreth; Theory of Edge Detection; Proc. of Royal Society of
London,B-207,pp 187-217,1980.

[25] P.J. Besl and Ramesh Jain; Segmentation Through Variable-Order Surface fitting, IEEE
Transactions on Pattern Analysis and Machine Intelligence, 1988.

[26] David J. Heeger; Filling in the Gaps: A Computational Theory of Contour Generation,
University of Pennsylvania, Grasp Lab, Technical report MS-CIS-84-64
\

[27] B. Gil, A. Mitiche and J.K. Aggarwal ;Experiments in combining intensity and range
edge maps,Computer Vision, Graphics and Image Processing 21, 1983, pp 395-411.

[28] A. Mitiche and J.K Aggarwal Detection of Edges Using Range Information, IEEE
Trans. Pattern Analysis and Machine Intelligence,PAMI-5,No. 2,pp 174-178.

[29] Peter Allen; Object Recognition Using Vision and Touch; Ph.D. Thesis, Grasp
Lab.,University of Pennsylvania. 1985.

[30] R. Nevatia and K.R. Babu; Linear feature extractor and Description, Computer Graph-
ics and Image Processing,voll3,pp. 257-269,1980.

[31] A. P. Pentland; Perceptual Organization and the Repesentation of Natuml


Fom,Artificial Intelligence 28(3),pp 293-331.
[32] R.Y. Wong and Hayrepetian ;Image Processing with Intensity and Range Data, Pro-
ceedings of the IEEE Pattern Recognition and Image Processing Conference, Las ve-
gasJune 1982,pp 518-520.

[33] R. S. Yang and A. C. Kak; Determination of the Identity, Position and Orientation
of the Topmost Object in a Pile, Computer Vision, Graphics and Image Processing
3 6 , 1 9 8 6 , ~229-255.
~

[34] David Smith and Takeo Kanade; Autonomous Scene Description with Range Imagery,
Computer Vision, Graphics and Image Processing, Vol 31,No. 3,Sept 1985,pp 322-334.

[35] Martin Herman; Generating Detailed Scene Descriptions from Range Images, IEEE
conference on Robotics and Automation, March 1984, pp 426-431.

[36] Richard Duda, David Nitzan and Phyllis Barrett ;Use of Range and Reflectance Data
to Find Planar Surface Regions, IEEE Pattern Analysis and Machine Intelligence,
PAMI- 1, NO 3, July 1979.

[37] Darwin Kuan and Robert Drazovich ;Model-based Interpretationof Range Im-
agery,Proc. of the National Conference on Artificial Intelligence, Austin, August 6-10,
AAA1,pp 210-215.

[38] B .C. Vemuri and J.K.Aggarwal;PD Model Construction from Multiple Views Using
Range and Intensity Data, IEEE conference on Computer Vision and Pattern Recog-
nition,l986,pp 435-437.

[39] Alan Yuille and Tomaso Poggio ;Scaling Theorems for Zen, Crossings, IEEE Pattern
Analysis and Machine Intelligence, PAMI-8,No 1, January 1986.
Appendix A

2nd order Least squares fitting in


symmetric neighborhood

The approximating polynomial is written as :

The square error term is :

To minimize the least squares term, let = 0.


which is :
Writing :

we get :

In a symmetric neighborhood :

Sppo= 0 f o r odd p or q and

Spqo = Sqpo

The above system of equations reduces to :


Which can be written as :

and
Appendix B

Source Code Listing

Listings of source programs is included in following pages.

1. space.c : Performs smoothing,median filtering,Gaussian filtering, Laplacian, marking


zero-crossings and graphics display of histogram etc. in interactive manner.
2. segment.^ : Segments all objects and topmost object in the scene, given original
image and zero-crossings of LOG image. Also outputs supporting points of topmost
object .
3. rca-ca1ib.c : Generates a list of points for Z-depth format image. Originally written
by Franc, modified to read P M files and add points horizontally and vertically.
4. c1assify.c : Procedure used t o select and classify the superquadric models.

5. sp1ine.c : Computes various surface characteristics of the image. Interactively dis-


plays results and histograms using quickdmw routines. Outputs labeled image and
other characteristic as desired by the user.
6. grad.c : Has code for fast least squares fitting. Polynomial is fitted in the symmetric
neighborhood of the pixel.

7. merge.c : Performs processing on the image labeled according to signs of Gaussian


and mean curvature. It computes convex parts of the image, fits polynomials on
patches using general least squares routines in so1ver.c.
8. so1ver.c : Has code for a general least squares fitting procedure.

There are other supporting programs that are vital to the algorithms. Superquadric
fitting programs are developed by Franc Solina and are not listed here.
FILE *infile.*out; / * pointers to input image and output image fil
Program for computing scale - space description of the input imag FILE *out2;
and lots of other things in interactive manner. Later will overri FILE *infs,*outgmsmooth,*outputgm;
sca1e.c. char *smooth-cmt;
char ikonas-disp [lo1;
Modifications for display of histogram done on Feb 18, 1988. char *out-cmt;
int count;
March 24 1988 : To read/write in both PM-C and PM-S formats. unsigned char c;
April 1 1988 : All buffers made floating pt. Computation of lapla static int nhb-x[8] = (0.1.1.1.0,-1,-1,-1);
and zerocrossing made floating pt. static int nhb-y [81 = (1,1,0,-1,-1,-1,O.l);
..................................................................... float nbd[l],temp;
int i,j,k,l,m,n,b;
float nbr [lo]:
Yinclude <stdio.h> int gsize;
Yinclude <math.h> float sigma; / * size and sigma of the gaussian operator * /
#include <local/pm.h> float sum;
Yinclude <ik.h> float gauss 1601 ;
/ * Yinclude <local/qkopt.h> * / float gsum;
int offset;
Ydefine BUFSIZE 256 / * Image buffer size * / char input [201;
Ydefine BUF-NUM 4 / * # of buffers available for manipulation * / int offx = 0,offy = 0; / * offset coordinates , intiallzed
char outfile[301; / * name of the output file * /
float result[BUFSIZE][BUFSIZE]; int bl,b2,b3;
float buffer[~U~-N~][BuFSIZE][BUFSIZE];/ * 0 stores original image * unsigned char *pmgoint;
int temp-buf[BUFSIZE]; / * temporarily store a buffer int disp-row;
short int *pmsqoint; / * pointer to short integer to hand
float x[BUFSIZE], y[BUFSIZE]; / * to store points * / PM-S format * /
int factor; / * # by which PM-S image pixel to b
for display on IKONAS * /
int threshold; / * Threshold for detecting zero-xings * /
int booll,bool2,bool3;
char *cmt;
char input-filename [50]; if (argc != 2)
int sizex,sizey; / * size of the image * / (
pmpic *pml; printf("usage : scale cinput-image-file-pmpic>\nN);
u-int image-format; / * stores the format of the last image read exit (0);
1
float mino ;
float getmedian() ; printf("want to display on ikonas ? ');
float abszz ( ) ; scanf (mls",&ikonas-disp[Ol);
float squarezz( ) ;
printf ("read\nm);
/ * open the ikonas display. value of env. variable is taken * /
if (~trcmp(~y",ikonas-disp)== 0)
main (argc,argv) if (ikopen(NULL) == -1)
int argc: f
char **argv; printf(*cantt open ikonas. exiting\nn);
( exit (0);
spa-. c Wn Apr 18 18:07:36 1988 3 sp.cm.c I(on Apr 18 18:07:36 1980 4

) count++;
1
/ * get comment line * / result [il [ jl = get-hedian(nbr) ;
1
cmt = pm-cmt(argc,argv);
for(i=l;i<(sizex-l);it+)
/ * open input pm file * / for(j=l;j<(sizey-1): j++)
bufferibl [il [ jl = result[i] [j];
readgicture (argv[l] ,0);
readgicture(argv[l], 1); ) / * end of median filtering * /
printf("Rows : %d Columns : %d\nn.sizex,sizey); else if((strcmp(input,"gauss*) == 0) I I (strcmp(input,"g') == 0
(

/* procesing of the commands starts now :


various available commands are :
1. gauss : convolves the image with gaussian filter.
2. cross : computes zero and other types of crossings in the gi
buffer.
3. save : saves indicated buffer in a file.
4 . disp : displays indicated buffer on the ikonas. printf("size of window :");
5. add : adds the two buffers.result is put in buffer 1 scanf('5d",hgsize);
6. sub : subtracts two buffers. result is put in buffer 1
7. buffer: selects the current buffer. / * compute the gaussian array */
8. offset: offset the picture on ikonas.
9. original : indicated buffer gets original picture. gsum = 0;
l0.read : Reads a file in the designated buffer. for(i= -gsize/2,j=O;i<= gsize/Z;i++,j++)
1l.hist : Computes and displays the historam using quickdraw. (
12.ikpm : saves the image displayed on ikonas in the a file in gauss[jl = (l.O/(sqrt((double) (2.0*3.1415926))*sigma))*
gsum += gauss[j];
five active buffers are maintained to manipulate the original ima printf ('gauss[%dl = Of",i,gauss[j]);
*/ 1
printf (">") ;
if (gsum == 0) gsum = 1;
while (scanf("9sY,input) != EOF) printf(*\n gsum = 9f \nW,gsum);
1
if ((strcmp(1nput. "median") == 0) II (strcmp(input8"mm)== 0)) / * seperably convolve x-axis * /
(
/ * do median filtering of the image * / for(j=O;jcsizey;j++)
b = readbuffer( ) ; for(i=gsize/2;i<=(size~-gsize/2):i++)
(
for (i=l;i< (sizex-1);it+) sum = 0;
for(j=l;j<(sizey-1) ;I++) for(k= -gsize/2;k<= gsize/Z;k++)
{ (
count = 0; sum += buffer[bl [i+k][jI*gauss[k+gsize/21;
for(m=i-l;m<=itl;m++) 1
for(n=j-l;n<=j+l:n++) result [i] [ j] = sum/gS~m;
( 1
nbr [count] = buffer[b] [m][n];

You might also like