0% found this document useful (0 votes)

10 views

2018_DGCNN

The document presents a novel neural network module called EdgeConv designed for learning from point clouds, which enhances point cloud classification and segmentation tasks by capturing local geometric features while maintaining permutation invariance. EdgeConv dynamically constructs a graph of relationships between points in each layer, allowing the model to learn global shape properties and achieve state-of-the-art performance on benchmark datasets. The authors demonstrate the effectiveness of EdgeConv by integrating it into existing deep learning architectures and showing improved results over traditional methods.

Uploaded by

1280565586

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views

2018_DGCNN

Uploaded by

1280565586

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

146

Dynamic Graph CNN for Learning on Point Clouds

YUE WANG and YONGBIN SUN, Massachusetts Institute of Technology
ZIWEI LIU, UC Berkeley/ICSI
SANJAY E. SARMA, Massachusetts Institute of Technology
MICHAEL M. BRONSTEIN, Imperial College London/USI Lugano
JUSTIN M. SOLOMON, Massachusetts Institute of Technology

Fig. 1. Point cloud segmentation using the proposed neural network. Bottom: schematic neural network architecture. Top: Structure of the feature spaces
produced at different layers of the network, visualized as the distance from the red point to all the rest of the points (shown left-to-right are the input
and layers 1–3; rightmost figure shows the resulting segmentation). Observe how the feature space structure in deeper layers captures semantically similar
structures such as wings, fuselage, or turbines, despite a large distance between them in the original input space.

Point clouds provide a flexible geometric representation suitable for count- of most 3D data acquisition devices. While hand-designed features
less applications in computer graphics; they also comprise the raw output on point clouds have long been proposed in graphics and vision, however,
the recent overwhelming success of convolutional neural networks
(CNNs) for image analysis suggests the value of adapting insight from
The authors acknowledge the generous support of Army Research Office Grant No. CNN to the point cloud world. Point clouds inherently lack topological
W911NF-12-R-0011, of Air Force Office of Scientific Research Award No. FA9550-19- information, so designing a model to recover topology can enrich the
1-0319, of National Science Foundation Grant No. IIS-1838071, of ERC Consolida-
tor Grant No. 724228 (LEMAN), from an Amazon Research Award, from the MIT-
representation power of point clouds. To this end, we propose a new
IBM Watson AI Laboratory, from the Toyota-CSAIL Joint Research Center, from the neural network module dubbed EdgeConv suitable for CNN-based high-
Skoltech-MIT Next Generation Program, and from Google Faculty Research Award. level tasks on point clouds, including classification and segmentation.
Any opinions, findings, and conclusions or recommendations expressed in this ma- EdgeConv acts on graphs dynamically computed in each layer of the
terial are those of the authors and do not necessarily reflect the views of these
organizations.
network. It is differentiable and can be plugged into existing architectures.
Authors’ addresses: Y. Wang, Y. Sun, S. E. Sarma, and J. M. Solomon, Massachusetts In- Compared to existing modules operating in extrinsic space or treating
stitute of Technology; emails: [email protected], {yb_sun, sesarma, jsolomon}@ each point independently, EdgeConv has several appealing properties: It
mit.edu; Z. Liu, The Chinese University of Hong Kong; email: [email protected]; incorporates local neighborhood information; it can be stacked applied to
M. M. Bronstein, Imperial College London / USI Lugano; email: m.bronstein@
imperial.ac.uk.
learn global shape properties; and in multi-layer systems affinity in feature
Permission to make digital or hard copies of all or part of this work for personal or space captures semantic characteristics over potentially long distances
classroom use is granted without fee provided that copies are not made or distributed in the original embedding. We show the performance of our model on
for profit or commercial advantage and that copies bear this notice and the full cita- standard benchmarks, including ModelNet40, ShapeNetPart, and S3DIS.
tion on the first page. Copyrights for components of this work owned by others than
the author(s) must be honored. Abstracting with credit is permitted. To copy other- CCS Concepts: • Computing methodologies → Neural networks;
wise, or republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee. Request permissions from [email protected].
Point-based models; Shape analysis;
© 2019 Copyright held by the owner/author(s). Publication rights licensed to ACM.
0730-0301/2019/10-ART146 $15.00
Additional Key Words and Phrases: Point cloud, classification, segmenta-
https://doi.org/10.1145/3326362 tion

ACM Transactions on Graphics, Vol. 38, No. 5, Article 146. Publication date: October 2019.
146:2 • Y. Wang et al.

ACM Reference format: the network to exploit local features, improving upon performance
Yue Wang, Yongbin Sun, Ziwei Liu, Sanjay E. Sarma, Michael M. Bronstein, of the basic model. These techniques largely treat points indepen-
and Justin M. Solomon. 2019. Dynamic Graph CNN for Learning on Point dently at local scale to maintain permutation invariance. This in-
Clouds. ACM Trans. Graph. 38, 5, Article 146 (October 2019), 12 pages. dependence, however, neglects the geometric relationships among
https://doi.org/10.1145/3326362
points, presenting a fundamental limitation that cannot capture
local features.
1 INTRODUCTION To address these drawbacks, we propose a novel simple oper-
Point clouds, or scattered collections of points in 2D or 3D, are ar- ation, called EdgeConv, which captures local geometric structure
guably the simplest shape representation; they also comprise the while maintaining permutation invariance. Instead of generating
output of 3D sensing technology, including LiDAR scanners and point features directly from their embeddings, EdgeConv gener-
stereo reconstruction. With the advent of fast 3D point cloud ac- ates edge features that describe the relationships between a point
quisition, recent pipelines for graphics and vision often process and its neighbors. EdgeConv is designed to be invariant to the or-
point clouds directly, bypassing expensive mesh reconstruction or dering of neighbors, and thus is permutation invariant. Because
denoising due to efficiency considerations or instability of these EdgeConv explicitly constructs a local graph and learns the em-
techniques in the presence of noise. A few of the many recent ap- beddings for the edges, the model is capable of grouping points
plications of point cloud processing and analysis include indoor both in Euclidean space and in semantic space.
navigation (Zhu et al. 2017), self-driving vehicles (Liang et al. 2018; EdgeConv is easy to implement and integrate into existing deep
Qi et al. 2017a; Wang et al. 2018b), robotics (Rusu et al. 2008b), and learning models to improve their performance. In our experiments,
shape synthesis and modeling (Golovinskiy et al. 2009; Guerrero we integrate EdgeConv into the basic version of PointNet without
et al. 2018). using any feature transformation. We show the resulting network
These modern applications demand high-level processing of achieves state-of-the-art performance on several datasets, most
point clouds. Rather than identifying salient geometric features notably ModelNet40 and S3DIS for classification and segmentation.
like corners and edges, recent algorithms search for semantic cues Key Contributions. We summarize the key contributions of our
and affordances. These features do not fit cleanly into the framework as follows:
works of computational or differential geometry and typically re-
quire learning-based approaches that derive relevant information • We present a novel operation for learning from point clouds,
through statistical analysis of labeled or unlabeled datasets. EdgeConv, to better capture local geometric features of point
In this article, we primarily consider point cloud classification clouds while still maintaining permutation invariance.
and segmentation, two model tasks in point cloud processing. Tra- • We show the model can learn to semantically group points
ditional methods for solving these problems employ handcrafted by dynamically updating a graph of relationships from layer
features to capture geometric properties of point clouds (Lu et al. to layer.
2014; Rusu et al. 2009, 2008a). More recently, the success of deep • We demonstrate that EdgeConv can be integrated into mul-
neural networks for image processing has motivated a data-driven tiple existing pipelines for point cloud processing.
approach to learning features on point clouds. Deep point cloud • We present extensive analysis and testing of EdgeConv and
processing and analysis methods are developing rapidly and out- show that it achieves state-of-the-art performance on bench-
perform traditional approaches in various tasks (Chang et al. 2015). mark datasets.
Adaptation of deep learning to point cloud data, however, is far
from straightforward. Most critically, standard deep neural net- 2 RELATED WORK
work models require input data with regular structure, while point Hand-Crafted Features. Various tasks in geometric data pro-
clouds are fundamentally irregular: Point positions are continu- cessing and analysis—including segmentation, classification, and
ously distributed in the space, and any permutation of their or- matching—require some notion of local similarity between shapes.
dering does not change the spatial distribution. One common ap- Traditionally, this similarity is established by constructing feature
proach to process point cloud data using deep learning models is descriptors that capture local geometric structure. Countless
to first convert raw point cloud data into a volumetric representa- papers in computer vision and graphics propose local feature
tion, namely a 3D grid (Maturana and Scherer 2015; Wu et al. 2015). descriptors for point clouds suitable for different problems and
This approach, however, usually introduces quantization artifacts data structures. A comprehensive overview of hand-designed
and excessive memory usage, making it difficult to go to capture point features is out of the scope of this article, but we refer the
high-resolution or fine-grained features. reader to Biasotti et al. (2016), Guo et al. (2014), and Van Kaick
State-of-the-art deep neural networks are designed specifically et al. (2011) for discussion.
to handle the irregularity of point clouds, directly manipulating Broadly speaking, one can distinguish between extrinsic and
raw point cloud data rather than passing to an intermediate reg- intrinsic descriptors. Extrinsic descriptors usually are derived from
ular representation. This approach was pioneered by PointNet (Qi the coordinates of the shape in 3D space and includes classical
et al. 2017b), which achieves permutation invariance of points by methods like shape context (Belongie et al. 2001), spin images
operating on each point independently and subsequently applying (Johnson and Hebert 1999), integral features (Manay et al. 2006),
a symmetric function to accumulate features. Various extensions distance-based descriptors (Ling and Jacobs 2007), point feature
of PointNet consider neighborhoods of points rather than acting histograms (Rusu et al. 2009, 2008a), and normal histograms
on each independently (Qi et al. 2017c; Shen et al. 2017); these allow (Tombari et al. 2011), to name a few. Intrinsic descriptors treat

ACM Transactions on Graphics, Vol. 38, No. 5, Article 146. Publication date: October 2019.
Dynamic Graph CNN for Learning on Point Clouds • 146:3

Fig. 2. Left: Computing an edge feature, ei j (top), from a point pair, xi and x j (bottom). In this example, h Θ () is instantiated using a fully connected layer,
and the learnable parameters are its associated weights. Right: The EdgeConv operation. The output of EdgeConv is calculated by aggregating the edge
features associated with all the edges emanating from each connected vertex.

the 3D shape as a manifold whose metric structure is discretized deep neural network, allowing to do intrinsic structured predic-
as a mesh or graph; quantities expressed in terms of the metric tion of correspondence between nonrigid shapes.
are invariant to isometric deformation. Representatives of this The last class of geometric deep learning approaches attempts
class include spectral descriptors such as global point signatures to pull back a convolution operation by embedding the shape into
(Rustamov 2007), the heat and wave kernel signatures (Aubry et al. a domain with shift-invariant structure such as the sphere (Sinha
2011; Sun et al. 2009), and variants (Bronstein and Kokkinos 2010). et al. 2016), torus (Maron et al. 2017), plane (Ezuz et al. 2017), sparse
Most recently, several approaches wrap machine learning schemes network lattice (Su et al. 2018), or spline (Fey et al. 2018).
around standard descriptors (Guo et al. 2014; Shah et al. 2013). Finally, we should mention geometric generative models, which
Deep Learning on Geometry. Following the breakthrough results attempt to generalize models such as autoencoders, variational
of convolutional neural networks (CNNs) in vision (Krizhevsky autoencoders (VAE) (Kingma and Welling 2013), and generative
et al. 2012; LeCun et al. 1989), there has been strong interest to adversarial networks (GAN) (Goodfellow et al. 2014) to the non-
adapt such methods to geometric data. Unlike images, geometry Euclidean setting. One of the fundamental differences between
usually does not have an underlying grid, requiring new building these two settings is the lack of canonical order between the input
blocks replacing convolution and pooling or adaptation to a grid and the output vertices, thus requiring an input-output correspon-
structure. dence problem to be solved. In 3D mesh generation, it is commonly
As a simple way to overcome this issue, view-based (Su et al. assumed that the mesh is given and its vertices are canonically or-
2015; Wei et al. 2016) and volumetric representations (Klokov and dered; the generation problem thus amounts only to determining
Lempitsky 2017; Maturana and Scherer 2015; Tatarchenko et al. the embedding of the mesh vertices. Kostrikov et al. (2017) pro-
2017; Wu et al. 2015)—or their combination (Qi et al. 2016)—“place” posed SurfaceNets based on the extrinsic Dirac operator for this
geometric data onto a grid. More recently, PointNet (Qi et al. 2017b, task. Litany et al. (2017a) introduced the intrinsic VAE for meshes
2017c) exemplifies a broad class of deep learning architectures on and applied it to shape completion; a similar architecture was used
non-Euclidean data (graphs and manifolds) termed geometric deep by Ranjan et al. (2018) for 3D face synthesis. For point clouds, mul-
learning (Bronstein et al. 2017). These date back to early methods tiple generative architectures have been proposed (Fan et al. 2017;
to construct neural networks on graphs (Scarselli et al. 2009), re- Li et al. 2018b; Yang et al. 2018).
cently improved with gated recurrent units (Li et al. 2016) and
neural message passing (Gilmer et al. 2017). Bruna et al. (2013) 3 OUR APPROACH
and Henaff et al. (2015) generalized convolution to graphs via the We propose an approach inspired by PointNet and convolution op-
Laplacian eigenvectors (Shuman et al. 2013). Computational draw- erations. Instead of working on individual points like PointNet,
backs of this foundational approach were alleviated in follow-up however, we exploit local geometric structures by constructing a
works using polynomial (Defferrard et al. 2016; Kipf and Welling local neighborhood graph and applying convolution-like opera-
2017; Monti et al. 2017b, 2018), or rational (Levie et al. 2017) spec- tions on the edges connecting neighboring pairs of points, in the
tral filters that avoid Laplacian eigendecomposition and guaran- spirit of graph neural networks. We show in the following that
tee localization. An alternative definition of non-Euclidean con- such an operation, dubbed edge convolution (EdgeConv), has prop-
volution employs spatial rather than spectral filters. The Geodesic erties lying between translation-invariance and non-locality.
CNN (GCNN) is a deep CNN on meshes generalizing the notion of Unlike graph CNNs, our graph is not fixed but rather is dynam-
patches using local intrinsic parameterization (Masci et al. 2015). ically updated after each layer of the network. That is, the set of
Its key advantage over spectral approaches is better generaliza- k-nearest neighbors of a point changes from layer to layer of the
tion as well as a simple way of constructing directional filters. network and is computed from the sequence of embeddings. Prox-
Follow-up work proposed different local charting techniques us- imity in feature space differs from proximity in the input, leading
ing anisotropic diffusion (Boscaini et al. 2016) or Gaussian mixture to nonlocal diffusion of information throughout the point cloud.
models (Monti et al. 2017a; Veličković et al. 2017). In Halimi et al. As a connection to existing work, Non-local Neural Networks
(2018) and Litany et al. (2017b), a differentiable functional map (Wang et al. 2018a) explored similar ideas in the video recognition
(Ovsjanikov et al. 2012) layer was incorporated into a geometric field, and follow-up work by Xie et al. (2018) proposed using

ACM Transactions on Graphics, Vol. 38, No. 5, Article 146. Publication date: October 2019.
146:4 • Y. Wang et al.

Fig. 3. Model architectures: The model architectures used for classification (top branch) and segmentation (bottom branch). The classification model takes
as input n points, calculates an edge feature set of size k for each point at an EdgeConv layer, and aggregates features within each set to compute EdgeConv
responses for corresponding points. The output features of the last EdgeConv layer are aggregated globally to form an 1D global descriptor, which is used
to generate classification scores for c classes. The segmentation model extends the classification model by concatenating the 1D global descriptor and all
the EdgeConv outputs (serving as local descriptors) for each point. It outputs per-point classification scores for p semantic labels. ⊕: concatenation. Point
cloud transform block: The point cloud transform block is designed to align an input point set to a canonical space by applying an estimated 3 × 3 matrix.
To estimate the 3 × 3 matrix, a tensor concatenating the coordinates of each point and the coordinate differences between its k neighboring points is used.
EdgeConv block: The EdgeConv block takes as input a tensor of shape n × f , computes edge features for each point by applying a multi-layer perceptron
(mlp) with the number of layer neurons defined as {a 1, a 2, . . . , a n }, and generates a tensor of shape n × a n after pooling among neighboring edge features.

non-local blocks to denoise feature maps to defend against Making analogy to convolution along images, we regard xi as the
adversarial attacks. central pixel and {xj : (i, j) ∈ E} as a patch around it (see Figure 2).
Overall, given an F -dimensional point cloud with n points, Edge-
3.1 Edge Convolution Conv produces an F -dimensional point cloud with the same num-
Consider an F -dimensional point cloud with n points, denoted by ber of points.
X = {x1 , . . . , xn } ⊆ RF . In the simplest setting of F = 3, each point Choice of h and . The choice of the edge function and the ag-
contains 3D coordinates xi = (x i , yi , zi ); it is also possible to in- gregation operation has a crucial influence on the properties of
clude additional coordinates representing color, surface normal, EdgeConv. For example, when x1 , . . . , xn represent image pixels
and so on. In a deep neural network architecture, each subsequent on a regular grid and the graph G has connectivity representing
layer operates on the output of the previous layer, so more gen- patches of fixed size around each pixel, the choice θm · xj as the
erally the dimension F represents the feature dimensionality of a edge function and sum as the aggregation operation yields stan-
given layer. dard convolution:
We compute a directed graph G = (V, E) representing local

point cloud structure, where V = {1, . . . , n} and E ⊆ V × V are x im = θ m · xj . (2)
the vertices and edges, respectively. In the simplest case, we con- j:(i, j ) ∈ E
struct G as the k-nearest neighbor (k-NN) graph of X in RF . The Here, Θ = (θ 1 , . . . , θ M ) encodes the weights of M different filters.
graph includes self-loop, meaning each node also points to it- Each θ m has the same dimensionality as x, and · denotes the Eu-
self. We define edge features as e i j = h Θ (xi , xj ), where h Θ : RF × clidean inner product.

RF → RF is a nonlinear function with a set of learnable parame- A second choice of h is
ters Θ.
Finally, we define the EdgeConv operation by applying a h Θ (xi , xj ) = h Θ (xi ), (3)

channel-wise symmetric aggregation operation (e.g., or max) encoding only global shape information oblivious of the local
on the edge features associated with all the edges emanating from neighborhood structure. This type of operation is used in Point-
each vertex. The output of EdgeConv at the i-th vertex is thus given Net, which can thus be regarded as a special case of EdgeConv.
by A third choice of h adopted by Atzmon et al. (2018) is
xi = h Θ (xi , xj ). (1)
j:(i, j ) ∈ E h Θ (xi , xj ) = h Θ (xj ) (4)

ACM Transactions on Graphics, Vol. 38, No. 5, Article 146. Publication date: October 2019.
Dynamic Graph CNN for Learning on Point Clouds • 146:5

Table 1. Comparison to Existing Methods

Aggregation Edge Function Learnable parameters

PointNet (Qi et al. 2017b) — h Θ (xi , xj ) = h Θ (xi ) Θ
PointNet++ (Qi et al. 2017c) max h Θ (xi , xj ) = h Θ (xj ) Θ

MoNet (Monti et al. 2017a) hθ m ,w n (xi , xj ) = θ m · (xj дw n (u (xi , xj ))) wn , θ m

PCNN (Atzmon et al. 2018) hθ m (xi , xj ) = (θ m · xj )д(u (xi , xj )) θm
The per-point weight w i in Atzmon et al. (2018) effectively is computed in the first layer and could be carried onward as an extra feature; we omit this
for simplicity.

and and a permutation operator π . The output of the layer xi is in-

x im = (hθ (xj ) )д(u (xi , xj )), (5) variant to permutation of the input xj because max is a symmetric
j ∈V function (other symmetric functions also apply). The global max
where д is a Gaussian kernel and u computes pairwise distance in pooling operator to aggregate point features is also permutation-
Euclidean space. invariant.
A fourth option is Translation Invariance. Our operator has a “partial” translation
h Θ (xi , xj ) = h Θ (xj − xi ). (6) invariance property, in that our choice of edge functions Equa-
tion (7) explicitly exposes the part of the function that can be
This encodes only local information, treating the shape as a col- translation-dependent and optionally can be disabled. Consider a
lection of small patches and losing global structure. translation applied to xj and xi ; we can show that part of the edge
Finally, a fifth option that we adopt in this article is an asym- feature is preserved when shifting by T . In particular, for the trans-
metric edge function lated point cloud, we have
h Θ (xi , xj ) = h̄ Θ (xi , xj − xi ). (7)
eijm = θ m · (xj + T − (xi + T )) + ϕ m · (xi + T )
This explicitly combines global shape structure, captured by the
coordinates of the patch centers xi , with local neighborhood in- = θ m · (xj − xi ) + ϕ m · (xi + T ).
formation, captured by xj − xi . In particular, we can define our
operator by notating If we only consider xj − xi by taking ϕ m = 0, then the operator
is fully invariant to translation. In this case, however, the model
eijm = ReLU(θ m · (xj − xi ) + ϕ m · xi ), (8) reduces to recognizing an object based on an unordered set of
which can be implemented as a shared MLP, and taking patches, ignoring the positions and orientations of patches. With
both xj − xi and xi as input, the model takes account into the local
x im = max eijm , (9) geometry of patches while keeping global shape information.
j:(i, j ) ∈ E

where Θ = (θ 1 , . . . , θ M , ϕ 1 , . . . , ϕ M ).
3.4 Comparison to Existing Methods
3.2 Dynamic Graph Update DGCNN is related to two classes of approaches, PointNet and
Our experiments suggest that it is beneficial to recompute the graph graph CNNs, which we show to be particular settings of our
using nearest neighbors in the feature space produced by each method. We summarize different methods in Table 1.
layer. This is a crucial distinction of our method from graph CNNs PointNet is a special case of our method with k = 1, yielding a
working on a fixed input graph. Such a dynamic graph update is graph with an empty edge set E = ∅. The edge function used in
the reason for the name of our architecture, the Dynamic Graph PointNet is h Θ (xi , xj ) = h Θ (xi ), which considers global but not lo-
CNN (DGCNN). With dynamic graph updates, the receptive field cal geometry. PointNet++ tries to account for local structure by ap-
is as large as the diameter of the point cloud, while being sparse. plying PointNet in a local manner. In our parlance, PointNet++ first
At each layer, we have a different graph G (l ) = (V (l ) , E (l ) ), constructs the graph according to the Euclidean distances between
where the lth layer edges are of the form (i, ji1 ), . . . , (i, jikl ) such the points, and in each layer applies a graph coarsening operation.
(l ) (l ) (l ) For each layer, some points are selected using farthest point sam-
that xji 1 , . . . , x j are the kl points closest to xi . Put differently, pling (FPS); only the selected points are preserved while others are
ikl
our architecture learns how to construct the graph G used in each directly discarded after this layer. In this way, the graph becomes
layer rather than taking it as a fixed constant constructed before smaller after the operation applied on each layer. In contrast to
the network is evaluated. In our implementation, we compute a DGCNN, PointNet++ computes pairwise distances using point in-
pairwise distance matrix in feature space and then take the closest put coordinates, and hence their graphs are fixed during training.
k points for each single point. The edge function used by PointNet++ is h Θ (xi , xj ) = h Θ (xj ), and
the aggregation operation is also a max.
3.3 Properties Among graph CNNs, MoNet (Monti et al. 2017a), ECC
Permutation Invariance. Consider the output of a layer, (Simonovsky and Komodakis 2017), Graph Attention Networks
(Veličković et al. 2017), and the concurrent work (Atzmon et al.
xi = max h Θ (xi , xj ), (10)
j:(i, j ) ∈ E 2018) are the most related approaches. Their common denominator

ACM Transactions on Graphics, Vol. 38, No. 5, Article 146. Publication date: October 2019.
146:6 • Y. Wang et al.

Fig. 4. Structure of the feature spaces produced at different stages of our shape classification neural network architecture, visualized as the distance between
the red point to the rest of the points. For each set, Left: Euclidean distance in the input R3 space; Middle: Distance after the point cloud transform stage,
amounting to a global transformation of the shape; Right: Distance in the feature space of the last layer. Observe how in the feature space of deeper layers
semantically similar structures such as shelves of a bookshelf or legs of a table are brought close together, although they are distant in the original space.

is a notion of a local patch on a graph, in which a convolution-type where д is a Gaussian kernel, is the elementwise (Hadamard)
operation can be defined.1 product, {w 1 , . . . , w N } encode the learnable parameters of the
Specifically, Monti et al. (2017a) use the graph structure to com- Gaussians (mean and covariance), and {θ 1 , . . . , θ M } are the learn-
pute a local “pseudo-coordinate system” u in which the neighbor- able filter coefficients. Equation (11) is an instance of our general
hood vertices are represented; the convolution is then defined as operation Equation (1), with a particular edge function
an M-component Gaussian mixture,

x im = θ m · (xj дw n (u (xi , xj ))), (11) hθ m ,w n (xi , xj ) = θ m · (xj дw n (u (xi , xj )))
j:(i, j ) ∈ E

and = . Again, their graph structure is fixed, and u is con-
structed based on the degrees of nodes.
1 Simonovsky and Komodakis (2017) and Veličković et al. (2017) can be considered Atzmon et al. (2018) can be seen as a special case of Monti
instances of Monti et al. (2017a), with the difference that the weights are constructed
employing features from adjacent nodes instead of graph structure; Atzmon et al. et al. (2017a) with д as predefined Gaussian functions. Remov-
(2018) is also similar except that the weighting function is hand-designed. ing learnable parameters (w 1 , . . . , w N ) and constructing a dense

ACM Transactions on Graphics, Vol. 38, No. 5, Article 146. Publication date: October 2019.
Dynamic Graph CNN for Learning on Point Clouds • 146:7

Table 2. Classification Results on ModelNet40 Table 4. Effectiveness of Different Components

Mean Overall CENT DYN MPOINTS Mean Class Accuracy(%) Overall Accuracy(%)
Class Accuracy Accuracy 88.9 91.7
3DShapeNets (Wu et al. 2015) 77.3 84.7 x 89.3 92.2
VoxNet (Maturana and Scherer 2015) 83.0 85.9 x x 90.2 92.9
Subvolume (Qi et al. 2016) 86.0 89.2 x x x 90.7 93.5
VRN (single view) (Brock et al. 2016) 88.98 — CENT denotes centralization, DYN denotes dynamical graph recomputation and
VRN (multiple views) (Brock et al. 2016) 91.33 — MPOINTS denotes experiments with 2,048 points.
ECC (Simonovsky and Komodakis 2017) 83.2 87.4
PointNet (Qi et al. 2017b) 86.0 89.2 Table 5. Results of Our Model with Different Numbers of Nearest
PointNet++ (Qi et al. 2017c) — 90.7 Neighbors
Kd-net (Klokov and Lempitsky 2017) — 90.6
PointCNN (Li et al. 2018a) 88.1 92.2 Number of nearest neighbors (k) Mean Overall
PCNN (Atzmon et al. 2018) — 92.3 Class Accuracy(%) Accuracy(%)
Ours (baseline) 88.9 91.7 5 88.0 90.5
Ours 90.2 92.9 10 88.9 91.4
Ours (2048 points) 90.7 93.5 20 90.2 92.9
40 89.4 92.4

Table 3. Complexity, Forward Time, and Accuracy of Different Models

models from 40 categories. We used 9,843 models for training and
Model size(MB) Time(ms) Accuracy(%)
2,468 models for testing. We follow verbatim the experimental set-
PointNet (Baseline) (Qi et al. 2017b) 9.4 6.8 87.1
tings of Qi et al. (2017b). For each model, 1,024 points are uniformly
PointNet (Qi et al. 2017b) 40 16.6 89.2
sampled from the mesh faces; the point cloud is rescaled to fit into
PointNet++ (Qi et al. 2017c) 12 163.2 90.7
the unit sphere. Only the (x, y, z) coordinates of the sampled points
PCNN (Atzmon et al. 2018) 94 117.0 92.3
are used, and the original meshes are discarded. During the train-
Ours (Baseline) 11 19.7 91.7
ing procedure, we augment the data by randomly scaling objects
Ours 21 27.2 92.9
and perturbing the object and point locations.
Architecture. The network architecture used for the classifi-
graph from point clouds, we have cation task is shown in Figure 3 (top branch without spatial
transformer network). We use four EdgeConv layers to extract

x im = (θ m · xj )д(u (xi , xj )), (12) geometric features. The four EdgeConv layers use three shared
j:j ∈V fully connected layers (64, 64, 128, 256). We recompute the graph
where u is the pairwise distance between xi and xj in Euclidean based on the features of each EdgeConv layer and use the new
space. graph for next layer. The number k of nearest neighbors is 20 for
While MoNet and other graph CNNs assume a given fixed graph all EdgeConv layers (for the last row in Table 2, k is 40). Shortcut
on which convolution-like operations are applied, to our knowl- connections are included to extract multi-scale features and one
edge our method is the first for which the graph changes from layer shared fully connected layer (1,024) to aggregate multi-scale fea-
to layer and even on the same input during training when learnable tures, where we concatenate features from previous layers to get a
parameters are updated. This way, our model not only learns how 64 + 64 + 128 + 256 = 512-dimensional point cloud. Then, a global
to extract local geometric features but also how to group points max/sum pooling is used to get the point cloud global feature, after
in a point cloud. Figure 4 shows the distance in different feature which two fully connected layers (512, 256) are used to transform
spaces, exemplifying that the distances in deeper layers carry se- the global feature. Dropout with keep probability of 0.5 is used in
mantic information over long distances in the original embedding. the last two fully connected layers. All layers include LeakyReLU
and batch normalization. The number k was chosen using a vali-
4 EVALUATION dation set. We split the training data to 80% for training and 20%
In this section, we evaluate the models constructed using Edge- for validation to search the best k. After k is chosen, we retrain
Conv for different tasks: classification, part segmentation, and se- the model on the whole training data and evaluate the model on
mantic segmentation. We also visualize experimental results to il- the testing data. Other hyperparameters were chosen in a similar
lustrate key differences from previous work. ways.
Training. We use SGD with learning rate 0.1, and we reduce
4.1 Classification the learning rate until 0.001 using cosine annealing (Loshchilov
Data. We evaluate our model on the ModelNet40 (Wu et al. 2015) and Hutter 2017). The momentum for batch normalization is
classification task, consisting in predicting the category of a pre- 0.9, and we do not use batch normalization decay. The batch size
viously unseen shape. The dataset contains 12,311 meshed CAD is 32 and the momentum is 0.9.

ACM Transactions on Graphics, Vol. 38, No. 5, Article 146. Publication date: October 2019.
146:8 • Y. Wang et al.

Fig. 5. Left: Results of our model tested with random input dropout. The
model is trained with number of points being 1024 and k being 20. Right:
Point clouds with different number of points. The numbers of points are
shown below the bottom row.

Fig. 7. Compare part segmentation results. For each set, from left to right:
PointNet, ours, and ground truth.

density. Note that PCNN (Atzmon et al. 2018) uses additional aug-
mentation techniques like randomly sampling 1,024 points out of
1,200 points during both training and testing.

4.2 Model Complexity

We use the ModelNet40 (Wu et al. 2015) classification experi-
ment to compare the complexity of our model to previous state-
of-the-art. Table 3 shows that our model achieves the best tradeoff
between the model complexity (number of parameters), computa-
tional complexity (measured as forward pass time), and the result-
ing classification accuracy.
Our baseline model using the fixed k-NN graph outperforms the
previous state-of-the-art PointNet++ by 1.0% accuracy, at the same
time being seven times faster. A more advanced version of our
model including a dynamically updated graph computation out-
performs PointNet++, PCNN by 2.2% and 0.6%, respectively, while
being much more efficient. The number of points in each experi-
ment is also 1,024 in this section.
Fig. 6. Our part segmentation testing results for tables, chairs, and lamps.
4.3 More Experiments on ModelNet40
Results. Table 2 shows the results for the classification task. Our We also experiment with various settings of our model on the
model achieves the best results on this dataset. Our baseline using ModelNet40 (Wu et al. 2015) dataset. In particular, we analyze the
a fixed graph determined by proximity in the input point cloud effectiveness of the different distance metrics, explicit usage of
is 1.0% better than PointNet++. An advanced version including xi − xj , and more points.
dynamical graph recomputation achieves the best results on this Table 4 shows the results. “Centralization” denotes using con-
dataset. All the experiments are performed with point clouds that catenation of xi and xi − xj as the edge features rather than con-
contain 1024 points except last row. We further test out model with catenating xi and xj . “Dynamic graph recomputation” denotes we
2,048 points. The k used for 2,048 points is 40 to maintain the same reconstruct the graph rather than using a fixed graph. Explicitly

ACM Transactions on Graphics, Vol. 38, No. 5, Article 146. Publication date: October 2019.
Dynamic Graph CNN for Learning on Point Clouds • 146:9

Table 6. Part Segmentation Results on ShapeNet Part Dataset

mean areo bag cap car chair ear guitar knife lamp laptop motor mug pistol rocket skate table
. phone board
# shapes 2690 76 55 898 3758 69 787 392 1547 451 202 184 283 66 152 5271
PointNet 83.7 83.4 78.7 82.5 74.9 89.6 73.0 91.5 85.9 80.8 95.3 65.2 93.0 81.2 57.9 72.8 80.6
PointNet++ 85.1 82.4 79.0 87.7 77.3 90.8 71.8 91.0 85.9 83.7 95.3 71.6 94.1 81.3 58.7 76.4 82.6
Kd-Net 82.3 80.1 74.6 74.3 70.3 88.6 73.5 90.2 87.2 81.0 94.9 57.4 86.7 78.1 51.8 69.9 80.3
LocalFeatureNet 84.3 86.1 73.0 54.9 77.4 88.8 55.0 90.6 86.5 75.2 96.1 57.3 91.7 83.1 53.9 72.5 83.8
PCNN 85.1 82.4 80.1 85.5 79.5 90.8 73.2 91.3 86.0 85.0 95.7 73.2 94.8 83.3 51.0 75.0 81.8
PointCNN 86.1 84.1 86.45 86.0 80.8 90.6 79.7 92.3 88.4 85.3 96.1 77.2 95.3 84.2 64.2 80.0 83.0
Ours 85.2 84.0 83.4 86.7 77.8 90.6 74.7 91.2 87.5 82.8 95.7 66.3 94.9 81.1 63.5 74.5 82.6

Metric is mIoU(%) on points.

Fig. 9. Left: The mean IoU (%) improves when the ratio of kept points in-
creases. Points are dropped from one of six sides (top, bottom, left, right,
front, and back) randomly during evaluation process. Right: Part segmen-
tation results on partial data. Points on each row are dropped from the
same side. The keep ratio is shown below the bottom row. Note that
the segmentation results of turbines are improved when more points are
included.

Table 7. 3D Semantic Segmentation Results on S3DIS

Mean overall
IoU accuracy
PointNet (baseline) (Qi et al. 2017b) 20.1 53.2
PointNet (Qi et al. 2017b) 47.6 78.5
MS + CU(2) (Engelmann et al. 2017) 47.8 79.2
G + RCU (Engelmann et al. 2017) 49.7 81.1
PointCNN (Li et al. 2018a) 65.39 —
Ours 56.1 84.1
MS+CU for multi-scale block features with consolidation units; G+RCU for the
grid-blocks with recurrent consolidation units.

possible k, we find with large k that the performance degenerates.

Fig. 8. Visualize the Euclidean distance (yellow: near; blue: far) between This confirms our hypothesis that for certain density, with large
source points (red points in the left column) and multiple point clouds from k the Euclidean distance fails to approximate geodesic distance,
the same category in the feature space after the third EdgeConv layer.
destroying the geometry of each patch.
Notice source points not only capture semantically similar structures in
We further evaluate the robustness of our model (trained on
the point clouds that they belong to but also capture semantically similar
structures in other point clouds from the same category. 1,024 points with k = 20) to point cloud density. We simulate the
environment that random input points drops out during testing.
centralizing each patch by using the concatenation of xi and Figure 5 shows that even half of points is dropped, the model still
xi − xj leads to about 0.5% improvement for overall accuracy. By achieves reasonable results. With fewer than 512 points, however,
dynamically updating graph, there is about 0.7% improvement, performance degenerates dramatically.
and Figure 4 also suggests that the model can extract semantically
meaningful features. Using more points further improves the over- 4.4 Part Segmentation
all accuracy by 0.6%. Data. We extend our EdgeConv model architectures for part seg-
We also experiment with different numbers k of nearest neigh- mentation task on ShapeNet part dataset (Yi et al. 2016). For this
bors as shown in Table 5. For all experiments, the number of points task, each point from a point cloud set is classified into one of a
is still 1,024. While we do not exhaustively experiment with all few predefined part category labels. The dataset contains 16,881

ACM Transactions on Graphics, Vol. 38, No. 5, Article 146. Publication date: October 2019.
146:10 • Y. Wang et al.

Fig. 10. Semantic segmentation results. From left to right: PointNet, ours, ground truth, and point cloud with original color. Notice our model outputs
smoother segmentation results, for example, wall (cyan) in top two rows, chairs (red) and columns (magenta) in bottom two rows.

3D shapes from 16 object categories, annotated with 50 parts in Training. The same training setting as in our classification task
total. We sampled 2,048 points from each training shape, and most is adopted. A distributed training scheme is further implemented
sampled point sets are labeled with less than six parts. We follow on two NVIDIA TITAN X GPUs to maintain the training batch size.
the official train/validation/test split scheme as Chang et al. (2015)
Results. We use Intersection-over-Union (IoU) on points to eval-
in our experiment.
uate our model and compare with other benchmarks. We follow
Architecture. The network architecture is illustrated in Figure 3 the same evaluation scheme as PointNet: The IoU of a shape is
(bottom branch). After a spatial transformer network, three Edge- computed by averaging the IoUs of different parts occurring in
Conv layers are used. A shared fully connected layer (1,024) that shape, and the IoU of a category is obtained by averaging the
aggregates information from the previous layers. Shortcut con- IoUs of all the shapes belonging to that category. The mean IoU
nections are used to include all the EdgeConv outputs as local (mIoU) is finally calculated by averaging the IoUs of all the testing
feature descriptors. At last, three shared fully connected layers shapes. We compare our results with PointNet (Qi et al. 2017b),
(256, 256, 128) are used to transform the pointwise features. Batch- PointNet++ (Qi et al. 2017c), Kd-Net (Klokov and Lempitsky 2017),
norm, dropout, and ReLU are included in the similar fashion to our LocalFeatureNet (Shen et al. 2017), PCNN (Atzmon et al. 2018), and
classification network. PointCNN (Li et al. 2018a). The evaluation results are shown in

ACM Transactions on Graphics, Vol. 38, No. 5, Article 146. Publication date: October 2019.
Dynamic Graph CNN for Learning on Point Clouds • 146:11

Table 6. We also visually compare the results of our model and be revised and/or re-engineered to improve efficiency or scala-
PointNet in Figure 7. More examples are shown in Figure 6. bility, e.g. incorporating fast data structures rather than comput-
Intra-cloud Distances. We next explore the relationships between ing pairwise distances to evaluate k-nearest neighbors queries. We
different point clouds captured using our features. As shown in also could consider higher-order relationships between larger tu-
Figure 8, we take one red point from a source point cloud and com- ples of points, rather than considering them pairwise. Another
pute its distance in feature space to points in other point clouds possible extension is to design a non-shared transformer network
from the same category. An interesting finding is that although that works on each local patch differently, adding flexibility to our
points are from different sources, they are close to each other model.
if they are from semantically similar parts. We evaluate on the Our experiments suggest that intrinsic features can be equally
features after the third layer of our segmentation model for this valuable if not more valuable than point coordinates; developing
experiment. a practical and theoretically justified framework for balancing in-
trinsic and extrinsic considerations in a learning pipeline will re-
Segmentation on Partial Data. Our model is robust to partial data. quire insight from theory and practice in geometry processing.
We simulate the environment that part of the shape is dropped Given this, we will consider applications of our techniques to more
from one of six sides (top, bottom, right, left, front, and back) with abstract point clouds coming from applications like document re-
different percentages. The results are shown in Figure 9. On the trieval and image processing rather than 3D geometry; beyond
left, the mean IoU versus “keep ratio” is shown. On the right, the broadening the applicability of our technique, these experiments
results for an airplane model are visualized. will provide insight into the role of geometry in abstract data
processing.
4.5 Indoor Scene Segmentation
Data. We evaluate our model on Stanford Large-Scale 3D Indoor REFERENCES
Spaces Dataset (S3DIS) (Armeni et al. 2016) for a semantic scene Iro Armeni, Ozan Sener, Amir R. Zamir, Helen Jiang, Ioannis Brilakis, Martin Fischer,
segmentation task. This dataset includes 3D scan point clouds for and Silvio Savarese. 2016. 3D semantic parsing of large-scale indoor spaces. In
6 indoor areas including 272 rooms in total. Each point belongs Proceedings of the CVPR.
Matan Atzmon, Haggai Maron, and Yaron Lipman. 2018. Point convolutional neural
to one of 13 semantic categories—e.g., board, bookcase, chair, ceil- networks by extension operators. ACM Trans. Graph. 37, 4, Article 71 (July 2018),
ing, and beam—plus clutter. We follow the same setting as Qi et al. 12 pages. DOI:https://doi.org/10.1145/3197517.3201301
Mathieu Aubry, Ulrich Schlickewei, and Daniel Cremers. 2011. The wave kernel sig-
(2017b), where each room is split into blocks with area 1m × 1m, nature: A quantum mechanical approach to shape analysis. In Proceedings of the
and each point is represented as a 9D vector (XYZ, RGB, and nor- ICCV Workshops.
malized spatial coordinates). We sampled 4,096 points for each Serge Belongie, Jitendra Malik, and Jan Puzicha. 2001. Shape context: A new descriptor
for shape matching and object recognition. In Proceedings of the NIPS.
block during training process, and all points are used for testing. Silvia Biasotti, Andrea Cerri, A. Bronstein, and M. Bronstein. 2016. Recent trends,
We also use the same sixfold cross validation over the six areas, applications, and perspectives in 3D shape similarity assessment. Comput. Graph.
and the average evaluation results are reported. Forum 35, 6 (2016), 87–119.
Davide Boscaini, Jonathan Masci, Emanuele Rodolà, and Michael Bronstein. 2016.
The model used for this task is similar to part segmentation Learning shape correspondence with anisotropic convolutional neural networks.
model, except that a probability distribution over semantic object In Proceedings of the NIPS.
Andrew Brock, Theodore Lim, James Millar Ritchie, and Nicholas J. Weston. 2016.
classes is generated for each input point and no categorical vector Generative and discriminative voxel modeling with convolutional neural net-
is used here. We compare our model with both PointNet (Qi et al. works. In Proceedings of the NIPS.
2017b) and PointNet baseline, where additional point features (lo- Michael M. Bronstein, Joan Bruna, Yann LeCun, Arthur Szlam, and Pierre Van-
dergheynst. 2017. Geometric deep learning: Going beyond euclidean data. IEEE
cal point density, local curvature, and normal) are used to construct Signal Process. Mag. 34, 4 (2017), 18–42.
handcrafted features and then fed to an MLP classifier. We further Michael M. Bronstein and Iasonas Kokkinos. 2010. Scale-invariant heat kernel signa-
compare our work with Engelmann et al. (2017) and PointCNN tures for non-rigid shape recognition. In Proceedings of the CVPR.
Joan Bruna, Wojciech Zaremba, Arthur Szlam, and Yann LeCun. 2013. Spectral net-
(Li et al. 2018a). Engelmann et al. (2017) present network architec- works and locally connected networks on graphs. arXiv:1312.6203 (2013).
tures to enlarge the receptive field over the 3D scene. Two differ- Angel X. Chang, Thomas Funkhouser, Leonidas Guibas, Pat Hanrahan, Qixing Huang,
Zimo Li, Silvio Savarese, Manolis Savva, Shuran Song, Hao Su et al. 2015.
ent approaches are proposed in their work: MS+CU for multi-scale Shapenet: An information-rich 3D model repository. arXiv:1512.03012 (2015).
block features with consolidation units; G+RCU for the grid-blocks Michaël Defferrard, Xavier Bresson, and Pierre Vandergheynst. 2016. Convolutional
with recurrent consolidation Units. We report evaluation results in neural networks on graphs with fast localized spectral filtering. In Proceedings of
the NIPS.
Table 7 and visually compare the results of PointNet and our model Francis Engelmann, Theodora Kontogianni, Alexander Hermans, and Bastian Leibe.
in Figure 10. 2017. Exploring spatial context for 3D semantic segmentation of point clouds. In
Proceedings of the CVPR.
Danielle Ezuz, Justin Solomon, Vladimir G. Kim, and Mirela Ben-Chen. 2017. GWCNN:
5 DISCUSSION A metric alignment layer for deep shape analysis. Comput. Graph. Forum 36, 5
In this work, we propose a new operator for learning on point (2017), 49–57.
Haoqiang Fan, Hao Su, and Leonidas J. Guibas. 2017. A point set generation network
cloud and show its performance on various tasks. Our model sug- for 3D object reconstruction from a single image. In Proceedings of the CVPR.
gests that local geometric features are important to 3D recognition Matthias Fey, Jan Eric Lenssen, Frank Weichert, and Heinrich Müller. 2018.
SplineCNN: Fast geometric deep learning with continuous B-spline kernels. In
tasks, even after introducing machinery from deep learning. Proceedings of the CVPR.
While our architectures easily can be incorporated as-is into ex- Justin Gilmer, Samuel S. Schoenholz, Patrick F. Riley, Oriol Vinyals, and George
isting pipelines for point cloud-based graphics, learning, and vi- E. Dahl. 2017. Neural message passing for quantum chemistry. arXiv:1704.01212
(2017).
sion, our experiments also indicate several avenues for future re- Aleksey Golovinskiy, Vladimir G. Kim, and Thomas Funkhouser. 2009. Shape-based
search and extension. Some details of our implementation could recognition of 3D point clouds in urban environments. In Proceedings of the ICCV.

ACM Transactions on Graphics, Vol. 38, No. 5, Article 146. Publication date: October 2019.
146:12 • Y. Wang et al.

Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Charles R. Qi, Hao Su, Kaichun Mo, and Leonidas J. Guibas. 2017b. PointNet: Deep
Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial learning on point sets for 3D classification and segmentation. In Proceedings of
nets. In Proceedings of the NIPS. the CVPR.
Paul Guerrero, Yanir Kleiman, Maks Ovsjanikov, and Niloy J. Mitra. 2018. PCPNet: Charles R. Qi, Hao Su, Matthias Nießner, Angela Dai, Mengyuan Yan, and Leonidas
Learning local shape properties from raw point clouds. Comput. Graph. Forum 37, J. Guibas. 2016. Volumetric and multi-view CNNs for object classification on 3D
2 (2018), 75–85. DOI:https://doi.org/10.1111/cgf.13343 data. In Proceedings of the CVPR.
Yulan Guo, Mohammed Bennamoun, Ferdous Sohel, Min Lu, and Jianwei Wan. 2014. Charles R. Qi, Li Yi, Hao Su, and Leonidas J. Guibas. 2017c. PointNet++: Deep hierar-
3D object recognition in cluttered scenes with local surface features: A survey. chical feature learning on point sets in a metric space. In Proceedings of the NIPS.
Trans. PAMI 36, 11 (2014), 2270–2287. Anurag Ranjan, Timo Bolkart, Soubhik Sanyal, and Michael J. Black. 2018. Generating
Oshri Halimi, Or Litany, Emanuele Rodolà, Alex Bronstein, and Ron Kimmel. 2018. 3D faces using convolutional mesh autoencoders. arXiv:1807.10267 (2018).
Self-supervised learning of dense shape correspondence. arXiv:1812.02415 (2018). Raif M. Rustamov. 2007. Laplace-beltrami eigenfunctions for deformation invariant
M. Henaff, J. Bruna, and Y. LeCun. 2015. Deep convolutional networks on graph- shape representation. In Proceedings of the SGP.
structured data. arXiv:1506.05163 (2015). Radu Bogdan Rusu, Nico Blodow, and Michael Beetz. 2009. Fast point feature his-
Andrew E. Johnson and Martial Hebert. 1999. Using spin images for efficient object tograms (FPFH) for 3D registration. In Proceedings of the ICRA.
recognition in cluttered 3D scenes. Trans. PAMI 21, 5 (1999), 433–449. Radu Bogdan Rusu, Nico Blodow, Zoltan Csaba Marton, and Michael Beetz. 2008a.
Diederik P. Kingma and Max Welling. 2013. Auto-encoding variational bayes. Aligning point cloud views using persistent feature histograms. In Proceedings of
arXiv:1312.6114 (2013). the IROS.
Thomas N. Kipf and Max Welling. 2017. Semi-Supervised classification with graph Radu Bogdan Rusu, Zoltan Csaba Marton, Nico Blodow, Mihai Dolha, and Michael
convolutional networks. International Conference on Learning Representations Beetz. 2008b. Towards 3D point cloud-based object maps for household environ-
(ICLR). ments. Robot. Auton. Syst. J. 56, 11 (Nov. 2008), 927–941.
Roman Klokov and Victor Lempitsky. 2017. Escape from cells: Deep Kd-networks for Franco Scarselli, Marco Gori, Ah Chung Tsoi, Markus Hagenbuchner, and Gabriele
the recognition of 3D point cloud models. (2017). Monfardini. 2009. The graph neural network model. IEEE Tran. Neural Networks
Ilya Kostrikov, Zhongshi Jiang, Daniele Panozzo, Denis Zorin, and Joan Bruna. 2017. 20, 1 (2009), 61–80.
Surface networks. In Proceedings of the CVPR. Syed Afaq Ali Shah, Mohammed Bennamoun, Farid Boussaid, and Amar A. El-Sallam.
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. Imagenet classification 2013. 3D-Div: A novel local surface descriptor for feature matching and pairwise
with deep convolutional neural networks. In Proceedings of the NIPS. range image registration. In Proceedings of the ICIP.
Yann LeCun, Bernhard Boser, John S. Denker, Donnie Henderson, Richard E. Howard, Yiru Shen, Chen Feng, Yaoqing Yang, and Dong Tian. 2017. Neighbors do help: Deeply
Wayne Hubbard, and Lawrence D. Jackel. 1989. Backpropagation applied to hand- exploiting local structures of point clouds. arXiv:1712.06760 (2017).
written ZIP code recognition. Neural Comput. 1, 4 (1989), 541–551. David I. Shuman, Sunil K. Narang, Pascal Frossard, Antonio Ortega, and Pierre Van-
Ron Levie, Federico Monti, Xavier Bresson, and Michael M. Bronstein. 2017. Cay- dergheynst. 2013. The emerging field of signal processing on graphs: Extending
leyNets: Graph convolutional neural networks with complex rational spectral fil- high-dimensional data analysis to networks and other irregular domains. IEEE
ters. arXiv:1705.07664 (2017). Signal Process. Mag. 30, 3 (2013), 83–98.
Chun-Liang Li, Manzil Zaheer, Yang Zhang, Barnabas Poczos, and Ruslan Salakhut- Martin Simonovsky and Nikos Komodakis. 2017. Dynamic edge-conditioned filters in
dinov. 2018b. Point cloud GAN. arXiv:1810.05795 (2018). convolutional neural networks on graphs. In Proceedings of the CVPR.
Yangyan Li, Rui Bu, Mingchao Sun, Wei Wu, Xinhan Di, and Baoquan Chen. Ayan Sinha, Jing Bai, and Karthik Ramani. 2016. Deep learning 3D shape surfaces
2018a. PointCNN: Convolution On X-transformed points. In Advances in Neu- using geometry images. In Proceedings of the ECCV.
ral Information Processing Systems 31, S. Bengio, H. Wallach, H. Larochelle, K. Hang Su, Varun Jampani, Deqing Sun, Subhransu Maji, Evangelos Kalogerakis, Ming-
Grauman, N. Cesa-Bianchi, and R. Garnett (Eds.). Curran Associates, Inc., Hsuan Yang, and Jan Kautz. 2018. SPLATNet: Sparse lattice networks for point
820–830. Retrieved from http://papers.nips.cc/paper/7362-pointcnn-convolution- cloud processing. In Proceedings of the CVPR. 2530–2539.
on-x-transformed-points.pdf. Hang Su, Subhransu Maji, Evangelos Kalogerakis, and Erik Learned-Miller. 2015.
Yujia Li, Daniel Tarlow, Marc Brockschmidt, and Richard Zemel. 2016. Gated graph Multi-view convolutional neural networks for 3D shape recognition. In Proceed-
sequence neural networks. In Proceedings of the ICLR. ings of the CVPR.
Ming Liang, Bin Yang, Shenlong Wang, and Raquel Urtasun. 2018. Deep continuous Jian Sun, Maks Ovsjanikov, and Leonidas Guibas. 2009. A concise and provably infor-
fusion for multi-sensor 3D object detection. In Proceedings of the ECCV. mative multi-scale signature based on heat diffusion. Comput. Graph. Forum 28, 5
Haibin Ling and David W. Jacobs. 2007. Shape classification using the inner-distance. (2009), 1383–1392.
Trans. PAMI 29, 2 (2007), 286–299. Maxim Tatarchenko, Alexey Dosovitskiy, and Thomas Brox. 2017. Octree generating
Or Litany, Alex Bronstein, Michael Bronstein, and Ameesh Makadia. 2017a. networks: Efficient convolutional architectures for high-resolution 3D outputs. In
Deformable shape completion with graph convolutional autoencoders. Proceedings of the ICCV.
arXiv:1712.00268 (2017). Federico Tombari, Samuele Salti, and Luigi Di Stefano. 2011. A combined texture-
Or Litany, Tal Remez, Emanuele Rodolà, Alex M. Bronstein, and Michael M. Bron- shape descriptor for enhanced 3D feature matching. In Proceedings of the ICIP.
stein. 2017b. Deep functional maps: Structured prediction for dense shape corre- Oliver Van Kaick, Hao Zhang, Ghassan Hamarneh, and Daniel Cohen-Or. 2011. A
spondence. In Proceedings of the ICCV. survey on shape correspondence. Comput. Graph. Forum 30, 6 (2011), 1681–1707.
I. Loshchilov and F. Hutter. 2017. SGDR: Stochastic gradient descent with warm Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò,
restarts. In Proceedings of the ICLR. and Yoshua Bengio. 2017. Graph attention networks. arXiv:1710.10903.
Min Lu, Yulan Guo, Jun Zhang, Yanxin Ma, and Yinjie Lei. 2014. Recognizing objects in Shenlong Wang, Simon Suo, Wei-Chiu Ma, Andrei Pokrovsky, and Raquel Urtasun.
3D point clouds with multi-scale local features. Sensors 14, 12 (2014), 24156–24173. 2018b. Deep parametric continuous convolutional neural networks. In Proceedings
Siddharth Manay, Daniel Cremers, Byung-Woo Hong, Anthony J. Yezzi, and Stefano of the CVPR.
Soatto. 2006. Integral invariants for shape matching. Trans. PAMI 28, 10 (2006), Xiaolong Wang, Ross Girshick, Abhinav Gupta, and Kaiming He. 2018a. Non-local
1602–1618. neural networks. In Proceedings of the CVPR.
Haggai Maron, Meirav Galun, Noam Aigerman, Miri Trope, Nadav Dym, Ersin Yumer, Lingyu Wei, Qixing Huang, Duygu Ceylan, Etienne Vouga, and Hao Li. 2016. Dense
Vladimir G Kim, and Yaron Lipman. 2017. Convolutional neural networks on sur- human body correspondences using convolutional networks. In Proceedings of the
faces via seamless toric covers. In Proceedings of the SIGGRAPH. CVPR.
Jonathan Masci, Davide Boscaini, Michael Bronstein, and Pierre Vandergheynst. 2015. Zhirong Wu, Shuran Song, Aditya Khosla, Fisher Yu, Linguang Zhang, Xiaoou Tang,
Geodesic convolutional neural networks on riemannian manifolds. In Proceedings and Jianxiong Xiao. 2015. 3D shapenets: A deep representation for volumetric
of the 3dRR. shapes. In Proceedings of the CVPR.
Daniel Maturana and Sebastian Scherer. 2015. Voxnet: A 3D convolutional neural net- Cihang Xie, Yuxin Wu, Laurens van der Maaten, Alan Yuille, and Kaiming He. 2018.
work for real-time object recognition. In Proceedings of the IROS. Feature denoising for improving adversarial robustness. arXiv:1812.03411.
Federico Monti, Davide Boscaini, Jonathan Masci, Emanuele Rodolà, Jan Svoboda, and Yaoqing Yang, Chen Feng, Yiru Shen, and Dong Tian. 2018. FoldingNet: Point cloud
Michael M. Bronstein. 2017a. Geometric deep learning on graphs and manifolds auto-encoder via deep grid deformation. In Proceedings of the CVPR.
using mixture model CNNs. In Proceedings of the CVPR. Li Yi, Vladimir G. Kim, Duygu Ceylan, I. Shen, Mengyan Yan, Hao Su, A. R. Cewu Lu,
F. Monti, M. M. Bronstein, and X. Bresson. 2017b. Geometric matrix completion with Qixing Huang, Alla Sheffer, Leonidas Guibas et al. 2016. A scalable active frame-
recurrent multi-graph neural networks. In Proceedings of the NIPS. work for region annotation in 3D shape collections. Trans. Graph. 35, 6 (2016),
Federico Monti, Karl Otness, and Michael M. Bronstein. 2018. MotifNet: A motif-based 210.
graph convolutional network for directed graphs. arXiv:1802.01572 (2018). Yuke Zhu, Roozbeh Mottaghi, Eric Kolve, Joseph J. Lim, Abhinav Gupta, Li Fei-Fei,
Maks Ovsjanikov, Mirela Ben-Chen, Justin Solomon, Adrian Butscher, and Leonidas and Ali Farhadi. 2017. Target-driven visual navigation in indoor scenes using deep
Guibas. 2012. Functional maps: A flexible representation of maps between shapes. reinforcement learning. In Proceedings of the ICRA.
Trans. Graph. 31, 4 (2012), 30.
Charles R. Qi, Wei Liu, Chenxia Wu, Hao Su, and Leonidas J. Guibas. 2017a. Frustum
PointNets for 3D object detection from RGB-D data. arXiv:1711.08488 (2017). Received January 2019; revised May 2019; accepted June 2019

ACM Transactions on Graphics, Vol. 38, No. 5, Article 146. Publication date: October 2019.

The Good-Enough Sex Model For Couple Sexual Satisfaction
100% (1)
The Good-Enough Sex Model For Couple Sexual Satisfaction
13 pages
Point Transformer
No ratings yet
Point Transformer
10 pages
point transformers
No ratings yet
point transformers
11 pages
Point Transformer
No ratings yet
Point Transformer
11 pages
A Throughput-Optimized Accelerator For Submanifold
No ratings yet
A Throughput-Optimized Accelerator For Submanifold
6 pages
Thesis Z Ai
No ratings yet
Thesis Z Ai
46 pages
Pointwise Convolutional Neural Networks
No ratings yet
Pointwise Convolutional Neural Networks
10 pages
00005_3D_Semantic_Novelty_Detection
No ratings yet
00005_3D_Semantic_Novelty_Detection
21 pages
MPCT_Multiscale_Point_Cloud_Transformer_With_a_Residual_Network
No ratings yet
MPCT_Multiscale_Point_Cloud_Transformer_With_a_Residual_Network
12 pages
A Comprehensive Overview of Deep Learning Techniques For 3D Point Cloud Classification and Semantic Segmentation
No ratings yet
A Comprehensive Overview of Deep Learning Techniques For 3D Point Cloud Classification and Semantic Segmentation
54 pages
3 D Point Cloud Classification
No ratings yet
3 D Point Cloud Classification
7 pages
Advancing 3D point cloud understanding through deep transfer learning: A comprehensive survey
No ratings yet
Advancing 3D point cloud understanding through deep transfer learning: A comprehensive survey
38 pages
GN-CNN: A Point Cloud Analysis Method For Metaverse Applications
No ratings yet
GN-CNN: A Point Cloud Analysis Method For Metaverse Applications
16 pages
1811.07246v1-2
No ratings yet
1811.07246v1-2
10 pages
3 D Point Cloud Reviews
No ratings yet
3 D Point Cloud Reviews
22 pages
Lecture 16 Hao
No ratings yet
Lecture 16 Hao
56 pages
Landrieu GT Appr Opt
No ratings yet
Landrieu GT Appr Opt
190 pages
Huang 2016 ICPR
No ratings yet
Huang 2016 ICPR
6 pages
N-19248
No ratings yet
N-19248
6 pages
Deep Learning on Point Clouds and Its Application_ a Survey
No ratings yet
Deep Learning on Point Clouds and Its Application_ a Survey
22 pages
Large-Scale Point Cloud Semantic Segmentation With Superpoint Graphs
No ratings yet
Large-Scale Point Cloud Semantic Segmentation With Superpoint Graphs
13 pages
Pointconv: Deep Convolutional Networks On 3D Point Clouds
No ratings yet
Pointconv: Deep Convolutional Networks On 3D Point Clouds
10 pages
A Review On Multiscale-Deep-Learning Applications
No ratings yet
A Review On Multiscale-Deep-Learning Applications
28 pages
Learning Efficient Point Cloud Generation For Dense 3D Object Reconstruction
No ratings yet
Learning Efficient Point Cloud Generation For Dense 3D Object Reconstruction
8 pages
Pointnet: Deep Learning On Point Sets For 3D Classification and Segmentation
No ratings yet
Pointnet: Deep Learning On Point Sets For 3D Classification and Segmentation
19 pages
Mesh Generation: Advances and Applications in Computer Vision Mesh Generation
From Everand
Mesh Generation: Advances and Applications in Computer Vision Mesh Generation
Fouad Sabry
No ratings yet
Dark Green Light Green White Corporate Geometric Company Internal Deck Business Presentation
No ratings yet
Dark Green Light Green White Corporate Geometric Company Internal Deck Business Presentation
17 pages
Lyu Learning To Segment 3D Point Clouds in 2D Image Space CVPR 2020 Paper
No ratings yet
Lyu Learning To Segment 3D Point Clouds in 2D Image Space CVPR 2020 Paper
10 pages
SO Net
No ratings yet
SO Net
17 pages
Volumetric and Multi-View CNNs For Object Classification On 3D Data
No ratings yet
Volumetric and Multi-View CNNs For Object Classification On 3D Data
14 pages
2018_Frustum_PointNets
No ratings yet
2018_Frustum_PointNets
10 pages
Ye TPCN Temporal Point Cloud Networks for Motion Forecasting CVPR 2021 Paper
No ratings yet
Ye TPCN Temporal Point Cloud Networks for Motion Forecasting CVPR 2021 Paper
10 pages
2104.13044v1
No ratings yet
2104.13044v1
8 pages
zanuttigh2017
No ratings yet
zanuttigh2017
5 pages
3D Graph Neural Networks For RGBD Semantic Segmentation
No ratings yet
3D Graph Neural Networks For RGBD Semantic Segmentation
10 pages
Anand Bhat PHD Thesis
No ratings yet
Anand Bhat PHD Thesis
173 pages
Chapter 01 - 3D Perception Vision
No ratings yet
Chapter 01 - 3D Perception Vision
8 pages
ConDaFormer
No ratings yet
ConDaFormer
16 pages
Pan_3D_Object_Detection_With_Pointformer_CVPR_2021_paper
No ratings yet
Pan_3D_Object_Detection_With_Pointformer_CVPR_2021_paper
10 pages
Applied Sciences: Go Wider: An Efficient Neural Network For Point Cloud Analysis Via Group Convolutions
No ratings yet
Applied Sciences: Go Wider: An Efficient Neural Network For Point Cloud Analysis Via Group Convolutions
15 pages
Choy 2019
No ratings yet
Choy 2019
9 pages
1608 07916 PDF
No ratings yet
1608 07916 PDF
8 pages
Implementation of Deep Neural Networks Learning On Unmanned Aerial Vehicle Based Remote-Sensing
No ratings yet
Implementation of Deep Neural Networks Learning On Unmanned Aerial Vehicle Based Remote-Sensing
7 pages
B.N.M. Institute of Technology: Visvesvaraya Technological University
No ratings yet
B.N.M. Institute of Technology: Visvesvaraya Technological University
23 pages
Remote Sensing: Review: Deep Learning On 3D Point Clouds
No ratings yet
Remote Sensing: Review: Deep Learning On 3D Point Clouds
34 pages
Rocco Convolutional Neural Network CVPR 2017 Paper
No ratings yet
Rocco Convolutional Neural Network CVPR 2017 Paper
10 pages
entropy-25-00635
No ratings yet
entropy-25-00635
35 pages
Lu 等 - 2024 - 3DGTN 3-D dual-attention GLocal transformer network for point cloud classification and segmentation
No ratings yet
Lu 等 - 2024 - 3DGTN 3-D dual-attention GLocal transformer network for point cloud classification and segmentation
13 pages
07-Landrieu Presentation
No ratings yet
07-Landrieu Presentation
90 pages
Heritage 05 00208 v2
No ratings yet
Heritage 05 00208 v2
24 pages
9781638280712-summary
No ratings yet
9781638280712-summary
65 pages
Procedural Surface: Exploring Texture Generation and Analysis in Computer Vision
From Everand
Procedural Surface: Exploring Texture Generation and Analysis in Computer Vision
Fouad Sabry
No ratings yet
73380012
No ratings yet
73380012
81 pages
Detecting Objects in Scene Point Cloud: A Combinational Approach
No ratings yet
Detecting Objects in Scene Point Cloud: A Combinational Approach
9 pages
L12 - 3d Deep Learning On Volumetric Representation
No ratings yet
L12 - 3d Deep Learning On Volumetric Representation
63 pages
timo-etal-pers2017
No ratings yet
timo-etal-pers2017
13 pages
Deep Learning For Geometric and Semantic Tasks in Photogrammetry and Remote Sensing
No ratings yet
Deep Learning For Geometric and Semantic Tasks in Photogrammetry and Remote Sensing
11 pages
16456-Article Text-19950-1-2-20210518
No ratings yet
16456-Article Text-19950-1-2-20210518
8 pages
3D Point Cloud Analysis: Traditional, Deep Learning, and Explainable Machine Learning Methods 1st Edition Shan Liu all chapter instant download
100% (3)
3D Point Cloud Analysis: Traditional, Deep Learning, and Explainable Machine Learning Methods 1st Edition Shan Liu all chapter instant download
40 pages
Hassani Unsupervised Multi-Task Feature Learning on Point Clouds ICCV 2019 Paper
No ratings yet
Hassani Unsupervised Multi-Task Feature Learning on Point Clouds ICCV 2019 Paper
12 pages
1803.06329v1
No ratings yet
1803.06329v1
9 pages
THE WIND CAP - QUESTIONS AND ANSWERS
No ratings yet
THE WIND CAP - QUESTIONS AND ANSWERS
7 pages
Fall With Me A Grumpysunshine Cowboy Small Town Romance Brooke Montgomery Brooke Cumberland instant download
100% (1)
Fall With Me A Grumpysunshine Cowboy Small Town Romance Brooke Montgomery Brooke Cumberland instant download
50 pages
SELF DECLARATION FOR FINANCIAL DETAILS INDIVIDUAL 386a216064
No ratings yet
SELF DECLARATION FOR FINANCIAL DETAILS INDIVIDUAL 386a216064
2 pages
Scribd
No ratings yet
Scribd
5 pages
Clasa A 12a B Unitati
No ratings yet
Clasa A 12a B Unitati
5 pages
65 Okabe v. Gutierrez
No ratings yet
65 Okabe v. Gutierrez
3 pages
Report Draft
No ratings yet
Report Draft
16 pages
Anna University Chennai Centre For University - Industry Collaboration
No ratings yet
Anna University Chennai Centre For University - Industry Collaboration
3 pages
Kulokabildo
No ratings yet
Kulokabildo
2 pages
8565 Grade 10 Math Kangaroo Sample Paper - 2019
No ratings yet
8565 Grade 10 Math Kangaroo Sample Paper - 2019
7 pages
Pup Model Congress: AN ACT . (Long Title)
No ratings yet
Pup Model Congress: AN ACT . (Long Title)
4 pages
Finance Team Score Cards
No ratings yet
Finance Team Score Cards
18 pages
Rapidard CF
No ratings yet
Rapidard CF
1 page
Enzymes and Cancer A Look Toward The Past As We Mo
No ratings yet
Enzymes and Cancer A Look Toward The Past As We Mo
3 pages
h m Venkatesh v State of Karnatka 402694
No ratings yet
h m Venkatesh v State of Karnatka 402694
6 pages
ORAL HEALTH
No ratings yet
ORAL HEALTH
2 pages
CLASS IX SCIENCE HOLIDAY HOMEWORK 2025-26
No ratings yet
CLASS IX SCIENCE HOLIDAY HOMEWORK 2025-26
4 pages
Challenges Encountered by Students in The School For Special Needs in Kwara State, Nigeria
No ratings yet
Challenges Encountered by Students in The School For Special Needs in Kwara State, Nigeria
19 pages
IIFL - Rollover Action - Feb-21 T Expiry Day
No ratings yet
IIFL - Rollover Action - Feb-21 T Expiry Day
6 pages
PDF Attachment in Adulthood Structure Dynamics and Change 1st Edition Mario Mikulincer download
100% (5)
PDF Attachment in Adulthood Structure Dynamics and Change 1st Edition Mario Mikulincer download
41 pages
MATATAG Science CG Grade 4 and 7.
No ratings yet
MATATAG Science CG Grade 4 and 7.
41 pages
Competitive Guru
No ratings yet
Competitive Guru
21 pages
Hill Cipher
No ratings yet
Hill Cipher
24 pages
HW 0023 BrochureBook Letter Pages PDF
No ratings yet
HW 0023 BrochureBook Letter Pages PDF
80 pages
Online Learning Prospectus
No ratings yet
Online Learning Prospectus
38 pages
GOAT-PRODUCTION-MANUAL-JULY-2023-2022docx_240112_100629
No ratings yet
GOAT-PRODUCTION-MANUAL-JULY-2023-2022docx_240112_100629
89 pages
Free Access to Police Operations Theory and Practice 6th Edition Hess Test Bank Chapter Answers
100% (9)
Free Access to Police Operations Theory and Practice 6th Edition Hess Test Bank Chapter Answers
46 pages
Buy Buy
0% (1)
Buy Buy
7 pages
Rohit Mishra
No ratings yet
Rohit Mishra
50 pages

2018_DGCNN

Uploaded by

2018_DGCNN

Uploaded by

146

Dynamic Graph CNN for Learning on Point Clouds

Table 1. Comparison to Existing Methods

Aggregation Edge Function Learnable parameters

Table 2. Classification Results on ModelNet40 Table 4. Effectiveness of Different Components

Table 3. Complexity, Forward Time, and Accuracy of Different Models

4.2 Model Complexity

Table 6. Part Segmentation Results on ShapeNet Part Dataset

Metric is mIoU(%) on points.

Table 7. 3D Semantic Segmentation Results on S3DIS

possible k, we find with large k that the performance degenerates.

You might also like