Singh 2020

This paper presents a generative modeling technique for 3D reconstruction from a single 2D image using deep learning methods. The proposed approach consists of two stages: generating multiple images from different viewpoints using convolutional neural networks, and constructing a 3D model from these images using generative adversarial networks. The results demonstrate significant improvements over existing techniques in 3D object reconstruction.

Uploaded by

rifanapa2001

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views5 pages

Singh 2020

Uploaded by

rifanapa2001

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

A Generative Modelling Technique for 3D

Reconstruction from a Single 2D Image

Saurabh Kumar Singh Shrey Tanna
Dept. of Computer Science & Engineering Dept. of Computer Science & Engineering
IIT (BHU), Varanasi IIT (BHU), Varanasi
Varanasi, India Varanasi, India
[email protected] [email protected]

Abstract—3D Object Reconstruction is the task of predicting have any depth on the 3D model, it is very difficult to predict
the 3D model of an object given a set of 2D images. In this paper, the depth using only one image. However, the position of the
we propose an approach to solving this problem, given a single point can be found as the intersection of the projection rays
2D image. We attempt to make use of several deep learning
techniques. Our model consists of two parts. The first part provided we have two images of that object.
generates multiple images having different viewpoints. We have In this paper, we present a two-stage approach. The first
included this part because reconstructing 3D object directly from one focuses on generating several images having different
a single 2D image is quite difficult, but the same task would be viewpoints from a single, using convolutional neural networks.
a lot easier given multiple images which capture different views The second one constructs a 3D model from these multiple
of that same object. Also, predicting an image having a different
viewpoint is much easier than predicting the whole 3D object, view images using generative adversarial networks. This paper
given an input image. The second part uses a network consisting presents an improved model of present deep-learning methods
of an Encoder, a Decoder (or Generator), and a Discriminator [1] [2], which is well reflected in our results.
to predict the complete 3D voxel grid of the object. In this way, The organization of the paper is as follows: Section II
we achieve significant improvements in the results as compared presents a summary of the literature survey showing different
to the existing techniques.
Index Terms—Reconstruction, GANs, CNNs, Neural networks,
methods for multiple-view image generation and non-neural
Voxel network based and neural network based methods of 3D
reconstruction. The proposed approach is discussed in Section
III. The experimental analysis and results are discussed in
I. I NTRODUCTION
Section IV. Finally, the paper is concluded in Section V.
The rapid development of fields like robotics, designing,
II. R ELATED W ORK
virtual reality, medical imaging, etc. requires a crucial under-
standing of shapes of objects, and hence 3D understanding A. Multiple views prediction from a single image
of objects has itself evolved as an interesting problem for Several approaches have been studied for obtaining unseen
computer vision researchers. As the computer storage space views of a given image using different image transformation or
is increasing, the visual data is also increasing proportionally, neural network methods. Transforming autoencoders [3] show
and hence studying their shape features has been a worry. us how neural networks deal with variations in orientation,
The problem would have been easier if somehow we could scale, position and lighting of the image. The Deep Convo-
extract a 3D model out of it. This creation of a 3D model lution Inverse Graphics Network (DCIGN) [4] can generate
from a given set of image(s) is called 3D reconstruction. Let images with a different pose for the same object and similarly,
us understand this with an example. Say, there is a bus standing images with different lighting given an input image. It consists
on the road, and we click a picture of it using a camera. The of multiple layers of convolution and de-convolution operators
picture we get is a 2D image. If we are given this photo and is trained using the Stochastic Gradient Variational Bayes
and asked to predict its 3D model, it is quite tough. It can (SGVB) algorithm [5]. Such task is much popular in the field
be seen that 3D reconstruction is just the reverse of clicking of face recognition and manipulation, as various factors like
a 2D photo of a 3D scene. When we click a picture of an identity, view and illumination are coupled in the face images
object using a camera, we only get its projection onto a given and hence disentangling such factors is a major challenge. As
plane. The loss of most of the depth information makes the an example, the Multi-view perceptron (MVP) [6] disentangles
problem challenging. A human, when asked to predict the the face identity and predict features under different view-
corresponding 3D counterpart of a given image, as he/she may points. Also, another similar problem is comparing two images
have some idea about that as he/she has seen several similar from different views, which turns out to be quite challenging
pictures throughout his/her life. This gives us a hint that the as features are not stable under large view point changes. [7]
machine can be trained with lots of visual data to accomplish focuses on solving this problem by synthesizing the features
the task. As we know that a given point on a 2D image may for other views with the help of 3D model collection of related

Authorized licensed use limited to: Auckland University of Technology. Downloaded on November 01,2020 at 13:53:49 UTC from IEEE Xplore. Restrictions apply.
objects, based on which feature sets for the two images is includes weight clipping or enforcing Lipschitz constraint.
created and compared. To render rotated objects from a single DCGAN [13] uses convolutional networks for implementing
image, [8] is an end-to-end trained network. Given an image, GANs. Stacked GAN [14] is trained to invert the hierarchical
it basically gives us the view from a viewpoint differing with representations of a bottom-up discriminative network and is
a fixed angle. As a result of this, a discrete set of views of the able to generate images of much higher quality than GANs
given image may be generated by using the algorithm again without stacking. Different methods for improvement of GANs
and again. However, it may lead to error accumulation in case are also described in [15], [16] and [17]. [1] mentions an
of large angle. As opposite of this, [2] can generate views, approach generating a 3D object from a probabilistic space
varying the angle continuously. using GANs and CNNs. [18] also achieves this, but uses
IWGAN for the same. MarrNet [19] uses 2.5D sketches such
B. Neural Network Based 3D Reconstruction as a normal map, depth map, and silhouette map in addition
1) RNN-based 3D object reconstruction: Recurrent neural to single image are input to the network in order to improve
networks have been used for reconstructing the 3D object 3D-GAN [1]. Most of this 3D work uses the Shapenet dataset
from both single and multi-images input. The paper [9] [20].
proposes a novel recurrent neural network architecture, called [21] trained generative up-convolutional neural networks
3D Recurrent Reconstruction Neural Network (3D-R2N2) for which are able to generate images of objects given object
3D object reconstruction. In this network, we can feed one style, viewpoint, and color. [22] presented a method for joint
image or multiple images. These images may be taken from analysis and synthesis of geometrically diverse 3D shape
randomly oriented viewpoints. In this whole procedure, no families. [23], [24] proposed to learn a joint embedding of 3D
pre-processing elements like segmentation, viewpoint labels, shapes and synthesized images, [25], [26] focused on learning
or key points are needed. This method uses recurrent neural discriminative representations for 3D object recognition, [27],
networks like LSTMs and GRUs, and convolutional neural [28], [29] discussed 3D object reconstruction from in-the-
networks as well. wild images, possibly with a recurrent network, and [24],
The network aims at performing object reconstruction from [30] explored autoencoder-based networks for learning voxel-
single or multi-view images. The network consists of three based object representations. [8] proposed a novel recurrent
components, a 2D Convolutional Neural Network (2D-CNN), convolutional encoder-decoder network that is trained end-to-
a 3D Convolutional LSTM (3D-LSTM), and a 3D Deconvo- end on the task of rendering rotated objects starting from a
lutional Neural Network (3D-DCNN). single image.
2) CNN-based 3D object reconstruction: Convolutional
neural networks are used in most of the computer vision C. Non-neural Network Based 3D Reconstruction
problems. Here, we are discussing some of the methods using A lot of approaches are also available, which does not
only CNNs for 3D reconstruction. The paper [10] explores involve any use of neural network techniques.
the point set representation of a 3D object using a single [31] presented an automated pipeline with pixels as in-
image of that object. A point cloud representation as output formation sources and 3D surfaces of different classifications
is chosen as it is a simple and uniform representation and as a result of pictures of some scenes. Their methodology
allows geometric transformations or deformations easily with had deformable 3D models that can be learned from 2D
some manipulations. Also, in cases where ground truth may annotations available in existing object detection data sets. [32]
be ambiguous, this representation provides us the best output. proposed a two-step method. Initially, they employ orthogonal
To be capable of this, the model analyzes the visible parts of matching pursuit to choose the closest single CAD model
the object with the help of the input image and then cleverly in the dictionary to the projected image. Finally, they use
guesses the rest part. their graph embedding based on local dense correspondence
The network aims at outputting a point set representation of to allow for sparse linear combinations of the CAD models.
a 3D object given a single image of that object as input. 3D
shape of the object is represented as an unordered point set III. P ROPOSED A PPROACH
S = (xi ; yi ; zi )N
i=1 where N is a constant. Mostly, N is taken as In this section, we introduce our approach to solving the 3D
1024. Such representation can have geometric transformations Reconstruction problem. We can easily divide our approach
(like rotation, scaling, translation or their combinations) easily into two parts which are discussed in the following sections.
by simple matrix algebra. There is an encoder in the network
and a decoder(or a predictor) in the network. A. Multiple view images generation step
3) GAN based 3D Reconstruction: Generative adversarial The first part of the approach is to generate multiple view
networks [11] have several advantages over the other methods images for the given image. Several images having different
and can produce high-quality realistic objects and outperforms viewpoints are produced using the multi-view 3D (mv3D)
several other methods in terms of performance. There are other CNN network inspired by the paper [2]. This network takes
modified versions of GANs available, as mentioned further. an image as well as a viewpoint vector as input and outputs
WGAN or IWGAN [12] mentions a method for stable training the required image. Viewpoint is described using a viewpoint
of GANs by providing an alternative training method, which vector containing five elements, two of which being the sine

Authorized licensed use limited to: Auckland University of Technology. Downloaded on November 01,2020 at 13:53:49 UTC from IEEE Xplore. Restrictions apply.
and cosine of the azimuthal angle, the other two being the sine of channels, kernel size, and strides respectively, as (64, 11, 4),
and cosine of the elevation angle, and the fifth one being the (128, 5, 2), (256, 5, 2), (512, 5, 2), (400, 8, 1). Finally, we sam-
distance from the object center. ple the output of the last layer (400-sized vector) into two
sets to form a mean vector and a standard deviation vector,
giving rise to a 200-dimensional vector. This vector is further
used as input to the generator. The generator (or decoder)
consists of three 3D convolutional layers with number of
channels, kernel size, and strides respectively, as (128, 10, 2),
(64, 4, 2), (1, 4, 2). It outputs a 20 × 20 × 20 voxel grid, which
is taken as input by the discriminator to check whether it is
real or fake by estimating a score using ground truth. The
Fig. 1. First part of the network, showing multi-view images as the output of discriminator consists of three 3D convolutional layers with
the mv3D network, given a single input image along with different viewpoint
vectors number of channels, kernel size, and strides respectively, as
(64, 4, 2), (128, 4, 2), (2, 2, 1). It outputs a value between 0
The network used is an encoder-decoder network. A pair of and 1, denoting the probability of the 3D voxel being real/fake.
image and viewpoint vector (xi , vi ) is input to the network, We also add batch normalization layer and the layer having
and the network aims to predict the required image (ground leaky ReLU units with parameter 0.2 in between convolutional
truth = fi ). The encoder consists of five convolutional layers layers throughout.
(stride = 2) followed by a fully connected layer at the end. The overall loss function is made up of the following
The viewpoint vector is independently processed by fully components.
connected layers, and the result is merged with the encoder The first one is the 3D-GAN loss function, which is a
output. The decoder then takes the resultant vector through minimax loss as it involves the update of two components G
five deconvolutional layers. Deconvolution is the reverse of and D, taking turns alternatively by minimizing generator loss
convolution. Each pixel is replaced with a 2 × 2 pixel window and maximizing discriminator loss. The Discriminator aims to
with original pixel value in the top-left pixel and filling others maximize the probability of assigning the correct label to the
with zero. training examples and output of G. So, we can write the loss
To train the network, any pair of snapshots from the 3D function for D as:
model of some object along with angle information is taken
as an input-output pair. The training is done by minimizing the LDisc = −Ex [log(D(x))] − Ez [log(1 − D(G(z)))] (2)
loss function, which is squared Euclidean loss for the required where E is the mathematical expectation operator, x is 3D
image, and as shown below: voxel grid input to D, z is the vector (output of E) input to
X G. The generator aims to minimize the inverse probability by
L= k fi − fî k22 (1)
D for fake samples. So, we can write the loss function for G
i
as:
where fî is the output image and fi is the ground truth image LGen = Ez [log(1 − D(G(z)))] (3)
for input image xi and viewpoint vector vi .
For a given image xi , we make 11 pairs of inputs as The second one is the Kullback-Leibler divergence loss,
(xi , vi1 ), (xi , vi2 ), .., (xi , vi11 ) using 11 different viewpoint which aims to confine the distribution space of output of the
vectors vi1 , vi2 , .., vi11 respectively. It can be seen in the encoder E.
Fig. 1 that 11 images of different viewpoints are produced LKL = DKL (N (µ, σ) k N (0, I)) (4)
in our implementation, however, this number can be changed
appropriately. Now, the original image xi , along with 11 where DKL (P k Q) is the divergence between P and Q,
images fî1 , fî2 , .., fi11
ˆ produced, is taken as a whole to form N (µ, σ) denotes the normal distribution with mean µ and
a 256 × 256 × 12 block by taking only the L-channel of the standard deviation σ (produced by the encoder), and N (0, I)
respective images. This block is then taken as the input for denotes the standard normal distribution.
a 3D-GAN network, which constitutes the second part of our The third loss function represents the 3D model reconstruc-
solution as described in the section as under. tion loss, which is the squared euclidean distance between the
generator output and the target shape. It is given by:
B. 3D model construction step
L3D =k y − ŷ k22 (5)
In this section, we discuss the second part of the network,
which takes in the output of the first part of the network as where y is the target 3D shape of an image, and ŷ is the 3D
input, and outputs the required 3D voxel grid. output of the generator.
The network consists of three components: Encoder (E), In each iteration of the training step, an appropriate set of
Generator (G), and Discriminator (D). The encoder takes the loss functions are used for updating a particular component
12 × 256 × 256 block (a set of 12 images’ L-channels) as of the network. We have used ADAM optimizer for training
input. It consists of five 3D convolutional layers with number purpose, setting the parameters as α = 0.00008, β1 = 0.5

Authorized licensed use limited to: Auckland University of Technology. Downloaded on November 01,2020 at 13:53:49 UTC from IEEE Xplore. Restrictions apply.
Fig. 2. Second part of the network, showing its three components: Encoder, Generator and Discriminator

and β2 = 0.9. We have trained the algorithm for 1200 epochs, ACKNOWLEDGEMENTS
keeping the batch size to be 256. This paper and the project would not have been possi-
IV. E XPERIMENTS ble without our mentor’s support, Dr. Pratik Chattopadhyay,
A. Dataset and System Description Assistant Professor, CSE, IIT(BHU), Varanasi. His valuable
discussions, insightful suggestions, and knowledge kept our
The 3D CAD models are downloaded from the Shapenets
work on track.
[20] website. These models are then rendered as images, given
different illumination levels, size, azimuthal, and elevation R EFERENCES
angles. The images are then resized by cropping them from [1] Jiajun Wu, Chengkai Zhang, Tianfan Xue, Bill Freeman, and Josh
the center to make it 256 × 256 sized. These images and their Tenenbaum. Learning a probabilistic latent space of object shapes via
corresponding voxel grid are used as the dataset for the task. 3d generative-adversarial modeling. In Advances in neural information
processing systems, pages 82–90, 2016.
We have trained the model for the car dataset. [2] Maxim Tatarchenko, Alexey Dosovitskiy, and Thomas Brox. Multi-
We have used a system with three Graphics Processing Units view 3d models from single images with a convolutional network. In
(GPUs) for training purposes. One of them is Nvidia Titan Xp European Conference on Computer Vision, pages 322–337. Springer,
2016.
with 12GB RAM, 12196 MB FB memory and 256 MB BAR1 [3] Geoffrey E Hinton, Alex Krizhevsky, and Sida D Wang. Transforming
memory, and the other two are Nvidia GeForce GTX 1080 Ti auto-encoders. In International Conference on Artificial Neural Net-
with 11 GB RAM, 11178 MB FB memory and 256 MB BAR1 works, pages 44–51. Springer, 2011.
[4] Tejas D Kulkarni, William F Whitney, Pushmeet Kohli, and Josh
memory. Tenenbaum. Deep convolutional inverse graphics network. In Advances
B. Results in neural information processing systems, pages 2539–2547, 2015.
[5] Diederik P Kingma and Max Welling. Auto-encoding variational bayes.
We show the qualitative results for 3D reconstruction from arXiv preprint arXiv:1312.6114, 2013.
a single image in fig. 3. In each row, the image on the left [6] Zhenyao Zhu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. Multi-
view perceptron: a deep model for learning face identity and view
is the input image, and the other images show two arbitrary representations. In Advances in Neural Information Processing Systems,
views of the corresponding 3D output. The results as can be pages 217–225, 2014.
seen, never miss a major part of the article even if it’s thin, [7] Hao Su, Fan Wang, Li Yi, and Leonidas Guibas. 3d-assisted im-
age feature synthesis for novel views of an object. arXiv preprint
which was not the case with some other methods. This can arXiv:1412.0003, 2014.
easily be seen in the output for the tractor. [8] Jimei Yang, Scott E Reed, Ming-Hsuan Yang, and Honglak Lee. Weakly-
supervised disentangling with recurrent transformations for 3d view
V. C ONCLUSION AND F UTURE W ORK synthesis. In Advances in Neural Information Processing Systems, pages
In this paper, we present an approach that simplifies the 1099–1107, 2015.
[9] Christopher B Choy, Danfei Xu, JunYoung Gwak, Kevin Chen, and
problem of reconstructing a 3D object from a single image, Silvio Savarese. 3d-r2n2: A unified approach for single and multi-view
to a problem of reconstructing a 3D object from multiple 3d object reconstruction. In European conference on computer vision,
images(captured from different angles). For this task, we use pages 628–644. Springer, 2016.
[10] Haoqiang Fan, Hao Su, and Leonidas J Guibas. A point set generation
the multi-view 3D CNN network, inspired by [2]. And the network for 3d object reconstruction from a single image. In Proceedings
actual 3D generation after this step is done by the GAN. The of the IEEE conference on computer vision and pattern recognition,
encoder of this GAN takes the L-channels of several multi- pages 605–613, 2017.
[11] Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David
view images as it’s input. The decoder consists of three 3D Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Gen-
convolutional layers, which outputs a voxel grid. erative adversarial nets. In Advances in neural information processing
The approaches proposed till now and even our approach are systems, pages 2672–2680, 2014.
[12] Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin,
likely to work for a single class image. The network is trained and Aaron C Courville. Improved training of wasserstein gans. In
for a given class of objects and gives the desired output only in Advances in Neural Information Processing Systems, pages 5767–5777,
case of the images taken from that class. So, a robust solution 2017.
[13] Wei Fang, Feihong Zhang, Victor S Sheng, and Yewen Ding. A method
that could work for all classes of images is to be developed for improving cnn-based image recognition using dcgan. Comput. Mater.
and is a challenging problem. Contin, 57:167–178, 2018.

Authorized licensed use limited to: Auckland University of Technology. Downloaded on November 01,2020 at 13:53:49 UTC from IEEE Xplore. Restrictions apply.
[14] Xun Huang, Yixuan Li, Omid Poursaeed, John Hopcroft, and Serge
Belongie. Stacked generative adversarial networks. In Proceedings of the
IEEE Conference on Computer Vision and Pattern Recognition, pages
5077–5086, 2017.
[15] Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. Progressive
growing of gans for improved quality, stability, and variation. arXiv
preprint arXiv:1710.10196, 2017.
[16] Ian Goodfellow. Nips 2016 tutorial: Generative adversarial networks.
arXiv preprint arXiv:1701.00160, 2016.
[17] Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec
Radford, and Xi Chen. Improved techniques for training gans. In
Advances in neural information processing systems, pages 2234–2242,
2016.
[18] Edward Smith and David Meger. Improved adversarial systems for 3d
object generation and reconstruction. arXiv preprint arXiv:1707.09557,
2017.
[19] Jiajun Wu, Yifan Wang, Tianfan Xue, Xingyuan Sun, William T Free-
man, and Joshua B Tenenbaum. MarrNet: 3D Shape Reconstruction via
2.5D Sketches. In Advances In Neural Information Processing Systems,
2017.
[20] Angel X Chang, Thomas Funkhouser, Leonidas Guibas, Pat Hanrahan,
Qixing Huang, Zimo Li, Silvio Savarese, Manolis Savva, Shuran Song,
Hao Su, et al. Shapenet: An information-rich 3d model repository. arXiv
preprint arXiv:1512.03012, 2015.
[21] Alexey Dosovitskiy, Jost Tobias Springenberg, Maxim Tatarchenko,
and Thomas Brox. Learning to generate chairs, tables and cars with
convolutional networks. IEEE transactions on pattern analysis and
machine intelligence, 39(4):692–705, 2017.
[22] Haibin Huang, Evangelos Kalogerakis, and Benjamin Marlin. Analysis
and synthesis of 3d shape families via deep-learned generative models
of surfaces. In Computer Graphics Forum, volume 34, pages 25–38.
Wiley Online Library, 2015.
[23] Yangyan Li, Hao Su, Charles Ruizhongtai Qi, Noa Fish, Daniel Cohen-
Or, and Leonidas J Guibas. Joint embeddings of shapes and images
via cnn image purification. ACM transactions on graphics (TOG),
34(6):234, 2015.
[24] Rohit Girdhar, David F Fouhey, Mikel Rodriguez, and Abhinav Gupta.
Learning a predictable and generative vector representation for objects.
In European Conference on Computer Vision, pages 484–499. Springer,
2016.
[25] Hao Su, Charles R Qi, Yangyan Li, and Leonidas J Guibas. Render for
cnn: Viewpoint estimation in images using cnns trained with rendered
3d model views. In Proceedings of the IEEE International Conference
on Computer Vision, pages 2686–2694, 2015.
[26] Charles R Qi, Hao Su, Matthias Nießner, Angela Dai, Mengyuan Yan,
and Leonidas J Guibas. Volumetric and multi-view cnns for object
classification on 3d data. In Proceedings of the IEEE conference on
computer vision and pattern recognition, pages 5648–5656, 2016.
[27] Jiajun Wu, Tianfan Xue, Joseph J Lim, Yuandong Tian, Joshua B
Tenenbaum, Antonio Torralba, and William T Freeman. Single image
3d interpreter network. In European Conference on Computer Vision,
pages 365–382. Springer, 2016.
[28] Yu Xiang, Wongun Choi, Yuanqing Lin, and Silvio Savarese. Data-
driven 3d voxel patterns for object category recognition. In Proceedings
of the IEEE Conference on Computer Vision and Pattern Recognition,
pages 1903–1911, 2015.
[29] Christopher B Choy, Danfei Xu, JunYoung Gwak, Kevin Chen, and
Silvio Savarese. 3d-r2n2: A unified approach for single and multi-view
3d object reconstruction. In European conference on computer vision,
pages 628–644. Springer, 2016.
[30] Abhishek Sharma, Oliver Grau, and Mario Fritz. Vconv-dae: Deep
volumetric shape learning without object labels. In European Conference
on Computer Vision, pages 236–250. Springer, 2016.
[31] Abhishek Kar, Shubham Tulsiani, Joao Carreira, and Jitendra Malik.
Category-specific object reconstruction from a single image. In Proceed-
ings of the IEEE conference on computer vision and pattern recognition,
pages 1966–1974, 2015.
Fig. 3. Results showing 3D reconstruction from a single image. In each row, [32] Chen Kong, Chen-Hsuan Lin, and Simon Lucey. Using locally corre-
the image on the left is the input image, and the other images show two sponding cad models for dense 3d reconstructions from a single image.
arbitrary views of the corresponding 3D output In Proceedings of the IEEE conference on computer vision and pattern
recognition, pages 4857–4865, 2017.

Authorized licensed use limited to: Auckland University of Technology. Downloaded on November 01,2020 at 13:53:49 UTC from IEEE Xplore. Restrictions apply.

An Invitation To 3-D Vision PDF
No ratings yet
An Invitation To 3-D Vision PDF
338 pages
An Invitation To 3-D Vision From Images To Models
No ratings yet
An Invitation To 3-D Vision From Images To Models
339 pages
3D Computer Vision Foundations and Advanced Methodologies Springer
No ratings yet
3D Computer Vision Foundations and Advanced Methodologies Springer
480 pages
Computer Vision Three-dimensional - Andrea Fusiello
No ratings yet
Computer Vision Three-dimensional - Andrea Fusiello
632 pages
DJI Terra Operation Guide v4.2
No ratings yet
DJI Terra Operation Guide v4.2
94 pages
3D Reconstruction From Multiple Images Part 1: Principles
No ratings yet
3D Reconstruction From Multiple Images Part 1: Principles
37 pages
CV Lab Manual
No ratings yet
CV Lab Manual
45 pages
OrtogOnLineMag - 3 Eng
No ratings yet
OrtogOnLineMag - 3 Eng
156 pages
Computer Vision
No ratings yet
Computer Vision
1 page
Roy Slides Part 1 3D Reconstruction With Deep Neural Networks
No ratings yet
Roy Slides Part 1 3D Reconstruction With Deep Neural Networks
74 pages
Presentation (1)
No ratings yet
Presentation (1)
41 pages
Lecture1 PDF
No ratings yet
Lecture1 PDF
95 pages
2021-Single Image 3D Object Reconstruction Based On Deep Learning A Review
No ratings yet
2021-Single Image 3D Object Reconstruction Based On Deep Learning A Review
36 pages
3D Photography
No ratings yet
3D Photography
2 pages
From 2D to 3D: Leveraging Sparse Inputs for High-Fidelity Model Generation with Neural Radiance Fields
No ratings yet
From 2D to 3D: Leveraging Sparse Inputs for High-Fidelity Model Generation with Neural Radiance Fields
5 pages
Cv Question Bank
No ratings yet
Cv Question Bank
7 pages
GRM: Large Gaussian Reconstruction Model For Efficient 3D Reconstruction and Generation
No ratings yet
GRM: Large Gaussian Reconstruction Model For Efficient 3D Reconstruction and Generation
29 pages
Arc 3D
No ratings yet
Arc 3D
20 pages
GRF: L G R F 3DS R R: Earning A Eneral Adiance Ield For Cene Epresentation and Endering
No ratings yet
GRF: L G R F 3DS R R: Earning A Eneral Adiance Ield For Cene Epresentation and Endering
28 pages
Time-Distributed Framework for 3D Reconstruction Integrating Fringe Projection with Deep Learning
No ratings yet
Time-Distributed Framework for 3D Reconstruction Integrating Fringe Projection with Deep Learning
23 pages
3D Reconstruction 2021
No ratings yet
3D Reconstruction 2021
27 pages
2106.10689v3
No ratings yet
2106.10689v3
23 pages
1 s2.0 S0926580524001122 Main
No ratings yet
1 s2.0 S0926580524001122 Main
25 pages
Texture Generic Deep Shape From Template
No ratings yet
Texture Generic Deep Shape From Template
22 pages
3D Reconstruction From Multiview 2D Images
No ratings yet
3D Reconstruction From Multiview 2D Images
21 pages
LRM: L R M S I 3D: Arge Econstruction Odel For Ingle Mage To
No ratings yet
LRM: L R M S I 3D: Arge Econstruction Odel For Ingle Mage To
23 pages
View Interpolation and Layered Depth Images
No ratings yet
View Interpolation and Layered Depth Images
19 pages
Research Paper
No ratings yet
Research Paper
19 pages
The_assessment_of_3D_model_representation_for_retr
No ratings yet
The_assessment_of_3D_model_representation_for_retr
17 pages
Fast Single-View 3D Object Reconstruction With Fine Details Through Dilated Downsample and Multi-Path Upsample Deep Neural Network
No ratings yet
Fast Single-View 3D Object Reconstruction With Fine Details Through Dilated Downsample and Multi-Path Upsample Deep Neural Network
5 pages
Generalized Fringe-To-Phase Framework for Single-Shot 3D Reconstruction Integrating Structured Light with Deep Learning
No ratings yet
Generalized Fringe-To-Phase Framework for Single-Shot 3D Reconstruction Integrating Structured Light with Deep Learning
18 pages
Single-Shot 3D Reconstruction via Nonlinear Fringe Transformation: Supervised and Unsupervised Learning Approaches
No ratings yet
Single-Shot 3D Reconstruction via Nonlinear Fringe Transformation: Supervised and Unsupervised Learning Approaches
18 pages
Neural Recon
No ratings yet
Neural Recon
10 pages
A Real World Dataset For Multi-View 3D
No ratings yet
A Real World Dataset For Multi-View 3D
18 pages
Realfusion 360 Reconstruction of Any Object From A Single Image
No ratings yet
Realfusion 360 Reconstruction of Any Object From A Single Image
20 pages
neus2
No ratings yet
neus2
15 pages
UNIT IV AICV AIDS
No ratings yet
UNIT IV AICV AIDS
22 pages
三维重建文档教程
No ratings yet
三维重建文档教程
19 pages
DUSt3R
No ratings yet
DUSt3R
13 pages
Accurate Multiple View 3D Reconstruction Using Pathch-Based Stereo For Large-Scale Scenes
No ratings yet
Accurate Multiple View 3D Reconstruction Using Pathch-Based Stereo For Large-Scale Scenes
14 pages
NeuManifold: Neural Watertight Manifold Reconstruction With Efficient and High-Quality Rendering Support
No ratings yet
NeuManifold: Neural Watertight Manifold Reconstruction With Efficient and High-Quality Rendering Support
16 pages
pdf1906 06543 PDF
No ratings yet
pdf1906 06543 PDF
27 pages
1811.12016v1
No ratings yet
1811.12016v1
12 pages
Multiview Compressive Coding for 3D Reconstruction
No ratings yet
Multiview Compressive Coding for 3D Reconstruction
12 pages
Zero-1-To-3: Zero-Shot One Image To 3D Object
No ratings yet
Zero-1-To-3: Zero-Shot One Image To 3D Object
13 pages
Progressive Learning of 3D Reconstruction Network From 2D GAN Data
No ratings yet
Progressive Learning of 3D Reconstruction Network From 2D GAN Data
12 pages
The_More_You_See_in_2D_the_More_You_Perceive_in_3D
No ratings yet
The_More_You_See_in_2D_the_More_You_Perceive_in_3D
11 pages
Ju DG-Recon Depth-Guided Neural 3D Scene Reconstruction ICCV 2023 Paper
No ratings yet
Ju DG-Recon Depth-Guided Neural 3D Scene Reconstruction ICCV 2023 Paper
11 pages
Reconstructing BIM From 2D Structural Drawings For Existing Buildings
No ratings yet
Reconstructing BIM From 2D Structural Drawings For Existing Buildings
17 pages
MVD-Fusion: Single-View 3D Via Depth-Consistent Multi-View Generation
No ratings yet
MVD-Fusion: Single-View 3D Via Depth-Consistent Multi-View Generation
11 pages
AutoRecon 自动检测物体并重建
No ratings yet
AutoRecon 自动检测物体并重建
10 pages
2d 3d Reconstruction
No ratings yet
2d 3d Reconstruction
11 pages
Pixel Nerf
No ratings yet
Pixel Nerf
10 pages
Yin Learning To Recover 3D Scene Shape From A Single Image CVPR 2021 Paper
No ratings yet
Yin Learning To Recover 3D Scene Shape From A Single Image CVPR 2021 Paper
10 pages
Incremental_Dense_Reconstruction_From_Monocular_Video_With_Guided_Sparse_Feature_Volume_Fusion
No ratings yet
Incremental_Dense_Reconstruction_From_Monocular_Video_With_Guided_Sparse_Feature_Volume_Fusion
8 pages
Vision System
No ratings yet
Vision System
8 pages
Hawk Eye
No ratings yet
Hawk Eye
35 pages
MCC
No ratings yet
MCC
12 pages
From Images To 3D Models
No ratings yet
From Images To 3D Models
7 pages
Liv Cister Ros 2020 (001-050)
No ratings yet
Liv Cister Ros 2020 (001-050)
50 pages
1.fan - A Point Set Generation Network For 3D Object Reconstruction From A Single Image - CVPR - 2017 - Paper
No ratings yet
1.fan - A Point Set Generation Network For 3D Object Reconstruction From A Single Image - CVPR - 2017 - Paper
9 pages
Azinovic_Neural_RGB-D_Surface_Reconstruction_CVPR_2022_paper
No ratings yet
Azinovic_Neural_RGB-D_Surface_Reconstruction_CVPR_2022_paper
12 pages
Minor Project
No ratings yet
Minor Project
7 pages
Overview on 3 d Reconstruction From Images
No ratings yet
Overview on 3 d Reconstruction From Images
7 pages
2111.03098
No ratings yet
2111.03098
14 pages
Lastone
No ratings yet
Lastone
6 pages
613-Article Text-2537-3-10-20221025
No ratings yet
613-Article Text-2537-3-10-20221025
10 pages
Maier 2017 Intrinsic 3 D
No ratings yet
Maier 2017 Intrinsic 3 D
14 pages
Mescheder Occupancy Networks Learning 3D Reconstruction in Function Space CVPR 2019 Paper
No ratings yet
Mescheder Occupancy Networks Learning 3D Reconstruction in Function Space CVPR 2019 Paper
11 pages
Learning Efficient Point Cloud Generation For Dense 3D Object Reconstruction
No ratings yet
Learning Efficient Point Cloud Generation For Dense 3D Object Reconstruction
8 pages
A Novel Intelligent Inspection Robot With Deep Stereo Vision For Three-Dimensional Concrete Damage Detection and Quantification
No ratings yet
A Novel Intelligent Inspection Robot With Deep Stereo Vision For Three-Dimensional Concrete Damage Detection and Quantification
15 pages
Computer Vision (CS708) : Lecture01 - Introduction
No ratings yet
Computer Vision (CS708) : Lecture01 - Introduction
27 pages
A Review of Deep Learning Techniques For 3D Reconstruction of 2D Images
No ratings yet
A Review of Deep Learning Techniques For 3D Reconstruction of 2D Images
5 pages
cvpr06 3dreconstructionindoor
No ratings yet
cvpr06 3dreconstructionindoor
8 pages
3D Model Generation and Reconstruction Using Conditional Generative Adversarial Network
No ratings yet
3D Model Generation and Reconstruction Using Conditional Generative Adversarial Network
9 pages
Sensor On 3D Digitization
No ratings yet
Sensor On 3D Digitization
22 pages
Assignment Computer Vision
No ratings yet
Assignment Computer Vision
4 pages
Ali Real-Time Vehicle Distance Estimation Using Single View Geometry
No ratings yet
Ali Real-Time Vehicle Distance Estimation Using Single View Geometry
10 pages
IWEX - A New Ultrasonic Array Technology For Direct Imaging of Subsurface Defects
No ratings yet
IWEX - A New Ultrasonic Array Technology For Direct Imaging of Subsurface Defects
5 pages
E57 Fact Sheet 2019
No ratings yet
E57 Fact Sheet 2019
1 page
3D Reconstruction Based On Stereovision and Texture Mapping
No ratings yet
3D Reconstruction Based On Stereovision and Texture Mapping
6 pages
Sensors On 3D Digitization
No ratings yet
Sensors On 3D Digitization
2 pages
Computer Stereo Vision: Exploring Depth Perception in Computer Vision
From Everand
Computer Stereo Vision: Exploring Depth Perception in Computer Vision
Fouad Sabry
No ratings yet
Procedural Surface: Exploring Texture Generation and Analysis in Computer Vision
From Everand
Procedural Surface: Exploring Texture Generation and Analysis in Computer Vision
Fouad Sabry
No ratings yet
Rendering Computer Graphics: Exploring Visual Realism: Insights into Computer Graphics
From Everand
Rendering Computer Graphics: Exploring Visual Realism: Insights into Computer Graphics
Fouad Sabry
No ratings yet
Multi View Three Dimensional Reconstruction: Advanced Techniques for Spatial Perception in Computer Vision
From Everand
Multi View Three Dimensional Reconstruction: Advanced Techniques for Spatial Perception in Computer Vision
Fouad Sabry
No ratings yet
Global Illumination: Advancing Vision: Insights into Global Illumination
From Everand
Global Illumination: Advancing Vision: Insights into Global Illumination
Fouad Sabry
No ratings yet
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet

Singh 2020

Uploaded by

Singh 2020

Uploaded by

A Generative Modelling Technique for 3D

Reconstruction from a Single 2D Image

978-1-7281-9615-2/20/$31.00 ©2020 IEEE 1

You might also like