Chen Floor-SP Inverse CAD For Floorplans by Sequential Room-Wise Shortest Path ICCV 2019 Paper

This document summarizes a new approach called Floor-SP for reconstructing floorplans from RGBD scans. Floor-SP first segments the scans into rooms, then formulates an optimization problem to reconstruct each room as a polygonal loop by sequentially solving shortest path problems. The objective function considers data terms from neural networks, consistency between adjacent rooms, and model complexity. The approach was evaluated on scans of 527 buildings and outperformed the state-of-the-art method.

Uploaded by

Carlos Campoverde

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views10 pages

Chen Floor-SP Inverse CAD For Floorplans by Sequential Room-Wise Shortest Path ICCV 2019 Paper

Uploaded by

Carlos Campoverde

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Floor-SP: Inverse CAD for Floorplans by

Sequential Room-wise Shortest Path

Jiacheng Chen1 Chen Liu2 Jiaye Wu2 Yasutaka Furukawa1

1 2
Simon Fraser University Washington University in St. Louis
{jca348,furukawa}@sfu.ca {chenliu,jiaye.wu}@wustl.edu

Figure 1. The proposed system, dubbed Floor-SP, takes aligned panorama RGBD scans as input, finds room segments, solves an opti-
mization problem to reconstruct a floorplan graph as multiple polygonal loops (one for each room), and merges them into a 2D graph via
simple post-processing heuristics. The optimization is the technical contribution of the paper, which employs the room-wise coordinate
descent strategy and sequentially solves shortest path problems to optimize the room structure.

Abstract tion [3] and hand tracking [30]. Unfortunately, the success
This paper proposes a new approach for automated has been limited to the cases of fixed known topology (e.g.,
floorplan reconstruction from RGBD scans, a major mile- a human has two arms). Inference of graph structure with
stone in indoor mapping research. The approach, dubbed unknown varying topology is still an open problem.
Floor-SP, formulates a novel optimization problem, where A popular approach to graph reconstruction is primitive
room-wise coordinate descent sequentially solves shortest detection and selection [11, 27, 22], for example, detecting
path problems to optimize the floorplan graph structure. corners, selecting subsets of corners to form edges, and se-
The objective function consists of data terms guided by lecting subsets of edges to form regions. The major problem
deep neural networks, consistency terms encouraging adja- of this bottom-up process is that it cannot recover from a
cent rooms to share corners and walls, and the model com- single false-negative in an earlier stage (i.e., a missing prim-
plexity term. The approach does not require corner/edge itive). The task becomes increasingly more difficult as the
primitive extraction unlike most other methods. We have primitive space grows exponentially with their degrees of
evaluated our system on production-quality RGBD scans of freedom, especially for non-Manhattan scenes which most
527 apartments or houses, including many units with non- existing methods do not handle [11, 2, 21, 20].
Manhattan structures. Qualitative and quantitative evalua- This paper seeks to make a breakthrough in the domain
tions demonstrate a significant performance boost over the of floorplan reconstruction with three key ideas.
current state-of-the-art. Please refer to our project website • First, we start from room segmentation via instance se-
http:// jcchen.me/ floor-sp/ for code and data. mantic segmentation technique (we use Mask-RCNN [12]).
The room segmentation reduces the floorplan graph infer-
ence into the reconstruction of multiple polygonal loops,
1. Introduction one for each room. This reduction allows us to formulate
Architectural floorplans play a crucial role in designing, floorplan reconstruction as sound energy optimization over
understanding, and remodeling indoor spaces. Automated multiple loops guided by room proposals.
floorplan reconstruction from raw sensor data is a major • Second, we employ room-wise coordinate descent strat-
milestone in indoor mapping research. The core techni- egy in optimizing the objective function. By exploiting the
cal challenge lies in the inference of wall graph structure, fact that the room topology is a simple loop, our formulation
whose topology is unknown and varies per example. finds the (near-)optimal graph structure by solving a short-
Computer Vision has made remarkable progress in the est path problem for each room one by one sequentially,
task of graph inference, for instance, human pose estima- while enforcing consistency with the other rooms.

12661
• Third, we utilize deep neural networks in evaluating the bottom-up process, missing corners in the detection phase
data terms of the optimization problem, measuring the dis- automatically lead to missing walls and rooms in the final
crepancy against the input sensor data. The data term is model. Second, false candidate primitives could lead to the
combined with the ad-hoc 1) consistency term, encourag- reconstruction of extraneous walls and rooms. Third, to en-
ing adjacent rooms to share corners and walls at the room able the usage of powerful IP, FloorNet needs to restrict the
boundaries, and 2) model complexity term, penalizing the solution space to Manhattan scenes.
number of corners in the graph. Structured indoor modeling by Ikehata et al. [17] is the
We have evaluated the proposed approach on production- source of inspiration for our work, which starts by room
quality RGBD scans of 527 apartments or houses, a few segmentation then solves shortest path problems to recon-
times larger than the current largest database [20]. Our struct room shapes followed by room merging and room
approach makes significant improvements over the current addition. While their system is a sequence of heuristics for
state-of-the-art [20]. We refer to our project website http: indoor modeling, our approach formulates a sound energy
//jcchen.me/floor-sp/ for code and data. minimization problem to recover the floorplan structure.
Indoor scan datasets: Affordable depth sensing hardware
2. Related Works enables researchers to build many indoor scan datasets. The
ETH3D dataset contains 16 indoor scans for multi-view
We discuss related work in two domains: graph recon- stereo [24]. The ScanNet dataset [6] and the SceneNN
struction and indoor scan datasets. dataset [15] capture a variety of indoor scenes. However,
Graph reconstruction: Graph structure inference has been most of their scans contain only one or two rooms, not
a popular field of study in Computer Vision, for instance, suitable for the floorplan reconstruction problem. Matter-
inferring a human body pose [3] or the semantic relation- port3D [4] builds high-quality panorama RGBD image sets
ships of categories [14, 28]. In these problems, the graph for 90 luxurious houses. 2D-3D-S dataset [1] provides 6
topology is defined over the label space, common to all the large-scale indoor scans of office spaces by using the same
instances (e.g., a head is always connected to a body). We Matterport system. Lastly, a large-scale synthetic dataset,
here focus on graph inference problems in the context of SUNCG [26], offers a variety of indoor scenes.
reconstruction, where the topology varies per instance. For the floorplan reconstruction task, FloorNet [20] pro-
Room layout estimation infers a graph of architectural vides the benchmark with full floorplan annotations and
feature lines from a single image, where nodes are room the corresponding RGBD videos from smartphones for 155
corners and edges are wall boundaries. Most approaches residential units. This paper utilizes production-quality
assume a 3D box-room to limit the topological variations in panorama RGBD scans for 527 houses or apartments with
the room layouts visible in 2D images [13, 25, 18, 5]. For floorplan annotations.
a room beyond a box shape, Dynamic Programming (DP)
was applied to search for an optimal room structure [8, 9]. 3. Floor-SP: System Overview
DP was similarly used to solve for floorplans by limiting
their topology to be a loop [2]. Floor-SP turns aligned panorama RGBD images into a
Bottom-up processing is a popular approach for graph floorplan graph in three phases: room segmentation, room-
reconstruction, where low-level primitives such as corners aware floorplan reconstruction, and loop merging (See
are detected, which are then selected to form higher-level Fig. 2). This section provides the system overview with
primitives such as edges or regions. DNN-based junc- minimal details. The aligned panorama RGBD scans are
tion detector was proposed for floorplan image vectoriza- first converted into 2D point-density/normal map, which is
tion [21], where a junction indicates incident edge direc- the input to Floor-SP. Unlike FloorNet [20], we focus on
tions in the Manhattan frame. The junction information is the wall structures, where doors/windows, icons, and room
utilized in inferring the edges by integer programming (IP). semantics can be added given proper wall structures.
Similarly, Huang et al. [16] uses DNN to detect junctions Room segmentation: The input panorama scans are con-
represented by a set of incident edge directions, and infer verted into a 4-channel 256×256 point-density/normal map
edges by heuristics for single-image wireframe reconstruc- in a top-down view (See Sect. 6). We utilize instance se-
tion of man-made scenes. mantic segmentation technique (Mask R-CNN [12]) to find
While many previous works utilize RGBD scans/point room segments given the 4-channel image. The room seg-
clouds for high-quality indoor reconstruction [17, 19, 23, ments set up a good foundation for floorplan reconstruction
20], FloorNet [20] is the current state-of-the-art for floor- by providing room proposals with rough shape, but they
plan reconstruction task tested on large-scale indoor bench- are still far away from a good floorplan graph because 1)
marks. FloorNet combines DNN and IP in a bottom-up pro- Mask R-CNN segment has a raster representation (i.e., un-
cess but it has three major failure modes. First, as in any known number and placement of corners); and 2) Walls are

2662
Figure 2. System overview: (Left) Mask-RCNN finds room segments (raster) from a top-down projection image consisting of point density
and mean surface normal, allowing us to reconstruct a floorplan as multiple room loops. (Middle) Room-wise coordinate descent optimizes
vectorzied room structures one by one by minimizing the sum of data, consistency, and model complexity terms. (Right) Simple graph
merging operations combine loops into a floorplan graph structure.

not consistently shared across rooms. of pixels along each loop.

Room-aware floorplan reconstruction: Given a set of
X
C
Edata (Li ) = λ1 Edata (p) +
room segments and the input point-density/normal map, we p∈C(Li )
formulate an optimization problem that reconstructs a floor- X
E I

plan graph as multiple polygonal loops, one for each room. λ2 Edata (p) + λ3 Edata (p) .
Deep neural networks derive data terms in the objective. We p∈E(Li )
propose a novel room-wise coordinate descent algorithm C
• Edata (p) is the penalty of placing a corner at pixel p
that directly optimizes the number and placement of corners
(see Fig.3a), and hence, summed over all the corner pix-
by sequentially solving shortest-path problems.
els C(Li ) on Li . The penalty is defined as one minus the
Loop merging: Simple graph merging operations combine pixel-wise corner likelihood. We estimate the corner like-
multiple polygonal loops into a final floorplan graph. lihood map from the input point-density/normal map using
Room-aware floorplan reconstruction is the technical Dilated Residual Networks (DRN) [29].
E
core of the paper, where Sect. 4 defines the problem for- • Edata (p) is the penalty of placing an edge over a pixel
mulation, and Sect. 5 presents the optimization algorithm. p. The term is defined as one minus the pixel-wise edge
Room segmentation and loop merging are based on existing likelihood (see Fig. 3b), summed over all the edge pixels
techniques, where Sect. 6 provides their algorithmic details E(Li ) along Li . We use Bresenham’s line algorithm to ob-
and the remaining system specifications. tain edge pixels given corners. The same DRN estimates the
edge likelihood from the input point-density/normal map.
I
4. Room-aware floorplan reconstruction • Edata (p) is also the penalty summed over the edge pixels,
which enforces Li not to pass through the room segment
The room segmentation (Ri ) from Mask R-CNN allows Ri . The term is a large constant if a pixel belongs to any of
us to reduce the floorplan graph inference into the recon- the room segments and 0 otherwise.
struction of multiple loops (Li ), one for each room. Li is Consistency term: Econsis is a room-wise higher-order
defined as a sequence of pixels at integer coordinates form- potential, encouraging loops to be consistent at the room
ing a polygonal curve with a loop topology. Our problem is boundaries (i.e., sharing corners and edges). We define the
to minimize the following objective with respect to the set penalty to be the number of pixels that are used by the cor-
of polygonal loops L: ners (or edges) of all the loops together. For instance, if two
X X corners are close to each other, this term suggests to move
Edata (Li ) + Econsis (L) + Emodel (Li ), them to the same pixel so that penalty is imposed only once:
Li ∈L Li ∈L X X
Econsis (L) = [λ4 1C (p, L)] + [λ5 1E (p, L)]
subject to Li being a loop containing Ri inside. Note that p p
a room has an arbitrary number of corners (i.e., degrees of The first term 1C (p, L) is an indicator function, which be-
freedom), which must be optimized by an algorithm. comes 1 if a pixel (p) is a corner of at least one loop. Simi-
Data term: Edata is a room-wise unary potential, measur- larly, the second term is an indicator function for edges. See
ing the discrepancy with the input sensor data over the set Fig. 3 for the illustration over toy examples.

2663
This section explains 1) Shortest path problem reduction;
2) Containment constraint satisfaction; and 3) Two approx-
imation methods for speed-boost.
Shortest path problem reduction: The reduction process
is straightforward, as our cost function is the summation of
pixel-wise penalties and the number of corners. Without
loss of generality, suppose we are optimizing L1 while fix-
ing the other loops. Our optimization problem is equivalent
to solving a shortest path problem for R1 with the following
weight definition for each edge (e) (See the supplementary
document for the derivation):
X λ1
E C (p) +
2 data
p∈C(e)
X
E I

λ2 Edata (p) + λ3 Edata (p) +
p∈E(e)
X
λ4 (1 − 1C (p, L \ {L1 })) +
p∈C(e)
C
X
Figure 3. Illustration of data and consistency terms. Edata and λ5 (1 − 1E (p, L \ {L1 })) + λ6 .
E
Edata are defined based on corner and edge likelihood maps. p∈E(e)
Blue pixels indicate lower costs in these toy examples. Econsis
counts the number of pixels used by room corners and room edges. With abuse of notation, C(e) denotes the two pixels at the
When neighboring rooms share corners and edges as shown in (c), end-points of e, E(e) denotes the set of pixels along e ob-
Econsis goes down.
tained by Bresenham’s line algorithm, and L\{L1 } denotes
the set of loops excluding L1 .
Model complexity term: Emodel is the model complexity
penalty, counting the number of corners in our loops, pre- Containment constraint satisfaction: Shortest path is a
ferring compact shapes. powerful formulation that searches for the optimal number
and placement of corners with one caveat: An additional
Emodel (Li ) = λ6 {# of corners in Li }. constraint is necessary to avoid a trivial solution (i.e., an
empty loop). We use a heuristic similar in spirit to the prior
work [2] to implement this constraint: “Li contains (or goes
λ? are scalars defining the relative weights of the penalty
around) Ri ”. We refer the details to the supplementary doc-
terms. We found our system robust to these parameters and
ument and here summarize the process.
use the following setting throughout our experiments: λ1 =
0.2, λ2 = 0.2, λ3 = 100.0, λ4 = 0.2, λ5 = 0.1, λ6 = 1.0. First, we find corner candidates from the same corner
likelihood map used for the data term (see Fig. 4). Second,
we look at the edge likelihood map to identify a good pair
5. Sequential room-wise shortest path of corners forming the start-edge of the loop. Third, we
The inspiration of our optimization strategy comes from draw a start-line that starts from the room mask (Ri ) and
a prior work, which solves a shortest path problem and re- passes through the start-edge perpendicularly at its middle
constructs a floorplan as a loop [2]. This formulation con- point. Lastly, we remove all the edges that intersect with the
siders every pixel as a node of a graph, encodes objectives start-line to ensure that the path must go around Ri .
into edge weights, and finds the shortest path as a loop. Note that fixing the start-edge to be part of the loop
Our problem solves for multiple loops over multiple breaks the local optimality of our coordinate descent step,
rooms. We devise room-wise coordinate descent strategy but works well in practice as it is not difficult to identify one
that optimizes room structures one by one sequentially by wall segment with high confidence.
reducing a room-wise coordinate descent step into a short- Bounding box approximation: We make an approxima-
est path problem. While the algorithm is robust to the pro- tion in pruning nodes and edges to reduce the computational
cessing order, we visit rooms in increasing order of their expenses of the shortest path algorithm (SPA). We restrict
areas (i.e. smaller rooms are handled first) so that we get the domain of SPA, as it is wasteful to run it over an en-
fixed results given the same input. The optimization runs tire image domain to reconstruct one room. Given a room
for two rounds in our experiments. mask Ri , we apply the binary dilation 10 times to expand

2664
wise likelihoods for corner, edge, and edge direction, we
use the official implementation of Dilated Residual Net-
works [29], which produces 32 × 32 feature maps. In or-
der to produce an output in the same resolutions as the
input, we add 3 extra layers of residual blocks [10] with
transposed convolution of stride 2 to reach the resolution
of 256 × 256. For the corner likelihood supervision, we
render each ground truth corner as a 7 × 7 disk. For the
edge likelihood and wall-direction supervision, we draw the
edge mask and direction information with a width of 5 pix-
Figure 4. We solve a shortest path problem for each room, where els. The loss is binary cross entropy and the learning rate is
cost functions are encoded into edge weights. In order to avoid a 1e-4. Dijkstra’s algorithm solves the shortest path problem.
trivial solution (i.e., an empty graph) and enforce the path to go Loop merging: We use simple graph merging operations to
around the rough room segment (Ri ), we first identify a start-edge convert room loops into the final floorplan graph structure.
that is a part of a room shape with high-confidence. Next, we draw
More concretely, we denote a contiguous set of colinear line
a (red) start-line perpendicularly to split the domain. We prohibit
crossing the start-line, assign a very high penalty for going through
segments as a segment group. We repeatedly identify a pair
Ri , then solve for a shortest path that starts and ends at the two of parallel segment groups within 5 pixels and snap them
end-points of the start-edge. into a new segment group at the middle point while merging
corners. After applying the edge merging to all compatible
the mask and find its axis-aligned bounding box with a 5- pairs, we merge corners that are within 3 pixels.
pixel margin, in which we solve SPA.
Dominant direction approximation: Floor-SP goes be- 7. Experiments
yond the conventional Manhattan assumption by allowing We have evaluated the proposed system on 527 sets of
multiple Manhattan frames per room. We train the same aligned panorama RGBD scans. The average numbers of
DRN architecture to estimate the wall direction likelihoods 1) input 3D points for the point-density/normal image, 2)
in an increment of 10 degrees at every pixel. We perform a corners in the annotations, 3) wall segments in the annota-
simple statistical analysis to extract four Manhattan frames tions, and 4) rooms in the annotations are 432,552, 28.87,
(i.e., eight directions) globally , then assign its subset to 35.88, and 7.73, respectively. Out of 4072 rooms, 489
each room. We allow edges only along the selected domi- rooms do not follow the primary Manhattan structure of the
nant directions with some tolerance on discretization errors unit. Fig. 5 shows four examples from our dataset.
(See the supplementary document for details). 527 units are split into 433 and 94 for training and test-
ing, respectively. We make the test set more challenging on
6. System Details purpose for evaluations: 48 out of 94 testing units contain
challenging non-Manhattan structure, and 199 out of 667
Input processing: Given a set of panorama RGBD scans testing rooms follow non-Manhattan geometry.
where the Z axis is aligned with the gravity direction, we We have implemented the proposed system in Python
compute the tight axis-aligned bounding box of the points while using PyTorch as the DNN library. We have used a
on the horizontal plane. We expand the rectangle by 2.5% in workstation equipped with an NVIDIA 1080Ti with 12GB
each of the four directions, apply non-uniform scaling into GPU memory. We trained the Mask-RCNN for 70 epochs
a 256 × 256 pixel grid, and compute the point density and with a batch size of 1, and the DRNs for 35 epochs with a
normal in each pixel. The point density is the number of 3D batch size of 4. The training of each DNN model takes at
points that fall inside the pixel, which we linearly re-scale most a day. At test time, it takes about 5 minutes to process
to [0.0, 1.0] so that the highest density becomes 1.0. The one apartment/house. The bottleneck is the construction of
point normal is the average surface normal vector of the 3D the graph for the shortest path problem (a CPU-intensive).
points associated with the pixel.
Room segmentation: We use the publicly available 7.1. Qualitative evaluations
Mask R-CNN implementation [7] with the default hyper- Fig. 6 compares Floor-SP against the current state-of-
parameters except that we lower the detection threshold the-art FloorNet [20] and the variants of our system. Floor-
from 0.7 to 0.2. Given a segment from Mask R-CNN, we Net follows a bottom-up process, where it first detects cor-
apply the binary erosion operation for 2 iterations with 8- ners then uses Integer Programming to find their valid con-
connected neighborhood to obtain room segments (Ri ). nections. FloorNet suffers from three failure modes: 1)
Room-aware floorplan reconstruction: To estimate pixel- Missing rooms due to missing corners in the first corner de-

2665
Figure 5. Our dataset offers production-level panorama RGBD scans for 527 houses/apartments. We convert each scan into a point
density/normal map from a top-down view, which is the input to our system. We annotated floorplan structure as a 2D polygonal graph.
Note that for visualizing point-density/normal maps (the middle column), the intensity encodes the point density, and the hue/saturation
encodes the 2D horizontal component of the mean surface normal.

Table 1. The main quantitative evaluation results. The colors cyan, orange, magenta represent the top three entries.
Corner Edge Room Room++
Method
Prec. Recall Prec. Recall Prec. Recall Prec. Recall
FloorNet [20] 95.0 76.6 94.8 76.8 81.2 72.1 42.3 37.5
Ours (w/o Edata , Econsis ) 84.4 80.4 82.3 79.8 75.1 61.3 23.3 22.0
Ours (w/o Econsis ) 93.9 82.3 89.2 81.2 83.8 81.7 49.4 48.5
Ours (1st-round coordinate descent) 94.6 82.8 89.4 81.7 83.9 81.8 49.5 48.7
Ours (2nd-round coordinate descent) 95.1 82.2 90.2 81.1 84.7 83.0 51.4 50.4

tection step; 2) Extraneous rooms coming from extraneous Corner precision/recall: We declare that a corner is suc-
corner detections; and 3) Broken non-Manhattan structures, cessfully reconstructed if there is a ground-truth room cor-
which becomes challenging due to the excessive amount of ner within 10 pixels. When multiple corners are detected
search space in Integer Programming. around a single ground-truth corner, we only take the clos-
The right three columns show the variants of proposed est one as correct and treat the others as false-positives.
Floor-SP. The left does not have the consistency term and Edge precision/recall: We declare that an edge of a graph
replaces the DNN-based data term by the ad-hoc cost func- is successfully reconstructed if its two end-points pass the
tions in the prior work [2]. Our overall formulation guaran- corner test described above and the corresponding edge be-
tees a room reconstruction at each detected room segment, longs to the ground-truth.
producing reasonable results. On adding our DNN-based
data term Edata (middle), per-room structure improves sig- Room precision/recall: We declare that a room is success-
nificantly. However, inconsistencies at the room boundaries fully reconstructed if 1) it does not overlap with any other
are often noticeable. Lastly, with the addition of the con- room, and 2) there exists a room in the ground-truth with
sistency term (right), we see clean floorplan structures with intersection-over-union (IOU) score more than 0.7. Note
consistent shared room boundaries. that this metric does not consider the positioning and shar-
Fig. 7 illustrates the effect of room-wise coordinate de- ing of corners and edges.
scent over multiple rounds. Red ovals indicate challeng- Room++ precision/recall: We declare that a room is suc-
ing structure causing room overlaps or holes, which are re- cessfully reconstructed in this metric, if the room is con-
solved after the second round of optimization. nected (i.e., sharing edges) to the correct set of successfully
reconstructed rooms as in the ground-truth, besides passing
7.2. Quantitative evaluations the above two room conditions.
We follow FloorNet [20] and define the following four Table 1 shows the main quantitative evaluations. Preci-
metrics for the quantitative evaluations: sion metrics on low-level primitives (i.e., corners and edges)

2666
Figure 6. Qualitative comparisons against FloorNet [20] and the variants of our approach. We select hard non-Manhattan examples
here to illustrate the reconstruction challenges in our dataset. For reconstructions by Floor-SP variants, room colors are determined by
corresponding room segments from Mask R-CNN. For the ground-truth and the FloorNet, colors are based on the room types.

are high for FloorNet, because this task does not require fails. Floor-SP recovers such challenging corners through
high-level structural reasoning and the majority of the cor- the sequential room-wise optimization process.
ners are easy ones (e.g., Manhattan corners). On the other On room-level metrics, Floor-SP is consistently better
hand, their recall metrics are low even for low-level prim- than FloorNet. Furthermore, the addition of the data and
itives, because some room corners do not have enough 3D consistency terms improves the room-level metrics. Finally,
points due to occlusions where DNN based corner detection room-wise coordinate descent adds a further boost to the

2667
performance. The quantitative results and the visualization
of all 94 test examples are in the supplementary document.

Figure 9. Standard corner detection easily makes mistakes (red

disks). Mask R-CNN produces imprecise raster room segments
(white masks) or misses an entire room (right-most example).
Floor-SP uses optimization to solve for the corner placements and
their connections robustly. At the top, the orange polygon shows
our reconstruction of a room in focus. The bottom shows the cor-
responding ground truth.
Figure 7. Multiple rounds of the coordinate descent fix mistakes
at challenging floorplan structure. The top row shows the results on the number of rooms for a complex case, which is again
after the first round of the coordinate descent optimization, and the impossible to recover. Once a single room fails, all the adja-
bottom shows the results after the second round. We also show the cent rooms automatically fail in the Room++ metric, lead-
total amount of energy after each round. Corresponding ground- ing to the zero precision and recall in this example.
truth annotations are found in Fig. 6.
In Fig. 9, we further analyze the robustness of our ap-
proach. Corner detection with non-maximum suppression
always produce noisy results, and room segments generated
by instance segmentation network are also imprecise on de-
tails. Instead of using these primitive detections directly,
Floor-SP formulates an energy minimization problem to
solve for the number and placement of floorplan corners
and is robust to these two types of mistakes. However, when
room instance segmentation makes mistake on the number
of rooms (as in the last example in Fig. 9), our system can-
not recover but produce approximate indoor structures with
wrong room separation. This mistake is also observed in
the two examples in Fig. 8. One future research is to re-
cover from mistake made in room segmentation phase to
Figure 8. Typical failure modes. The top is the ground-truth an- produce more accurate floorplan graph.
notation and the bottom is our result for each example. Our sys- We would like to also note that the input to our system is
tem still makes mistakes for complex scenes and challenging non-
a single point-density/normal image from a top-down view.
Manhattan structures.
We have discarded the 3D information by projecting the
7.3. Discussion points onto a 2D image as described in Sect. 6. We have
not utilized high-resolution panorama RGB images, which
Floor-SP produces near-perfect results for Manhattan
are available in the dataset and could make the system more
structures. The majority of the failures are concentrated on
robust like FloorNet [20].
non-Manhattan cases. Quantitatively, our Room++ metrics
We believe that this paper sets a major milestone in in-
are just slightly above 50. However, we would like to point
door mapping research. The proposed system produces
out that our reconstructions are not terribly bad even in ex-
compelling floorplan reconstruction results on production-
tremely challenging cases with poor Room++ metrics.
quality challenging scenes in large quantities. We publicly
Look at the first example in Fig. 8. Room++ precision
share our code and data in our project website to promote
and recall are both 0 with our reconstruction, while the re-
further research.
construction looks fairly reasonable. The reasons are three-
fold as marked by the numbers. 1) A small non-Manhattan Acknowledgement: This research is partially supported
room has wrong dominant directions in the pre-processing by National Science Foundation under grant IIS 1618685,
step, which makes it impossible for Floor-SP to recover, and NSERC Discovery Grants, NSERC Discovery Grants Ac-
fails the IOU test; 2) Small details such as concave struc- celerator Supplements, and DND/NSERC Discovery Grant
tures are hard to keep and the room fails the IOU test; 3) Supplement. We thank Beike (https://www.ke.com) for the
The room segmentation by Mask R-CNN makes a mistake 3D house scans and annotations.

2668
References [15] Binh-Son Hua, Quang-Hieu Pham, Duc Thanh Nguyen,
Minh-Khoi Tran, Lap-Fai Yu, and Sai-Kit Yeung. Scenenn:
[1] Iro Armeni, Ozan Sener, Amir R Zamir, Helen Jiang, Ioannis A scene meshes dataset with annotations. In 2016 Fourth
Brilakis, Martin Fischer, and Silvio Savarese. 3d semantic International Conference on 3D Vision (3DV), 2016.
parsing of large-scale indoor spaces. In IEEE Conference on [16] Kun Huang, Yifan Wang, Zihan Zhou, Tianjiao Ding,
Computer Vision and Pattern Recognition (CVPR), 2016. Shenghua Gao, and Yi Ma. Learning to parse wireframes
[2] Ricardo Cabral and Yasutaka Furukawa. Piecewise planar in images of man-made environments. In IEEE Conference
and compact floorplan reconstruction from images. In IEEE on Computer Vision and Pattern Recognition (CVPR), 2018.
Conference on Computer Vision and Pattern Recognition [17] Satoshi Ikehata, Hang Yang, and Yasutaka Furukawa. Struc-
(CVPR). IEEE, 2014. tured indoor modeling. In IEEE International Conference on
[3] Zhe Cao, Tomas Simon, Shih-En Wei, and Yaser Sheikh. Computer Vision (ICCV), 2015.
Realtime multi-person 2d pose estimation using part affinity [18] Chen-Yu Lee, Vijay Badrinarayanan, Tomasz Malisiewicz,
fields. In Proceedings of the IEEE Conference on Computer and Andrew Rabinovich. Roomnet: End-to-end room layout
Vision and Pattern Recognition (CVPR), 2017. estimation. In IEEE International Conference on Computer
[4] Angel X. Chang, Angela Dai, Thomas A. Funkhouser, Ma- Vision (ICCV), 2017.
ciej Halber, Matthias Nießner, Manolis Savva, Shuran Song, [19] Minglei Li, Peter Wonka, and Liangliang Nan. Manhattan-
Andy Zeng, and Yinda Zhang. Matterport3d: Learning from world urban reconstruction from point clouds. In European
rgb-d data in indoor environments. In 2017 International Conference on Computer Vision (ECCV), 2016.
Conference on 3D Vision (3DV), 2017. [20] Chen Liu, Jiaye Wu, and Yasutaka Furukawa. Floornet:
[5] Yu-Wei Chao, Wongun Choi, Caroline Pantofaru, and Sil- A unified framework for floorplan reconstruction from 3d
vio Savarese. Layout estimation of highly cluttered in- scans. In European Conference on Computer Vision (ECCV),
door scenes using geometric and semantic cues. In In- 2018.
ternational Conference on Image Analysis and Processing [21] Chen Liu, Jiajun Wu, Pushmeet Kohli, and Yasutaka Fu-
(ICIAP), 2013. rukawa. Raster-to-vector: Revisiting floorplan transforma-
[6] Angela Dai, Angel X Chang, Manolis Savva, Maciej Hal- tion. In IEEE International Conference on Computer Vision
ber, Thomas Funkhouser, and Matthias Nießner. Scannet: (ICCV), 2017.
Richly-annotated 3d reconstructions of indoor scenes. In [22] Aron Monszpart, Nicolas Mellado, Gabriel J. Brostow, and
IEEE Conference on Computer Vision and Pattern Recog- Niloy Jyoti Mitra. Rapter: rebuilding man-made scenes
nition (CVPR), 2017. with regular arrangements of planes. ACM Trans. Graph.,
[7] pytorch-mask-rcnn. https://github.com/multimodallearning/ 34:103:1–103:12, 2015.
pytorch-mask-rcnn. [23] Liangliang Nan and Peter Wonka. Polyfit: Polygonal sur-
[8] Alex Flint, Christopher Mei, David Murray, and Ian Reid. face reconstruction from point clouds. In IEEE International
A dynamic programming approach to reconstructing build- Conference on Computer Vision (ICCV), 2017.
ing interiors. In European Conference on Computer Vision [24] Thomas Schöps, Johannes L. Schönberger, Silvano Galliani,
(ECCV), 2010. Torsten Sattler, Konrad Schindler, Marc Pollefeys, and An-
dreas Geiger. A multi-view stereo benchmark with high-
[9] Alex Flint, David Murray, and Ian Reid. Manhattan scene
resolution images and multi-camera videos. In IEEE Confer-
understanding using monocular, stereo, and 3d features. In
ence on Computer Vision and Pattern Recognition (CVPR),
IEEE International Conference on Computer Vision (ICCV),
2017.
2011.
[25] Alexander G Schwing, Tamir Hazan, Marc Pollefeys, and
[10] Huan Fu, Mingming Gong, Chaohui Wang, Kayhan Bat-
Raquel Urtasun. Efficient structured prediction for 3d indoor
manghelich, and Dacheng Tao. Deep ordinal regression net-
scene understanding. In IEEE Conference on Computer Vi-
work for monocular depth estimation. In IEEE Conference
sion and Pattern Recognition (CVPR). IEEE, 2012.
on Computer Vision and Pattern Recognition (CVPR), 2018.
[26] Shuran Song, Fisher Yu, Andy Zeng, Angel X. Chang,
[11] Yasutaka Furukawa, Brian Curless, Steven M. Seitz, and Manolis Savva, and Thomas A. Funkhouser. Semantic scene
Richard Szeliski. Manhattan-world stereo. In IEEE Com- completion from a single depth image. In 2017 IEEE Confer-
puter Society Conference on Computer Vision and Pattern ence on Computer Vision and Pattern Recognition (CVPR),
Recognition (CVPR), 2009. 2016.
[12] Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross B. [27] Jianxiong Xiao and Yasutaka Furukawa. Reconstructing the
Girshick. Mask r-cnn. In IEEE International Conference on world’s museums. International Journal of Computer Vision,
Computer Vision (ICCV), 2017. 110(3):243–258, 2014.
[13] Varsha Hedau, Derek Hoiem, and David A. Forsyth. Recov- [28] Danfei Xu, Yuke Zhu, Christopher B Choy, and Li Fei-Fei.
ering the spatial layout of cluttered rooms. In IEEE Interna- Scene graph generation by iterative message passing. In Pro-
tional Conference on Computer Vision (ICCV), 2009. ceedings of the IEEE Conference on Computer Vision and
[14] Hexiang Hu, Guang-Tong Zhou, Zhiwei Deng, Zicheng Pattern Recognition (CVPR), 2017.
Liao, and Greg Mori. Learning structured inference neural [29] Fisher Yu, Vladlen Koltun, and Thomas A. Funkhouser. Di-
networks with label relations. In IEEE Conference on Com- lated residual networks. In IEEE Conference on Computer
puter Vision and Pattern Recognition (CVPR), 2016. Vision and Pattern Recognition (CVPR), 2017.

2669
[30] Shanxin Yuan, Qi Ye, Bjorn Stenger, Siddhant Jain, and Tae-
Kyun Kim. Bighand2. 2m benchmark: Hand pose dataset
and state of the art analysis. In IEEE Conference on Com-
puter Vision and Pattern Recognition (CVPR), 2017.

2670

The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
From Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
4/5 (6471)
Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (650)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brene Brown
4/5 (1175)
Never Split the Difference: Negotiating As If Your Life Depended On It
From Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
4.5/5 (1005)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4.5/5 (1858)
Grit: The Power of Passion and Perseverance
From Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
4/5 (650)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4.5/5 (4104)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1278)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (903)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (629)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
From Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
4/5 (1022)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
From Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
4.5/5 (361)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (298)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4.5/5 (1139)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
From Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
4.5/5 (582)
A Man Called Ove: A Novel
From Everand
A Man Called Ove: A Novel
Fredrik Backman
4.5/5 (5181)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (100)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (945)
Brooklyn: A Novel
From Everand
Brooklyn: A Novel
Colm Toibin
3.5/5 (2141)
The Little Book of Hygge: Danish Secrets to Happy Living
From Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
3.5/5 (464)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
From Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
4.5/5 (280)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2289)
Yes Please
From Everand
Yes Please
Amy Poehler
4/5 (2016)
Bad Feminist: Essays
From Everand
Bad Feminist: Essays
Roxane Gay
4/5 (1090)
The Art of Racing in the Rain: A Novel
From Everand
The Art of Racing in the Rain: A Novel
Garth Stein
4/5 (4372)
A Tree Grows in Brooklyn
From Everand
A Tree Grows in Brooklyn
Betty Smith
4.5/5 (2033)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (2886)
The Woman in Cabin 10
From Everand
The Woman in Cabin 10
Ruth Ware
3.5/5 (2814)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (233)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
From Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
4.5/5 (141)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (244)
Wolf Hall: A Novel
From Everand
Wolf Hall: A Novel
Hilary Mantel
4/5 (4135)
On Fire: The (Burning) Case for a Green New Deal
From Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
4/5 (78)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (144)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (929)
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (841)
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2547)
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M.L. Stedman
4.5/5 (815)
Little Women
From Everand
Little Women
Louisa May Alcott
4.5/5 (2369)
The Constant Gardener: A Novel
From Everand
The Constant Gardener: A Novel
John le Carré
4/5 (278)
Machine Learning Based Financial Statement Analysis
No ratings yet
Machine Learning Based Financial Statement Analysis
56 pages
Building AI - No-Code NLP Workflows
No ratings yet
Building AI - No-Code NLP Workflows
109 pages
Project Report On Natural Language Processing
No ratings yet
Project Report On Natural Language Processing
4 pages
Xiao 等 - 2023 - BarrierNet Differentiable Control Barrier Functions for Learning of Safe Robot Control
No ratings yet
Xiao 等 - 2023 - BarrierNet Differentiable Control Barrier Functions for Learning of Safe Robot Control
19 pages
Shathiya Soft
No ratings yet
Shathiya Soft
2 pages
Specialization Program - Full Detailed Main Brochure 90 Pages
No ratings yet
Specialization Program - Full Detailed Main Brochure 90 Pages
92 pages
Helmet Use Detection of Tracked Motorcycles Using CNN-Based Multi-Task Learning
No ratings yet
Helmet Use Detection of Tracked Motorcycles Using CNN-Based Multi-Task Learning
12 pages
Ghousia Technical Seminar Report
No ratings yet
Ghousia Technical Seminar Report
25 pages
Deep Learning-Based Fusion of Landsat-8 and Sentinel-2 Images For A Harmonized Surface
No ratings yet
Deep Learning-Based Fusion of Landsat-8 and Sentinel-2 Images For A Harmonized Surface
58 pages
Class 10 - Summer Holiday Homework 2023
100% (1)
Class 10 - Summer Holiday Homework 2023
9 pages
Efficient Facial Expression Recognition Algorithm Based On Hierarchical Deep Neural Network Structure
No ratings yet
Efficient Facial Expression Recognition Algorithm Based On Hierarchical Deep Neural Network Structure
13 pages
Ganguli CV
No ratings yet
Ganguli CV
9 pages
Audio Chord Recognition With Recurrent Neural Networks
No ratings yet
Audio Chord Recognition With Recurrent Neural Networks
6 pages
Access Control For Smart Cities
No ratings yet
Access Control For Smart Cities
40 pages
Python Projects List
No ratings yet
Python Projects List
15 pages
Deep Learning For Face Recognition
No ratings yet
Deep Learning For Face Recognition
47 pages
Ai Project Cycle
No ratings yet
Ai Project Cycle
11 pages
Malicious Activity Analysis of IoT Network Intrusion Detection System
100% (1)
Malicious Activity Analysis of IoT Network Intrusion Detection System
38 pages
Detection of Mirai Botnet Attacks On IoT Devices Using Deep Learning
No ratings yet
Detection of Mirai Botnet Attacks On IoT Devices Using Deep Learning
14 pages
DBA4714 Deep Learning Generative AI in Business - R1
No ratings yet
DBA4714 Deep Learning Generative AI in Business - R1
3 pages
CP4252 ML Syllabus
No ratings yet
CP4252 ML Syllabus
4 pages
Cam PPT
No ratings yet
Cam PPT
21 pages
Efficient CNN Accelerator On FPGA
No ratings yet
Efficient CNN Accelerator On FPGA
9 pages
Apmaiinproject Management Final
No ratings yet
Apmaiinproject Management Final
28 pages
AI Startups and The Fight Against Mis/Disinformation Online: An Update
No ratings yet
AI Startups and The Fight Against Mis/Disinformation Online: An Update
20 pages
8626-Article Text-32183-1-10-20230510
No ratings yet
8626-Article Text-32183-1-10-20230510
11 pages
Toward Digital Twin of The Ocean: From Digitalization To Cloning
No ratings yet
Toward Digital Twin of The Ocean: From Digitalization To Cloning
12 pages
Graduation Project Proposal
No ratings yet
Graduation Project Proposal
3 pages
Car Damage Assessment
No ratings yet
Car Damage Assessment
14 pages
The Fourth Industrial Revolution in The Food Industry Part I Industry 4 0 Technologies
No ratings yet
The Fourth Industrial Revolution in The Food Industry Part I Industry 4 0 Technologies
18 pages

Chen Floor-SP Inverse CAD For Floorplans by Sequential Room-Wise Shortest Path ICCV 2019 Paper

Uploaded by

Chen Floor-SP Inverse CAD For Floorplans by Sequential Room-Wise Shortest Path ICCV 2019 Paper

Uploaded by

Floor-SP: Inverse CAD for Floorplans by

Sequential Room-wise Shortest Path

Jiacheng Chen1 Chen Liu2 Jiaye Wu2 Yasutaka Furukawa1

not consistently shared across rooms. of pixels along each loop.

Figure 9. Standard corner detection easily makes mistakes (red

You might also like