Concrete Crack Quantification Using Voxel-Based Reconstructi
Concrete Crack Quantification Using Voxel-Based Reconstructi
Abstract—Concrete cracks are one of the most apparent Index Terms—Bayesian data fusion, crack quantifi-
indicators for possible structural deterioration and need cation, instance segmentation network, voxel-based
to be periodically inspected. However, for current image- reconstruction.
based automated crack inspection techniques, accurate
and detailed crack quantification and assessment remain
a challenging task. Most of these techniques require high- I. INTRODUCTION
quality input images, which may be difficult to ensure in
practice. Besides, simply merging crack detections from ONCRETE structures need to be periodically inspected
multiple images to generate a large crack map may result
in an inaccurate outcome for crack severity assessment.
In this article, a novel crack quantification framework is
C to assess their current functional state, predict their future
condition, and make informed decisions on their maintenance
proposed to identify complete crack geometric properties and rehabilitation. One important inspection item for concrete
utilizing a set of unordered inspection images. To real- structures is to investigate whether cracks have appeared or prop-
ize this, cracks in images are detected by an instance agated in their critical components [1]. According to structural
segmentation convolutional neural network. Subsequently, inspection manuals, a crack is defined as a linear fracture in
the crack segmentations from multiple separate images
concrete and its type, size, orientation, and location need to be
are systematically aggregated through voxel-based recon-
struction and Bayesian data fusion. This framework outputs recorded to describe its seriousness [2], [3]. Conventionally,
a crack model that can retrieve accurate geometric proper- manual visual inspection has been the primary method for
ties of each crack segment by recognizing the crack’s inher- gathering information about cracks. It is however reckoned that
ent branching patterns. The capability and performance of manual visual inspection could not only be labor-demanding
the proposed crack quantification framework are validated
and time-consuming but also lead to rather subjective and inac-
on cracked concrete specimens in a laboratory setting.
Also, a field test on a cracked concrete wall was carried curate results [4]. Hence, there is a need for more reliable, and
out using images captured by a UAV to demonstrate the automated methods for detecting and characterizing cracks for
efficacy of the proposed framework in practical conditions. concrete structures.
Owing to the prevalence of low-cost and high-quality digital
cameras, image-based techniques are vastly incorporated into
crack inspection tasks. In an image-based technique, in order
Manuscript received 27 July 2021; revised 31 December 2021; ac-
cepted 18 January 2022. Date of publication 1 February 2022; date of to obtain geometrical properties (e.g., width and length) of
current version 9 September 2022. This work was supported in part by cracks, pixels that form the cracks need to be extracted with
the China Postdoctoral Science Foundation under Grant 2021M701805 high precision [5]. This process is called segmentation and has
and Grant 2021TQ0159 and in part by the Major Key Project of PCL un-
der Grant PCL2021A09. Paper no. TII-21-3190. (Corresponding author: been broadly developed and utilized for crack inspection. The
Chaobo Zhang.) crack segmentation output of an inspection image is a 2-D binary
Chaobo Zhang and Xiaojun Liang are with the Department of Mathe- mask where each pixel is labeled as either part of a crack or not.
matics and Theories, Peng Cheng Laboratory, Shenzhen 518066, China
(e-mail: [email protected]; [email protected]). Existing crack segmentation techniques can be divided into two
Maziar Jamshidi is with the Department of Civil Engineering, Uni- groups: methods that are based on manually-selected features
versity of Calgary, Calgary, AB T2N 1N4, Canada (e-mail: maziar. and those recent methods that utilize deep learning for feature
[email protected]).
Chih-Chen Chang is with the Department of Civil and Environmental extraction. Techniques in the first group detect crack pixels by
Engineering, Hong Kong University of Science and Technology, Hong applying a threshold or a machine learning classifier to hand-
Kong (e-mail: [email protected]). crafted image features [6]. The accuracy of these techniques,
Zhiwen Chen and Weihua Gui are with the Department of Math-
ematics and Theories, Peng Cheng Laboratory, Shenzhen 518066, however, depends on the quality of the features that are manually
China, and also with the School of Automation, Central South Uni- selected. Therefore, their performance may be hampered for
versity, Changsha 410017, China (e-mail: [email protected]; actual images taken under varying environmental conditions
[email protected]).
Color versions of one or more figures in this article are available at [7]. On the other hand, techniques based on deep learning
https://doi.org/10.1109/TII.2022.3147814. could extract high-level features directly from raw images by
Digital Object Identifier 10.1109/TII.2022.3147814 using large-scale datasets to train convolutional neural networks
1551-3203 © 2022 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR. Downloaded on May 04,2023 at 11:31:19 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: CONCRETE CRACK QUANTIFICATION USING VOXEL-BASED RECONSTRUCTION AND BAYESIAN DATA FUSION 7513
(CNNs) [8]. Researchers in several studies (e.g., [9], [10]) have and severity, a crucial piece of information is its relation with the
utilized the power of CNNs for crack segmentation by treat- underlying structural component. Therefore, in a comprehensive
ing cracks as distinct objects in inspection images. Results of assessment framework, it is necessary to associate the locations
those studies demonstrated that CNN-based approaches are less and properties (e.g., orientation) of identified cracks with the
sensitive and dependent on image conditions than the traditional structural components to be able to interpret their significance
crack segmentation methods in the first group. [22]. In this article, a procedure is laid out that employs as-
The next stage is to extract the physical properties of the designed models of the inspected structure to assign detected
identified cracks from the crack masks. To transform the pixel cracks to the underlying members with any shapes and types.
domain data into the physical domain, a typical approach is to The following are the main contributions of the proposed
employ the pinhole camera model where the camera must be article.
directed perpendicularly to the concrete surface at a measured 1) A new framework is laid out to retrieve detailed crack
distance [11], [12]. This method requires special instruments properties from a set of inspection images. Compared to
operating under controlled conditions. Also, they are usually the existing methods, the proposed technique can perceive
applicable to concrete surfaces with specific geometric shapes. branching patterns of cracks in 3-D voxel space. It can
These conditions could be difficult to meet when images are also associate the detected crack properties with the un-
captured by a nonstationary camera moving at varying distances derlying structural components which is a crucial piece of
from structural surfaces. Images taken from the camera mounted information when it comes to crack severity assessment.
on an unmanned aerial vehicle are examples of such situations. 2) A crack instance segmentation CNN is developed which
To eliminate the need for special camera settings, Valença can predict crack mask quality. The predicted quality
et al. [13], [14] utilized auxiliary equipment, such as predefined score plays a key role in the process of accurate crack
pattern boards and terrestrial laser scanners to perform image width quantification.
rectification and calculate the spatial resolution for crack mea- 3) A Bayesian data fusion technique is proposed whereby
surement. In another study, Jahanshahi et al. [15] reconstructed crack segmentation outcomes from multiple images are
3-D models from images to carry out perspective correction and systematically aggregated. The novelty of the proposed
measure cracks on flat concrete surfaces. Similarly, Liu et al. method is that accurate segmentations can favorably skew
[16], [17] proposed a combination of manual feature-based crack the final outcome. Application of this technique enhances
segmentation and 3-D scene reconstruction to locate the 3-D the quantification accuracy of size, orientation, and loca-
positions of the edges of thin cracks. However, their method tion of each crack segment.
simply stitches segmentations from multiple images to assess The rest of the article is organized as follows. Details of the
a crack’s severity. Such an indiscriminate combination of the research methodology are described in Section II. In Section III,
segmentation information without a systematic way to correct experimental results are presented to validate the capability
for false detections may render the crack quantification results of the proposed techniques. Finally, Section IV concludes this
unreliable. Besides, in inspection videos or images taken by article.
a robotic platform, it is likely that the same scene is captured
from many different view angles and distances generating a large II. METHODOLOGY
volume of unordered and complex overlapping visual data [18].
The proposed crack quantification technique takes a set of
In such a situation, the shared information among the entire
inspection images and the corresponding crack segmentation
collected image set can be leveraged to filter false segmentations.
results as input and outputs a voxel-based crack representation
In this article, a novel crack quantification framework is
model with detailed crack properties. As shown in Fig. 1, the
proposed that retrieves detailed crack properties from a set of
whole process can be divided into four stages.
unordered inspection images by incorporating CNN-based seg-
1) Crack segmentation in acquired images.
mentation, voxel-based reconstruction and Bayesian data fusion.
2) Voxel-based crack reconstruction
Voxels are volumetric cubes arranged in a 3-D grid and are
3) Bayesian data fusion.
analogous to pixels in 2-D images. Constructing crack models in
4) Crack measurement and visualization.
the form of voxels rather than a cluster of point clouds makes it
Details of each stage are described in the following.
possible to apply image processing techniques in 3-D space and
significantly reduces the data storage size of crack information
[19]. In this article, unlike the previous 3-D reconstruction-based A. Crack Segmentation
studies, the branching patterns of cracks are also recognized To extract the image pixels pertinent to a target crack, a
in the quantification process. Given the geometric complexity defect instance segmentation architecture called DIS-YOLO
of a crack pattern, which usually consists of several segments [23] was adopted. It was shown that DIS-YOLO outperforms the
connected at branching points, each crack segment should be state-of-the-art models in crack instance segmentation. In this
analyzed individually to get its length, width, and orientation network, a class-specific confidence score (object confidence
[20], [21]. Moreover, in the proposed framework, to increase the × class probability) from YOLOv3 [24] is used to give out
precision and robustness of the overall quantification process, predictions. This score reflects the confidence of the network in
Bayesian data fusion is utilized to systematically aggregate crack detecting the existence of cracks but does not reflect the quality
masks from multiple images. Finally, to understand a crack mode of the predicted masks very well, and such a misalignment tends
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR. Downloaded on May 04,2023 at 11:31:19 UTC from IEEE Xplore. Restrictions apply.
7514 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 18, NO. 11, NOVEMBER 2022
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR. Downloaded on May 04,2023 at 11:31:19 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: CONCRETE CRACK QUANTIFICATION USING VOXEL-BASED RECONSTRUCTION AND BAYESIAN DATA FUSION 7515
Fig. 4. Reconstruction process of crack voxel model. the camera parameters are known, patch-based MVS [31] is
performed to generate a dense 3-D point cloud model of the
scene. The output point cloud is then scaled to real size using a
illustrated examples show that the modified DIS-YOLO can
uniform scale factor obtained from the known dimensions of an
predict crack bounding boxes and masks more accurately com-
object in the scene.
pared to the original DIS-YOLO. This indicates that additional
The next step is to extract the element point cloud from
training information from the mask scoring branch is beneficial
the reconstructed scene. This task could be accomplished by
for both object detection and mask prediction in DIS-YOLO.
3-D registration of the as-built data with the as-designed model
Also, the added mask scoring branch provides a crack seg-
[32]. The as-built data is the reconstructed point cloud from the
mentation confidence that can well reflect the quality of the
previous step, which presents the current appearance of a scene.
predicted mask, as shown for all cracks in Fig. 3. The predicted
The as-designed model represents the corresponding 3-D model
segmentation confidences of the modified DIS-YOLO show
of a scene that is usually stored in CAD or BIM engines. The
slightly larger errors for some very thin and complex cracks [see
registration process between the as-built point cloud and the
Fig. 3(e) and (f)], which are still well below those of the original
as-designed model consists of a coarse registration for rough
DIS-YOLO.
alignment followed by a fine registration. The purpose of regis-
tration is to find the association between the as-built data with its
B. Voxel-Based Reconstruction
corresponding as-designed 3-D model object. In this article, the
Reconstructing 3-D objects from images is a classic computer coarse registration is performed manually using fiducial markers
vision problem that has been a subject of research for decades. with known locations on the as-designed 3-D model [18]. The
Among various image-based 3-D reconstruction techniques, fine registration is performed using a procedure based on the
multiview stereo (MVS) that reconstructs the 3-D scene based Iterative closest point proposed by Glira et al. [33]. It should
on stereo correspondence is the most successful method in be noted that, for both registration steps, only the translation
terms of robustness and number of applications [27]. The MVS and rotation transformation are computed to update the camera
algorithms can be classified into four general groups according parameters so as to keep the reconstructed point cloud model in
to their output scene representations which can be depth maps, real size.
point clouds, voxel spaces and meshes. Among them, the voxel The third step is to locate the crack points in 3-D space by
representation that reconstructs the 3-D scene with a regular grid projecting the crack from 2-D images onto the element surface.
structure provides great flexibility for model manipulation [28]. To do so, the extracted element point cloud from the previous
Such a voxel output is suitable for performing 3-D volumetric step is first converted to a triangular mesh surface model. With
analysis such as morphological operations, but may not be the known camera parameters, crack skeleton and width points,
suitable for accurate crack width measurement since its accuracy as the key points defining the shape of the crack, can be projected
is limited by the resolution of the voxel grid. In this article, a from the image onto the element’s triangular mesh surface. A
novel and robust voxel-based crack reconstruction framework is crack skeleton point is obtained by running a morphological
proposed which can produce a crack model where each voxel thinning algorithm on the crack segmentation masks, and the
encodes accurate crack width information from multiple 2-D closest crack edge point to each skeleton point corresponds to
images. This process is accomplished through the four steps its width point (see Fig. 5) [12].
illustrated in the flowchart shown in Fig. 4. Finally, the crack voxel model is reconstructed by building
First, a dense 3-D point cloud of the scene is generated using a volumetric space and labeling voxels containing any skeleton
the structure from motion (SFM) and MVS techniques. The SFM points as candidate crack voxels. As illustrated in Fig. 5, each
process adopts VisualSFM [29] to compute the camera param- voxel of the reconstructed crack model encodes the crack in-
eters by detecting and matching feature points from multiple formation from multiple images. Specifically, each voxel stores
images using the scale-invariant feature transform [30]. Once five pieces of information.
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR. Downloaded on May 04,2023 at 11:31:19 UTC from IEEE Xplore. Restrictions apply.
7516 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 18, NO. 11, NOVEMBER 2022
1) Image IDs indicating the images containing the skeleton image IR and the investigated image I, respectively, and IR
points of the voxel. and I¯ denote their corresponding mean intensity values. In
2) Crack IDs indicating the crack on the image that the the proposed algorithm, if the NCC coefficient is less than a
skeleton points of the voxel belong to. given threshold ρ, it indicates the content in the image being
3) Crack segmentation confidence for each crack ID. investigated is dissimilar to the reference image and should be
4) 3-D coordinates of the skeleton points (xs , ys , zs ). labeled as invisible.
5) 3-D coordinates of the corresponding width points (xw , For each candidate voxel, to initiate the algorithm, a list of
yw , zw ). potential reference images is generated by searching for the im-
Given this information, in the next stage, it is inferred whether ages that satisfy the following constraints: they must include the
a candidate crack voxel belongs to an actual crack or it is a false projected region of the voxel; they must provide crack skeleton
detection. points for the voxel; and the angle between the normal of the
voxel (taken as the surface normal of the closest triangular mesh
C. Data Aggregation face element) and the direction from the voxel to the camera’s
optical center is smaller than 90°. The potential reference images
1) Tracking Voxels in Multiple Images: Each voxel can have
in the list are then sorted in descending order according to their
projections in multiple images. Although for each candidate
crack segmentation confidence scores. Afterward, these images
crack voxel, there is at least one image with a detected crack in its
are analyzed sequentially to find a reference image that has
corresponding region, the projected regions of the crack voxel
more than γ (equal to 3) number of visible images as shown
on other images may have no crack detected by the segmen-
in Algorithm 1. This constraint guarantees that the reference
tation network. By utilizing the shared information among all
image itself is not occluded by any obstacles. Here, a preset
contributing images, the segmentation outcomes from multiple
threshold ρ = 0.3 for NCC is selected as the visibility criterion,
individual images can be fused to increase the accuracy and
however, the NCC is not invariant to rotation and may lead to a
robustness of the reconstructed crack voxel model. To realize
low similarity score for two images captured from different view
such data aggregation, a candidate crack voxel first needs to be
angles. To address this issue, the projected region of an image
tracked in multiple visible images. With the estimated camera
being checked against a potential reference image is sequentially
parameters calculated in the 3-D reconstruction process, the lo-
rotated with a 30°interval around its center. After each rotation,
cation of a candidate crack voxel in each image can be calculated
the NCC coefficient is calculated to check whether the visibility
by projecting the voxel onto the image. Here, to maintain a
is larger than the threshold. In the end, if the algorithm could
consistent projected region in each image, the sphere projection
successfully find a reference image, that image and all the visible
model introduced by Yeum et al. [18] is adopted. In each image,
images are stored and linked to the evaluated candidate crack
the smallest bounding box containing the projected sphere is
voxel, otherwise, the voxel will be removed from the model.
taken as the projected image region of the voxel.
2) Bayesian Data Fusion: For a set of candidate crack voxels
Due to uncertainty in the crack’s location and the camera’s
that are successfully tracked in multiple images, a rigorous
view angles, there might be images that fail to show some voxels
approach is needed to determine whether they belong to an actual
clearly and should be excluded in the data fusion process. For
crack or not. Previously, Chen et al. [34] proposed a data fusion
instance, the candidate crack voxel may be too far away from the
approach based on Bayesian inference that provides a systematic
optical center of the image (i.e., crack is not detectable) or it is
framework for decision making. In this article, this promising
occluded by other objects in the scene. To tackle this problem, a
approach was further improved by including the segmentation
visibility checking algorithm is proposed in Algorithm 1 based
confidence of the network, which reflects the reliability of the
on the reference image selection and photo-consistency mea-
network in finding a crack object and generating a high-quality
sure. The reference image for a candidate crack voxel is defined
mask (see Section II-A). As a general rule, the closer the value
as the image with the best crack visibility. By comparing the
of segmentation confidence gets to 1, the more likely it is that the
photo-consistency of the projected region in the reference image
segmentation outcome is correct and accurate. In the proposed
with the corresponding regions in other images, their visibility
data fusion process, the segmentation outcomes are divided into
can be evaluated. In this article, the projected image regions are
two categories of low confidence (LC) and high confidence (HC)
converted to grayscale and resized to a constant size of n ×
detections using a limiting value λ. Thus, for a candidate crack
n (n is set to 30), and the normalized cross-correlation (NCC)
voxel v that contains n visible images, the outcome y can be
coefficient [27] shown in the following equation is adopted to
presented as the set below
measure the photo-consistency, (2) shown at the bottom of this
page.
In (2), IR(px , py ) and I(px , py ) denote the intensity values
at pixel location (px , py ) of the projected region in reference y = {(i, Yi ) |i ∈ {1, . . . , n} , Yi ∈ {NO, LC, HC}}. (3)
n n
px =1 IR (px , py ) − IR ]·[ I (px , py ) − I¯
py =1
NCC (IR, I) = n 2 n n (2)
n ¯2
px =1 py =1 IR (p x , p y ) − IR · px =1 py =1 I (px , py ) − I
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR. Downloaded on May 04,2023 at 11:31:19 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: CONCRETE CRACK QUANTIFICATION USING VOXEL-BASED RECONSTRUCTION AND BAYESIAN DATA FUSION 7517
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR. Downloaded on May 04,2023 at 11:31:19 UTC from IEEE Xplore. Restrictions apply.
7518 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 18, NO. 11, NOVEMBER 2022
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR. Downloaded on May 04,2023 at 11:31:19 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: CONCRETE CRACK QUANTIFICATION USING VOXEL-BASED RECONSTRUCTION AND BAYESIAN DATA FUSION 7519
TABLE I
ACCURACY OF CRACK VOXEL MODEL BEFORE AND AFTER
BAYESIAN DATA FUSION
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR. Downloaded on May 04,2023 at 11:31:19 UTC from IEEE Xplore. Restrictions apply.
7520 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 18, NO. 11, NOVEMBER 2022
TABLE II
QUANTIFICATION RESULTS OF CRACK LENGTH AND ORIENTATION
TABLE III
QUANTIFICATION RESULTS OF CRACK WIDTH
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR. Downloaded on May 04,2023 at 11:31:19 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: CONCRETE CRACK QUANTIFICATION USING VOXEL-BASED RECONSTRUCTION AND BAYESIAN DATA FUSION 7521
Fig. 12. Field test of a concrete wall and its reconstructed crack segment model.
method on average offers a 4.7% lower relative error compared to fixed at 4 mm and 0.8 μm, respectively. Under this experimental
the direct projection method. Particularly, for the narrow cracks setting, the maximum spatial resolution and minimum detectable
at locations number 3 and number 10, the advantage of the crack width can be calculated as 10 pixels/mm and 0.3 mm,
proposed method is more obvious where the relative error is respectively.
decreased by more than 10%. It should be noted that in the Fig. 12(b) shows the as-designed 3-D model and the recon-
direct projection method, all detected cracks are projected onto structed point cloud model of the wall. Similar to most concrete
the triangular surface model. Therefore, false segmentations structures in service, the surface of the wall has dirt, holes, or
can easily generate false crack location and width information other inevitable imperfections, which usually provide enough
in 3-D space, and thus further undermine the accuracy of the distinctive feature points for accurate point cloud reconstruc-
overall crack measurement. Moreover, with the 3-D crack points tion. The final crack segment model is illustrated in Fig. 12(c).
generated as the output of the direct projection method, it would This model is obtained following the implementation details
be difficult to apply image processing techniques to calculate of crack segmentation, crack voxel model reconstruction and
the length and orientation of each crack segment. Bayesian data fusion described in Section III-A. It should be
noted that the conditional probabilities and prior odds ratio
in the Bayesian data fusion process were obtained through a
C. Field Test prior crack inspection test on another wall in the same fac-
A field test was conducted on a concrete wall of a factory tory room. When the limiting parameter λ was set to 0.6,
room, as shown in Fig. 12(a). The wall’s dimensions are 8 m × the terms P (HC|AC), P (LC|AC), P (NO|AC), P (HC|AC),
3 m, and it has several transverse cracks on its surface. Photos P (LC|AC), P (NO|AC), and P (AC)/P (AC) was estimated to
of the wall were taken using the commercial UAV “Mavic 2 be 0.357, 0.419, 0.224, 0.020, 0.042, 0.938, and 0.031, respec-
Enterprise Advanced” equipped with a high-resolution camera. tively. The results in Fig. 12(c) show that major cracks could be
Since the GPS signal for precise indoor navigation is weak, man- recognized successfully. However, a couple of very short cracks
ual control of the UAV was adopted throughout the experiment. were detected as false positives or negatives.
A total of 96 images were taken with a resolution of 8000 × In Table IV, the length, width and orientation of a total of
6000 pixels at a working distance of approximately between 0.5 eight detected crack segments are presented. It can be seen
and 1.5 m. The focal length and pixel pitch of the camera are that the errors in the predicted lengths and orientations for all
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR. Downloaded on May 04,2023 at 11:31:19 UTC from IEEE Xplore. Restrictions apply.
7522 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 18, NO. 11, NOVEMBER 2022
TABLE IV
CRACK QUANTIFICATION RESULTS OF THE FIELD TEST
crack segments are less than 20 mm and 5°, respectively. Also images. Finally, a crack quantification procedure is proposed
in Table IV, the predicted maximum widths for two locations to retrieve the detailed crack properties for each crack segment
of each crack segment are compared with the measured crack given the inherently complex and branching pattern of cracks in
widths at the same locations. In terms of the accuracy of crack 3-D space.
width prediction, the measurement error of most crack locations, The capability and performance of the proposed techniques
including the narrow cracks at locations 4, 10, and 13, can be were validated on cracked concrete specimens using the images
controlled within 0.1 mm. The proposed method could offer acquired from different locations and view angles. Results show
an error of less than 10% for cracks with widths greater than that the proposed Bayesian data fusion approach can signifi-
0.5 mm. This indicates that a promising width measurement cantly improve the accuracy of crack voxel model reconstruction
accuracy could be achieved when the crack width is more and achieve an F1-score of 0.767, which is 33.8% higher than the
than 5 pixels on an image. The above results demonstrate that method without data fusion and 3.4% higher than the standard
the proposed crack quantification framework has a consistent Bayesian data fusion approach. It is also illustrated that the
performance for both laboratory and field conditions. obtained crack voxel model can be used to accurately quantify
the length and orientation of relatively long crack segments in a
complete crack. By quantifying the crack width on reference
IV. CONCLUSION images, a promising accuracy for crack widths is achieved
In this article, a new crack assessment and 3-D visualization as well. Compared with the direct projection method where
framework is proposed that identifies crack segments and quan- crack width measurements across multiple images are simply
tifies their various properties using a set of unordered inspection averaged, the proposed method could provide a 4.7% lower
images. The proposed technique begins with an instance seg- average error in terms of crack width. In order to further verify
mentation network that generates segmentation masks for the the robustness and applicability of the proposed framework, a
cracks that appeared in all images. The proposed network can field test was conducted on a cracked concrete wall with images
output a crack segmentation confidence score for each predic- taken by a commercial UAV. Results show that the calculated
tion, which is consistent with the IoU between the predicted and crack length and orientation match well with the actual measured
the ground-truth mask. Testing results on the public concrete values. And for cracks wider than 0.5 mm, the relative crack
crack dataset showed that the developed network could achieve width error was less than 10%. These results demonstrate the
the instance segmentation accuracy of AP = 87.3% using the efficacy of the proposed framework to quantify cracks in the field
mask-level IoU metric of 0.5, and the semantic segmentation ac- environment.
curacy of IoU = 72.5%. Next, the crack segmentation results of The results of the proposed crack quantification framework
multiple images are aggregated through the 3-D reconstruction promote its possible adoption in automated inspection systems
and Bayesian data fusion techniques. Specifically, in the pro- where often a large volume of unordered and complex visual
posed data fusion approach, rather than treating crack segmen- inspection data is collected and processed. Moreover, compress-
tation outcomes equally, segmentations with higher confidence ing the crack information of collected images in the form of
scores can potentially improve the final result. The above steps voxel models can significantly reduce the need to retain and
result in a voxel-based crack representation model where each store a massive amount of inspection images over the lifetime
voxel encodes crack width information derived from a reference of a piece of infrastructure. In the future, more field tests on
image. Following a proposed image tracking algorithm, for different concrete structures, specifically those with very narrow
each crack voxel, a reference image is found that has the best cracks, should be performed to validate and further improve the
quality and most accurate crack segmentation among all visible robustness and practicality of the proposed framework.
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR. Downloaded on May 04,2023 at 11:31:19 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: CONCRETE CRACK QUANTIFICATION USING VOXEL-BASED RECONSTRUCTION AND BAYESIAN DATA FUSION 7523
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR. Downloaded on May 04,2023 at 11:31:19 UTC from IEEE Xplore. Restrictions apply.
7524 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 18, NO. 11, NOVEMBER 2022
Chih-chen Chang received the master’s and Zhiwen Chen (Member, IEEE) received the
Doctoral degrees in aeronautics and astronau- B.S. degree in electronic information science
tics from Purdue University, West Lafayette, IN, and technology and the M.S. degree in elec-
USA, in 1989 and 1993, respectively. tronic information and technology from Central
He is currently a Professor Emeritus with the South University, Changsha, China, in 2008 and
Department of Civil and Environmental Engi- 2012, respectively, and the Ph.D. degree in elec-
neering, Hong Kong University of Science and trical engineering and information technology
Technology, Hong Kong. His research interests from the University of Duisburg-Essen, Duis-
included condition monitoring, control and as- burg, Germany, in 2016.
sessment of large scale structures, innovative He is currently an Associated Professor with
signal and data processing for civil engineering Central South University. His research interests
applications. include model-based and data-driven fault diagnosis and health moni-
toring, data analytics, etc.
Xiaojun Liang (Member, IEEE) received the Weihua Gui received the B.S. degree in elec-
B.Eng. degree in engineering mechanics and trical engineering and the M.S. degree in con-
aerospace from Tsinghua University, Beijing, trol engineering from Central South University,
China, in 2012, and the Ph.D. degree in me- Changsha, China, in 1976 and 1981, respec-
chanical engineering and applied mechanics tively.
from the University of Pennsylvania, Philadel- From 1986 to 1988, he was a Visiting Scholar
phia, PA, USA, in 2017. with the University GH Duisburg, Duisburg, Ger-
From 2017 to 2020, he was a Senior R&D many. Since 1991, he has been a Full Professor
Engineer with intelligent robot division of JD with Central South University, Changsha, China.
Group. Since 2021, he is currently an Assistant Since 2013, he has been an Academician with
Research Fellow with Peng Cheng Laboratory, the Chinese Academy of Engineering, Beijing,
Shenzhen, China. His research interests include mechanical modeling China. His current research interests include modeling and optimal
and analysis, industrial intelligent system, industry control optimization control of complex industrial processes, distributed robust control, and
based on mechanism and data fusion, etc. fault diagnosis.
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR. Downloaded on May 04,2023 at 11:31:19 UTC from IEEE Xplore. Restrictions apply.