0% found this document useful (0 votes)
15 views12 pages

Bare JRNL

Uploaded by

sorese6187
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views12 pages

Bare JRNL

Uploaded by

sorese6187
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

1

Automatic Road Crack Detection Using Random


Structured Forests
Yong Shi, Limeng Cui, Zhiquan Qi, Fan Meng and Zhensong Chen

Abstract—The growing threat of crack to road condition


has drawn much attention to the construction of intelligent
transportation system. However, as the key part of intelligent
transportation system, automatic road crack detection has suf-
fered great challenge because of the intensity inhomogeneity
along the cracks, the topology complexity of crack, the infer-
ence of noises with similar texture to the cracks and so on.
In this paper, we propose CrackForest, a novel road crack
detection framework based on random structured forests to
address these issues. Our contributions are shown as follows: (1) (a) Original image (b) Edge detection
apply the integral channel features to re-define the tokens that
constitute a crack and get better representation of the cracks
with intensity inhomogeneity; (2) introduce random structured
forests to generate a high-performance crack detector which can
identify arbitrarily complex cracks; (3) propose a new crack
descriptor to characterize cracks and discriminate them from
noises effectively. In addition, our method is faster and easier to
parallel. Experimental results prove the state-of-the-art detection
precision of CrackForest compared with competing methods.
Index Terms—Road crack detection, structured learning, ma-
chine learning, random structured forests, crack descriptor, crack (c) Binarization (d) Crack detection
characterization.
Fig. 1. Consider the pavement surface shown in image (a). (b) shows the
preliminary detection results after applying random structured forests. Darker
color indicates that the pixel is more likely to contain a crack. After eroding
I. I NTRODUCTION and dilating, the result is shown in (c). (d) shows the final result after the
RACK is a form of road distresses that may potentially classification stage.
C reduce the road performance and threaten the road safety
[1]. Governments have made a great effort to achieve the goal
of constructing a high quality road network. They are now, With the development of image processing techniques, road
more than ever, fully aware of the need for adequate road crack detection and recognition have been widely discussed in
inspection and maintenance. Crack detection is an essential the past few decades [7], [8], [9], [10], [11]. In early methods
part of road maintenance systems and has attracted growing [12], [13], researchers usually use threshold-based approaches
attentions in recent years. As it is known, traditional manual to find crack regions based on the assumption that real
road crack detection approaches are very time-consuming, crack pixel is consistently darker than its surroundings. These
dangerous, labor-intensive and subjective [2], [3], [4], [5]. methods are very sensitive to noises, since only brightness
Thus, the slow and subjective traditional procedures have been feature is taken into consideration. Moreover, these approaches
replaced gradually by automatic crack detection, which is are performed on individual pixels. Lack of global view also
developed for fast and reliable crack analysis in intelligent makes these methods unsatisfying.
transportation systems [6]. Automated crack detection systems In terms of the current methods [8], [14], [9], [15], [5], [11],
can quantify the quality of road surfaces and assist in priori- most researchers try to suppress the inference of noises by
tizing and planning the maintenance of the road network and incorporating features such as gray-level value [11], the mean
thereby accomplish the objective of preserving the roads in and the standard deviation value [5], [16], [6]. In addition, to
good condition and extending the service life. improve the continuity of the existing methods, researchers
attempt to conduct crack detection from a global view by
Address correspondence to Zhiquan Qi, Research Center on Fictitious
Economy and Data Science, Chinese Academy of Sciences, Beijing 100190, introducing methods such as Minimal Path Selection (MPS)
China. E-mail: [email protected]. [17], [18], [19], Minimum Spanning Tree (MST) [20], [21],
Yong Shi is with School of Economics and Management, University of Crack Fundamental Element (CFE) [22], [23] and so on. These
Chinese Academy of Sciences and also with Key Laboratory of Big Data
Mining and Knowledge Management, Chinese Academy of Sciences. methods can partly eliminate noises and enhance the continuity
Limeng Cui is with School of Computer and Control Engineering, Univer- of detected cracks.
sity of Chinese Academy of Sciences.
Fan Meng and Zhensong Chen is with School of Economics and Manage- However, these methods do not perform well while dealing
ment, University of Chinese Academy of Sciences. with cracks with intensity inhomogeneity or complex topology.
2

are discussed. Crack characterization exploits the spatial dis-


tribution of image tokens composing the detected cracks and
thereby transforms the structured tokens into discrete labels.

A. Crack Detection
Numerous papers have been written on road crack detection
over the past 30 years. Early works [28], [29], [1], [30]
are mainly based on intensity-thresholding for its simplicity
and efficiency. Most recent work explores crack detection
under more challenging conditions and can be divided into
(a) Most representative token (b) Mean contour structure five branches: methods based on saliency detection, textured-
analysis, wavelet transform, minimal path and machine learn-
Fig. 2. Examples of tokens learned from manually labelled image database. ing. An assessment of various pavement distress detection
(a) shows the most representative token for each token set. (b) shows the methods can be found in [31] and [32].
mean contour structure for each token set. Salient Detection: Salient regions are visually more con-
spicuous due to their contrast with the surroundings. Although
A possible explanation is that the used features only roughly existing methods [33], [34] demonstrate their effectiveness in
capture the grey-level information but some unique charac- detecting salient regions in the Berkeley database [35], they
teristics of crack may not be presented and utilized properly. perform poor on the completeness and continuity of detected
Besides, local structured information is ignored by existing crack.
methods. In fact, cracks in a local image patch are highly Textured-analysis: Since road surface images are often
interdependent, which often contain well-known patterns, such highly textured, textured-analysis methods [36], [8], [37] are
as longitudinal, transverse, diagonal and so on. Therefore, introduced in road crack detection. In order to distinguish the
structured learning is proposed to solve similar problems in cracks and the backgrounds, [36], [8] use the Wigner model,
recent years. For example, in [24], researchers apply structured and [37] uses classification method. These methods use a
learning to semantic image labeling where image labels are local binary pattern operator to determine whether each pixel
also interdependent. belongs to a crack and the local neighbor information is not
In order to overcome these two shortages mentioned above, taken into consideration. Therefore, the cracks with intensity
we propose a novel road crack detection method (called inhomogeneity can not be detected precisely.
CrackForest) based on random structured forests, which is Wavlet Transform: Wavelet transform is applied to sep-
superior to other state-of-the-art detecting techniques like arate distresses from noises [38]. In [4], complex coefficient
CrackTree [20], CrackIT [6], FFA [25] and MPS [17], [19]. maps are built by a 2D continuous wavelet transform, wavelet
CrackForest incorporates complementary features from mul- coefficients maximal values are obtained for crack detection.
tiple levels to characterize cracks and to take advantage As a result, differences between crack regions and crack free
of the structured information in crack patches. In specific, regions could be raised up. However, due to the anisotropic
we first extend the traditional road crack detection feature characteristic of wavelets, these approaches may not handle
set by introducing the integral channel features [26] to re- the cracks with low continuity or high curvature properly.
define crack tokens with structured information. After that, we Minimal Path Selection: Give both endpoints of the curve
apply random structured forests [27] to exploit such structured as user’s input, minimal path based method can extract simple
information. Random structured forests predict a patch crack open curves in images, that is first proposed by Kass et
of structured tokens that are aggregated across the image al. [39]. In [40], Kaul et al. propose a method that is able
to compute our preliminary crack detection result. In this to detect the same types of contour-like image structures
step, the structured tokens assigned to each image patch with less prior knowledge about both the topology and the
can be obtained simultaneously. Then, the structured tokens endpoints of the desired curves. To avoid false detections that
are used to construct the crack descriptor which consists of are assimilating loops, Amhaz et al. [17], [19] propose an
two statistical histograms to characterize cracks with arbitrary improved algorithm to select endpoints at the local scale and
topology. With the crack descriptor, a classification method is then to select minimal paths at the global scale. It can also
applied to discriminate the cracks from noises. In addition, detect the width of the crack. In [25], Nguyen et al. propose
we also propose a quantitative evaluation method for road a method which takes into account intensity and crack form
crack detection task. Extensive experiments demonstrate the features for crack detection simultaneously by introducing
efficiency of CrackForest on real road crack dataset and our Free-Form Anisotropy.
method shows state-of-the-art precision. Machine Learning: With the increasing size of image
data, machine learning based methods [41], [3], [42], [15],
[5], [43] have become an important branch in detecting road
II. R ELATED W ORK cracks. In [3], artificial neural network models are used to
In this section, we first give a brief review of crack de- separate crack pixels from the background by selecting proper
tection, after that, the related crack characterization methods thresholds. [41] deals with the detection of poorly contrasted
3

cracks in textured areas using a Markov random field model.


In [43], Cord et al. use AdaBoost to distinguish images of
road surfaces with defects from road surfaces based on textual
information with patterns. For all these methods, the training
and classification are conducted on each sub-image and as
local method, they have drawbacks in finding complete crack
curves over the whole image.

B. Crack Characterization
Existing methods on crack characterization are mainly based
on shape descriptor, crack seeds and assigning crack type on
each image block.
[14] gives the definition of cracks based on mathematical
morphology and proposes that a crack is thought to be a
succession of saddle points with linear features. But this Fig. 3. Procedure of the proposed automatic road crack detection method
definition is pretty vague. [2], [44] use the direction indices
of each pixels and extensible directions for each direction to
characterize cracks. A chromosome representation is applied to and thereby a preliminary result of crack detection can be
encode the different ensemble of directions and its extensible obtained. In the third part, we propose a new crack descriptor
directions. Therefore, a crack can be represented as a long by using the statistical character of tokens. This descriptor
sequence of 0 and 1. can characterize the cracks with arbitrary topology. And a
[42], [31] categorize the cracks into five types: longitudinal, classification algorithm (KNN, SVM or One-Class SVM) is
transverse, diagonal, block, and alligator. [42] uses a neural applied to discriminate cracks from noises effectively.
network based method to search patterns of various crack
types horizontally and vertically. [31] uses curves and buffers A. Structured Tokens
to describe certain regions of a crack. [9] uses longitudinal,
Token (segmentation mask) indicates the crack regions of
transverse, or diagonal crack seeds to identify longitudinal
an image patch. Current block-based methods [38], [6] are
and transverse cracks. Orientation and strength information
usually used to extract small patches and calculate mean and
are taken into consideration by [20], which largely improves
standard deviation value on these patches to represent an image
the diversity of crack seeds.
token. These traditional features are computed on gray level
In [6], cracks are classified into three types as defined by
images and applied to describe the brightness and gradient
the Portuguese Distress Catalog. They use two block feature
information. However, local structured information is not taken
including the mean and the standard deviation values of
into consideration. So in the first step, we re-define the tokens
pixel-normalized intensities to categorize an image block as
by introducing the integral channel features which incorporate
longitudinal, transversal or miscellaneous. [5] computes CTA
the color, gradient information from multiple levels and facets.
(Conditional Texture Anisotropy) values over the distribution
1) Learning the Tokens: Assume that we have a set of
of the mean and the standard deviation values calculated on
images I with a corresponding set of binary images G rep-
pixels to distinguish crack pixels from defect free pixels.
resenting the manually labeled crack edge from the sketches.
However, there are two main drawbacks in these methods.
We use a 16 × 16 sliding window to extract image patches
On the one hand, new types of crack cannot be generated. By
x ∈ X from the original image. Image patch x which
applying the structured tokens, we extend the crack types into
contains a labeled crack edge at its center pixel, will be
thousands of dimensions. On the other hand, these methods
regarded as positive instance and vice versa. y ∈ Y encodes
perform poor on the cracks with complex topology. To address
the corresponding local image annotation (crack region or
this issue, we propose a novel crack descriptor to describe the
crack free region), which also indicates the local structured
cracks with arbitrary complex topology.
information of the original image. These tokens cover the
diversity of various cracks, which are not limited to straight
III. AUTOMATIC ROAD C RACK D ETECTION lines, corners, curves, etc.
In this section, we will introduce our novel crack detection From Fig. 4, we can see the extracted image patches and
method which can take advantage of the structured information their hand drawn contour tokens. These image patches and
of cracks. Fig. 3 shows the overall procedure of our proposed tokens will be used to train CrackForest later.
method. This framework can be divided into three parts: In 2) Feature Extraction: To describe the above tokens, fea-
the first part, we extend the feature set of traditional crack tures are computed on the image patches x extracted from the
detection method by introducing the integral channel features. training images I, and considered to be weak classifiers in the
These features extracted from multiple levels and orientations next step.
allow us to re-define representative crack tokens with richer We use mean and standard deviation value as features.
structured information. In the second part, random structured Two matrices are computed for each original image: the
forests are introduced to exploit such structured information, mean matrix Mm with each block’s average intensity and the
4

Fig. 5. The routing path of an image patch.

Fig. 4. (Top) Example of original image and its ground truth. (Bottom)
Example of extracted image patches and their hand drawn contours. Notice
A forest T can be seen as an ensemble of decision trees
the variety of sketches. ft . Each tree ft (x) gives a prediction of a sample x ∈ X .
The final class prediction of multiple trees is integrated by
a majority voting algorithm. A leaf L(π) ∈ ft can assign
standard deviation matrix ST Dm with corresponding standard a class prediction for samples it is reached by, where π
deviation value std. Each image patch yields a mean value and stands for the most represented token in the leaf. Each node
a 16 × 16 standard deviation matrix. N (h, ft L , ft R ) ∈ ft is associated with a binary split function:
To characterize the cracks more comprehensively, we also
apply a set of channel features composed with color, gradient h(x, θj ) ∈ {0, 1} (1)
and oriented gradient information. Integral channel features
not only perform better than other features including histogram with feature θj for each node j. If h(x, θj ) = 0, sample x
of oriented gradient (HOG), but also achieve fast detecting should be branched to the left sub-tree ft L , otherwise the right
results and integrate heterogeneous sources of information sub-tree ft R .
[26]. 1) Class prediction: Given a tree ft ∈ T , the class
3 color, 2 magnitude and 8 orientation channels, for a total prediction of an image patch x ∈ X can be obtained by
of 13 channels yield 3328 candidate features. Each of the chan- recursively branching it forward until a leaf is reached. An
nel captures a different aspect of information. Self-similarity intuitive example has shown in Fig. 5. The prediction function
features are compute for each channel. These features capture ψ(x|ft ) : X → Y for node j is:
the portion that an image patch contains similar textures based
on color or gradient informations [45]. Texture information is
(
L R ψ(x|ft L ), for h(x, θj ) = 0
computedon a m × m grid over the patch. These differences ψ(x|N (h, ft , ft )) =
yield 5·5 = 300 more features per channel. ψ(x|ft R ), for h(x, θj ) = 1 (2)
2
ψ(x|L(π)) = π
B. Structured Learning The final class prediction of x is obtained from the predic-
In previous step, a set of tokens y which indicate the tion of each tree as the one receiving the majority voting.
structured information of local patches, and features which 2) Randomized training: Each tree is trained individually.
describe such tokens, are acquired. In this step, we cluster For a given node Nj and training set Sj ⊂ X × Y, the goal is
these tokens by using a state-of-the-art structured learning to find the optimal feature θj that results in a good split of the
framework, random structured forests, to generate an effective data. In other words, the discrepancy of tokens in the same leaf
crack detector. Random structured forests can exploit the should be as small as possible. We apply information gain to
structured information and predict the segmentation mask measure this discrepancy and maximize the information gain
(token) of a given image patch. Thereby we can obtain the to choose θj . The form of information gain for node j is
preliminary result of crack detection. defined as follow:
In random structured forests, each decision tree ft (x) clas- Ij = I(Sj , Sj L , Sj R ) (3)
sifies an image patch x ∈ X by recursively branching left
or right down to the tree until a leaf is reached. And the where Sj = Sj L ∪ Sj R , Sj L = {(x, y) ∈ Sj |h(x, θj ) = 0}
class of the node is assigned to patch x. The leaf stores the stands for a set of samples that reaches the left sub-tree of the
prediction of the input x, which is a target label y ∈ Y or a current node and Sj R = {(x, y) ∈ Sj |h(x, θj ) = 1} refers to
distribution over Y. By training such a tree, tokens with the the other set of samples that reaches the right sub-tree.
same structure will be gathered at one leaf. We use the most Whether a terminal node should be further split depends on
representative token in each leaf to represent the token class. the maximum depth, the minimum size of node or the entropy
The class number of tokens equals to the number of leaves. of the class distribution. If the node is no longer splitting, a
5

leaf is grown where the class prediction π is set to the most


representative token in the training data. Otherwise a node
N (h, ft L , ft R ) is grown where h is a split function regulated
by parameter θj , maximizing the information gain about the
label distribution due to the split {Sj L , Sj R } of the training
data S.
For multi-class classification (Y ⊂ Z), the definition of
information gain is:
X |Sj k |
Ij = H(Sj ) − H(Sj k ) (4)
|Sj |
k∈{L,R}
P
where H(Sj ) = − y py log(py ) denotes the Shannon entropy
and py stands for the proportion of elements P in S with label
y. Alternatively, Gini impurity H(Sj ) = y py (1 − py ) can
also be applied in Equation(4).
Individual decision tree tends to overfit, which may nega- Fig. 6. Assigning y to each image patch. The image patches have been
assigned to the tokens below respectively (both from left to right).
tively affected accuracy. To overcome this drawback, random
structured forests combine multiple decision trees together
to assign the final label. Random structured forests have
shown promising flexibility and generalization ability, and
most importantly, this method is easy to parallel and extremely
fast.
The randomness is embodied by randomly subsampling the
data used to train each tree and each node, and randomly
subsampling the features used to split each node. In order to
maintain the diversity of trees, only a small pool of features is (a) Binarization based on threshold (b) Erosion and Dilation
used to select the optimal θj when choosing the split function.
3) Structured mapping: Random structured forests change Fig. 7. (a) shows the binarization result based on threshold when α = 0.1
(removing pixels of low probability according to the given probability map).
the discrete outer space of the traditional decision forests (b) shows the result after erosion and dilation with a 4-by-4 rectangular
into a structured space Y. While dealing with structured label structuring element.
y ∈ Y directly may cause significant computing expense, the
structured labels y ∈ Y at a leaf is mapped into a set of
discrete labels c ∈ C, where C = {1, . . . , k}. Given the discrete leaf is assigned to the image patch. Fig. 6 shows an intuitive
label space C, information gain can be calculated efficiently via example. We select the token which has the lowest variance
Equation(4). We first map the label space Y into a intermediate with others as the most representative token.
space Z: 4) Binarization: After the structured mapping, each image
patch x is assigned to a structured label y. Due to the over-
Π:Y→Z (5) lapping, the result of detection is a map, where each element
16·16
 indicates the probability that the corresponding position in the
Define z = Π(y) in space Z as a 2 = 32640 dimensional
original image is on crack region. So we use a threshold α to
vector, which encodes every pair of pixels in the segmentation
obtain all the possible regions. A high α value may cause the
mask y. The computational cost of z appears to be significant.
incontinuity of cracks and the ignorance of inapparent cracks.
While the dimension of z is still very high, we randomly
Therefore, we choose 0.1 6 α 6 0.2 in this paper. Fig. 7(a)
select 256 dimension of z to train each split function, using a
shows the binarization result when α = 0.1.
distinct reduced mapping function at each node j:
We conduct the erosion and the dilation operation on the
Πφ : Y → Z (6) preliminary edge detection results to make the cracks as
connective as possible. The inside of the crack is filled and
Then we apply PCA reduction to map 256 dimensions of z the fragments are connected. Moreover, some of the noises are
into 5 dimensions, with the first dimension being the most eliminated. From Fig. 7(b), we can see that small fragments of
significant factor. To obtain the discrete label c ∈ C of the detected region have merged together and the continuity
each structured label y, we use the first dimension of each of the crack has been improved.
intermediate label z to cluster into two sets. Labels in the
same cluster are assigned to the same label c. With the label
c, standard information gain can be calculated at each node. C. Crack Type Characterization and Detection
After the random structured forests are trained, the struc- Each image patch is assigned to a structured label y (seg-
tured labels y are gathered at the leaves of each tree. An image mentation mask) after structured learning. Although we obtain
patch is routed though each tree based on the split function a preliminary result of crack detection so far, a lot of noises
until a leaf is reached. The most representative token in the are generated due to the textured background at the same
6

12 12

10 10

8 8

Occurrence (log2)

Occurrence (log2)
6 6

4 4

2 2

0 0
(a) The original image (b) The detected region 0 0.5 1 1.5
Token Number
2 2.5 3
×10 4
0 100 200 300 400
Token Number
500 600 700

120 (a) The occurrence of all the tokens in (b) The occurrence of the 708 most
descending order frequent tokens in descending order
100

80

Fig. 9. The statistical feature histogram shows the occurrence (in logs) of
Occurrence

60 each token (sorting in descending order of occurrence). (a) shows the statistical
40
feature of all 26443 tokens. (b) only shows the most representative tokens.

20

0
0 0.5 1 1.5
Token Number
2 2.5
×10 4
up only a small percentage of all. Therefore, we only use
(c) The statistical histogram of the (d) The 10 most frequent tokens of
these 708 tokens to construct the statistical feature histogram
detected region of the detected region and the statistical neighborhood histogram. Fig. 9(b) shows
the occurrence of these tokens.
Fig. 8. (a) shows the original image. (b) shows one of the detected regions.
(c) shows the statistical feature histogram of the detected region. (d) shows
Statistical Neighborhood Histogram: The statistical
what the 10 most frequent tokens look like. neighborhood histogram captures the neighborhood informa-
tion of two tokens. We calculate the co-occurrence of each
pair of tokens only when they are adjacent. There would be
time. Traditional thresholding methods mark small regions as 708
2 = 250278 token pairs without reduction. Furthermore,
noises according to their sizes. However, in this way, many we also find the long tail effect of this distribution. Over 90%
inconspicuous cracks may be removed by mistake. occurrences of all the token pairs are centered on 956 specific
Cracks have a series of unique structural properties that token pairs. Thus, only these token pairs will be used in the
differ from noises. Based on this thought, we propose a novel following section.
crack descriptor by using the statistical feature of structured 2) Crack Detection: With the two histograms for each
tokens in this section. This descriptor consists of two statis- separated region, we can characterized cracks with arbitrary
tical histograms, which can characterize cracks with arbitrary topology. In this section, we will introduce how to discriminate
topology. By applying classification method like SVM, we can the noises from cracks by using the two histograms.
discriminate noises from cracks effectively. Vectorization: The distribution of occurrence and co-
1) Crack Descriptor: Existing crack characterization meth- occurrence are scaled to [0, 1]. Hence, each detected region is
ods categorize cracks into several types, such as longitudinal, presented as a long vector with 708+956 = 1664 dimensions.
transverse, diagonal, block, and alligator. However, the de- Classification: We consider the crack detection procedure
scriptor proposed, which consists of hundreds of dimensions as a classification problem. The crack regions are assigned
respectively, has greatly broadened the range of representable to class +1 and the crack free regions are assigned to class
crack. What is more, the crack is no longer limited to a few −1. By applying KNN(k-Nearest Neighbor), SVM (Support
types, we extend the types of crack into thousands of kinds. Vector Machine) with linear kernel and One-Class SVM with
We use 26443 structured tokens obtained in the structured linear kernel, we obtain the classification model which can
learning procedure to characterize the cracks. The statistical discriminate cracks from noises effectively. The results of our
histogram and the neighborhood histogram of these tokens algorithm using SVM are shown in Fig. 10.
within a crack can be calculated precisely.
Statistical Feature Histogram: After the structured learn-
IV. E XPERIMENTS
ing procedure, we can obtain the token map. Each point in the
map indicates the label of token that the 16 × 16 image patch In this section, we analyze the performance of our proposed
around the corresponding position is assigned to. Statistical method. Part of the Matlab code is supported on Piotr’s
feature histogram in Fig. 8 reflects the composition of the Computer Vision Toolbox [46] and Structured Edge Detection
crack comprehensively. Each dimension of this histogram Toolbox [27]. All the experiments are conducted on a desktop
represents the number of a certain token. with AMD FX(tm)-4300 Quad-Core Processor and 4G RAM.
The token number from the training result is numerous. In order to evaluate our method, we compare it with the
After plotting the overall occurrence of each token in Fig. 9(a), traditional method (Canny [47]), and the state-of-the-art road
we notice a long tail effect of the token distribution. After detection methods (CrackTree [20], CrackIT [48], FFA [25]
analyzing the statistical information of appeared tokens, we and MPS [17], [19]).
find that over 90% occurrences of all the tokens are centered We show results on two datasets measuring accuracy perfor-
on 708 specific tokens. The occurrences of most tokens make mance. We demonstrate the cross dataset generalization of our
7

approach by testing on each dataset using CrackForest learned from 1 to 3 mm. From Fig. 10, we can see that these images
on the other. contain noises such as shadows, oil spots and water stains.
Unlike other edge detection tasks, the evaluation of crack We use the 60%/40% training/testing split with the im-
detection performance is difficult. Thereby we define two ages reduced to 480 × 320. Example detections on CFD are
kinds of evaluation indicators for crack detection. shown in Fig. 10. The first column lists the original images.
Crack Detection Accuracy: We use precision, recall and F1 The corresponding manually labeled cracks are shown in the
Score to evaluate the performance of different crack detection second column as ground truth. The third column shows the
algorithms. preliminary detection results after applying random structured
The precision and recall can be computed on true positive forests. Darker color indicates that the pixel is more likely to
(T P ), false negative (F N ) and false positive (F P ). contain a crack. After the binarization, crack pixels with less
confidence are removed. The use of crack descriptor allows us
TP to transform each detected region into a vector. By applying
P rpixel = (7) classification method such as SVM, we can eliminate the
TP + FP
noise regions and keep the crack regions effectively. The final
detection results are shown in column 5. Our method is robust
TP to noise.
Repixel = (8)
TP + FN Five methods are conducted on this dataset: Canny, CrackIT,
CrackTress, FFA and CrackForest. Results are shown in Fig.
2 × P rpixel × Repixel 11 and summary statistics are in Table I. As it can be
F 1pixel = (9) observed intuitively, our method outperforms the alternatives.
P rpixel + Repixel
Traditional edge detection method Canny is not suitable for
Assume that the detected pixels which are no more than five road crack detection due to its high sensitivity. CrackIT does
pixels away from the manually labeled pixel are true positive not perform well on low-resolution and low-contrast images.
pixels. As a result, it fails to detect most of the crack pixels in the
The precision, recall and F1 Score on detect region can be images. The accuracy of CrackTree is acceptable. But it may
similarly computed by Equation(7) and Equation(8). hallucinate a crack that does not exist. In addition, the width
of the crack can not be observed. As for FFA, it may falsely
T Pr detect landmarks as defects.
P rregion = (10) Our method CrackForest performs better than the alterna-
T Pr + F Pr
tives. To be specific, CrackForest (SVM) gives both good
precision and recall.
T Pr
Reregion = (11)
T Pr + F Nr B. AigleRN Results
AigleRN dataset [49] contains 38 images with ground truth.
2 × P rregion × Reregion We use 60% for training and the rest for testing.
F 1region = (12)
P rregion + Reregion We compare four methods on this dataset: CrackIT, FFA,
Crack Continuity Assessment: We define the “Continuity MPS and CrackForest. Example AigleRN results are shown
Index (CI)” as a degree of continuity. It measures how much in Fig. 12 and Table II. Although CrackIT can detect most
the detected regions are connected if they belong to the same of the cracks, a lot of noises are still remained. Besides, the
crack. Denote M as the number of images in the testing set. continuity of the detected cracks is not very good. As for FFA,
Ni as the number of ground truth cracks in the ith image and the precision is acceptable. But when it comes to detecting
nij as the number of true positive regions that cover the jth cracks with complex topology, FFA is less competitive. MPS
ground truth crack in the ith image. performs well on detecting light cracks, but it may hallucinate
a crack that does not exist. CrackForest shows promising
M Ni
results on most of the indicators. To be specific, CrackForest
1 X 1 X 1 (SVM) still gives both better precision and recall.
CI = ( ) (13)
M i=1 Ni j=1 nij
C. Cross Dataset Generalization
The continuity is better as CI gets closer to 1.
To study the ability of our approach to generalize across
datasets, we ran a final set of experiments. In Table III,
A. CFD Results we show results on AigleRN using CrackForest trained on
We propose an annotated road crack dataset called CFD. CFD and also results on CFD using CrackForest trained on
This dataset is composed of 118 images, which can generally AigleRN. Note that images in the CFD and AigleRN datasets
reflect urban road surface condition in Beijing, China. Each are qualitatively quite different, see Fig. 11 and Fig. 12,
image has hand labeled ground truth contours. All the images respectively.
are taken by an iPhone5 with focus of 4mm, aperture of f/2.4 In Table III, top, results on AigleRN of the AigleRN and
and exposure time of 1/134s. The width of the images ranges CFD trained models are compared. The precision and recall
8

Fig. 10. Part of the results of road crack detection using our proposed method. Notice that our method can eliminate the influence of oil stains, shadows and
complex background effectively, and can cope with miscellaneous crack topology.

TABLE I
C RACK D ETECTION R ESULTS E VALUATION ON CFD

Method P rpixel Repixel F 1pixel P rregion Reregion F 1region CI


Canny 12.23% 22.15% 15.76% 0.05% 0.22% 0.08% 0.004
CrackTree 73.22% 76.45% 70.80% 84.35% 85.24% 84.79% 0.22
CrackIT 67.23% 76.69% 71.64% 93.43% 91.22% 92.31% 0.32
FFA 78.56% 68.43% 73.15% 91.55% 85.58% 88.46% 0.58
CrackForest (KNN) 80.77% 78.15% 79.44% 90.88% 93.72% 92.28% 0.62
CrackForest (SVM) 82.28% 89.44% 85.71% 95.75% 95.62% 95.68% 0.67
CrackForest (One-Class SVM) 81.25% 86.45% 83.77% 96.73% 92.53% 94.58% 0.65

do not fluctuate much using two datasets for training. Results The experimental results show that CrackForest could serve
on CFD of the CFD and AigleRN models, shown in Table III, as a general purpose crack detector without the necessity of
bottom, are likewise similar. retraining.
9

Fig. 11. Results of different algorithms on CFD (From top to bottom: original image, ground truth, Canny, CrackIT, CrackTree, FFA and CrackForest)

V. C ONCLUSION manner from a small training set. More importantly, we can


characterize cracks and eliminate noises marked as cracks by
In this paper, we propose an effective and fast automatic using two feature histograms proposed.
road crack detection method, which can suppress noises ef-
ficiently by learning the inherent structured information of Our innovation is shown as follows: Firstly, to capture the
cracks. Our detection framework builds upon representative inherent structure of the road crack, we apply integral channel
and discriminative integral channel features and combines features to enrich the feature set of traditional crack detection.
this representation with random structured forests. This also Secondly, the introducing of random decision forests makes
allows us to train our framework in a completely supervised it possible to exploit such structured information and predict
10

Fig. 12. Results of different algorithms on AigleRN (From top to bottom: original image, ground truth, CrackIT, FFA, MPS and CrackForest)

TABLE II
C RACK D ETECTION R ESULTS E VALUATION ON A IGLE RN

Method P rpixel Repixel F 1pixel P rregion Reregion F 1region CI


CrackIT 76.85% 74.32% 76.56% 86.52% 76.47% 81.19% 0.35
FFA 73.22% 87.52% 79.73% 84.35% 85.24% 84.79% 0.67
MPS 86.66% 90.06% 88.33% 87.52% 90.23% 88.85% 0.77
CrackForest (KNN) 89.47% 82.83% 86.02% 87.45% 92.34% 89.83% 0.76
CrackForest (SVM) 90.28% 86.58% 88.39% 90.32% 86.32% 88.27% 0.87
CrackForest (One-Class SVM) 85.09% 83.67% 84.37% 89.73% 88.29% 89.00% 0.85

TABLE III
C ROSS - DATASET GENERALIZATION TEST FOR C RACK F OREST. TRAIN/TEST INDICATES THE TRAINING / TESTING DATASET USED .

TRAIN/TEST P rpixel Repixel F 1pixel P rregion Reregion F 1region CI


AigleRN / AigleRN 90.28% 86.58% 88.39% 90.32% 86.32% 88.27% 0.87
CFD / AigleRN 87.36% 85.02% 86.17% 87.43% 85.52% 86.46% 0.79
CFD / CFD 82.28% 89.44% 85.71% 95.75% 95.62% 95.68% 0.67
AigleRN / CFD 81.27% 87.43% 84.24% 92.37% 94.33% 93.34% 0.65

local segmentation masks of the given image patch. Thirdly, a in suppressing noises compared to several competing methods.
crack descriptor, which consists of two statistical histograms, Our approach yields promising processing speed and state-of-
is proposed to characterize the structured information of cracks the-art accuracy.
and discriminate cracks from noises. In addition, we also
propose an annotated road crack image dataset which can
generally reflect the urban road surface condition in China and
Source code is available online: https://github.com/
two indicators to evaluate the performance of crack detection
cuilimeng/CrackForest. Our annotated road crack image
methods.
dataset CFD is also available online: https://github.com/
Experimental results prove the effectiveness of our method cuilimeng/CrackForest-dataset.
11

VI. L IMITATIONS AND F UTURE W ORK [15] H. Oliveira and P. L. Correia, “Supervised strategies for crack detection
in images of road pavement flexible surfaces,” in Proc. European Signal
In our experiments, CrackForest has proven to be quite Processing Conf.(EUSIPCO’08), 2008, pp. 25–29.
promising. However, it does have some limitations: [16] C. Koch and I. Brilakis, “Pothole detection in asphalt pavement images,”
Advanced Engineering Informatics, vol. 25, no. 3, pp. 507–515, 2011.
• Our method has only performed on static images so far. [17] R. Amhaz, S. Chambon, J. Idier, and V. Baltazart, “A new minimal path
The video streaming is not taken into consideration. In selection algorithm for automatic crack detection on pavement images,”
the future, we will test our method on video datasets. in Image Processing (ICIP), 2014 IEEE International Conference on.
IEEE, 2014, pp. 788–792.
• The width of the crack is not measured in our method. We [18] M. Avila, S. Begot, F. Duculty, and T. S. Nguyen, “2d image based road
will focus on the severity level assessment in the future pavement crack detection by calculating minimal paths and dynamic
work. programming,” in Image Processing (ICIP), 2014 IEEE International
Conference on. IEEE, 2014, pp. 783–787.
[19] R. Amhaz, S. Chambon, J. Idier, and V. Baltazart, “Automatic crack
detection on 2d pavement images: An algorithm based on minimal path
ACKNOWLEDGMENTS selection,” Intelligent Transportation Systems, IEEE Transactions on,
Thanks professor Qin Zou and professor Manuel Avila for 2016 (To appear).
[20] Q. Zou, Y. Cao, Q. Li, Q. Mao, and S. Wang, “Cracktree: Automatic
their kind help. And thanks professor Sylvie Chambon and Dr. crack detection from pavement images,” Pattern Recognition Letters,
Rabih Amhaz for their valuable discussions. vol. 33, no. 3, pp. 227–238, 2012.
This work is supported by National Natural Science Founda- [21] K. Fernandes and L. Ciobanu, “Pavement pathologies classification
using graph-based features,” in Image Processing (ICIP), 2014 IEEE
tion of China (Grant No. 91546201, 71331005, 71110107026, International Conference on. IEEE, 2014, pp. 793–797.
61402429). [22] Y.-C. J. Tsai, C. Jiang, and Y. Huang, “Multiscale crack fundamental
element model for real-world pavement crack classification,” Journal of
Computing in Civil Engineering, 2012.
R EFERENCES [23] Y. J. Tsai, C. Jiang, and Z. Wang, “Implementation of automatic
crack evaluation using crack fundamental element,” in Image Processing
[1] H. Oliveira and P. L. Correia, “Automatic road crack segmentation (ICIP), 2014 IEEE International Conference on. IEEE, 2014, pp. 773–
using entropy and image dynamic thresholding,” in European Signal 777.
Processing Conference (EUSIPCO), vol. 17, no. 24-28, 2009, pp. 622– [24] P. Kontschieder, S. R. Bulo, H. Bischof, and M. Pelillo, “Structured
626. class-labels in random forests for semantic image labelling,” in Com-
[2] H. Cheng, J.-R. Chen, C. Glazier, and Y. Hu, “Novel approach to puter Vision (ICCV), 2011 IEEE International Conference on. IEEE,
pavement cracking detection based on fuzzy set theory,” Journal of 2011, pp. 2190–2197.
Computing in Civil Engineering, vol. 13, no. 4, pp. 270–280, 1999. [25] T. S. Nguyen, S. Begot, F. Duculty, and M. Avila, “Free-form anisotropy:
[3] H. Cheng, J. Wang, Y. Hu, C. Glazier, X. Shi, and X. Chen, “Novel A new method for crack detection on pavement surface images,” in
approach to pavement cracking detection based on neural network,” Image Processing (ICIP), 2011 18th IEEE International Conference on.
Transportation Research Record: Journal of the Transportation Research IEEE, 2011, pp. 1069–1072.
Board, vol. 1764, no. 1, pp. 119–127, 2001. [26] P. Dollár, Z. Tu, P. Perona, and S. Belongie, “Integral channel features,”
[4] P. Subirats, J. Dumoulin, V. Legeay, and D. Barba, “Automation of pave- in BMVC, vol. 2, no. 3, 2009, p. 5.
ment surface crack detection using the continuous wavelet transform,” [27] P. Dollár and C. L. Zitnick, “Structured forests for fast edge detection,”
in Image Processing, 2006 IEEE International Conference on. IEEE, in Computer Vision (ICCV), 2013 IEEE International Conference on.
2006, pp. 3037–3040. IEEE, 2013, pp. 1841–1848.
[5] T. S. Nguyen, M. Avila, and S. Begot, “Automatic detection and [28] H.-D. Cheng and M. Miyojim, “Automatic pavement distress detection
classification of defect on road pavement using anisotropy measure,” system,” Information Sciences, vol. 108, no. 1, pp. 219–240, 1998.
in European Signal Processing Conference, 2009, pp. 617–621. [29] A. Ayenu-Prah and N. Attoh-Okine, “Evaluating pavement cracks with
[6] H. Oliveira and P. L. Correia, “Automatic road crack detection and bidimensional empirical mode decomposition,” EURASIP Journal on
characterization,” Intelligent Transportation Systems, IEEE Transactions Advances in Signal Processing, vol. 2008, no. 1, p. 861701, 2008.
on, vol. 14, no. 1, pp. 155–168, 2013. [30] H. Zhao, G. Qin, and X. Wang, “Improvement of canny algorithm based
[7] H. Oh, N. W. Garrick, and L. E. Achenie, “Segmentation algorithm using on pavement edge detection,” in Image and Signal Processing (CISP),
iterative clipping for processing noisy pavement images,” in Imaging 2010 3rd International Congress on, vol. 2. IEEE, 2010, pp. 964–967.
Technologies: Techniques and Applications in Civil Engineering. Second [31] Y.-C. Tsai, V. Kaul, and R. M. Mersereau, “Critical assessment of
International Conference, 1998. pavement distress segmentation methods,” Journal of transportation
[8] M. Petrou, J. Kittler, and K. Song, “Automatic surface crack detection on engineering, vol. 136, no. 1, pp. 11–19, 2009.
textured materials,” Journal of materials processing technology, vol. 56, [32] S. Chambon and J.-M. Moliard, “Automatic road pavement assessment
no. 1, pp. 158–167, 1996. with image processing: Review and comparison,” International Journal
[9] Y. Huang and B. Xu, “Automatic inspection of pavement cracking of Geophysics, vol. 2011, 2011.
distress,” Journal of Electronic Imaging, vol. 15, no. 1, pp. 013 017– [33] R. Achanta, F. Estrada, P. Wils, and S. Süsstrunk, “Salient region
013 017, 2006. detection and segmentation,” in Computer Vision Systems. Springer,
[10] S. Cafiso, A. Di Graziano, and S. Battiato, “Evaluation of pavement 2008, pp. 66–75.
surface distress using digital image collection and analysis,” in Seventh [34] R. Achanta, S. Hemami, F. Estrada, and S. Susstrunk, “Frequency-tuned
International Congress on Advances in Civil Engineering. Citeseer, salient region detection,” in Computer vision and pattern recognition,
2006. 2009. cvpr 2009. ieee conference on. IEEE, 2009, pp. 1597–1604.
[11] M. Gavilán, D. Balcones, O. Marcos, D. F. Llorca, M. A. Sotelo, I. Parra, [35] P. Arbelaez, M. Maire, C. Fowlkes, and J. Malik, “Contour detection
M. Ocaña, P. Aliseda, P. Yarza, and A. Amı́rola, “Adaptive road crack and hierarchical image segmentation,” Pattern Analysis and Machine
detection system by pavement classification,” Sensors, vol. 11, no. 10, Intelligence, IEEE Transactions on, vol. 33, no. 5, pp. 898–916, 2011.
pp. 9628–9657, 2011. [36] K. Y. Song, M. Petrou, and J. Kittler, “Texture crack detection,” Machine
[12] M. S. Kaseko and S. G. Ritchie, “A neural network-based methodol- Vision and Applications, vol. 8, no. 1, pp. 63–75, 1995.
ogy for pavement crack detection and classification,” Transportation [37] Y. Hu and C.-x. Zhao, “A local binary pattern based methods for
Research Part C: Emerging Technologies, vol. 1, no. 4, pp. 275–291, pavement crack detection,” Journal of pattern Recognition research,
1993. vol. 1, no. 20103, pp. 140–147, 2010.
[13] Q. Li and X. Liu, “Novel approach to pavement image segmentation [38] J. Zhou, P. S. Huang, and F.-P. Chiang, “Wavelet-based pavement distress
based on neighboring difference histogram method,” in Image and Signal detection and evaluation,” Optical Engineering, vol. 45, no. 2, pp.
Processing, 2008. CISP’08. Congress on, vol. 2. IEEE, 2008, pp. 792– 027 007–027 007, 2006.
796. [39] M. Kass, A. Witkin, and D. Terzopoulos, “Snakes: Active contour
[14] N. Tanaka and K. Uematsu, “A crack detection method in road surface models,” International journal of computer vision, vol. 1, no. 4, pp.
images using morphology.” MVA, vol. 98, pp. 17–19, 1998. 321–331, 1988.
12

[40] V. Kaul, A. Yezzi, and Y. Tsai, “Detecting curves with unknown Zhensong Chen is currently a Ph.D. candidate student at University of
endpoints and arbitrary topology using minimal paths,” Pattern Analysis Chinese Academy of Sciences, Beijing, China. His current research is focused
and Machine Intelligence, IEEE Transactions on, vol. 34, no. 10, pp. on image segmentation and proportion learning.
1952–1965, 2012.
[41] P. Delagnes and D. Barba, “A markov random field for rectilinear
structure extraction in pavement distress image analysis,” in Image
Processing, 1995. Proceedings., International Conference on, vol. 1.
IEEE, 1995, pp. 446–449.
[42] B. J. Lee, H. Lee et al., “Position-invariant neural network for digital
pavement crack analysis,” Computer-Aided Civil and Infrastructure
Engineering, vol. 19, no. 2, pp. 105–118, 2004.
[43] A. Cord and S. Chambon, “Automatic road defect detection by textural
pattern recognition based on adaboost,” Computer-Aided Civil and
Infrastructure Engineering, vol. 27, no. 4, pp. 244–259, 2012.
[44] L. Ying and E. Salari, “Beamlet transform-based technique for pavement
crack detection and classification,” Computer-Aided Civil and Infrastruc-
ture Engineering, vol. 25, no. 8, pp. 572–580, 2010.
[45] J. J. Lim, C. L. Zitnick, and P. Dollár, “Sketch tokens: A learned
mid-level representation for contour and object detection,” in Computer
Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on.
IEEE, 2013, pp. 3158–3165.
[46] P. Dollár, “Piotr’s Computer Vision Matlab Toolbox (PMT),” http:
//vision.ucsd.edu/∼pdollar/toolbox/doc/index.html.
[47] J. Canny, “A computational approach to edge detection,” Pattern Analy-
sis and Machine Intelligence, IEEE Transactions on, no. 6, pp. 679–698,
1986.
[48] H. Oliveira and P. L. Correia, “Crackit—an image processing toolbox
for crack detection and characterization,” in Image Processing (ICIP),
2014 IEEE International Conference on. IEEE, 2014, pp. 798–802.
[49] S. Chambon, “AigleRN,” http://www.irit.fr/œSylvie.Chambon/Crack
Detection Database.html.

Yong Shi received the Ph.D. degree in management science and computer
system from University of Kansas, USA.
He is currently a Professor with Chinese Academy of Sciences, Beijing,
China, where he serves as the Director of Research Center on Fictitious
Economy and Data Science. He is also a Professor and Distinguished Chair
of Information Technology, College of Information Science and Technology,
University of Nebraska at Omaha, USA. His research interests are data mining,
information overload, optimal system designs, multiple criteria decision
making, decision support systems, and information and telecommunications
management.
Dr. Shi is the Editor-in-Chief of International Journal of Information
Technology and Decision Making and Annals of Data Science.

Limeng Cui is currently a Ph.D candidate student at University of Chinese


Academy of Sciences, Beijing, China, with a focus on computer vision and
machine learning. Her current work lies in image processing, object detection
and scene recognition.

Zhiquan Qi is currently an Assistant Professor with the Research Center on


Fictitious Economy and Data Science, Chinese Academy of Sciences, Beijing,
China. His current research interests include object detection, object tracking,
change detecting, and machine learning.

Fan Meng is currently a Ph.D. candidate student at University of Chinese


Academy of Sciences, Beijing, China. His current research interests include
edge detection, image segmentation, road crack detection and proportion
learning.

You might also like