0% found this document useful (0 votes)
56 views9 pages

Segmentation of Digital Rock Images Using Deep Convolutional Autoencoder

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views9 pages

Segmentation of Digital Rock Images Using Deep Convolutional Autoencoder

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Computers and Geosciences 126 (2019) 142–150

Contents lists available at ScienceDirect

Computers and Geosciences


journal homepage: www.elsevier.com/locate/cageo

Segmentation of digital rock images using deep convolutional autoencoder T


networks☆
Sadegh Karimpoulia, Pejman Tahmasebib,∗
a
Mining Engineering Group, Faculty of Engineering, University of Zanjan, Zanjan, Iran
b
Department of Petroleum Engineering, University of Wyoming, Laramie, WY, 82071, USA

ARTICLE INFO ABSTRACT

Keywords: Segmentation is a critical step in Digital Rock Physics (DRP) as the original images are available in a gray-scale
Segmentation format. Conventional methods often use thresholding to delineate distinct phases and, consequently, watershed
Digital rock physics (DRP) algorithm to identify the existing phases. Such methods are based on color contrast, which makes it difficult to
Mineral identification automatically differentiate phases with similar colors and intensities. Recently, deep learning and machine
Artificial intelligence
learning algorithms have proposed several algorithms working with images, including Convolutional Neural
Networks (CNN). Among them, convolutional autoencoder networks have produced accurate results in different
applications when various images are available for the training. In this paper, thus, convolutional autoencoder
algorithm is implemented to enhance segmentation of digital rock images. However, the bottleneck for applying
the CNN algorithms in DRP is the limited available rock images. As an effective data augmentation method, a
cross-correlation based simulation was used to increase the necessary dataset in this study. Therefore, using the
originally available dataset, namely 20 images from Berea sandstone, a training seed comprising of the manually
and semi-manually segmented images was used. Then, the dataset is divided into training, validation and testing
groups with a fraction of 80, 10 and 10%, respectively. Next, the produced dataset is given to our stochastic
image generator algorithm and 20000 realizations, along with their segmented images, are produced simulta-
neously. The implemented CNN algorithm was tested for two versions of basic and extended architectures. The
results show that the extended network produces results with 96% of categorical accuracy using the designated
images in the testing group. Finally, a qualitative comparison with the conventional multiphase segmentation
(multi-thresholding) revealed that our results are more accurate and reliable even if very few rock images are
available.

1. Introduction detection using gradient magnitude of the image and excluding them, 2.
Thresholding the reminder pixels as pore and minerals phases, and 3.
Rock physics provides the relationships between the physical Expanding all phases to boundary pixels by a marker-based watershed
properties of the porous structure of rock and remotely-sensed geo- algorithm (Beucher and Meyer, 1992). Marker detection in minerals
physical measurements. Recently, emerging the high-resolution micro- with the complicated pattern is not straightforward and there may be
computed tomography (μCT) images of rock samples have led to a re- no general method to achieve this (Beucher and Meyer, 1992). Al-
markable development in Digital Rock Physics (DRP). In the standard though most of the segmentation methods are based on image proces-
workflow of DRP, segmentation of pore and minerals to separate phases sing algorithms occurring in an automatic framework, manually con-
is a vital step. The segmentation methods developed for DRP are ex- trolling is essential in each step. For instance, thresholding may fail
tensively reviewed in the previous publications (Iassonov et al., 2009; when no color contrast is observed between two separate phases. In
Sezgin, 2004). Among them, the procedure introduced by Visual Sci- other words, there may be two different minerals in the rock image with
ence Group (VSG), Stanford University (SU) and Kongju (KJ) segmen- similar colors. This issue happens due to either their close densities or
tations are the most effective available frameworks (Andrä et al., 2013). limited detection power of the imaging instrument. For the sake of
These methods are summarized in the following steps: 1. Boundary simplicity, researchers often consider the rock images as a two-phase

Dr. Sadegh Karimpouli and Dr. Pejman Tahmasebi together conceived the problem, developed the method, performed the computations and contributed to

analyzing the results and writing the paper.



Corresponding author.
E-mail addresses: [email protected] (S. Karimpouli), [email protected] (P. Tahmasebi).

https://doi.org/10.1016/j.cageo.2019.02.003
Received 19 April 2018; Received in revised form 12 October 2018; Accepted 5 February 2019
Available online 06 February 2019
0098-3004/ © 2019 Elsevier Ltd. All rights reserved.
S. Karimpouli and P. Tahmasebi Computers and Geosciences 126 (2019) 142–150

(porosity and mineral) sample (Karimpouli et al., 2018; Fattahi and heterogeneous media such as sandstone and carbonate samples
Karimpouli, 2016; Karimpouli and Fattahi, 2016). This simplification is, (Karimpouli and Tahmasebi, 2016), coal samples (Karimpouli et al.,
however, not applicable in general and strongly affects the subsequent 2017), small-scale porous media modeling (Tahmasebi, 2018a;
computations of rock physical parameters, particularly for Pe and S- Tahmasebi, 2018c; Tahmasebi and Kamrava, 2018) and unconventional
wave velocities (Andrä et al., 2013). plays (Tahmasebi et al., 2017; Tahmasebi et al., 2018; Tahmasebi,
Artificial Neural Network (ANN) is a class of Machine Learning al- 2018b).
gorithms inspired by the human brain. ANNs learn to perform tasks In this paper, first, we tackle the problem of limited data in DRP by
such as classification or prediction if they are trained by some ex- the HYPPS algorithm as an effective augmentation approach. Then, the
amples. Recent developments in ANN and, in particular, deep learning SegNet is used to overcome the difficulties and drawbacks of the con-
has offered new possibilities, to tackle very complex problems in image ventional methods for segmentation of digital rock images. This
analysis. One of such an algorithm is the Convolutional Neural workflow can be considered as an opening for the deep learning tech-
Networks (CNN) (Krizhevsky et al., 2012; Lecun et al., 1998), which use niques in the vast world of DRP. In the following sections, we will in-
convolution and pooling functions to extract new features for analyzing troduce the CNN, SegNet and HYPPS algorithms in Section 2. In Section
visual imagery. These networks, which are considered as one of the 3, the utilized Berea sandstone is described as a benchmark data in the
deep learning derivatives, have manifested significant differences in DRP studies. Then, we produce a limited number of ground-truth (or
terms of accuracy and effectiveness compared to the conventional segmented) images semi-manually and increase the numbers using the
networks (Garcia-Garcia et al., 2017). CNN have been used for many HYPPS method. Next, the two versions of the SegNet are used for seg-
applications, including face detection (Li et al., 2015), semantic seg- mentation and their respective segmented results are compared. Fi-
mentation (Garcia-Garcia et al., 2017), video analysis in autonomous nally, the results are discussed in Section 4.
driving (Badrinarayanan et al., 2015), speech recognition (Y. Zhang
et al., 2017b) and medical image analysis and applications (Havaei 2. Basic concepts
et al., 2017; Litjens et al., 2017; Wallach et al., 2015). In geosciences,
and particularly in rock physics, deep learning methods and CNN are Two main algorithms that are used in this paper are the SegNet for
used in different applications such as: lithology detection using bore- image-based segmentation and the HYPPS method for simulation and
hole imaging (P. Y. Zhang et al., 2017a), rock type classification (Cheng data augmentation. In this part, a brief description of these algorithms
and Guo, 2017; Ferreira and Giraldi, 2017), permeability prediction as well as CNN, as the core of the image-based neural network, is
(Srisutthiyakorn, 2016) and reconstruction of rock porous media (Laloy provided.
et al., 2017; Mosser et al., 2017). In this work, we aim to use this
powerful method for rock image segmentation. Thus, deep learning 2.1. Convolutional Neural Networks
segmentation methods are first reviewed briefly.
As mentioned earlier, CNN is capable to be used for pixel labeling Convolutional Neural Networks are a part of a large group of deep
problems or segmentation. The utmost significant advantage that makes learning methods. They attracted attention due to their strong abilities
the CNN to surpass the conventional methods is the ability to learn from in image classification/recognition (He et al., 2015). CNN are trained
patterns, features, and textures rather than only relying on color var- through their convolutional layers to recognize various patterns in the
iation. According to a recent research performed by Garcia-Garcia et al. input images. Small size kernels are pillars of the convolutional layers.
(2017), the most successful sat-of-the-art deep learning based segmen- Indeed, they effectively extract high-level characteristics of the input
tation methods are fully convolutional networks (Shelhamer et al., image. Convolutional layers are followed by a fully connected neural
2017). In the fully convolutional networks, a set of connected layers are network, which is used to translate those features obtained from the
replaced by the convolutional layers, which produce spatial maps in- previous layers to the given output phases. The basic layers in a
stead of classification scores. In fact, such maps are deconvolved in an common CNN are as follow:
up-sampling procedure to obtain a per-pixel labeled image (i.e. seg-
mented image). SegNet (Badrinarayanan et al., 2015), an encoder-de- 1. Input layer: Images are considered as input data, which are in-
coder convolutional network, is one of the currently best available troduced to CNN in this layer.
networks of this kind which has been used in different applications and 2. Convolutional layer: In this layer, input images or feature maps
has demonstrated promising results (Garcia-Garcia et al., 2017; Kendall from the last layer are convolved with some small size filters (or
et al., 2015; Nanfack et al., 2017). kernels) to generate new feature maps. These convolutions are being
One of the major issues in using deep learning methods for DRP is performed with a shift of ‘n’ pixels, which are called stride
the limited dataset that is available for the training step. Acquiring μCT (Krizhevsky et al., 2012). In fact, stride controls how the filter
images is both expensive and time-consuming. Moreover, due to either convolves around the input image.
long-term procedure of sample preparation or limited core samples, 3. ReLU (Rectified Linear Units) Layer: The purpose of this layer is
preparing hundreds (or thousands) polished thin-section samples for to introduce nonlinearity to a system that basically has just been
microscope imaging may not be plausible in subsurface applications. computing by linear operations (multiplications and summations)
Therefore, data augmentation methods should be applied to increase during the convolutional layers. The ReLU layer applies the function
the existing dataset. Using such big data would lead to more effective f (x ) = max(0, x ) to all the values in the input volume. The logic
training and, thus, avoiding overfitting by either a fast convergence or behind ReLU is that this layer changes all the negative values to
better regularizing. The common methods for data augmentation are zero.
the available image transformation operators, such as rotation, trans- 4. Max-pooling layer: This layer, usually known as down-sampling, is
lation, scaling, crops, etc. For example, Mosser et al. (2017) cropped used to summarize data by choosing the local maximum in a sliding
several small size images from a large image and used an overlap of window moving across the feature maps with a stride of the same
12–64% between small images to increase the dataset. In this paper, length.
however, a more efficient reconstruction method, namely Hybrid pat- 5. Fully connected layer: It is similar to the traditional Multi-Layer
tern- and pixel-based Simulation (HYPPS) introduced by Tahmasebi Perceptron (MLP) neural networks (Haykin, 1999) and is used to
(2017), is used for data augmentation. HYPPS uses an input image and translate feature maps or patterns obtained in previous layers to a
generates any number of realizations with different structures and any known classification.
sizes, but similar statistics. HYPPS is a powerful tool which has been 6. Soft-max layer: The soft-max or normalized exponential function is
used in several applications to reconstruct new scenarios of another activation function, which produces a categorical

143
S. Karimpouli and P. Tahmasebi Computers and Geosciences 126 (2019) 142–150

Fig. 1. The general architecture of the SegNet (after Badrinarayanan et al. (2015)).

parameters are very large leading to more expensive computation and


harder training. On the other hand, U-net transfers the entire encoder
feature map to decoder, which requires an extensive memory. There are
also some minor architectural differences between the SegNet and U-
net, which are found in Badrinarayanan et al. (2015).
In this paper, we use two versions of the SegNet network: 1. Basic
SegNet and 2. Extended SegNet. In the basic SegNet, four encoders and
four decoders are used. In each encoder or decoder, a 7 × 7 × 64
convolutional/deconvolutional layer with a 2 × 2 max-pooling/un-
Fig. 2. Max-pooling indices are used to up-sample low-resolution maps in
pooling window and stride 2 were applied. This means the 64 channels
SegNet (after Badrinarayanan et al. (2015)).
feature map of each internal step is sub-sampled or up-sampled by a
factor of two. The ReLU activation function is used only in decoder part.
probability distribution such that the total sum of the outputs is The extended SegNet is similar to the basic one, but with a more
equal to one. This layer is located in the final layer. massive architecture. In this study, five encoders and five decoders are
used, while two convolutional/deconvolutional layers applied in four
2.2. The SegNet architecture end-encoders and decoders. Besides, three convolutional/deconvolu-
tional layers are applied in six central part (similar to Fig. 1). Therefore,
SegNet is considered as a fully convolutional encoder-decoder net- each basic and extended SegNet contains 18 and 38 layers, respectively.
work (Badrinarayanan et al., 2015). For the encoder part, it uses a
general architecture of CNN and removes the fully connected layers to
produce a low-resolution feature map of the input image. Conversely, in 2.3. Data augmentation
the decoder part, a similar architecture is used to reproduce a high-
resolution feature map. This map is fed to a multiphase soft-max layer As mentioned earlier, conducting a successful training for the arti-
to classify it into a pixel-wise multiphase segmented output; see Fig. 1 ficial intelligence techniques require a massive number of data (Zhao,
(Garcia-Garcia et al., 2017). 2017). Such a requirement becomes more crucial when the input data
Each encoder consists of various convolutional layers responsible are complex, which in this study comprises a set of gray-scale images of
for generating feature maps. To introduce non-linearity, a ReLU layer is the rock sample. Furthermore, depending on the complexity of the
used as an efficient activation function. Then, a max-pooling window problem, the number of input data for training varies significantly and,
with a non-overlapping stride produces sub-sampled feature maps. thus, it requires much cost and time to provide such big data. In this
After using several steps of convolutional layers, a low-resolution fea- study, however, the necessary data is provided wherein the stochastic
ture map is achieved, which is up-sampled in the decoder part. The algorithm uses the input-image and can produce images with different
problem lies with the decoder learning to deconvolve or decode the structures. Therefore, we first describe the implemented stochastic al-
low-resolution map. Thus, boundary information of the encoder feature gorithm and, then, other issues related to imposing variability among
maps is stored before sub-sampling and used in the decoder step. the training to ensure using a comprehensive dataset will be discussed.
Badrinarayanan et al. (2015) showed that it is sufficient to store only In this study, the cross-correlation simulation algorithm, which
the max-pooling indices. Since the maximum feature values exist in belongs to a more inclusive method of the HYPPS technique, is used.
each encoder feature maps. These memorized max-pooling indices are The reason for using this algorithm and not the original HYPPS method
used in the decoder part for up-sampling and producing dense feature is discussed later. The implemented method works with an/(a set of)
maps (Fig. 2). image(s) and produces various equiprobable realizations. In this study,
SegNet architecture is similar to DeconvNet (Noh et al., 2015) and we used the same size as the input image. However, there is no re-
U-Net (Ronneberger et al., 2015), but there are some differences. lationship between the input and output images and they can be of any
Comparing to DeconvNet, it uses fully connected layers, thus its sizes. First, an empty simulation grid is generated for the output image.

144
S. Karimpouli and P. Tahmasebi Computers and Geosciences 126 (2019) 142–150

Fig. 3. (a) Different minerals of Berea sandstone detected in a scanning electron microscopy (SEM) (after (Madonna et al., 2013)) and (b) 3D μCT image of Berea
sandstone with a size of 1024 × 1024 × 1024 voxels and a resolution of 0.74 μm.

Then, starting from a corner of the simulation grid, a pattern with a 3. Benchmark data
specific size from the input image is selected and pasted on the simu-
lation grid. Note, the first pattern is inserted on an empty simulation Andrä et al. (2013) introduced several standard digital rock samples
grid and it is selected in a completely random fashion. The size of the such as Berea and Fontainebleau sandstone and Grosmont carbonate1.
selected pattern depends on the heterogeneity of the input image. A These benchmark samples have been frequently used for DRP studies.
larger template size can be used if the input image represents very Among them, we used the Berea sandstone to evaluate our segmenta-
complex patterns. Similarly, smaller pattern size can be chosen when tion results. This sample is mainly composed of quartz and some small
the input image homogeneous. However, it should be kept in mind that minerals such as clay, K-feldspar, ankerite, and zircon; see Fig. 3(a). The
a larger template often reduces the variability between the produced acquired image consists of 1024 × 1024 × 1024 voxels with a resolu-
realizations yet generating high-quality images. On the other hand, tion of 0.74 μm (Fig. 3(b)). Andrä et al. (2013) implemented three
smaller template size increases the variability, whereas the final reali- segmentation methods, namely VSG, SU, and KJ, to obtain a mono-
zations may not be very similar to the input image. Therefore, after mineral sample. The results indicate that the porosity ranges from 18.4
defining the appropriate size, a small overlap between the previously to 20.9%. This is mostly because even with a bimodal image histogram
inserted patterns OL (OL x × OL y ) is selected and the cross-correlation (i.e. two distinct modes of porosity and mineral), choosing a threshold
of the selected region(s) with the input image is calculated using: value strongly affects the estimated porosity.

OLx 1 OL x 1
(i , j ; x , y ) = DI (x + i, y + j) DT (x , y ), 4. Results
x=0 y=0 (1)
4.1. Semi-automatic segmentation
where DI is the input 2D image, DT is the visiting data-event at point
(x , y ) and T represents its size (i.e. template size). The resulted simi-
Fig. 4(a) represents an example of the original grayscale image used
larity map is used, and the patterns are sorted based on their similarities
in this study. The original image size is 1024 × 1024 pixels, but to
and a certain number of the most similar patterns are selected. Finally,
avoid the existing streak artifacts around the central Z-axis, especially
one of such patterns is selected and inserted in the visiting point
in the boundary of the image, an image with 512 × 512 pixels was
(Tahmasebi, 2017). This process continues until the simulation grid is
selected from the center of the original image. Then, we resampled the
filled. Some of the produced realization using an input sandstone image
image by a factor of 0.5 to produce an image with 256 × 256 pixels and
are shown in Fig. 5.
a resolution of 1.48 μm, mostly for the sake of smaller computational
The HYPPS algorithm was originally developed to deal with com-
time.
plex secondary continuous and point data (Tahmasebi, 2017). As such,
To obtain a multi-mineral segmented image using several threshold
reproducing the conditioning data can be difficult using a sole pattern
values require addressing two crucial issues. As illustrated in Fig. 4(b):
strategy. Therefore, the HYPPS method offers some flexibilities such as
1. Grain boundaries are brighter than grain surfaces so that they are
simulating the point-data through a pixel-based method. In this study,
misclassified, and 2. Different minerals are similarly classified because
however, since none of such data are available, the HYPPS method is
of their close color values or intensities. Although these minerals have
used in its pattern-based mode.
different textures, the above segmentations are insensitive to such
One of the main differences between artificial intelligence and
features. Based on several trials, the watershed algorithm enhances the
physics-based modeling is the amount of the necessary variability. In
segmentation process. This algorithm, however, also failed to differ-
other words, the deep learning methods can represent their best out-
entiate complex structures. Therefore, we decided to manually label
come when the input data show large standard deviation so that the
each misclassified pixel as an expert supervisor. Fig. 4(c) shows the
new models can be built from a rich database. To leverage the spatial
result of our semi-manually segmentation. Although there are many
relationship between the input images and making new patterns, the
minor minerals with a similar color in this sample, we decided to ca-
HYPPS method was modified. Thus, new realizations/images are gen-
tegorize all of them as one phase. Therefore, five phases were
erated using an ensemble of inputs. In other words, all the available
images are searched during the pattern selection phase. Doing so will
result in more variability and also producing new transition patterns 1
These images and their corresponding segmentations are available on:
that might not be available in each of the input images individually. https://github.com/fkrzikalla/drp-benchmarks.

145
S. Karimpouli and P. Tahmasebi Computers and Geosciences 126 (2019) 142–150

4.2. Data augmentation

As mentioned, the main obstacle in using the CNN for digital rock
images is the limited available data for training. Although image
transforms are used as data augmentation method (Garcia-Garcia et al.,
2017), our proposed method in Section 2.3 is a more efficient method.
Therefore, HYPPS method is applied in this study to enrich the dataset.
It has been proven that this algorithm can be considered as an efficient
augmentation method to generate as many images as required in the
deep learning studies.
Fig. 4. An example of (a) original image, (b) automatic multi-thresholding To avoid a secondary segmentation, we made changes in the source
segmentation and (c) semi-manually segmentation of Berea sandstone. code to produce the segmented images simultaneously. This means no
extra round of segmentation is required and the segmented image is
generated all at once when a new realization is produced. The input
image size is 256 × 256 pixels and optimal template and overlap sizes
used for the reconstructions are 90 × 90 and 10 × 10 respectively.
According to the heterogeneity of this sample, five images are con-
sidered. An example of such a simultaneous reconstruction is shown in
Fig. 5. In this example, we used the gray-scale image shown in Fig. 4(a)
and its corresponded segmented image in Fig. 4(c) as inputs, and three
realizations, as well as their segmented images, are produced. These
results demonstrate how efficient and accurate images are generated in
this study.
To avoid overfitting, the original 20 images (Section 4.1) are di-
vided into training, validation and testing groups with a fraction of 80,
10 and 10%, respectively. Then, 16000 stochastic images are generated
using the images in the training group. In a similar way, 4000 other
images are produced for the validation and testing phases, each with
2000 images. In the next step, the produced images in the training
group are used to train the SegNet and the network is adjusted using the
existing images in the evaluation dataset. Finally, the designed network
Fig. 5. Three realizations and their corresponding segmented images produced is tested using the unseen images in the testing subset.
using HYPPS algorithm.

4.3. Segmentation using SegNet


distinguished as pore space (Phi), quartz (Qtz), K-feldspar (K-Fld),
zircon (Zrc) and other minerals (i.e. mainly clays). Though semi- The basic and extended SegNet are used for digital rock image
manually segmentation is an efficient approach, it requires a long time segmentation. Detailed specifications of these networks are mentioned
and can be very limited in practice. Therefore, only 20 images were in Section 2.2. Augmented dataset generated in the previous section is
segmented in this study. used with these networks. Fig. 6 shows the categorical accuracy and
Although there is no straightforward instruction for defining the loss values of training and validation procedures along epochs.
number of training data, a few thousand images per class are required According to Fig. 6, the extended SegNet reaches the required ac-
to properly train a CNN for classification purposes (Ciresan et al., curacy faster. As can be seen, it takes about 100 epochs of training,
2012). Thus, the HYPPS algorithm based on data augmentation ap- while the basic SegNet requires 500 epochs (i.e. five times longer) to
proach is used to train the SegNet network. This is discussed in detail regularize the convolutional coefficients and reaching the appropriate
below. accuracy (we mean here more than 90%). Both categorical accuracy

Fig. 6. The categorical accuracy and loss of (a) basic and (b) extended SegNet used in this study.

146
S. Karimpouli and P. Tahmasebi Computers and Geosciences 126 (2019) 142–150

Fig. 7. Categorical accuracy values of (a) the basic and (b) extended SegNet for each phase.

Table 1 such values for the final epoch.


A summary of the basic and extended SegNet performances. The categorical accuracy curves and the values presented in Fig. 7
Categorical Basic Extended and Table 1 demonstrate that in all cases the two first phases (Phi and
accuracy Qtz) can reach a reasonable accuracy very fast. Note that these phases
Training Validation Test Training Validation Test comprise the majority part of the utilized images (Fig. 4), which their
categorical accuracy values highly affect the overall performance
Phase #1 (Phi) 0.89 0.87 0.87 0.98 0.95 0.95
Phase #2 (Qtz) 0.98 0.96 0.97 0.99 0.90 0.98 (Fig. 6). In other words, even if other phases are not properly de-
Phase #3 (OM) 0.80 0.41 0.60 0.90 0.62 0.64 termined, an overall categorical accuracy still more than 90% can be
Phase #4 (K-Fld) 0.89 0.24 0.22 0.91 0.63 0.73 achieved (Table 1). This stresses that one must not only rely on the
Phase #5 (Zrc) 0.69 0.04 0.00 0.92 0.60 0.88 overall categorical accuracy or loss values as it is mostly controlled by
Overall 0.96 0.92 0.92 0.99 0.97 0.96
most dominated minerals. Fig. 7(a) shows that categorical accuracy of
other phases (OM, K-Fld, and Zrc) in the basic SegNet never reaches a
value more than 90%, although its overall value is 96% for the training
and loss values of two networks for each training, validation and testing step in the final epoch (Table 1). In addition, the categorical accuracy of
phases represent a steady convergence. Moreover, the validation and these phases using the validation and test data shows that the basic
testing results are more accurate in the extended SegNet, which in- SegNet can barely detect the minerals. For instance, the Zircon phase is
dicates that this network can produce more valid segmentation results never distinguished from the other existing minerals. The network
for the new and unseen images. The strength of the extended SegNet is performances for labeling the unseen images in the testing data are 87,
more revealed using the categorical accuracy convergence of each 97, 60, 22 and 0% for Phi, Qtz, OM, K-Fld, and Zrc, respectively, with
phase. Fig. 7 shows the categorical accuracy values for each phase in- an overall categorical accuracy of 92%. As can be seen, the results are
dividually along the epoch for both networks. Table 1 also summarizes more promising in the extended SegNet. This can be verified through

147
S. Karimpouli and P. Tahmasebi Computers and Geosciences 126 (2019) 142–150

Fig. 8. Four examples of SegNet segmentation.

the categorical accuracy values of all phases after 10000 epochs, which demonstrate this capability, the results of a multi-segmented image
are more than 90% (Fig. 7(b) and Table 1). Although the overall ca- using the conventional and the CNN methods are compared. The multi-
tegorical accuracy for the training step in the last epoch is 99%, the thresholding method is done by taking the following steps:
network does not perform well for the validation and test images. The
extended SegNet, however, segmented the testing dataset with 95, 98, 1. Due to the artifact associated with image acquisition and re-
64, 73 and 88% of categorical accuracy for the Phi, Qtz, OM, K-Fld, and construction, the image was smoothed using a median filter with a
Zrc, respectively, with an overall categorical accuracy of 96%. size of two pixels.
According to these results, the extended SegNet is considered being 2. The gradient image was computed to identify the boundary between
a more reliable network. Fig. 8 illustrates the gray-scale image, the the minerals. The transition areas between separate phases are those
ground truth and the basic and extended SegNet results for four unseen with high gradient magnitude. Such areas are then masked as the
images. It is clear that the extended SegNet has been successful in the transition regions.
training and image segmentation. Whereas, the basic SegNet algorithm 3. Except for the masked transition area, multiphase segmentation is
introduces a large number of noises in the segmented images. implemented by choosing four thresholds manually to achieve five
A key point that suppresses the CNN-based segmentation compared different phases based on color contrasts of minerals.
to conventional methods is that these methods are based on small 4. Finally, each phase is extended to the transition area using a wa-
convolutional kernels combined in a deep architecture, which enable tershed algorithm.
them to consider both color and texture simultaneously. To

Fig. 9. (a) Real, (b) ground truth, (c) SegNet and (d) multiphase thresholding segmentation.

148
S. Karimpouli and P. Tahmasebi Computers and Geosciences 126 (2019) 142–150

Fig. 9 shows the results of multiphase segmentation. A visual References


comparison between the SegNet and conventional methods reveals that
the SegNet segmentation is more successful, in particular for identifying Andrä, H., Combaret, N., Dvorkin, J., Glatt, E., Han, J., Kabel, M., Keehm, Y., Krzikalla, F.,
the OM phase (clay and other minerals). The color of this phase is very Lee, M., Madonna, C., Marsh, M., Mukerji, T., Saenger, E.H., Sain, R., Saxena, N.,
Ricker, S., Wiegmann, A., Zhan, X., 2013. Digital rock physics benchmarks-Part I:
similar to Qtz phase, which makes it difficult to accurately detect them imaging and segmentation. Comput. Geosci. 50, 25–32. https://doi.org/10.1016/j.
by only using the color. However, SegNet has demonstrated an ex- cageo.2012.09.005.
cellent performance. The implemented kernels in CNN extract the ne- Badrinarayanan, V., Kendall, A., Cipolla, R., 2015. SegNet: A Deep Convolutional
Encoder-Decoder Architecture for Image Segmentation.
cessary texture and boundary of different components of a rock image. Beucher, S., Meyer, F., 1992. The Morphological Approach to Segmentation: the
In other words, the CNN methods are not only sensitive to color fluc- Watershed Transformation, vol. 34. Opt. Eng. YORK-MARCEL DEKKER Inc., pp. 433.
tuation, but they successfully differentiate components with various Cheng, G., Guo, W., 2017. Rock images classification by using deep convolution neural
network. J. Phys. Conf. Ser. 887, 012089. https://doi.org/10.1088/1742-6596/887/
textures. 1/012089.
According to the results of this study, automatic CNN-based seg- Ciresan, D.C., Meier, U., Schmidhuber, J., 2012. Transfer learning for Latin and Chinese
mentation produces reliable outputs with higher categorical accuracy. characters with deep neural networks. In: The 2012 International Joint Conference
on Neural Networks (IJCNN), pp. 1–6. https://doi.org/10.1109/IJCNN.2012.
This study shows that even with the very small number of images one
6252544.
can train a CNN if an efficient data augmentation method is used. The Fattahi, H., Karimpouli, S., 2016. Prediction of porosity and water saturation using pre-
implemented reconstruction method presented a successful perfor- stack seismic attributes: a comparison of Bayesian inversion and computational in-
mance in this study by generating a vast number of images quickly (e.g. telligence methods. Comput. Geosci 20, 1075–1094. https://doi.org/10.1007/
s10596-016-9577-0.
here about 28,000). It should be noted that the CPU time for producing Ferreira, A., Giraldi, G., 2017. Convolutional Neural Network approaches to granite tiles
each realization is less than 15 ms with a Core-i7 CPU and 8 GB RAM. classification. Expert Syst. Appl. 84, 1–11. https://doi.org/10.1016/J.ESWA.2017.
This type of augmentation may seem unnecessary for the common ap- 04.053.
Garcia-Garcia, A., Orts-Escolano, S., Oprea, S., Villena-Martinez, V., Garcia-Rodriguez, J.,
plications of CNN in other fields, but it is inevitable in DRP and maybe 2017. A Review on Deep Learning Techniques Applied to Semantic Segmentation.
other geosciences related applications. This is mainly due to limited https://doi.org/10.1007/978-1-4471-4640-7.
access to a large dataset of images in the engineering processes such as Havaei, M., Davy, A., Warde-Farley, D., Biard, A., Courville, A., Bengio, Y., Pal, C.,
Jodoin, P.-M., Larochelle, H., 2017. Brain tumor segmentation with deep neural
DRP and subsurface problems. networks. Med. Image Anal. 35, 18–31. https://doi.org/10.1016/J.MEDIA.2016.05.
004.
Haykin, S.S., 1999. Neural Networks : a Comprehensive Foundation. Prentice Hall.
5. Conclusion He, K., Zhang, X., Ren, S., Sun, J., 2015. Delving deep into rectifiers: surpassing human-
level performance on imagenet classification. In: Proceedings of the IEEE
International Conference on Computer Vision, pp. 1026–1034. https://doi.org/10.
In this study, we aimed to improve the segmentation process of
1109/ICCV.2015.123.
digital rock images using CNN. Although CNN has been used in a verity Iassonov, P., Gebrenegus, T., Tuller, M., 2009. Segmentation of X-ray computed tomo-
of fields from semantic segmentation, autonomous driving, and medical graphy images of porous materials: a crucial step for characterization and quantita-
tive analysis of pore structures. Water Resour. Res. 45. https://doi.org/10.1029/
image processing, due to limited available data their applications in
2009WR008087.
DRP have not yet been investigated significantly. Most of the con- Karimpouli, S., Fattahi, H., 2016. Estimation of P- and S-wave impedances using Bayesian
volutional networks need a large number of images (in order of 104). inversion and adaptive neuro-fuzzy inference system from a carbonate reservoir in
We approached this issue by using a stochastic reconstruction method Iran. Neural Comput. Appl. https://doi.org/10.1007/s00521-016-2636-6.
Karimpouli, S., Khoshlesan, S., Saenger, E.H., Koochi, H.H., 2018. Application of alter-
by which one can generate a large dataset even with a few images. native digital rock physics methods in a real case study: a challenge between clean
Manual segmentation is one of the best methods for segmentation of and cemented samples. Geophys. Prospect. https://doi.org/10.1111/1365-2478.
small datasets, but it requires too much of time. Due to using a small 12611. Accepted, Published Online.
Karimpouli, S., Tahmasebi, P., 2016. Conditional reconstruction: an alternative strategy
number of images, however, we preferred to perform the segmentation in digital rock physics. Geophysics 81, D465–D477. https://doi.org/10.1190/
manually. In the next step, the segmented images of every realization geo2015-0260.1.
generated by the utilized stochastic method can be produced simulta- Karimpouli, S., Tahmasebi, P., Lamei, R.H., Mostaghimi, P., Saadatfar, M., 2017.
Stochastic modeling of coal fracture network by direct use of micro-computed to-
neously in the simulation process with the accuracy similar to those mography images. Int. J. Coal Geol. 179, 153–163. https://doi.org/10.1016/j.coal.
prepared by the expert. 2017.06.002.
Our results showed that the SegNet with a deeper architecture can Kendall, A., Badrinarayanan, V., Cipolla, R., 2015. Bayesian SegNet: Model Uncertainty in
Deep Convolutional Encoder-Decoder Architectures for Scene Understanding.
be trained more effectively by a large dataset and produce more reliable
https://doi.org/10.13140/RG.2.1.2985.2407.
results. The point is that the overall categorical accuracy (or loss) of the Krizhevsky, A., Sutskever, I., Hinton, G.E., 2012. ImageNet Classification with Deep
network cannot be used to verify the performance of the network in a Convolutional Neural Networks.
Laloy, E., Hérault, R., Lee, J., Jacques, D., Linde, N., 2017. Inversion using a new low-
multiphase segmentation application. Nevertheless, phase-by-phase
dimensional representation of complex binary geological media based on a deep
categorical accuracy should be considered to accurately evaluate the neural network. Adv. Water Resour. 110, 387–405. https://doi.org/10.1016/J.
network's performance. Although the overall accuracies of the basic and ADVWATRES.2017.09.029.
extended SegNet in our study were close to each other, the phase-by- Lecun, Y., Bottou, L., Bengio, Y., Haffner, P., 1998. Gradient-based learning applied to
document recognition. Proc. IEEE 86, 2278–2324. https://doi.org/10.1109/5.
phase study showed that the extended SegNet is trained more effec- 726791.
tively (with a categorical accuracy of 99%). This network also produced Li, H., Lin, Z., Shen, X., Brandt, J., Hua, G., 2015. A convolutional neural network cascade
valid results for unseen images with a categorical accuracy of about for face detection. In: 2015 IEEE Conference on Computer Vision and Pattern
Recognition (CVPR). IEEE, pp. 5325–5334. https://doi.org/10.1109/CVPR.2015.
96%. A comparison with the results produced by multiphase thresh- 7299170.
olding also revealed that a substantial improvement is achieved when Litjens, G., Kooi, T., Bejnordi, B.E., Setio, A.A.A., Ciompi, F., Ghafoorian, M., van der
our reconstruction method is used along with the extended SegNet. Laak, J.A.W.M., van Ginneken, B., Sánchez, C.I., 2017. A survey on deep learning in
medical image analysis. Med. Image Anal. 42, 60–88. https://doi.org/10.1016/j.
The results of this study are not limited to sandstone samples, but media.2017.07.005.
the proposed framework can be for any sample. We anticipate the other Madonna, C., Quintal, B., Frehner, M., Almqvist, B.S.G., Tisato, N., Pistone, M., Marone,
related fields with a similar obstacle, namely a limited number of ob- F., Saenger, E.H., 2013. Synchrotron-based X-ray tomographic microscopy for rock
physics investigations. Geophysics 78, D53–D64. https://doi.org/10.1190/geo2012-
servations or input images can benefit from the findings of this paper.
0113.1.
Mosser, L., Dubrule, O., Blunt, M.J., 2017. Reconstruction of Three-Dimensional Porous
Media Using Generative Adversarial Neural Networks. https://doi.org/10.1103/
Appendix A. Supplementary data PhysRevE.96.043309.
Nanfack, G., Elhassouny, A., Thami, R.O.H., 2017. Squeeze-SegNet: A New Fast Deep
Supplementary data to this article can be found online at https:// Convolutional Neural Network for Semantic Segmentation.
Noh, H., Hong, S., Han, B., 2015. Learning deconvolution network for semantic
doi.org/10.1016/j.cageo.2019.02.003.

149
S. Karimpouli and P. Tahmasebi Computers and Geosciences 126 (2019) 142–150

segmentation. In: Proceedings of the IEEE International Conference on Computer Tahmasebi, P., 2018c. Packing of discrete and irregular particles. Comput. Geotech 100,
Vision, pp. 1520–1528. https://doi.org/10.1109/ICCV.2015.178. 52–61. https://doi.org/10.1016/J.COMPGEO.2018.03.011.
Ronneberger, O., Fischer, P., Brox, T., 2015. U-net: convolutional networks for biomedical Tahmasebi, P., Javadpour, F., Sahimi, M., 2017. Data mining and machine learning for
image segmentation. In: Lecture Notes in Computer Science (Including Subseries identifying sweet spots in shale reservoirs. Expert Syst. Appl. 88, 435–447.
Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Springer, Tahmasebi, P., Kamrava, S., 2018. Rapid multiscale modeling of flow in porous media.
Cham, pp. 234–241. https://doi.org/10.1007/978-3-319-24574-4_28. Phys. Rev. E 98, 052901. https://doi.org/10.1103/PhysRevE.98.052901.
Sezgin, M., 2004. Survey over image thresholding techniques and quantitative perfor- Tahmasebi, P., Sahimi, M., Shirangi, M.G., 2018. Rapid Learning-Based and Geologically
mance evaluation. J. Electron. Imag. 13, 146–168. Consistent History Matching. Transp. Porous Media. https://doi.org/10.1007/
Shelhamer, E., Long, J., Darrell, T., 2017. Fully convolutional networks for semantic s11242-018-1005-6.
segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39, 640–651. https://doi.org/ Wallach, I., Dzamba, M., Heifets, A., 2015. AtomNet: A Deep Convolutional Neural
10.1109/TPAMI.2016.2572683. Network for Bioactivity Prediction in Structure-Based Drug Discovery.
Srisutthiyakorn, N., 2016. Deep-learning methods for predicting permeability from 2D/ Zhang, P.Y., Sun, J.M., Jiang, Y.J., Gao, J.S., 2017a. Deep Learning Method for Lithology
3D binary-segmented images. In: SEG Technical Program Expanded Abstracts 2016. Identification from Borehole Images. https://doi.org/10.3997/2214-4609.
Society of Exploration Geophysicists, pp. 3042–3046. https://doi.org/10.1190/ 201700945.
segam2016-13972613.1. Zhang, Y., Chan, W., Jaitly, N., 2017b. Very deep convolutional networks for end-to-end
Tahmasebi, P., 2017. HYPPS: a hybrid geostatistical modeling algorithm for subsurface speech recognition. In: 2017 IEEE International Conference on Acoustics, Speech and
modeling. Water Resour. Res. 53, 5980–5997. https://doi.org/10.1002/ Signal Processing (ICASSP). IEEE, pp. 4845–4849. https://doi.org/10.1109/ICASSP.
2017WR021078. 2017.7953077.
Tahmasebi, P., 2018a. Accurate modeling and evaluation of microstructures in complex Zhao, W., 2017. Research on the deep learning of the small sample data based on transfer
materials. Phys. Rev. E 97, 023307. https://doi.org/10.1103/PhysRevE.97.023307. learning. In: AIP Conference Proceedings. AIP Publishing LLC, 020018. https://doi.
Tahmasebi, P., 2018b. Nanoscale and multiresolution models for shale samples. Fuel 217, org/10.1063/1.4992835.
218–225. https://doi.org/10.1016/j.fuel.2017.12.107.

150

You might also like