0% found this document useful (0 votes)
18 views10 pages

Bahri Et Al GeodermaReg 2022

This study presents a Digital Soil Mapping (DSM) initiative to assess soil organic carbon (SOC) stocks in Tunisian topsoils at a resolution of 100 m, utilizing a Quantile Regression Forest algorithm and a national database of 1540 SOC profiles. The findings indicate a total SOC stock of 391 Tg C in the top 30 cm of soil, outperforming global DSM products like SoilGrids 2.0 in predictive accuracy. The research emphasizes the importance of local environmental covariates and suggests that enhancing soil profile observations could further improve SOC mapping in Tunisia.

Uploaded by

Daniel Pinheiro
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views10 pages

Bahri Et Al GeodermaReg 2022

This study presents a Digital Soil Mapping (DSM) initiative to assess soil organic carbon (SOC) stocks in Tunisian topsoils at a resolution of 100 m, utilizing a Quantile Regression Forest algorithm and a national database of 1540 SOC profiles. The findings indicate a total SOC stock of 391 Tg C in the top 30 cm of soil, outperforming global DSM products like SoilGrids 2.0 in predictive accuracy. The research emphasizes the importance of local environmental covariates and suggests that enhancing soil profile observations could further improve SOC mapping in Tunisia.

Uploaded by

Daniel Pinheiro
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Geoderma Regional 30 (2022) e00561

Contents lists available at ScienceDirect

Geoderma Regional
journal homepage: www.elsevier.com/locate/geodrs

Mapping soil organic carbon stocks in Tunisian topsoils


Haithem Bahri a, *, Damien Raclot b, Meriem Barbouchi c, Philippe Lagacherie b,
Mohamed Annabi c
a
Agronomic Sciences and Techniques Laboratory (LR16INRAT05), National Research Institute of Rural Engineering, Water and Forests (INRGREF), Carthage
University, Rue Hedi Karray, CP1004, Menzah 1, Tunisia
b
UMR LISAH, Univ. montpellier, INRAE, IRD, Institut Agro, 2 place Pierre Viala, F-340760 Montpellier, France
c
Agronomic Sciences and Techniques Laboratory (LR16INRAT05), National Institute of Agricultural Research of Tunisia (INRAT), Carthage University, Rue Hedi
Karray, CP1004, Menzah 1, Tunisia

A R T I C L E I N F O A B S T R A C T

Keywords: Better knowledge of the amount and spatial distribution of soil organic carbon (SOC) stock at national level is a
Digital soil mapping key element for monitoring, planning and decision-making regarding soil quality management, agriculture or
Soil organic carbon stocks carbon storage options. The present study proposes for the first time a Digital Soil Mapping (DSM) initiative to
Tunisian soils
map SOC stocks in Tunisian topsoils (0–30 cm) at 100 m resolution, using a Quantile Regression Forest (QRF)
Quantile regression forest
Machine learning spatial prediction
algorithm, a range of environmental covariates, and a national database of 1540 SOC stock profiles. Our results
National approach provided a revised assessment of the SOC stock on the Tunisian territory at 391Tg C in the first 30 cm of soil
Multiple soil classes profile, i.e. an average of 2.53 kg m-2. The map of SOC stocks outperformed global DSM products such as
SoilGrids 2.0 in both R2 (0.44 vs. 0.15) and RMSE (1.94 vs. 2.52 kg m− 2) and can be used as a benchmark against
changes of land use and climate. The importance of the environmental covariates tested indicates the major role
of bioclimatic data and, to a lesser extent, remote sensing images and topography-related variables. Our study
did not show a significant added value of using additional covariates in relation to nationally available variables
or the SOC map predicted by SoilGrids2.0. Finally, our results showed that increasing the quality and quantity of
soil profile observations is most likely the best way to improve the future SOC map, starting with the northern
region of Tunisia, which has the highest SOC stock predictions and uncertainties in the country. An alternative
way would be the exploration of new covariates through sub-national approaches.

1. Introduction element for environmental research relating to atmospheric carbon


sequestration in soil. It is also a key element for agricultural planning
Soil Organic Carbon (SOC) is a key component of functional eco­ and decision-making as SOC map provides users with very useful in­
systems and crucial for food, soil, water, energy security, as well as in formation to monitor the soil condition, identify degraded areas, set
climate change mitigation (Stockmann et al., 2015). It is essential for restoration targets, and explore SOC sequestration potentials. There is
enhancing soil quality, sustaining and improving food production, therefore a real need for methods that enable reliable and updated soil
maintaining clean water, and reducing increased CO2 in the atmosphere. stock evaluations at the national scale.
SOC is recognized as the largest store of terrestrial carbon, containing Digital Soil Mapping (DSM) is an interesting way to map soil prop­
approximately 2344 Pg of organic carbon (Batjes, 1996; Lal, 2004). erties. The key principle of the DSM approach is to determine one or
Globally, its storage capacity is much larger compared with the pools of several soil properties, such as a SOC stock at any location, using a
carbon in the atmosphere and vegetation. However, SOC is a dynamic prediction model based on spatial continuous layers of auxiliary vari­
component of terrestrial systems, with both internal changes and ables called covariates. The prediction model is built by running
external exchanges with the atmosphere and the biosphere (Zhang and learning algorithms for available soil profiles with measured SOC stock
McGrath, 2004). and independent environmental covariates at the same location. Usu­
The knowledge of distributed SOC stocks at a national level is a key ally, DSM is conducted using a wide range of environmental variables,

* Corresponding author.
E-mail addresses: [email protected] (H. Bahri), [email protected] (D. Raclot), [email protected] (P. Lagacherie), mohamed.annebi@
inrat.ucar.tn (M. Annabi).

https://doi.org/10.1016/j.geodrs.2022.e00561
Received 8 April 2022; Received in revised form 10 June 2022; Accepted 4 July 2022
Available online 7 July 2022
2352-0094/© 2022 Elsevier B.V. All rights reserved.
H. Bahri et al. Geoderma Regional 30 (2022) e00561

including soil properties, climate, relief, land use and parental material So, despite the extensive application of various DSM techniques for
(McBratney et al., 2003; Minasney et al. 2013). The main DSM tech­ SOC prediction, only limited work is available in Tunisia. In fact, only a
niques initially used to model SOC from auxiliary data were traditional preliminary study has looked at the spatial heterogeneity of SOC stock at
statistics and geostatistical techniques. Traditional statistics used the Tunisian national level. Brahim et al. (2010) established a SOC stock
generally non spatial models such as multiple linear regression (e.g. map based on a set of measured SOC stocks at the national level. The
Meersmans et al., 2008), partial least square regression (e.g. Amare results of this study provided reliable information on the Tunisian SOC.
et al., 2013), and linear mixed model (e.g. Doetterl et al., 2013). These However, this study is based on the national pedological map (1/
statistical methods determine the correlation between SOC and envi­ 500000) and it assumes that the spatial variability of SOC stock is
ronmental variables. Geostatistical techniques such as Ordinary kriging explained exclusively by soil pedology. However, it is well known that
(e.g. Piccini et al., 2014) and geographically weighted kriging (Webster the spatial variability of SOC stock is highly related to several other
and Oliver, 2001) consider the spatial correlation. Nevertheless, statis­ environmental factors and human activities. For example, climate and
tical and geostatistical models are based on the three following as­ topography affect surface runoff and transport of soil along the surface,
sumptions: linearity, stationarity and noncollinearity (Chen et al., modifying the spatial distribution of SOC (Zhang et al., 2011). Land use,
2019b). The two first assumptions are no longer satisfied for the large fertiliser application and soil tillage also play important roles in influ­
study areas, as is the case for the national scale studies. Moreover, it is encing the SOC dynamics (Song et al., 2020).
difficult to limit the number of candidate covariates and thus avoid The global SOC stocks map (SoilGrids 2.0) published by Poggio et al.
collinearity in the absence of a priori knowledge of large-scale SOC (2021) can be used as a baseline for comparing local finer resolution
drivers (Eldeiry and Garcia, 2010). Nowadays, machine learning tech­ maps for any country in the world, including Tunisia. However, using
niques are increasingly used in DSM applications (Chen et al., 2022). In freely available soil profile data from throughout the world may cause
addition to the availability of open data, one of the main reasons is that predictions of soil properties to be biased at the local scale (Mulder et al.,
they can accommodate non-linearity and multicollinearity, and can 2016). For national applications, it is therefore important to evaluate
overcome overfitting with limited soil observations and auxiliary envi­ how a specific national DSM initiative based on a set of local SOC stock
ronmental information (Drake et al., 2006). Compared to other DSM measurements and local covariates can provide more reliable pre­
techniques, they are considered to have a greater ability to obtain much dictions than a worldwide initiative such as SoilGrids 2.0.
more information for unsampled points by investigating nonlinear in­ In this context, this study aims to: i) propose for the first time a na­
teractions between the concerned soil property and auxiliary variables tional DSM initiative to map SOC stocks in Tunisia, ii) to assess how such
(McBratney et al., 2003; Mansuy et al., 2014; Ottoy et al., 2017). Several an initiative can outperform the global SoilGrids 2.0 DSM initiative and
machine-learning methods, such as support vector machines, artificial iii) finally to discuss ways to improve the prediction of the SOC stock
neural networks (ANN), K-nearest or random forest are widely applied map for Tunisia. The first step was to assess the added value of using
in DSM. Among them, Quantile Regression Forests (QRF) is a quite national soil profile data instead of global soil profile data to calibrate a
recent extension of the Random Forest that considers the spread of the DSM approach based on globally covariates only (i.e. a comparison of a
response variable from which prediction intervals are constructed national calibration to the global calibration used in the SoilGrids
(Vaysse and Lagacherie, 2017). The QRF, therefore, provides un­ initiative). The second step was to examine how the predictive perfor­
certainties associated with predictions while retaining the advantages of mance of the DSM model can be improved using Tunisia-specific cova­
random forest. It can fit complex, non-linear relationships and the cor­ riates and the same national ground point data. The last step was to test
relation between the environmental covariates is not a limiting factor if the introduction of SoilGrids 2.0 predictions as additional covariate
(Vaysse and Lagacherie, 2017; Szatmári and Pásztor, 2019). can improve the national SOC stock predictions.
At the global scale, SoilGrids (Hengl et al., 2014) provides SOC
content and SOC stock estimates at 1000 m resolution using random 2. Material and methods
forest plus kriging. Hengl et al. (2017) published a global SOC stock map
at 250 m resolution using numerous machine learning techniques. The 2.1. Study area
SoilGrids2.0 initiative (Poggio et al., 2021) proposes the more recent
map of several soil properties, including SOC stocks at 250 m resolution Located in North Africa at 33◦ 47′ 35.38“ N latitude and 9◦ 33’
using QRF. 38.76” E longitude, Tunisia is bounded on the north and east by the
At continental scale, Hengl et al. (2015) applied the random forest Mediterranean Sea, on the southeast by Libya and on the west by Algeria
technique to generate a SOC stock map for Africa at 250 m resolution. (Fig. 1). The Tunisian territory covers a 163,610 km2 area which is
Likewise, Guevara et al. (2018) compared five DSM techniques for divided into 32% cultivated land, 29% pasture, 29% uncultivated land
mapping SOC at 1000 m resolution across Latin America and found that, (Wetland, desert, urban areas) and 9% forest, maquis and steppe (FAO,
overall, machine learning prediction algorithms generated similar re­ 2015).
sults. Higher agreement of machine learning prediction algorithms was The northern region is humid to subhumid, with rainfall between
found in small countries where environmental conditions and land cover 600 and 1200 mm. This region is characterised by mountains and a small
use characteristics tend to be more homogeneous. In Europe, many coastal strip. It is still occupied by rainforest. The Dorsal, which is the
studies have addressed mapping SOC at the continental scale using eastern extension of the Atlas Mountains, runs across Tunisia in a
various DSM techniques. For example, de Brogniez et al. (2015) applied northeast direction and is characterised by low, rolling hills and plains.
a generalised additive model to map the topsoil organic carbon content The central region is semi-arid, with rainfall between 200 and 600 mm.
at 500 m and Rial et al. (2017) applied a random forest for the same This region is dominated by steppe vegetation. The southern region is
purpose at 1000 m. In Australia, Viscarra Rossel et al. (2015) mapped arid, it is dominated by the desert and the annual rainfall is <200 mm.
the SOC stock at 100 m resolution using a decision tree with piecewise Rainfall in Tunisia is highly seasonal, with a severe dry season.
regression on environmental variables combined with geostatistical Annual rainfall amounts ranged from <100 mm in the south to over
modelling of residuals. 1000 mm in the extreme north of the country. The availability of arable
Several assessments of SOC stocks have been published at a national land decreases from north to south along with rainfall gradient. How­
scale. Among the studies that compared their national predictions to the ever, the topography is increasingly sloping in the northern part of the
global Soilgrids initiatives, Mulder et al. (2016) used regression tree country, making it difficult to cultivate land where rainfall is relatively
modelling to generate 3D SOC distribution in France at 90 m and 500 m abundant.
resolution and Szatmári et al. (2019) used QRF to map SOC stock in Bioclimatic, geological and morphological diversity are at the origin
Hungary at 100 m resolution. of the existence of a mosaic of vegetation (natural vegetation, rainfed

2
H. Bahri et al. Geoderma Regional 30 (2022) e00561

Where i is the different soil horizons in the 0-30 cm depth range,


varying from 1 to N (N generally 1 or 2); SOCi is the SOC content (g kg− 1
soil) in horizon i; Dai is the bulk density (g.cm− 3) in horizon i; and ei is
the soil thickness (cm) of horizon i.
The bulk density was not available for the whole soil horizons of the
1540 soil profiles. Bulk density for horizons without measurement was
estimated using the pedotransfer function proposed by Brahim et al.
(2010) for Tunisian soils (Eq. (2)):
Da = 0.9–0.08 SOC + 0.007 S + 0.007 L + 0.05 pH (2)
− 1
Where SOC is the soil organic carbon content (g kg soil); S is the
sand content (g kg− 1 soil), and L is the silt content (g kg− 1 soil).
The significance of spatial correlation in SOC stocks was tested by
comparing the experimental semivariogram with the semivariogram
envelope based on permutations of the data values across the locations
(number of simulations N sim = 1000), simulating a non-spatially
correlated variable (Mantel test). In this empirical test, a spatial corre­
lation was considered significant at a 95% level for the lower distance,
which was with the experimental semivariogram below the envelope.

2.3. Environmental covariates

The environmental covariates considered in this study as potential


predictors of SOC stocks are described below. They relate to bioclimatic
variables, morphometric variables, soil/parent material, land use and
other environmental factors such as vegetation through remote sensing
indices. The global covariates correspond to the variables available for
the world while the national covariates correspond to the variables
available only for Tunisia. In this study, all covariate maps were
resampled to a common 100 m grid cell using the UTM 32 N / Carthage
(EPSG: 22332).

2.3.1. Global covariates


Firstly, the 19 bioclimatic variables established by the WorldClim
version 2 initiative at 30s resolution (Fick and Hijmans, 2017, https://
Fig. 1. Study site location, available measured SOC stocks (black point) and
www.worldclim.org/data/worldclim21.html) are considered. They are
sub-national zoning in three regions (EPSG: 22332).
derived from the monthly temperature and precipitation values for the
years 1970–2000 in order to generate more biologically meaningful
crops and irrigated crops) and genetically different soils (Thornes,
variables. They are named BIO1 to BIO19 (see https://www.worldclim.
2002). According to the world reference base for soil resources, the
org/data/bioclim.html for complete description) and they represent
tunisian soils are classified as Cambisols and rendzines, Regosols and
annual trends (e.g., mean annual temperature, annual precipitation),
fluvisols, Luvisols, Podzoluvisols, vertisols, red mediterranean soils,
seasonality (e.g., annual range in temperature and precipitation) and
Kastanozems and solonetz, Solonchaks and also Lithosols (STUDI-SCOT-
extreme or limiting environmental factors (e.g., temperature of the
SODETEG, 2001).
coldest and warmest month, and precipitation of the wet and dry
quarters). A set of five morphometric maps at 30s resolution was also
2.2. Soil organic carbon dataset
tested as potential predictors of SOC stocks. The maps include elevation,
slope, a multiresolution index of valley bottom flatness (MRVBF, Gallant
The dataset consists of topsoil data from 1540 soil profiles collected
and Dowling, 2003), a multiresolution ridge top flatness index (MRRTF,
across the country between 2000 and 2014. Each of these 1540 soil
Gallant and Dowling, 2003) and a topographic position index (TPI,
profiles was divided into soil horizons, each of which was measured at
Wilson and Gallant, 2000). They were all derived from the NASA Shuttle
least for texture, pH and SOC content. The latter was determined in the
Radar Topography Mission (doi:https://doi.org/10.5067/MeaSURE
laboratory by the Walkley-Black oxidation method (Walkley and Black,
s/SRTM/SRTMGL1N.003). As a covariate related to soil/parent mate­
1934), by far the most commonly used method on the African continent
rial, the soil classification map proposed by the SoilGrids version 2.0
along with dry combustion and the determination of emitted CO2 by a
initiative is considered. This map is an aggregation of the predictions of
CHN analyzer (Chevallier et al., 2020). These data come from the soil-
Hengl et al. (2017) at 250 m resolution, obtained by selecting the
plant-water laboratory of National Institute of Agricultural Research of
Reference Soil Groups of the World Reference Base for Soil Resources
Tunisia (INRAT), the gray literature (theses and master’s theses) and
with the highest probability of occurrence. It is named WRB_Soil_Map in
scientific articles. The samples were collected in various areas (Fig. 1),
this paper. Finally, five Sentinel-2 derived variables were also tested as
with higher spatial density in the north, where soil and vegetation are
additional covariate candidates for representing the relationships be­
highly spatially variable than in the south, where soils and land use are
tween SOC stocks and the landscape/environment. They correspond to
more homogeneous. The average density of soil profiles is about one per
the four bands (B2 for Red, B3 for Green, B4 for Blue and B8 for Near
100 km2.
Infra-red) and the Normalized Difference Vegetation Index (NDVI)
The SOC stock (kg m− 2) was calculated for each profile over the
derived from a global 10 m resolution cloud-free pixel-based composite
0–30 cm depth range according to Eq. (1):
from Sentinel-2 images. The latter was created from the Sentinel-2 data

SOCstock = (SOCi *Dai *ei ) (1) archive (Level L1C) available on Google Earth Engine for the period
January 2017–December 2018 (Corbane et al., 2020, doi:https://doi.

3
H. Bahri et al. Geoderma Regional 30 (2022) e00561

org/10.2905/0BD1DFAB-E311-4046-8911-C54A8750DF79). 2.4.3. Model performance evaluation and covariates’ importance


Three classical indicators were used to evaluate the accuracy of SOC
2.3.2. National covariates stock predictions, namely mean error (ME), root mean squared error
The national covariates were derived from the Tunisian agriculture (RMSE) and percentage of explained variance (R2, sometimes called
map (STUDI-SCOT-SODETEG, 2001) at a 1:20,000 scale. They include a model efficiency coefficient). For the three QRF models tested in this
bioclimatic zoning (TUN_Bioclimat, 29 classes) and a set of four layers study, 20 replications of a 10-fold cross-validation were performed by
related to soil/parent material: the soil texture (TUN_Texture, 3 classes), random sampling to obtain a median value and its confidence interval
the soil type (TUN_SoilType, 15 classes), the soil salinity (TUN_Salinity, for all performance indicators. The lower (0.05 quantile) and upper
7 classes) and the parent material (TUN_ParentMat, 37 classes). In (0.95 quantile) confidence intervals were used to calculate the Predic­
addition, a national land use map (TUN_LandUse, 14 classes according tion Interval Coverage Percentage (PICP, Shrestha and Solomatine,
to the classification system of the FAO) provided by the Sahel Sahara 2006), a fourth indicator which expresses the probability that all
Observatory was also considered. This map was derived from a K-means observed values fall within the 90% prediction limits provided by the
classification of multi-date Sentinel-2 images acquired during the humid QRF model.
and dry seasons. The RF permutation importance of each covariate (Breiman, 2001)
was also evaluated. The basic idea behind permutation importance is
that a covariate is considered important if it has a positive effect on the
2.4. Topsoil SOC stock spatial modelling prediction performance.

2.4.1. Presentation of quantile regression forests (QRF) 3. Results


In this study, the QRF (Meinshausen, 2006) based on random forest
(RF) modelling (Breiman, 2001) was chosen to predict SOC stocks. This 3.1. Description of the national SOC stock database
method was also chosen because it was used in the recent SoilGrids 2.0
initiative that produced the baseline for the prediction of the Tunisian The SOC stock in topsoils (0–30 cm) ranges from 0.29 to 18.12 kg
SOC stock considered in this study. It was implemented using the R m− 2, with a median and mean value of 3.88 and 4.34 kg m− 2 respec­
software (R Core Team, 2013) and the Ranger Package (Wright and tively. The highest values are located in the north of the country,
Ziegler, 2017), which is a fast implementation of Random Forests that is whereas the lowest are in the south (Fig. 2a). Comparison of the
suitable for multiple model building. experimental semivariogram with the semivariogram envelope based on
QRF has been largely described in previous papers such as Vaysse permutations of the data values across the locations (Fig. 2b) shows a
and Lagacherie (2017), Loiseau et al. (2019), or Lagacherie et al. (2020). significant spatial correlation of the SOC stocks over a distance of
Large excerpts of these papers are used in the following. This prediction approximately 2 km.
model is the result of a large (> 500) ensemble of decision trees, using
independent observations. For each tree, QRF randomly samples a 3.2. Performance of QRF models
number of variables to use for splitting the observations. A bagged
version of the training data is used for each of them. For the prediction of A number of 70 iterations (i.e., the default value in the Tune Ranger
a new data point, QRF benefits from the estimation of the full condi­ Package) were used as it appeared to be a good compromise that ensured
tional distribution through the RF. From this conditional distribution, it a fairly good convergence towards an optimised solution while being
is possible to derive both the predicted value (the median) and the acceptable in terms of computing costs. The predictive performance of
bound of the 90% prediction interval that assess the associated uncer­ the three QRF models tested in this study is presented in Table 1. First,
tainty (the 0.05 and 0.95 quantiles). the indicators of performance show that all three QRF models signifi­
The RF Algorithm has several hyperparameters that must be set by cantly improve SOC stocks prediction compared to SoilGrids 2.0. This
the user. Among them, three parameters may significantly impact the indicates a clear improvement of the national calibration of the QRF
results and therefore should be tuned to improve the predictions (Probst models compared to a global calibration, as illustrated by Fig. 3. Sec­
et al., 2018): i) the number of observations drawn randomly for each ondly, it can be seen that the introduction of national covariates as
tree, ii) the number of variables drawn randomly for each split and iii) additional covariates to global ones slightly improves the prediction of
the minimum number of samples that a node must contain. These pa­ the QRF models (i.e., Model_2 vs Model_1). In addition, the introduction
rameters were tuned using one of the most established tuning strategies, of the SOC stocks predicted by SoilGrids 2.0 as an additional covariate
sequential model-based optimization (Jones et al., 1998; Hutter et al., does not improve the predictions (i.e., Model_3 vs Model_2). It can also
2011). This tuning algorithm iteratively uses the results of the different be noted that PICP values below 0.9 suggest that the uncertainty pre­
already evaluated hyperparameter values and chooses future hyper­ dictions for the three QRF models were slightly underestimated, and
parameters based on these results. It is implemented in the TuneRanger that, in our case study, the use of Tune Ranger-based parameterisation
Package (Probst et al., 2018). did not have a significant impact on the value of the performance in­
dicators for the different models tested.
2.4.2. The numerical experiment
Three “national” QRF models in relation to the 3 objectives defined 3.3. SOC stock variability and relevant predictors
in the introduction were tested. These models were trained using the
national SOC stock database (i.e. 1540 values, Fig. 1): The QRF models tested in this study provide maps of SOC stocks in
the 0–30 cm topsoil layer, with their related prediction uncertainties, as
• Model_1 was only based on global covariates. illustrated by Fig. 4 for Model_2.
• Model_2 was based on global and national covariates, For all 3 QRF models, the most important covariates for predicting
• Model_3 considered the SOC stock map predicted by SoilGrids 2.0 SOC stocks at national level are bioclimatic variables (Fig. 5). Remote
(named SoilGrid 2.0_OCS) as an additional covariate to global and sensing variables and morphometry also play an important role but to a
national covariates. lesser extent. The national covariates introduced in Model_2 show a low
importance, which is consistent with the low performance benefit be­
Finally, the SOC stock prediction performances of each model were tween Model_1 and Model_2. In the same vein, the no performance
evaluated and compared to that obtained from the global SoilGrids 2.0 benefit between Model_2 and Model_3 can be explained by the low
initiative. importance of considering SOC stocks predicted by SoilGrids 2.0 as an

4
H. Bahri et al. Geoderma Regional 30 (2022) e00561

Fig. 2. a) Overview of the Tunisian SOC stock (0–30 cm) database (kg m− 2); b) experimental semivariogram and envelopes based on permutations.

from the north to the south of the country. The highest SOC stocks in
Table 1
Tunisia were observed in the northern region (region n◦ 1 in Fig. 1), with
Indicators of performance of the 3 QRF models tested in this study.
an average SOC stock of about 4.6 kg m− 2. This region, which is located
QRF model Covariates ME RMSE R2 PICP north of the dorsal area, includes the sandstone ridges of the Kroumirie
Model_1 Global only 0.18 1.98 0.41 85.7 Mountains that reach elevations of 900 m and the Mogods mountains. It
Model _2 Global + national 0.19 1.94 0.44 85.9 is recognized as one of the important areas for cereal production in
Model_3 Global + national 0.19 1.94 0.44 85.9
Tunisia and characterised by a high elevation, by a dominance of humid
+ SoilGrids2.0 SOC stocks
and subhumid climates, and by Mediterranean forests. These results are
consistent with the findings of Brahim and Ibrahim (2018), who showed
additional covariate in the present national case study. that Tunisian SOC stocks were about 8 kg m− 2, 5 kg m− 2 and 3.5 kg m− 2
Fig. 5 also reveals that the input of the soil classification map pro­ for forests, annual crops and fruit trees respectively. They are also
posed by SoilGrids 2.0 and the national land-use map produced by OSS consistent with the findings of Annabi et al. (2009), who showed that
as additional covariates was of very low importance for the 3 QRF SOC content is higher in forest soils (24 g kg− 1) than in cultivated soils
models developed to predict Tunisian SOC stocks. (14 g kg− 1) in northern Tunisia. In the center of Tunisia (region n◦ 2 in
Fig. 1) lies a hilly region known as the High Steppes in the west and Low
4. Discussion Steppes in the east, with an average SOC stock of about 2.9 kg m− 2. This
region is characterised by a dominance of semi arid to arid climates,
4.1. SOC stock variability in Tunisian topsoils (0–30 cm) with a low annual rainfall (200 to 400 mm yr− 1). It is mainly devoted to
fruit trees, mainly olive trees and irrigated crops. The average SOC stock
The SOC stock map (Fig. 4a) shows a significant decreasing gradient in the south part of Tunisia (region n◦ 3 in Fig. 1), which is characterised

5
H. Bahri et al. Geoderma Regional 30 (2022) e00561

Fig. 3. SoilGrids2.0 predictions vs soil profiles measurements of SOC stocks over the 0–30 cm depth range a); Model_2 predictions vs soil profiles measurements of
SOC stocks over the 0–30 cm depth range b). (ME = mean error; RMSE = root mean squared error and R2 = percentage of explained variance, gray lines in b
represents the 90% prediction interval of Model_2).

Fig. 4. Map at 100 m resolution of mean SOC stocks (kg m− 2) as predicted by Model_2 a); Map of related SOC prediction uncertainties calculated as prediction at
95% minus prediction at 5% b).

by an arid to desertic climate (100 to 200 mm year− 1) and by natural considers a larger number of calibration data and a wider set of envi­
rangeland, is around 1.8 kg m− 2. The coastal eastern region is covered ronmental covariates. In this sense, the present prediction can be
by olive trees, with the presence of some oasis. However, the average considered more reliable and robust.
SOC stock in the western south region is <1 kg m− 2 due to the aridity, Considering the quantification of the terrestrial SOC stock in topsoils
which is amplified by desertification. (0–30 cm) provided by FAO (2017), Tunisian SOC stock represents
Our estimation with Model_2 leads to a Tunisian SOC stock of 391 Tg about 0.5% and 0.05% of the African and the world SOC stocks
C in topsoils (0–30 cm) at the national level (i.e. an average of 2.53 kg respectively.
m− 2). This estimate is close to the Tunisian SOC stock in topsoils (0–30
cm) estimated at 405 Tg C by Brahim et al. (2011) and lower than that
4.2. Importance of covariates in SOC prediction at national scale
estimated at 498 Tg C by Henry et al. (2009) in their map covering the
whole African continent. However, our updated estimate is higher than
Even if importance metrics should be interpreted with care because it
the one (296 Tg C) proposed by Poggio et al. (2021) with the SoilGrids
may not give reliable feature importances when “potential predictor
2.0 initiative or the one (265 Tg C) proposed by Hengl et al. (2015) using
variables vary in their scale of measurement or their number of cate­
the random forest method to predict SOC for the African continent with
gories” (Strobl et al., 2007), the permutation importance remains the
very limited observations (9 observations) for Tunisia.
recommended method in the vast majority of cases (see e.g.
Compared to previous predictions of SOC stocks, our prediction
https://explained.ai/rf-importance/index.html#7 for an extended

6
H. Bahri et al. Geoderma Regional 30 (2022) e00561

Fig. 5. Importance of covariates in each QRF Model (permutation method).

discussion on bias on importance metrics). In the present study, biocli­ mapping studies at the regional and the global levels (Hengl et al., 2017;
matic variables were by far the most influential predictor variables for Kalambukattu et al., 2018; Zhang et al., 2017). Our results revealed that
SOC stock distribution. It is very consistent with the North-South optical images were effective predictors for determining SOC stock
decreasing gradient of SOC distribution that is very similar to the distribution. This result is consistent with previous studies using random
North-South gradient of both climate and vegetation cover. In terms of forest, which reported for example, that Sentinel-2-derived predictors
importance, bioclimatic variables are followed by remote sensing data are important factors in predicting SOC distribution in Czech Republic
and DEM derivatives. The combined effects of climate factors, pedo­ (Gholizadeh et al., 2018), in the Versailles plain (Vaudour et al., 2019)
logical and lithology processes were often found to be the main drivers and in the southern part of Central Europe (Zhou et al., 2020). One of the
of the spatial SOC stocks (Wang et al., 2018). Several studies highlighted main explanations is that remote sensing images are a relevant proxy for
that SOC stocks prediction for large areas was primarily controlled by vegetation cover and are therefore relevant for observing the undis­
precipitation (Adhikari et al., 2014; Wang et al., 2018). For example, turbed vegetation gradient over large areas like a country.
Hobley et al. (2015) highlighted that SOC stock in Australia is positively Other environmental variables were also found to improve SOC stock
and significantly correlated (R2 = 0.51) to precipitation. Analysing SOC prediction. For example, parent material was found to be the most sig­
variation, Gray et al. (2015) estimated that each 100 mm increase in nificant factor for predicting SOC at the regional scale in China (Guo
annual precipitation induced 4% of the SOC stock increase. In fact, et al., 2015) using a density of observation of 1/0.4 km2 or in eastern
increasing rainfall supports greater primary production, which results in Australia (Gray et al., 2015) using a density of observation of 1/526
more SOC accumulating in the soil (Hobley et al., 2015; Zhou et al., km2. Furthermore, Yang et al. (2007) found that the humidity index
2019). Several studies using Random Forest showed that three covari­ explained most of the spatial variation of SOC in China (1/3000 km2),
ables (climatic, remote sensing and topographic variables) were the followed by vegetation cover and soil texture.
most important predictors for SOC stock when a similar density of ob­ This confirms that the relationship of environmental covariates to
servations is used as a calibration dataset. For example, elevation and SOC content is very complex, depending upon environmental condi­
climatic variables were found to be the most important variables of SOC tions, resolution, and the extent of the area under concern (Minasny
distribution in Canada (McNicol et al., 2019) or in a Mediterranean et al., 2013).
region in France (Vaysse and Lagacherie, 2017). Climate was also re­
ported to be an influential covariate by Ramifehiarivo et al. (2017) in
4.3. The use of SoilGrids 2.0 output at a national level
addition to elevation and NDVI for predicting SOC in the top 30 cm soil
layer in Madagascar. Topographic variables have been commonly used
Nowadays, the SoilGrids maps can be seen as a baseline for soil
as key predictors for digitising soil mapping (McBratney et al., 2003;
properties, spatialization and especially SOC stock at a global scale
Tsui et al., 2004; Obu et al., 2017; Wang et al., 2017; Wang et al., 2018;
(Hengl et al., 2014, 2017; Poggio et al., 2021). These maps have been
Grimm et al., 2008; Zhou et al., 2020). The influence of topographic
refined in different studies until the most recent one (SoilGrids 2.0) by
parameters on SOC distribution can be associated with its behaviour on
Poggio et al. (2021). This SoilGrids 2.0 map was based on the QRF
soil redistribution through erosion and deposition, in the maintenance of
method at 250 m and the use of 60,000 more profiles than the preceding
vegetation cover and in soil drainage that affects SOC (Adhikari, 2014).
SoilGrids runs using the RF method (Hengl et al., 2017). It therefore
In addition, elevation was also found to be the most effective parameter
benefits from substantial new information for calibration of the new
in previous studies of SOC prediction using the random forest method
global models, and no synthetic observations (“pseudo-points”) were
(Hinge et al., 2018). Elevation plays an important role in the develop­
included.
ment of microclimates (Griffiths et al., 2009), which in turn affects the
Despite the improvements, the comparison between SOC stocks
distribution of plant communities and soil processes (Lozano-García
observed in Tunisia and SoilGrids 2.0 predictions (Fig. 3a) showed a
et al., 2016).
very weak correlation (R2 = 0.11; RMSE = 2.5 kg m− 2). Globally,
Remote sensing data has been successfully applied to digital soil
SoilGrids 2.0 tends to underestimate actual SOC stocks. However, the

7
H. Bahri et al. Geoderma Regional 30 (2022) e00561

most striking point is that SoilGrids2.0 tends to predict very uniform By using a calibration database of >1500 observations of Tunisian
SOC stocks over Tunisia as its predictions range from 1.1 to 7.9 kg m-2 SOC stocks, the present paper confirms the significant performance gain
whereas observed SOC stocks range from 0.29 to 18.12 kg m-2. In of QRF models in SOC stock assessment when based on dense national
particular, the largest SOC stock values (> 5 kg m-2) were strongly calibration databases. The gain was substantial even when using only
underestimated. The poor performance of previous versions of SoilGrids global covariates as in the SoilGrids initiatives (Model_1, R2 = 0.41).
in predicting local or national SOC content or SOC stocks has already More surprising, the addition of national covariates with much more
been identified. For example, Mulder et al. (2016) observed a low cor­ spatial and semantic details did not result in a significant additional gain
relation (R2 = 0.11) between local observations in France and SoilGrids in performance (Model_2, R2 = 0.44). While it has already been shown
predictions by Hengl et al. (2014). Dharumarajan et al. (2021) found that increasing the number of variables can bring only a small
similar results to ours when comparing SoilGrids predictions according improvement in performance (Zhou et al., 2020), one would have ex­
Hengl et al. (2017) with SOC stocks observed at regional level in India pected that more detailed information on the parent material, for
(R2 = 0.11). Song et al. (2020) also reported that the SoilGrids maps of example, would have resulted in a significant improvement in our case
Hengl et al. (2017) largely underestimated SOC content in China. Like study.
us, Hengl et al. (2017) and Dharumarajan et al. (2021) found predictions So, the challenging question for us is how to improve SOC stock
with strong smoothing of upper and lower values when they compared assessment capabilities at Tunisian level? In other words, should we put
the SoilGrids map with local maps from India, Tasmania and California. our next efforts into developing more relevant national covariates or
The better performance of national DSM initiatives compared to global should we put them into acquiring SOC stocks for new soil profiles? To
initiatives such as SoilGrids was also shown for the prediction of other try to answer this question we explored the spatial structure in the re­
variables such as pH (Chen et al., 2019a; Helfenstein et al., 2022). The siduals of Model_1. Fig. 7 showed that no spatial structure remains in the
present study highlights that the new SoilGrids 2.0 version (Poggio et al., residual of Model_1. This means that the entire spatial structure in the
2021) does not provide better SOC stock predictions on a national scale available SOC stocks database was captured using global covariates. It is
than previous SoilGrids versions. SoilGrid 2.0 generally underestimates likely to explain why the addition of local covariates (i.e. Model_2 and
SOC stocks in topsoils (0–30 cm), particularly in the northwestern part Model_3) did not significantly improve the SOC stocks predictions.
of Tunisia, as shown in Fig. 6. All these observations work in favour of focusing first on the obser­
Finally, the consideration of SoilGrids 2.0 SOC stock prediction as an vation of new soil profiles in order to densify our calibration database
additional covariate doesn’t improve the performance of the QRF model before looking for more relevant environmental covariates. This result is
(Model_3 vs Model_2). This is probably due to the fact that all the in agreement with Samuel-Rosa et al. (2015) that tested various sets of
explanatory part of the global variables is already incorporated in the covariates for mapping soil properties over southern Brazil. We are
national models, which already rely on the same or very similar global aware that simply collecting, preparing or downloading better cova­
variables to those used by SoilGrids. riates could be a cheaper option, but all our tests with nationally
available covariates didn’t add significant value. It could also be noted
that more calibration points could already lead to an improvement of the
4.4. Considerations to improve Tunisian SOC stocks evaluation
performance of local covariates considered in this study.
As experimented by Brungard et al. (2021) in the upper Colorado
An obvious weakness of global approaches like SoilGrids is that they
river basin (Western USA), an alternative approach would be to test the
rely on too few calibration points from that specific region. For Tunisia,
QRF approach on sub-divisions of the national area, as a way to improve
Hengl et al. (2017) and Poggio et al. (2021) used only 9 and 60 soil
SOC predictions. In this case, we suggest subdividing the territory ac­
profiles respectively. These authors, fully aware of this weakness,
cording to the bioclimatic context in order to try to highlight the
claimed that SoilGrids is not expected to be as accurate or relevant as
covariates that can explain the variability of the SOC stock over smaller
locally produced maps and models that make use of considerably greater
spatial ranges than the bioclimatic variables. For example, priority
amounts of local point data and finer local covariates.
should be given to areas with the highest SOC concentrations (i.e., the
north of Tunisia), where predictions and uncertainties about SOC stocks
are higher than in other regions.

5. Conclusion

In this study, a national DSM initiative based on the Quantile


Random Forest technique and a large profile database of soil stock
measurements was proposed for the first time to predict the SOC stock
map in Tunisian topsoils (0–30 cm). It provided an assessment of the
SOC stock on the Tunisian territory at 391Tg C (i.e., an average of 2.53
kg m− 2). The performance of the SOC stocks prediction (R2 of 0.44,
RMSE of 1.94 kg m− 2) was in the same range as most SOC map studies
conducted at regional or national scales. Unsurprisingly, the present
national initiative highly improved the predictions provided by the
global SoilGrids2.0 DSM initiative (R2 of 0.15, RMSE of 2.52 kg m− 2),
which confirms the added value of locally produced maps and models
compared to global estimation. However, neither the addition of na­
tionally available covariates nor the addition of the SOC map predicted
by SoilGrids2.0 significantly improved the predictive performance of the
nationally calibrated DSM. In the end, the relative importance of envi­
ronmental covariates indicated the major role of globally available
variables, such as bioclimatic data, remote sensing, DEM and its de­
rivatives, for modelling Tunisian SOC stocks. The main avenues identi­
Fig. 6. Map of differences between Model_2 (present study) and SoilGrids2.0 fied for progress in the mapping of Tunisian SOC stocks are, on the one
SOC stock predictions (kg m− 2). hand, to focus on complementary field surveys, and on the other hand,

8
H. Bahri et al. Geoderma Regional 30 (2022) e00561

Fig. 7. Experimental semivariogram of residuals of Model_1 and envelopes based on permutations.

to explore the interest of new covariates via sub-national approaches, Dharumarajan, S., Kalaiselvi, B., Suputhra, A., Lalitha, M., Vasundhara, R., Kumar, K.A.,
Nair, K.M., Hegde, Ragendra, Sign, S.K., Lagacherie, P., 2021. Digital soil mapping of
starting with the northern region of Tunisia, which presents the highest
soil organic carbon stocks in Western Ghats. S. India Geoderma Reg. 25, e00387
SOC stock predictions and uncertainties in the country. https://doi.org/10.1016/j.geodrs.2021.e00387.
Doetterl, S., Stevens, A., Van Oost, K., Quine, T.A., Van Wesemael, B., 2013. Spatially-
explicit regional-scale prediction of soil organic carbon stocks in cropland using
Declaration of Competing Interest environmental variables and mixed model approaches. Geoderma 204, 31–42.
https://doi.org/10.1016/j.geoderma.2013.04.007.
Drake, J.M., Randin, C., Guisan, A., 2006. Modelling ecological niches with support
The authors declare that they have no known competing financial vector machines. J. Appl. Ecol. 43 (3), 424–432. https://doi.org/10.1111/j.1365-
interests or personal relationships that could have appeared to influence 2664.2006.01141.x.
Eldeiry, A.A., Garcia, L.A., 2010. Comparison of ordinary kriging, regression kriging, and
the work reported in this paper. cokriging techniques to estimate soil salinity using LANDSAT images. J. Irrig. Drain.
Eng. 136 (6), 355–364. https://doi.org/10.1061/(ASCE)IR.1943-4774.0000208.
Data availability FAO, 2015. Aquastat Profil de pays_Tunisie. Organisation des Nations Unies pour
l’alimentation et l’agriculture, Rome, Italie.
FAO, 2017. Global Soil Organic Carbon Map, The Soil Day.
The data that has been used is confidential. Fick, S.E., Hijmans, R.J., 2017. WorldClim 2: new 1km spatial resolution climate surfaces
for global land areas. Int. J. Climatol. 37 (12), 4302–4315. https://doi.org/10.1002/
joc.5086.
References Gallant, J.G., Dowling, T.I., 2003. A multiresolution index of valley bottom flatness for
mapping depositional areas. Water Resour. Res. 39 (12), 1347. https://doi.org/
Adhikari, K., Hartemink, A.E., Minasny, B., Kheir, R.B., Greve, M.B., Greve, M.H., 2014. 10.1029/2002WR001426.
Digital mapping of soil organic carbon contents and stocks in Denmark. PLoS One 9 Gholizadeh, A., Žižala, D., Saberioon, M., Borůvka, L., 2018. Soil organic carbon and
(8), e105519. https://doi.org/10.1371/journal.pone.0105519. texture retrieving and mapping using proximal, airborne and Sentinel-2 spectral
Amare, T., Hergarten, C., Hurni, H., Wolfgramm, B., Yitaferu, B., Selassie, Y.G., 2013. imaging. Remote Sens. Environ. 218, 89–103. https://doi.org/10.1016/j.
Prediction of soil organic carbon for Ethiopian highlands using soil spectroscopy. Int. rse.2018.09.015.
Scholar. Res. Notices 2013. https://doi.org/10.1155/2013/720589. Gray, J.M., Bishop, T.F.A., Yang, X., 2015. Pragmatic models for the prediction and
Annabi, M., Bahri, H., Latiri, K., 2009. Statut organique et respiration microbienne des digital mapping of soil properties in eastern Australia. Soil Res. 53 (1), 24–42.
sols du nord de la Tunisie. Biotechnol. Agron. Soc. Environ. 13 (3), 401–408. https://doi.org/10.1071/SR13306.
Batjes, N.H., 1996. Total carbon and nitrogen in the soils of the world. Eur. J. Soil Sci. 47 Griffiths, R.P., Madritch, M.D., Swanson, A.K., 2009. The effects of topography on forest
(2), 151–163. https://doi.org/10.1111/ejss.12114_2. soil characteristics in the Oregon Cascade Mountains (USA): implications for the
Brahim, N., Ibrahim, H., 2018. Effect of land use on organic carbon distribution in a effects of climate change on soil properties. For. Ecol. Manag. 257 (1), 1–7. https://
north African region: Tunisia case study. In: Soil Management and Climate Change. doi.org/10.1016/j.foreco.2008.08.010.
Academic Press, pp. 15–24. https://doi.org/10.1016/B978-0-12-812128-3.00002-1. Grimm, R., Behrens, T., Märker, M., Elsenbeer, H., 2008. Soil organic carbon
Brahim, N., Bernoux, M., Blavet, D., Gallali, T., 2010. Tunisian soil organic carbon stocks. concentrations and stocks on Barro Colorado Island—digital soil mapping using
Int. J. Soil Sci. 5 (1), 34–40. https://doi.org/10.3923/ijss.2010.34.40. random forests analysis. Geoderma 146 (1–2), 102–113. https://doi.org/10.1016/j.
Brahim, N., Gallali, T., Bernoux, M., 2011. Carbon stock by soils and departments in geoderma.2008.05.008.
Tunisia. J. Appl. Sci. 11 (1), 46–55. https://doi.org/10.3923/jas.2011.46.55. Guevara, M., Olmedo, G.F., Stell, E., Yigini, Y., Aguilar Duarte, Y., Arellano
Breiman, L., 2001. Random forests. Mach. Learn. 45 (1), 5–32. https://doi.org/10.1023/ Hernández, C., Arévalo, G.E., Arroyo-Cruz, C.E., Bolivar, A., Bunning, S., Cañas, N.
A:1010933404324. B., Cruz-Gaistardo, C.O., Davila, F., Dell Acqua, M., Encina, A., Tacona, H.F.,
de Brogniez, D., Ballabio, C., Stevens, A., Jones, R.J.A., Montanarella, L., van Fontes, F., Hernández Herrera, J.A., Navarro, A.R.I., Loayza, V., Manueles, A.M.,
Wesemael, B., 2015. A map of the topsoil organic carbon content of Europe Jara, F.M., Olivera, C., Hermosilla, R.O., Pereira, G., Prieto, P., Ramos, I.A., Brina, J.
generated by a generalized additive model. Eur. J. Soil Sci. 66 (1), 121–134. https:// C.R., Rivera, R., Rodríguez-Rodríguez, R., Roopnarine, R., Ibarra, A.R., Riveiro, K.A.
doi.org/10.1111/ejss.12193. R., Schulz, G.A., Spence, A., Vasques, M., Vargas, R., Vargas, R., 2018. No silver
Brungard, C., Nauman, T., Duniway, M., Veblen, K., Nehring, K., White, D., Salley, S., bullet for digital soil mapping: country-specific soil organic carbon estimates across
Anchang, J., 2021. Regional ensemble modeling reduces uncertainty for digital soil Latin America. Soil 4 (3), 173–193. https://doi.org/10.5194/soil-4-173-2018.
mapping. Geoderma 397, 114998. https://doi.org/10.1016/j. Guo, L.J., Zhang, Z.S., Wang, D.D., Li, C.F., Cao, C.G., 2015. Effects of short-term
geoderma.2021.114998. conservation management practices on soil organic carbon fractions and microbial
Chen, S., Liang, Z., Webster, R., Zhang, G., Zhou, Y., Teng, H., Hu, B., Arrouays, D., community composition under a rice-wheat rotation system. Biol. Fertil. Soils 51 (1),
Shi, Z., 2019a. A high-resolution map of soil pH in China made by hybrid modelling 65–75. https://doi.org/10.1007/s00374-014-0951-6.
of sparse soil data and environmental covariates and its implications for pollution. Helfenstein, A., Mulder, V.L., Heuvelink, B.M., Okx, J.P., 2022. Tier 4 maps of soil pH at
Sci. Total Environ. 655, 273–283. https://doi.org/10.1016/j.scitotenv.2018.11.230. 25 m resolution for the Netherlands. GEoderma 410, 115659. https://doi.org/
Chen, L., Ren, C., Li, L., Wang, Y., Zhang, B., Wang, Z., Li, L., 2019b. A comparative 10.1016/j.geoderma.2021.115659.
assessment of geostatistical, machine learning, and hybrid approaches for mapping Hengl, T., de Jesus, J.M., MacMillan, R.A., Batjes, N.H., Heuvelink, G.B., Ribeiro, E.,
topsoil organic carbon content. ISPRS Int. J. Geo Inf. 8 (4), 174. https://doi.org/ Samuel-Rosa, A., Kempen, B., Leenaars, J.G.B., Walsh, M.G., Gonzalez, M.R., 2014.
10.3390/ijgi8040174. SoilGrids1km—global soil information based on automated mapping. PLoS One 9
Chen, S., Arrouays, D., Mulder, V.L., Poggio, L., Minasny, B., Roudier, P., Walter, C., (8), e105992. https://doi.org/10.1371/journal.pone.0114788.
2022. Digital mapping of GlobalSoilMap soil properties at a broad scale: a review. Hengl, T., Heuvelink, G.B., Kempen, B., Leenaars, J.G., Walsh, M.G., Shepherd, K.D.,
Geoderma 409, 115567. https://doi.org/10.1016/j.geoderma.2021.115567. Sila, A., MacMillan, R.A., de Jesus, J.M., Tamene, L., Tondoh, J.E., 2015. Mapping
Chevallier, T., Razafimbelo, T.M., Chapuis-Lardy, L., Brossard, M., 2020. Carbone des soil properties of Africa at 250 m resolution: random forests significantly improve
sols en Afrique. Impacts des usages des sols et des pratiques agricoles. FAO/IRD, current predictions. PLoS One 10 (6), e0125814. https://doi.org/10.1371/journal.
Rome/Marseille. https://doi.org/10.4000/books.irdeditions.34867, 268p. pone.0125814.
Corbane, C., Politis, P., Kempeneers, P., Simonetti, D., Soille, P., Burger, A., Pesaresi, M., Hengl, T., Mendes de Jesus, J., Heuvelink, G.B., Ruiperez Gonzalez, M., Kilibarda, M.,
Sabo, F., Syrris, V., Kemper, T., 2020. A global cloud free pixel- based image Blagotić, A., Shangguan, W., Wright, M.N., Geng, X., Marschallinger, B.B.,
composite from Sentinel-2 data. In: Data in Brief, 31, p. 105737. ISSN 2352–3409 Guevara, M.A., Vargas, R., MacMillan, R.A., Batjes, N.H., Leenaars, J.G.B.,
(online). JRC120356. https://doi.org/10.1016/j.dib.2020.105737.

9
H. Bahri et al. Geoderma Regional 30 (2022) e00561

Ribeiro, E., Wheeler, I., Mantel, S., Kempen, B., 2017. SoilGrids250m: global gridded carbon on a national scale: towards an improved and updated map of Madagascar.
soil information based on machine learning. PLoS One 12 (2), e0169748. https:// Geoderma Reg. 9, 29–38. https://doi.org/10.1016/j.geodrs.2016.12.002.
doi.org/10.1371/journal.pone.0169748. Rial, M., Cortizas, A.M., Rodríguez-Lado, L., 2017. Understanding the spatial distribution
Henry, M., Valentini, R., Bernoux, M., 2009. Soil carbon stocks in ecoregions of Africa. of factors controlling topsoil organic carbon content in European soils. Sci. Total
Biogeosci. Discuss. 6 (1), 797–823. https://doi.org/10.5194/bgd-6-797-2009. Environ. 609, 1411–1422. https://doi.org/10.1016/j.scitotenv.2017.08.012.
Hinge, G., Surampalli, R.Y., Goyal, M.K., 2018. Prediction of soil organic carbon stock Samuel-Rosa, A., Heuvelink, G.B.M., Vasques, G.M., Anjos, L.H.C., 2015. Do more
using digital mapping approach in humid India. Environ. Earth Sci. 77 (5), 1–10. detailed environmental covariates deliver more accurate soil maps? Geoderma 243,
https://doi.org/10.1007/s12665-018-7374-x. 214–227. https://doi.org/10.1016/j.geoderma.2014.12.017.
Hobley, E., Wilson, B., Wilkie, A., Gray, J., Koen, T., 2015. Drivers of soil organic carbon Shrestha, D.L., Solomatine, D.P., 2006. Machine learning approaches for estimation of
storage and vertical distribution in eastern Australia. Plant Soil 390 (1), 111–127. prediction interval for the model output. Neural Netw. 19, 225–235. https://doi.org/
https://doi.org/10.1007/s11104-015-2380-1. 10.1016/j.neunet.2006.01.012.
Hutter, F., Hoos, H.H., Leyton-Brown, K., 2011. Sequential Model-Based Optimization for Song, X.D., Wu, H.Y., Ju, B., Liu, F., Yang, F., Li, D.C., Zhao, Y.G., Yang, J.L., Zhang, G.L.,
General Algorithm Configuration, 507–523. Springer Berlin Heidelberg, Berlin, 2020. Pedoclimatic zone-based three-dimensional soil organic carbon mapping in
Heidelberg. https://doi.org/10.1007/978-3-642-25566-3_40. China. Geoderma 363, 114145. https://doi.org/10.1016/j.geoderma.2019.114145.
Jones, D.R., Schonlau, M., Welch, W.J., 1998. Efficient global optimization of expensive Stockmann, U., Padarian, J., McBratney, A., Minasny, B., de Brogniez, D.,
black-boxfunctions. J. Glob. Optim. 13, 455–492. https://doi.org/10.1023/A: Montanarella, L., Hong, S.Y., Rawlins, B.G., Field, D.J., 2015. Global soil organic
1008306431147. carbon assessment. Glob. Food Sec. 6, 9–16. https://doi.org/10.1016/j.
Kalambukattu, J.G., Kumar, S., Raj, R.A., 2018. Digital soil mapping in a Himalayan gfs.2015.07.001.
watershed using remote sensing and terrain parameters employing artificial neural Strobl, C., Boulesteix, A.L., Zeileis, A., Hothorn, T., 2007. Bias in random forest variable
network model. Environ. Earth Sci. 77 (5), 1–14. https://doi.org/10.1007/s12665- importance measures: illustrations, sources and a solution. BMC Bioinformatics 8, 25
018-7367-9. (2007). https://doi.org/10.1186/1471-2105-8-25.
Lagacherie, P., Arrouays, D., Bourennane, H., Gomez, C., Nkuba-kasanda, L., 2020. STUDI-SCOT-SODETEG, 2001. Étude des cartes agricoles régionales. Ministère de
Analysing the impact of soil spatial sampling on the performances of digital soil l’agriculture, de l’environnement et des ressources hydrauliques. Report and map.
mapping models and their evaluation: a numerical experiment on quantile random Szatmári, G., Pásztor, L., 2019. Comparison of various uncertainty modelling approaches
Forest using clay contents obtained from Vis-NIR-SWIR hyperspectral imagery. based on geostatistics and machine learning algorithms. Geoderma 337, 1329–1340.
Geoderma 375. https://doi.org/10.1016/j.geoderma.2020.114503. https://doi.org/10.1016/j.geoderma.2018.09.008.
Lal, R., 2004. Soil carbon sequestration impacts on global climate change and food Szatmári, G., Pirkó, B., Koós, S., Laborczi, A., Bakacsi, Z., Szabó, J., Pásztor, L., 2019.
security. Science 304 (5677), 1623–1627. https://doi.org/10.1126/ Spatio-temporal assessment of topsoil organic carbon stock change in Hungary. Soil
science.1097396. Tillage Res. 195, 104410 https://doi.org/10.1016/j.still.2019.104410.
Loiseau, T., Chen, S., Mulder, V.L., Dobarco, M.R., Richer-de-Forges, A.C., Lehmann, S., Thornes, J.B., 2002. The Evolving Context of Mediterranean Desertification. A Mosaic of
Bourennane, H., Saby, N.P.A., Martin, M.P., Vaudour, E., Gomez, C., Lagacherie, P., Processes and Responses, Mediterranean Desertification, pp. 5–11.
Arrouays, D., 2019. Satellite data integration for soil clay content modelling at a Tsui, C.C., Chen, Z.S., Hsieh, C.F., 2004. Relationships between soil properties and slope
national scale. Int. J. Appl. Earth Obs. Geoinf. 82, 101905 https://doi.org/10.1016/ position in a lowland rain forest of southern Taiwan. Geoderma 123 (1–2), 131–142.
j.jag.2019.101905. https://doi.org/10.1016/j.geoderma.2004.01.031.
Lozano-García, B., Parras-Alcántara, L., Brevik, E.C., 2016. Impact of topographic aspect Vaudour, E., Gomez, C., Loiseau, T., Baghdadi, N., Loubet, B., Arrouays, D., Ali, L.,
and vegetation (native and reforested areas) on soil organic carbon and nitrogen Lagacherie, P., 2019. The impact of acquisition date on the prediction performance
budgets in Mediterranean natural areas. Sci. Total Environ. 544, 963–970. https:// of topsoil organic carbon from Sentinel-2 for croplands. Remote Sens. 11 (18), 2143.
doi.org/10.1016/j.scitotenv.2015.12.022. https://doi.org/10.3390/rs11182143.
Mansuy, N., Thiffault, E., Paré, D., Bernier, P., Guindon, L., Villemaire, P., Vincent, P., Vaysse, K., Lagacherie, P., 2017. Using quantile regression forest to estimate uncertainty
Beaudoin, A., 2014. Digital mapping of soil properties in Canadian managed forests of digital soil mapping products. Geoderma 291, 55–64. https://doi.org/10.1016/j.
at 250 m of resolution using the k-nearest neighbor method. Geoderma 235, 59–73. geoderma.2016.12.017.
https://doi.org/10.1016/j.geoderma.2014.06.032. Viscarra Rossel, R.A., Chen, C., Grundy, M.J., Searle, R., Clifford, D., Campbell, P.H.,
McBratney, A.B., Santos, M.M., Minasny, B., 2003. On digital soil mapping. Geoderma 2015. The Australian three-dimensional soil grid: Australia’s contribution to the
117 (1–2), 3–52. https://doi.org/10.1016/S0016-7061(03)00223-4. GlobalSoilMap project. Soil Res. 53 (8), 845–864. https://doi.org/10.1071/
McNicol, G., Bulmer, C., D’Amore, D., Sanborn, P., Saunders, S., Giesbrecht, I., Arriola, S. SR14366.
A., Bidlack, A., Butman, D., Buma, B., 2019. Large, climate-sensitive soil carbon Walkley, A., Black, I.A., 1934. An examination of Degtjareff method for determining soil
stocks mapped with pedology-informed machine learning in the North Pacific organic matter and a proposed modification of the chromic acid titration method.
coastal temperate rainforest. Environ. Res. Lett. 14 (1), 014004 https://doi.org/ Soil Sci. 37, 29–37.
10.1088/1748-9326/aaed52. Wang, S., Zhuang, Q., Wang, Q., Jin, X., Han, C., 2017. Mapping stocks of soil organic
Meersmans, J., De Ridder, F., Canters, F., De Baets, S., Van Molle, M., 2008. A multiple carbon and soil total nitrogen in Liaoning Province of China. Geoderma 305,
regression approach to assess the spatial distribution of soil organic carbon (SOC) at 250–263. https://doi.org/10.1016/j.geoderma.2017.05.048.
the regional scale (Flanders, Belgium). Geoderma 143 (1–2), 1–13. https://doi.org/ Wang, B., Waters, C., Orgill, S., Gray, J., Cowie, A., Clark, A., Li Liu, D., 2018. High
10.1016/j.geoderma.2007.08.025. resolution mapping of soil organic carbon stocks using remote sensing variables in
Meinshausen, N., 2006. Quantile regression forests. J. Mach. Learn. Res. 7, 983–999. the semi-arid rangelands of eastern Australia. Sci. Total Environ. 630, 367–378.
Minasny, B., McBratney, A.B., Malone, B.P., Wheeler, I., 2013. Digital mapping of soil https://doi.org/10.1016/j.scitotenv.2018.02.204.
carbon. In: Sparks, D.L. (Ed.), Advances in Agronomy. Academic Press, pp. 1–47. Webster, R., Oliver, M.A., 2001. Geostatistics for Experimental Scientists. John Wiley and
Mulder, V.L., Lacoste, M., Richer-de-Forges, A.C., Martin, M.P., Arrouays, D., 2016. Sons ltd, Chichester.
National versus global modelling the 3D distribution of soil organic carbon in Wilson, J.P., Gallant, J.C., 2000. Primary topographic attributes. In: Wilson, J.P.,
mainland France. Geoderma 263, 16–34. https://doi.org/10.1016/j. Gallant, J.C. (Eds.), Terrain Analysis: Principles and Applications. John Wiley &
geoderma.2015.08.035. Sons, pp. 51–85.
Obu, J., Lantuit, H., Myers-Smith, I., Heim, B., Wolter, J., Fritz, M., 2017. Effect of terrain Wright, M.N., Ziegler, A., 2017. Ranger: A Fast Implementation of Random Forests for
characteristics on soil organic carbon and total nitrogen stocks in soils of Herschel High Dimensional Data in C ++ and R 77. https://doi.org/10.48550/
Island, Western Canadian Arctic. Permafr. Periglac. Process. 28 (1), 92–107. https:// arXiv.1508.04409.
doi.org/10.1002/ppp.1881. Yang, Y., Mohammat, A., Feng, J., Zhou, R., Fang, J., 2007. Storage, patterns and
Ottoy, S., De Vos, B., Sindayihebura, A., Hermy, M., Van Orshoven, J., 2017. Assessing environmental controls of soil organic carbon in China. Biogeochemistry 84 (2),
soil organic carbon stocks under current and potential forest cover using digital soil 131–141. https://doi.org/10.1007/s10533-007-9109-z.
mapping and spatial generalisation. Ecol. Indic. 77, 139–150. https://doi.org/ Zhang, C., McGrath, D., 2004. Geostatistical and GIS analyses on soil organic carbon
10.1016/j.ecolind.2017.02.010. concentrations in grassland of southeastern Ireland from two different periods.
Piccini, C., Marchetti, A., Francaviglia, R., 2014. Estimation of soil organic matter by Geoderma 119 (3–4), 261–275. https://doi.org/10.1016/j.geoderma.2003.08.004.
geostatistical methods: use of auxiliary information in agricultural and Zhang, C., Tang, Y., Xu, X., Kiely, G., 2011. Towards spatial geochemical modelling: use
environmental assessment. Ecol. Indic. 36, 301–314. https://doi.org/10.1016/j. of geographically weighted regression for mapping soil organic carbon contents in
ecolind.2013.08.009. Ireland. Appl. Geochem. 26 (7), 1239–1248. https://doi.org/10.1016/j.
Poggio, L., De Sousa, L.M., Batjes, N.H., Heuvelink, G., Kempen, B., Ribeiro, E., apgeochem.2011.04.014.
Rossiter, D., 2021. SoilGrids 2.0: producing soil information for the globe with Zhang, G.L., Liu, F., Song, X.D., 2017. Recent progress and future prospect of digital soil
quantified spatial uncertainty. Soil 7 (1), 217–240. https://doi.org/10.5194/soil-7- mapping: a review. J. Integr. Agric. 16 (12), 2871–2885. https://doi.org/10.1016/
217-2021. S2095-3119(17)61762-3.
Probst, P., Wright, M., Boulesteix, A., 2018. Hyperparameters and tuning strategies for Zhou, Y., Hartemink, A.E., Shi, Z., Liang, Z., Lu, Y., 2019. Land use and climate change
random Forest. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 1–19. https://doi. effects on soil organic carbon in north and Northeast China. Sci. Total Environ. 647,
org/10.1002/widm.1301. 1230–1238. https://doi.org/10.1016/j.scitotenv.2018.08.016.
R Core Team, 2013. R: A Language and Environment for Statistical Computing. R Zhou, T., Geng, Y., Chen, J., Pan, J., Haase, D., Lausch, A., 2020. High-resolution digital
Foundation for Statistical Computing, Vienna, Austria. http://www.R-project.org/. mapping of soil organic carbon and soil total nitrogen using DEM derivatives,
Ramifehiarivo, N., Brossard, M., Grinand, C., Andriamananjara, A., Razafimbelo, T., Sentinel-1 and Sentinel-2 data based on machine learning algorithms. Sci. Total
Rasolohery, A., Razafimahatratra, H., Seyler, F., Ranaivoson, N., Rabenarivo, M., Environ. 729, 138244 https://doi.org/10.1016/j.scitotenv.2020.138244.
Albrecht, A., Razafindrabe, F., Razakamanarivo, H., 2017. Mapping soil organic

10

You might also like