0% found this document useful (0 votes)
9 views

Lecture 2

Advanced GIS and Remote Sensing Lecture Notes 2

Uploaded by

kassaye hussien
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Lecture 2

Advanced GIS and Remote Sensing Lecture Notes 2

Uploaded by

kassaye hussien
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

Predicting Environmental

Variables

Lecture 2
2.1. Environmental Variables: Concepts
 What are Environmental Variables?
2.1. Environmental Variables:
Concepts
 Environmental variables are quantitative or
descriptive measures of different environmental
features.
 Environmental variables can belong to different
domains, ranging from
◦ Biology (distribution of species and biodiversity
measures),
◦ Soil science (soil properties and types),
◦ Vegetation science (plant species and communities, land
cover types),
◦ Climatology (climatic variables at surface and
beneath/above),
◦ Hydrology (water quantities and conditions) and similar.
Environmental Variables...
 Environmental Variables and Remote Sensing
─They are commonly collected through field
sampling (supported by remote sensing), which
are then used to produce maps showing their
distribution in an area.
─Such accurate and up-to-date maps of
environmental features represent a crucial input
to Spatial planning, decision making, land
evaluation or land degradation assessment
Remote Sensing for EVs : General
• Green Vegetation
 Vegetation Biomass
 Grassland ecosystem
 Forest Monitoring
 Crop condition monitoring and
 Crop yield forecasting
• Soil landscape Modeling
• Drought Monitoring
Green Vegetation Variables
 Parameters or indices of green vegetation
◦ Biomass
◦ Leaf area index,
◦ Fraction of vegetation cover,
◦ Vegetation height,
◦ Vegetation age,
◦ Leaf chlorophyll Content
◦ Absorbed photosynthetic
◦ Active radiation,
◦ Photosynthetic efficiency, etc.
Agro-Meteorological Variables
 Parameters crop growth stage and yield estimation
◦ Leaf chlorophyll Content
◦ Crop biomass
◦ Intercepted solar radiation
◦ Crop ground Cover
◦ Canopy leaf area Index
◦ Yield
2.2. Predicting Techniques
 Generally, the goal of traditional image classification is to
produce discrete categories of cover image Classification
types/vegetation types rather than focusing to extract
continuous vegetation properties accurately.
 In remote sensing, the main problem is the accurate
determination of vegetation variables (vegetation or soil
variables ) from remotely sensed data.
 These variables are, for the most part, continuous (e.g.
biomass, leaf area index, species richness. etc.)
 Inferring continuous variables implies that a functional
relationship must be made between the predicted
variable(s), the remotely sensed data
Predicting Techniques...
 This is opposed to classification studies where
the goal is to produce discrete categories.
 Several common approaches for the extraction
of continuous variables (vegetation and soil
variables) from remote sensing exist.
 Generally, the main types of techniques to predict
continuous variables include:
◦ Traditional Statistical Methods
◦ Machine learning (ML) Methods
◦ Geostatistical Methods
◦ Hybrid (ML + Geostatistical) Methods
2.2.1. Traditional Statistical Methods
 Traditional statistical methods are used to
determine the correlation between independent
variable (environmental variables or remote
sensing bands) and dependent variable (Targeted
vegetation or soil parameters).
 Traditional Statistical modeling techniques are
mainly based on General Liner Model which
include
◦ Linear regression model
◦ Stepwise regression model
◦ Ordinary least squares regression model
◦ Partial least squares regression model
Traditional …
 Commonly, estimates of the target variables
using traditional statistical method can be
grouped into two basic approaches
◦ Target variables- radiance relationships and
◦ Vegetation indices approaches
Traditional …
Target - Radiance relationships
 Ideally, a functional relationship exists between the
independent variables (e.g. remotely sensed signals) and
the estimated variables (e.g. biomass, leaf area index
(LAI ), etc..
 Vegetation variables -radiance relationship approaches
are best suited to high spatial resolution satellite sensor
data such as Landsat TM, MSS and SPOT, because they
require accurate measurement of vegetation cover at
ground covering the same area with the pixels
resolution.
 In estimating relationship between vegetation
biophysical characteristics and radiance, a simple
assumptions that allow one to develop a predictive
equation in the form of a general linear model.
Traditional …
2.Vegetation indices approaches

 One of the ways of estimating biophysical characteristics


through remote sensing is
◦ To determine empirical or semi-empirical relationships
between these parameters and combinations of reflectance
from different spectral bands, that is vegetation index .
 The vegetation index (VIs) approach is based on the
existing vegetation index and developed to estimate
vegetation variables.
◦ VIs have been used to assess the temporal and spatial
variations of plant biophysical properties (biomass, Leaf
Area Index (LAI), vegetation fraction etc.) and ecosystem
processes (Net Primary Productivity, exchange of energy and
matter), which are highly variable in time and space.
Traditional …
Vegetation indices…
 There are more than 12 vegetation indices and
they have been correlated with vegetation.
◦ These Indices have been developed to enhance the
spectral contribution from green vegetation while
minimizing contributions from soil background, sun
angle, sensor view angle, senesced vegetation and the
atmosphere developed an index that is sensitive to
liquid water content of vegetation canopies.
◦ The derivative of vegetation reflectance with respect
to wavelength (or related form) is common to all
vegetation indices.
Definition and Sources Spectral Vegetation Indices
Traditional …
 Most VIs are sensitive to the influence of background
réflectance (différences in soli moisture, content,
chemical composition, roughness and litter content),
atmospheric effects and illumination-observational
conditions.
◦ The influences of these three factors are complex, intricately coupled,
and dependent on surface characteristics (Rondeaux et al. 1996).
 The number and diversity of VIs proposed so far reflect, in
part, attempts to reduce the impact of these perturbing
processes.
 An ideal VI should be highly sensitive to biophysical
parameters while being relatively insensitive to these
external influences.
Limitation of Traditional Method
 The Traditional statistical models described above are
linear in the coefficients and can be solved using least-
squares methods.
 Limitation of Linear Model regarding to Vegetation
variables
◦ This model performs poorly in predicting vegetation
variables because the relations between scattered
radiation above vegetation canopies and vegetation
variables are nonlinear.
◦ VIs do not give the scientist any deep insight into the
physical system. It is often difficult to decide what
transformations to make, if any.
◦ Generally, the choice is made based on the results of
previous studies in similar study areas and on trial and
Limitation……
 Although easily applicable, their
prerequisites of independent and identical
distribution with large sample demands for
field observations are among challenges.
 These methods are also known as lack of
spatial information, making them less stable
and unsuitable for delineating local changes
2.2.2.1 Non-Linear Model
 Regression analysis employs a family of functions called
Generalized Linear Models (GLMs), which all assume a
linear relationship between the inputs and outputs.
◦ Hence, output from the model fitting process is a set of
regression coefficients.
 On the other hand, regression models can be also used to
represent non-linear relationships with the use of General
Additive Models (GAMs).
◦ Unlike GLM, the relationship between the predictors and
targets can be solved using one-step data-fitting or by using
iterative data fitting techniques.
◦ Many studies have acknowledged that a nonlinear form (i.e.
nonlinear in the coefficients) is the more realistic and
potentially more accurate model.
Non-Linear Model...
 Non-Linear Models (NLM) is also known as
Machine Learning (ML) Methods.
 These methods can accommodate non-linearity and
multi-collinearity, and they can overcome over-
fitting with limited field observations and auxiliary
environmental information.
 The most known Machine learning Methods:
◦ Artificial neural network (ANN),
◦ Support vector machine (SVM)
◦ Random forest (RF) or Tree-based Model,
Artificial Neural Network (ANN)
 Artificial Neural Network (ANN) is an
information processing model inspired by the
biological neuron system, and it simulate the
human learning processes.
 It has the ability to learn by examples,
following the non-linear path and process
information in parallel throughout the nodes
(Navlani, 2019).
 It is a special type of nonlinear regression
based on a set of neurons or computing units
that are linked to each other.
ANN…
 An artificial neural network is a machine learning system
consists of interconnected networks of simple processing
elements.
 It has a power patter recognition capabilities that enable it to
learn to represent complex multivariate data patterns.
 Neural network can perform more accurate than other
statistical techniques, particularly when feature space is
more complex and the source of data has different statistical
distribution.
 Recently, the techniques has been used to estimate
continuous vegetation variables [such as biomass cover,
canopy density and other] and soil variables nitrogen and
phosphorus, soil carbon contents
ANN…
 The structure of ANN used is based on three layer networks
consisting of an input, hidden and output layers.
◦ Input neurons receive the input information; the higher the input
value, the greater the activation.
◦ Then, the activation value is passed through the network in regard to
weights and transfer functions in the graph.
◦ The hidden neurons (or output neurons) then sum up the activation
values and modify the summed values with the transfer function.
◦ The activation value then flows through hidden neurons and stops
when it reaches the output nodes.
 As a result, one can use the output value from the output
neurons to classify the data, and find any kinds of
relationship between network inputs (auxiliary variables)
and outputs (target variables).
Example : ANN structure
ANN…
 The advantage of ANN Method
◦ It is independent of the statistical distribution
of the data, i.e the methods do not need
explicit assumptions about data distributions,
and
◦ It allows for modelling of complex
relationships
2.2.2.2 Support Vector Machine (SVM)

 Support Vector Machine SVM is a high-performing


machine learning algorithms and supervised learning
binary classifier technique based on statistical learning
theory and structural risk minimization principle.
 Support vector machines can construct an optimal
hyperplane by projecting the data onto a new hyperspace
by the means of kernel functions. The hyerplane separates
classes and creates the widest margin in the classification.
 SVM often used kernel functions which include:
◦ Polynomial (PL),
◦ Radial basis function (RBF),
◦ Sigmoid (SIG), and
◦ Linear (LN).
SVM...
 Application of SVM model for predicting category
of new observation (e.g. Activity-2), also works to
predict continuous values (topsoil organic carbon
stock density). It can solve regression problems
using environmental covariates and SOC (FAO,
2018).
 The SMO (sequential minimal optimization)
algorithm will be used to solve the quadratic
programming optimization problem step-by-step. It
updated the support vector regression function, as
shown in Equation (3.3).
2.2.3. Random Forest (RF)
 Random Forest is a tree-based learner that
combines decision-tree and bagging methods
and its algorithm works in following ways.
◦ First, it creates a forest and makes it somehow
random. The “forest” it builds, is an ensemble of
Decision Trees, most of the time trained with the
“bagging” method. It randomly selects a set of
data points features, which are used to builds
multiple trees (Forest) and decide the best split at
each node of the decision tree.
◦ The final prediction is calculated by averaging the
predictions from all decision trees.
Random Forest (RF)
 Advantages
◦ One big advantage of random forest is that it can be
used for both classification and regression problems
which form the majority of current machine learning
systems.
◦ Tree-based models are often easier to interpret when a
mix of continuous and discrete variables are used as
predictors
◦ They are fitted by successively splitting a dataset into
increasingly homogeneous groupings.
◦ Output from the model fitting process is a decision
tree, which can then be applied to make predictions of
either individual property values or class types for an
entire area of interest.
Limitations of Non-linear Methods?
 Non-Linear methods or non parametric Methods
: Artificial Neural Networks, Decision Trees, Expert
Systems...
 Non-parametric methods are increasingly used to
model and map continuous spatial properties.
◦ These can use more ancillary variables than explicitly
spatial methods.
 However, usually assessed using non-spatial global
error measures.
◦ Summarize many data points
◦ Cannot easily identify where model is correct

You might also like