Topic 8 - The Anatomy of A GIS Model: Beyond Mapping II
Topic 8 - The Anatomy of A GIS Model: Beyond Mapping II
From Recipes to Models describes basic Binary and Rating model expressions using a
simple Landslide Susceptible model
Extending Basic Models through Logic Modifications describes logic extensions to a simple
Landslide Susceptible model by adding additional criteria that changes a models structure
Evaluating Map-ematical Relationships discussed the differences and similarities between
the two basic types of GIS models (Cartographic and Spatial) using the Universal Soil Loss
Equation as an example
<Click here> for a printer-friendly version of this topic (.pdf).
(Back to the Table of Contents)
______________________________
So what's the difference between a recipe and a model? Both seem to mix a bunch of things
together to create something else. Both result in a synergistic amalgamation that's more than the
sum of the parts. Both start with basic ingredients and describe the processing steps required to
produce the desired result-be it a chocolate cake or a landslide susceptibility map.
In a GIS, the ingredients are base maps and the processing steps are spatial handling operations.
For example, a simple recipe for locating landslide susceptibility involves ingredients such as
terrain steepness, soil type, and vegetation cover; areas that are steep, unstable, and bare are the
most susceptible.
Before computers, identifying areas of high susceptibility required tedious manual map analysis
procedures. A transparency was taped over a contour map of elevation, and areas where contour
lines were spaced closely (steep) were outlined and filled with a dark color. Similar transparent
overlays were interpreted for areas of unstable soils and sparse vegetation from soil and
vegetation base maps. When the three transparencies were overlaid on a strong light source, the
combination was deciphered easily clear = not susceptible, and dark = susceptible. That basic
recipe has been with us for a long time. Of course, the methods changed as modern drafting aids
replaced the thin parchment, quill pens, and stained glass windows of the 1800s, but the
From the online book Beyond Mapping II by Joseph K. Berry, www.innovativegis.com/basis/. All rights reserved. Permission to copy
for educational use is granted.
Page 1
The flowchart in figure 1 depicts an alternative raster-based binary model (only two states of
either Yes or No), which mimics the manual map analysis process and achieves the same result
as the overlay/SQL query. A slope map is created by calculating the change in elevation
throughout the project area (first derivative of the elevation surface).
Figure 1. Binary, ranking and rating models of landslide susceptibility. The location indicated
by the piercing arrow contains 34 percent slope, a fairly stable soil and sparse forest cover.
A Simple Binary model solution codes as 1 all of the susceptible areas on each of the factor
maps (>30 percent slope, unstable soils, bare vegetative cover), whereas the non-susceptible
areas are coded as 0. The product of the three binary maps (SL_HAZARD (binary), SO_
HAZARD (binary), CO_ HAZARD (binary)) creates a final map of landslide potential l =
susceptible, and 0 = not susceptible. Only locations susceptible on all three maps retain the
"susceptible" classification (1*1*1= l). In the other instances, multiplying 0 times any number
From the online book Beyond Mapping II by Joseph K. Berry, www.innovativegis.com/basis/. All rights reserved. Permission to copy
for educational use is granted.
Page 2
forces the product to 0 (not susceptible). The map-ematical model corresponding to the
flowchart (Simple Binary model) in figure 1 might be expressed (in TMAP modeling language)
as:
SLOPE ELEVATION FOR SLOPES
In the multiplicative case, the arithmetic combination of the maps yields the original two statesdark or 1 = susceptible, and clear or 0 = not susceptible (at least one data layer not susceptible).
It's analogous to the "AND" condition of the logical combination in the SQL query. However,
other combinations can be derived. For example, the visual analysis could be extended by
interpreting the various shades of gray on the stack of transparent overlays: clear = not
susceptible, light gray = low susceptibility, medium gray = moderate susceptibility and dark gray
= high susceptibility.
In an analogous map-ematical approach, the computed sum of the three binary maps yields a
similar ranking: 0 = not susceptible, 1 = low susceptibility, 2 = moderate susceptibility and 3 =
high susceptibility (l + l+ l = 3). That approach is called a Binary Ranking model, because it
develops an ordinal scale of increasing landslide potential a value of two is more susceptible
than a value of 1, but not necessarily twice as susceptible.
A rating model is different, because it uses a consistent scale with more than two states to
characterize the relative landslide potential for various conditions on each factor map. For
example, a value of 1 is assigned to the least susceptible steepness condition (e.g., from 0 percent
to 5 percent slope), while a value of 9 is assigned to the most susceptible condition (e.g., >30
percent slope). The intermediate conditions are assigned appropriate values between the
landslide susceptibility extremes of 1 and 9. That calibration results in three maps with relative
susceptibility ratings (SL_HAZARD (rate), SO_HAZARD (rate), CO_HAZARD (rate)) based
on the 1-9 scale of relative landslide susceptibility.
Computing the simple average (Simple Rating model) of the three rate maps determines an
overall landslide potential based on the relative ratings for each factor at each map location. For
example, a particular grid cell might be rated 9, because it's steep, 3 because its soil is fairly
stable, and I because it's forested. The average landslide susceptibility rating under these
conditions is [(9+3+3)/3] = 5, indicating a moderate landslide potential.
From the online book Beyond Mapping II by Joseph K. Berry, www.innovativegis.com/basis/. All rights reserved. Permission to copy
for educational use is granted.
Page 3
A weighted average of the three maps (Weighted Rating model) expresses the relative
importance of each factor to determine overall susceptibility. For example, steepness might be
identified as five times more important than either soils or vegetative cover in estimating
landslide potential. For the example grid cell described previously, the weighted average
computes to [([9*5]+3+3) l7) = 7.28, which is closer to a high overall rating. The weighted
average is influenced preferentially by the SL-rate map's high rating, yielding a much higher
overall rating than the simple average.
All that may be a bit confusing. The four different "recipes" for landslide potential produced
strikingly different results for the example grid cell in figure 1 from not susceptible to high
susceptibility. It's like baking banana bread. Some folks follow the traditional recipe; some add
chopped walnuts or a few cranberries. By the time diced dates and candied cherries are tossed
in, you can't tell the difference between your banana bread and last years' fruitcake.
So back to the main point-what's the difference between a recipe and a model? Merely
semantics? Simply marketing jargon? The real difference between a recipe and a model isn't in
the ingredients, or the processing steps themselves. It's in the conceptual fabric of the process
but more on that later.
The previous section described various renderings of a landslide susceptibility model. It related
the results obtained for an example location using manual, logical combination, binary, ranking,
and rating models. The results ranged from not susceptible to high susceptibility. Two factors in
model expression were at play: the type of model and its calibration.
However, the model structure, which identified the factors considered and how they interact,
remained constant. In the example, the logic was constrained to jointly considering terrain
steepness, soil type, and vegetation cover. One could argue other factors might contribute to
landslide potential. What about depth to bedrock? Or previous surface disturbance? Or slope
length? Or precipitation frequency and intensity? Or gopher population density? Or about
anything else you might dream up?
That's it. You've got the secret to seat-of-the-pants GISing. First you address the critical factors,
and then extend your attention to other contributing factors. In the abstract it means adding
boxes and arrows to the flowchart to reflect the added logic. In practice it means expanding the
GIS macro code, and most importantly wrestling with the model's calibration.
From the online book Beyond Mapping II by Joseph K. Berry, www.innovativegis.com/basis/. All rights reserved. Permission to copy
for educational use is granted.
Page 4
For example, it's easy to add a fourth row to the landslide flowchart, identifying the additional
criterion of depth to bed rock, and tie it to the other three factors. It's even fairly easy to add the
new lines of code to the GIS macro (Binary Ranking model):
RENUMBER DEPTH_BR
ASSIGNING 0 TO 0 THRU 4
ASSIGNING 1 TO 5 THRU 15
FOR BR_BINARY
COMPUTE SL_BINARY PLUS
SO_BINARY PLUS
CO_BINARY PLUS BR_BINARY
FOR L_HAZARD
identifies depth to
bedrock > 4m as minimal
susceptibility = 1
for example:
compute 1 + 1 + 1+ 1 =
4 to identify extremely
hazardous areas
Things get a lot tougher when you have to split hairs about precisely what soil depths increase
landslide susceptibility (>4 meters a good guess?).
The previous discussions focused on the hazard of landslides, but not their risk. Do we really
care about landslides unless there is something valuable in the way? Risk implies the threat a
hazard imposes on something valuable. Common sense suggests that a landslide hazard distant
from important features represents a much smaller threat than a similar hazard adjacent to a
major road or school.
Figure 1. Extends the basic landslide susceptibility model to isolate hazards around roads
(simple proximity mask).
From the online book Beyond Mapping II by Joseph K. Berry, www.innovativegis.com/basis/. All rights reserved. Permission to copy
for educational use is granted.
Page 5
The top portion of figure 1 shows the flowchart and commands for the basic binary landslide
model. The lower portion identifies a risk extension the basic model that considers proximity to
important features as a risk indicator. In the flowchart, a map of proximity to roads (R_PROX)
is generated that identifies the distance from every location to the nearest road. Increasing map
values indicate locations farther from a road. A binary map of buffers around roads
(R_BUFFERS) is created by renumbering the distance values near roads to 1 and far from roads
to 0. By multiplying this masking" map by the landslide susceptibility map (L_HAZARD), the
landslide threat is isolated for just the areas around roads (risk).
A further extension to the model involves variable-width buffers as a function of slope (figure 2).
The logic in that refinement is that in steep areas the buffer width increases as a landslide poses a
greater threat. The threat diminishes in gently sloped areas, so the buffer width contracts. The
weighted buffer extension calibrates the slope map into an impedance map (FRICTION), which
guides the proximity measurement.
Instead of a constant geographic reach around the roads, the effective buffer varies in width, as a
function of slope, throughout the map area. As before, the buffer can be used as a binary mask to
isolate the hazards within the variable reach of the roads.
That iterative refinement characterizes a typical approach to GIS modeling from simple to
increasingly complex. Most applications first mimic manual map analysis procedures and are
then extended to include more advanced spatial analysis tools. For example, a more rigorous
map-ematical approach to the previous extension might use a mathematical function to combine
the effective proximity (R_WPROX) with the relative hazard rating L_HAZARD) to calculate a
risk index for each location.
For your enjoyment, some additional extensions are suggested below. Can you modify the
flowchart to reflect the changes in model logic? If you have TMAP, can you develop the
additional code? If you're a malleable undergraduate, you have to if you want to pass the course.
But if you're a professional, you need not concern yourself with such details. Just ask the l8 year
old GIS hacker down the hall to do your spatial reasoning.
HAZARD SUBMODEL MODIFICATIONS
Consideration of other physical factors, such as bedrock type, depth to bedrock, faulting,
etc.
Consideration of disturbance factors, such as construction cuts and fills
Consideration of environmental factors, such as recent storm frequency, intensity and
duration
Consideration of seasonal factors, such as freezing and thawing cycles in early spring
Consideration of historical landslide data earthquake frequency
RISK SUBMODEL MODIFICATIONS
Consideration of additional important features, such as public, commercial, and
residential structures
Extension to differentially weight the uphill and downhill slopes from a feature to
calculate the effective buffer
Extension to preferentially weight roads based on traffic volume, emergency routes, etc.
Extension to include an economic valuation of threatened features and potential resource
loss
From the online book Beyond Mapping II by Joseph K. Berry, www.innovativegis.com/basis/. All rights reserved. Permission to copy
for educational use is granted.
Page 7
As noted in the two previous sections, GIS applications come in a variety of forms. The
differences aren't as much in the ingredients (maps) or the processing steps (command macros)
as in the conceptual fabric of the process. In the extensions in the evolution of the landslide
susceptibility, differences in the model approaches can arise through model logic and/or model
expression. A Simple Binary susceptibility model (only two states of either Yes or No) is
radically different from a Weighted Rating model using a weighted average of relative
susceptibility indices. In mathematical terms, the rating model is more robust, because it
provides a continuum of system responses. Also, it provides a foothold to extend the model even
further.
There are two basic types of GIS models: cartographic and spatial. In short, a cartographic
model focuses on automating manual map analysis techniques and reasoning, and a spatial model
focuses on expressing mathematical relationships. In the landslide example, the logical
combination and the binary map algebra solutions are obviously cartographic models. Both
could be manually solved using file cabinets and transparent overlays-tedious, but feasible for
the infinitely patient. The weighted average rating model, however, smacks of down and dirty
map-ematics and looks like a candidate spatial model. But is it?
As with most dichotomous classifications there is a gray area of overlap between cartographic
and spatial model extremes. If the weights used in rating model averaging are merely guesstimates, then the application lacks all of the rights, privileges, and responsibilities of an exalted
spatial model. The model may be mathematically expressed, but the logic isn't mathematically
derived, or empirically verified. In short, "Where's the science?"
One way to infuse a sense of science is to perform some data mining. That involves locating a
lot of areas with previous landslides, then pushing a predictive statistical technique through a
stack of potential driving variable maps. For example, you might run a regression on landslide
occurrence (dependent mapped variable) with %slope, %clay, %silt, and %"cover (independent
mapped variables). If you get a good fit, then substitute the regression equation for the weighted
average in the rating model. That approach is at the threshold of science, but it presumes your
database contains just the right set of maps over a large area. An alternative is to launch a series
of "controlled" experiments under various conditions (%slope, soil composition, cover density,
etc.) and derive a mathematical model through experiment. That's real science, but it consumes a
lot of time, money, and energy.
A potential shortcut involves reviewing the scientific literature for an existing mathematical
model and using it. That approach is used in figure 1, a map-ematical evaluation of the Revised
Universal Soil Loss Equation (RUSLE) kind of like landslides from a bug's perspective. The
expected soil loss per acre from an area, such as a farmer's field, is determined from the product
of six factors: the rainfall, the erodibility of the soil, the length and steepness (gradient) of the
ground slope, the crop grown in the soil, and the land practices used. The RUSLE equation and
its variable definitions are shown in figure 1. The many possible numerical values for each
factor require extensive knowledge and preparation. However, a soil conservationist normally
From the online book Beyond Mapping II by Joseph K. Berry, www.innovativegis.com/basis/. All rights reserved. Permission to copy
for educational use is granted.
Page 8
works in a small area, such as a single county, and often needs only one or two rainfall factors
(R), values for only a few soils (K), and only a few cropping/practices systems (C and P). The
remaining terrain data (L and S) are tabulated for individual fields.
Figure 1. Basic GIS model of the Revised Universal Soil Loss Equation and extensions.
The RUSLE model can be evaluated two ways: aggregated or disaggregated. An aggregated
model uses a spatial database management system (DBMS) to store the six factors for each field,
and then solves the equation through a database query. A map of predicted soil loss by
individual field can be displayed, and the total loss for an entire watershed can be calculated by
summing each of the constituent field losses (loss per acre multiplied by number of acres). That
RUSLE implementation provides several advancements, such as geo-query access, automated
acreage calculations, and graphic display, over the current procedures.
However, it also raises serious questions. Many fields don't fit the assumptions of an aggregated
model. Field boundaries reflect ownership rather than uniformly distributed RUSLE variables.
Just ask any farmer about field variability (particularly if their field's predicted soil loss puts
them out of compliance). A field might have two or more soils, and it might be steep at one end
and flat on the other. Such spatial variation is known to the GIS (e.g., soil and slope maps), but
not used by the aggregated model. A disaggregated model breaks an analysis unit (farmer's field
in this case) into spatially representative subunits. The equation is evaluated for each of the
subunits, and then combined for the parent field.
In a vector system, the subunits are derived by overlaying maps of the six RUSLE factors,
independent of ownership boundaries. In a raster system, each cell in the analysis grid serves as
From the online book Beyond Mapping II by Joseph K. Berry, www.innovativegis.com/basis/. All rights reserved. Permission to copy
for educational use is granted.
Page 9
a subunit. The equation is evaluated for each "composite polyglet" or "grid cell," then weightaveraged by area for the entire field. If a field contains three different factor conditions, the
predicted soil loss proportionally reflects each subunit's contribution. The aggregated approach
requires the soil conservationist to fudge the parameters for each of the conditions into generally
representative values, and then run the equation for the whole field. Also, the aggregated
approach loses spatial guidance for the actual water drainage a field might drain into two or
more streams in different proportions.
Figure 1 shows several extensions to the disaggregated model. Inset 1 depicts the basic spatial
computations for soil loss. Inset 2 uses field boundaries to calculate the average soil loss for
each field based on its subunits. Inset 3 provides additional information not available with the
aggregated approach. Areas of high soil loss (AMAX) are isolated from the overall soil map
(A), and then combined with the FIELDS map to locate areas out of compliance. That directs
the farmer's attention to portions of the field which might require different management action.
Inset 4 enables the farmer to reverse calculate the RUSLE equation. In this case, a soil loss
tolerance (T) is established for an area, such as a watershed, and then the combinations of soil
loss factors meeting the standard are derived. Because the climatic and physiographic factors of
R, K, L, and S are beyond a farmer's control, attention is focused on vegetation cover (C) and
control practices (P). In short, the approach generates a map of the set of crop and farming
practices that keep the field within soil loss compliance good information for decision making.
OK, what's wrong with the disaggregated approach? Two things: our databases and our science.
For example, our digital maps of elevation may be too coarse to capture the subtle tilts and turns
that water follows. And the science behind the RUSLE equation may be too coarse (modeling
scale) to be applied to quarter-acre polyglets or cells. These limitations, however, tell us what
we need to d0 improve our data and redirect our science. From that perspective, GIS is more
of a revolution in spatial reasoning than an evolution of current practice into a graphical form.
________________________
Author's Note: Let me apologize for this brief treatise on an extremely technical subject. How water cascades over
a surface, or penetrates and loosens the ground, is directed by microscopic processes. The application of GIS (or
any other expansive mode) by its nature muddles the truth. The case studies presented are intended to illustrate
various GIS modeling approaches and stimulate discussion about alternatives.
(return to top of Topic)
(Back to the Table of Contents)
From the online book Beyond Mapping II by Joseph K. Berry, www.innovativegis.com/basis/. All rights reserved. Permission to copy
for educational use is granted.
Page 10