0% found this document useful (0 votes)
8 views

3_Data Models, Raster, Vector

The document discusses the differences between raster and vector data models in GIS, focusing on aspects such as data construction, storage, and display. It emphasizes the importance of understanding spatial data characteristics and the mapping process, including the representation of real-world features through points, lines, and areas. Additionally, it addresses the dynamic nature of the real world and the challenges of accurately modeling complex systems within GIS.

Uploaded by

Sheeraz Ahmed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

3_Data Models, Raster, Vector

The document discusses the differences between raster and vector data models in GIS, focusing on aspects such as data construction, storage, and display. It emphasizes the importance of understanding spatial data characteristics and the mapping process, including the representation of real-world features through points, lines, and areas. Additionally, it addresses the dynamic nature of the real world and the challenges of accurately modeling complex systems within GIS.

Uploaded by

Sheeraz Ahmed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 83

Data Models, Raster,

Vector
Objectives
• Get familiar with the distinction between raster and vector model
regarding
• data construction
• data storage
• data display

Source: GOZDYRA/ PENFOLD (1999)


Data- information „Information the fish in the data ocean”

• Data are observations we make from monitoring


the real world. Data are collected as facts or
evidence that may be processed to give them
meaning and turn them into information.
• There is a clear distinction between data and
information, although the two terms are often used
interchangeably.

Examples of raster (a–d) and vector GIS data layers (e–h)


Primary or secondary data
• There are a wide variety of data sources, though all data fall into one
of two categories: primary or secondary. Primary data collected
through first-hand observation (Surveying, field sensoring, remote
sensoring).
• Secondary data will have been collected by another individual or
organization, for example cartography map. Many secondary data
sources are published and include maps, population census details
and meteorological data..
Data - MAP
• All primary and secondary data have three modes or dimensions:
temporal, thematic and spatial. For all data it should be possible to
identify each of these three modes.
• GIS place great emphasis on the use of the spatial dimension for
turning data into information, which, in turn, assists our
understanding of geographic phenomena.
The mapping process
• establish the purpose the map is to serve;
• define the scale at which the map is to be produced;
• select the features (spatial entities) from the real world which must be portrayed
on the map;
• choose a method for the representation of these features (points, lines and areas);
• generalize these features for representation in two dimensions;
• adopt a map projection for placing these features onto a flat piece of paper;
• apply a spatial referencing system to locate these features relative to each other;
and
• annotate the map with keys, legends and text to facilitate use of the map.
• (after Robinson et al., 1995):
GIS as a model of world
• All Geographical Information Systems are computer representations
of some aspect of the real world. It would be impossible to represent
all the features in which you are interested in a computer, so GIS pres-
ent a simplified view of the world.
• A model is ‘a synthesis of data’ (Haggett and Chorley, 1967) which is
used as a ‘means of “getting to grips” with systems whose spa-tial
scale or complexity might otherwise put them beyond our mental
grasp’ (Hardisty et al., 1993). GIS is used to help build models where it
would be impossible to synthesize the data by any other means.
GIS as a model of world
• However, in order to model the real world in a GIS, it has to be
reduced to a series of abstract features or basic spatial entities (such
as points, lines and areas) and this is not without problems.
• The reductionist nature of GIS has been severely criticized in some
circles as being too simplistic for modelling complex human systems
• Before looking at how spatial models are constructed using a GIS it is
necessary to consider the character of the spatial data they use as
their raw material.
Spatial entities

• Traditionally, maps have used symbols to represent real-world


features. Examination of a map will reveal three basic symbol types:
points, lines and areas (Monmonier, 1996).
• New symbols: network, pointclouds
• These were introduced the basic spatial entities. Each is a simple two-
dimensional model that can be used to represent a feature in the real
world.
• These simple models have been developed by cartographers to allow
them to portray three-dimensional features in two dimensions on a
piece of paper.
Contents
• GIS models are only as good a representation
of the real world as the spatial data used to
construct them. Understanding the main
characteristics of spatial data is an important
first step in evaluating its usefulness for GIS.
• Applied data models in GIS:
• vector model
• raster model
• Correlation data model and
• data attributes
• topology
Source: CRUM, Shannon (1999)
Process flow
• By applying this abstraction process you move from the position of observing the
geographical complexities of the real world to one of simulating them in the
computer.
• Identifying the spatial features from the real world that are of interest in the
context of an application and choosing how to represent them in a conceptual
model.
• Representing the conceptual model by an appropriate spatial data model. This
involves choosing between one of two approaches: raster or vector. In many
cases the GIS software used may dictate this choice.
• Selecting an appropriate spatial data structure (physical model) to store the
model within the computer. The spatial data structure is the physical way in
which entities are coded for the purpose of storage and manipulation.
Dynamic changing
• The real world is not static. Forests grow and are felled; rivers flood and
change their course; and cities expand and decline.
• The dynamic nature of the world poses two problems for the entity-
definition phase of a GIS project.
• The first is how to select the entity type that provides the most appropriate
representation for the feature being modelled. Is it best to represent a forest
as a collection of points (representing the location of individual trees), or as
an area (the boundary of which defines the territory covered by the forest)?
• The second problem is how to represent changes over time. A forest,
originally represented as an area, may decline as trees die or are falled until
it is only a dispersed group of trees that are better represented using points.
Tree or Forest problems
• The definition of entity types for real-world features is also hampered by the fact that
many real-world features simply do not fit into the categories of entities available.
• An area of natural woodland does not have a clear boundary as there is normally a
transition zone where trees are interspersed with vegetation from a neighbouring habitat
type.
• In this case, if we wish to represent the wood-land by an area entity, where do we place
the boundary?
• The question is avoided if the data are captured from a paper map where a boundary is
clearly marked, as someone else will have made a decision about the location of the
woodland boundary.
• But is this the true boundary? Vegetation to an ecologist may be a continuous feature
(which could be represented by a surface), whereas vegetation to a forester is better
represented as a series of discrete area entities
Raster and vector spatial data- data
conversion
Gradual Modeling
• Step 1: setting points
• Step 2: connecting two points to a line
• Step 3: joining lines to an area

Source: RUTGERS State University of New Yersey (1999)


Vector
• Based on points (defined by xy-coordinates) in a reference system.
• Connecting two points leads to a directed line segment (= vector)

Source: Center for Innovation in Engineering Education,


Graphic Primitives
• Point: single coordinate pair
• Line: series of coordinate pairs
• Area: closed loop of coordinate pairs

Source: CRUM, Shannon (19991)


Points
• Points are used to represent features that are too small to be represented as areas
at the scale of map-ping being used. Examples are a postbox, a tree or a lamp post.
• The data stored for a post-box will include geographic location and details of what
the feature is.
• Latitude and longitude, or a coordinate reference, could be given together with
details that explain that this is a postbox in current use.
• Of course, features that are represented by points are not fully described by a two-
dimensional geographical reference.
• There is always a height component since the postbox is located at some height
above sea level. If three dimensions are important to a GIS application this may also
be recorded, usually by adding a z value representing height to give an (x,y,z)
coordinate.
Lines (Arcs)
• Lines are used to represent features that are linear in nature, for example roads,
powerlines or rivers . It can be difficult for a GIS user to decide when a feature should
be represented by a line. Should a road be represented by a single line along its centre,
or are two lines required, one for each side of the road?
• A line is simply an ordered set of points. It is a string of (x,y) co-ordinates joined
together in order and usually connected with straight lines. Lines may be isolated, such
as geological fault lines, or connected together in networks, such as road, pipeline or
river networks.
• Networks are sometimes regarded as a separate data type but are really an extension
of he line type. Like points, lines are in reality three-dimensional. For instance, a hydro-
geologist may be interested in underground as well as surface drainage. Adding a z co-
ordinate (representing depth or height) to the points making up the line representing a
stream allows an accurate three-dimensional representation of the feature.
AREAS (Polygons)
• Areas are represented by a closed set of lines and are used to define features such as fields,
buildings or lakes. Area entities are often referred to as polygons. As with line features, some
of these polygons exist on the ground, whilst others are imaginary.
• They are often used to represent area features that do not exist as physical features, such as
school catchment zones or administrative areas.
• Two types of polygons can be identified: island polygons and adjacent polygons.
• Island polygons occur in a variety of situations, not just in the case of real islands. For example,
a woodland area may appear as an island within a field, or an industrial estate as an island
within the boundary of an urban area. A special type of island polygon, often referred to as a
nested polygon, is created by contour lines. If you imagine a small conical hill represented by
contour lines, this will be represented n polygon form as a set of concentric rings.
• Adjacent polygons are more common. Here, boundaries are shared between adjacent areas.
Examples include fields, postcode areas and property boundaries.
• A three-dimensional area is a surface. Surfaces can be used to represent topography or non-
topo-graphical variables such as pollutant levels or population densities.
Scale
• Virtually all sources of spatial data, including maps, are smaller than the reality they
represent.
• Scale gives an indication of how much smaller than reality a map is. Scale can be defined
as the ratio of a distance on the map to the corresponding distance on the ground.
• Scale can be expressed in one of three ways: as a ratio scale, a verbal scale or a graphical
scale
• Graphic scales are frequently used on computer maps. They are useful where changes to
the scale are implemented quickly and interactively by the user. In such cases,
recalculating scale could be time-consuming, and the ratios produced (which may not be
whole numbers) may be difficult to interpret.
• Resolution is defined as the size of the smallest recording unit (or the smallest size of
feature that can be mapped or measured
Cartographic generalization
• 1 Selection. First, the map feature for generalization is
selected. If more than one source is available to the
cartographer this may involve choosing the most
appropriate representation of the feature or a blending of
the two.
• 2 Simplification. Next, a decision will be taken to simplify
the feature. For the example of the river this may involve
the removal of some minor bends. The aim of
generalization will usually be to simplify the image but
maintain the overall trend and impression of the feature.

(Source: Adapted from Robinson et al., 1995)


Cartographic generalization
• 3 Displacement. If there are features that are located side by side in
the real world, or that lie on top of one another, the cartographer
may choose to displace them by a small degree so that they are both
visible on the map image. This may have the effect of displacing a
feature several hundred metres depend-ing on the map scale used.
• 4 Smoothing and enhancement. If the source data from which a
cartographer is working are very angu-lar, because they have been
collected from a series of sampling points, a smoothing technique
may be used to apply shape and form to the feature. This will give a
better representation.
Vector Attributes
• Non-spatial data linked to spatial features by means of identification
(ID)

Source: FOOTE/ HUEBNER (1996)


Nodes
• Graphic element: point
• Combined with coordinates as position information
• Vertices are in
between a line.

Source: FOOTE/ HUEBNER (1996)


Arcs
• Consist of two or more coordinate pairs
• Nodes at the start and end point define orientation if recorded.
• Several line
segments can
build up one arc.

Source: FOOTE/ HUEBNER (1996)


Polygons
• Formed by arcs, which enclosure an area
• Start and end node have equal x,y coordinates.
• Neighboring
polygons share
arcs.

Source: FOOTE/ HUEBNER (1996)


Complex Objects
= Combination of several graphic elements
• Defined by ID, object class, and the sum of its elements
• Data structure consists of one
file storing coordinate
information and another file
storing attribute information

Source: UNIGIS Salzburg (2000)


Networks
= Set of interconnected lines, which form a complex object, rather than
polygons.
• Directed networks:
Flow moves just in one direction.
• Undirected networks:
Flow moves in both directions.

Source: DeMers, Michael N. (1997), p. 198


Graphs
= Sum of nodes and lines (edges)
• Concerned about connectivity and not locations
• Rules of planar
enforcement applied
• Graph theory widens
possibilities for
operations on networks.

Source: Münchner Verkehrs- und Tarifverbund (2000)


Regions
= Sum of polygons that share one or more characteristics.
• Regions as unification of polygons
• Region‘s shape can be contiguous, fragmented or perforated.

Source: DeMers, Michael N. (1997), p. 200


Simple data structure of vector
model

„Spagethi structures” - shape file

• For the representation of line networks, and adjacent and


island polygons, a set of instructions is required which
informs the computer where one polygon, or line, is with
respect to its neighbours.
• Topological data structures contain this information. There
are numerous ways of providing topological structure in a
form that the computer can understand.
Topological structuring of complex
areas
The limitations of simple vector data structures start to emerge
when more complex spatial entities are considered.

There is a considerable range of topological data structures in


use by GIS. All the structures available try to ensure that:
-no node or line segment is duplicated;
-line segments and nodes can be referenced to more than one
polygon;
-all polygons have unique identifiers; and
-island and hole polygons can be adequately represented.
MODELLING NETWORKS
• A network is a set of interconnected linear features through which
materials, goods and people are transported or along which
communication of information is achieved.
• Network models in GIS are abstract representations of the
components and characteristics of their real-world counterparts.
• They are essentially adaptations of the vector data model and for
this reason raster GIS are generally not very good at network
analysis.
• The vector network model is made up of the same arc (line
segments) and node elements as any other vector data model but
with the addition of special attributes
• In the network model the arcs become network links representing
the roads, railways and air routes of transport networks; the power
lines, cables and pipelines of the utilities networks; or the rivers
and streams of hydrological systems
MODELING NETWORKS
• The nodes in turn become network nodes, stops and centres. Network
nodes are simply the endpoints of network links and as such represent
junctions in transport networks, confluences in stream networks, and
switches and valves in utilities networks.
• Stops are locations on the network that may be visited during a journey.
They may be stops on a bus route, pick-up and drop-off points on a
delivery system, or sediment sources in a stream network.
• They are points where goods, people or resources are transferred to
and from some form of transport system. Centres are discrete ocations
on a network at which there exists a resource supply or some form of
attraction.
MODELING NETWORKS
• All the data regarding the characteristics of net-work links, nodes, stops, centres
and turns are stored as attribute information in the vector model database. Two
key characteristics of network features are impedance and supply and demand.
• Impedance is the cost associated with traversing a network link, stopping,
turning or visiting a centre. For example, link impedance may be the time it
takes to travel from one node to another along a network link. If we use the
example of a delivery van travelling along a city street then the impedance
value represents time, fuel used and the driver’s pay. Factors influencing the
impedance value will include traffic volume as determined by time of day and
traffic control systems; direction (for instance, one-way streets); topography
(more fuel is used going uphill); and weather (more fuel is used travelling into a
strong headwind).
MODELING NETWORKS
• Different links have different impedance values
depending on local conditions. Turn impedance is
also important and may be represented by the cost
of making a particular turn.
• Impedance values are, therefore, very important in
determining the out-come of route finding,
allocation and spatial interaction operations.
• Correct topology and connectivity are extremely
important for network analysis. Digital networks
should be good topological representations of the
real-world network they mimic.
• Correct geographical representation in network
analysis is not soofimportant,
A classic example this is the so long as key attributes
such
famousasmap
impedance
of the London
Underground system.

Link, turn and stop impedances affecting the journey


Dividing Modeling
• Earth‘s surface is „divided“ into thousands of cells.
• Each cell represents a single value.
• The sum of all
grid values
form a layer.

Source: RUTGERS State University


of New Yersey (1999)
Raster Attributes
• Values are source- and operation-depending
• Satellite data: values range from 0 to 255 „0“ and „1“ as indicator for
presence or absence of features
• Nominal values can be represented by almost every code (letter,
numbers, etc.).

Source: UNIGIS Salzburg (2000)


Surface Modeling
• e.g. Digital Elevation Model (DEM)
• Elevation data linked to a regularly spaced point grid instead of pixels.
• Satellite data etc. can be draped on DEM to highlight relief.

Source: PATTERSON, Mark (2000) Source:


CRUM, Shannon (19992)
Raster Coordinates
• Pixel coordinates determined from grid origin.
• Coordinates as row and column number
• Feature locations get more
absolute, the better the
raster‘s resolution is.

Source: BRYAN, Brett (1999)


Resolution
• Raster resolution =
pixel size
• Respectively the
covered space seen
by a satellite
assigned to a pixel

Source SATO/ RASTOSKUEV/ SHALINA (1999)


Rasters are:
-Regular square tessellations
-Matrices of values distributed among equal-
sized, square cells
565 573 582 590

575 580 595 600

579 581 597 601

580 600 620 632


Why squares?
-Computer scanners and output devices use
square pixels
-Bit-mapping technology/theory can be
adapted from computer sciences
-1-to-1 mapping to grid coordinate systems!
Cell location specified by:
• Row/column (R/C) address
• Origin is upper left cell (1,1)
• Relative or geographic coordinates can be specified
Registration to “world” coordinates

Unregistered Registered
Registration to “world” coordinates

• Specify coords. of upper left corner


• Specify ground dimensions of cell, in same units
World File - DRG example
Spatial Resolution
-Defined by area or dimension of each cell
- Spatial Resolution = (cell height) X (cell width)
-High resolution: cell represent small area
-Low resolution: cell represent larger area
-Defined by size of one edge of cell (e.g. “30 m
DEM”) -For fixed area, file size increases with
resolution
30 m vs. ~90 m pixel size
o Resolution of
30 m data is 9
times better than
90 m data

(50 m contours, vector data layer)


Resolution constraint
Cell size should be less than half of the size of the
smallest object to be represented (“Minimum
mapping unit; MMU")
Raster Attributes
Two types:
1. Integer codes assigned to raster cells
E.g. rock type, land use, vegetation
Codes are technically nominal or ordinal data
2. Measured "real"values
Can be integer or "floating-point" (decimal) values;
technically interval or ratio data
E.g. topography, em spectrum, temperature, rainfall,
concentration of a chemical element
Integer Code Attributes
■ Code is referenced to attribute via a "look-up table"
or "value attribute table" - VAT
■ Commonly many cells with the same code
■ Different attributes must be stored in different raster
layers

VAT
Value Count Rock Type
2 21 Marble
5 37 Gneiss
8 6 Granite

Nominal Coded Raster


Mixed Pixel Problem

• Assign to feature that


comprises most of pixel
Coded Value Raster Types
Single-band: Thematic data
o Black & White: binary (1 bit) (0 = black, 1 = white)
o Panchromatic ("Grayscale") (8 bit): 0 (black) - 255 (white) or
graduated color ramps (e.g. blue to red, light to dark red)
o Colormaps ("Indexed Color") (8 bit): code cells by values that
match prescribed R-G-B combinations in a lookup table
B & W Panchromatic Color Map
Lookup/index table

Figures from: Modeling our World, ESRI press


GEO327G/386G, UT Austin
Single Band
Examples - Black & White (Grayscale)

Black & White -1 bit

Grayscale - 8 bit;
black, white & 254 shades of gray
Single Band
Example Color Map (Indexed Color)

Each pixel contains one of 12 unique values, each


corresponding to a prescribed color (Red, Green & Blue
combination)
Measured, “Real Value” Attributes
Commonly stored as floating point values
Different attributes must be stored in different
layers, e.g. spectral bands in satellite imagery
Compression techniques for rasters of integer-
valued cells, but not floating point (see below)
Multiband
Image Raster Attributes

Figures from: Modeling our World, ESRI press


Multiband Image 8
bits/Band, 3 Band RGB

E.g. Austin East 7.5’ Color Infrared Digital Orthophotograph (“CIR DOQ")

2/9/2016 GEO327G/386G, UT Austin 21


Cell values apply to

Source: Modeling our World, ESRI press


Digital Elevation Model
Airborn Magnetic (TFI) Map
How are rasters projected?
Problem: Square cells must remain square
after projection.
Solution: Resampling (interpolation); add,
remove, reassign cells to conform to new
spatial reference.
Raster File Size
Raster File Size
File Size = Rows x columns x bit-depth

Bit depth: number of bits used to represent pixel


value

“8-bit" data can represent 256 values (28)


“16-bit" data (216) allows 65,536 values "32-
bit" data allows ~4.3 billion values
File Structure
File Compression
E.g. Run-length encoding

After: 46 characters (28%


reduction; ratio of 1.4: 1)
File Compression
E.g. Block encoding
Block
1 2 3 4 5 678
Raster Compression
• Run-length encoding:
adjacent cells of one row are grouped
• Value-point encoding:
same cell values of a square are grouped
• Quadtrees:
gradually subdivision of
a raster leads to
homogenous quadrants

Source: FONSECA, Frederico (1998)


Hierarchial Raster
• Pyramids:
Successive stacking of raster
layers with higher and lower Source: UNIGIS Salzburg (2000)
resolution
Source: Institut für Photogrammetrie und
• Quadtrees: Fernerkundung, TU Wien (1999)
Successive
subdivision of
pixels into four quadrants

Source: UNIGIS Salzburg (2000)


Lossy” vs. Lossless Compression

Techniques that combine similar attribute


information to reduce file size are “lossy” e.g.
JPEG, GIFF, PNG, MrSID

Lossless formats; TIFF, BMP, GRID


Raster Pyramids
Store reduced-resolution copies of a raster for rapid
display - e.g ArcGIS, Google, many others
Often combined with image tiling for rapid rendering
of images

Source: ESRI ArcGIS Help file


Image “Tiling
Split raster into small contiguous rectangles
or squares = tiles
Display only the tile required upon zooming
MrSID or ECW (wavelet) compression
- Multi-resolution Seamless Image Database -
commercialized by LizardTech
Compression ratios of 15-20:1 for single band 8-bit
images
Ratios of 2-100:1 (!) for multiband color images
also ECW by ER Mapper Ltd. (now Intergraph/ERDAS)
*** Enormous raster data sets now
manageable on PCs and across web
with this technology ***
Supported Raster Formats
See ArcCatalog>Tools>
Options
Each explained in Help 24
supported formats

2/9/2016 GEO327G/386G, UT Austin 36


Voxel Model

• Serves representation of volumetric data


• Voxel = 3D raster pixel
• Each voxel is encoded with attribute data.
• Similar drawbacks like raster model due
to discrete representation

Source: Both figures by


KAUFMAN/ COHEN/ YAGEL (1993)
TIN
= Triangulated irregular network
• Technique to represent surfaces
• Surface approximation with the help of triangles
• Starting with sample points
each point is connect with
its nearest neighbor

Source: UNIGIS Salzburg (2000)


Vector or Raster?
Spatially continuous data = raster
Spatially continuous data = raster
Modeling of data with high degree of variability =
raster
Objects with well defined boundaries = vector
Geographic precision & accuracy = vector
Topological dependencies = vector or raster
Raster or Vector?
Raster Vector
■ Simple data structure ■ Compact data structure
■ Efficient topology
■ Ease of analytical operation
■ Sharper graphics
■ Format for scanned or sensed data
■ Object-orientation better for
- easy, cheap data entry some modeling
But....... But....
■ More complex data structure
■ Less compact
■ Overlay operations
■ Querry-based analysis difficult
computationally intensive
■ Coarser graphics ■ Not good for data with high
■ More difficult to transform & degree of spatial variability
project ■ Slow data entry
Selftraining
• For a project you are involved in, list your data sources. Review each
one and identify any issues about scale, entity definition,
generalization, projections, spatial referencing and topology that you
think might be relevant.
Weblink
• Federal Geographic Data Committee http://www.fgdc.gov/
• Open Geospatial Consortium http://www.opengeospatial.org/
Key words for exam
• Explain the following terms and phrases:
• 1 Spatial data. 2 Attribute data. 3 Spatial referencing. 4 Spatial entities
• Explain the difference between data and information.
• What are the three basic spatial entities and how are these used to
portray geographical features on paper maps and in GIS?
• Vector topology (make a figure)
• Raster compression
• Compare raster with vector model

You might also like