Principles of Geographic Information System
Chapter-V
DATA ENTRY AND PREPARATION
Dr.GOVINDU VANUM
Asst.Professor
Institute of Geo-Information and Earth Observation Sciences
Mekelle University, Mekelle
Mobile: 0914876574
Mail:
[email protected] GIS Data Collection
Chapter Content:
GIS Data Sources
Data Collection Methods
Raster Data Capture
Point Interpolation
Remote Sensing
Scanning Existing Sources
Conversion of Existing Data
Vector Data Capture
Digitizing
In-situ Measurement (Ground Surveying)
GPS
Data Preparation
Objectives:
• Up on completion of the chapter, you will be able to:
List and describe the various GIS data sources,
Discuss the GIS data collection methods,
Point out factors that affect the selection of the methods,
Explain the methods available for raster data capture,
Describe the different interpolation methods,
Explain the methods available for vector data capture,
Describe data preparation techniques.
GIS Data Sources
The first step of using GIS is to provide it with data-data
capture - putting the data into the system.
The collection and preprocessing of spatial data is an expensive
and time consuming process.
A wide variety of data sources exist for both spatial and
attribute data.
The most common general sources for spatial data are:
Hard copy maps,
Aerial photographs,
Remotely sensed imagery,
Point data samples from surveys and
Existing digital data files.
The data sources for GIS should be carefully selected for
specific purposes.
Data Collection Methods
There is no single method of collecting data, rather there are
several, mutually compatible methods that can be used singly or in
combination.
Primary Data Capture (first-hand collection):
Digitizing
Scanning
Census data
GPS collections
Aerial photographs
Remote sensing data
Secondary Data Capture (from others):
Published or released data (originally primary data)
All primary data from others are secondary data
Raster Data Capture
I. Spatial Interpolation: is the process of filling in the gaps
between sample observations.
Then you use that sample to make inferences about the
entire geographic area.
That is, interpolation methods are used to provide estimates
of a surface measure at locations where there are no
observations.
The data you obtain is limited to a sampling of different
locations within that area.
An example of this is a continuous surface map representing
the distribution of rainfall, where the values between rain
gauges are estimated from a function that considers the rainfall
readings and distribution of sample sites
Most interpolation methods can be divided into two main
types :
Global interpolators: use all the available data to provide
estimates for the points with unknown values.
Local interpolators: use only the data in the vicinity of the
point being estimated.
Interpolation methods may be either deterministic or
stochastic.
Deterministic methods provide no indication of the extent
of possible errors, whereas stochastic methods provide
probabilistic estimates.
Interpolation Methods
a. Inverse Distance Weight Method:
those sample point values closer to the cell have a
greater influence on the cell’s estimated value than
sample points that are further away.
b. Spline Interpolation:
Divides the theme into regions, and uses the samples
found in each region to predict individual cell values
for that region.
Is best for gently varying surfaces, such as elevation,
water table heights, or pollution concentration.
c. Kriging Interpolation:
is a form of local and stochastic interpolation using
geostatistical methods.
IDW and Spline are referred to as deterministic
interpolation methods:
directly based on the surrounding measured values or
on specified mathematical formulas.
Kriging is based on statistical models that include
autocorrelation-the statistical relationship among the
measured points.
II. Remote sensing: is recording of information without
touching and data from RS is available in raster format.
Remote sensing on a variety of platforms is perhaps the
most important source of digital data.
III. Scanning Existing Sources: the result is an image as
matrix of pixels.
Digital scanners have a fixed maximum resolution:
the highest number of pixels they can identify per inch;
the unit is dots per inch (dpi).
A 400dpi scanner is a good choice for scanning maps for
use as background GIS reference layer.
For a color aerial photograph to be used for photo
interpretation and analysis, a color 900dpi scanner is more
appropriate.
IV. Conversion of Existing Data: existing data in vector
format can be converted to raster.
Vector Data Capture
i. Digitizing is the process where features on a map or image are
converted into digital format.
Digitizing converts the features on the map into three basic data
types:
Points – zero dimensional objects
Lines – one dimensional objects
Polygons – two dimensional objects
There are three primary methods for digitizing spatial information:
Manual methods include:
Tablet digitizing
Heads-up digitizing (on-screen digitizing)
An automated method includes:
Scanning and vectorization
Tablet Digitizing requires a person to enter coordinate
information through the use of a digitizing tablet and
digitizing mouse.
A digitizing tablet is a hardened surface with a fine
electrical wire grid under the surface and upon which the
map or drawing is placed.
A digitizing mouse/cursor is an electrical device with
cross hairs and multiple buttons to perform data entry
operations.
During digitizing the user traces the spatial features with a
the mouse.
While tracing the features the coordinates of selected
points, e.g. vertices, are sent to the computer and stored.
Problems of Tablet Digitizing:
A slow, laborious, expensive task.
It is tedious, and “operator fatigue” is very common.
Due to environmental conditions (such as humidity),
the source materials may actually change (due to
shrinking and expansion).
Heads up Digitizing is a combination of scanning and
manual digitizing.
The main steps in heads up digitizing typically include:
Scanning the map – a user can scan the map at a high
resolution.
Registering the map – the user can enter control points on
screen and transform the scanned image to real world coordinates.
Digitizing the map – the user can zoom to specific areas on
screen and trace points, lines, or polygons on the map.
More Comfortable for the Operator.
More Accurate (zooming facilities).
Faster (Digitizing and Editing at the
same time).
The Steps for Digitizing a Paper Map are:
Physical preparation
Digitizer set up
Map preparation
Digital preparation
Map registration
Digital collection
Feature collection
Feature correction
Save, save, save…
Automated Digitizing: tools to automatically
convert a raster scan to vector lines.
Requires a very clean scan.
Scans can be cleaned using raster cleanup tools.
The vector files usually require cleanup after conversion.
If you start with a clean image it can save a lot of time.
If your image is not clean manual digitizing may be faster.
ii. Collection of on-site Data/Ground Surveying: the
primary, sometimes the ideal way, to obtain spatial data.
iii. The Global Positioning System (GPS): is a satellite
based positioning system operated by the US Department of
Defense.
When fully deployed, GPS will provide all-weather,
worldwide, 24-hour position and time information.
Position, time, and attribute information is collected by
walking, riding, driving, and flying around locations of
interest.
GPS speeds up and simplifies the collection of your initial
GIS data.
Data Preparation
Spatial Data Preparation aims to make the required spatial data
fit for use.
Spatial data preparation consists of:
Data checks and repairs
Spatial elements
Associating attributes
Rasterization /Vectorization
Topology generation
Combining datasets
Editing is time-consuming, such as the trimming of overshoots of
lines at inter-sections, deleting duplicate lines, closing gaps in lines,
and generating polygons.
Associating attribute data with objects through either manual input
or reading digital attribute files into the GIS.
Clean up Operations (Vector Data)
Before Cleanup After Cleanup Description Before Cleanup After Cleanup Description
Erase duplicates Extend
or sliver lines undershoots
Erase short Snap clustered
objects nodes
Break crossing Erase dangling
objects objects or
overshoots
1 1
Dissolve Dissolve
2 2 polygons pseudo-nodes
3
Edge-Matching
Some GIS systems have merge or edge-matching functions to
solve the problem arising from merging adjacent data sets.
Causes of mismatch
paper map shrinkage/expansion
errors from digitizing/scanning
geo-referencing errors
accuracy of equipment
extrapolation or round-off errors
overlapping map coverage
Linking Spatial and Attribute Data
Raster Map
Attribute Table
Vector Map
Attribute Table
THANK YOU