0% found this document useful (0 votes)

6 views109 pages

GIS Unit 1-5 Notes

This document provides an introduction to Geographic Information Systems (GIS), explaining its nature, components, and applications. It outlines the capabilities of GIS in handling geospatial data, including data capture, management, manipulation, and presentation. The document also discusses the importance of spatial data quality and the classification of GIS operations.

Uploaded by

sharmahritesh4014

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views109 pages

GIS Unit 1-5 Notes

Uploaded by

sharmahritesh4014

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 109

GEOGRAPHIC INFORMATION SYSTEMS Unit 1

Unit-I
Chapter 1: A Gentle Introduction to GIS

Contents

1.1. The nature of GIS

1.2. The real world and representations of it
1.3. Geographic Information and Spatial Database
1.4. Geographic Phenomena
1.5. Computer Representations of Geographic Information
1.6. Organizing and Managing Spatial Data

1.1. A Gentle Introduction to GIS

1.1.1. Some fundamental observations:

Our world is dynamic. Many aspects of our daily lives and our environment are
constantly changing, and not always for the better. Some of these changes appear
to have natural causes (e.g. volcanic eruptions, meteorite impacts), while others
are the result of human modification of the environment. (e.g. land use changes or
land reclamation from the sea).

The fundamental problem that we face in many uses of GIS is that of under-
standing phenomena that have a spatial or geographic dimension, as well as a
- means that
our object of study has different characteristics for different locations (the
geographic dimension) and also that these characteristics change over time (the
temporal dimension). The El Nino event is a good example of such a phenomenon,
because sea surface temperatures differ between locations, and sea surface
temperatures change from one week to the next. El Nino is an aberrant pattern in
weather and sea water temperature that occurs with some frequency (every4 9
nine years) in the Pacific.
Ocean along the Equator. It is characterized by less strong western winds across
the ocean, less upwelling of cold, nutrient-rich, deep-sea water near the South
American coast, and therefore by substantially higher sea surface temperatures
(see figures below). It is generally believed that El Nino has a considerable impact
on global weather systems, and that it is the main cause for droughts in Wallacea
and Australia, as well as for excessive rains in Peru and the southern U.S.A.

1
GEOGRAPHIC INFORMATION SYSTEMS Unit 1

1.1.2. Defining GIS:

A geographic Information system (GIS) is a computer system for capturing,

storing, querying, analyzing, and displaying geospatial data.
Also called geographically referenced data, geospatial data are data that describe
both the locations and the characteristics of spatial features such as roads, land
parcels, and vegetation stands on the Earth's surface.

1.1.3. GIS Systems:

A GIS integrates hardware and software to capture/analyze data, allowing users to
question, understand and visualize data in many different ways to reveal patterns
or trends in the form of maps, globes, charts and reports.
The key word to this technology is Geography this means that some portion of
the data is spatial. In other words, data that is in some way referenced to locations
on the earth.
It also establishes GIS as a technology important to such occupations as market
research analysts, environmental engineers, and urban and regional planners,
which are also listed at the U.S. Department of Labor's website.
This helps users answer questions and solve problems, which is useful because by
viewing and analyzing visual data, the human mind can more easily discern
patterns and relationships. Google Maps is the best example of a GIS. . GIS have
rapidly developed since

2
GEOGRAPHIC INFORMATION SYSTEMS Unit 1

capabilities, and today are widely used all over the world for a wide range of
purposes.
GIS can refer to a number of different technologies, processes, and methods. It is
attached to many operations and has many applications related to engineering,
planning, management, transport/logistics, insurance, telecommunications, and
business. For that reason, GIS and location intelligence applications can be the
foundation for many location-enabled services that rely on analysis and
visualization.

1.1. 4.GIS SCIENCE:

GIS is a computer-based system that provides the following four sets of capabilities to
handle georeferenced data:
1. Data capture and preparation
2. Data management, including storage and maintenance
3. Data manipulation and analysis
4. Data presentation

1. Data capture and preparation:

In the El Nino case, data capture refers to the collection of sea water temperatures
and wind speed measurements. This is achieved by placing buoys with measuring
equipment at various places in the ocean. Each buoy measures a number of things:
wind speed and direction; air temperature and humidity; and sea water temperature
at the surface and at various depths down to 500 metres.

2. Sea surface temperature (SST) and Wind speed (WS).

A typical buoy is illustrated in Figure 1.2, which shows the placement of various
sensors on the buoy. For monitoring purposes, some 70 buoys were deployed at
strategic places within 10 latitude of the Equator, between the Galapagos Islands
and Papua New Guinea. Figure 1.3 provides a map that illustrates the positions of
these buoys. The buoys have been anchored, so they are stationary. Occasional
malfunctioning is caused by high seas and bad weather or by the buoys becoming
entangled in long-line fishing nets.

All the data that a buoy obtains through its thermometers and other sensors, as

daily.

3
GEOGRAPHIC INFORMATION SYSTEMS Unit 1

3. Data Management:

For our example application, data management refers to the storage and
maintenance of the data transmitted by the buoys via satellite communication. This
phase requires a decision to be made on how best to represent our data, both in
terms of their spatial properties and the various attribute values which we need to
store.
We will from here on assume that the acquired data has been put in digital form,
that is, it has been converted into computer-readable format, so that we can begin
our analysis.

4. Data manipulation and analysis:

Once the data has been collected and organized in a computer system, we can start
analyzing it. Here, let us look at what processes were involved in the eventual
production of the maps of Figure1.1 . Note that the actual production of maps
belongs to the phase of data presentation that we discuss below.

4
GEOGRAPHIC INFORMATION SYSTEMS Unit 1

Here, we look at how data generated at the buoys was processed before map
production. A closer look at Figure 1.1 reveals that the data being presented are
based on the monthly averages for SST and WS (for two months), not on single
measurements for a specific date. Moreover, the two lower figures provide

was made with the December averages of several years.

The initial (buoy) data have been generalized from 70 point measurements (one
for each buoy) to cover the complete study area. Clearly, for positions in the study
area for which no data was available, some type of interpolation took place,
probably using data of nearby buoys. This is a typical GIS function: deriving an
estimated value for a property for some location where we have not measured.

Data presentation:

After the data manipulations discussed above, our data is prepared for producing
output. In this case, the maps of Figure 1.1. The data presentation phase deals with
putting it all together into a format that communicates the result of data analysis in
the best possible way. Many issues arise in this phase. Among other things, we
need to consider what the message is that we want to portray, who the audience is,
what kind of presentation medium will be used, which rules of aesthetics apply,
and what techniques are available for representation. These issues may sound a
little abstract, so let us clarify with the El Nino case.
The message we wanted to portray is what are the El Nino and La Nina events,
both in absolute figures, but also in relative figures, i.e. as differences from a
normal situation.
The audience for this data presentation clearly were the readers of this text book,
i.e. students of ITC who want to obtain a better understanding of GIS. The
medium was this book, (printed matter of A4 size) and possibly a website. The

font size. The rules of aesthetics demanded many things: the maps should be
printed north-up; with clear georeferencing; with intuitive use of symbols et
cetera, We actually also violated some rules of aesthetics, for instance, by
applying a different scaling factor in latitude (horizontally) compared to longitude
(vertically).
The techniques that we used included the use of a color scheme and isolines, plus
a number of other techniques

1.1.5. GIS Applications:

i. An urban planner might want to assess the extent of urban fringe growth in her/his
city, and quantify the population growth that some suburbs are witnessing. S/he

5
GEOGRAPHIC INFORMATION SYSTEMS Unit 1

might also like to understand why these particular suburbs are growing and others
are not.
ii. A biologist might be interested in the impact of slash-and-burn practices on the
populations of amphibian species in the forests of a mountain range to obtain a
better understanding of long-term threats to those populations.
iii. A natural hazard analyst might like to identify the high-risk areas of annual
monsoon-related flooding by investigating rainfall patterns and terrain
characteristics.
iv. A geological engineer might want to identify the best localities for constructing
buildings in an earthquake-prone area by looking at rock formation characteristics.
v. A mining engineer could be interested in determining which prospective copper
mines should be selected for future exploration, taking into account parameters
such as extent, depth and quality of the ore body, amongst others
vi. A geo-informatics engineer hired by a telecommunications company may want to

various cost factors such as land prices, undulation of the terrain.

vii. A forest manager might want to optimize timber production using data on soil and
current tree stand distributions, in the presence of a number of operational
constraints, such as the need to preserve species diversity in the area.
viii. A hydrological engineer might want to study a number of water quality parameters
of different sites in a freshwater lake to improve understanding of the current
distribution of Typha reed beds, and why it differs from that of a decade ago.

1.1.6. Components

GIS requires the following components to work with geospatial data:

1. Computer System: The computer system includes the computer and the operating
system to run GIS. Typically the choices are PCs that use the Windows operating
system (e.g., Windows 2000, Windows XP) or workstations that use the UNIX or
Linux operating system. Additional equipment may include monitors for display,
digitizers and scanners for spatial data input. GPS receivers and mobile devices for
fieldwork, and printers and plotters for hard-copy data display.
2. GIS Software: The GIS software includes the program and the user interface for
driving the hardware. Common user interfaces in GIS are menus, graphical icons,
command lines, and scripts.
3. People: People refers to GIS professionals and users who define the purpose and
objectives, and provide the reason and justification for using GIS.
4. Data: Data consist of various kinds of inputs that the system takes to produce
information.
5. Infrastructure (METHOD): The infrastructure refers to the necessary physical,
organizational, administrative, and cultural environments that support GIS

6
GEOGRAPHIC INFORMATION SYSTEMS Unit 1

operations. The infrastructure includes requisite skills, data standards, data

clearinghouses, and general organizational patterns.

Spatial data describe the locations of spatial features, which may be discrete or
continuous. Discrete features are individually distinguishable features that do not
exist between observations. Discrete features include points (e.g. wells), lines
(e.g., roads), and areas (e.g., land use types). Continuous features are features that
exist spatially between observations. Examples of continuous features are
elevation and precipitation. A GIS represents these spatial features on the Earth's
surface as map features on a plane surface. This transformation involves two main
issues: the spatial reference system and the data model.

Attribute data describe the characteristics of spatial features. For raster data, each
cell has a value that corresponds to the attribute of the spatial feature at that
location. A cell is tightly bound to its cell value. For vector data, the amount of
attribute data to be associated with a spatial feature can vary significantly. A road
segment may only have the attributes of length and speed limit, whereas a soil
polygon may have dozens of properties, interpretations, and performance data.
How to join spatial and attribute data is therefore important in the case of vector
data.

1.1.7. Spatial data and Geoinformation :

Key components of spatial data quality include positional accuracy (both horizontal and
vertical), temporal accuracy (that the data is up to date), attribute accuracy (e.g. in
labelling of features or of classifications), Lineage (history of the data including sources),
completeness (if the data set represents all related features of reality), and logical
consistency (that the data is logically structured).

These components play an important role in assessment of data quality for several
reasons:

1. Even when source data, such as official topographic maps, have been subject to
stringent quality control, errors are introduced when these data are input to GIS.
2. Unlike a conventional map, which is essentially a single product, a GIS database
normally contains data from different sources of varying quality.
3. Unlike topographic or cadastral databases, natural resource databases contain data
that are inherently uncertain and therefore not suited to conventional quality
control procedures.
4. Most GIS analysis operations will themselves introduce errors.

7
GEOGRAPHIC INFORMATION SYSTEMS Unit 1

1.1.8.Classification of GIS Operations.

Spatial data Input 1. Data entry: use existing data, create new data
2. Data editing
3. Geometric transformation
4. Projection and reprojection

Attribute data 1. Data entry and verification

management 2. Database management
3. Attribute data manipulation

Data display 1. Cartographic symbolization

2. Map design

Data exploration 1. Attribute data query

2. Spatial data query
3. Geographic visualization

Data analysis 1. Vector data analysis: buttering, overlay, distance measurement,

spatial statistics, map manipulation
2. Raster data analysis: local, neighborhood, zonal, global,
raster data manipulation
3. Terrain mapping and analysis
4. View shed and watershed
5. Spatial interpolation
6. Geocoding and dynamic segmentation
7. Path analysis and network applications
GIS modeling 1. Binary models
2. Index models
3. Regression models
4. Process models

GIS Systems, GIS Science and GIS applications Revisited:

The discipline that deals with all aspects of the handling of spatial data and
geoinformation.
and software, and also people such as the database creators or administrators,
analysts who work with the software, and the users of the end product. Related
terms include geoinformatics, geomatics, and spatial information science. These
are all similar terms which have much the same meaning, although each approach

8
GEOGRAPHIC INFORMATION SYSTEMS Unit 1

has slight differences in the way it deals with problems, some emphasizing
engineering approaches, others computational solutions, and so on.
Project-based GIS applications usually have a clear-cut purpose, and these
applications can be short-lived: the research is carried out by collecting data,
entering data in the GIS, analyzing the data, and producing informative maps. An
example is rapid earthquake damage assessment. Institutional GIS applications, on
the other hand, usually have as their goal the continued administration of spatial
change and the sustained availability of spatial base data. Their needs for
advanced data analysis are usually less, and the complexity of these applications
lies more in the continued provision of trustworthy data to others. They are thus
long-lived applications. An obvious example are automated cadastral systems.

1.2. The real world and representations of it

1.2.1. Models and modeling:

is a term used in many different ways and which has many different
meanings. A representation of some part of the real world can be considered a
model because the representation will have certain characteristics in common with
the real world. Specifically, those which we have identified in our model design.
This then allows us to study and operate on the model itself instead of the real
world in order to test what happens under various conditions, and help us answer
ata or alter the parameters of the model,
and investigate the effects of the changes.

Models as representations come in many different flavors. In the GIS

environment, the most familiar model is that of a map. A map is a miniature
representation of some part of the real world. Paper maps are the most common,
but digital maps also exist. Databases are another important class of models. A
database can store a considerable amount of data, and also provides various
functions to operate on the stored data. The collection of stored data represents
some real world phenomena, so it too is a model. Obviously, here we are
especially interested in databases that store spatial data. Digital models (as in a
database or GIS) have enormous advantages over paper models (such as maps).
They are more flexible, and therefore more easily changed for the purpose at hand.
In principle, they allow animations and simulations to be carried out by the
computer system. This has opened up an important toolbox that can help to
improve our understanding of the world.

A is a representation of a number of phenomena that we can

observe in reality, usually to enable some type of study, administration,
computation and/or simulation. Application models to refer to models with a

9
GEOGRAPHIC INFORMATION SYSTEMS Unit 1

specific application, including real-world models and so-called analytical models.

structuring a database. This process involves the identification of the kinds of data
that the database will store, as well as the relationships between these kinds of
data.

Most maps and databases can be considered static models. At any point in time,
they represent a single state of affairs. Usually, developments or changes in the
real world are not easily recognized in these models.

Dynamic models or process models address precisely this issue. They emphasize
changes that have taken place, are taking place or may take place sometime in the
future. Dynamic models are inherently more complicated than static models, and
usually require much more computation. Simulation models are an important class
of dynamic models that allow the simulation of real world processes.

1.2.2. Maps

Maps are perhaps the best known (conventional) models of the real world. Maps
have been used for thousands of years to represent information about the real
world, and continue to be extremely useful for many applications in various
domains.
Their conception and design has developed into a science with a high degree of
sophistication. A disadvantage of the traditional paper map is that it is generally
restricted to two-dimensional static representations, and that it is always displayed
in a fixed scale.
The map scale determines the spatial resolution of the graphic feature
representation. The smaller the scale, the less detail a map can show. The
accuracy of the base data, on the other hand, puts limits to the scale in which a
map can be sensibly drawn. Hence, the selection of a proper map scale is one of
the first and most important steps in map design.
A map is always a graphic representation at a certain level of detail, which is
determined by the scale. Map sheets have physical boundaries, and features
spanning two map sheets have to be cut into pieces. Cartography, as the science
and art of map making, functions as an interpreter, translating real world
phenomena (primary data) into correct, clear and understandable representations
for our use. Maps also become a data source for other applications, including the
development of other maps.
With the advent of computer systems, analogue cartography developed into digital
cartography, and computers play an integral part in modern cartography.
Alongside this trend, the role of the map has also changed accordingly, and the

10
GEOGRAPHIC INFORMATION SYSTEMS Unit 1

The traditional role of paper maps as a data storage medium is being taken over
by (spatial) databases, which offe
Notwithstanding these developments, paper maps remain as important tools for
the display of spatial information for many applications.

Databases
epository for storing large amounts of data. It comes with a

1. A database can be used by multiple users at the same time i.e. it allows
concurrent use
2. A database offers a number of techniques for storing data and allows the use of the
most efficient one i.e. it supports storage optimization,
3. A database allows the imposition of rules on the stored data; rules that will be
automatically checked after each update to the data i.e. it supports data integrity,
4. A database offers an easy to use data manipulation language, which allows the
execution of all sorts of data extraction and data updates i.e. it has a query
facility,
5. A database will try to execute each query in the data manipulation language in the
most efficient way i.e. it offers query optimization.

Databases can store almost any kind of data. Modern database systems organize
the stored data in tabular format. A database may have many such tables, each of
which stores data of a certain kind. It is not uncommon for a table to have many
thousands of data rows, sometimes even hundreds of thousands.
Spatial databases and Spatial analysis

A GIS must store its data in some way. For this purpose the previous generation of

there has been an increasing trend in GIS applications that used a GIS for spatial
analysis, and used a database for storage. In more recent years, spatial databases
(also known as geo-databases) have emerged.
Besides traditional administrative data, they can store representations of real world
geographic phenomena for use in a GIS. These databases are special because they
use additional techniques different from tables to store these spatial
representations.
A geo-database is not the same thing as a GIS, though both systems share a
number of characteristics. These include the functions listed above for databases in
general: concurrency, storage, integrity, and querying, specifically, but not only,
spatial data.
A GIS, on the other hand, is tailored to operate on spatial data. It Geodatabases

11
GEOGRAPHIC INFORMATION SYSTEMS Unit 1

inherently geographic in nature, such as distance and area computations and

spatial interpol
to combine representations of geographic phenomena.
GISs, moreover, built-in tools for map production, of the paper and the digital
geographic space.
Databases typically lack this kind of understanding .
The phenomena for which we want to store representations in a spatial database
may have point, line, area or image characteristics. Different storage techniques
exist for each of these kinds of spatial data. These geographic phenomena have
various relationships with each other and possess spatial (geometric), thematic and
temporal attributes (they exist in space and time). For data management purposes,
phenomena are classified into thematic data layers.
The purpose of the database is usually described by a description such as cadastral,
topographic, land use, or soil database.
Spatial analysis is the generic term for all manipulations of spatial data carried out
anding of the geographic phenomena that the data
represents. It involves questions about how the data in various layers might relate
to each other, and how it varies over space.
The aim of spatial analysis is usually to gain a better understanding of geographic
phenomena through discovering patterns that were previously unknown to us, or to
build arguments on which to base important decisions.
It should be noted that some GIS functions for spatial analysis are simple and
easy-to-use, others are much more sophisticated, and demand higher levels of
analytical and operating skills. Successful spatial analysis requires appropriate
software, hardware, and perhaps most importantly, a competent user.

1.3. Geographic Information and Spatial Database

1.3.1. Models and Representations of the real world:

Model: GISs help us to analyze and understand more about processes and
phenomena in the real world. In practical terms, this refers to the process of
representing key aspects of the real world digitally (inside a computer). These
representations are made up of spatial data, stored in memory in the form of bits
and bytes, on media such as the hard drive of a computer.
This digital representation can then be subjected to various analytical functions
(computations) in the GIS, and the output can be visualized in various ways.

that some part of it can be more easily handled.

12
GEOGRAPHIC INFORMATION SYSTEMS Unit 1

Depending on the application domain of the model, it may be necessary to

manipulate the data with specific techniques. To investigate the geology of an
area, we may be interested in obtaining a geological classification.

As highlighted in in Figure 2.1, the process of translating the relevant aspects of the real
world into a computer representation of it is a domain of expertise by itself. It might be
achieved through direct observations using sensors, and digitizing (converting) the sensor
output for computer usage. This is the domain of remote sensing, the topic of Principles
of Remote Sensing. It may be done by making use of the output of a previous project,
such as a paper map, and re-digitizing it.
We can use the GIS to create visualizations from the computer representation,
either on-screen, printed on paper, or otherwise. It is crucial to understand the
fundamental differences between these notions. The real world, after all, is a

simulations of the real world.

We have limitations on the amount of data that we can store, limits on the amount
of detail we can capture, and (usually) limits on the time, we have available for a
project. It is therefore possible that some facts or relationships that exist in the real
world may not be discovered
Any geographic phenomenon can usually be represented in various ways; the
choice of which representation is best depends mostly on two issues. Firstly, what
original, raw data (from sensors or otherwise) is available, and secondly, what sort
of data manipulation is required or will be undertaken.

1.4.Geographic phenomena

1.4.1. Defining geographic phenomena:

A GIS operates under the assumption that the relevant spatial phenomena occur in a two-
or three-dimensional Euclidean space, unless otherwise specified. Euclidean space can be

13
GEOGRAPHIC INFORMATION SYSTEMS Unit 1

informally defined as a model of space in which locations Euclidean space are

represented by coordinates (x,y) in 2D; (x,y,z) in 3D and distance and direction can
defined with geometric formulas. In the 2D case, this is known as the Euclidean plane,
which is the most common Euclidean space in GIS use.

We may define a geographic phenomenon as a manifestation of an entity or process of

interest
that:
Can be named or described
Can be geo-referenced
Can be assigned a time (interval) at which it is/was present.

For instance, in water management, the objects of study might be river basins, agro-
ecologic units, measurements of actual evapo transpiration, meteorological data, ground
water levels, irrigation levels, water budgets and measurements of total water use. Note
that all of these can be named or described, georeferenced and provided with a time
interval at which each exists.

In multipurpose cadastral administration, the objects of study are different: houses, land
parcels, streets of various types, land use forms, sewage canals and other forms of urban
infrastructure may all play a role. Again, these can be named or described, georeferenced
and assigned a time interval of existence.

1.4.2.Types of geographic phenomena:

In order to be able to represent a phenomenon in a GIS requires us to state what it

is, and where it is. We must provide a description or at least a name on the one
hand, and a georeference on the other hand.
Some phenomena manifest themselves essentially everywhere in the study area,
while others only do so in certain localities. If we define our study area as the
equatorial Pacific Ocean, we can say that Sea Surface Temperature Fields can be
measured anywhere in the study area. Therefore, it is a typical example of a
(geographic) field.
Some common examples of geographic fields are air temperature, barometric
pressure and elevation. These fields are in fact continuous in nature. Examples of
discrete fields are land use and soil classifications.
(Geographic) objects populate the study area, and are usually well distinguished,

undetermined.
The array of buoys of the previous chapter is a good example: there is a fixed
number of buoys, and for each we know exactly where it is located. The buoys are
typical examples of (geographic) objects.

14
GEOGRAPHIC INFORMATION SYSTEMS Unit 1

A simple rule-of-thumb is that natural geographic phenomena are usually fields, and
man-made phenomena are usually objects.

1.4.3.Geographic fields

A field is a geographic
area. We can therefore think of a field as a mathematical function f that associates
a specific value with any position in the study area. Hence if (x,y) is a position in
the study area, then f(x,y) stands for the value of the field f at locality (x,y). Fields
can be discrete or continuous.
In a continuous field

not change abruptly, but only gradually.

Examples of continuous fields are air temperature, barometric pressure, soil
salinity and elevation. Continuity means that all changes in field values are
gradual. A continuous field can even be differentiable, meaning we can determine
a measure of change in the field value per unit of distance anywhere and in any
direction.
Discrete fields divide the study space in mutually exclusive, bounded parts, with
all locations in one part having the same field value. Typical examples are land
classifications, for instance, using either geological classes, soil type, land use
type, crop type or natural vegetation type.

A discrete field indicating geological units.

15
GEOGRAPHIC INFORMATION SYSTEMS Unit 1

Data types and values

Since we have now differentiated between continuous and discrete fields, we may also
look at different kinds of data values
The following measure four data limit types are:

1. Nominal data values are values that provide a name or identifier so that we can
discriminate between different values, but that is about all we can do. Specifically,
we cannot do true computations with these values.
Examples are the names of geological units. This kind of data value is called
categorical data when the values assigned are sorted according to some set of non-
overlapping categories. For example, we might identify the soil type of a given
area to belong to a certain (pre-defined) category.
2. Ordinal data values are data values that can be put in some natural sequence but
that do not allow any other type of computation. Household income, for instance,

natural sequence, but this is all we can say we can not say that a high income is
twice as high as an average income.
3. Interval data values are quantitative, in that they allow simple forms of
computation like addition and subtraction. However, interval data has no
arithmetic zero value, and does not support multiplication or division. For
instance, a temperature of20 C is not twice as warm as 10 C, and thus centigrade
temperatures are interval data values, not ratio data values.
4. Ratio data values allow most, if not all, forms of arithmetic computation. Rational
data have a natural zero value, and multiplication and division of values are
possible operators (distances measured in meters are an example). Continuous
fields can be expected to have ratio data values, and hence we can interpolate
them.

1.4.4.Geographic Objects:

G of GIS can be interpreted as Geographic Object or Spatial Object or Spatial

feature or Geographic Individuals. At the planet scale, our world is only one object
with the defined boundary. At the global scale, you can view our planet as water
and land mass. They have their owned defined boundaries. In our planet, there is
more water mass than land mass. It is interesting why our planet is called Earth
rather than Water.

Anyway, at the more refined scale, electric poles, rivers, roads, tramlines, water
pipes, fire hydrants, lakes, vineyards, agricultural land and the forest patches are
examples of geographic objects. Most of the tangible geographic objects have the
defined acceptable boundary and properties such as name, types and status etc.
These are called discrete geographic objects.
16
GEOGRAPHIC INFORMATION SYSTEMS Unit 1

For example, lake Geneva has a defined boundary. Its name (property) is called
Geneva lake. After that boundary, it is not known as Geneva lake. It is one of the
most famous (status) urban lakes in the world. Therefore, Geneva lake is a discrete
geographic object.

Collections of geographic objects can be interesting phenomena at a higher

aggregation level: forest plots form forests, groups of parcels form suburbs,
streams, brooks and rivers form a river drainage system, roads form a road
network, and SST buoys form an SST sensor network. It is sometimes useful to
view geographic phenomena at this more aggregated level and look at
characteristics like coverage, connectedness, and capacity.

Which part of the road network is within 5 km of a petrol station? (A coverage

question)
What is the shortest route between two cities via the road network? (A
connectedness question)
How many cars can optimally travel from one city to another in an hour? (A
capacity question)

A number of geological faults in the study area as Geographic Objects

1.4.5.Boundaries

Where shape and/or size of contiguous areas matter, the notion of Boundary
comes into play. Location, shape and size are fully determined if we know an
area
A crisp boundary is one that can be determined with almost arbitrary precision,
dependent only on the data acquisition technique applied. Fuzzy boundaries
contrast with crisp boundaries in that the boundary is not a precise line, but rather

17
GEOGRAPHIC INFORMATION SYSTEMS Unit 1

itself an area of transition. As a general rule-of-thumb, crisp boundaries are more

common in man-made phenomena, whereas fuzzy boundaries are more common
with natural phenomena. A continuous field example, namely the elevation.

1.5. Computer representations of geographic information

In order to represent continuous or discrete fields/data, a phenomenon faithfully in

computer memory, we could either:
Try to store as many (location, elevation) observation pairs as possible, or
Try to find a symbolic representation of the elevation field function, as a formula
in x and y like f(x,y) = 300x2 +4001y2 + 2015 xy, etc.

Both of these approaches have their drawbacks.

The first suffers from the fact that we will never be able to store all elevation
values for all locations; after all, there are infinitely many locations.
The second approach suffers from the fact that we do not know just what this
function should look like, and that it would be extremely difficult to derive such a
function for larger areas.

In GISs, typically a combination of both approaches is taken. We store a finite, but

intelligently chosen set of (sample) locations with their elevation. This gives us the
elevation for those stored locations, but not for others. We can use interpolation schemes
to determine the uncalculated values in between of two field points.
Interpolation is made possible by a principle called spatial autocorrelation. This is a
fundamental principle which refers to the fact that locations that are closer together are

18
GEOGRAPHIC INFORMATION SYSTEMS Unit 1

more likely to have similar values than locations that are far apart commonly referred to

Fields are usually implemented using Tessellation approach and Objects with
Topological (Vector) approach.

1.5.1.Regular tessellations:

A tessellation (or tiling) is a partitioning of space into mutually exclusive cells that
together make up the complete study space. With each cell, some (thematic) value
is associated to characterize that part of space.
Three regular tessellation types are illustrated in Figure 2.5. In a regular
tessellation, the cells are the same shape and size. The simplest example is a
rectangular raster of unit squares, represented in a computer in the 2D case as an
array of n×m elements.

The three most common regular tessellation types: square cells, hexagonal cells,
and triangular cells.
In all regular tessellations, the cells are of the same shape and size, and the field
attribute value assigned to a cell is associated with the entire area occupied by the
cell. In general these cells are named Raster.
A raster is a set of regularly spaced (and contiguous) cells with associated (field)
values. The associated values represent cell values, not point values. This means
that the value for a cell is assumed to be valid for all locations within the cell.

resolution. Sometimes, the word grid is also used, but strictly speaking, a grid
refers to values at the intersections of a network of regularly spaced horizontal and
perpendicular lines (see Figure 2.6). Grids are often used for discrete
measurements that occur at regular intervals. Grid values are often considered
synonymous with raster cells, although they are not.

19
GEOGRAPHIC INFORMATION SYSTEMS Unit 1

Grids and Raster respectively.

The location associated with a raster cell is fixed by convention, and may be the cell
centroid (mid-point) or, for instance, its left lower corner. Values for other positions than
these must be computed through some form of interpolation function, which will use one
or more nearby field values to compute the value at the requested position. This allows us
to represent continuous, even differentiable, functions.

An obvious disadvantage is that they are not adaptive to the spatial phenomenon we want
to represent. The cell boundaries are both artificial and fixed: they may or may not
coincide with the boundaries of the phenomena of interest.

1.5.2.Irregular tessellations:

Irregular tessellations are more complex than the regular ones, but they are also
more adaptive, which typically leads to a reduction in the amount of memory used
to store the data. A well-known data structure in this family upon which many
more variations have been based is the region quadtree. It is based on a regular
tessellation of square cells, but takes advantage of cases where neighboring cells
have the same field value, so that they can together be represented as one bigger
cell. A simple illustration is provided in Figure 2.7.

20
GEOGRAPHIC INFORMATION SYSTEMS Unit 1

The quadtree that represents this raster is constructed by repeatedly splitting up the area
into four quadrants, which are called NW, NE, SE, SW for obvious reasons. This
procedure stops when all the cells in a quadrant have the same field value. The procedure
produces an upside-down, tree-like structure, known as a quadtree.

1.5.3.Vector Data Model:

The vector data model uses the geometric objects of point, line, and area to
represent simple.
Spatial features (Figure 2.8). Dimensionality and property distinguish the three
types of geometric objects as well as the features they represent.
A point has 0 dimension and has only the property of location. A point may also
be called a node, vertex, or 0-cell. A point feature is made of a point or a set of
separate points. Wells, benchmarks, and gravel pits are examples of point
features.
A line is one-dimensional and has the property of length. A line has two end
points and points in between to mark the shape of the line. The shape of a line
may be a smooth curve or a connection of straight-line segments. Smooth curves
are typically fitted by mathematical equations such as splines. Straight-line
segments may represent human-made features such as canals and streets, or they
may simply be approximations of curves. A line is also called an edge, link, chain,
or 1-cell. A line feature is made of lines. Roads, streams, and contour lines are
examples of line features.
An area is two-dimensional and has the properties of area (size) and perimeter.
Made of connected lines, an area may be alone or share boundaries with other
areas. An area may contain holes, such as a national forest containing private land
parcels (holes). The existence of holes means that the area has both external and
internal boundaries. An area is also called a polygon, face, zone, or 2-cell. An
area feature is made of polygons. Examples of area features include timber stands,
land parcels, and water bodies.

21
GEOGRAPHIC INFORMATION SYSTEMS Unit 1

1.5.4.Topology and Spatial relationships:

Topology refers to the spatial relationships between geographical elements in a

data set that do not change under a continuous transformation.
Topological relationships are built from simple elements into more complex
elements: nodes define line segments, and line segments connect to define lines,
which in turn define polygons. The fundamental issues relating to order,
connectivity and adjacency of geographical elements form the basis of more
sophisticated GIS analyses. These relationships (called topological properties) are
invariant under a continuous transformation, referred to as a topological mapping.
For example, features drawn on a sheet of rubber (as in Figure 2.9) can be made to
change in shape and size by stretching and pulling the sheet.

However, some properties of these features do not change:

Area E is still inside area D,
The neighbourhood relationships between A, B, C, D, and E stay intact, and their
boundaries have the same start and end nodes,
The areas are still bounded by the same boundaries, only the shapes and lengths of
their perimeters have changed.

Figure 2.9 Rubber sheet transformation: The space is transformed, Yet many
relationships between the constituents remain unchanged.

The mathematical properties of the geometric space used for spatial data can be
described as follows:
The space is a three-dimensional Euclidean space where for every point we can
determine its three-dimensional coordinates as a triple (x,y,z) of real numbers. In
this space, we can define features like points, lines, polygons, and volumes as
geometric primitives of the respective dimension. A point is zero-dimensional, a
line one-dimensional, a polygon two-dimensional, and a volume is a three-
dimensional primitive.
22
GEOGRAPHIC INFORMATION SYSTEMS Unit 1

The space is a metric space, which means that we can always compute the
distance between two points according to a given distance function. Such a
function is also known as a metric.
The space is a topological space, of which the definition is a bit complicated. In
essence, for every point in the space we can find a neighbourhood around it that
fully belongs to that space as well.
Interior and boundary are properties of spatial features that remain invariant
under topological mappings. This means that under any topological mapping, the
interior and the boundary of a feature remains unbroken and intact.
We can define within the topological space, features that are easy to handle and
that can be used as representations of geographic objects. These features are called
simplices as they are the simplest geometric shapes of some dimension: point (0-
simplex), line segment (1-simplex), triangle (2-simplex), and tetrahedron (3-
simplex).
When we combine various simplices into a single feature, we obtain a simplicial
complex.

23
GEOGRAPHIC INFORMATION SYSTEMS Unit 1

1.5.5.Scale and resolution:

Map scale can be defined as the ratio between the distance on a paper map and the
distance of the same stretch in the terrain. A 1:50,000 scale map means that 1 cm
on the map -
that the ratio is large, so typically it means there is much detail, as in a 1:1,000
-
1:2,500,000 paper map. When applied to spatial data, the term resolution is
commonly associated with the cell width of the tessellation applied.
Digital spatial data, as stored in a GIS, is essentially without scale: scale is a ratio
notion associated with visual output, like a map or on-screen display, not with the
data that was used to produce the map.

Map of Boston area with two different scales 1:100000000 and 1:34000 scale.
Problems:
Q: What is the length in cm of one km on a 1:25,000 map?
Q: The distance between two towns measures 6cm on a map. What is the true distance if
the scale is 1: 50000?

Q: The plans of a house show a room to be 4.0m X 3.6m. The dimensions of the room
on the plans measure 20mm X 18mm. What was the scale used?

Feature Representation

Vector vs. Raster

To work in a GIS environment, real world observations (objects or events that can be
recorded in 2D or 3D space) need to be reduced to spatial entities. These spatial entities
can be represented in a GIS as a vector data model or a raster data model.

24
GEOGRAPHIC INFORMATION SYSTEMS Unit 1

Vector and raster representations of a river feature.

Vector

Vector features can be decomposed into three different geometric primitives: points,
polylines and polygons.

Point

Three point objects defined by their X and Y coordinate values.

A point is composed of one coordinate pair representing a specific location in a

coordinate system. Points are the most basic geometric primitives having no length

practical if such primitives are to be mapped. So points on a map are represented

using symbols that have both area and shape (e.g. circle, square, plus signs).

Polyline

A simple polyline object defined by connected

vertices.

25
GEOGRAPHIC INFORMATION SYSTEMS Unit 1

A polyline is composed of a sequence of two or more coordinate pairs called vertices. A

vertex is defined by coordinate pairs just like a point, but what differentiates a vertex
from a point is its explicitly defined relationship with neighboring vertices. A vertex is

no area. And like a point, a line is symbolized using shapes that have a color, width and
style (e.g.
polylines in a GIS.

Polygon

A simple polygon object defined by an area enclosed by connected vertices.

A polygon is composed of one or more lines whose starting and ending coordinate pairs
are the same. Polygons represent both length (i.e. the perimeter of the area) and area.
They also embody the idea of an inside and outside; in fact, the area that a polygon
encloses is explicitly defined in a GIS environment.

Raster

A simple raster object defined by a 10x10 array of cells or pixels.

A raster data model uses an array of cells, or pixels, to represent real-world

objects. Raster datasets are commonly used for representing and managing
imagery, surface temperatures, digital elevation models, and numerous other
entities.

26
GEOGRAPHIC INFORMATION SYSTEMS Unit 1

A raster can be thought of as a special case of an area object where the area is
divided into a regular grid of cells. But a regularly spaced array of marked points
may be a better analogy since rasters are stored as an array of values where each
cell is defined by a single coordinate pair inside of most GIS environments.

Implicit in a raster data model is a value associated with each cell or pixel. This is
in contrast to a vector model that may or may not have a value associated with the
geometric primitive.

1.6. Organizing and Managing Spatial Data:

The main principle of data organization applied in GIS system is that of a spatial
data layer. A spatial data layer is either representation of a continuous data or
discrete data or a collection of objects of the same kind. Usually data is organized
so that similar elements are in a single data layer. For example, all ATMs would
be in one layer, and all road line objects in another.
A data layer contains spatial data as well as attribute data which further describes
the field or objects in a layer. Attribute data is quite often arranged in a tabular
form, maintained in some kind of geodatabase. Data layers can be overlaid with
each other, inside the GIS package, co as to study combinations of geographic
phenomenon, A GIS can be used to study the spatial relationship between different
phenomenon, requiring compotation which overlays one layer with another .

Several layers can be used in spatial analysis.

27
GEOGRAPHIC INFORMATION SYSTEMS Unit 1

1.6.1. The Temporal Dimension:

Besides having geometric, thematic and topological properties, geographic phenomena

are also dynamic; they can change over time. For an increasing number of applications,
these changes themselves are the key aspect of the phenomenon to study. Spatiotemporal
data models are ways of organizing representations of space and time in GIS. Several
representation techniques have been proposed.

1. Discrete and continuous time-

Discrete time is composed of discrete elements (seconds, minutes, hours, days,

months).In continuous time , no such discrete elements exist, and for any two different
points in time, there is always one time in between. We can also structure time by events
or periods. We can derive temporal relationship between events and periods such as

2. Valid and transaction time-

Valid time(world time) is the time period during which a fact is true in the real world.
Transaction time is the time period during which a fact stored in the database was known.

3. Linear,branching and cyclic time-

Time can be considered to be linear, extending from the past to present and into future.
This view give a single time line. Branching time in which different time lines from

28
GEOGRAPHIC INFORMATION SYSTEMS Unit 1

certain points in time onwards are possible. Cyclic time in which repeating cycles such as
seasons or days of week are recognized ,make more sense and can be useful.

4. Time Granularity-

When measuring time we speak of granularity as the precision of the time value in GIS or
database(e.g. seconds, minutes ,hours, years).In cadastral application, time granularity
might well be a day ,as the law requires deeds to be date marked. In geological mapping
applications, time granularity is more likely in the order of thousand or millions of years.

5. Absolute and relative time-

other arbitrary points in time)

Sample Questions:

1. Explain spatial database and spatial analysis with example.

2. Write a short note on Map in GIS.
3. Explain different Models and Modeling in GIS.
4. What are important components in data quality assessment?
5. Write a short note on Data representation schemes in GIS.
6. Explain four sets of capabilities to handle georeferenced data in GIS.
7. Define GIS. What are its components?
8. Explain different application areas of GIS in short.
9. Explain feature representation in terms of vector data presentation V/s raster data
presentation.
10. Explain Topology and spatial relationship in GIS with example.
11. Explain Vector data model in GIS with example.
12. Explain quad tree data presentation with example in GIS.
13. Explain different data types and values in GIS.
14. Explain different types of geographic phenomena in Euclidian space.
15. Problems on Map scale.

29
Geographic Information Systems Unit 2

Unit-II
Data Management and Processing Systems

Contents

2.1 Introduction
2.2 GIS Software.
2.3 Stages of Spatial Data Infrastructure
2.4 Database Management System
2.5 GIS and Spatial Databases

2.1 INTRODUCTION

-based tool that analyzes

stores, manipulates and visualizes geographic information on a map.

In a GIS, geographic information is described explicitly in terms of geographic

coordinates (latitude and longitude or some coordinates) or implicitly in terms
of a street address, postal code, or forest stand identifier.
A geographic information system contains the ability to translate implicit
geographic data (such as a street address) into an explicit map location.
For many years, analog data sources were used, processing was done manually,
and paper maps were produced. The introduction of modern techniques has led
to an increased use of computers and digital information in all aspects of spatial
data handling.
The software technology used in this domain is centered on geographic
information systems (GIS).
A geographic information system provides a range of capabilities to handle
geo-referenced (The term is commonly used in the geographic information
systems field to describe the process of associating a physical map or raster
image of a map with spatial locations) data which includes:

30
Geographic Information Systems Unit 2

2.2- GIS SOFTWARE :-

A Geographic Information System (GIS Software) is designed to store,
retrieve, manage, display, and analyze all types of geographic and spatial
data.
GIS software lets us to produce maps and other graphic displays of
geographic information for analysis and presentation.
A GIS stores data on geographical features and their characteristics. The
features are typically classified as points, lines, or areas, or as raster/vector
images.
The main characteristics of a GIS software package are its analytical
functions, which provide means for deriving new geo-information from
existing spatial and attribute data.
There are many GIS packages available on the market and all of them have
their strengths and weaknesses. Some GISs have traditionally focused more
on support for raster-based functionality, others more on vector-based
spatial objects.
We can conclude that any package that provides support for only raster or
only vector data is not complete GIS software.

1. GRASS GIS (Geographic Resources Analysis Support System) Originally

developed by the U.S. Army Corps of Engineers (year 1982). This software
is used for geospatial data management and analysis, image processing,
producing graphics and maps, spatial and temporal modeling, and
visualizing. It can handle raster, topological vector, image processing, and
graphic data.
2. gvSIG(year 2004) Written in Java. Runs on Linux, Unix, Mac OS X and
Windows. A desktop application designed for capturing, storing, handling,
analyzing and deploying any kind of referenced geographic information in
order to solve complex management and planning problems. gvSIG is
known for having a user-friendly interface, being able to access the most
common formats, both vector and raster ones. It features a wide range of
tools for working with geographic-like information (query tools, layout
creation, geo-processing, networks, etc.).
3. ILWIS (Integrated Land and Water Information System) ILWIS
(Integrated Land and Water Information System) ILWIS was initially and
distributed by ITC Enschede (International Institute for Geo-Information
Science and Earth Observation) in the Netherlands . remote sensing
software for both vector and raster processing. Its features include
digitizing, editing, analysis and display of data, and production of quality
maps.
4. QGIS (previously known as Quantum GIS) Runs on Linux, Unix, Mac
OS X and Windows .This software allows users to analyze and edit spatial
information, in addition to composing and exporting graphical maps. QGIS

31
Geographic Information Systems Unit 2

supports both raster and vector layers; vector data is stored as either point,
line, or polygon features. Multiple formats of raster images are supported
and the software can georeference images.
5. JUMP GIS / OpenJUMP - UMP is a Java based vector and raster GIS and
programming framework. It reads and writes the file formats ESRI
Shapefile, it reads database datastores PostGIS, SpatiaLite, Oracle Spatial
and MySQL. It writes PostGIS datastore reads raster files (world file
supported) eg. GeoTIFF, TIFF, JPEG, BMP, PNG, FLT, ASC, JPEG
2000.It writes raster eg. GeoTIFF, TIFF, PNG.it supports full geometry and
attribute editing.
6. MapWindow GIS It is an application and set of programmable mapping
components. It has been adopted by the United States Environmental
Protection Agency . MapWindow GIS is distributed as an open source
application under the Mozilla Public License distribution license, It is an
extensible geographic information system. This means that advanced users
or developers can write plug-ins to add additional functionality and pass
these along to any number of the users clients and end users.
7. SAGA GIS (System for Automated Geoscientific Analysis): It is a
geographic information system (GIS) computer program, used to edit
spatial data. It is free and open-source software, developed originally by a
small team at the Department of Physical Geography, University of
Göttingen, Germany, and is now being maintained and extended by an
international developer community. SAGA has a fast-growing set of
geoscientific methods, bundled in exchangeable module libraries. SAGA
GIS is an effective tool with user friendly graphical user interface (GUI)
that requires only about 10 MB disk space. No installation needed
8. uDig[User Friendly Desktop Internet GIS] is a GIS software program
produced by a community led by Canadian-based consulting company
Refractions Research. It is based on the Eclipse platform and features full
layered Open Source GIS. It supports shape files, PostGIS , WMS, and
many other data sources natively. uDig is commonly used as a framework
for building other GIS platforms and applications. Such applications
include DIVA-GIS. Following table represents comparative study of GIS
-sources.

Software Name Relea User Programming Supports

se Profile Language
Raster/Vector
Date

GRASS GIS 1982 Expert- C , Python Both

Researcher

Characteristics: It used for geospatial data management and analysis, image

processing, producing graphics and maps, spatial and temporal

32
Geographic Information Systems Unit 2

modeling, and visualizing. It can handle raster, topological vector,

image processing, and graphic data.

ILWIS 1984 Beginner - MS Visual C Both

Researcher

Characteristics: It supports remote sensing feature for both vector and raster
processing. Its features include digitizing, editing, analysis and
display of data, and production of quality maps.

SAGA 2001 Beginner - MS Visual C Mainly raster

Researcher

Characteristics: It is used to edit spatial data and an effective tool with user friendly
graphical user interface (GUI) that requires only about 10 MB disk
space. No installation needed.

OPEN JUMP 2002 Beginner - Java Both

Researcher

Characteristics: It writes raster eg. GeoTIFF, TIFF, PNG. It supports full geometry
and attribute editing. . It reads and writes the file formats ESRI
Shape file, it reads database data stores PostGIS.

QGIS 2002 Beginner - C++ Python Both

Researcher

Characteristics: This software allows users to analyze and edit spatial information, in
addition to composing and exporting graphical maps. QGIS supports
both raster and vector layers; vector data is stored as either point,
line, or polygon features. Multiple formats of raster images are
supported and the software can geo-reference images

gvSIG 2003 Beginner - Java Both

Researcher

Characteristics: It features a wide range of tools for working with geographic-like

information (query tools, layout creation, geo-processing, networks,
etc).

33
Geographic Information Systems Unit 2

uDIG 2004 Beginner - Java Mainly vector

Researcher

Characteristics: It supports shape files, PostGIS, WMS, and many other data sources
natively. uDig is commonly used as a framework for building other
GIS platforms and applications.

Map Window 2005 Beginner - C++ , NET Both

Researcher

Characteristics: It is an extensible geographic information system. This means that

advanced users or developers can write plug-ins to add additional
functionality and pass these along to any number of the users clients
and end users.

2.2.1 GIS ARCHITECTURE AND FUNCTIONALITY

34
Geographic Information Systems Unit 2

As shown in above Fig the architecture consists of 3 tiers: The Client tier that
presents the data to the user; the Middleware Tier consisting of the Web
Application and GIS Servers; and the Server Tier including the database that
stores the spatial and non-spatial data. The roles of the various components of
this architecture include:
1. Client (Presentation) Tier: It supports various web browsers for
presentational purposes. It also includes desktop software for complex
spatial data manipulation and visualization tasks, as well as support for the
increasingly popular mobile devices.
2. Middleware (Application) Tier:

Web server: Map server is an open-source web server that can render maps by
generating objects from the database and its shape files . A scripting language,
such as PHP, provides the interaction between the user and the system and
ge in the database. Open
Database Connectivity (ODBC) protocol is available to connect to the database
and Map server supports numerous OGC standards.
GIS Server: This is distinct from web map servers as it does not provide web
mapping services, but rather provides GIS processing functionality such as
visualization, spatial data.

3. Data (Server) Tier: The geo-database in the data tier efficiently stores,
manages and retrieves data for relevant purposes.

A geographic information system in the wider sense consists of hardware,

software, data, people, and an organization in which it is used. All these elements
enable several functional components that support key GIS functions.

Hardware: It is the computer on which a GIS operates. Today, GIS software runs on
a wide range of hardware types, from centralized computer servers to desktop
computers used in stand-alone or networked configurations.

Software: GIS software provides the functions and tools needed to store, analyze, and
display geographic information. Key software components are
Tools for the input and manipulation of geographic information
A database management system (DBMS)
Tools that support geographic query, analysis, and visualization
A graphical user interface (GUI) for easy access to tools.
Data: The most important component of a GIS is the data. A GIS will integrate spatial
data with other data resources and can even use a DBMS, used by most organizations
to organize and maintain their data, to manage spatial data. In GIS data is classified
into 3 types namely
Spatial Data : It includes co-ordinates, boundaries, wells, road networks
Non-Spatial Data(attribute , a-spatial data): It includes address,
population density, land ownership , Soil PH values

35
Geographic Information Systems Unit 2

Data Relationship (spatial relationship-topology, attribute relations):-It

People: GIS technology is of limited value without the people who manage the
system and develop plans for applying it to real-world problems.

Infrastructure: It refers to the physical, organizational administrative and cultural

environments that support GIS. The infrastructure includes essential skills, data
standards, and general organizational patterns.

GIS Functions consist of following steps:

1. Spatial data Input 1. Data entry: use existing data, create new data
2. Data editing
3. Geometric transformation
4. Projection and re-projection

2. Database Management 1. Data entry and verification

2. Database management
3. Attribute data manipulation
3. Data display 1. Cartographic symbolization
2. Map design
4. Data exploration 1. Attribute data query
2. Spatial data query
3. Geographic visualization

5. Data analysis 1. Vector data analysis: buttering, overlay, distance

measurement, spatial statistics, map manipulation
2. Raster data analysis: local, neighborhood, zonal,
Global, raster data manipulation.

6. GIS modeling 1. Binary models

2. Index models
3. Regression models
4. Process models

36
Geographic Information Systems Unit 2

2.2.2 SPATIAL DATA INFRASTRUCTURE (SDI)

SDI (Spatial Data Infrastructure)

The term spatial data infrastructure was coined in 1993 by the U.S. National Research
Council to denote a framework of technologies, policies, and institutional
arrangements that together facilitate the creation, exchange, and use of geospatial data
and related information resources across an information-sharing community. Such a
framework can be implemented narrowly to enable the sharing of geospatial
information within an organization or more broadly for use at a national, regional, or
global level.
As per Kuhn (2005): "An SDI is a coordinated series of agreements on
technology standards, institutional arrangements, and policies that enable
the discovery and use of geospatial information by users and for purposes
other than those it was created for."
An SDI provides its users with different facilities for finding, viewing,
downloading and processing data. Because the organizations in an SDI are
normally widely distributed over space, computer networks are used as the
means of communication. With the development of the internet, the functional
components of GIS have been gradually become available as web-based
applications.

2.3 SPATIAL DATA

It is also known as geospatial data or geographic information it is the data or

information that identifies the geographic location of features and boundaries on
Earth, such as natural or constructed features, oceans, and more. Spatial data is usually
stored as coordinates and topology, and is data that can be mapped. Spatial data is
often accessed, manipulated or analyzed through Geographic Information Systems
(GIS).

Types of Spatial Data: Basically there are two types namely

Raster Data Vector Data

Raster data uses a grid to represent its Vector data uses the simple geometric
geographic information. Points are objects of points, lines, and areas
represented by single cells, lines by (polygons) to represent spatial
sequences of neighboring cells, and features.Vector data is not made up of a
areas by collection of grouping cells. grid of pixels. Instead, vector graphics are
comprised of vertices and paths.
Other types of raster data include: The three basic symbol types for vector
Satellite Imagery - remotely sensed data are points, lines and polygons (areas).
satellite data , Digital Elevation Since maps have been using symbols to
Models- an array of uniformly spaced represent real-world features. In GIS
elevation data, Graphic Files - scanned terminology, real-world features are called

37
Geographic Information Systems Unit 2

maps, photographs, and images in spatial entities

TIFF, GIF, or JPEG format
Raster data are good for: Vector data are good for :-
i)representing continuous data (e.g., i)accurately representing true shape and
slope, elevation, chemical size
concentrations) ii)representing non-continuous data (e.g.,
ii)representing multiple feature types rivers, political boundaries, road lines,
(e.g., points, lines, and polygons) as mountain peaks)
single feature types (cells) iii)creating aesthetically pleasing maps
iii)rapid computations ("map algebra")
in which raster layers are treated as
elements in mathematical expressions
analysis of multi-layer or multivariate
data (e.g., satellite image processing
and analysis)
The raster data model represents The primary focus of vector data model is
features as a matrix of cells geographic feature itself, the primary focus
of raster data model is location
Conversion of vector to raster data is Conversion of raster to vector data is
called rasterization. called vectorization.
Examples : DEM, DRGs,Satellite Examples : TIN , routes ,regions
imagery,digital orthophotos

2.3.1 STAGES OF SPATIAL DATA HANDLING:-

1. Data capture and preparation

GIS project planning requires data sources, both spatial and non- spatial, from
different national institutes. Data is captured from different sources basically
there are two types use existing data (it includes FDGC-Federal Geographic
Data Committee, US Geological Survey , US Census Bureau) and create new
data(it includes remotely sensed data , field data , digitization).
The functions for capturing data are closely related to the disciplines of
surveying engineering, photogrammetric, remote sensing, and the processes of

38
Geographic Information Systems Unit 2

digitizing, i.e. the conversion of analog data into digital representations.

Remote sensing, in particular, is the field that provides photographs and images
as the raw base data from which spatial data sets are derived
Traditional techniques for obtaining spatial data, typically from paper sources,
included manual digitizing and scanning. Digitizing and scanning in the
availability and sharing of digital (geospatial) data. Data conversion objects of
interest to the system still need to be constructed.

2. Spatial data storage and maintenance:

The way that data is stored plays a central role in the processing and the
eventual understanding of that data. In most of the available systems, spatial
data is organized in layers by theme and/or scale.
For instance, the data may be organized in thematic categories, such as land
use, topography and administrative subdivisions, or according to map scale.
Spatial data are stored in raster or vector format [comparison of raster and
vector format-covered in types of spatial data].
Maintenance of (spatial) data can best be defined as the combined
activities to keep the data set up-to-date and as supportive as possible to
the user community.
It deals with obtaining new data, and entering them into the system, possibly
replacing outdated data. The purpose is to have an up-to-date stored data set
available.

3. Spatial query and analysis

Spatial queries and process models play an important role in this functionality.
Spatial decision support systems (SDSS) SDSS are a category of information
systems composed of a database, GIS software models, and a so-called
knowledge engine which allow users to deal specifically with locational
problems.
Spatial Analysis of Raster Data: There are four basic functions of raster data
according to which analysis of raster data is done. These are:

39
Geographic Information Systems Unit 2

Local functions: that work on ever single cell,

Focal functions: that process the data of each cell based on the
information of a specified neighborhood,
Zonal functions: that provide operations that work on each group of
cells of identical values
Global functions: that work on a cell based on the data of the entire
grid.

Spatial Analysis of Vector Data: Spatial analysis is a fundamental component of a

GIS that allows for an in-depth study of the topological and geometric properties of a
dataset or datasets. An analysis technique enables us to analyze and manipulate the
spatial attributes of a vector feature dataset.

Buffering
Buffers are common vector analysis tools used to address questions of proximity in a
GIS and can be used on points, lines, or polygons

Geo-processing Operations
- GIS. The term can (and should)
be widely applied to any attempt to manipulate GIS data. The following represents the
most common geo-processing tools.
The dissolve operation combines adjacent polygon features in a single feature
dataset based on a single predetermined attribute.
The append operation creates an output polygon layer by combining the
spatial extent of two or more layers.
The select operation creates an output layer based on a user-defined query that
selects particular features from the input layer. The output layer contains only
those features that are selected during the query.
The merge operation combines features within a point, line, or polygon layer
into a single feature with identical attribute information. Often, the original
features will have different values for a given attribute. In this case, the first
attribute encountered is carried over into the attribute table, and the remaining
attributes are lost.

40
Geographic Information Systems Unit 2

(d) Merge

4. Spatial data presentation

In data presentation major concern has be taken because the quality

level of different datasets has to be consistent and errors should not be
introduced while data presentation.
The presentation of spatial data, whether in print or on-screen, in maps or in
y related to the disciplines of
cartography, printing and publishing. The presentation may either be an
end-product, for ex-ample as a printed atlas, or an intermediate product, as
in spatial data made available through the internet.

2.4 DATABASE MANAGEMENT SYSTEM

A database consists of an organized collection of data for one or more uses,

typically in digital form. A database management system (DBMS) consists of

41
Geographic Information Systems Unit 2

software that operates databases, providing storage, access, security, backup

and other facilities.
Designing a database is not an easy task. First of all, one has to consider
carefully what the database purpose is, and who its users will be. Secondly,
one needs to identify the available data sources and define the format in
which the data will be organized within the database. This format is
usually called the database structure.
Lastly, data needs to be entered into the database in a consistent way. It is
important to keep the data up-to-date, and it is therefore wise to set up the
processes for this, and make someone responsible for regular maintenance of
the database.

2.4.1 REASONING FOR USING DBMS

There are various reasons for using a DBMS for data storage and processing.
DBMS supports the storage and manipulation of very large data sets
Some data sets are so big that storing them in text files or spreadsheet files
becomes too awkward for use in practice. The result may be that finding
simple facts takes minutes, and performing simple calculations perhaps
even hours. A DBMS is specifically designed for doing this efficiently.
DBMS supports the concurrent use of the same data set by many users
Many people are usually involved in the data collection, maintenance and
processing. These data sets are often considered to be of a high strategic
value to the owner(s). Moreover, for different users of the database,
different views on the data can be defined.
In this way, users will be under the impression that they operate on their
personal database, and not on one shared by many people. They may all be

activities. This DBMS function is called concurrency control.

DBMS supports the use of a data model
A data model is a language with which one can define a database structure
and manipulate the data stored in it. The most prominent data model is the
relational data model.
Its primitives are tuples (also known as records, or rows) with attribute
values, and relations, being sets of similarly formed tuples.
DBMS provides a high-level, declarative query language
In a database, a query is a computer program that extracts from the database
that data that meets some specified conditions. A query language can be
used to define queries and updates.
DBMS can be instructed to guard over data correctness
We can ensure that the data that is entered into the database does not
contain errors. This is generally known as integrity constraints. More
complex integrity constraints are certainly possible, and their definition is
part of the design of a database.
42
Geographic Information Systems Unit 2

DBMS includes data backup and recovery functions and the control of data
redundancy
Regular back-ups of the dataset and automatic recovery schemes provide an
insurance against loss of data. Storing a fact multiple times, a phenomenon
known as data redundancy, can lead to situations in which stored facts may
contradict each other, causing reduced usefulness of the data. A well-
designed database takes care to store single facts only once.

2.4.2 ALTERNATIVES FOR DATA MANAGEMENT

The decision to use a DBMS will depend, among other things, on how
much data there is, what type of use will be made of it, and how many users
might be involved. The alternatives of data management can be traditional
approach.
Data is stored in flat files as records. Records consist of various fields
which are delimited by a space, comma, any special character etc. End of
records and end of files will be marked using any predetermined character
set or special characters in order to identify them. Following are some
problems associated with traditional approaches.
Data Security
Data Redundancy
Data Isolation

Relational Data-Model
A data model is a language that allows the definition of:
The structures that will be used to store the base data,
The integrity constraints that the stored data has to obey at all moments
in time, and
The computer programs used to manipulate the data.
In the relational data model, a database is viewed as a collection of
relations, commonly also known as tables. A table or relation is itself a
collection of tuples (or records). An attribute is a named field of a tuple,
with which each tuple associates a value, the tuple attribute value. A key of
a relation comprises one or more attributes. A value for these attributes
uniquely identifies a tuple.

The definition of relation schemas is an important part of database design.

The relation schemas together make up the database schema. When a
relation is created, we need to indicate what type of tuples it will store
(provide a name for the relation, indicate which attributes it will have, and
set the domain of each attribute)

43
Geographic Information Systems Unit 2

Querying a relational database:

The three most elementary query operators used for querying relational database.
The first query operator is tuple selection: it works like a filter: it allows
tuples that meet the selection condition to pass, and disallows tuples that do
not meet the condition.
The second query operator is attribute projection: it works like a tuple
formatter: it passes through all tuples of the input, but reshapes each of
them in the same way.
The third query operator is projects onto: The output relation of this
operator has as its schema only the list of attributes given.

The most common way of defining queries in a relational database is through the
SQL language. SQL stands for Structured Query Language.
Examples: SELECT FROM Parcel WHERE AreaSize > 1000 [tuple selection
from the Parcel relation, using the condition AreaSize > 1000. The indicates that we
want to extract all attributes of the input relation]
The FROM-clause identifies the two input relations; the WHERE-clause states the
join condition.
SELECT Pid, Location FROM Parcel[attribute projection from the Parcel relation.
The SELECT-clause indicates that we only want to extract the two attributes Pid and
Location. There is no WHERE-clause in this query].

2.5 GIS AND SPATIAL DATABASES

GIS software provides support for spatial data and thematic or attribute data.
GISs have traditionally stored spatial data and attribute data separately. This
required the GIS to provide a link between the spatial data (represented with
rasters or vectors), and their non-spatial attribute data.

44
Geographic Information Systems Unit 2

The strength of GIS technology lies in its built-

geographic space and all functions that derive from this, for purposes such as
storage, analysis, and map production. GIS packages themselves can store
tabular data; however, they do not always provide a full-fledged query
language to operate on the tables.

2.5.1 LINKING GIS AND DBMS

GIS and databases interact to enhance individual strengths and minimize their
weaknesses.

GIS DBMS

GIS software packages provide support for

Database management systems
both spatial and attribute data. They
(DBMS) have been based on the notion
accommodate spatial data storage using the
of tables for data storage.
vector and raster approaches and attribute
data using tables.

The strength of GIS lies in its built-in DBMSs have a long tradition in handling
attribute (i.e., administrative, non-spatial,
functions that derive from this, for purposes tabular, thematic) data in a secure way, for
such as storage, analysis, and map production. multiple users at the same time.
GIS packages themselves can store tabular
Perhaps, DBMSs offer much better table
data.
functionality, since they are specifically
However, they do not always provide a full- designed for this purpose. A lot of the data
fledged query language to operate on the in GIS applications is attribute data, so it
tables. made sense to use a DBMS for it.

A GIS has spatial data represented with The DBMS serves as a centralized data
rasters or vectors, and the attribute data stored repository for all users, while each user
in an external DBMS runs her/his own GIS software, obtaining
its data from the DBMS.

With raster representations, each raster cell This identifier is usually just called the
stores a characteristic value. This value can be objectID featureID
used to look up attribute data in an link the spatial object (as represented in
accompanying database table. vectors) with its attribute data.

With vector representations, our spatial

objects - whether they are points, lines or

45
Geographic Information Systems Unit 2

ti-
polygons - are automatically given a unique
identifier by the system

2.5.2 SPATIAL DATABASE FUNCTIONALITY

A spatial database is a database that is optimized for storing and

querying data that represents objects defined in a geometric space.
Most spatial databases allow the representation of simple geometric
objects such as points, lines and polygons.
Some spatial databases handle more complex structures such as 3D objects,
topological coverages, linear networks, and TINs.
While typical databases have developed to manage various numeric and
character types of data, such databases require additional functionality to
process spatial data types efficiently, and developers have often added
geometry or feature data types (crosses, touches, disjoint, equal, intersect,
within, contain, overlap).
The architecture of a spatial database differs from a standard RDBMS not
only because it can handle geometry data and manage projections, but also
for a larger set of commands that extend standard SQL language (e.g.
distance calculations, buffers, overlay, conversion between coordinate
systems, etc).
Querying a spatial database
A Spatial DBMS provides support for geographic co-ordinate systems and
transformations. It also provides storage of the relationships between features,
including the creation and storage of topological relationships. As a result one is
able to
SELECT R.Name
FROM Restaurants AS R,
Hotels as H

ST_Intersects(R.Geometry, ST_Buffer(H.Geometry, 2000))

In this case the WHERE clause uses the ST Intersects function to perform a spa-
tial join between a 2000 m buffer of the selected hotel and the selected subset of
restaurants. The Geometry column carries the spatial data.
Features of Spatial Databases:
Spatial Measurements: Computes line length, polygon area, the distance
between geometries, etc.
Spatial Functions: Modify existing features to create new ones, for

46
Geographic Information Systems Unit 2

example by providing a buffer around them, intersecting features, etc.

Spatial Predicates: Allows true/false queries about spatial relationships
between geometries.
Geometry Constructors: Creates new geometries, usually by specifying
the vertices (points or nodes) which define the shape.
Observer Functions: Queries which return specific information about a
feature such as the location of the center of a circle
Spatial Index: spatial databases use a spatial index to speed up database
operations.

Sample Questions

1.
2. Write a short note on GIS architecture and functionality.
3. Explain SDI in detail.
4. What do you mean by spatial data? Explain stages involved in spatial data
handling.
5. How databases are linked in GIS? Explain reasons for using database.
6. What are the possible alternatives for data management in GIS?
7. Explain relational database with an appropriate example.
8. Explain how GIS and DBMS can be linked together.
9. What do you mean by spatial databases? Explain spatial database
functionality in GIS.

47
Geographic Information Systems Unit 3

Unit-III
Spatial Referencing and Positioning

Contents

3.1. Spatial Referencing and Positioning

3.2. Data Entry and Preparation
3.2.1 Spatial Data Input
3.2.2 Data Quality
3.2.3 Data Preparation
3.2.4 Point Data Transformation

3.1. Spatial Referencing and Positioning

3.1.1. Spatial Referencing:

A spatial reference is a series of parameters that define the coordinate system and
other spatial properties for each dataset in the geodatabase. It is typical that all
datasets for the same area (and in the same geodatabase) use a common spatial
reference definition.
A spatial reference includes the following:
The coordinate system
The coordinate precision with which coordinates are stored (often referred to as the
coordinate resolution)
Processing tolerances (such as the cluster tolerance)
The spatial extent covered by the dataset (often referred to as the spatial domain)

3.1.1.1.Geographic coordinate systems

A geographic coordinate system (GCS) uses a three-dimensional spherical

surface to define locations on the earth. A GCS is often incorrectly called a
datum, but a datum is only one part of a GCS. A GCS includes an angular unit
of measure, a prime meridian, and a datum (based on a spheroid). The spheroid
defines the size and shape of the earth model, while the datum connects the
spheroid to the earth's surface.

A point is referenced by its longitude and latitude values. Longitude and

latitude are angles measured from the earth's center to a point on the earth's

48
Geographic Information Systems Unit 3

surface. The angles often are measured in degrees (or in grads). The following
illustration shows the world as a globe with longitude and latitude values:

In the spherical system, horizontal lines, or east west lines, are lines of equal
latitude, or parallels. Vertical lines, or north south lines, are lines of equal
longitude, or meridians. These lines encompass the globe and form a gridded
network called a graticule.

The line of latitude midway between the poles is called the equator. It defines
the line of zero latitude. The line of zero longitude is called the prime meridian.
For most GCSs, the prime meridian is the longitude that passes through
Greenwich, England. The origin of the graticule (0,0) is defined by where the
equator and prime meridian intersect.
Latitude and longitude values are traditionally measured either in decimal
degrees or in degrees, minutes, and seconds (DMS). Latitude values are
measured relative to the equator and range from 90° at the south pole to +90°
at the north pole. Longitude values are measured relative to the prime meridian.
They range from 180° when traveling west to 180° when traveling east. If the
prime meridian is at Greenwich, then Australia, which is south of the equator
and east of Greenwich, has positive longitude values and negative latitude
values.

It may be helpful to equate longitude values with x and latitude values with y.
Data defined on a geographic coordinate system is displayed as if a degree is a
linear unit of measure. This method is basically the same as the Plate Carrée
projection. A physical location will usually have different coordinate values in
different geographic coordinate systems.

3.1.1.2.Geographic (datum) transformations

If two datasets are not referenced to the same geographic coordinate system,
you may need to perform a geographic (datum) transformation. This is a well-
defined mathematical method to convert coordinates between two geographic
coordinate systems. As with the coordinate systems, there are several hundred
predefined geographic transformations that you can access. It is very important
to correctly use a geographic transformation if it is required. When neglected,

49
Geographic Information Systems Unit 3

coordinates can be in the wrong location by up to a few hundred meters.

Sometimes no transformation exists, or you have to use a third GCS like the
World Geodetic System 1984 (WGS84) and combine two transformations.

3.1.1.3.Projected coordinate systems

A projected coordinate system (PCS) is defined on a flat, two-dimensional

surface. Unlike a GCS, a PCS has constant lengths, angles, and areas across the
two dimensions. A PCS is always based on a GCS that is based on a sphere or
spheroid. In addition to the GCS, a PCS includes a map projection, a set of
projection parameters that customize the map projection for a particular
location, and a linear unit of measure.

3.1.2.1Coordinate system

Coordinate systems enable geographic datasets to use common locations for

integration. A coordinate system is a reference system used to represent the
locations of geographic features, imagery, and observations, such as Global
Positioning System (GPS) locations, within a common geographic framework.
Each coordinate system is defined by the following:
Its measurement framework, which is either geographic (in which spherical
coordinates are measured from the earth's center) or planimetric (in which the
earth's coordinates are projected onto a two-dimensional planar surface)
Units of measurement (typically feet or meters for projected coordinate systems or
decimal degrees for latitude-longitude)
The definition of the map projection for projected coordinate systems
Other measurement system properties such as a spheroid of reference, a datum,
one or more standard parallels, a central meridian, and possible shifts in the x- and
y-directions

Several hundred geographic coordinate systems and a few thousand projected

coordinate systems are available for use. In addition, you can define a custom
coordinate system.

3.1.2.2Map projections

A map projection is a mathematically described technique of how to represent

the
Earth on a flat paper map or on a computer screen, the curved horizontal
reference surface must be mapped onto the 2D mapping plane
Whether you treat the earth as a sphere or a spheroid, you must transform its
three-dimensional surface to create a flat map sheet. This mathematical
transformation is commonly referred to as a map projection. One easy way to
understand how map projections alter spatial properties is to visualize shining a
light through the earth onto a surface, called the projection surface. Imagine the

50
Geographic Information Systems Unit 3

earth's surface is clear with the graticule drawn on it. Wrap a piece of paper
around the earth. A light at the center of the earth will cast the shadows of the
graticule onto the piece of paper. You can now unwrap the paper and lay it flat.
The shape of the graticule on the flat paper is different from that on the earth.
The map projection has distorted the graticule.
A spheroid cannot be flattened to a plane any more easily than a piece of
orange peel can be flattened it will tear. Representing the earth's surface in
two dimensions causes distortion in the shape, area, distance, or direction of the
data.

A map projection uses mathematical formulas to relate spherical coordinates on

the globe to flat, planar coordinates.

Projection parameters

Each map projection has a set of parameters that you must define. The parameters
specify the origin and customize a projection for your area of interest.
Linear parameters
False easting is a linear value applied to the origin of the x-coordinates.
False northing is a linear value applied to the origin of the y-coordinates. False
easting and northing values are usually applied to ensure that all x- and y-
values are positive. You can also use the false easting and northing parameters
to reduce the range of the x- or y- coordinate values.
For example, if you know all y- values are greater than 5,000,000 meters, you could
apply a false northing of 5,000,000. Height defines the point of perspective above
the surface of the sphere or spheroid for the Vertical Near-Side Perspective projection.
Angular parameters
Azimuth defines the centerline of a projection. The rotation angle measures east
from north. It is used with the azimuth cases of the Hotine Oblique Mercator
projection.
Central meridian defines the origin of the x-coordinates.

51
Geographic Information Systems Unit 3

Longitude of origin defines the origin of the x-coordinates. The central meridian
and longitude of origin parameters are synonymous.
Central parallel defines the origin of the y-coordinates.

Classification of map projections

Map projections can be described in terms of their:

i. class (cylindrical, conical or azimuthal),
ii. Point of secancy (tangent or secant),
iii. aspect (normal, transverse or oblique), and
iv. distortion property (equivalent, equidistant or conformal).

i. CLASS: The three classes of map projections are cylindrical, conical and
azimuthal. The Earth's reference surface projected on a map wrapped around
the globe as a cylinder produces a cylindrical map projection. Projected on a
map formed into a cone gives a conical map projection. When projected
directly onto the mapping plane it produces an azimuthal (or zenithal or
planar) map projection. The figure below shows the surfaces involved in these
three classes of projections.

ii. Point of secancy: The planar, conical, and cylindrical surfaces in the figure above
are all tangent surfaces; they touch the horizontal reference surface in one point
(plane) or along a closed line (cone and cylinder) only. Another class of projections is
obtained if the surfaces are chosen to be secant to (to intersect with) the horizontal
reference surface; illustrations are in the figure below. Then, the reference surface is
intersected along one closed line (plane) or two closed lines (cone and cylinder).
Secant map surfaces are used to reduce or average scale errors because the line(s) of
intersection are not distorted on the map (section 4.3 scale distortions on a map).

52
Geographic Information Systems Unit 3

Three secant projection classes.

A method to calculate the lines of intersection in a normal conical or
cylindrical projection (i.e. standard parallels) could be by determining the range
-
sixth rul -sixth the range above the
southern boundary and the second standard parallel minus one-sixth the range
below the northern limit (figure below). There are other possible approaches.

A conical projection with a secant projection plane. The lines of intersection

(standard parallels) are selected at one-sixth below and above the limit of the
mapping area.

iii. aspect: Projections can also be described in terms of the direction of the projection
plane's orientation (whether cylinder, plane or cone) with respect to the globe. This is
called the aspect of a map projection. The three possible aspects are normal,
transverse and oblique. In a normal projection, the main orientation of the projection
surface is parallel to the Earth's axis (as in the figures above for the cylinder and the
cone). A transverse projection has its main orientation perpendicular to the Earth's
axis. Oblique projections are all other, non-parallel and non-perpendicular, cases. The
figure below provides two examples.

53
Geographic Information Systems Unit 3

A transverse and an oblique map projection.

The terms polar and equatorial are also used. In a polar azimuthal projection the
projection surface is tangent or secant at the pole. In an equatorial azimuthal or
equatorial cylindrical projection, the projection surface is tangent or secant at the
equator.
iv. distortion property: So far, we have not specified how the Earth's reference
surface is projected onto the plane, cone or cylinder. How this is done determines
which kind of distortion properties the map will have compared to the original curved
reference surface. The distortion properties of map are typically classified according
to what is not distorted on the map:
In a conformal (orthomorphic) map projection the angles between lines in
the map are identical to the angles between the original lines on the curved
reference surface. This means that angles (with short sides) and shapes (of
small areas) are shown correctly on the map.
In an equal-area (equivalent) map projection the areas in the map are
identical to the areas on the curved reference surface (taking into account
the map scale), which means that areas are represented correctly on the
map.
In an equidistant map projection the length of particular lines in the map
are the same as the length of the original lines on the curved reference
surface (taking into account the map scale).

A particular map projection can have any one of these three properties. No map
projection can be both conformal and equal-area. A projection can only be equidistant
(true to scale) at certain places or in certain directions.
Map and GIS users are mostly confronted in their work with transformations from one
two-dimensional coordinate system to another. This includes the transformation of
polar coordinates delivered by the surveyor into Cartesian map coordinates (section
2.5) or the transformation from one 2D Cartesian (x,y) system of a specific map
projection into another 2D Cartesian (x,y) system of a defined map projection.

54
Geographic Information Systems Unit 3

Integration of spatial data into one common coordinate system.

Datum transformations are also important, usually for mapping purposes at large and
medium scale. An example, map and GIS users are often collecting spatial data in the
field using satellite navigation technology and need to represent this data on published
maps on a local horizontal datum

3.1.2.Satellite-based Positioning

GPS Segments
The Global Positioning System basically consists of three segments: Space Segment,
Control Segment and User Segment.
i. Space Segment: ( Satellite that orbits the earth and radio signals that they emits)
The Space Segment contains 24 satellites, in 12-hour near-circular orbits at altitude of
about 20000 km, with inclination of orbit 55°. The constellation ensures at least 4
satellites in view from any point on the earth at any time for 3-D positioning and
navigation on world-wide basis. The three axis controlled, earth-pointing satellites
continuously transmit navigation and system data comprising predicted satellite
ephemeris, clock error etc.
ii. Control Segment: (the ground station that monitors and maintain the space
segment component)
This has a Master Control Station (MCS), few Monitor Stations (MSs) and an UpLoad
Station (ULS). The MSs are transportable shelters with receivers and computers;
which passively track satellites, accumulating ranging data from navigation signals.
This is transferred to MCS for processing by computer, to provide best estimates of
satellite position, velocity and clock drift relative to system time. The data thus
processed generates refined information of gravity field influencing the satellite
motion, solar pressure parameters, position, clock bias and electronic delay
characteristics of ground stations and other observable system influences. Future
navigation messages are generated from this and loaded into satellite memory once a
day via ULS which has a parabolic antenna, a transmitter and a computer.

55
Geographic Information Systems Unit 3

Thus, role of Control Segment is:

- To estimate satellite [space vehicle (SV)] ephemerides and atomic clock behavior.
- To predict SV positions and clock drifts.
- To upload this data to SVs.
iii. User Segment: (user with their hardware and software to conduct positioning)
The user equipment consists of an antenna, a receiver, a data-processor with software
and a control/display unit. The GPS receiver measures the pseudo range, phase and
other data using navigation signals from minimum 4 satellites and computes the 3-D
position, velocity and system time. Corrections like delay due to ionospheric and
tropospheric refraction, clock errors, etc. are also computed and applied by the user
equipment / processing software.

Principle of positioning:

There are two, general operating modes from which GPS-derived positions can be
obtained absolute and relative (or differential) positioning. Within each of these two
modes, range measurements to the satellites can be performed by tracking either the
phase of the satellite's carrier signal

3.1.2.Methods of Observations:

The different methods of observations with GPS include, absolute positioning, relative
positioning in translocation mode, relative positioning using differential GPS
technique, and kinematic GPS surveying technique.

56
Geographic Information Systems Unit 3

1. Absolute Positioning:

The working principles of absolute, satellite-based positioning are fairly simple:

A satellite, equipped with a clock, at a specific moment sends a radio message that
includes
the satellite identifier,
its position in orbit, and
its clock reading.

1. A receiver on or above the planet, also equipped with a clock, receives the
message slightly later, and reads its own clock.
2. From the time delay observed between the two clock readings, and knowing
the speed of radio transmission through the medium between (satellite) sender
and receiver, the receiver can compute the distance to the sender, also known
pseudorange

In the absolute positioning mode, the absolute coordinates of the antenna

position (centred over the survey station) are determined using single GPS
receiver, by a method similar to the resection method used in plane tabling.
The pseudo ranges (the satellite-antenna range, contaminated by the receiver
clock bias) from minimum four satellites are observed at the given epoch, from
which the four unknown parameters - the 3-D position of the antenna (x, y, z)
and the receiver clock error can be determined.
The accuracy of the position obtained from this method depends upon the
accuracy of the time and position messages received from the satellites. With
the selective availability operational, the accuracy of absolute positioning in
real-time was limited to about 100 meters, which has now improved to a about
10 to 20 meters, since the SA is switched-off.
This can be further improved to few centimeters level by using post-processed
satellite orbit information in the post-processing mode. The accuracy of
absolute positioning with GPS is limited mainly due to the high orbit of the
satellites. However, very few applications require absolute position in real
time.

3.1.2.2.Errors in absolute positioning:

Errors related to the space segment

As a first source of error, the operators of the control segment may

intentionally deteriorate radio signals of the satellites to the general public, to
avoid optimal use of the system by the enemy, for instance in times of global
political tension and war. This selective availability meaning that the military
forces allied with the control segment will still have access to undisturbed

57
Geographic Information Systems Unit 3

signals may cause error that is an order of magnitude larger than all other
error sources combined.

Secondly, the satellite message may contain incorrect information. Assuming

that it will always know its own identifier, the satellite may make two kinds of
error:

Incorrect clock reading: Even atomic clocks can be off by a small margin, and since
Einstein, we know that travelling clocks are slower than resident clocks, due to a so-
called relativistic effect. If one understands that a clock that is off by 0.000001 sec

is clear that these satellite clocks require very strict monitoring.

Incorrect orbit position: The orbit of a satellite around our planet is easy to describe
mathematically if both bodies are considered point masses, but in real life they are not.
For the same reasons that the Geoid is not a sim-
gravitation field that a satellite experiences in orbit is not simple either. Moreover, it is
disturbed by solar and lunar gravitation, making its flight path slightly erratic and
difficult to forecast exactly.
Both types of error are strictly monitored by the ground control segment, which is
responsible for correcting any errors of this nature, but it does so by applying an
agreed upon tolerance. A control station can obviously compare results of positioning
computations like discussed above with its accurately known position, flagging any

until errors have been corrected, and brought to within the tolerance. This may be
done by uploading a correction on the clock or orbit settings to the satellite.

Errors related to the medium

Thirdly, the medium between sender and receiver may be of influence to the
radio signals. The middle atmospheric layers of strato- and mesosphere are
relatively harmless and of little hindrance to radio waves, but this is not true of
the lower and upper layer. They are, respectively:

The troposhere:

phenomena that we call the weather. It is an obstacle that delays radio waves in
a rather variable way.

The ionosphere: the most outward part of the atmosphere that starts at an
altitude of 90 km, holding many electrically charged atoms, thereby forming a
protection against various forms of radiation from space, including to some
extent radio waves. The degree of ionization shows a distinct night and day
rhythm, and also depends on solar activity.

58
Geographic Information Systems Unit 3

The latter is a more severe source of delay to satellite signals, which obviously means
that pseudoranges are estimated larger than they actually are. When satellites emit
radio signals at two or more frequencies, an estimate can be computed from
differences in delay incurred for signals of different frequency, and this will allow for
the correction of atmospheric delay, leading to a 10 50% improvement of accuracy. If
this is not the case, or if the receiver is capable of receiving just a single frequency, a
model should be applied to forecast the (especially ionospheric) delay, typically taking
into account the time of day and current latitude of the receiver.

Fourth in this list is the error occurring when a radio signal is received via two
or more paths between sender and receiver, some of which typically via a
bounce off of some nearby surface, like a building or rock face. The term
applied to this phenomenon is multi-path; when it occurs the multiple
receptions of the same signal may interfere with each other . Multi-path is a
difficult to avoid error source.

At any point

and for others multi-path signal reception may occur

All of the above error sources have an influence on the computation of a

user equivalent
range error.

3.1.2.3. Relative positioning

In relative positioning, also known as differential positioning, one tries to

remove some of the systematic error sources by taking into account
measurements of these errors in a nearby stationary reference receiver with an

59
Geographic Information Systems Unit 3

accurately known position. By using these systematic error findings at the

reference, the position of the target receiver of interest will become known
much more precisely.
In an optimal setting, reference and target receiver experience identical
conditions and are connected by a direct data link, allowing the target to
receive correctional data from the reference. In practice, relative positioning
allows reference and target receiver to 70 200 km apart, and they will
essentially experience similar atmospheric signal error. For each satellite in
view, the reference receiver will determine its pseudorange error. After all, its
position is known with high accuracy, so it can solve any pseudorange
equations to determine the error. Sub-sequently, the target receiver, having
received the error characteristics will apply the correction for each of the four
satellite signals that it uses for positioning.

3.1.2.4. Network positioning

After discussing the advantages of relative positioning, we can move on to the

notion of network positioning: an integrated, systematic network of reference
receivers covering a large area like a continent or even the whole globe.
The organization of such a network can take different shapes, augmenting an
already existing satellite-based system. Here we discuss a general architecture,
consisting of a network of reference stations, strategically positioned in the
area to be covered, each of which is constantly monitoring signals and their
errors for all positioning satellites in view. One or more control centres receive
the reference station data, verify this for correctness, and relay (uplink) this
information to a geostationary satellite. The satellite will retransmit the
correctional data to the area that it covers, so that target receivers, using their
own approximate position, can determine how to correct for satellite signal
error, and consequently obtain much more accurate position fixes.
With network positioning, accuracy in the submetre range can be obtained.
Typically, advanced receivers are required, but the technology lends itself also
for solutions with a single advanced receiver that functions in the direct
neighbour-hood as a reference receiver to simple ones.

3.1.2.5. Code versus Phase measurements

Up until this point, we have assumed that the receiver determines the range of a
satellite by measuring time delay on the received ranging code. There exists a
more advanced range determination technique known as carrier phase
measurement. This typically requires more advanced receiver technology, and
longer observation sessions. Carrier phase measurement can currently only be
used with relative positioning, as absolute positioning using this method is not
yet well developed.

60
Geographic Information Systems Unit 3

The technique aims to determine the number of cycles of the (sine-shaped)

radio signal between sender and receiver. Each cycle corresponds to one
wavelength of the signal, which in the applied L-band frequencies is 19 24 cm.
Since this number of cycles cannot be directly measured, it is determined, in a
long observation session, from the change in carrier phase with time. This
happens because the satellite is orbiting itself. From its orbit parameters and the
change in phase over time, the number of cycles can be derived.With relative
positioning techniques, a horizontal accuracy of 2 mm 2 cm can be achieved.
This degree of accuracy makes it possible to measure tectonic plate
movements, which can be as big as 10 cm per year in some locations on the
planet.

3.1.2.6. Positioning technology:

At present, two satellite-based positioning systems are operational (GPS and

GLONASS), and a third is in the implementation phase (Galileo). Respectively, these
are American, Russian and European systems. Any of these, but especially GPS and
Galileo, will be improved over time, and will be augmented with new techniques.

1. GPS

The NAVSTAR Global Positioning System (GPS) was declared operational in

1994, providing Precise Positioning Services (PPS) to US and allied military
forces as well as US government agencies, and Standard Positioning Services
(SPS) to civilians throughout the world. Its space segment nominally consists
of 24 satellites, each of which orbit our planet in 11h58m at an altitude of
20,200 km. There can be any number of satellites active, typically between 21
and 27. The satel- Orbital planes lites are organized in six orbital planes,
somewhat irregularly spaced, with an angle of inclination of 55 63 with the
equatorial plane, nominally having four satellites each (see Figure 4.28). This
means that a receiver on Earth will have between five and eight (sometimes up
to twelve) satellites in view at any point in time. Software packages exist to
help in planning GPS surveys, identifying expected satellite set-up for any
location and time.
, and monitor
stations in a belt around the equator, namely in Hawaii, Kwajalein Atoll in the
Marshall Islands, Diego Garcia (British Indian Ocean Territory) and Ascension
Island (UK, southern Atlantic Ocean).
The NAVSTAR satellites transmit two radio signals, namely the L1 frequency
at 1575.42 MHz and the L2 frequency at 1227.60 MHz. There are also a third
and fourth signal, but they are not important for our discussion here.

The first two signals consist of:

The carrier waves at the given frequencies,
A coarse ranging code, known as C/A, modulated on L1,

61
Geographic Information Systems Unit 3

An encrypted precision ranging code, known as P(Y), modulated on L1 and L2,

and
A navigation message modulated on both L1 and L2.
The role of L2 is to provide a second radio signal, thereby allowing (the more
expensive) dual-frequency receivers a way of determining fairly precisely the
actual ionospheric delay on satellite signals received.
The role of the ranging codes is two-fold:
To identify the satellite that sent the signal, as each satellite sends unique
codes, and the receiver has a look-up table for these codes, and

The navigation message contains the satellite orbit and satellite clock error in-
formation, as well as some general system information. GPS also carries a fifth,
encrypted military signal carrying the M-code. GPS uses WGS84 as its
reference system. It has been refined on several occasions and is now aligned
with the WGS84 and ITRFITRF at the level of a few centimeters worldwide.
GPS has adopted UTC as its time system.

In the civil market, GPS receivers of varying quality are available, their quality
depending on the embedded positioning features: supporting single- or dual-
frequency, supporting only absolute or also relative positioning, performing
code measurements or also carrier phase measurements. Leica and Trimble are
two of the well-known brands in the high-precision, professional surveying do-
GPS manufacturers main; Magellan and Garmin, for instance, operate in the
lower price, higher volume consumer market range, amongst others for
recreational use in outdoor activities. Many of these are single frequency
receivers, doing only code measurements, though some are capable of relative
positioning. This includes the new generation of GPS-enabled mobile phones.

2. GLONASS

What GPS is to the US military, is GLONASS to the Russian military,

specifically the Russian Space Forces. Both systems were primarily designed
on the basis of military requirements. The big difference between the two is
that GPS generated a major interest in civil applications, thus having an
important economic impact. This cannot be said of GLONASS.

The GLONASS space segment consists of nominally 24 satellites, organized in

three orbital planes, with an inclination of 64.8 with the equator. Orbiting
altitude is 19,130 km, with a period of revolution of 11 hours 16 min.
GLONASS uses the PZ 90 as its reference system, and like GPS uses UTC as
time reference, though with an offset for Russian daylight.

GLONASS radio signals are somewhat similar to that of GPS, but differ in the
details. Satellites use different identifier schemes, and their navigation message
use other parameters. They also use different frequencies: GLONASS L1 is at
62
Geographic Information Systems Unit 3

approximately 1605 MHz (changes are underway), and L2 is at approximately

1248 MHz. Otherwise, the GLONASS system performance is rather
comparable to that of GPS.

3. Galileo

n
satellite-based positioning system, to become independent of the GPS
monopoly and to support its own economic growth by providing services of
high reliability under civilian control.

Galileo is the name of this EU system. The vision is that satellite-based

positioning will become even bigger due to the emergence of mobile phones
equipped with receivers, perhaps with some 400 million users by the year 2015.
Development of the system has experienced substantial delays, and at the time
of writing European ministers insist that Galileo should be up and running by
the end of 2013. The completed system will have 27 satellites, with three in
reserve, orbiting in one of three, equally spaced, circular orbits at an elevation
of 23,222 km, inclined 56 with the equator. This higher inclination, when
compared to that of GPS, has been chosen to provide better positioning
coverage at high latitudes, such as northern Scandinavia where GPS performs
rather poorly.

In June 2004, the EU and the US agreed to make Galileo and GPS compatible
by adoption of interchangeable satellite signal set-ups. The effect of this
agreement is that the Galileo/GPS tandem satellite system will have so many
satellites in the sky (close to 60) that a receiver can almost always find an
optimal constellation in view. This will be especially useful in situations where
in the past bad signal reception happened: in built-up areas and forests, for
instance. It will also bring the implementation of a Global Navigation Satellite
System (GNSS) closer as positional accuracy and reliability will improve. With
such a system, eventually one expects to implement fully automated air and
road traffic. Automatic aircraft landing, for instance, requires horizontal
accuracy in the order of 4 m, and vertical accuracy below 1 m: these
requirements can currently not be achieved reliably.

The Galileo Terrestrial Reference Frame (GTRF) will be a realization of the

ITRS independently set up from that of GPS, so that one system can back-up
for the other. Positional differences between the WGS84 and the GTRF will be
at worst a few centimeters. The Galileo System Time (GST) will closely follow
International TAI Atomic Time (TAI) with a time offset of less than 50 nsec
for 95 % of the time over any period of a year. Information on the actual offset
between GST and TAI, and between GST and UTC (as used in GPS) will be
broadcasted in the Galileo satellite signal.

63
Geographic Information Systems Unit 3

Spatial data can be obtained from various sources. It can be collected from
scratch, using direct spatial data acquisition techniques, or indirectly, by
making use of existing spatial data collected by others.

3.2.Data Entry and Preparation

3.2.1.Spatial Data Input:

Data input is the operation of encoding data for inclusion into a database. The
creation of accurate databases is a very important part of GIS.
Data collection, and the maintenance of databases, remains the most expensive
and time consuming aspect of setting up a major GIS facility. This typically
costs 60-80% of the overall costs of a GIS project.
There are a number of issues which arise when developing a data base for a
planning or management projects. The first issue is should the data be stored in
vector or raster format. Considerations here include:
The nature of the source data e.g. it is already in raster form
The predominant use to which it will be put
The potential losses that may occur in transition
Storage space (increasingly less important)
Requirements for data sharing with other systems/software

3.2.1.1.Direct spatial data capture:

Data which is captured directly from the environment is known as primary

data.
One way to obtain spatial data is by direct observation of the relevant
geographic phenomena. This can be done through ground-based field surveys,
or by using remote sensors in satellites or airplanes. Many Earth sciences
Primary data have developed their own survey techniques, as ground-based
techniques re-main the most important source for reliable data in many cases.
With primary data the core concern in knowing its properties is to know the
process by which it was captured, the parameters of any instruments used and
the rigour with which quality requirements were observed.
Remotely sensed imagery is usually not fit for immediate use, as various
sources of error and distortion may have been present, and the imagery should
first be freed from these. This is the domain of remote sensing.
An image refers to raw data produced by an electronic sensor, which are not
pictorial, but arrays of digital numbers related to some property of an object or
scene, such as the amount of reflected light.

64
Geographic Information Systems Unit 3

3.2.1.2.Indirect spatial data capture

Any data which is not captured directly from the environment is known as
secondary data.

In contrast to direct methods of data capture described above, spatial data can
also be sourced indirectly. This includes data derived from existing paper maps
Secondary data through scanning, data digitized from a satellite image,
processed data purchased from data capture firms or international agencies, and
so on. This type of data is known as secondary data:

1. Digitizing

A traditional method of obtaining spatial data is through digitizing existing pa-

per maps. This can be done using various techniques. Before adopting this
approach, one must be aware that positional errors already in the paper map
will further accumulate, and one must be willing to accept these errors.
There are two forms of digitizing: on-tablet and on-screen manual digitizing. In
on-tablet digitizing, the original map is fitted on a special surface (the tablet),
while in on-screen digitizing, a scanned image of the map (or some other
image) is shown on the computer screen. In both of these forms, an operator

the lines, and storing location coordinates relative to a number of previously

defined control points.
Control
points digitized data: the control points on the map have known coordinates,
and by digitizing them we tell the system implicitly where all other digitized
locations are. At least three control points are needed, but preferably more
should be digitized to allow a check on the positional errors made.
Another set of techniques also works from a scanned image of the original
map, but uses the GIS to find features in the image. These techniques are
known as semi-automatic or automatic digitizing, depending on how much
operator inter-action is required. If vector data is to be distilled from this
procedure, a process known as vectorization follows the scanning process.

2. Scanning

reflected light with a CCD array. The result is an image as a matrix of pixels,
each of which holds an intensity value. Office scanners have a fixed maximum
resolution, expressed as the highest number of pixels they can identify per inch;
Resolution the unit is dots-per-inch (dpi). For manual on-screen digitizing of a
paper map, a resolution of 200 300 dpi is usually sufficient, depending on the
thickness of the thinnest lines. For manual on-screen digitizing of aerial
photographs, higher resolutions are recommended typically, at least 800 dpi.

65
Geographic Information Systems Unit 3

(Semi-)automatic digitizing requires a resolution that results in scanned lines of

at least three pixels wide to enable the computer to trace the centre of the lines
and thus avoid displacements. For paper maps, a resolution of 300 600 dpi is
usually sufficient. Automatic or semi-automatic tracing from aerial
photographs can only be done in a limited number of cases. Usually, the
information from aerial photos is obtained through visual interpretation.
After scanning, the resulting image can be improved with various image pro-
cessing techniques. It is important to understand that scanning does not result
in a structured data set of classified and coded objects. Additional work is re-
quired to recognize features and to associate categories and other thematic at-
tributes with them.

3. Vectorization

The process of distilling points, lines and polygons from a scanned image is
called vectorization.
As scanned lines may be several pixels wide, they are often first thinned to
retain only the centreline. The remaining centreline pixels are converted to
series of (x, y) coordinate pairs, defining a polyline.
Subsequently, OCR features are formed and attributes are attached to them.
This process may be en-tirely automated or performed semi-automatically,
with the assistance of an op-erator. Pattern recognition methods like Optical
Character Recognition (OCR) for text can be used for the automatic
detection of graphic symbols and text.
Vectorization causes errors such as small spikes along lines, rounded corners,
errors in T- and X-junctions, displaced lines or jagged curves. These errors are
corrected in an automatic or interactive post-processing phase.The phases of
the vectorization process are illustrated in Figure

66
Geographic Information Systems Unit 3

4. Metadata

Metadata is defined as background information that describes all necessary in-

includes:

Identification information: Data source(s), time of acquisition, etc.

Data quality information: Positional, attribute and temporal accuracy, lineage, etc.

Entity and attribute information: Related attributes, units of measure, etc.

In essence, metadata answer who, what, when, where, why, and how questions about
all facets of the data made available. Maintaining metadata is an key part in
maintaining data and information quality in GIS. This is because it can serve different
purposes, from description of the data itself through to providing in structions for data
handling. Depending on the type and amount of metadata provided, it could be used to
determine the data sets that exist for a geographic location, evaluate whether a given
data set meets a specified need, or to process and use a data set.

3.2.2.Data Quality:

With the advent of satellite remote sensing, GPS and GIS technology, and the
increasing availability of digital spatial data, resource managers and others who
formerly relied on the surveying and mapping profession to supply high quality map
products are now in a position to produce maps themselves. At the same time, GISs
are being increasingly used for decision support applications, with in- Application
requirements creasing reliance on secondary data sourced through data providers or
via the internet , through geo-web services. The implications of using low-quality data
in important decisions are potentially severe. There is also a danger that uninformed
GIS users introduce errors by incorrectly applying geometric and other
transformations to the spatial data held in their database.

3.2.2.1.Accuracy and Positioning :

Accuracy and precision

So far we have used the terms error, accuracy and precision without appropriately
defining them.

Accuracy should not be confused with precision, which is a statement of the

smallest unit of measurement to which data can be recorded. In conventional
surveying and mapping practice, accuracy and precision are closely related.
Instruments with an appropriate precision are employed, and surveying
methods chosen, to meet specified accuracy tolerances.

67
Geographic Information Systems Unit 3

In GIS, how- Accuracy tolerances ever, the numerical precision of computer

processing and storage usually exceeds the accuracy of the data. This can give
rise to so-called spurious accuracy, for example calculating area sizes to the
nearest m2 from coordinates obtained by digitizing a 1 : 50, 000 map.
Using graphs that display the probability distribution (for which see below) of
a measurement against the true value T , the relationship between accuracy and
precision can be clarified.
An accurate measurement has a mean close to the true value; a precise
measurement has a sufficiently small variance.

3.2.2.2. Positional accuracy

The surveying and mapping profession has a long tradition of determining and
minimizing errors. This applies particularly to land surveying and
photogrammetry, both of which tend to regard positional and height errors as
undesirable. Cartographers also strive to reduce geometric and attribute errors
in their products, and, in addition, define quality in specifically cartographic
terms, for ex-ample quality of linework, layout, and clarity of text.
It must be stressed that all measurements made with surveying and photogram-
metric instruments are subject to error. These include:

Human errors in measurement (e.g. reading errors) generally referred to as

gross errors or blunders. These are usually large errors resulting from
carelessness which could be avoided through careful observation, although it is
never absolutely certain that all blunders have been avoided or eliminated.
Instrumental or systematic errors (e.g. due to mis adjustment of instruments).
This leads to errors that vary systematically in sign and/or magnitude, but can
go undetected by repeating the measurement with the same instru- Error
sources ment. Systematic errors are particularly dangerous because they tend to
accumulate.
So called random errors caused by natural variations in the quantity being
measured. These are effectively the errors that remain after blunders and
systematic errors have been removed. They are usually small, and dealt with in
least squares adjustment.
Measurement errors are generally described in terms of accuracy. In the case of
spatial data, accuracy may relate not only to the determination of coordinates
(positional error) but also to the measurement of quantitative attribute data. The
accuracy of a single measurement can be defined as:

the closeness of observations, computations or estimates to the true values or the

68
Geographic Information Systems Unit 3

Root mean square error

Location accuracy is normally measured as a root mean square error (RMSE). The
RMSE is similar to, but not to be confused with, the standard deviation of a statistical
sample. The value of the RMSE is normally calculated from a set of check
measurements (coordinate values from an independent source of higher accuracy for
identical points).

3.2.2.3. Attribute accuracy

Attribute accuracy

We can identify two types of attribute accuracies. These relate to the type of data we
are dealing with:

For nominal or categorical data, the accuracy of labeling (for example the type
of land cover, road surface, etc).

For numerical data, numerical accuracy (such as the concentration of

pollutants in the soil, height of trees in forests, etc).

It follows that depending on the data type, assessment of attribute accuracy may range
from a simple check on the labelling of features for example, is a road classified as a
metaled road actually surfaced or not? to complex statistical procedures for
assessing the accuracy of numerical data, such as the percentage of pollutants present
in the soil.

When spatial data are collected in the field, it is relatively easy to check on the
appropriate feature labels. In the case of remotely sensed data, however, considerable
effort may be required to assess the accuracy of the classification procedures. This is
usually done by means of checks at a number of sample points.
3.2.2.4. Temporal accuracy

As noted, the amount of spatial data sets and archived remotely sensed data
has increased enormously over the last decade. These data can provide useful
temporal information such as changes in land ownership and the monitoring of
environmental processes such as deforestation. Analogous to its positional and
attribute components, the quality of spatial data may also be assessed in terms
of its temporal accuracy. For a static feature this refers to the difference in the
values of its coordinates at two different times. This includes not only the
accuracy and precision of time measurements (for example, the date of a
survey), but also the temporal consistency of different data sets. Because the
positional and attribute components of spatial data may change together or
independently, it is also necessary to consider their temporal validity. For

69
Geographic Information Systems Unit 3

example, the boundaries of a land parcel may remain fixed over a period of
many years whereas the ownership attribute may change more frequently.
The field data are then used to construct an error matrix (also known as a
confuse- Error matrix or misclassification matrix) that can be used to evaluate
the accuracy of the classification. For example there are three land types are
identified. For 62 check points that are forest, the classified image identifies
them as forest. However, two forest check points are classified in the image as
agriculture. Vice versa, five agriculture points are classified as forest.

3.2.2.5.Lineage:

Lineage describes the history of a data set. In the case of published maps, some
lineage information may be provided as part of the metadata, in the form of a
note on the data sources and procedures used in the compilation of the data.
Examples include the date and scale of aerial photography, and the date of field
verification. Especially for digital data sets, however, lineage may be defined
more formally as:

that part of the data quality statement that contains information that de-scribes the
source of observations or materials, data acquisition and compilation methods,
conversions, transformations, analyses and derivations that the data has been

All of these aspects affect other aspects of quality, such as positional accuracy.
Clearly, if no lineage information is available, it is not possible to adequately

3.2.2.6.Completeness

Completeness refers to whether there are data lacking in the database compared
to what exists in the real world. Essentially, it is important to be able to assess
what does and what does not belong to a complete dataset as intended by its

belong within the scope of the data set as it is defined).

Completeness can relate to either spatial, temporal, or thematic aspects of a

data set. For example, a data set of property boundaries might be spatially
incomplete because it contains only 10 out of 12 suburbs; it might be
temporally incomplete because it does not include recently subdivided
properties; and it might be thematically over complete because it also includes
building footprints.

70
Geographic Information Systems Unit 3

3.2.2.6.Logical consistency :

For any particular application, (predefined) logical rules concern:

compatibility of data with other data in a data set (e.g. in terms of data format),

contradictions within a data set,

topological consistency of the data set, and

The allowed attribute value ranges, as well as combinations of attributes. For

example, attribute values for population, area, and population density must agree for
all entities in the database.
The absence of any inconsistencies does not necessarily imply that the data are
accurate.

3.2.3.Data Preparation:

Spatial data preparation aims to make the acquired spatial data fit for use.
Images may require enhancements and corrections of the classification scheme
of the data. Vector data also may require editing, such as the trimming of over-
shoots of lines at intersections, deleting duplicate lines, closing gaps in lines,
and generating polygons. Data may require conversion to either vector format
or raster format to match other data sets which will be used in the analysis.
Additionally, the data preparation process includes associating attribute data
with the spatial features through either manual input or reading digital attribute
files into the GIS/DBMS.

The intended use of the acquired spatial data may require only a subset of the
original data set, as only some of the features are relevant for subsequent anal-
Intended use y axis subsequent map production. In these cases, data and/or
cartographic generalization can performed on the original data set.

3.2.3.1.Data checks and repairs:

Acquired data sets must be checked for quality in terms of the accuracy,
consistency and completeness parameters discussed above. Often, errors can be
identified automatically, after which manual editing methods can be applied to
correct the errors. Alternatively, some software may identify and automatically
correct certain types of errors. Below, we focus on the geometric, topological,
and attribute components of spatial data.
- rd sequence. For
example, crossing lines are split before dangling lines are erased, and nodes are
created at intersections before polygons are generated.

71
Geographic Information Systems Unit 3

With polygon data, one usually starts with many polylines, in an unwieldy for-
mat known as spaghetti data, that are combined in the first step . This results in
fewer polylines with more internal vertices. Then, polygons can be identified
(c).
Sometimes, polylines that should connect to form closed boundaries do not,
and therefore must be connected (either manually or automatically); this step is
not indicated in the figure. In a final step, the elementary topology of the
polygons can be derived (d).

(a) Spaghetti data (b) Spaghetti data (cleaned)

Rasterization or vectorization

Vectorization produces a vector data set from a raster. We have looked at this
in some sense already: namely in the production of a vector set from a scanned
image. Another form of vectorization takes place when we want to identify
features or patterns in remotely sensed imagery. The keywords here are feature
extraction and pattern recognition, which are dealt with in Principles of
Remote Sensing .
If much or all of the subsequent spatial data analysis is to be carried out on
raster data, one may want to convert vector data sets to raster data. This process
is known as rasterization.
It involves assigning point, line and polygon attribute values to raster cells that
overlap with the respective point, line or polygon. To avoid information loss,
the raster resolution should be carefully chosen on the basis of the geometric
resolution.
A cell size which is too large may result in cells that cover parts of multiple
vector features, and then ambiguity arises as to what value to assign to the cell.
If, on the other hand, the cell size is too small, the file size of the raster may
increase significantly.
er
boundaries are only an approximation

lost their topological properties.

72
Geographic Information Systems Unit 3

Often the reason for rasterization is because it facilitates easier combination

with other data sources also in raster formats, and/or because there are several
analytical techniques which are easier to perform upon raster data .
An alternative to rasterization is to not perform it during the data preparation
phase, but to use GIS rasterization functions on-the-fly, that is when the
computations call for it. This allows keeping the vector data and generating
raster data from them when needed. Obviously, the issue of performance trade-
off must be looked into.

3.2.3.2.Combining data from multiple sources Point

A GIS project usually involves multiple data sets, so the next step addresses the issue
of how these multiple sets relate to each other. There are four fundamental cases to be
considered in the combination of data from different sources:

They may be about the same area, but differ in accuracy,

They may be about the same area, but differ in choice of representation,
They may be about adjacent areas, and have to be merged into a single data set.
They may be about the same or adjacent areas, but referenced in different
coordinate systems.

Differences in accuracy

Images come at a certain resolution, and paper maps at a certain scale. This typically
results in differences of resolution of acquired data sets, all the more since map
features are sometimes intentionally displaced to improve readability of the map. For
instance, the course of a river will only be approximated roughly on a small-scale
map, and a village on its northern bank should be depicted Scale north of the river,
even if this means it has to be displaced on the map a little bit. The small scale causes
an accuracy error. If we want to combine a digitized version of that map, with a
digitized version of a large-scale map, we must be aware that features may not be
where they seem to be. Analogous examples can be given for images at different
resolutions.

In Figure , the polygons of two digitized maps at different scales are coincide, and
polygon boundaries cross each other. This causes small, artefact polygons in the
overlay known as sliver polygons.

73
Geographic Information Systems Unit 3

If the map scales involved differ significantly, the polygon boundaries of the
large-scale map should probably take priority, but when the differences are
slight, we need interactive techniques to resolve the issues. There can be good
reasons for having data sets at different scales.
A good example is found in mapping organizations; European organizations
maintain a
single source database that contains the base data. This database is essentially
scale-less and contains all data required for even the largest scale map to be
produced. For each map scale that the mapping organization produces, they
derive a separate database from the foundation data. Such a derived database
may be called a cartographic database as the data stored are elements to be
printed on a map, including, for instance, data on where to place name tags, and
what colour to give them. This may mean the organization has one database for
the larger scale ranges and other databases for the smaller scale ranges. They
maintain a ulti-scale data environment.

Differences in representation

We have already talked about the various ways to represent spatial data. Some-times
data is acquired as point samples or observations, other times it is in the form of
polygons with attribute data.
When points need to be translated into raster, we need to perform something
known as point data transformation
.Some advanced GIS applications require the possibility of representing the
same geographic phenomenon in different ways. These are called multi
representation systems.
The production of maps at various scales is an example, but there are
numerous others. The commonality is that phenomena must sometimes be
viewed as points, and at other times as polygons.
For example, a small-scale national road network analysis may represent
villages as point objects, but a nation-wide urban population density study
should regard all municipalities as represented by polygons.
The complexity that this requirement entails is that the GIS or the DBMS
must keep track of links between different representations for the same
phenomenon, and must also provide support for decisions as to which
representations to use in which situation.
The links between various representations for the same object maintained by
the system allows switching between them, and many fancy applications of
their use seem possible.

Merging data sets of adjacent areas

When individual data sets have been prepared as described above, they some-

the appearance of the integrated geometry is as homogeneous as possible.

74
Geographic Information Systems Unit 3

Edge matching is the process of joining two or more map sheets, for instance,
after they have separately been digitized.

Merging adjacent data sets can be a major problem. Some GIS functions, such
as line smoothing and data clean-up (removing duplicate lines) may have to be
performed. Figure illustrates a typical situation.
Some GISs have merge or edge-matching functions to solve the problem
arising from merging adjacent data. At the map sheet edges, feature
representations have to be matched in order for them to be combined.
Coordinates of the objects along shared borders are adjusted to match those in
the neighboring data sets. Mismatches may still occur, so a visual check, and
interactive editing is likely to be required.

Other data preparation functions

A range of other data preparation functions exist that support conversion or

adjustment of the acquired data to format requirements that have been defined
for data storage purposes. These include:

Format transformation functions. These convert between data formats of different

systems or representations, e.g. reading a DXF file into a GIS. Al-though we will not
focus on the technicalities here, the user should be warned that conversions from one
format to another may cause problems. The reason is that not all formats can capture
the same information, and there-fore conversions often mean loss of information. If
one obtains a spatial data set in format F , but needs it in format G (for instance
because the locally preferred GIS package requires it), then usually a conversion
function can be found, often within the same GIS software package. The key to
successful conversion is to also find an inverse conversion, back from G to F , and to
ascertain whether the double conversion back to F results in the same data set as the
original. If this is the case, both conversions are not causing information loss, and can
safely be applied.

Graphic element editing Manual editing of digitized features so as to correct errors,

and to prepare a clean data set for topology building.

Coordinate thinning A process that is often applied to remove redundant or excess

vertices from line representations, as obtained from digitizing.

75
Geographic Information Systems Unit 3

3.2.4.Point Data Transformation:

We may want to transform our points into other representations in order to

facilitate interpretation and/or integration with other data. Examples include
defining homogeneous areas (polygons) from our point data, or deriving
contour lines. This is generally referred to as interpolation, i.e. the calculation
ple of
spatial autocorrelation plays a central part in the process of interpolation .
In order to predict the value of a point for a given (x, y) location, we could sim-

the simplest form of interpolation, known as nearest-neighbour interpolation.

We might instead choose to use the distance that points are away from (x, y) to
weight their importance in our calculation.
In some instances we may be dealing with a data type that limits the type of
interpolation we can do . A fundamental issue in this respect is what kind of
phenomena we are considering: is it adiscrete field such as geological units,
for instance in which the values are of Data type a qualitative nature and the
data is categorical, or is it a continuous field like elevation, temperature, or
salinity in which the values are of a quantitative nature, and represented as
continuous measurements? This distinction matters, because we are limited to
nearest-neighbour interpolation for discrete data.

3.2.4.1.Interpolating discrete data

If we are dealing with discrete (nominal, categorical or ordinal) data, we are

effectively restricted to using nearest-neighbour interpolation.
In a nearest-neighbour interpolation, each location is assigned the value of the

around the points of measurement, with each point belonging to a zone

assigned the same value. Effectively, this represents an assignment of an
existing value (or category) to a location.
If the desired output was a polygon layer, we could construct Thiessen
polygons around the points of measurement. The boundaries of such polygons,
by definition, are the locations for which more than one point of measurement
is the closest point.

3.2.4.2.Interpolating continuous data:

Interpolation of values from continuous measurements is significantly more complex.

Since the data are continuous, we can make use of measured values for interpolation.
There are many continuous geographic fields elevation, temperature and ground
water salinity are just a few examples. Commonly, continuous fields are represented
as rasters, and we will almost by default assume that they are. The main alternative for
continuous field representation is a polyline vector layer, in which the lines are
isolines. We will also address these issues of representation below.

76
Geographic Information Systems Unit 3

The aim is to use measurements to obtain a representation of the entire field using
point samples. In this section we outline four techniques to do so:

Trend surface fitting using regression,

inverse distance weighting,

regression:

In trend surface fitting, the assumption is that the entire study area can be
represented by a formula f (x, y) that for a given location with coordinates (x,
y) will give us the approximated value of the field in that location. The key
objective in trend surface fitting is to derive a formula that best describes the
field. Various classes of formula exist, with the simplest being the one that
describes a flat, but tilted plane:
f (x, y) = c1 · x + c2 · y + c3
If we believe and this judgement must be based on domain expertise that
the field under consideration can be best approximated by a tilted plane, then
the problem of finding the best plane is the problem of determining best values
for the coefficients c1, c2 and c3.
This is where the point measurements earlier obtained become important.
Statistical techniques known as regression techniques can be used to determine
values for these coefficients c that best fit with the measurements.
A plane will be fitted through the measurements that makes the smallest overall
error with respect to the original measurements.

Triangulation:

Another way of interpolating point measurements is by triangulation. Triangu-

lated Irregular Networks (TINs) this technique constructs a triangulation of the
study area from the known measurement points.
Preferably, the triangulation should be a Delaunay triangulation. After having
obtained it, we may define for which values of the field we want to construct
isolines. For instance, for elevation, we TINs and isolines might want to have
the 100 m-isoline, the 200 m-isoline, and so on.
For each edge of a triangle, a geometric computation can be performed that
indicates which isolines intersect it, and at what positions they do so. A list of
computed loca-tions, all at the same field value, is used by the GIS to construct
the isoline.

77
Geographic Information Systems Unit 3

Spatial moving averages using inverse distance weighting:

Moving window averaging attempts to directly derive a raster dataset from a

set of sample points. This is why it is som
The principle behind this technique is illustrated in Figure. The cell values for

known as a kernel) is defined, and initially placed over the top left raster cell.
Measurement Moving window averaging points falling inside the window
contribute to the averaging computation, those outside the window do not.
This is why moving window averaging is said to be a local interpolation
method. After the cell value is computed and assigned to the cell, the window
is moved one cell to the right, and the computations are performed for that cell.
Successively, all cells of the raster are visited in this way.

-2 -

78
Geographic Information Systems Unit 3

(a) 2 0 2 4 6 8 10 12

Kriging

Kriging was originally developed my mining geologists attempting to derive

accurate estimates of mineral deposits in a given area from limited sample
measurements.
It is an advanced interpolation technique belonging to the field of geostatistics,
which can deliver good results if applied properly and with enough sample
points.
Kriging is usually used when the variation of an attribute and/or the density of
sample points is such that simple methods of interpolation may give unreliable
predictions.
Kriging is based on the notion that the spatial change of a variable can be de-
scribed as a function of the distance between points. It is similar to IDW
interpolation, in that it the surrounding values are weighted to derive a value
for an unmeasured location.
However, the kriging method also looks at the overall spatial arrangement of
the measured points and the spatial correlation between their values, to derive
values for an unmeasured location.

The first step in the kriging procedure is to compare successive pairs of point
measurements to generate a semi-variogram.
In the second step, the semi-variogram is used to calculate the weights used in
interpolation. Although kriging is Semi-variogram a powerful technique, it
should not be applied without a good understanding of geostatistics, including
the principle of spatial autocorrelation

79
Geographic Information Systems Unit 3

Sample Questions:

1) Write a short note on geographic coordinate system.

2) What is datum transformation and projected coordinate system?
3) What is Map Projection? What are the classification of it?
4) Write a short note on satellite based positioning.
5) What are the different types of positioning?
6) What are the errors in absolute positioning related to medium?
7) What are the errors in absolute positioning related to space segment?
8) Explain Relative positioning and Network positioning.
9) What are the Positioning technology:
(a) GPS
(b) GLONASS
(c) Galileo
10) Write a short note on Direct spatial data capture.
11) Explain Indirect spatial data capture with digitizing, scanning and vectorization
and Metadata.
12) Explain the following Data quality :
a) Accuracy and Positioning
b) Positional accuracy
c) Attribute accuracy
d) Temporal accuracy
13) Explain how data can be combined from multiple source point?
14) Write a short note on Point Data Transformation.
15) Explain the following interpolating technique for continuous data
a) Trend surface fitting
b) Triangulation
c) Kriging

80
Geographic Information systems Unit 4

Unit-IV
Spatial Data Analysis

Contents

4.1 Spatial Data Analysis

4.2 Overlay Function
4.3 Neighborhood functions
4.4 Analysis: Network analysis, interpolation, terrain modeling
4.5 GIS and Application models
4.6 Error propagation in spatial data processing

4.1. SPATIAL DATA ANALYSIS:-

Spatial analysis or spatial statistics includes any of the formal techniques which
study entities using their topological, geometric, or geographic properties.
Spatial data refers to all types of data objects or elements that are present in a
geographical space or horizon. It enables the global finding and locating of
individuals or devices anywhere in the world. Spatial data is also known as
geospatial data, spatial information or geographic information.

4.1.1- CLASSIFICATION OF ANALYTICAL GIS CAPABILITIES:-

Following are the different ways to classify analytical functions of GIS:

Classification, retrieval, and measurement functions:All functions in
thiscategory are performed on a single (vector or raster) data layer, often using
the associated attribute data.
Classification allows the assignment of features to a class on the basis of
attribute values or attributes ranges (definition of data patterns).
Retrieval functions allow the selective search of data. Ex we might retrieve all
agricultural fields where potato is grown.
Generalization is a function that joins different classes of objects with common
characteristics to a higher level.
Measurement functions allow the calculation of distances, lengths, or areas.

4.1.1.1- Measurement:

Geometric measurement on spatial features includes counting, distance and area size
computations. Measurement are categorized into two types namely

81
Geographic Information systems Unit 4

i) Measurements on vector data:

The primitives of vector data sets are point, (poly) line and polygon. Related
geometric measurements are location, length, distance and area size. Some
of these are geometric properties of a feature in isolation (location, length, area
size); others (distance) require two features to be identified.
The location property of a vector feature is always stored by the GIS: a single
coordinate pair for a point, or a list of pairs for a polyline or polygon boundary.
Length is a geometric property associated with polyline, by themselves, or in
their function as polygon boundary. It can obviously be computed by the
GIS as the sum of lengths of the constituent line segments.
Area size is associated with polygon features. Again, it can be computed, but
usually is stored with the polygon as an extra attribute value. This speeds up
the computation of other functions that require area size values.
Geometric Measurements: Used by GIS is the minimal bounding box
computation. It applies to polyline and polygons, and determines the minimal
rectangle with sides parallel to the axes of the spatial reference system that
covers the feature.
Measuring distance between two features can be done using above formula

dist(p, (xP Q )2 + (yP

q) = q yQ )2 .
If one of the features is not a point, or both are not, we must be precise in
defining what we mean by their distance. All these cases can be summarized as
computation of the minimal distance between a location occupied by the first
and a location occupied by the second feature. This means that feature that
intersects or meets, or when one contains the other have a distance of 0.
Another geometric measurement used by the GIS is the minimal bounding box
computation. It applies to polyline and polygons, and determines the minimal
rectangle with sides parallel to the axes of the spatial reference system that
covers the feature.

ii) Measurements on raster data:

Measurements on raster data layers are simpler because of the regularity of the
cells. The area size of a cell is constant, and is determined by the cell
resolution. Horizontal and vertical resolution may differ, but typically do not.

resolution, and the position of the cell in the raster.

82
Geographic Information systems Unit 4

The area size of a selected part of the raster (a group of cells) is calculated as
the number of cells multiplied by the cell area size. The distance between two
raster cells is the standard distance function applied to the locations of their
respective mid-points, obviously taking into account the cell resolution where a
raster is used to represent line features as strings of cells through the raster, the
length of a line feature is computed as the sum of distances between
consecutive cells.

4.1.1.2- Spatial selection queries

When exploring a spatial data set, the first thing one usually wants is to select certain
features, to (temporarily) restrict the exploration. Such selections can be made on
geometric/spatial grounds, or on the basis of attribute data associated with the spatial
features. We discuss both techniques below.

Interactive Spatial Selection: In interactive spatial selection, one defines the

selection condition by pointing at or drawing spatial objects on the screen
display, after having indicated the spatial data layer(s) from which to select
features. The interactively defined objects are called the selection objects; they
can be points, lines, or polygons. The GIS Selection objects then selects the
features in the indicated data layer(s) that overlap (i.e. intersect, meet, contain,
or are contained in with the selection objects.
Spatial selection by attribute Conditions: It is also possible to select features
by using selection conditions on feature at-tributes. These conditions are
formulated in SQL like example Spatial selection using the attribute
condition (AREA< 400000)
Combining attribute conditions: When multiple criteria have to be used for
selection, we need to carefully express all of these in a single composite
condition. The tools for this come from a field of mathematical logic, known as
propositional calculus. Atomic conditions use a predicate symbol, such as <
(less than) or = (equals). Other possibilities are <= (less than or equal), >
(greater than), >= (greater than or equal) and <> (does not equal). Any of these
symbols is combined with an expression on the left and one on the right. For
instance, LandUse <> 80 can be used to select all areas with a land use class
different from 80. Atomic conditions can be combined into composite

83
Geographic Information systems Unit 4

conditions using logical connectives like AND, OR, NOT. Examples: Area <
400000 AND LandUse = 80, Area < 400000 OR LandUse = 80, NOT
(LandUse = 80)
Spatial selection using topological relationships: It consist of two steps:
1. To select one or more features as the selection objects, and
2. To apply a chosen spatial relationship function to determine the
selected features that have that relationship with the selection objects.

Selecting features that are inside selection objects: This type of query uses
the containment relationship between spatial objects. polygons can contain
polygons, lines or points, and lines can contain lines or points, but no other
containment relationships are possible.
Selecting features that intersect: The intersect operator identifies features that
are not disjoint.
Selecting features adjacent to selection objects Adjacency is themeetrelation-
ship .It expresses that features share boundaries, and therefore it applies only to
line and polygon features.
Selecting features based on their distance One may also want to use the
distance function of the GIS as a tool in selecting features. Such selections can
be searches within a given distance from the selection objects, at a given
distance, or even beyond a given distance.

4.1.1.3- Classification

Classification is a technique of explicitly removing detail from an input data

set, in the hope of revealing important patterns (of spatial distribution). In the
process, we produce an output data set, so that the input set can be left intact.
We do so by assigning a characteristic value to each element in the input set, which
is usually a collection of spatial features that can be raster cells or points, lines or
polygons. If the number of characteristic values is small in comparison to the size
of the input set, we have classified the input set.
Let us take an example for classification the distribution of household income in
a city where household income is called the classification parameter. If we know
for each ward in the city the associated average income, we have many different
values. Subsequently, we could define five different categories (or: classes) as

At times an input data set may have itself been the result of a classification, and in
such a case we called it as reclassification. For example, we may have a soil map
that shows different soil type units and we would like to show the suitability of
units for a specific crop.
In this case, it is better to assign to the soil units an attribute reclassification of
suitability for the crop. Since different soil types may have the same crop
suitability, a classification may merge soil units of different type into the same
category of crop suitability.

84
Geographic Information systems Unit 4

Types of Classification: There are two types

1. User-controlled Classification: In user-controlled classification, a user selects the

attribute(s) that will be used as the classification parameter(s) and defines the
classification method. The latter involves declaring the number of classes as well as the
correspondence between the old attribute values and the new classes. This is usually done
via a classification table.
2. Automatic Classification:
User-controlled classifications require a classification table or user interaction.
GIS software can also perform automatic classification, in which a user only
specifies the number of classes in the output data set. The system automatically
determines the class break points. Two main techniques of determining break
points are in use.
Equal interval technique: The minimum and maximum values vMIN and
vMAX of the classification parameter are determined and the (constant)
interval size for each category is calculated
the number of classes chosen by the user. This classification is useful in
revealing the distribution patterns as it determines the number of features in
each category.
Equal frequency technique: This technique is also known as quantile
classification. The objective is to create categories with roughly equal numbers
of features per category. The total number of features is determined first and by
the required number of categories, the number of features per category is
calculated. The class break points are then determined by counting off the
features in order of classification parameter value.

4.2 OVERLAY FUNCTIONS:

An overlay operation combines geometries and attributes of two feature layers
to create output. The geometry of output represents geometric intersection of
features from input layers. Following are operations with overlays.

Fig (1.1)
Point-in-polygon: in the vector model, this overlay determines the points lying
inside a specific polygon. In the example, we are looking for all hotels that are
85
Geographic Information systems Unit 4

located in the settlement areas. In the resulting layer, the points have the additional
information whether or not they are in the settlement area. In the raster model, the
points in question are visible through the addition of the two input layers.

Fig (1.2)
Line-in-polygon: the overlay of lines and polygons is more complex. The example
shows the calculation of road sections located in the settlement area. In the vector
model, the topology changes: the original line is cut into shorter segments by the
intersection points. It has to be specified for each segment whether it is inside or
outside the settlement polygon. In the raster model, a simple addition identifies the
interest areas.

Fig (1.3)

Polygon-on-polygon: in the vector model, the most complex case is the

intersection of polygons. The result is a data layer with a whole new topology. The
overlay of contour lines results in a variety of new intersections and polygons for
which the attributes have to be reassigned. In addition, for non-cut areas it has to
be checked whether they contain areas of another information layer i.e. whether
island polygons were created. Raster overlay is very simple, however. Again, the
cell values of the input layers are calculated.

86
Geographic Information systems Unit 4

Overlay Functions are classified into 2 types

4.2.1- VECTOR OVERLAYS

Feature overlays from vector data are created when one vector layer (points, lines,
or polygons) is merged with one or more other vector layers covering the same
area with points, lines, and/or polygons. A resultant new layer is created that
combines the geometry and the attributes of the input layers.

Polygon Overlay Functions:Various GIS software packages offer a variety of

polygon overlay tools.
a) Intersection, where the result includes all those polygon parts that occur in
both input layers and all other parts are excluded.
b) Union, where the result includes all those polygon parts that occur in either A
or B (or both), so is the sum of all the parts of both A and B
c) Subtract, also known as Difference or Erase, where the result includes only
those polygon parts that occur in one layer but not in another.
d) Symmetric Difference, also known as Exclusive or, which includes polygons
that occur in one of the layers but not both. It can be derived as either (A union
B) subtract (A intersect B), or (A subtract B) union (B subtract A). It is roughly
analogous to XOR in logic.
e) Identity covers the extent of one of the two layers, with the geometry and
attributes merged in the area where they overlap. It can be derived as (A subtract
B) union (A intersect B).
f) Clip contains the same overall extent as the intersection, but only retains the
geometry and attributes of one of the input layers.

87
Geographic Information systems Unit 4

4.2.2- RASTER OVERLAYS

Raster overlay involves two or more different sets of data that derive from a
common grid. The separate sets of data are usually given numerical values. These
values then are mathematically merged together to create a new set of values for a
single output layer.Below is an example of raster overlay by addition. Two input
rasters added together to create an output raster with the values for each cell
summed.

This approach is often used to rank attribute values by suitability or risk and then
add them, to produce an overall rank for each cell. The various layers can also be
assigned a relative importance to create a weighted ranking (the ranks in each layer
are multiplied by that layer's weight value before being summed with the other
layers).

4.3- Neighborhood functions

Overlays combine features at the same location; neighborhood functions evaluate

computation on it.
Interpolation functions predict unknown values using the known values at nearby
locations. This typically occurs for continuous fields, like elevation, when the data
actually stored does not provide the direct answer for the location(s) of interest.
Topographic functions determine characteristics of an area by looking at the
immediate neighborhood as well. Typical examples are slope computations on
digital terrain models (i.e. continuous spatial fields). The slope in a location is
defined as the plane tangent to the topography in that location.
Various computations can be performed, such as:

88
Geographic Information systems Unit 4

determination of slope angle,

determination of slope aspect,
determination of slope length,
The principle here is to find out the characteristics of the vicinity, here called
neighborhood, of a location. To perform neighborhood analysis, we must:
a. State which target locations are of interest to us, and define their spatial
extent,
Example: our target might be a medical clinic.

b.Define how to determine the neighborhood for each target,

Example: Its neighborhood could be defined as an area within 2 km travel
distance or all other clinics within 10 minutes travel time, All residential areas,
for where the clinic is the closest clinic.

Define which characteristic(s) must be computed for each neighborhood.

Example: The total population of the area, average household income.

4.3.1 Proximity Computation :-

In proximity computations, we use geometric distance to define the

neighborhood of one or more target locations. The most common and useful
technique is buffer zone generation. Another technique based on geometric
distance is Thiessen polygon generation.

Buffer zone generation (or buffering) is one of the best known

neighborhood functions. It determines a spatial envelope (buffer) around
given feature(s). The created buffer may have a fixed width, or a variable
width that depends on characteristics of the area.

Thus, we can say that the principle of buffer zone generation is simple: we
select one or more target locations, and then determine the area around them,
within a certain distances.

Types of buffering:

A) Data Type
Point Buffering: It involves the creation of a circular polygon about the point
of interest. Buffer distance, in this case the radius of the polygon, can either be
same for each point or the user can define it using attribute table or lookup
table. Example: An area covered by a mobile tower.
Polyline Buffering:A polygon is created at a specific width on a side or around
a line or its segment. Example: An area to be affected by construction of canal

89
Geographic Information systems Unit 4

Polygon Buffering:Buffering can be used to create either Inner buffer (inside

the polygonal surface) or Outer buffer (outside the polygonal
surface).Example: No honking zone around hospitals.

B) Width
Varying: Buffer distance around a feature varies for different segments of the
features. Example: An area along the banks of a river depending on the
intensity of the adjacent land use.
Single: Buffer zone with just one concerned parameter is created. Example:
Area owned by somebody along a border.
Multiple: Creation of concentric zone around the feature of interest. Example:
An area affected by radiation from a nuclear plant. Creation of zone expanding
more on one direction than the other. Example: An area inundated if a dam
breaks
Other:Buffer zones often have dissolved boundaries so that there are no
overlapping areas between the buffer zones. In some cases though, it may also
be useful for boundaries of buffer zones to remain intact, so that each buffer
zone is a separate polygon and so as to identify the overlapping areas.

90
Geographic Information systems Unit 4

2) Thiessen Polygons:

Thiessen polygons (otherwise known as Voronoi polygons or Voronoi diagrams), are

an essential method for the analysis of proximity and neighborhood. Thiessen
polygons (cf. figure below on the right side) are used to allocate space to the nearest
point feature. It defines an area around a point, where every location is nearer to this
point than to all the others (2D). Such kind of structures can be generated also in
higher dimensions, whereupon they are called Thiessen polyhedron or Voronoi
polyhedron.

Inputs: A feature layer (Point, Polyline, Polygon)

Outputs: New feature class.

Procedure:
The process goes through following steps:

1. Collects the points from a point layer (vertices if the source is a polyline or
polygon layer)
2. Clean duplicate points
3. Generates Convex Hull
4. Creates a TIN structure
5. Generates perpendicular bisectors for each tin edge.
6. Builds the Thiessen polygons
7. Clips the Thiessen polygons feature class with the convex hull.

Application Area:

Thiessen polygons are used to generate soil maps based on irregular distributed
sample points. The border between two soil types is assumed to be at the half of the
distance between two sample points that exhibit different soil types.

Raster Thiessen Polygons:

In raster data model, polygon raster zones are created. These zones show the locations
that are closest to a given point (in this case points are represented by raster cells).
There is the advantage that compared to vector data models, in raster data models the

91
Geographic Information systems Unit 4

metric space can be chosen and weighting factors etc. can be included to the
calculation.

4.3.2 Computation of diffusion :-

The determination of neighborhood of one or more target locations may

depend not only on distance cases but also on direction and differences in
the terrain in different directions. This typically is the case when the target

diffusion.
Diffusion
computation involves one or more target locations, which are better called
source. They are the locations of the source of whatever spreads. The
computation also involves a local resistance raster, which for each cell provides
a value that indica
cell.
The value in the cell must be normalized: i.e. valid Resistance for a

location(s) and the local resistance raster, the GIS will be able to compute a
new raster that indicates how much minimal total resistance the spread has
witnessed for reaching a raster cell.

4.3.3 FLOW COMPUTATION

Flow computations determine how a phenomenon spreads over the area, in principle
in all directions, though with varying difficulty or resistance. There are also cases

given, least-cost path, determined again by local terrain characteristics. The typical
case arises when we want to determine the drainage patterns in a catchment: the

4.3.4- Raster based surface analysis

Surfaces represent phenomena that have values at every point across their
extent. The values at the infinite number of points across the surface are

92
Geographic Information systems Unit 4

derived from a limited set of sample values. These may be based on direct
measurement, such as height values for an elevation surface, or temperature
values for a temperature surface; between these measured locations values are
assigned to the surface by interpolation.
Surfaces can be represented using contour or iso-lines, arrays of points, TINs,
and rasters; however, most surface analysis in GIS is done on raster or TIN
data.
Rasters are rectangular arrays of cells (or pixels), each of which stores a value
for the part of the surface it covers. A given cell contains a single value, so the
amount of detail that can be represented for the surface is limited to the size of
the raster cells.
Rasters are the most commonly used surface models in ArcGIS. The simplicity
of the raster data structure makes calculations on rasters (or comparisons
between rasters) faster for rasters than other surface representations. Rasters are
also used to store imagery, scanned maps, and categorical information, such as
land use class, which is often derived from imagery.
Surface analysis involves several kinds of processing, including extracting new
surfaces from existing surfaces, reclassifying surfaces, and combining surfaces.
Certain tools extract or derive information from a surface, a combination of
surfaces, or surfaces and vector data.

Terrain analysis tools

Some of these tools are primarily designed for the analysis of raster terrain
surfaces. These include Slope, Aspect, and Hill shade.
Slope tool calculates the maximum rate of change from a cell to its neighbors,
which is typically used to indicate the steepness of terrain.

Aspect toolcalculates the direction in which the plane fitted to the slope faces
for each cell. The aspect of a surface typically affects the amount of sunlight it
receives (as does the slope); in northern latitudes places with a southerly aspect
tends to be warmer and drier than places that have a northerly aspect.

93
Geographic Information systems Unit 4

Hill shade tool calculates the intensity of lighting on a surface given a light
source at a particular location it can model which parts of a surface would be
shadowed by other parts.

Surfaces are continuous data, such as elevation, rainfall, pollution

concentration, and water tables. This data can be represented as a continuous
surface; computations are required for continuous data the most common
technique applied is filtering. The principle of filtering is quite similar to that of
moving window averaging, for each cell, the system performs some
computation, and assigns the result of this computation to the cell in the output.
The difference with moving window averaging is that the moving Window or
kernel window in filtering is itself a little raster, which contains cell values that
are used in the computation for the output cell value. This little raster is a filter,
also known as a kernel which may be square (such as a 3x3 kernel), but it does
not have to be. The values in the filter are used as weight factors.
Example: As an example, let us consider a 3 × 3 cell filter, in which all values
are equal to 1. The use of this filter means that the nine cells considered are
given equal weight in the computation of the filtering step. Let the input raster
cell values, for the current filtering step, be denoted by rij and the
corresponding filter values by wij . The output value for the cell under
consideration will be computed as the sum of the weighted input values divided
by the sum of weights:

94
Geographic Information systems Unit 4

4.4 ANALYSIS: NETWORK ANALYSIS, INTERPOLATION, TERRAIN

MODELING :-

A network is a connected set of lines, representing some geographic

phenomenon, typically of the transportation type.Topological properties of
networks are: connectivity, adjacency, and incidence. These properties
serve as a basis for analysis. A simple example of a network in GIS can be
streets, power lines, or city centerlines.
Network analysis can be performed on either raster or vector data layers, but
they are more commonly done in the latter, as line features can be associated
with a network, and hence can be assigned typical transportation characteristics
such as capacity and cost per unit.
GIS networks consist of interconnected lines (known as edges) and
intersections (known as junctions) that represent routes upon which people,
goods, etc. can travel.The object traversing the network follows the edges, and
junctions appear when at least two edges intersect.For example, a road network
can have speed limits attached to the edges, and a junction can prevent left
turns.
Networks are either directed, in which only one direction of travel is allowed
within the network, or undirected, in which any direction of travel is allowed.

Types of networks that can be modeled in GIS:

To perform analysis of movement within networks on GIS. These networks include:

1. Utility networks: including water mains, sewage lines, and electrical

circuits. These networks are generally directed.
2. Transportation networks: including roads, railroads, and flight paths.
These networks are generally undirected.
Various classical spatial analysis functions on networks are supported by GIS
software packages.

95
Geographic Information systems Unit 4

1. Optimal path finding :

It generates a least cost-path on a network between a pair of predefined locations

using both geometric and attribute data.

Optimal path finding techniques are used when a least-cost path between two
nodes in a network must be found. The two nodes are called origin and
destination, respectively. The aim is to find a sequence of connected lines to
traverse from the origin to the destination at the lowest possible cost.

The cost function, it can be defined as the total length of all lines on the path.
The cost function not only length of the lines, but also their capacity, maximum
transmission (travel) rate and other line characteristics, for instance to obtain a
reasonable approximation of travel time. There can even be in which the nodes
visited add to the cost of the path as well. These may be called turning costs.

Problems related to optimal path finding are ordered optimal path finding and
unordered optimal path finding. Both have an extra requirement that a number of
additional nodes needs to be visited along the path. In ordered optimal path
finding, the sequence in which these extra nodes are visited matters; in unordered
optimal path finding it does not.

2. Network partitioning

It assigns network elements (nodes or line segments) to different locations using

predefined criteria.
In network partitioning, the purpose is to assign lines and/or nodes of the network,
in a mutually exclusive way, to a number of target locations. Typically, the target
locations play the role of service centre for the network. This may be service
areas any type of service: medical treatment, education, water supply. This type
of network partitioning is known as a network allocation problem.
In network allocation, we have a number of target locations that function as
resource centers, and the problem is which part of the net-work to exclusively
assign to which service centre.
Another problem is trace analysis. Trace analysis is performed when we want to

on the network, known as the trace origin. For a node or line to be conditionally

96
Geographic Information systems Unit 4

connected, it Means that a path exists from the node/line to the trace origin, and
that the connecting path fulfills the conditions set.

Tracing is the computation that the GIS perform to find the paths from the trace
origin that obey the tracing conditions. It is a rather useful function for many
network-related problems.

4.5 GIS AND APPLICATION MODELS:-

and GIS are more or less inseparable, as GIS is itself a tool for modeling
the real world.
The solution to a (spatial) problem usually depends on a (large) number of
parameters. Since these parameters are often interrelated, their interaction is
made more precise in an application model.
Many kinds of application models exist, and they can be classified in many
different ways. Here we identify five characteristics of GIS-based application
models:
1. The purpose of the model : It refers to whether the model is descriptive,
prescriptive or predictive in nature. Descriptive models attempt to answer the

question by determining the best solution from a given set of conditions.

outcomes based upon a set of input conditions

2. The methodology underlying the model: It refers to the operational components
of the model. Stochastic models use statistical or probability functions to
represent random or semi-random behavior of phenomena. In contrast,
deterministic models are based upon a well-defined cause and effect
relationship.
3. The scale at which the model works: it refers to whether the components of the
model are individual or aggregate in nature. Essentially
at which the model operates. Individual-based models are based on individual
entities, such as the agent-based models(attempt to model movement and
development of multiple interacting agents (which might represent
individuals),often using sets of decision-rules about what the agent can and

97
Geographic Information systems Unit 4

population census data.

1. Its dimensionality: It refers to whether the model includes spatial,
temporal or spatial and temporal dimensions.
2. Its implementation logic: It refers to how the model uses existing
theory or knowledge to create new knowledge. Deductive
approaches use knowledge of the over-all situation in order to predict
outcome conditions. Inductive approaches, on the other hand, are
less straightforward, in that they try to generalize (often based upon
samples of a specific data set) in order to derive more general
models.

4.6 - ERROR PROPAGATION IN SPATIAL DATA PROCESSING :-

A number of sources of error that may be present in source data. It is important
to note that the acquisition of base data to a high standard of quality still does
not guarantee that the results of further, complex processing can be treated with
certainty. As the number of processing steps increases, it becomes difficult to
predict the behavior of error propagation.

These various errors may affect the outcome of spatial data manipulations. In
addition, further errors may be introduced during the various processing steps.
One of the most commonly applied operations in geographic information
systems is analysis by overlaying two or more spatial data layers each such
layer will contain errors, due to both inherent inaccuracies in the source data
and errors arising from some form of computer processing, for example,
rasterization.
During the process of spatial overlay, all the errors in the individual data layers
contribute to the final error of the output. The amount of error in the output
depends on the type of overlay operation applied. For example, errors in the
results of overlay using the logical operator AND are not the same as those
created using the OR operator. Following are some common sources of error
introduced into GIS analyses.

98
Geographic Information systems Unit 4

Sample Questions

1. Explain measurement in spatial data analysis.

2. What are different spatial selection queries?
3. Explain Overlay Functions with its types.
4. Write a short note on Network Analysis.
5. Discuss about GIS applications in detail.
6. What is error in spatial data? Explain how errors propagate and how to
quantify them.

99
Geographic Information Systems Unit 5

Unit-V
Data Visualization

Contents

5.1 GIS and Map?

5.2 Visualization Process
5.3 Visualization Strategies
5.4 Cartographical Toolbox
5.5 How to Map
5.6 Map Cosmetics
5.7 Map Dissemination

5.1 GIS AND MAP?

Definition: A representation or abstraction of geographic reality. A tool for

presenting geographic information in a way that is visual, digital or tactile.

Geographic reality refers the object of study, Representation and abstraction

refers to the models of this phenomenon. Maps are the most effective and
efficient means to transfer spatial information. Maps can be used as inputs to
GIS and plays an important role as a component in GIS
Map puts the data in a spatial context. It also informs about the thematic
attributes of geographic object located on the map.
Maps can deal with queries related to the basic components of spatial or
geographic data which are location(geometry), characteristics (thematic
attributes) and time, and their combinations.
Locations are displayed using different objects. Characteristics of a location are
represented using different colors and signs. On screen maps provides features
with interactive links to database, embedded in it.
A map simplifies the details by providing an abstract view of the complicated
details and at the same time provides the remaining information in clear
perspective when compared to the aerial view of the data captured from the
satellite
The map on the other hand focuses on the relevant details by classification,
omission and selection of features. The effectiveness of a map for a given
purpose and a map scale has a relationship. The map scale is a ratio between a
distance on the map and the corresponding distance in the reality.Maps that
shows the details of a small area are called large scale map

100
Geographic Information Systems Unit 5

5.1.1.- TOPOGRAPHIC AND THEMATIC MAPS:-

accurately as possible
Topographic maps are used to represents the
detailed information of a location including
the natural and human made phenomenon.
The 3Dimensional feature are represented
using contours lines
They may include forest land, infrastructure,
land use, relief, hydrology, geographic
names and reference grids. The designs are
based on conventions and the set of symbols using visual variables
Eg : use of blue to represent water

Thematic Map : Represents the distribution of particular themes

Maps design specifically to highlight the distribution of a particular
phenomenon are called as Thematic Map. They are the
most common type maps produced in GIS
Map showing the population of a place is an example of
socio-
drainage area is an example of physical theme. Designs
are based on cartographic rules.
Both type of map are stored in the database as a separate
layer. Each layer contains details and the user is able to
switch between the layers

5.2- VISUALIZATION PROCESS :-

Definition: Visualization process is considered as a translation or conversion of

spatial data from database into graphics

Data visualization refers to the technique or method for representation involved

in the process of visually communicating data to others. Data can be visually
communicated in many ways, ranging from a simple table of numbers to
complex and highly sophisticated charts and interactive graphics.
Cartographic methods and techniques are applied in the process of
visualization. It allows for an optimal design and production for the use of
maps, depending upon the applications. To enable the translation from spatial
data into graphics, the data in the databases is well structured
Visualization can be created during any phase of spatial data handling process.
It can be simple or complex process. The environment selected for
visualization can vary considerably. It can be a personal computer or a
computer on the network. Visualization should also be tested on the grounds of
effectiveness which are based on following factors:
o Generalization : meaningful reduction of content during scale reduction
101
Geographic Information Systems Unit 5

o Design approach : topographic or thematic

o Type of data : Quantitative or Qualitative
Tools are available to visualize the data consisting of function, rules, habits or
convention, algorithms to classify the data or to smooth a polyline

5.3- VISUALIZATION STRATEGIES : PRESENT OR EXPLORE

Visual Communication
The main function of map is to communicate geographic information. Inform
the map users about the location and nature of geographic phenomena and
spatial pattern
The main goals of cartographers is to produce a map with cartographic tool
which will be able to convey the main purpose of map.

Visual Thinking process

Since GIS has spread so widely in day today life, that even a simple excel sheet
has mapping caliber. Thinking process is accelerated by continued
developments in hardware and software
Dynamic presentation and user interaction is possible by using media like
DVD-ROMs and WWW

Visual Data Mining

Data has now become abundant in many sectors of geoinformation world.
Users expectation of granting immediate and real time access to data is now
possible due to visual data mining. Geographical Information System (GIS)
stores data collected from heterogeneous sources in varied formats in the form
of geodatabases representing spatial features,
A new branch of science currently evolving to deal with the problem of
abundance data in geo-discipline is called as visual data mining

Interaction and Dynamic

Toolbox have been developed with the functionality based on two key words
interactions and dynamic. A separate discipline called scientific visualization

102
Geographic Information Systems Unit 5

also has an important impact on cartography. It offers the users interacting with
the map in real time .
Interaction with the map will stimulate t
add more functionality to the map.

Geovisualization
It refers to the set of tools and techniques supporting the analysis of geospatial
data through the use of interactive visualization. Map based scientific
visualization is also known as geovisualization.
It covers both presentation and exploration function of map. Presentation is
described as public visual communication since it is related to wide audiences.
Exploration is described as private visual thinking because its about an
individual manipulating with data. Exploration involves expertise and
knowledge about dealing with data.Exploration means to search for spatial,
temporal, spatio-temporal patterns or relationships or trends.
Visual Communication Process

Geovisualization emphasizes knowledge construction which is combined with

human understanding, allowing data exploration and decision-making
processes
In case of pattern aspects like distribution of phenomenon, the occurrence of
anomalies, the sequence of appearance and disappearance can be considered.
In case of relationships change in vegetation indices, climatic pressure,
location of deprived urban area, distance to educational facilities are considered
In terms of trends, development in distribution and frequency of landslides are
considered
Paper maps have dual function. They act as a database of an object and
communicated the information about this object.

location and visual variable to show features attributes. Omission of

information can also be done deliberately to emphasize on the remaining
information

5.4 CARTOGRAPHICAL TOOLBOX:

5.4.1: What kind of data do I have? : Cartographic data analysis

Cartographic data analysis to be done to derive proper symbols for map,

First step in the analysis process is to find a common denominator for all data
which will be known as the title of the map
Second step is to describe the individual components and its description should
be included in the legend

103
Geographic Information Systems Unit 5

Types of Data

Qualitative (nominal data or categorical)

o Describes different kinds of categories with no ordering in data.
Relatively easy to create. It is not difficult to find 12 or 15 distinctive
hues for a map, if map requires more symbols, pattern or text can be
added to hue for more symbol
o Eg. Land use types and soil types
Quantitative(ordinal, interval and ratio data)
o Data are classified based on the disciplinary insight such as soil
classification. Receives more attention than qualitative data. Map reader
can easily perceive the progression from low to high values. It can be
measured either along interval or ratio scale.
Nominal or Categorical data: it describes the different categories. It do not
implies ordering and do not support any mathematical or statistical anlaysis. Eg
land use, soil types
Ordinal data: differentiates data on the basis of ranking relationship. It implies
a definite ordering and supports only a limited set of statistical procedures,
such as maximum and minimum. Eg soil erosion can be ranked as severe to
moderate
Interval data: Has known interval between values. It also has a constant step
applied and supports a full set mathematics and statistics. Eg temperature
reading at 70 degree F is warmer than 60 degree F by 10 degrees
Ratio data: Ratio data introduces another condition based on meaningful or
absolute ero values. Eg Population density is the example of ratio data

5.4.2 : How can I map my data : Symbology

Different graphical techniques are used to present the geographic information.

Basic Elements of Maps are points, line, areas and text. The appearance of
these elements vary depending upon the nature of features.
Points can vary in form or color to represent the location or they can vary in
size to represent aggregated values.
Lines can vary in color to distinguish between different boundaries and rivers
or in shape to differentiate between rails and roads.
Areas vary in color to distinguish between different vegetation stands

104
Geographic Information Systems Unit 5

Visual variables :

Variation in symbol appearance are grouped together into six categories called
visual variables. Visual variables are used to make one symbol different from
.

5.5. HOW TO MAP

How to map qualitative data?

Readability is influenced by the
number of displayed geographic
unit. Each unit on the map
should get equal importance and
none should stand out above the
other
The color application and its
selection can be of great use
since it differentiates between
varying geographic units. If the
units are of equal importance the
color selected should have equal
visual weight or brightness.
The readability also depends upon the number of geographic units spread
across in the area which should not get cluttered.

105
Geographic Information Systems Unit 5

How to map quantitative data?

The geographic map should offer an overview
of the geographic distribution of phenomenon.
The symbols used should have quantitative
perception properties for which symbols
should vary in size.
Mapping counts on the map should be
carefully done, because the values may be
influenced by other factors and could yield a
misleading map. The aim of the map should
be, to give an overview of the distribution
For example, when making a map showing
the total sales figures of a product by state, the total sales figure is likely to
reflect the differences in population among the states.

How to map quantitative data?

Terrain can be mapped
using different
methods. One of the
method id to collect the
data for elevation and
mark it on the map as
text.
In contour map, points
of equal elevation can be connected together, filling the details related to the
elevation by using different colors in between the space of contour using
standard convention. This technique is known as hypsometric or layer tinting
Shaded relief can be created using shaded effect to get a 3Dimensional effect
on the map. Interactive functions are needed to manipulate maps in
3Dimensional space. The interactive functions involves panning, zooming,
rotating and scaling

How to map time series?

Time dependent data is now a part of GIS routines due to the increase in the
number of data captured at different periods in time. The changes caused by the
real-world process needs data at different time slices. Mapping time means
mapping changes.
Eg change in the boundaries or coastal area over the years. Urban boundaries
expand and the shift as the growth of urban areas takes place from rural parts.
Symbols are used to represent the perceiving changes. Arrows can be taken as
one of the symbol representing such changes from origin to destination
indicating magnitude of change

106
Geographic Information Systems Unit 5

Temporal cartographic technique

Changes can be depicted using symbols with varying sizes or visual

variables. Following cartographic techniques are used to represent
temporal data
Single static map
o Graphic variables and symbols are used to indicate the change or an
event. Eg : visual variable value to represent the age of the built up area
Series of static maps
o A single map will represent the time series and the process of change.
Change is represented by depicting the situation in successive snapshot.
Temporal sequence is represented using spatial sequence. Time of
images selected are limited to human eyes to follow
Animated map
o Change is perceived to happen in a single image by displaying several
snapshots one after the other like a successive frame. Temporal
components are added to a map displaying change in some dimension

5.6 MAP COSMETICS:

5.6.1 :- Fundamental requirement

Title is immediately gives the viewers description of the subject matter of the
map. Each map should have title informing the users about the topic visualized.
Additionally marginal information like scale indicator, north arrow for
orientation, map projection etc to be included in the map.
Information related to the publisher of map and the date of issue forms the
metadata which helps the users to determine the quality of map.

5.6.2 :- Space constraints

Map presented on the screen often go without marginal information because of
space constraints. On screen maps are interactive, clicking on map will reveal
more information from database. Legends and titles are available on demand
too

5.6.3 :- Text
Text is used to convey information. Text can be combined with visual variable
of different colors to specify different objects like blue for water bodies. The
positioning of text also matters with respect to the object that it refers. Text
elements in the map body are also known as labels. Interactive editing is also
provided to improve the final map.
Point features should have names to upper right. Line features are mentioned in
blocks and parallel to course of line feature. Area features have name in
feature, size dependent on extent.

107
Geographic Information Systems Unit 5

Interactive labelling: If the placement does not work well the label can be
moved immediately
Dynamic labelling : where the computer takes over the labelling task using
inbuilt algorithms

5.6.4 :- Contrast and visual hierarchy

Visual hierarchy is the process of developing a
visual plan to introduce the 3D effect or depth to
maps using the technique of contrast. Mapmakers
create the visual hierarchy by placing map
elements at different visual levels.
The most important element should be at the top
of the hierarchy and should appear closest to the
map reader. The least important element should
be at the bottom.
The concept of visual hierarchy is an extension of the figure-ground
relationship in visual perception. The figure is more important visually, appears
closer to the viewer, has form and shape, has more impressive color, and has
meaning. The ground is the background
Principle of interposition is applied on object to get the depth effect.
Interposition uses the incomplete outline of an object to make it appear as
though it is behind another.Eg Continents on a map look more important or
occupy a higher level in visual hierarchy if the lines of longitude and latitude
stop at the coast.

5.7 MAP DISSEMINATION :

The output media also plays an important role in map design

5.7.1 :- On Screen map

Compared to paper map, on screen maps have to be smaller.There content
should be carefully selected. On screen maps offers a lot of additional
functionality like links to databases.
Eg.Hiding the legends if not needed

5.7.2 :- Multimedia maps

Multimedia can be integrated in the Map. Atlas like Encaria World is a good
example of maps with multimedia embedded in it. Hovering the mouse on the
location will display its flag. Map can also be used as an interface to additional
information. Geographic location can be linked with text, sounds and
photograph etc. Maps can also be used to view spatial data products from
clearing house

108
Geographic Information Systems Unit 5

5.7.3 :- Maps as visual interfaces

Internet technology is becoming a medium to make data available to users who
are spread across on the earth for decision making and software application
development.
World wide web is a common medium to present and display spatial data.
Maps can be multifunctional with the help of internet and its world wide spread
and allow map to function as an interface to additional information. Geo
webservices works as an intermediate between browsers and data

5.7.4 :- Static Maps

Many static maps are view only. Tourist maps are created using static methods.
This form serves the main purpose of giving the users the preview of available
services in its organization. Zooming, panning and hyperlinks are also added in
the static maps .

5.7.5 :- Dynamic Maps

Dynamic maps are also known as clickable map and also serves as an interface
to spatial data. Clicking on the maps will display additional information in the
form of audio or video. The user can also select the layers to be viewed like the
colors or symbols, visualization parameters. Dynamic maps are about to change
either in one or more component
Animated gif is an example of dynamic maps.

5.7.6 :- Interactive Maps

Interactive mapping uses the GIS to display data on a map. Interactive maps
improves the display while presenting large amount of complex data. Plugin to
the browsers defines interaction options which includes pause, backward and
forward play.
Internet also provides facility to display 3Dimensional map using Virtual
Reality Markup Language. Each map should have title informing the users
about the topic visualized

Sample Questions:

1. Compare between Topographic and Thematic Maps.

2. Describe the process of Visualization in detail.
3. Write short note on Map Dissemination.
4. Explain how to map quantitative and qualitative maps?
5. Write short note on different types of data

109

Fokus Osce Ukmppd - 20200212130044
No ratings yet
Fokus Osce Ukmppd - 20200212130044
128 pages
Drawer Spacing Calculator
No ratings yet
Drawer Spacing Calculator
2 pages
GIS in Screening, Scoping, Baseline Studies, Impact Prediction, Mitigation and Monitoring
No ratings yet
GIS in Screening, Scoping, Baseline Studies, Impact Prediction, Mitigation and Monitoring
22 pages
Setting Up A GIS Laboratory - 30122011
83% (6)
Setting Up A GIS Laboratory - 30122011
15 pages
01 (C) ServiceNow Cloud Architecture
No ratings yet
01 (C) ServiceNow Cloud Architecture
3 pages
SW certificationProcedure-v1G
100% (1)
SW certificationProcedure-v1G
6 pages
01. PGIS Unit 1 Crash Course Contents
No ratings yet
01. PGIS Unit 1 Crash Course Contents
20 pages
TYBSCIT SEMESTER 6 CBCS Principles of Geographic Information Systems WORK ORDER Munotes
No ratings yet
TYBSCIT SEMESTER 6 CBCS Principles of Geographic Information Systems WORK ORDER Munotes
158 pages
PGIS Unit 1 Ch 1, 2
No ratings yet
PGIS Unit 1 Ch 1, 2
103 pages
unit 1 QB
No ratings yet
unit 1 QB
12 pages
Pgis Notes PDF
No ratings yet
Pgis Notes PDF
182 pages
PGIS UNIT 1
No ratings yet
PGIS UNIT 1
24 pages
Daily Report 03
No ratings yet
Daily Report 03
10 pages
1-A Gentle Introduction to GIS (E-next.in)
No ratings yet
1-A Gentle Introduction to GIS (E-next.in)
37 pages
GIS Raport
No ratings yet
GIS Raport
14 pages
PGIS Unit 1_ 5 merged
No ratings yet
PGIS Unit 1_ 5 merged
133 pages
Lectures_GIS
No ratings yet
Lectures_GIS
82 pages
Gis and Information Systems
No ratings yet
Gis and Information Systems
9 pages
GIS 5 Unit Notes
No ratings yet
GIS 5 Unit Notes
175 pages
Geography Grade 10 Term 1 Week 7 - 2021
No ratings yet
Geography Grade 10 Term 1 Week 7 - 2021
6 pages
5. GIS 5 UNITS NOTES
No ratings yet
5. GIS 5 UNITS NOTES
175 pages
01 GIS - Complete Note
No ratings yet
01 GIS - Complete Note
233 pages
Course GIS Application in Transport23
No ratings yet
Course GIS Application in Transport23
34 pages
Lecture 1
No ratings yet
Lecture 1
8 pages
Faculty of Applied Engineering and Urban Planning Civil Engineering Department
No ratings yet
Faculty of Applied Engineering and Urban Planning Civil Engineering Department
36 pages
GIS Short Lecture Note
No ratings yet
GIS Short Lecture Note
8 pages
GIS Serawit
No ratings yet
GIS Serawit
290 pages
GIS Applications in The Environment & Geomorphology GIS
No ratings yet
GIS Applications in The Environment & Geomorphology GIS
8 pages
Geographical Information System (GIS)
No ratings yet
Geographical Information System (GIS)
62 pages
GIS Concept and Error
No ratings yet
GIS Concept and Error
15 pages
01 - Introduction of Geographical Information System
No ratings yet
01 - Introduction of Geographical Information System
76 pages
UE 461 Intro. To GIS - 12
No ratings yet
UE 461 Intro. To GIS - 12
124 pages
GIS Unit 1 To 5
No ratings yet
GIS Unit 1 To 5
50 pages
Gis RS & GPS
No ratings yet
Gis RS & GPS
70 pages
Geographical Information Systems: 1) Basic Definitions of GIS
No ratings yet
Geographical Information Systems: 1) Basic Definitions of GIS
7 pages
Unit 1
No ratings yet
Unit 1
156 pages
Geographic Information Systems
No ratings yet
Geographic Information Systems
22 pages
geog204_fall22_lecture1
No ratings yet
geog204_fall22_lecture1
15 pages
GIS Module 1
No ratings yet
GIS Module 1
95 pages
Unit I Fundamentals of Gis
No ratings yet
Unit I Fundamentals of Gis
15 pages
GIS and Remote Sensing - 241106 - 184639
No ratings yet
GIS and Remote Sensing - 241106 - 184639
33 pages
Chapter 1. Introduction and Theoretical Issues in Archaeological Gis. Chapter 2.
100% (1)
Chapter 1. Introduction and Theoretical Issues in Archaeological Gis. Chapter 2.
22 pages
UNIT - 1 (Introduction GIS)
No ratings yet
UNIT - 1 (Introduction GIS)
7 pages
UNIT-II
No ratings yet
UNIT-II
22 pages
GIS Notes v2
No ratings yet
GIS Notes v2
23 pages
ASEAN - Geographic Information System
No ratings yet
ASEAN - Geographic Information System
62 pages
All About GIS (Geo-Information-System)
No ratings yet
All About GIS (Geo-Information-System)
7 pages
GI 605 - Lecture 1 Introduction
No ratings yet
GI 605 - Lecture 1 Introduction
48 pages
Principles and Applications of GIS-1-1
No ratings yet
Principles and Applications of GIS-1-1
57 pages
RS & GIS u-3
No ratings yet
RS & GIS u-3
29 pages
Unit-1-Geographic-Information-System (1)
No ratings yet
Unit-1-Geographic-Information-System (1)
26 pages
Principal of Geography Information - Unit1-Unit5
No ratings yet
Principal of Geography Information - Unit1-Unit5
111 pages
Unit-1 Geographic Information System-PDF - 1
No ratings yet
Unit-1 Geographic Information System-PDF - 1
22 pages
Handout Week4 Day1
No ratings yet
Handout Week4 Day1
10 pages
GIT 201 Theoretical
No ratings yet
GIT 201 Theoretical
63 pages
(GIS - 23) - Lecture 1 - Introduction To GIS
No ratings yet
(GIS - 23) - Lecture 1 - Introduction To GIS
25 pages
Gis Assignment PDF
100% (1)
Gis Assignment PDF
7 pages
GEO Note
No ratings yet
GEO Note
41 pages
Using gis - PRESENTATION
No ratings yet
Using gis - PRESENTATION
15 pages
Grade 11 Week 6 Presentation
No ratings yet
Grade 11 Week 6 Presentation
67 pages
OGI352 GIS NOTES
No ratings yet
OGI352 GIS NOTES
149 pages
GIS Complete Best Note BPP
100% (1)
GIS Complete Best Note BPP
86 pages
Remote Sensing Technology
From Everand
Remote Sensing Technology
Rajendra Asan
No ratings yet
Exploring ArcMap 10.5
From Everand
Exploring ArcMap 10.5
Prof. Sham Tickoo
No ratings yet
Aws Project
No ratings yet
Aws Project
13 pages
Error Gradle
No ratings yet
Error Gradle
3 pages
An IoT-based Soil Nutrients Monitoring System
100% (1)
An IoT-based Soil Nutrients Monitoring System
3 pages
Soapdoc 2
No ratings yet
Soapdoc 2
294 pages
Spinchillercontroller
No ratings yet
Spinchillercontroller
59 pages
RC 2014 AutoCAD Manual
100% (1)
RC 2014 AutoCAD Manual
110 pages
Sad 9 Cocomo Model Questions
No ratings yet
Sad 9 Cocomo Model Questions
26 pages
TCS Interview preparation
No ratings yet
TCS Interview preparation
49 pages
Chapter 1
No ratings yet
Chapter 1
8 pages
Endpoint Security For The Enterprise en
No ratings yet
Endpoint Security For The Enterprise en
6 pages
Optical Electronics by Ghatak and Thyagarajan PDF
No ratings yet
Optical Electronics by Ghatak and Thyagarajan PDF
4 pages
Types of Programming Language
No ratings yet
Types of Programming Language
6 pages
Professional Cloud Architect - 8
No ratings yet
Professional Cloud Architect - 8
30 pages
NEA Analysis (first draft)-2
No ratings yet
NEA Analysis (first draft)-2
110 pages
Pascal Paper
100% (2)
Pascal Paper
4 pages
University of The West of England 1
No ratings yet
University of The West of England 1
48 pages
Online Admission System in PHP MYSQL
No ratings yet
Online Admission System in PHP MYSQL
1 page
FinalElex Tec Ans
No ratings yet
FinalElex Tec Ans
8 pages
MPL Labmanual
No ratings yet
MPL Labmanual
76 pages
Combined Calendar Application and Scientific Calculator
No ratings yet
Combined Calendar Application and Scientific Calculator
36 pages
Final Exam
No ratings yet
Final Exam
3 pages
SDC LAB Manual
No ratings yet
SDC LAB Manual
40 pages
Computer Portfolio (Aashi Singh)
No ratings yet
Computer Portfolio (Aashi Singh)
18 pages
Release Not Ese Tabs V 1900
No ratings yet
Release Not Ese Tabs V 1900
23 pages
Debre Markos Univercity
No ratings yet
Debre Markos Univercity
47 pages
Sechrist Millennium-Bulletin-005 Setup Remote Communication
No ratings yet
Sechrist Millennium-Bulletin-005 Setup Remote Communication
2 pages