0% found this document useful (0 votes)
21 views

Gis Exist Course

Uploaded by

kqk07829
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views

Gis Exist Course

Uploaded by

kqk07829
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 57

YIRGAALEM SORSA (Msc) FUNDAMENTALS

OF GIS [GeES3082]

Unit one: Introduction to GIS

1. Introduction

Geographic Information System abbreviated as GIS, is an emerging technology in storing, managing,


analyzing and modeling geographic data. The term GIS was emerged in Canada Geographic Information
System (CGIS) in the mid-1960s for agricultural agency. It evolved from automated cartography in
response to the need to manage and analyze growing quantities of spatial data. The field of geographic
information systems (GIS) is concerned with the description, explanation, and prediction of patterns and
processes at geographic scales. GIS is a science, a technology, a discipline, and an applied problem
solving methodology.
1.1. Definitions and concepts of GIS

A Geographic Information System (GIS) - is a tool for making and using spatial information.
It uses the power of computer to pose and answer geographic questions.

There is no clear cut definition for GIS. Different people define it according to its capability and purpose
for which it is applied. Some are:
 GIS is a computer system for the input, manipulation, storage and output of digital spatial data
within a particular organization (Clark, 1986)
 GIS is a powerful tool set for collection, storing, retrieval as well as transforming and displaying
spatial data from the real world (Burrough 1989).

An organized collection of computer hardware, software, geographic data, and personnel


designed to efficiently capture, store, update, manipulate, analyze, and display all forms of
geographically referenced data.” (Understanding GIS, 1997)

Elements of GIS
GIS is the combination of three words: Geographical Information and Systems.
This implies Geographical – the ‘spatial key’ or location of features is central to data handling, analysis
and reporting, which sets GIS apart from other data base management systems.
This is the part of GIS that explains "spatially" where things are such as the location of nations, states,
counties, cities, schools, roads, rivers, lakes, and the list can go on and on. Spatially means where on the
earth's surface an object or feature is located. This can be as simple as the latitude and longitude of a
feature. The geographic feature or object can be anything of interest.

1
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

Information – without data and information GIS can have no role to play and good quality data are
critical if the results of analysis are to be reliable.
GIS information is the "data" or "attribute" information about specific features that we are interested in
such as the name of the feature, what the feature is, the location of the feature, and any other information
that is important. An example could be the name of a city, where it is located, how big it is in square feet
(area), its population, its population in the past, and any other information that is important.
Geographical Information is different from other kinds of information and therefore requires special
methods to be analyzed. Here are some of the characteristics that make geographical information special:
• Multidimensional – at least two coordinates must be specified to define a location
• Voluminous – a geographic database can easily reach a terabyte in size
• Different Representations - and how this is done can strongly influence the ease of
analysis and the end results
• Requires projection to flat surface
• Requires unique analysis methods
• Analyses require data integration
• Data updates are expensive and time consuming
Systems – at a basic level they are computer-based systems, but it is important to remember that GIS are
rarely personal technology, so an understanding of how organizations manage data and use information is
critical to understanding and achieving effective use of GIS.
The system in GIS is the computer and the software that is written to help people analyze the data, look at
the data and combine it in various ways to show relationships or to create geographic models. A GIS can
be made up of a variety of software and hardware tools, as long as they are integrated to provide a
functional geographic data processing tool.

1.2. Components of GIS (Exist exam part)

GIS is a technology that integrates powerful database capabilities with the unique visual perspective of
map. It involves information about the real world that is represented by point, line, areas and image; at
any scale ranging from local to global. GIS operates on two data elements-spatial and non-spatial
(attribute) data. Generally speaking GIS has five components. They are hardware, software, data, methods
and people. Is GIS about software only? But is GIS about computers only?

Some definitions of GIS focus on the hardware, software, data and analysis of components. However, no
GIS exist in isolation from the organizational context, and there must always be people to plan,
implement and operate the system as well as make decision based on the output.

2
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

A GIS is an organised collection of:

The Hardware components

The central processing unit forms the backbone of the GIS hardware. Other components include scanner,
digitizer board, printer, plotter, and storage devices. All these components should be connected to the
CPU. The above mentioned hard wares are discussed below.

The hardware of a GIS is composed of: input devices, processing storage devices and, output devices.

Input devices
Digital data input depends on the type of data to be utilized. Imagery input is possible from analogue
images through the use of image scanners. Digital airborne and space-borne systems already use charge-
coupled device CCD-sensors to supply the data in digital form.

3
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

Processing and storage devices


Processing and storage devices consist of the central processing unit (CPU) and the main memory, the
external storage devices and the user interface. Its arithmetic unit performs algebraic and logical
operations for the data. Its control unit regulates the data transfer between arithmetic unit and the main
memory. The main memory (random access memory, or RAM) contains the machine programs and
accepts data in short access time with caching, if required.
Output devices
Output devices include the ports to printers. Specific to GIS are the following graphic output facilities.
Vector devices are flat-bed plotters and drum plotters. Flat-bed plotters have an accuracy of _0.05 mm at a
speed of _30m/min operated with a pen or a light beam. Drum plotters are less accurate but faster (300–
900m/min). They are used for verification plots.
Raster devices permit the output of halftones in a pixel or a screened manner. They are able to print RGB
or CYMK colors in different saturations. They can combine vector and raster data in raster form. To print
halftones, the dithering technique is used, in which printer pixels are combined in halftone cells. For
example, a 600dpi output has a 150dpi halftone cell for a 4-bit radiometric resolution.
Main Components of hardware
i. Scanner – it is input device that converts a picture in analogue format into a digital image for further
processing. The output of scanner can be stored in many formats e.g. TIFF, BMP, JPG etc.
ii. Digitizer – it is input device used for vectorisation (it is a process of converting raster into vector
format) of a given map objects. Features either on paper map or digital map selectively can be traced
using digitized.
iii. Printers and plotters - are the most common output devices for a GIS hardware setup.
iv. Storage devices - Storage devices are hardware designed to store information. There are two types of
storage devices used in computers; a 'primary storage' device and a 'secondary storage' device. A storage
location that holds memory for short periods of times is an example of a primary storage device for
example, computer RAM. On the other hand, storage medium that holds information until it is deleted or
overwritten is an example of secondary storage devices. Examples include floppy disk drive or a hard
disk drive.

Software components
Software that is used to create, manage, analyze and visualize geographic data, i.e. data with a reference
to a place on earth, is usually denoted by the umbrella term ‘GIS software’. Typical applications for GIS
software include the evaluation of places for the location of new stores, the management of power and gas

4
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

lines, the creation of maps, the analysis of past crimes for crime prevention, route calculations for
transport tasks, the management of forests, parks and infrastructure, such as roads and water ways, as well
as applications in risk analysis of natural hazards, and emergency planning and response.
For this multitude of applications different types of GIS functions are required and different categories of
GIS software exist, which provide a particular set of functions needed to fulfill certain data management
tasks. We will first explain important GIS software concepts, then list the typical tasks accomplished with
GIS software, describe different GIS software categories, and finally provide information on software
producers and projects.
Types of Software’s required for GIS:
A. Basic computer Software’s
The operation of a computer is based on its operating system. It assures that all parts of the computer
function in liaison. Most common are Microsoft’s operating systems for PCs. In MS-DOS (Microsoft
Disk Operating System) the operation is regulated by text lines. This permits the administering of files by
name. More modern are Windows operating systems such as Windows 3.1, Windows 95, Windows 98,
Windows NT, Windows 2000, Windows ME and Windows XP, utilizing graphic symbols (icons).
Windows acts as a graphical user interface (GUI). Windows is now a network compatible system.

B. GIS application software


Based upon an operating system, augmented by additional programming tools and standards, various
vendors (ESRI, Intergraph, Siemens and many others) have developed GIS software packages.
GIS software provides the tools to manage, analyze, and effectively display and disseminate spatial data
and spatial information. Main function of GIS software are analytical functions that provide means for
deriving new geo-information from existing spatial and attribute data. Websites of some of the major
international GIS vendors are:
Bentley, USA www.bentley.com
Caris, Canada www.caris.com
ER Mapper, USA www.ermapper.com
Erdas, USA www.erdas.com
ESRI, USA www.esri.com
Genasys, Australia www.genasys.com
GE-Smallworld, UK www.smallworld-us.com
Idrisi USA www.idrisi.clarku.edu
Intergraph, USA www.intergraph.com
MapInfo, USA www.mapinfo.com

5
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

PCI Geomatics, Canada www.pci.on.ca


Sicad, Germany www.sicad.de

Data:
Perhaps the most important component of a GIS is the data. A GIS can integrate spatial data with other
existing data resources, often stored in a corporate DBMS. The integration of spatial data (often
proprietary to the GIS software), and tabular data stored in a DBMS is a key functionality afforded by
GIS.
Like all useful data, geographic data is expected to possess desirable properties of accuracy, timeliness,
comprehensiveness, acceptable cost etc. Other general issues relating to geographic data include spatial
extent (the area covered), scale (the detail in the system), the large volume (both attribute data and
graphic data can make large storage demands), diversity (data of interest plus background data),
collection cost (despite technological advances, field collection of data can still be very labor intensive),
etc. Scale is important not only for graphic representation in map form but also as it impacts on other
issues such as map coverage extent, data volume and data collection.
Major sources of geographic information: maps, aerial photographs, remotely sensed imagery and digital
datasets available from various vendors. Today, in most developed countries there is a declining emphasis
on production of printed maps by mapping agencies as geographic information collection is shifting to
either remote sensing or to the use of GPS for field data collection. Increasingly there is integration of
GPS and GIS for field data collection.
Methods:
A successful GIS operates according to a well-designed implementation plan and business rules, which
are the models and operating practices unique to each organization.
 Procedures include how the data will be retrieved, input into the system, stored, managed,
transformed, analyzed, and finally presented in a final output.
 The procedures are the steps taken to answer the question need to be resolved. The ability of a
GIS to perform spatial analysis and answer these questions is what differentiates this type of
system from any other information systems.

As in all organizations dealing with sophisticated technology, new tools can only be used effectively if
they are properly integrated into the entire business strategy and operation. To do this properly requires
not only the necessary investments in hardware and software, but also in the retraining and/or hiring of
personnel to utilize the new technology in the proper organizational context.

6
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

People: People refer users and can be considered as the component of GIS who actually makes the GIS
work. Effective use of GIS requires an organization to support various GIS activities.
People in GIS usually include a plethora of positions including GIS managers, database administrators,
application specialists, systems analysts, and programmers. They are responsible for maintenance of the
geographic database and provide technical support. People also need to be educated to make decisions on
what type of system to use. People associated with a GIS can be categorized into: viewers, general users,
and GIS specialists.

Network
The use of the WWW to give access to maps dates from 1993. The recent histories of GIS and the
Internet have been heavily intertwined; GIS has turned out to be a compelling application that has
prompted many people to take advantage of the Web. At the same time, GIS has benefited greatly from
adopting the Internet paradigm and the momentum that the Web has generated. They range from using
GIS on the Internet to disseminate information to selling goods and services to direct revenue generation
through subscription services, to helping members of the public to participate in important local, regional,
and national debates.

No Elements of GIS Details


1 Hardware Types of Computers
 Modest personal computers
 High performance workstations
 Minicomputers
Input device
 Scanners
 Digitizers
 Keyboard
 Graphic monitors
Output device
 Plotter
 printer
2. Software’s Input modules
Analysis modules, output visualization
3. Data Attribute and spatial data

7
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

Remote sensing data, Global database


4. Method Analysis, modeling and others
5. People Trained professionals responsible for data entry,
analysis…

1.3. GIS Subsystems

A GIS is a computer –based system that provides the following four subsystems to handle geo-referenced
data.
A. Data Input Subsystem
A Data Input subsystem allows the user to capture, collect, and transform spatial and thematic data into
digital form. The data inputs are usually derived from a combination of hard copy maps, aerial
photographs, remotely sensed images, reports, survey documents, etc.
B. Data management (Data Storage, Editing and Retrieval Subsystem)
The second necessary component for a GIS is the data storage and retrieval subsystem. The Data Storage
and retrieval subsystem organizes the data, spatial and attribute, in a form, which permits it to be quickly
retrieved by the user for analysis, and permits rapid and accurate updates to be made to the database. This
component usually involves use of a database management system (DBMS) for maintaining attribute
data. Spatial data is usually encoded and maintained in a proprietary file format.
 Organizing Data for Analysis: Most GIS software organizes spatial data in a thematic approach
that categorizes data in vertical layers.
 Editing and Updating of Data: Perhaps the primary function in the data storage and retrieval
subsystem involves the editing and updating of data.
 Data Retrieval and Querying The ability to retrieve data is based on the unique structure of the
DBMS and command interfaces are commonly provided with the software.
C. Data Manipulation and Analysis Subsystem
The Data Manipulation and Analysis subsystem allows the user to define and execute spatial and
attributes procedures to generate derived information. This subsystem is commonly thought of as the
heart of a GIS, and usually distinguishes it from other database information systems and computer-aided
drafting (CAD) systems.
 Manipulation and Transformations of Spatial Data
The maintenance and transformation of spatial data concerns the ability to input, manipulate, and
transform data once it has been created. Some specific functions are:
 Coordinate thinning: involves the reduction of the coordinate pairs (X and Y) from arcs.

8
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

 Geometric Transformations
 Map Projection Transformations
 Edge Matching
 Interactive Graphic Editing
 Analytical Functions in a GIS
The primitive analytical functions that must be provided by any GIS are:
 Retrieval, Reclassification, and Generalization
 Topological Overlay Techniques
 Neighborhoods Operations
 Connectivity Functions
D. Data Output and Display Subsystem.
The Data Output subsystem allows the user to generate graphic displays, normally maps, and tabular
reports representing derived information products. This subsystem conveys the results of analysis to the
people who make decisions about resources. Wall maps and other graphics can be generated, allowing the
viewer to visualize and thereby understand the results of analyses or simulations of potential events.

1.4. Capabilities of GIS

Till now GIS has been described in two ways:


 Through formal definitions, and
 Through technology's ability to carry out spatial operations and linking data sets together.

9
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

However, there is another way to describe GIS by listing the type of questions (capabilities) the
technology can (or should be able) to answer. These include: locations, conditions, trends, patterns,
modeling, non spatial questions, and spatial questions. There are five types of questions that a GIS can
answer:
i. Query for location: what is at………?
The first of these questions seeks to find out what exists at a particular location. Mapped data primarily
indicates where objects are located, but cannot explain why. A location can be described in many ways,
using, for example place name, postcode, or geographic reference such as longitude/latitude or x/y
coordinates. For example, an aerial photo may show that corn is growing vigorously in certain sections of
a field, but cannot explain why it does not grow well in other areas.
ii. Query for Condition: where is it…………?
The second question is the converse of the first and requires spatial data to answer. Frequently a GIS user
wants to discover whether the mapped data will meet certain conditions. That means instead of
identifying what exists at a given location, one may wish to find location(s) where certain conditions are
satisfied (e.g., an un forested section of at-least 2000 square meters in size, within 100 meters of road, and
with soils suitable for supporting buildings).
iii. Trend analysis: what has changed since…………..?
The third question might involve both the first two and seeks to find the differences (e.g. in land use or
elevation) over time. This can help to address temporal changes of earth’s phenomena.
iv. Pattern analysis: what spatial patterns exist…………..?
This question is more sophisticated. One might ask this question to determine whether landslides are
mostly occurring near streams. It might be just as important to know how many anomalies there are those
do not fit the pattern and where they are located.
v. Modeling: what if……………..?
"What if…" questions are posed to determine what happens, for example, if a new road is added to a
network or if a toxic substance seeps into the local ground water supply. Answering this type of question
requires both geographic and other information (as well as specific models). GIS permits spatial
operation.
In addition to all these capabilities, GIS can also handle related to non- spatial issues. For instance,
"What's the average number of people working with GIS in each location?" is non spatial question - the
answer to which does not require the stored value of latitude and longitude; nor does it describe where the
places are in relation with each other.
1.4. Applications and Purposes of GIS

10
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

Our day of life with GIS illustrates the unprecedented frequency with which, directly or indirectly, we
interact with digital machines. Today, more and more individuals and organizations find themselves using
GIS to answer the fundamental question, where?
Why Study GIS?
• 80% of local government activities estimated to be geographically based
– plats, zoning, public works (streets, water supply, sewers), garbage collection, land
ownership and valuation, public safety (fire and police)
• a significant portion of state government has a geographical component
– natural resource management
– highways and transportation
• businesses use GIS for a very wide array of applications
– retail site selection & customer analysis
– logistics: vehicle tracking & routing
– natural resource exploration (petroleum, etc.)
– precision agriculture
– civil engineering and construction
• Military and defense
– Battlefield management
– Satellite imagery interpretation
• scientific research employs GIS
– geography, geology, botany
– anthropology, sociology, economics, political science
– Epidemiology, criminology
Generally, the application of geospatial sciences has spread very fast and wide over the past few decades.
There is, quite simply, a huge range of applications of GIS shall be explained in this section. These
include topographic base mapping, socio-economic and environmental modeling, global (and
interplanetary!) modeling, and education. Applications generally set out to full fill the five Msof GIS:
mapping, measurement, monitoring, modeling, management.
General and specialized GIS systems have been designed for a variety of purposes:
• For environmental management and conservation.
• For defense and intelligence purposes.
• For governmental administration.
• For resource management in agriculture and forestry.
• For geophysical exploration.

11
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

• For cadastral management.


• For telecommunications.
• For utility management.
• For business applications.
• For construction projects.

2: Spatial data and Geographic Information Systems (Exist exam part)

2.1 Spatial Data

2.1.1 Geographic phenomenon defined


2.1.2 Spatial Data Types
2.1.3 Geographic fields, geographic objects, and boundaries
2.2 Spatial Data Models

2.2.1. Vector Data Formats


2.2.2. Raster Data Formats

2.3 Advantages and Disadvantages of Vector and Raster Data


2.4 Attribute Data Models
2.5 Spatial Data Base Management
2.6 Topology
2.7 Data Accuracy and Quality

Spatial Data
Spatial data also known as geospatial (coordinate) data or geographic information. It is the data or
information that identifies the geographic location of features and boundaries on Earth, such as natural or
constructed features, parcels, roads, buildings and more. In other words, it describes the absolute and
relative location of geographic or spatial features.
Geographic phenomena and features are infinite and have complex relationship with each other.
Geographic entities (also called geographic phenomena) you have to refer back the principle of
interpolation. In addition to that, entity or geographic feature occupies position in space about
which data describing the attributes of the entity and its geographic location are recorded.
A geographical entity is defined in terms of:
Location (spatial reference)
Dimensions
Attribute
Time
It is common in spatial analysis to refer to places as (spatial) objects.
Spatial data is usually stored as coordinates and topology, and is data that can be mapped. Spatial data use
Cartesian coordinates systems. Two dimensional Cartesian coordinate systems define x and y axes in a

12
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

plane. The three dimensional Cartesian system defines a z axis, orthogonal to both the x and y axes. An
origin is defined with zero values at the intersection of the orthogonal axes. Spatial data is often accessed,
manipulated or analyzed through GIS.

Example of Orthogonal Cartesian plane


 A Coordinate System is a reference system used to measure horizontal and vertical distances on a
planimetric (flat surface) map.
 defined by :
 a map projection,
 a spheroid of reference,
 a datum,
 one or more standard parallels, a central meridian
In two systems: GCS and PCS
Geographic Coordinate System
 A reference system using latitude and longitude to define the location of points on the surface of a
sphere or spheroid
 decimal degrees (DD) -92.5
 degrees/minutes/seconds (DMS) 92° 30’ 00” W
 A geographical coordinate system uses a three-dimensional spherical surface to define locations
on the earth.
 Georeferencing is crucial to making aerial and satellite imagery, usually raster images, useful for
mapping as it explains how other data, such as the above GPS points, relate to the imagery.
 Very essential information may be contained in data or images that were produced at a different
point of time.
 It may be desired either to combine or compare this data with that currently available. The latter
can be used to analyze the changes in the features under study over a period of time.

13
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

 Different maps may use different projection systems. Georeferencing tools contain methods to
combine and overlay these maps with minimum distortion.
 Using georeferencing methods, data obtained from surveying tools like total stations may be
given a point of reference from topographic maps already available.
Projected Coordinate System
• x,y coordinates referred to as “eastings” & “northings”
• Units can be in meters, feet, inches
• The projection name, type, and other parameters are defined by a grid mapping variable.
• It is the result of different types of map projection
In GIS, spatial Data should be projected using the appropriate map projection
2.1.2. Spatial Data types

Zero Dimensional Object Types


Point - as pairs of coordinates in lat/long or some other reference system
 A point feature is a zero-dimensional cartographic object.
 It specifies the geometric location and no other meaningful measurement
 The size of the point may vary, but the area of those symbols is meaningless
Line - ordered sequence of points connected by straight lines. Line features are one dimensional features,
despite occupying two-dimensional space.
 A line segment is the direct connection between two points
 A line feature is typically represented as a sequence of vectors
 An Arc is the location of points that are defined by a mathematical function to form a curve
 Link or edge is the connection between two nodes
Areas as ordered rings of points connected by straight lines to form polygons
 Area is a two dimensional, bounded and continuous object
 Interior area is an area not including its boundary
 Simple polygon consists of an interior area and an outer ring. The boundary does not intersect
itself
Typically refers to vector polygons, but also relates to pixels and grid cells.

2.1.3. Geographic fields, geographic objects, and boundaries


Geographic features two basic characteristics. The first is a discrete object which has a generic
class with basic connectedness and interdependence as a single data set, such as, land use as a
class has separate entities of residential, commercial, industrial, agricultural, etc. The class is a

14
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

set of geographic entities derived from a common set of criteria, thus sharing spatial character
and structure, e.g., ownership parcels, intersections, street segments, etc. These geographic
features can be counted and if we like we can list them in table with their attribute. So objects are
distinguished by their dimensions, and naturally fall into categories of points, lines, or areas,
when we represent them in GIS.
GEOGRAPHIC ENTITIES
Geographic entities can also be called as geographic phenomena. GIS supports such study because it
represents phenomena digitally in a computer. An entity or geographic feature occupies position in space
about which data describing the attributes of the entity and its geographic location are recorded. It is a
discrete generic class with basic connectedness and interdependence as a single data set, i.e., land use as a
class has separate entities of residential, commercial, industrial, agricultural, etc. The class is a set of
geographic entities derived from a common set of criteria, thus sharing spatial character and structure,
e.g., ownership parcels, intersections, street segments, etc.
Types of Geographic entity
The fundamental observation is that some phenomena manifest themselves essentially everywhere in the
study area while others only occur in certain localities. Therefore, geographic entities based on the
manifestations within the area of consideration; can be broadly classified into two types.
i. Geographic fields
Geographic fields are geographic phenomena at which every point in the study area a value can be
determined. They manifest themselves essentially everywhere in the study area. The usual examples of
geographic fields are: temp, pressure, elevation, etc. These fields are actually continuous in nature and are
characterized by their fuzzy boundary nature.
ii. Geographic objects
As opposite to the above discussed types of geographic phenomena, many other phenomena do not
manifest themselves everywhere in the study area, but only in certain localities. These entities populate
the study area and are usually distinguishable one from the other and can be characterized by their
discrete boundary nature. The space between them is potentially empty. Examples include: building, road,
parcel, river, etc. Their position in space can be determined by a combination of:
 Location (where is it)
 Shape (what form is it?)
 Size (how big is it?)
 Orientation (in which direction is it facing?).

15
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

A continuous field of elevation, for example, varies much more smoothly in a landscape that has
been worn down by glaciations or flattened by blowing sand than one recently created by cooling
lava. Cliffs are places in continuous fields where elevation changes suddenly, rather than
smoothly. Population density is a kind of continuous field, defined everywhere as the number of
people per unit area, though the definition breaks down if the field is examined so closely that
the individual people become visible. Continuous fields can also be created from classifications
of land, into categories of land use, or soil type. Such fields change suddenly at the boundaries
between different classes. Other types of fields can be defined by continuous variation along
lines, rather than across space. Traffic density, for example, can be defined everywhere on a road
network, and flow volume can be defined everywhere on a river.

While other geographic phenomenon, do not have clear cut geographic boundary. They manifest
themselves essentially everywhere in the study area. These fields are actually continuous in
nature and are characterized by their fuzzy boundary nature. The usual examples of geographic
fields are: temp, pressure, elevation, etc.

The continuous Field View: based on this view, there is no clear cut boundary between
geographic objects and space. The geographic space potentially contains an infinite amount of
information if it defines the value of the variable at every point. Since there are an infinite
number of points in any defined geographic area it is impossible to represent all in the computer.
Because in our discussion above it his topic we have said that the space in computers is limited.

Discrete Object View: The second conceptual geographic data model is discrete objects view
which is based on the assumption that geographic objects have well-defined boundaries. In this
view though objects in geographic space are discrete (finite such as mountain, buildings, road,
etc), the objects can also require an infinite amount of information for full description. For
example, although Lake Tana has a definite boundary, it contains an infinite amount of
information if it is mapped in infinite detail. Thus it is not possible to include and study all
information in the computer representation.

Continuous fields and discrete objects define two conceptual views of geographic phenomena,
but they do not solve the problem of digital representation. Therefore, continuous fields and

16
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

discrete objects are no more than conceptualizations, or ways in which we think about
geographic phenomena in GIS; they are not designed to deal with the limitations of computers.

2.1 Spatial Data Models (Exist exam part)

Features on the earth surface are represented in GIS by their location. The common requirement to store
such data is on the base of one or more classes. Many GIS software’s designs such database based on a
particular level of classification such as road, rivers, or vegetation types are grouped in to so call layers
or coverage. The layers are connected by common identification. The layers can be combined
(overlayed) with each other in various ways to create new layers. Raster and vector are two methods of
representing geographic data in digital computers in GIS data model. The choice of data structure affects
both data volume and processing efficiency.

3.4.1.1. Vector data model: It is a discrete data model. In this model geographic features are represented
discretely in the form of point, line, and polygon. In the vector data model each object in the real world is
first classified into a geometric type: in the 2-D case point, line, or polygon (Figure ).
Points (e.g., wells, soil pits (depth), and retail stores) are recoded as single coordinate pairs, lines (e.g.,
roads, streams, and geologic faults) as a series of ordered coordinate pairs (also called polylines), and
polygons (e.g., census tracts, soil areas, and oil license zones) as one or more line segments that close to
form a polygon area. The coordinates that define the geometry of each object may have 2, 3, or 4
dimensions: 2 (x, y: row and column, or latitude and longitude), 3 (x, y, z: the addition of a height value),
or 4 (x, y, z, m: the addition of another value to represent time or some other property – perhaps the offset
of road signs from a road centerline, or an attribute).
Vector data - objects represented in vector data structures are determined by an x, y location in
coordinate space. Vector data sets are composed of single points, lines, polylines or arcs (connected
string of points), and polygons (series of coordinates that define an enclosed region).
arc= series of line segments bounded by nodes at end points and vertices.
topology= the way a vector GIS uses points, lines, and polygons to represent map features.

+1
Points Coordinates
+2 +3 1. (2, 4)
2. (6, 8)
Polyline Coordinates
1 (2,3), (3, 6), (6,10)

17
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

Polygon Coordinates
1 (1,4)(4,5)(5,6)(6, 8)(9, 1)

3.4.1.2. Raster Data Model: is a continuous data model. Geographical features are made up of a matrix
of pixel (cell), each containing a value that represents the conditions covered by that cell. Example, aerial
photography and satellite image.

The size of each cell is generally determined by some type of header record for the file that describes
the coordinate of the origin (row 1, column 1) of the file and the x,y dimensions of the cells in the file.

Raster Data Characteristics


Raster data forces all features to be represented in grid cells with specific dimensions. Some areas are
fairly well represented by raster data structure, provided the raster structure aligns with the feature
orientation and the size of the features.
Irregular features are not well represented as to their true size and shape.
Linear features are represented by rasters being turned on and take on a jagged appearance. This is both
annoying and inaccurate.
Resolution: we can increase the accuracy of spatial representation of features by increasing the resolution
of the raster data (make the cells smaller). This is done however, at expense of increased storage space
and processing costs on the computer.
Raster Advantages (overhead)
A. simple data structure for storage (rows and columns)
B. easy to analyze and compare to other rasters(differencing operations)
C. form of imagery
D. modeling applications much easier to program and implement
Raster Disadvantages

18
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

1. Spatial inaccuracy
2. Resolution
3. Large storage requirements for simple data (note that complex landscapes can actually take more
storage with vector data than raster).
4. General perception that maps are based on vector data (points, lines polygons) because this is how we
tend to visualize features on the earth.

Vector Advantages
A. More what we expect as map data (points, lines, polygons)
B. better resolutions than raster generally on detailed landscapes
C. generally better spatial accuracy
D. advantage of topology (can represent connectivity and interrelationships)

Vector Disadvantages
1. difficult to manage on a computer
2. Slow to process complex data sets on low-end computers
3. More costly to use given the previous
The combination of the spatial and attribute data along with the creation, editing, data retrieval, and
output capabilities is what makes up the sum total of a GIS.

2.3. Attribute Data Model


Attributes refer to descriptive information. The range of attributes in
geographic information is vast. Some attributes are physical or
environmental in nature (e.g., atmospheric temperature or elevation),
while others are social or economic (e.g., population or income). There
are five main types of attributes: nominal, ordinal, interval, ratio, and
cyclic.
It implies the way descriptive information are stored in the GIS. Attribute data are stored in attribute
database. It is a tabular data model. Attribute database structure depends on the GIS software used.
These are usually data tables that contain information about the spatial components of the GIS themes.
These can be numeric and/or character data such as timber type, timber volume,road size, well depth,
etc. The attributes are related back to the spatial features by use of unique identifiers that are stored both
with the attribute tables and the features in each spatial data layer.

19
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

The database allows us to manipulate information in many ways: from simple listing of attributes,
sorting features by some attributes, grouping by attributes, or selecting and singling out groups by
attributes.

Attribute data describes characteristics of the spatial features. These characteristics can be quantitative
and/or qualitative in nature. Attribute data is often referred to as tabular data. For example, the coordinate
location of a forestry stand would be spatial data, while the characteristics of that forestry stand, e.g.
cover group, dominant species, crown closure, height,etc., would be attribute data. Other data types, in
particular image and multimedia data, are becoming more prevalent with changing technology.
Depending on the specific content of the data, image data may be considered either spatial, e.g.
photographs, animation, movies, etc., or attribute, e.g. sound, descriptions, narration's, etc.
Attribute data are used to record the non-spatial characteristics of an entity. Attributes are also called
items or variables. Attributes may be envisioned as a list of characteristics that help describe and define
the features we wish to represent in a GIS. Color, depth, weight, owner, components vegetation type, or
land use are examples of variables that may be used as attributes. Attributes have values, e.g. color may
be blue, black or brown, weight from 0.0 to 500, or land use may be urban, agriculture, or undeveloped.
Attributes are often presented in tables, with attributes arranged in rows and columns. Each row
corresponds to an individual spatial object and each column corresponds to an attribute.
Types of Attribute data
Attributes of different types may be grouped together to describe the non spatial properties of each object
in the database. These attribute data may take many forms but all attribute data can be categorized as
nominal, ordinal, or interval/ratio attributes.
Nominal: Geographic features that have names only. So, you can’t compare their descriptive information
to any other. Place names, Color, vegetation types, city name, owner of the parcel or soil series are all
examples of nominal attributes. Each serves only to identify the particular instance of a class of entities
and to distinguish it from other members of the same class. Nominal attributes include numbers, letters,
and even colors. Even though a nominal attribute. can be numeric it makes no sense to apply arithmetic
operations to it: adding two nominal attributes, such as two drivers’ license numbers, creates nonsense.
There is no implied order, size, or quantitative information contained in the nominal attributes.
Nominal attributes may also be images, audio recordings, or other descriptive information. Just as the
color or type attributes provide nominal information for an entity, an image also provides descriptive
information. Examples of nominal descriptions you might find on a map include the Addis Ababa,
LakeTana, etc.

20
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

Ordinal: Geographic features that you can compare by rank. You could have short, medium, and tall
trees; dirt roads, paved roads, highways, and superhighways; or large, medium, and small chemical spills.
An ordinal attribute may be descriptive such as small, medium or large or they may be numeric such as an
erosion class which takes values from 1 through 10. The order reflects only ranks, and does not specify
the form of the scale. An object with an ordinal attribute that has a value of four has a higher rank for that
attribute than an object with a value of two. However, we cannot infer that the attribute value is twice as
large, because we cannot assume the scale is linear. Averaging makes no sense either, but the median, or
the value such that half of the attributes are higher-ranked and half are lower-ranked, is an effective
substitute for the average for ordinal data as it gives a useful central value.
Interval/ratio
Attributes are interval if the differences between values make sense. Interval/ratio attributes are used for
numeric items where both order and absolute difference in magnitudes are reflected in the numbers. The
scale of Celsius temperature is interval, because it makes sense to say that 30 and 20 are as different as 20
and 10. Attributes are ratio if the ratios between values make sense. Weight is ratio, because it makes
sense to say that a person of 100 kg is twice as heavy as a person of 50 kg; but Celsius temperature is only
interval, because 20 is not twice as hot as 10 (and this argument applies to all scales that are based on
similarly arbitrary zero points, including longitude).
These data are often recorded as real numbers most often on a linear scale. Area, length, weight, value,
height, or depth is a few examples of attributes which are represented by interval/ration variables.
Interval: Geographic features that have detailed increments (intervals) that you can measure. One
limiting characteristic of interval data is that,although you can get very accurate measurements, you can’t
form ratiosbecause the starting point is arbitrary. For example, if the soil in landparcel A is 15 degrees
centigrade and the soil in parcel B is 30 degrees,you can say that soil B is 15 degrees warmer than soil A,
but you can’tsay that it’s twice as warm because 0 degrees centigrade is an arbitrarystarting place and the
temperature values can thus be negative.
Ratio: Geographic data that have measurable units, like interval data, but also allow you to make the ratio
comparisons that interval data won’t.
The computer represents even nominal data (names) with numbers in the GIS database. Try to avoid
using mathematical techniques that force you to multiply the numbers that represent nominal categories
by ordinal, interval, or ratio numbers. Attempting to multiply a nominal category (urban) by a ratio
category (meter) often yields downright silly results (in this case, urban meter).

Topological Relationships

21
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

Topological features are essentially simple features structured using topological rules. Topology is the
mathematics and science of geometrical relationships. Topological relationships are non-metric
(qualitative) properties of geographic objects that remain constant when the geographic space of objects is
distorted. For example, when a map is stretched properties such as distance and angle change, whereas
topological properties such as adjacency and containment do not.
Topology can be defined as the organization of spatial relationships between features in a GIS. In
layman's terms, topology is the way a GIS “knows”:
1) Where a feature is in relation to other features,
2) What parts of different features are shared (points, lines, nodes), and
3) How features share connectivity (gives us ability to move between features in network applications).
The topological data structure logically determines exactly how and where points and lines
connect on a map by means of nodes (topological junctions). The order of connectivity defines
the shape of an arc or polygon. The computer stores this information in various tables of the
database structure. By storing information in a logical and ordered relationship missing
information, e.g., a line segment of a polygon, is readily apparent. A GIS manipulates, analyzes,
and uses topological data in determining data relationships.
The database tells us that the line is the “left side” of one polygon and the “right side” of the adjacent
polygon.
The three aspects of topology that are important in representing spatial relationships are:
1) adjacency- shared boundary
2) connectivity- shared node in arc-node topology
3) containment- accounts for polygons within polygons “islands”
The software in our GIS creates a database that keeps track of the relationships as lists of shared features.
A simple map may be composed of land cover polygons. The polygons are composed of “ chains”
(we'llcall them arcs to be consistent with arc/info). Some of the arcs are shared by polygons, some are
not. The database structure is designed to keep a list of all arcs and how they relate to the formation of
each polygon.

Network analysis uses topological modeling for determining shortest paths and alternate routes.
For example, a GIS for emergency service dispatch may use topological models to quickly
ascertain optional routes for emergency vehicles. Automobile commuters perform a similar
mental task by altering their route to avoid accidents and traffic congestion. Likewise an
electrical utility GIS could rapidly determine different circuit paths to route electricity when
service is interrupted by equipment damage. Similarly, political redistricting planners could use

22
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

certain algorithms to determine logical relationships between population groups and areas for
district boundaries.

To see how topology is represented or modeled, it is useful to consider an example to see how
connections are coded into a database. This involves recording more than use the absolute
location of points, lines, and regions. The first step is to record the location of all "nodes," that
is endpoints and intersections of lines and boundaries.

To see how topology is represented or modeled, it is useful to consider an example to see how
connections are coded into a database. This involves recording more than use the absolute
location of points, lines, and regions. The first step is to record the location of all "nodes," that is
endpoints and intersections of lines and boundaries.

Based upon these nodes, "arcs" are defined. These arcs have endpoints, but they are also assigned a
direction indicated by the arrowheads. The starting point of the vector is referred to as the "from node"
and the destination the "to node." The orientation of a given vector can be assigned in either direction,
as long as this direction is recorded and stored in the database.

By keeping track of the orientation of arcs, it is possible to use this information to establish
routes from node to node or place to place. Thus, if one wants to move from node 3 to node 1,
we can locate the necessary connections in the database.

23
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

Now, "polygons" are defined by arcs. To define a given polygon, trace around its area in a
clockwise direction recording the component arcs and their orientations. If an arc has to be
followed in its reverse orientation to make the tracing, it is assigned a negative sign in the
database.

Finally, for each arc, one records which polygon lies to the left and right side of its direction of
orientation. If an arc is on the edge of the study area, it is bounded by the "universe."

Spatial Data Base Management


Conversion of the real geographic variation in to digital model is done through data model. A data model
is a set of constructs for describing and representing selected aspects of the real-world in a computer.A
data model represents the link between the real world domain of geographic data and computer
representation of these features. Geographic data are organized in geographic database. A database is a
self-contained, long-term organization of data for flexible and secure use. It consists of the data and of a
database management system, the software to manage the data.
A GIS database is composed of all of the geographic/spatial information (maps, imagery) and
associated attribute information (tables, reports) that are linked in such a way that we can extract either

24
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

the spatial or attribute information by requests based on the location or characteristics of the data features
either singly or as related to other features.

Database Management Systems (DBMS) have been developed to manipulate, i.e. imports, store, and sort
and retrieve data in a database. Today, most systems use a relational database structure but other systems
exist and may be well suited to particular types of data. There are three basic types with which we should
be familiar in GIS. These are hierarchical data structure, network systems, and relational database
structure.
Types of Database System
A database is a comprehensive collection of related data stored in logical files and collectively processed,
usually in tabular form. Database Management Systems (DBMS) have been developed to manipulate, i.e.
imports, store, and sort and retrieve data in a database. Today, most systems use a relational database
structure but other systems exist and may be well suited to particular types of data. There are three basic
types with which we should be familiar in GIS. These are hierarchical data structure, network systems,
and relational database structure. A part from these, flat-file database and object-oriented system also
have been used for GIS. The Storage structure of database system a) hierarchical database system
b) Network database system and c) relation database
Table 4.1 Database systems and characteristics
Type Characteristics
File-system-based Simple – can use generalized software
Use files and directories to organize (word processors, file managers) Inefficient –
Information. Examples: Gopher as number of file increase within a directory,
information servers (not typically search speed decrease. Few capacities – no
considered as a DBMS) sorting or query capacities aside from sorting
Hierarchical file names
Store data in hierarchical system.
Examples: IBM IMS database software, Efficient storages for data that have a
levels of administration (country, clear hierarchy
province, district), satellite images in Hierarchical Data Tools that store data in hierarchically
Format (HDF) organized files are commonly used for
image data
Relatively rigid, requires a detailed
Network
planning process
Store data in interconnected units with
few constraints on the type and number Fewer constraints than a hierarchical

25
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

of connections. Examples: numerous databases


point locations with multiple plant or Links defined as part of the database
animal species structure
Networks can become chaotic unless
planned carefully
Relational

Store data in tables that can be linked by key fields.


Widely-used, mature technology
Examples: Structured Query Language (SQL) database
Efficient query
such as Oracle, Sybase and SQLserver, PC database
Standard range interfaces (i.e. SQL) Restricted
such as dBase and FoxPro
range of data structures, may
Object-oriented not handle image or expensive text well
(although some databases allow
Store data in objects each of which contains a defined
extensions)
set of methods for accessing and manipulating the data.
Examples: POSGRES database
New, developing technology
Wide range of structures in extensible to
handle many different types of objects
Not as efficient as relational DBMS for
query

The database software that we are using in lab is a relational database structure. This means that we
cross reference feature attributes to their spatial definitions based on some commonattribute stored in
the data table for the attributes and graphics. We can select one or more graphic features by use of a
query of some characteristic of interest in the feature attribute table of the database. And, since the
reference from graphics to attributes works both ways, we can select one or more spatial graphic
features on our computer screen and have the software give us the associated attributes. The relational
qualities of the database go even further than these simple examples. We frequently attach other external
tables to our original data sets and relate them to the existing data by common attributes.

26
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

selectBoundary.id-contour, x, y
fromCountry, Boundary, Contour, Point
wherename = ‘France’
andCountry.id-boundary = Boundary.id-boundary
andBoundary.id-contour = Contour.id-contour
andContour.id-point = Point.id-point
order by Boundary.id-contour, point-num

The database software that we are using most GIS software is a relational database structure. This
means that we cross reference feature attributes to their spatial definitions based on some
commonattribute stored in the data table for the attributes and graphics. We can select one or more
graphic features by use of a query of some characteristic of interest in the feature attribute table of the
database. And, since the reference from graphics to attributes works both ways, we can select one or
more spatial graphic features on our computer screen and have the software give us the associated
attributes. The relational qualities of the database go even further than these simple examples. We
frequently attach other external tables to our original data sets and relate them to the existing data by
common attributes.

27
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

The relational database has several advantages for geographic representation and cartographic
representation. First, the conceptual model of the database is distinct from the physical model (how the
database is stored and managed on computer hardware are separate). Second, separate tables help
maintain the integrity of the potential meaning of database elements. Most relational databases now use
structured query language (SQL) for constructing queries involving tables of a single database, or with
tables in other databases, even on other computers. Third, the clarity of the relationships aids people
using the database with previous experiences of the database. Reliable processing is critical for queries of
geographic information and online maps. Fourth, it is possible to define multiple views of the same data
in different database tables (e.g., listing entries by street address or alphabetically by name). Most
geographic information and maps only scratch the surface of what databases can be used for, but the two
most common uses of databases for geographic information and maps are as follows:
• Databases store measurements and observations of things and events.
• Databases store the symbols, values, and other graphic elements that help maps communicate.
Uses of DBMS
 Reduce time wastage
 Data interdependence and effective access
 Data integrity and security
 Uniform administration

Raster Data Formats


• Images (.tif, .jpg, .img, .sid, .jp2)
• Grids – ESRI proprietary (folders)
• DEMs (.dem, folders)
• ASCII (.txt, .asc)
Vector Data Formats
• CAD – subclasses (.dxf)
• Coverage – supports subclasses (folders)
• E00 – coverage exchange format (.e00)
• Shapefile – industry standard (.shp)
• Geodatabase – (.mdb)
Types of Geodatabase:
File Geodatabase
The file geodatabase was a new geodatabase type released in version 9.2. Its goals are to: Provide a
widely available, simple, and scalable geodatabase solution for all users. File Geodatabase is the latest,
greatest file-based format from ESRI. Provide a portable geodatabase that works across operating
systems. Scale up to handle very large datasets. Provide excellent performance and scalability, for

28
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

example, to support individual datasets containing well over 300million features and datasets that can
scale beyond 500 GB per file with very fast performance.
Personal Geodatabase

Personal geodatabases have been used in ArcGIS since their initial release in Version 8.0 and
have used the Microsoft Access data file structure (the .mdb file) and Jet Engine. They support
geodatabases that are limited in size to 2 GB or less. However, the effective database size is
smaller, somewhere between 250 and 500 MB before the database performance starts to slow
down.
The major datasets that can be stored within the geodatabase are the following:
 Feature datasets: Feature datasets exist in the geodatabase to define a scope for a
particular spatial reference. All feature classes that participate in topological relationships
with one another, for example, a geometric network or a topology, must have the same
spatial reference
 Topologies: Many vector datasets have features that could share boundaries or corners. If
you create a topology in the dataset, you can set up rules defining how features share
their geometry.
 Geometric networks Some vector datasets, particularly those used to model
communications, material or energy flow, or transportation networks, need to support
connectivity tracing and network connectivity rules.
 Relationship classes: Relationship classes define relationships between objects in the
geodatabase. These relationships can be simple one-to-one relationships, such as you
might create between a feature and a row in a table, or more complex one-to-many (or
many-to-many) relationships between features and table rows.
 Object classes: An object class is a table in a geodatabase with which you can associate
behavior. Object classes keep descriptive information about objects that are related to
geographic features, but are not features on a map.

GIS Data quality


• The International Standards Organization (ISO) considers quality to be “the totality of
characteristics of a product that bear on its ability to satisfy a stated and implied need”
• The extent to which errors and other shortcomings of a data set affect decision making depends
on the purpose for which the data is to be used
• GIS is mostly used to depict and analyse the earth’s surface in its infinite complexity through an
abstract and simplified (finite) model.

29
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

• Errors are inevitable in the model


• create a need to describe the quality of data.
Why quality is important in GIS?
• Even when source data, such as official topographic maps, have been subject to severe quality
control,
• errors are introduced when these data are input to GIS
• Unlike a conventional map, which is essentially a single product,
• a GIS database normally contains data from different sources of varying quality.
• Unlike topographic or cadastral databases,
• natural resource databases contain data that are inherently uncertain and therefore not
suited to conventional quality control procedures.
• Most GIS analysis operations will themselves introduce errors.
Elements of GIS Data Quality
• The major elements of GIS data Quality grouped in to internal and external quality
A. Internal Data Quality: (difference b/n reality & final output itself)
• Representing the infinite complexity is addressed by designing the ‘nominal ground,’ i.e., the
desired spatial and attribute representations and their accuracies
• The final product will differ from the Nominal Ground by the degree and extent of errors
introduced in the process of data capturing.

• The types of Internal Data Duality:

attribute accuracy : The assessment of may range from:

a simple check on the labeling of features—for example, is a road classified as a metalled road actually
surfaced or not?—

to complex statistical procedures for assessing the accuracy of numerical data,

B. External Data Quality

30
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

Lineage describes the history of a data set.

For digital data sets, defined as: “that part of the data quality statement that contains information that
describes the source of observations or materials, data acquisition and compilation methods, conversions,
transformations, analyses and derivations that the data has been subjected to, and the assumptions and
criteria applied at any stage of its life.”

3: Data Sources and Processes

3.1 Sources of Data


3.2 Data Input Techniques
3.3 Data Organization and Storage
3.3.1 Organizing data for analysis
3.3.2 Spatial data layers – Vertical Data Organization
3.4 Data Editing and Updating

3.5. Data querying and retrieval

3.1 Sources of Data

1. Spatial Data Acquisition and Management


1.1. Sources and Methods of GIS Data Acquisition

Possibly the most important component of a GIS is the data. Geographic data and related tabular
data can be collected in-house or purchased from a commercial data provider. A GIS will
integrate spatial data with other data resources and can even use a DBMS, used by most
organizations to organize and maintain their data, to manage spatial data. We will get these data
from primary and secondary sources.

31
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

Primary data sources:Spatial data can be obtained from scratch, using direct spatial data
acquisition techniques, are called primary data. These data are collected in digital format
specifically for use in a GIS project. Typical examples of primary GIS sources include raster
SPOT and IKONOS LandSat Earth satellite images, and vector building-survey measurements
captured using a total survey station.

Secondary sources are digital and analog datasets that were originally captured for another
purpose and need to be converted into a suitable digital format for use in a GIS project. They are
gained indirect by making use of spatial data collected earlier, possibly by others.

The above data can either be hardcopy or digital in nature. A GIS will integrate spatial data with
other data resources and can even use a Database Management Systems (DBMS) used by most
organization to maintain their data, and predominantly useful to manage spatial data. The
diagram bellow shows the sources and process of data capturing in GIS.

Figure 3.1. Sources of GIS Data


Sources and methods of Attribute data acquisition

Attribute data about a given geographic data can be collected from various sources. The types of
attribute could be e.g. text, numbers (tables),audio, video and photographs in eitherdigital or
analogue form. The analogue data have to be convertedinto digital form and imported into
theGISusing attribute importing techniques of GIS software’s or through direct keyboard typing
method. Whereas, the digital data have to be formatted inorder to be compatible with be usable
in GIS software. The main sources of attribute data include:

32
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

 Primary sources such as by interviews, measurements of variables of interest. It is used to


get new attribute data
 Conventional documents in registers and files;
 Compilations in scientific reports; and
 Socioeconomic and Statistical Files.National censuses are the major source of
socioeconomic data. These socioeconomic surveys provide data on a large number of
themes that can be linked with the spatial data at various levels of aggregation. Area
codes and names are used to link the attribute data with the spatial data. Geophysical and
Environmental Data Files: Many global organizations are working on developing GIS
database on various geophysical and environmental themes, e.g., digital chart of the
world (DCW) and FAO soil maps. These global datasets can be used as a base for small-
scale studies.

Spatial Data Acquisition Techniques

The sources for geo-spatial data are probably more numerous and of greater variety than in most
other information. At the outset, the data, such as those identified as core datasets mentioned
above, can be imported/input to GIS from various sources, those may be:

A. Terrestrial Surveys: Large-scale data are acquired through terrestrial surveys. Survey
measurements and tools expressed in coordinates or other units are used for the
collections. With the increasing use of modern equipment, these surveys lead to digital
files that can be directly imported into GIS. Some of the tools and method for collecting
data through survey are as follows:
Satellite Data: Earth Resources Satellites have become a source of a huge amount of
data for GIS applications. Satellites contain scanners with sensors susceptible to the
radiation emitted or reflected by the earth’s surface. The sensors measure radiation
sequentially from patches or grid cells that are later put in their proper spatial relationship
simulating a map. Data accuracy depends upon the resolution of grid cells, number of
correction techniques for both the radiation values and geometric accuracy.

33
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

GPS Data: The development of a global positioning system (GPS) has made high-
accuracy spatial data easier to obtain in little time. A GPS provides unequalled accuracy
and flexibility of positioning for navigation, surveying and GIS data capture. The data
from GPS are used for increasing the accuracy of existing georeferencing methods or for
point and linear surveys. The data are recorded on the basis of a global reference system
that can be transformed to local reference systems.
B. Digital Data from elsewhere: GIS data may by found from organizations, individuals,
internet, and so on digital map forms. Now a days digital data are produced at large
extent by different vendors and distributed directly or through the network. These data
will be directly encoded in our GIS database using external hard drives and via the
network. For example the following web links provides different kinds of data for free
and price.
Internet data sources:
 http://www.pecad.fas.usda.gov/cropexplorer/global_reservoir/index.cfm
 http://www.arcgis.com/home/
 http://earlywarning.usgs.gov/fews/africa/index.php
 http://www.maplibrary.org/stacks/Africa/Ethiopia/index.php/
 http://faostat.fao.org/site/377/default.aspx#ancor/
 http://www.cru.uea.ac.uk/cru/data/precip/
 http://srtm.csi.cgiar.org/SELECTION/inputCoord.asp/
 http://www.census.gov/ipc/www/idb/informationGateway.php/

C. Data Captured from Existing Maps: Maps have been used since the earliest times to
portray information about the earth’s surface. Maps provide spatial and non-spatial
information of geographic phenomena. The features of the earth are represented as points,
lines or areas. Attention needs to be given to map properties such as scale, resolution,
accuracy, precision and time of map production when using them as a data source for
GIS. The methods encoding such analog data includes:

 Map scanning: Photos, maps, or typed documents can be scanned and then saved
in a format that is readable in the GIS software. Most common are the tabletop
flatbed scanners, but there are also drum scanners, which are very useful for very

34
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

large images.Scanning produces a digital image of the map by moving an electronic


detector across the map surface. Accuracy of the data largely depends upon the
quality of the original map. After scanning, the resulting image can be improved
with various techniques of image processing. This may include corrections of color,
brightness and contrast, or removal of noise, the filling of holes or the smoothening
of lines. It is important to understand that a scanned image is not a structured data
set of classified and coded objects.
 Digitizing is the transformation of information from analog format, such as a paper
map, to digital format, so that it can be stored and displayed with a computer.
Digitizing is basically drawing a line from a paper map or drawing using a special
digitizing tool in GIS softwares or using board or table and a special mouse called a
puck.

The digitizing process

Manual digitizing involves placing a map on a digitizing surface or displaying a map on screen
and tracing the location of feature boundaries. Coordinate data are sampled by manually
positioning the puck or cursor over each target point and collecting coordinate locations. This
step is repeated for every point to be captured and in this manner the locations and shapes of all
required map features defined. Features that are viewed as points are represented by digitizing a
single location. Lines are represented by digitizing an ordered sets of points and polygons by
digitizing an ordered sets of lines. Lines have a starting point often called a starting node, a set of
vertices defining the shape and an ending node. Hence lines may be viewed as a series of straight
line segments connecting vertices and nodes.

Common Errors in Digitizing: While digitizing data the following errors me be encountered
and thus must be corrected before using the data for analysis.
 Slivers – boundaries of adjacent polygons overlap
 Gaps – boundaries of polygons that supposedly share a common border don’t touch due
to “double digitizing”
 Attribute errors – attribute data entered incorrectly
 Overshoot – digitized line extends too far

35
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

 Undershoot – digitized line too short


 Dangling Arc – only one node at an endpoint
 Dangling Node – only one arc attached

3.3. Organizing Geographic Data for analysis

Preprocessing - In preprocessing, the assembled spatial data are converted to forms that can be ingested
by the GIS to produce data layers of spatial objects and their associated information.

► Data layers (shape file etc.)

► Feature types: points, lines, polygons

► Objects types: geometric or thematic

► The manipulation can be for the three features of GIS data

 Geometric attributes (Vector and raster)

 Non-geometric attributes

 Relationships (topology)

► The major spatial data processing operations (vector) are

 Format transformation

► Data clean-up and checking

► Combining multiple data sources

► Geometric transformation

► transformations between data models.,

► spatial interpolation

► For Raster:

 Geometric and radiometric processing, etc

► Format transformation

- Spatial data files must be transformed into the data structures and file formats used internally by
a GIS software package

► Making to the same format enables to use similar rules of validation, editing relationship

Data Checks and Repairs

36
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

 before using the data for our analysis purpose we have to check whether these errors are found in
our data or not, and repair, if errors are found.

 Acquired data sets must be checked for consistency and completeness

Combining multiple data sources

 A GIS project usually involves multiple data sets, thus multiple datasets should be relate to each
other.

 There are three fundamental cases to be considered if we compare data sets pairwise:

 they may be about the same area, but differ in accuracy,

 they may be about the same area, but differ in choice of representation (due to scale
difference), and

 they may be about adjacent areas, and have to be merged into a single data set.

D. Geometric transformations,

 Different data layers should be registered to a common coordinate system

 Making to the same coordinates to overlay data together

37
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

Transformations between data models.

 If much or all of the subsequent spatial data analysis is to be carried out on raster data, one may
want to convert vector data sets to raster data. This process is known as rasterization.

 It involves assigning point, line and polygon attribute values to raster cells that overlap with the
respective point, line or polygon.

 There is an inverse operation, called vectorization, that produces a vector data set from a raster.
We have looked at this in some sense already: namely in the production of a vector set from a
scanned image…..Digitization

 Another form of vectorization takes place when we want to identify features or patterns in
remotely sensed imagery.

 The keywords here are feature extraction and pattern recognition, but these subjects will be dealt
with in Principles of Remote Sensing

Interpolation

 Spatial interpolation is a process of using points with known values to estimate values at other
points.

 If mapping precipitation, for example, and there is no weather reporting station within the grid
cell, an estimate is based on nearby weather stations.

 The process of transforming point based data into a full coverage grid map is called interpolation.

 Basic assumption in spatial interpolation is that the value to be estimated at a point is more
influenced by nearby control points than those that are farther away.

 It can be used to predict unknown values for any geographic point data, such as elevation,
rainfall, chemical concentrations, and noise levels.

 Global and local methods.

 Global method uses every control point available to make the estimate of the unknown
value. ex. IDW

 Local method uses a sample of control points for estimation.

 Kriging methods rely on the notion of autocorrelation

1.2. Basic Vector Operations and Analysis

A spatial analysis may require generation of new data from the original set. For example,
assume that you wish to find a relationship between noise level and proximity to freeways within

38
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

one mile of the freeway, for all parcels zoned residential. To answer the question, you must
combine, or overlay, the parcel map and the freeway map. Then derive the areas within one mile
of the freeway. This process may require the delineation of new geographic entities and the
generation of a new data table showing combinations of several factors. There are several vector
based operations and analysis in GIS. Some of them are discussed bellow.

A. Querying data
The scope of spatial analysis ranges from a simple query about the spatial phenomenon to
complicated combinations of attribute queries, spatial queries, and alterations of original data.
Spatial Queries require the processing of spatial information extraction from the database. There
are two major types of queries in GIS- attribute and spatial.

Attribute Queries

Attribute queries require the processing of attribute data exclusive of spatial information. For
example: identifying commercial land use parcels in order to compute the average value of this
land use type. The selection is based only on an attribute item; therefore, no spatial information
is required. Attribute query may enables:

– To select features using attribute data (e.g. using SQL)


– Results can be mapped or presented in conventional database form
– Can be used to produce maps of subsets of the data or choropleth maps

Attribute query in ArcGIS use SQL (structured query language a language used to query
database) to define your selection. SQL is a language that allows you to query a database - the
SQL query window helps you to express your request correctly. In the top box select the attribute
you would like to select from. Then select which condition this attribute should have. To do this
you need to select the right Boolean operator (=, >, <) and the condition. You can click to get
unique values to select from within the existing values in the database. This is particularly handy
when you want to select a string (a name) as it will avoid any typos and automatically retrieves
the correct formatting for strings.

Boolean expressions and their meaning

39
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

A Boolean expression is an expression that evaluates to a value: True or False. The most
common operators are:
 = equal to
 < less than
 <= less than or equal to
 > greater than
 >= greater than or equal to
 <> greater than or less than
 LIKE is the same as = for strings but allows to guess some of the letters
 AND is part of all the sets
 OR is part of at least one of the set
 NOT is not part of the set

Spatial Data Query or Selection by Location:

This procedure retrieves data from a map by working with map features. Features can be
selected with a cursor or graphic. This type of query aims to select features based on their
location relative to the locations of other features. You would perform this query if, for example,
you wanted to identify all of the fields along a particular road. To perform this type of query you
need at least 2 layers: a target layer - the layer in which features will be selected, and a source
layer - the layer that is used to determine the selection based on its topological relationship to the
target. In addition to selecting features this tool also allows you to add or remove features to or
from you map.

There are a variety of selection methods available to select the point, line, or polygon features in
one layer that are near to, or which overlap features in the same or in another layer. These are
listed below:

Descriptions for different selection methods in Select by Location Function


 Are crossed by the outline of: This method selects the features that are overlapped by the
features of another layer.
 Intersect: This method is similar to the Are crossed by the outline of method but
alsoselects any features bordered by the reference features.

40
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

 Are completely within: This method selects features near or adjacent to features in the
same layer or in a different layer.
 Have their center in: This method selects the features in one layer that have their center
 Completely contain: You can select polygons in one layer that completely contain the
features in another layer.
 Share a line segment with: This method selects features that share line segments,
vertices, or nodes with other features.
 Are identical to: This method selects any feature having the same geometry as a feature
of another layer. The feature types must be the same—for example, you use polygons to
select polygons, lines to select lines, and points to select points.
 Contain: This method selects features in one layer that contains the features of another.
This method differs from the Completely contain method in that the boundaries of the
features can touch.
 Are contained by: This method selects features in one layer that are contained by the
features in another.
 Touch the boundary of: If you are selecting features using a layer containing lines, this
method selects lines and polygons that share line segments, vertices, or endpoints (nodes)
with the lines in the layer.

1.2. Editing and Updating Spatial data

ArcGIS allows you to create and edit several kinds of data. You can edit feature data stored in
shapefiles and geodatabases, as well as various tabular formats. You can also edit shared edges
and coincident geometry using topologies and geometric networks

Editing Existing features

Merging :Merge combines selected features of the same layer into one feature.

Deleting feature

41
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

You can select a feature and delete it to remove it from the map and database.
Splitting Feature

You can use the Cut Polygons tool on the Editor toolbar to perform this task. You can cut
multiple polygons this way, but the cut line is based on a sketch you draw manually.

Reshape Feature

The Reshape Feature tool lets you reshape a polygon by constructing a sketch over a selected
feature. The feature takes the shape of the sketch from the first place the sketch intersects the
feature to the last.

Updating Features

Digitizing is the process of converting features on a paper map into digital format. You might
want to digitize features into a new layer and add the layer to an existing map document or create
a completely new set of layers for an area for which no digital data is available.

Displaying XY coordinates in ArcMap

In addition to data sources, such as a shape file, you can also add tabular data that contains
geographic locations in the form of x,y coordinates to your map. The X,y coordinates describe
discrete locations on the earth's surface such as the location of sample points in a city or the
points where any samples you collected. To add a table of x,y coordinates to your map, the
table must contain two fields, one for the x-coordinate and one for the y-coordinate. You can
easily collect x,y coordinate data using a GPS device.

B. Overlay analysis
Spatial analysis is one of the most important functions of a GIS and includes many methods for
the analysis. One of the basic analysis methods is overlay analysis. During vector overlay, map
features and associated attributes are integrated to produce new
composite maps. Logical rules can be applied to define how the
maps are combined. Vector overlay can be performed on

42
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

different types of map feature: polygon-on-polygon overlay, line-in-polygon overlay, point-on-


polygon overlay. During the process of overlay, the attribute data associated with each feature
type is merged. The resulting table will contain all the attribute data. The process of overlay will
depend upon the modeling approach. One user might need to carry out a series of overlay
procedures to arrive at a conclusion. And this method (processing) can be furthermore divided
into some sub-methods as:
 Union: Union is an operation to create a new feature or set of features by combining all
the areas from two input layer. The union overlay option is useful if you wish to combine
multiple data layers into a single layer. We use Union when we want to overlay two
polygon layers so that the resulting output layer has the combined attribute data of the
polygons in the two inputs, and contains all the polygons from the inputs, whether or not
they overlap. In this way, we can produce a new layer combining the features and
attributes of two polygon layers.
T o combine features of an input polygon1 with the overlay polygon2 to produce a new
output polygon3 that contains the attributes and full extent of both polygon1 and
polygong2.
 Intersect: Intersect merges only the parts that shares common space (where the two
themes overlap). Intersect is an operation to overlay two spatial data layers and find the
areas common to both while discarding unique to either. We use intersect when we want
to overlay a layer with the polygons in another layer so that the resulting output layer has
the combined attribute data of the features from the two inputs, and only contains features
that fall within the spatial extent of the overlay polygons. In this way, we can find those
features that overlap and stamp the attributes of the overlay polygons in the second layer
onto the features in the first layer.
 Clip: removes portions of features that lie outside of features of another layer. The
clipping layer must always be a polygon file, but clipped layers may be points, lines, or
polygons. Only the outside boundary is used for clipping; internal boundaries have no
effect on the output layer. The attributes of the features in the output layer will be the
same as those of the feature in the layer being clipped.
Use a clip polygon1 like a cookie
cutter on your input polygong2 to

43
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

produce a newpolygong3 with the attributes being kept. Clip is an option that removes a
selected part of one theme using another theme, selected features, or a graphic. In effect,
it is an overlay operation that uses one part of a theme to select part of another by
extraction (cutting and removal).

Using overlay analysis, you can generate some kinds of new map as shown in figure 4.1 from a
topography map and a soil map to generate a topography-soil map, or conduct a soil distribution
statistics based on administrative area as shown in figure 4. 1.

Figure 4.1. Using overlay process to create new map and conduct new statistics
 Mask and Replace: Mask is a type of clip operation in which a designated section or set
of features from one theme is used a “window” for selecting parts of a second theme.
Replace (also called cover in some GISs) is another type of clip in some ways, in that it
transfers selected features from one theme to another, covering those in the second
theme. Replace is ideal for updating features spatially without having to go through
elaborate recoding and overlay operations. Replace is essentially a convenient selected-
feature overlay option

44
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

Figure 4.2: Mask and replace

C. Proximity analysis
In proximity computation we use geometric distance to define the neighborhood of one or more
target locations. The most common and useful technique is buffer zone generation. Another
technique based on geometric distance that will also be discussed is Thiessen polygon
generation.

Buffer zone generation:

The principle of buffer zone generation is simple: we select one or more target locations and then
determine the area around them within a certain distance. In Figure (a) a number of main and
minor roads were selected as targets and 75m and 25m (respectively) buffers were computed
from them. Buffers have many uses, mostly dealing with distance from selected features. For
example, questions such as what are the effects on urban areas if the road is extended by 100 m
or what are the effects of a 5 km buffer zone around a national park to prevent grazing can be
answered.

In some case studies, zoned buffers must be determined, for instance in assessment of the effects
of traffic noise. Most GISs support this type of zoned buffer computation. An illustration is
provided in Figure 4.3

45
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

Figure 4.3: buffer zone generation

In vector based buffer generation, the buffer themselves become polygon features, usually in
separate data layer, that can be used in further spatial analysis. Buffer generation on rasters is
fairly simple function. The target locations or locations are always represented by a selection of
the raster’s cells and geometric distance is defined using cell resolution as the unit. The distance
function applied is the Pythagorean distance between the cell centers. The distance from a non-
target cell to the target is the minimal distance one can find between that non-target cell and any
target cell.

D. Network analysis
A completely different set of analytic functions in GIS consists of computations on networks. A
network is a connected set of lines, representing some geographic phenomenon, typically of the
transportation type. The ‘goods’ transported can be almost anything: people, cars and other
vehicles along a road network, commercial goods along a logistic network, phone calls along a
telephone network, or water pollution along a stream/river network.

Network analysis can be done using either raster or vector data layers, but they are more
commonly done in the latter, as line features can be associated with a network naturally, and can
be given typical transportation characteristics like capacity and cost per unit. One crucial
characteristic of any network is whether the network lines are considered directed or not.
Directed networks associate with each line a direction of transportation; undirected networks do
not. In the latter, the ‘goods’ can be transported along a line in both directions. We discuss here

46
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

vector network analysis, and assume that the network is a set of connected line features that
intersect only at the lines’ nodes, not at internal vertices.

For many applications of network analysis, a planar network, i.e., one that is embeddable in a
two-dimensional plane will do the job. Many networks are naturally planar, like stream/river
networks. A large-scale traffic network, on the other end, is not planar: motorways have multi-
level crossings and are constructed with underpasses and overpasses. Planar networks are easier
to deal with computationally, as they have simpler topological rules. Not all GISs accommodate
non-planar networks, or can do so only using trickery. Such trickery may involve to split over
passing lines at the intersection vertex work will then allow to make a turn onto another line at
this new intersection node, which in reality would be impossible. The above is a good illustration
of geometry not fully determining the network’s behavior. Additional application-specific rules
are usually required to define what can and cannot happen in the network. Most GIS provide rule
based tools that allow the definition of these extra application rules. Various classical spatial
analysis functions on networks are supported by GIS software packages.

1.3. Basic Raster Analysis

In raster overlay, the pixel or grid cell values in each map are combined using arithmetic and
Boolean operators to produce a new value in the composite map. The maps can be treated as
arithmetic variables and perform complex algebraic functions. This method is often described as
map algebra. The raster GIS provides the ability to perform map layers mathematically. This is
particularly important for the modeling in which various maps are combined using various
mathematical functions. Conditional operators are the basic mathematical functions that are
supported in GIS.
A. Neighborhood operations
Whereas overlays combine features at the same location, neighbourhood functions evaluate the
characteristics of an area surrounding a feature’s location. This allows to look at buffer zones
around features, and spreading effects if features are a source of something that spreads—e.g.,
water springs, volcanic eruptions, sources of pollution.
i. Local operation: the output value of a cell is computed as a function of the value of
the same cell in input raster datasets (regardless of the values of neighboring cells).

47
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

ii. Neighborhood (focal) operation: the output value of a cell is computed as a function
of the values of the same cell and neighboring cells in the input datasets.

iii. Zonal operation: the output value for each location depends on the value of the cell
at the location and the association that location has within a cartographic zone.

iv. Global operation: output value at each cell location is potentially a


function of all the cells in the input raster datasets

48
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

Unit Five: Spatial Data Analysis


The heart of GIS is the analytical capability of the system. Spatial analysis helps in identifying trends on
the data, creating new relationship from the data, viewing complex relationship between the data sets, and
make better decision. Simple analysis such as statistical summaries, (maximum, minimum, means, and
sums) and analysis of interrelationships between various geographical related variables could be carried
out in a GIS environment
Spatial analysis is in many ways the crux of GIS because it includes all of the transformations,
manipulations, and methods that can be applied to geographic data to add value to them, to support
decisions, and to reveal patterns and anomalies that are not immediately obvious – in other words, spatial
analysis is the process by which we turn raw data into useful information, in pursuit of scientific
discovery, or more effective decision making. If GIS is a method of communicating information about the
Earth’s surface from one person to another, then the transformations of spatial analysis are ways in which
the sender tries to inform the receiver, by adding greater informative content and value, and by revealing
things that the receiver might not otherwise see.
Spatial analysis is the crux of GIS, the means ofadding value to geographic data, and of turningdata
into useful information.Spatial analysis can reveal things that mightotherwise be invisible – it can
make what isimplicit explicit.
The range of GIS analysis includes the spatial data analysis discussed above and some of the following.
(of course there are different classification approaches)
CLASSIFICATION OF ANALYTIC GIS CAPABILITIES
There are many ways to classify the analytic functions of a GIS. The classification used for this unit, is
essentially the one put forward by Aronoff. It makes the following distinctions in function classes:
i. Measurement, and retrieval functions
Perhaps the initial GIS analysis that any user undertakes is the retrieval and/or reclassification of data. It
allows exploring the data without making fundamental changes, and therefore they are often used at the
beginning of data analysis. Measurement functions include computing distances between features or
along their perimeters, and the computation of area size of 2D or volume size of 3D features. Counting, to
understand frequency of features, is also included. Spatial queries retrieve features selectively, using user-
defined, logical conditions. Classification means the (re)assignment of a thematic, characteristic value to
features in a data layer. All functions in this category are performed on single (vector or raster) data layer,
often using the associated attribute data.

8.2.1 Measurement

49
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

Reclassification involves the selection and presentation of a selected layer of data based on the classes or
values of a specific attribute e.g. cover group. It involves looking at an attribute, or a series of attributes,
for a single data layer and classifying the data layer based on the range of values of the attribute.
Accordingly, features adjacent to one another that have a common value, e.g. cover group, but differ in
other characteristics, e.g. tree height, species, will be treated and appear as one class. In raster based GIS
software, numerical values are often used to indicate classes. Reclassification is anattribute generalization
technique. Typically this function makes use of polygonpatterning techniques such as crosshatching
and/or color shading for graphicrepresentation.
Geometric measurement on spatial features includes counting, distance and area size
computations. For the sake of simplicity, this section discusses such measurements in a
planar spatial reference system. We limit ourselves to geometric measurements, and do
not include attribute data measurement, which is typically performed in a database query
language. Measurements on vector data are more advanced, thus, also more complex,
than those on raster data. We discuss each group as follows.
a. Measurements on vector data
The primitives of vector data sets are point, (poly)line and polygon. Related geometric measurements are
location, length, distance and area size. Some of these are geometric properties of a feature in isolation
(location, length, area size); others (distance) require two features to be identified. The location property
of a vector feature is always stored by the GIS: a single coordinate pair for a point, or a list of pairs for a
polyline or polygon boundary. Occasionally, there is a need to obtain the location of the centroid of a
polygon; some GISs store these also, others compute them ‘on-the-fly’. Length is a geometric property
associated with polylines, by themselves, or in their function as polygon boundary. It can obviously be
computed by the GIS— as the sum of lengths of the constituent line segments—but it quite often is also
stored with the polyline. Area size is associated with polygon features. Again, it can be computed, but
usually is stored with the polygon as an extra attribute value. This speeds up the computation of other
functions that require area size values. We see that all of the above measurements do not require
computation, but only a look up in stored data.
Measuring distance between two features is another important function. If both features are points, say p
and q, the computation in a Cartesian spatial reference system are given by the well-known Pythagorean
distance function:

b. Measurements on raster data

50
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

Measurements on raster data layers are simpler because of the regularity of the cells. The area size of a
cell is constant, and is determined by the cell resolution. Horizontal and vertical resolution may differ, but
typically do not. Together with the location of a so called anchor point, this is the only geometric
information stored with the raster data, so all other measurements by the GIS are computed. The anchor
point is fixed by convention to be the lower left (or sometimes upper left) location of the raster. Location
of an individual cell derives from the raster’s anchor point, the cell resolution, and the position of the cell
in the raster. Again, there are two conventions: the cell’s location can be its lower left corner, or the cell’s
midpoint. These conventions are set by the software in use, and in case of low resolution data they
become more important to be aware of. The area size of a selected part of the raster (a group of cells) is
calculated as the number of cells multiplied with the cell area size. The distance between two raster cells
is the standard distance function applied to the locations of their respective mid-points, obviously taking
into account the cell resolution. Where a raster is used to represent line features as strings of cells through
the raster, the length of a line feature is computed as the sum of distances between consecutive cells.

8.2.2 Spatial selection queries


When exploring a spatial data set, the first thing one usually wants is to select certain features, to
(temporarily) restrict the exploration. Such selections can be made on geometric/spatial grounds, or on the
basis of attribute data associated with the spatial features. We discuss both techniques below.
i. Interactive spatial selection
In interactive spatial selection, one defines the selection condition by pointing at or drawing spatial
objects on the screen display, after having indicated the spatial data layer(s) from which to select features.
The interactively defined objects are called the selection objects; they can be points, lines, or polygons.
The GIS then selects the features in the indicated data layer(s) that overlap (i.e., intersect, meet, contain,
or are contained in) with the selection objects. These become the selected objects. As we have seen spatial
data is usually associated with its attribute data (stored in tables) through a key/foreign key link.
Selections of features lead, via these links, to selections on the records. Vice versa, selection of records
may lead to selection of features. Interactive spatial selection answers questions like “What is at . . . ?”
The selection object is a circle and the selected objects are the red polygons; they overlap with the
selection object.
All city wards that overlap with the selection object are selected (left), and their corresponding attribute
records are highlighted (right, only part of the table is shown).

51
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

Figure: Selected object and their attributes


Spatial selection by attribute conditions
Attribute query applies an SQL query to a database and the results are represented in table form. The
query can be used to join several tables or return a subset of columns or rows from the original data in the
database. For example, a tourist may want to identify hotels in Bishoftu having bed room’s more than 20
and minimum price less than 200 Birr. Attribute query provides answer from relational tabular database
already created. The expression may look like this ["No_Beds" >= 20 AND "Min_price_" <=200]. This
kind of expression is known as Boolean expression

Selecting features based on their distance


One may also want to use the distance function of the GIS as a tool in selecting features.
Such selections can be searched within a given distance from the selection objects, at a given distance, or
even beyond a given distance. There is a whole range of applications to this type of selection:

52
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

 Which clinics are within 2 kilometres of a selected school? (Information needed for the school
emergency plan.)
 Which roads are within 200 metres of a medical clinic? (These roads must have a high road
maintenance priority.)
The figure below illustrates a spatial selection using distance. Here, we executed the selection of the
second example above. Our selection objects were all clinics, and we selected the roads that pass by a
clinic within 200 metres.

8.2.3 Overlay functions


In this section, we look at techniques of combining two spatial data layers and producing a third one from
them. The binary operators that we discuss are known as spatial overlay operators. We will first discuss
vector forms, and then raster overlay operators. Standard overlay operators take two input data layers, and
assume they are georeferenced in the same system, and overlap in study area. If either condition is not
met, the use of an overlay operator is senseless. The principle of spatial overlay is to compare the
characteristics of the same location in both data layers, and to produce a new characteristic for each
location in the output data layer. Which characteristic to produce is determined by a rule that the user can
choose. In raster data, as we shall see, these comparisons are carried out between pairs of cells, one from
each input raster. In vector data, the same principle of comparing locations pair wise applies, but the
underlying computations rely on determining the spatial intersections of features, one from each input
vector layer, pair wise.

ii. Overlay functions


This group forms the core computational activity of many GIS applications. Data layers are combined and
new information is derived, usually by creating features in a new layer. The computations are simpler for
raster data layers than for vector layers, but both can be used. The principle of overlay is to combine
features that occupy the same location. Many GISs support overlays through an algebraic language,

53
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

expressing an overlay function as a formula in which the data layers are the arguments. Different layers
can be combined using arithmetic, relational, and conditional operators and many different functions.

Vector overlays operators


In the vector domain, the overlaying of data layers is computationally more demanding than in the raster
domain. We will discuss here only overlays from polygon data layers, but remark that most of the ideas
carry over to overlaying with point or line data layers.
Two polygon layers A and B produce a new polygon layer (with associated attribute table) that contains
all intersections of polygons from A and B. The standard overlay operator for two layers of polygons is
the polygon intersection operator. It is fundamental, as many other overlay operators proposed in the
literature or implemented in systems can be defined in terms of it. The result of this operator is the
collection of all possible polygon intersection of the two input attribute tables.
Overlays using a decision table
Conditional expressions are powerful tools in cases where multiple criteria must be taken into account. A
small size example may illustrate this. Consider a suitability study in which a land use classification and a
geological classification must be used. The respective rasters are illustrated in the figure below on the left.
Domain expertise dictates that some combinations of land use and geology result in suitable areas,
whereas other combinations do not. In our example, forests on alluvial terrain and grassland on shale are
considered suitable combinations, while the others are not.

Figure: The use of a decision table in raster overlay

8.2.4 Neighbourhood functions


In our section on overlay operators, the guiding principle was to compare or combine the characteristic
value of a location from two data layers, and to do so for all locations. This is what raster operation, for
instance, gave us: cell by cell calculations, with the results stored in a new raster. There is another guiding

54
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

principle in spatial analysis that can be equally useful. The principle here is to find out the characteristics
of the vicinity, here called neighbourhood, of a location. After all, many suitability questions, for
instance, depend not only on what is at the location, but also on what is near the location. Thus, the GIS
must allow us ‘to look around locally’. To perform neighbourhood analysis, we must:
1. State which target locations are of interest to us, and what is their spatial extent,
2. Define how to determine the neighbourhood for each target,
3. Define which characteristic(s) must be computed for each neighbourhood.

For instance, our target can be a medical clinic. Its neighbourhood can be defined as:
 An area within 2 km distance, as the crow flies, or
 An area within 2 km travel distance, or
 All roads within 500 m travel distance, or
 All other clinics within 10 minutes travel time, or
 All residential areas, for which the clinic is the closest clinic.
Then, in the third step we indicate what characteristics to find out about the neighbourhood. This could
simply be its spatial extent, but it might also be statistical information like:
 How many people live in the area,
 What is their average household income, or
 Are any high-risk industries located in the neighbourhood?

iii. Neighbourhood functions


Whereas overlays combine features at the same location, neighbourhood functions evaluate the
characteristics of an area surrounding a feature’s location. This allows to look at buffer zones around
features, and spreading effects if features are a source of something that spreads—e.g., water springs,
volcanic eruptions, sources of pollution
8.2.5 Proximity computation
In proximity computations, we use geometric distance to define the neighbourhood of one or more target
locations. The most common and useful technique is buffer zone generation.
Buffer zone generation
Buffering involves the ability to create distance buffers around selected features, be it points, lines, or
areas. Buffers are created as polygons because they represent an area around a feature. Buffering is also
referred to as corridor or zone generation with the raster data model.
8.2.6 Network analysis

55
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

A completely different set of analytic functions in GIS consists of computations on networks. A network
is a connected set of lines, representing some geographic phenomenon, typically of the transportation
type. The ‘goods’ transported can be almost anything: people, cars and other vehicles along a road
network, commercial goods along a logistic network, phone calls along a telephone network, or water
pollution along a stream/river network. Network analysis can be done using either raster or vector data
layers, but they are more commonly done in the latter, as line features can be associated with a network
naturally, and can be given typical transportation characteristics like capacity and cost per unit. One
crucial characteristic of any network is whether the network lines are considered directed or not. Directed
networks associate with each line a direction of transportation; undirected networks do not. In the latter,
the ‘goods’ can be transported along a line in both directions. We discuss here vector network analysis,
and assume that the network is a set of connected line features that intersect only at the lines’ nodes, not
at internal vertices.
For many applications of network analysis, a planar network, i.e., one that is embeddable in a two-
dimensional plane will do the job. Many networks are naturally planar, like stream/river networks. A
large-scale traffic network, on the other end, is not planar: motorways have multi-level crossings and are
constructed with underpasses and overpasses. Planar networks are easier to deal with computationally, as
they have simpler topological rules. Not all GISs accommodate non-planar networks, or can do so only
using trickery. Such trickery may involve to split over passing lines at the intersection vertex work will
then allow to make a turn onto another line at this new intersection node, which in reality would be
impossible. The above is a good illustration of geometry not fully determining the network’s behaviour.
Additional application-specific rules are usually required to define what can and cannot happen in the
network. Most GIS provide rule based tools that allow the definition of these extra application rules.
Various classical spatial analysis functions on networks are supported by GIS software packages.

4: Data Visualization

4.1 GIS and Map


4.2 The visualization process
4.3 Mapping data
4.4 Mapping qualitative data
4.5 Mapping quantitative data
4.6 Mapping terrain elevation

Data Output

56
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]

GIS output represents the pinnacle (end) of many GIS projects. Since the purpose of information systems
is to produce results, this aspect of GIS is vitally important to many managers, technicians, and
scientists. Maps are a very effective way of summarizing and communicating the results of GIS
operations to a wide audience. The importance of map output is further highlighted by the fact that many
consumers of geographic information only interact with GIS through their use of map products. Analysis
outputs in the form of statistical summary, graphs charts.
What are the outputs of GIS and the methods?
Data output methods
Methods Devices

1. hard copy Printer, plotter

2. soft copy Computer screen, database

3. output of digital dataset CD ROM computer network

5: Web Technology for GIS and Mapping


5. 1 Principles of internet and the web
5.2 Principles of Open standards and web GIS

57
Department of Geography and Environmental Studies

You might also like