Gis Exist Course
Gis Exist Course
OF GIS [GeES3082]
1. Introduction
A Geographic Information System (GIS) - is a tool for making and using spatial information.
It uses the power of computer to pose and answer geographic questions.
There is no clear cut definition for GIS. Different people define it according to its capability and purpose
for which it is applied. Some are:
GIS is a computer system for the input, manipulation, storage and output of digital spatial data
within a particular organization (Clark, 1986)
GIS is a powerful tool set for collection, storing, retrieval as well as transforming and displaying
spatial data from the real world (Burrough 1989).
Elements of GIS
GIS is the combination of three words: Geographical Information and Systems.
This implies Geographical – the ‘spatial key’ or location of features is central to data handling, analysis
and reporting, which sets GIS apart from other data base management systems.
This is the part of GIS that explains "spatially" where things are such as the location of nations, states,
counties, cities, schools, roads, rivers, lakes, and the list can go on and on. Spatially means where on the
earth's surface an object or feature is located. This can be as simple as the latitude and longitude of a
feature. The geographic feature or object can be anything of interest.
1
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
Information – without data and information GIS can have no role to play and good quality data are
critical if the results of analysis are to be reliable.
GIS information is the "data" or "attribute" information about specific features that we are interested in
such as the name of the feature, what the feature is, the location of the feature, and any other information
that is important. An example could be the name of a city, where it is located, how big it is in square feet
(area), its population, its population in the past, and any other information that is important.
Geographical Information is different from other kinds of information and therefore requires special
methods to be analyzed. Here are some of the characteristics that make geographical information special:
• Multidimensional – at least two coordinates must be specified to define a location
• Voluminous – a geographic database can easily reach a terabyte in size
• Different Representations - and how this is done can strongly influence the ease of
analysis and the end results
• Requires projection to flat surface
• Requires unique analysis methods
• Analyses require data integration
• Data updates are expensive and time consuming
Systems – at a basic level they are computer-based systems, but it is important to remember that GIS are
rarely personal technology, so an understanding of how organizations manage data and use information is
critical to understanding and achieving effective use of GIS.
The system in GIS is the computer and the software that is written to help people analyze the data, look at
the data and combine it in various ways to show relationships or to create geographic models. A GIS can
be made up of a variety of software and hardware tools, as long as they are integrated to provide a
functional geographic data processing tool.
GIS is a technology that integrates powerful database capabilities with the unique visual perspective of
map. It involves information about the real world that is represented by point, line, areas and image; at
any scale ranging from local to global. GIS operates on two data elements-spatial and non-spatial
(attribute) data. Generally speaking GIS has five components. They are hardware, software, data, methods
and people. Is GIS about software only? But is GIS about computers only?
Some definitions of GIS focus on the hardware, software, data and analysis of components. However, no
GIS exist in isolation from the organizational context, and there must always be people to plan,
implement and operate the system as well as make decision based on the output.
2
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
The central processing unit forms the backbone of the GIS hardware. Other components include scanner,
digitizer board, printer, plotter, and storage devices. All these components should be connected to the
CPU. The above mentioned hard wares are discussed below.
The hardware of a GIS is composed of: input devices, processing storage devices and, output devices.
Input devices
Digital data input depends on the type of data to be utilized. Imagery input is possible from analogue
images through the use of image scanners. Digital airborne and space-borne systems already use charge-
coupled device CCD-sensors to supply the data in digital form.
3
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
Software components
Software that is used to create, manage, analyze and visualize geographic data, i.e. data with a reference
to a place on earth, is usually denoted by the umbrella term ‘GIS software’. Typical applications for GIS
software include the evaluation of places for the location of new stores, the management of power and gas
4
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
lines, the creation of maps, the analysis of past crimes for crime prevention, route calculations for
transport tasks, the management of forests, parks and infrastructure, such as roads and water ways, as well
as applications in risk analysis of natural hazards, and emergency planning and response.
For this multitude of applications different types of GIS functions are required and different categories of
GIS software exist, which provide a particular set of functions needed to fulfill certain data management
tasks. We will first explain important GIS software concepts, then list the typical tasks accomplished with
GIS software, describe different GIS software categories, and finally provide information on software
producers and projects.
Types of Software’s required for GIS:
A. Basic computer Software’s
The operation of a computer is based on its operating system. It assures that all parts of the computer
function in liaison. Most common are Microsoft’s operating systems for PCs. In MS-DOS (Microsoft
Disk Operating System) the operation is regulated by text lines. This permits the administering of files by
name. More modern are Windows operating systems such as Windows 3.1, Windows 95, Windows 98,
Windows NT, Windows 2000, Windows ME and Windows XP, utilizing graphic symbols (icons).
Windows acts as a graphical user interface (GUI). Windows is now a network compatible system.
5
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
Data:
Perhaps the most important component of a GIS is the data. A GIS can integrate spatial data with other
existing data resources, often stored in a corporate DBMS. The integration of spatial data (often
proprietary to the GIS software), and tabular data stored in a DBMS is a key functionality afforded by
GIS.
Like all useful data, geographic data is expected to possess desirable properties of accuracy, timeliness,
comprehensiveness, acceptable cost etc. Other general issues relating to geographic data include spatial
extent (the area covered), scale (the detail in the system), the large volume (both attribute data and
graphic data can make large storage demands), diversity (data of interest plus background data),
collection cost (despite technological advances, field collection of data can still be very labor intensive),
etc. Scale is important not only for graphic representation in map form but also as it impacts on other
issues such as map coverage extent, data volume and data collection.
Major sources of geographic information: maps, aerial photographs, remotely sensed imagery and digital
datasets available from various vendors. Today, in most developed countries there is a declining emphasis
on production of printed maps by mapping agencies as geographic information collection is shifting to
either remote sensing or to the use of GPS for field data collection. Increasingly there is integration of
GPS and GIS for field data collection.
Methods:
A successful GIS operates according to a well-designed implementation plan and business rules, which
are the models and operating practices unique to each organization.
Procedures include how the data will be retrieved, input into the system, stored, managed,
transformed, analyzed, and finally presented in a final output.
The procedures are the steps taken to answer the question need to be resolved. The ability of a
GIS to perform spatial analysis and answer these questions is what differentiates this type of
system from any other information systems.
As in all organizations dealing with sophisticated technology, new tools can only be used effectively if
they are properly integrated into the entire business strategy and operation. To do this properly requires
not only the necessary investments in hardware and software, but also in the retraining and/or hiring of
personnel to utilize the new technology in the proper organizational context.
6
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
People: People refer users and can be considered as the component of GIS who actually makes the GIS
work. Effective use of GIS requires an organization to support various GIS activities.
People in GIS usually include a plethora of positions including GIS managers, database administrators,
application specialists, systems analysts, and programmers. They are responsible for maintenance of the
geographic database and provide technical support. People also need to be educated to make decisions on
what type of system to use. People associated with a GIS can be categorized into: viewers, general users,
and GIS specialists.
Network
The use of the WWW to give access to maps dates from 1993. The recent histories of GIS and the
Internet have been heavily intertwined; GIS has turned out to be a compelling application that has
prompted many people to take advantage of the Web. At the same time, GIS has benefited greatly from
adopting the Internet paradigm and the momentum that the Web has generated. They range from using
GIS on the Internet to disseminate information to selling goods and services to direct revenue generation
through subscription services, to helping members of the public to participate in important local, regional,
and national debates.
7
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
A GIS is a computer –based system that provides the following four subsystems to handle geo-referenced
data.
A. Data Input Subsystem
A Data Input subsystem allows the user to capture, collect, and transform spatial and thematic data into
digital form. The data inputs are usually derived from a combination of hard copy maps, aerial
photographs, remotely sensed images, reports, survey documents, etc.
B. Data management (Data Storage, Editing and Retrieval Subsystem)
The second necessary component for a GIS is the data storage and retrieval subsystem. The Data Storage
and retrieval subsystem organizes the data, spatial and attribute, in a form, which permits it to be quickly
retrieved by the user for analysis, and permits rapid and accurate updates to be made to the database. This
component usually involves use of a database management system (DBMS) for maintaining attribute
data. Spatial data is usually encoded and maintained in a proprietary file format.
Organizing Data for Analysis: Most GIS software organizes spatial data in a thematic approach
that categorizes data in vertical layers.
Editing and Updating of Data: Perhaps the primary function in the data storage and retrieval
subsystem involves the editing and updating of data.
Data Retrieval and Querying The ability to retrieve data is based on the unique structure of the
DBMS and command interfaces are commonly provided with the software.
C. Data Manipulation and Analysis Subsystem
The Data Manipulation and Analysis subsystem allows the user to define and execute spatial and
attributes procedures to generate derived information. This subsystem is commonly thought of as the
heart of a GIS, and usually distinguishes it from other database information systems and computer-aided
drafting (CAD) systems.
Manipulation and Transformations of Spatial Data
The maintenance and transformation of spatial data concerns the ability to input, manipulate, and
transform data once it has been created. Some specific functions are:
Coordinate thinning: involves the reduction of the coordinate pairs (X and Y) from arcs.
8
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
Geometric Transformations
Map Projection Transformations
Edge Matching
Interactive Graphic Editing
Analytical Functions in a GIS
The primitive analytical functions that must be provided by any GIS are:
Retrieval, Reclassification, and Generalization
Topological Overlay Techniques
Neighborhoods Operations
Connectivity Functions
D. Data Output and Display Subsystem.
The Data Output subsystem allows the user to generate graphic displays, normally maps, and tabular
reports representing derived information products. This subsystem conveys the results of analysis to the
people who make decisions about resources. Wall maps and other graphics can be generated, allowing the
viewer to visualize and thereby understand the results of analyses or simulations of potential events.
9
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
However, there is another way to describe GIS by listing the type of questions (capabilities) the
technology can (or should be able) to answer. These include: locations, conditions, trends, patterns,
modeling, non spatial questions, and spatial questions. There are five types of questions that a GIS can
answer:
i. Query for location: what is at………?
The first of these questions seeks to find out what exists at a particular location. Mapped data primarily
indicates where objects are located, but cannot explain why. A location can be described in many ways,
using, for example place name, postcode, or geographic reference such as longitude/latitude or x/y
coordinates. For example, an aerial photo may show that corn is growing vigorously in certain sections of
a field, but cannot explain why it does not grow well in other areas.
ii. Query for Condition: where is it…………?
The second question is the converse of the first and requires spatial data to answer. Frequently a GIS user
wants to discover whether the mapped data will meet certain conditions. That means instead of
identifying what exists at a given location, one may wish to find location(s) where certain conditions are
satisfied (e.g., an un forested section of at-least 2000 square meters in size, within 100 meters of road, and
with soils suitable for supporting buildings).
iii. Trend analysis: what has changed since…………..?
The third question might involve both the first two and seeks to find the differences (e.g. in land use or
elevation) over time. This can help to address temporal changes of earth’s phenomena.
iv. Pattern analysis: what spatial patterns exist…………..?
This question is more sophisticated. One might ask this question to determine whether landslides are
mostly occurring near streams. It might be just as important to know how many anomalies there are those
do not fit the pattern and where they are located.
v. Modeling: what if……………..?
"What if…" questions are posed to determine what happens, for example, if a new road is added to a
network or if a toxic substance seeps into the local ground water supply. Answering this type of question
requires both geographic and other information (as well as specific models). GIS permits spatial
operation.
In addition to all these capabilities, GIS can also handle related to non- spatial issues. For instance,
"What's the average number of people working with GIS in each location?" is non spatial question - the
answer to which does not require the stored value of latitude and longitude; nor does it describe where the
places are in relation with each other.
1.4. Applications and Purposes of GIS
10
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
Our day of life with GIS illustrates the unprecedented frequency with which, directly or indirectly, we
interact with digital machines. Today, more and more individuals and organizations find themselves using
GIS to answer the fundamental question, where?
Why Study GIS?
• 80% of local government activities estimated to be geographically based
– plats, zoning, public works (streets, water supply, sewers), garbage collection, land
ownership and valuation, public safety (fire and police)
• a significant portion of state government has a geographical component
– natural resource management
– highways and transportation
• businesses use GIS for a very wide array of applications
– retail site selection & customer analysis
– logistics: vehicle tracking & routing
– natural resource exploration (petroleum, etc.)
– precision agriculture
– civil engineering and construction
• Military and defense
– Battlefield management
– Satellite imagery interpretation
• scientific research employs GIS
– geography, geology, botany
– anthropology, sociology, economics, political science
– Epidemiology, criminology
Generally, the application of geospatial sciences has spread very fast and wide over the past few decades.
There is, quite simply, a huge range of applications of GIS shall be explained in this section. These
include topographic base mapping, socio-economic and environmental modeling, global (and
interplanetary!) modeling, and education. Applications generally set out to full fill the five Msof GIS:
mapping, measurement, monitoring, modeling, management.
General and specialized GIS systems have been designed for a variety of purposes:
• For environmental management and conservation.
• For defense and intelligence purposes.
• For governmental administration.
• For resource management in agriculture and forestry.
• For geophysical exploration.
11
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
Spatial Data
Spatial data also known as geospatial (coordinate) data or geographic information. It is the data or
information that identifies the geographic location of features and boundaries on Earth, such as natural or
constructed features, parcels, roads, buildings and more. In other words, it describes the absolute and
relative location of geographic or spatial features.
Geographic phenomena and features are infinite and have complex relationship with each other.
Geographic entities (also called geographic phenomena) you have to refer back the principle of
interpolation. In addition to that, entity or geographic feature occupies position in space about
which data describing the attributes of the entity and its geographic location are recorded.
A geographical entity is defined in terms of:
Location (spatial reference)
Dimensions
Attribute
Time
It is common in spatial analysis to refer to places as (spatial) objects.
Spatial data is usually stored as coordinates and topology, and is data that can be mapped. Spatial data use
Cartesian coordinates systems. Two dimensional Cartesian coordinate systems define x and y axes in a
12
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
plane. The three dimensional Cartesian system defines a z axis, orthogonal to both the x and y axes. An
origin is defined with zero values at the intersection of the orthogonal axes. Spatial data is often accessed,
manipulated or analyzed through GIS.
13
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
Different maps may use different projection systems. Georeferencing tools contain methods to
combine and overlay these maps with minimum distortion.
Using georeferencing methods, data obtained from surveying tools like total stations may be
given a point of reference from topographic maps already available.
Projected Coordinate System
• x,y coordinates referred to as “eastings” & “northings”
• Units can be in meters, feet, inches
• The projection name, type, and other parameters are defined by a grid mapping variable.
• It is the result of different types of map projection
In GIS, spatial Data should be projected using the appropriate map projection
2.1.2. Spatial Data types
14
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
set of geographic entities derived from a common set of criteria, thus sharing spatial character
and structure, e.g., ownership parcels, intersections, street segments, etc. These geographic
features can be counted and if we like we can list them in table with their attribute. So objects are
distinguished by their dimensions, and naturally fall into categories of points, lines, or areas,
when we represent them in GIS.
GEOGRAPHIC ENTITIES
Geographic entities can also be called as geographic phenomena. GIS supports such study because it
represents phenomena digitally in a computer. An entity or geographic feature occupies position in space
about which data describing the attributes of the entity and its geographic location are recorded. It is a
discrete generic class with basic connectedness and interdependence as a single data set, i.e., land use as a
class has separate entities of residential, commercial, industrial, agricultural, etc. The class is a set of
geographic entities derived from a common set of criteria, thus sharing spatial character and structure,
e.g., ownership parcels, intersections, street segments, etc.
Types of Geographic entity
The fundamental observation is that some phenomena manifest themselves essentially everywhere in the
study area while others only occur in certain localities. Therefore, geographic entities based on the
manifestations within the area of consideration; can be broadly classified into two types.
i. Geographic fields
Geographic fields are geographic phenomena at which every point in the study area a value can be
determined. They manifest themselves essentially everywhere in the study area. The usual examples of
geographic fields are: temp, pressure, elevation, etc. These fields are actually continuous in nature and are
characterized by their fuzzy boundary nature.
ii. Geographic objects
As opposite to the above discussed types of geographic phenomena, many other phenomena do not
manifest themselves everywhere in the study area, but only in certain localities. These entities populate
the study area and are usually distinguishable one from the other and can be characterized by their
discrete boundary nature. The space between them is potentially empty. Examples include: building, road,
parcel, river, etc. Their position in space can be determined by a combination of:
Location (where is it)
Shape (what form is it?)
Size (how big is it?)
Orientation (in which direction is it facing?).
15
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
A continuous field of elevation, for example, varies much more smoothly in a landscape that has
been worn down by glaciations or flattened by blowing sand than one recently created by cooling
lava. Cliffs are places in continuous fields where elevation changes suddenly, rather than
smoothly. Population density is a kind of continuous field, defined everywhere as the number of
people per unit area, though the definition breaks down if the field is examined so closely that
the individual people become visible. Continuous fields can also be created from classifications
of land, into categories of land use, or soil type. Such fields change suddenly at the boundaries
between different classes. Other types of fields can be defined by continuous variation along
lines, rather than across space. Traffic density, for example, can be defined everywhere on a road
network, and flow volume can be defined everywhere on a river.
While other geographic phenomenon, do not have clear cut geographic boundary. They manifest
themselves essentially everywhere in the study area. These fields are actually continuous in
nature and are characterized by their fuzzy boundary nature. The usual examples of geographic
fields are: temp, pressure, elevation, etc.
The continuous Field View: based on this view, there is no clear cut boundary between
geographic objects and space. The geographic space potentially contains an infinite amount of
information if it defines the value of the variable at every point. Since there are an infinite
number of points in any defined geographic area it is impossible to represent all in the computer.
Because in our discussion above it his topic we have said that the space in computers is limited.
Discrete Object View: The second conceptual geographic data model is discrete objects view
which is based on the assumption that geographic objects have well-defined boundaries. In this
view though objects in geographic space are discrete (finite such as mountain, buildings, road,
etc), the objects can also require an infinite amount of information for full description. For
example, although Lake Tana has a definite boundary, it contains an infinite amount of
information if it is mapped in infinite detail. Thus it is not possible to include and study all
information in the computer representation.
Continuous fields and discrete objects define two conceptual views of geographic phenomena,
but they do not solve the problem of digital representation. Therefore, continuous fields and
16
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
discrete objects are no more than conceptualizations, or ways in which we think about
geographic phenomena in GIS; they are not designed to deal with the limitations of computers.
Features on the earth surface are represented in GIS by their location. The common requirement to store
such data is on the base of one or more classes. Many GIS software’s designs such database based on a
particular level of classification such as road, rivers, or vegetation types are grouped in to so call layers
or coverage. The layers are connected by common identification. The layers can be combined
(overlayed) with each other in various ways to create new layers. Raster and vector are two methods of
representing geographic data in digital computers in GIS data model. The choice of data structure affects
both data volume and processing efficiency.
3.4.1.1. Vector data model: It is a discrete data model. In this model geographic features are represented
discretely in the form of point, line, and polygon. In the vector data model each object in the real world is
first classified into a geometric type: in the 2-D case point, line, or polygon (Figure ).
Points (e.g., wells, soil pits (depth), and retail stores) are recoded as single coordinate pairs, lines (e.g.,
roads, streams, and geologic faults) as a series of ordered coordinate pairs (also called polylines), and
polygons (e.g., census tracts, soil areas, and oil license zones) as one or more line segments that close to
form a polygon area. The coordinates that define the geometry of each object may have 2, 3, or 4
dimensions: 2 (x, y: row and column, or latitude and longitude), 3 (x, y, z: the addition of a height value),
or 4 (x, y, z, m: the addition of another value to represent time or some other property – perhaps the offset
of road signs from a road centerline, or an attribute).
Vector data - objects represented in vector data structures are determined by an x, y location in
coordinate space. Vector data sets are composed of single points, lines, polylines or arcs (connected
string of points), and polygons (series of coordinates that define an enclosed region).
arc= series of line segments bounded by nodes at end points and vertices.
topology= the way a vector GIS uses points, lines, and polygons to represent map features.
+1
Points Coordinates
+2 +3 1. (2, 4)
2. (6, 8)
Polyline Coordinates
1 (2,3), (3, 6), (6,10)
17
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
Polygon Coordinates
1 (1,4)(4,5)(5,6)(6, 8)(9, 1)
3.4.1.2. Raster Data Model: is a continuous data model. Geographical features are made up of a matrix
of pixel (cell), each containing a value that represents the conditions covered by that cell. Example, aerial
photography and satellite image.
The size of each cell is generally determined by some type of header record for the file that describes
the coordinate of the origin (row 1, column 1) of the file and the x,y dimensions of the cells in the file.
18
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
1. Spatial inaccuracy
2. Resolution
3. Large storage requirements for simple data (note that complex landscapes can actually take more
storage with vector data than raster).
4. General perception that maps are based on vector data (points, lines polygons) because this is how we
tend to visualize features on the earth.
Vector Advantages
A. More what we expect as map data (points, lines, polygons)
B. better resolutions than raster generally on detailed landscapes
C. generally better spatial accuracy
D. advantage of topology (can represent connectivity and interrelationships)
Vector Disadvantages
1. difficult to manage on a computer
2. Slow to process complex data sets on low-end computers
3. More costly to use given the previous
The combination of the spatial and attribute data along with the creation, editing, data retrieval, and
output capabilities is what makes up the sum total of a GIS.
19
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
The database allows us to manipulate information in many ways: from simple listing of attributes,
sorting features by some attributes, grouping by attributes, or selecting and singling out groups by
attributes.
Attribute data describes characteristics of the spatial features. These characteristics can be quantitative
and/or qualitative in nature. Attribute data is often referred to as tabular data. For example, the coordinate
location of a forestry stand would be spatial data, while the characteristics of that forestry stand, e.g.
cover group, dominant species, crown closure, height,etc., would be attribute data. Other data types, in
particular image and multimedia data, are becoming more prevalent with changing technology.
Depending on the specific content of the data, image data may be considered either spatial, e.g.
photographs, animation, movies, etc., or attribute, e.g. sound, descriptions, narration's, etc.
Attribute data are used to record the non-spatial characteristics of an entity. Attributes are also called
items or variables. Attributes may be envisioned as a list of characteristics that help describe and define
the features we wish to represent in a GIS. Color, depth, weight, owner, components vegetation type, or
land use are examples of variables that may be used as attributes. Attributes have values, e.g. color may
be blue, black or brown, weight from 0.0 to 500, or land use may be urban, agriculture, or undeveloped.
Attributes are often presented in tables, with attributes arranged in rows and columns. Each row
corresponds to an individual spatial object and each column corresponds to an attribute.
Types of Attribute data
Attributes of different types may be grouped together to describe the non spatial properties of each object
in the database. These attribute data may take many forms but all attribute data can be categorized as
nominal, ordinal, or interval/ratio attributes.
Nominal: Geographic features that have names only. So, you can’t compare their descriptive information
to any other. Place names, Color, vegetation types, city name, owner of the parcel or soil series are all
examples of nominal attributes. Each serves only to identify the particular instance of a class of entities
and to distinguish it from other members of the same class. Nominal attributes include numbers, letters,
and even colors. Even though a nominal attribute. can be numeric it makes no sense to apply arithmetic
operations to it: adding two nominal attributes, such as two drivers’ license numbers, creates nonsense.
There is no implied order, size, or quantitative information contained in the nominal attributes.
Nominal attributes may also be images, audio recordings, or other descriptive information. Just as the
color or type attributes provide nominal information for an entity, an image also provides descriptive
information. Examples of nominal descriptions you might find on a map include the Addis Ababa,
LakeTana, etc.
20
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
Ordinal: Geographic features that you can compare by rank. You could have short, medium, and tall
trees; dirt roads, paved roads, highways, and superhighways; or large, medium, and small chemical spills.
An ordinal attribute may be descriptive such as small, medium or large or they may be numeric such as an
erosion class which takes values from 1 through 10. The order reflects only ranks, and does not specify
the form of the scale. An object with an ordinal attribute that has a value of four has a higher rank for that
attribute than an object with a value of two. However, we cannot infer that the attribute value is twice as
large, because we cannot assume the scale is linear. Averaging makes no sense either, but the median, or
the value such that half of the attributes are higher-ranked and half are lower-ranked, is an effective
substitute for the average for ordinal data as it gives a useful central value.
Interval/ratio
Attributes are interval if the differences between values make sense. Interval/ratio attributes are used for
numeric items where both order and absolute difference in magnitudes are reflected in the numbers. The
scale of Celsius temperature is interval, because it makes sense to say that 30 and 20 are as different as 20
and 10. Attributes are ratio if the ratios between values make sense. Weight is ratio, because it makes
sense to say that a person of 100 kg is twice as heavy as a person of 50 kg; but Celsius temperature is only
interval, because 20 is not twice as hot as 10 (and this argument applies to all scales that are based on
similarly arbitrary zero points, including longitude).
These data are often recorded as real numbers most often on a linear scale. Area, length, weight, value,
height, or depth is a few examples of attributes which are represented by interval/ration variables.
Interval: Geographic features that have detailed increments (intervals) that you can measure. One
limiting characteristic of interval data is that,although you can get very accurate measurements, you can’t
form ratiosbecause the starting point is arbitrary. For example, if the soil in landparcel A is 15 degrees
centigrade and the soil in parcel B is 30 degrees,you can say that soil B is 15 degrees warmer than soil A,
but you can’tsay that it’s twice as warm because 0 degrees centigrade is an arbitrarystarting place and the
temperature values can thus be negative.
Ratio: Geographic data that have measurable units, like interval data, but also allow you to make the ratio
comparisons that interval data won’t.
The computer represents even nominal data (names) with numbers in the GIS database. Try to avoid
using mathematical techniques that force you to multiply the numbers that represent nominal categories
by ordinal, interval, or ratio numbers. Attempting to multiply a nominal category (urban) by a ratio
category (meter) often yields downright silly results (in this case, urban meter).
Topological Relationships
21
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
Topological features are essentially simple features structured using topological rules. Topology is the
mathematics and science of geometrical relationships. Topological relationships are non-metric
(qualitative) properties of geographic objects that remain constant when the geographic space of objects is
distorted. For example, when a map is stretched properties such as distance and angle change, whereas
topological properties such as adjacency and containment do not.
Topology can be defined as the organization of spatial relationships between features in a GIS. In
layman's terms, topology is the way a GIS “knows”:
1) Where a feature is in relation to other features,
2) What parts of different features are shared (points, lines, nodes), and
3) How features share connectivity (gives us ability to move between features in network applications).
The topological data structure logically determines exactly how and where points and lines
connect on a map by means of nodes (topological junctions). The order of connectivity defines
the shape of an arc or polygon. The computer stores this information in various tables of the
database structure. By storing information in a logical and ordered relationship missing
information, e.g., a line segment of a polygon, is readily apparent. A GIS manipulates, analyzes,
and uses topological data in determining data relationships.
The database tells us that the line is the “left side” of one polygon and the “right side” of the adjacent
polygon.
The three aspects of topology that are important in representing spatial relationships are:
1) adjacency- shared boundary
2) connectivity- shared node in arc-node topology
3) containment- accounts for polygons within polygons “islands”
The software in our GIS creates a database that keeps track of the relationships as lists of shared features.
A simple map may be composed of land cover polygons. The polygons are composed of “ chains”
(we'llcall them arcs to be consistent with arc/info). Some of the arcs are shared by polygons, some are
not. The database structure is designed to keep a list of all arcs and how they relate to the formation of
each polygon.
Network analysis uses topological modeling for determining shortest paths and alternate routes.
For example, a GIS for emergency service dispatch may use topological models to quickly
ascertain optional routes for emergency vehicles. Automobile commuters perform a similar
mental task by altering their route to avoid accidents and traffic congestion. Likewise an
electrical utility GIS could rapidly determine different circuit paths to route electricity when
service is interrupted by equipment damage. Similarly, political redistricting planners could use
22
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
certain algorithms to determine logical relationships between population groups and areas for
district boundaries.
To see how topology is represented or modeled, it is useful to consider an example to see how
connections are coded into a database. This involves recording more than use the absolute
location of points, lines, and regions. The first step is to record the location of all "nodes," that
is endpoints and intersections of lines and boundaries.
To see how topology is represented or modeled, it is useful to consider an example to see how
connections are coded into a database. This involves recording more than use the absolute
location of points, lines, and regions. The first step is to record the location of all "nodes," that is
endpoints and intersections of lines and boundaries.
Based upon these nodes, "arcs" are defined. These arcs have endpoints, but they are also assigned a
direction indicated by the arrowheads. The starting point of the vector is referred to as the "from node"
and the destination the "to node." The orientation of a given vector can be assigned in either direction,
as long as this direction is recorded and stored in the database.
By keeping track of the orientation of arcs, it is possible to use this information to establish
routes from node to node or place to place. Thus, if one wants to move from node 3 to node 1,
we can locate the necessary connections in the database.
23
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
Now, "polygons" are defined by arcs. To define a given polygon, trace around its area in a
clockwise direction recording the component arcs and their orientations. If an arc has to be
followed in its reverse orientation to make the tracing, it is assigned a negative sign in the
database.
Finally, for each arc, one records which polygon lies to the left and right side of its direction of
orientation. If an arc is on the edge of the study area, it is bounded by the "universe."
24
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
the spatial or attribute information by requests based on the location or characteristics of the data features
either singly or as related to other features.
Database Management Systems (DBMS) have been developed to manipulate, i.e. imports, store, and sort
and retrieve data in a database. Today, most systems use a relational database structure but other systems
exist and may be well suited to particular types of data. There are three basic types with which we should
be familiar in GIS. These are hierarchical data structure, network systems, and relational database
structure.
Types of Database System
A database is a comprehensive collection of related data stored in logical files and collectively processed,
usually in tabular form. Database Management Systems (DBMS) have been developed to manipulate, i.e.
imports, store, and sort and retrieve data in a database. Today, most systems use a relational database
structure but other systems exist and may be well suited to particular types of data. There are three basic
types with which we should be familiar in GIS. These are hierarchical data structure, network systems,
and relational database structure. A part from these, flat-file database and object-oriented system also
have been used for GIS. The Storage structure of database system a) hierarchical database system
b) Network database system and c) relation database
Table 4.1 Database systems and characteristics
Type Characteristics
File-system-based Simple – can use generalized software
Use files and directories to organize (word processors, file managers) Inefficient –
Information. Examples: Gopher as number of file increase within a directory,
information servers (not typically search speed decrease. Few capacities – no
considered as a DBMS) sorting or query capacities aside from sorting
Hierarchical file names
Store data in hierarchical system.
Examples: IBM IMS database software, Efficient storages for data that have a
levels of administration (country, clear hierarchy
province, district), satellite images in Hierarchical Data Tools that store data in hierarchically
Format (HDF) organized files are commonly used for
image data
Relatively rigid, requires a detailed
Network
planning process
Store data in interconnected units with
few constraints on the type and number Fewer constraints than a hierarchical
25
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
The database software that we are using in lab is a relational database structure. This means that we
cross reference feature attributes to their spatial definitions based on some commonattribute stored in
the data table for the attributes and graphics. We can select one or more graphic features by use of a
query of some characteristic of interest in the feature attribute table of the database. And, since the
reference from graphics to attributes works both ways, we can select one or more spatial graphic
features on our computer screen and have the software give us the associated attributes. The relational
qualities of the database go even further than these simple examples. We frequently attach other external
tables to our original data sets and relate them to the existing data by common attributes.
26
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
selectBoundary.id-contour, x, y
fromCountry, Boundary, Contour, Point
wherename = ‘France’
andCountry.id-boundary = Boundary.id-boundary
andBoundary.id-contour = Contour.id-contour
andContour.id-point = Point.id-point
order by Boundary.id-contour, point-num
The database software that we are using most GIS software is a relational database structure. This
means that we cross reference feature attributes to their spatial definitions based on some
commonattribute stored in the data table for the attributes and graphics. We can select one or more
graphic features by use of a query of some characteristic of interest in the feature attribute table of the
database. And, since the reference from graphics to attributes works both ways, we can select one or
more spatial graphic features on our computer screen and have the software give us the associated
attributes. The relational qualities of the database go even further than these simple examples. We
frequently attach other external tables to our original data sets and relate them to the existing data by
common attributes.
27
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
The relational database has several advantages for geographic representation and cartographic
representation. First, the conceptual model of the database is distinct from the physical model (how the
database is stored and managed on computer hardware are separate). Second, separate tables help
maintain the integrity of the potential meaning of database elements. Most relational databases now use
structured query language (SQL) for constructing queries involving tables of a single database, or with
tables in other databases, even on other computers. Third, the clarity of the relationships aids people
using the database with previous experiences of the database. Reliable processing is critical for queries of
geographic information and online maps. Fourth, it is possible to define multiple views of the same data
in different database tables (e.g., listing entries by street address or alphabetically by name). Most
geographic information and maps only scratch the surface of what databases can be used for, but the two
most common uses of databases for geographic information and maps are as follows:
• Databases store measurements and observations of things and events.
• Databases store the symbols, values, and other graphic elements that help maps communicate.
Uses of DBMS
Reduce time wastage
Data interdependence and effective access
Data integrity and security
Uniform administration
28
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
example, to support individual datasets containing well over 300million features and datasets that can
scale beyond 500 GB per file with very fast performance.
Personal Geodatabase
Personal geodatabases have been used in ArcGIS since their initial release in Version 8.0 and
have used the Microsoft Access data file structure (the .mdb file) and Jet Engine. They support
geodatabases that are limited in size to 2 GB or less. However, the effective database size is
smaller, somewhere between 250 and 500 MB before the database performance starts to slow
down.
The major datasets that can be stored within the geodatabase are the following:
Feature datasets: Feature datasets exist in the geodatabase to define a scope for a
particular spatial reference. All feature classes that participate in topological relationships
with one another, for example, a geometric network or a topology, must have the same
spatial reference
Topologies: Many vector datasets have features that could share boundaries or corners. If
you create a topology in the dataset, you can set up rules defining how features share
their geometry.
Geometric networks Some vector datasets, particularly those used to model
communications, material or energy flow, or transportation networks, need to support
connectivity tracing and network connectivity rules.
Relationship classes: Relationship classes define relationships between objects in the
geodatabase. These relationships can be simple one-to-one relationships, such as you
might create between a feature and a row in a table, or more complex one-to-many (or
many-to-many) relationships between features and table rows.
Object classes: An object class is a table in a geodatabase with which you can associate
behavior. Object classes keep descriptive information about objects that are related to
geographic features, but are not features on a map.
29
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
a simple check on the labeling of features—for example, is a road classified as a metalled road actually
surfaced or not?—
30
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
For digital data sets, defined as: “that part of the data quality statement that contains information that
describes the source of observations or materials, data acquisition and compilation methods, conversions,
transformations, analyses and derivations that the data has been subjected to, and the assumptions and
criteria applied at any stage of its life.”
Possibly the most important component of a GIS is the data. Geographic data and related tabular
data can be collected in-house or purchased from a commercial data provider. A GIS will
integrate spatial data with other data resources and can even use a DBMS, used by most
organizations to organize and maintain their data, to manage spatial data. We will get these data
from primary and secondary sources.
31
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
Primary data sources:Spatial data can be obtained from scratch, using direct spatial data
acquisition techniques, are called primary data. These data are collected in digital format
specifically for use in a GIS project. Typical examples of primary GIS sources include raster
SPOT and IKONOS LandSat Earth satellite images, and vector building-survey measurements
captured using a total survey station.
Secondary sources are digital and analog datasets that were originally captured for another
purpose and need to be converted into a suitable digital format for use in a GIS project. They are
gained indirect by making use of spatial data collected earlier, possibly by others.
The above data can either be hardcopy or digital in nature. A GIS will integrate spatial data with
other data resources and can even use a Database Management Systems (DBMS) used by most
organization to maintain their data, and predominantly useful to manage spatial data. The
diagram bellow shows the sources and process of data capturing in GIS.
Attribute data about a given geographic data can be collected from various sources. The types of
attribute could be e.g. text, numbers (tables),audio, video and photographs in eitherdigital or
analogue form. The analogue data have to be convertedinto digital form and imported into
theGISusing attribute importing techniques of GIS software’s or through direct keyboard typing
method. Whereas, the digital data have to be formatted inorder to be compatible with be usable
in GIS software. The main sources of attribute data include:
32
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
The sources for geo-spatial data are probably more numerous and of greater variety than in most
other information. At the outset, the data, such as those identified as core datasets mentioned
above, can be imported/input to GIS from various sources, those may be:
A. Terrestrial Surveys: Large-scale data are acquired through terrestrial surveys. Survey
measurements and tools expressed in coordinates or other units are used for the
collections. With the increasing use of modern equipment, these surveys lead to digital
files that can be directly imported into GIS. Some of the tools and method for collecting
data through survey are as follows:
Satellite Data: Earth Resources Satellites have become a source of a huge amount of
data for GIS applications. Satellites contain scanners with sensors susceptible to the
radiation emitted or reflected by the earth’s surface. The sensors measure radiation
sequentially from patches or grid cells that are later put in their proper spatial relationship
simulating a map. Data accuracy depends upon the resolution of grid cells, number of
correction techniques for both the radiation values and geometric accuracy.
33
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
GPS Data: The development of a global positioning system (GPS) has made high-
accuracy spatial data easier to obtain in little time. A GPS provides unequalled accuracy
and flexibility of positioning for navigation, surveying and GIS data capture. The data
from GPS are used for increasing the accuracy of existing georeferencing methods or for
point and linear surveys. The data are recorded on the basis of a global reference system
that can be transformed to local reference systems.
B. Digital Data from elsewhere: GIS data may by found from organizations, individuals,
internet, and so on digital map forms. Now a days digital data are produced at large
extent by different vendors and distributed directly or through the network. These data
will be directly encoded in our GIS database using external hard drives and via the
network. For example the following web links provides different kinds of data for free
and price.
Internet data sources:
http://www.pecad.fas.usda.gov/cropexplorer/global_reservoir/index.cfm
http://www.arcgis.com/home/
http://earlywarning.usgs.gov/fews/africa/index.php
http://www.maplibrary.org/stacks/Africa/Ethiopia/index.php/
http://faostat.fao.org/site/377/default.aspx#ancor/
http://www.cru.uea.ac.uk/cru/data/precip/
http://srtm.csi.cgiar.org/SELECTION/inputCoord.asp/
http://www.census.gov/ipc/www/idb/informationGateway.php/
C. Data Captured from Existing Maps: Maps have been used since the earliest times to
portray information about the earth’s surface. Maps provide spatial and non-spatial
information of geographic phenomena. The features of the earth are represented as points,
lines or areas. Attention needs to be given to map properties such as scale, resolution,
accuracy, precision and time of map production when using them as a data source for
GIS. The methods encoding such analog data includes:
Map scanning: Photos, maps, or typed documents can be scanned and then saved
in a format that is readable in the GIS software. Most common are the tabletop
flatbed scanners, but there are also drum scanners, which are very useful for very
34
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
Manual digitizing involves placing a map on a digitizing surface or displaying a map on screen
and tracing the location of feature boundaries. Coordinate data are sampled by manually
positioning the puck or cursor over each target point and collecting coordinate locations. This
step is repeated for every point to be captured and in this manner the locations and shapes of all
required map features defined. Features that are viewed as points are represented by digitizing a
single location. Lines are represented by digitizing an ordered sets of points and polygons by
digitizing an ordered sets of lines. Lines have a starting point often called a starting node, a set of
vertices defining the shape and an ending node. Hence lines may be viewed as a series of straight
line segments connecting vertices and nodes.
Common Errors in Digitizing: While digitizing data the following errors me be encountered
and thus must be corrected before using the data for analysis.
Slivers – boundaries of adjacent polygons overlap
Gaps – boundaries of polygons that supposedly share a common border don’t touch due
to “double digitizing”
Attribute errors – attribute data entered incorrectly
Overshoot – digitized line extends too far
35
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
Preprocessing - In preprocessing, the assembled spatial data are converted to forms that can be ingested
by the GIS to produce data layers of spatial objects and their associated information.
Non-geometric attributes
Relationships (topology)
Format transformation
► Geometric transformation
► spatial interpolation
► For Raster:
► Format transformation
- Spatial data files must be transformed into the data structures and file formats used internally by
a GIS software package
► Making to the same format enables to use similar rules of validation, editing relationship
36
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
before using the data for our analysis purpose we have to check whether these errors are found in
our data or not, and repair, if errors are found.
A GIS project usually involves multiple data sets, thus multiple datasets should be relate to each
other.
There are three fundamental cases to be considered if we compare data sets pairwise:
they may be about the same area, but differ in choice of representation (due to scale
difference), and
they may be about adjacent areas, and have to be merged into a single data set.
D. Geometric transformations,
37
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
If much or all of the subsequent spatial data analysis is to be carried out on raster data, one may
want to convert vector data sets to raster data. This process is known as rasterization.
It involves assigning point, line and polygon attribute values to raster cells that overlap with the
respective point, line or polygon.
There is an inverse operation, called vectorization, that produces a vector data set from a raster.
We have looked at this in some sense already: namely in the production of a vector set from a
scanned image…..Digitization
Another form of vectorization takes place when we want to identify features or patterns in
remotely sensed imagery.
The keywords here are feature extraction and pattern recognition, but these subjects will be dealt
with in Principles of Remote Sensing
Interpolation
Spatial interpolation is a process of using points with known values to estimate values at other
points.
If mapping precipitation, for example, and there is no weather reporting station within the grid
cell, an estimate is based on nearby weather stations.
The process of transforming point based data into a full coverage grid map is called interpolation.
Basic assumption in spatial interpolation is that the value to be estimated at a point is more
influenced by nearby control points than those that are farther away.
It can be used to predict unknown values for any geographic point data, such as elevation,
rainfall, chemical concentrations, and noise levels.
Global method uses every control point available to make the estimate of the unknown
value. ex. IDW
A spatial analysis may require generation of new data from the original set. For example,
assume that you wish to find a relationship between noise level and proximity to freeways within
38
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
one mile of the freeway, for all parcels zoned residential. To answer the question, you must
combine, or overlay, the parcel map and the freeway map. Then derive the areas within one mile
of the freeway. This process may require the delineation of new geographic entities and the
generation of a new data table showing combinations of several factors. There are several vector
based operations and analysis in GIS. Some of them are discussed bellow.
A. Querying data
The scope of spatial analysis ranges from a simple query about the spatial phenomenon to
complicated combinations of attribute queries, spatial queries, and alterations of original data.
Spatial Queries require the processing of spatial information extraction from the database. There
are two major types of queries in GIS- attribute and spatial.
Attribute Queries
Attribute queries require the processing of attribute data exclusive of spatial information. For
example: identifying commercial land use parcels in order to compute the average value of this
land use type. The selection is based only on an attribute item; therefore, no spatial information
is required. Attribute query may enables:
Attribute query in ArcGIS use SQL (structured query language a language used to query
database) to define your selection. SQL is a language that allows you to query a database - the
SQL query window helps you to express your request correctly. In the top box select the attribute
you would like to select from. Then select which condition this attribute should have. To do this
you need to select the right Boolean operator (=, >, <) and the condition. You can click to get
unique values to select from within the existing values in the database. This is particularly handy
when you want to select a string (a name) as it will avoid any typos and automatically retrieves
the correct formatting for strings.
39
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
A Boolean expression is an expression that evaluates to a value: True or False. The most
common operators are:
= equal to
< less than
<= less than or equal to
> greater than
>= greater than or equal to
<> greater than or less than
LIKE is the same as = for strings but allows to guess some of the letters
AND is part of all the sets
OR is part of at least one of the set
NOT is not part of the set
This procedure retrieves data from a map by working with map features. Features can be
selected with a cursor or graphic. This type of query aims to select features based on their
location relative to the locations of other features. You would perform this query if, for example,
you wanted to identify all of the fields along a particular road. To perform this type of query you
need at least 2 layers: a target layer - the layer in which features will be selected, and a source
layer - the layer that is used to determine the selection based on its topological relationship to the
target. In addition to selecting features this tool also allows you to add or remove features to or
from you map.
There are a variety of selection methods available to select the point, line, or polygon features in
one layer that are near to, or which overlap features in the same or in another layer. These are
listed below:
40
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
Are completely within: This method selects features near or adjacent to features in the
same layer or in a different layer.
Have their center in: This method selects the features in one layer that have their center
Completely contain: You can select polygons in one layer that completely contain the
features in another layer.
Share a line segment with: This method selects features that share line segments,
vertices, or nodes with other features.
Are identical to: This method selects any feature having the same geometry as a feature
of another layer. The feature types must be the same—for example, you use polygons to
select polygons, lines to select lines, and points to select points.
Contain: This method selects features in one layer that contains the features of another.
This method differs from the Completely contain method in that the boundaries of the
features can touch.
Are contained by: This method selects features in one layer that are contained by the
features in another.
Touch the boundary of: If you are selecting features using a layer containing lines, this
method selects lines and polygons that share line segments, vertices, or endpoints (nodes)
with the lines in the layer.
ArcGIS allows you to create and edit several kinds of data. You can edit feature data stored in
shapefiles and geodatabases, as well as various tabular formats. You can also edit shared edges
and coincident geometry using topologies and geometric networks
Merging :Merge combines selected features of the same layer into one feature.
Deleting feature
41
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
You can select a feature and delete it to remove it from the map and database.
Splitting Feature
You can use the Cut Polygons tool on the Editor toolbar to perform this task. You can cut
multiple polygons this way, but the cut line is based on a sketch you draw manually.
Reshape Feature
The Reshape Feature tool lets you reshape a polygon by constructing a sketch over a selected
feature. The feature takes the shape of the sketch from the first place the sketch intersects the
feature to the last.
Updating Features
Digitizing is the process of converting features on a paper map into digital format. You might
want to digitize features into a new layer and add the layer to an existing map document or create
a completely new set of layers for an area for which no digital data is available.
In addition to data sources, such as a shape file, you can also add tabular data that contains
geographic locations in the form of x,y coordinates to your map. The X,y coordinates describe
discrete locations on the earth's surface such as the location of sample points in a city or the
points where any samples you collected. To add a table of x,y coordinates to your map, the
table must contain two fields, one for the x-coordinate and one for the y-coordinate. You can
easily collect x,y coordinate data using a GPS device.
B. Overlay analysis
Spatial analysis is one of the most important functions of a GIS and includes many methods for
the analysis. One of the basic analysis methods is overlay analysis. During vector overlay, map
features and associated attributes are integrated to produce new
composite maps. Logical rules can be applied to define how the
maps are combined. Vector overlay can be performed on
42
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
43
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
produce a newpolygong3 with the attributes being kept. Clip is an option that removes a
selected part of one theme using another theme, selected features, or a graphic. In effect,
it is an overlay operation that uses one part of a theme to select part of another by
extraction (cutting and removal).
Using overlay analysis, you can generate some kinds of new map as shown in figure 4.1 from a
topography map and a soil map to generate a topography-soil map, or conduct a soil distribution
statistics based on administrative area as shown in figure 4. 1.
Figure 4.1. Using overlay process to create new map and conduct new statistics
Mask and Replace: Mask is a type of clip operation in which a designated section or set
of features from one theme is used a “window” for selecting parts of a second theme.
Replace (also called cover in some GISs) is another type of clip in some ways, in that it
transfers selected features from one theme to another, covering those in the second
theme. Replace is ideal for updating features spatially without having to go through
elaborate recoding and overlay operations. Replace is essentially a convenient selected-
feature overlay option
44
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
C. Proximity analysis
In proximity computation we use geometric distance to define the neighborhood of one or more
target locations. The most common and useful technique is buffer zone generation. Another
technique based on geometric distance that will also be discussed is Thiessen polygon
generation.
The principle of buffer zone generation is simple: we select one or more target locations and then
determine the area around them within a certain distance. In Figure (a) a number of main and
minor roads were selected as targets and 75m and 25m (respectively) buffers were computed
from them. Buffers have many uses, mostly dealing with distance from selected features. For
example, questions such as what are the effects on urban areas if the road is extended by 100 m
or what are the effects of a 5 km buffer zone around a national park to prevent grazing can be
answered.
In some case studies, zoned buffers must be determined, for instance in assessment of the effects
of traffic noise. Most GISs support this type of zoned buffer computation. An illustration is
provided in Figure 4.3
45
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
In vector based buffer generation, the buffer themselves become polygon features, usually in
separate data layer, that can be used in further spatial analysis. Buffer generation on rasters is
fairly simple function. The target locations or locations are always represented by a selection of
the raster’s cells and geometric distance is defined using cell resolution as the unit. The distance
function applied is the Pythagorean distance between the cell centers. The distance from a non-
target cell to the target is the minimal distance one can find between that non-target cell and any
target cell.
D. Network analysis
A completely different set of analytic functions in GIS consists of computations on networks. A
network is a connected set of lines, representing some geographic phenomenon, typically of the
transportation type. The ‘goods’ transported can be almost anything: people, cars and other
vehicles along a road network, commercial goods along a logistic network, phone calls along a
telephone network, or water pollution along a stream/river network.
Network analysis can be done using either raster or vector data layers, but they are more
commonly done in the latter, as line features can be associated with a network naturally, and can
be given typical transportation characteristics like capacity and cost per unit. One crucial
characteristic of any network is whether the network lines are considered directed or not.
Directed networks associate with each line a direction of transportation; undirected networks do
not. In the latter, the ‘goods’ can be transported along a line in both directions. We discuss here
46
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
vector network analysis, and assume that the network is a set of connected line features that
intersect only at the lines’ nodes, not at internal vertices.
For many applications of network analysis, a planar network, i.e., one that is embeddable in a
two-dimensional plane will do the job. Many networks are naturally planar, like stream/river
networks. A large-scale traffic network, on the other end, is not planar: motorways have multi-
level crossings and are constructed with underpasses and overpasses. Planar networks are easier
to deal with computationally, as they have simpler topological rules. Not all GISs accommodate
non-planar networks, or can do so only using trickery. Such trickery may involve to split over
passing lines at the intersection vertex work will then allow to make a turn onto another line at
this new intersection node, which in reality would be impossible. The above is a good illustration
of geometry not fully determining the network’s behavior. Additional application-specific rules
are usually required to define what can and cannot happen in the network. Most GIS provide rule
based tools that allow the definition of these extra application rules. Various classical spatial
analysis functions on networks are supported by GIS software packages.
In raster overlay, the pixel or grid cell values in each map are combined using arithmetic and
Boolean operators to produce a new value in the composite map. The maps can be treated as
arithmetic variables and perform complex algebraic functions. This method is often described as
map algebra. The raster GIS provides the ability to perform map layers mathematically. This is
particularly important for the modeling in which various maps are combined using various
mathematical functions. Conditional operators are the basic mathematical functions that are
supported in GIS.
A. Neighborhood operations
Whereas overlays combine features at the same location, neighbourhood functions evaluate the
characteristics of an area surrounding a feature’s location. This allows to look at buffer zones
around features, and spreading effects if features are a source of something that spreads—e.g.,
water springs, volcanic eruptions, sources of pollution.
i. Local operation: the output value of a cell is computed as a function of the value of
the same cell in input raster datasets (regardless of the values of neighboring cells).
47
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
ii. Neighborhood (focal) operation: the output value of a cell is computed as a function
of the values of the same cell and neighboring cells in the input datasets.
iii. Zonal operation: the output value for each location depends on the value of the cell
at the location and the association that location has within a cartographic zone.
48
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
8.2.1 Measurement
49
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
Reclassification involves the selection and presentation of a selected layer of data based on the classes or
values of a specific attribute e.g. cover group. It involves looking at an attribute, or a series of attributes,
for a single data layer and classifying the data layer based on the range of values of the attribute.
Accordingly, features adjacent to one another that have a common value, e.g. cover group, but differ in
other characteristics, e.g. tree height, species, will be treated and appear as one class. In raster based GIS
software, numerical values are often used to indicate classes. Reclassification is anattribute generalization
technique. Typically this function makes use of polygonpatterning techniques such as crosshatching
and/or color shading for graphicrepresentation.
Geometric measurement on spatial features includes counting, distance and area size
computations. For the sake of simplicity, this section discusses such measurements in a
planar spatial reference system. We limit ourselves to geometric measurements, and do
not include attribute data measurement, which is typically performed in a database query
language. Measurements on vector data are more advanced, thus, also more complex,
than those on raster data. We discuss each group as follows.
a. Measurements on vector data
The primitives of vector data sets are point, (poly)line and polygon. Related geometric measurements are
location, length, distance and area size. Some of these are geometric properties of a feature in isolation
(location, length, area size); others (distance) require two features to be identified. The location property
of a vector feature is always stored by the GIS: a single coordinate pair for a point, or a list of pairs for a
polyline or polygon boundary. Occasionally, there is a need to obtain the location of the centroid of a
polygon; some GISs store these also, others compute them ‘on-the-fly’. Length is a geometric property
associated with polylines, by themselves, or in their function as polygon boundary. It can obviously be
computed by the GIS— as the sum of lengths of the constituent line segments—but it quite often is also
stored with the polyline. Area size is associated with polygon features. Again, it can be computed, but
usually is stored with the polygon as an extra attribute value. This speeds up the computation of other
functions that require area size values. We see that all of the above measurements do not require
computation, but only a look up in stored data.
Measuring distance between two features is another important function. If both features are points, say p
and q, the computation in a Cartesian spatial reference system are given by the well-known Pythagorean
distance function:
50
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
Measurements on raster data layers are simpler because of the regularity of the cells. The area size of a
cell is constant, and is determined by the cell resolution. Horizontal and vertical resolution may differ, but
typically do not. Together with the location of a so called anchor point, this is the only geometric
information stored with the raster data, so all other measurements by the GIS are computed. The anchor
point is fixed by convention to be the lower left (or sometimes upper left) location of the raster. Location
of an individual cell derives from the raster’s anchor point, the cell resolution, and the position of the cell
in the raster. Again, there are two conventions: the cell’s location can be its lower left corner, or the cell’s
midpoint. These conventions are set by the software in use, and in case of low resolution data they
become more important to be aware of. The area size of a selected part of the raster (a group of cells) is
calculated as the number of cells multiplied with the cell area size. The distance between two raster cells
is the standard distance function applied to the locations of their respective mid-points, obviously taking
into account the cell resolution. Where a raster is used to represent line features as strings of cells through
the raster, the length of a line feature is computed as the sum of distances between consecutive cells.
51
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
52
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
Which clinics are within 2 kilometres of a selected school? (Information needed for the school
emergency plan.)
Which roads are within 200 metres of a medical clinic? (These roads must have a high road
maintenance priority.)
The figure below illustrates a spatial selection using distance. Here, we executed the selection of the
second example above. Our selection objects were all clinics, and we selected the roads that pass by a
clinic within 200 metres.
53
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
expressing an overlay function as a formula in which the data layers are the arguments. Different layers
can be combined using arithmetic, relational, and conditional operators and many different functions.
54
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
principle in spatial analysis that can be equally useful. The principle here is to find out the characteristics
of the vicinity, here called neighbourhood, of a location. After all, many suitability questions, for
instance, depend not only on what is at the location, but also on what is near the location. Thus, the GIS
must allow us ‘to look around locally’. To perform neighbourhood analysis, we must:
1. State which target locations are of interest to us, and what is their spatial extent,
2. Define how to determine the neighbourhood for each target,
3. Define which characteristic(s) must be computed for each neighbourhood.
For instance, our target can be a medical clinic. Its neighbourhood can be defined as:
An area within 2 km distance, as the crow flies, or
An area within 2 km travel distance, or
All roads within 500 m travel distance, or
All other clinics within 10 minutes travel time, or
All residential areas, for which the clinic is the closest clinic.
Then, in the third step we indicate what characteristics to find out about the neighbourhood. This could
simply be its spatial extent, but it might also be statistical information like:
How many people live in the area,
What is their average household income, or
Are any high-risk industries located in the neighbourhood?
55
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
A completely different set of analytic functions in GIS consists of computations on networks. A network
is a connected set of lines, representing some geographic phenomenon, typically of the transportation
type. The ‘goods’ transported can be almost anything: people, cars and other vehicles along a road
network, commercial goods along a logistic network, phone calls along a telephone network, or water
pollution along a stream/river network. Network analysis can be done using either raster or vector data
layers, but they are more commonly done in the latter, as line features can be associated with a network
naturally, and can be given typical transportation characteristics like capacity and cost per unit. One
crucial characteristic of any network is whether the network lines are considered directed or not. Directed
networks associate with each line a direction of transportation; undirected networks do not. In the latter,
the ‘goods’ can be transported along a line in both directions. We discuss here vector network analysis,
and assume that the network is a set of connected line features that intersect only at the lines’ nodes, not
at internal vertices.
For many applications of network analysis, a planar network, i.e., one that is embeddable in a two-
dimensional plane will do the job. Many networks are naturally planar, like stream/river networks. A
large-scale traffic network, on the other end, is not planar: motorways have multi-level crossings and are
constructed with underpasses and overpasses. Planar networks are easier to deal with computationally, as
they have simpler topological rules. Not all GISs accommodate non-planar networks, or can do so only
using trickery. Such trickery may involve to split over passing lines at the intersection vertex work will
then allow to make a turn onto another line at this new intersection node, which in reality would be
impossible. The above is a good illustration of geometry not fully determining the network’s behaviour.
Additional application-specific rules are usually required to define what can and cannot happen in the
network. Most GIS provide rule based tools that allow the definition of these extra application rules.
Various classical spatial analysis functions on networks are supported by GIS software packages.
4: Data Visualization
Data Output
56
Department of Geography and Environmental Studies
YIRGAALEM SORSA (Msc) FUNDAMENTALS
OF GIS [GeES3082]
GIS output represents the pinnacle (end) of many GIS projects. Since the purpose of information systems
is to produce results, this aspect of GIS is vitally important to many managers, technicians, and
scientists. Maps are a very effective way of summarizing and communicating the results of GIS
operations to a wide audience. The importance of map output is further highlighted by the fact that many
consumers of geographic information only interact with GIS through their use of map products. Analysis
outputs in the form of statistical summary, graphs charts.
What are the outputs of GIS and the methods?
Data output methods
Methods Devices
57
Department of Geography and Environmental Studies