(GIS'23) Lecture 3 - GIS Data Representation & Modeling
(GIS'23) Lecture 3 - GIS Data Representation & Modeling
Systems
Lecture 3
Representation of GIS Data
Prepared by
Dr. Naglaa Fathy
[email protected]
Image source: Westfield State University
Agenda
2
Nature of Geospatial Data
• Geographical (Geospatial) data link place, time and attributes.
➢Place is represented directly by geographical coordinates, or indirectly by
other means, e.g. place names, street numbers.
➢Time is represented as a relative manner, such as dates, past, present
and future.
➢Attributes describe the characteristics of the spatial features.
• For example,
“The temperature at noon on December 2nd 2004 at latitude 34 degrees
45 minutes north, longitude 120 degrees 0 minutes west, was 18 degrees
Celsius”.
It ties location and time to the attribute of atmospheric temperature
3
Types of Attribute Data
The distinction of attribute
data is important because
they provide guidance about
the proper use of different
statistical, analytical, and
cartographic (map drawing)
operations.
Image Source: What is the difference between ordinal, interval and ratio variables? Why should I care? - FAQ 1089 - GraphPad 4
Types of Attribute Data
Nominal data
• Describe different categories of data such as:
➢ Land cover type (grass, asphalt, trees, bare ground, water)
➢ Race (White, Black or African American, American Indian or Alaska Native, Asian)
➢ Marital status
➢ Mode of transportation(car, bus, subway, railroad...)
➢ Type of heating fuel (gas, fuel oil, coal, electricity...)
➢ etc.
• Nominal attributes include numbers, letters, and even colors.
• Even though a nominal attribute can be numeric (i.e. car license number,
country code, etc.), applying arithmetic operations to it is meaningless.
5
Types of Attribute Data
Ordinal data
• Differentiate data by a ranking relationship such as:
➢ Socio economic status (" low income", "middle income", " high income "),
➢ Education level (" high school ", " BS ", " MS ", " PhD "),
➢ Income level (" less than 50K ", " 50K-100K ", " over 100K "),
➢ Satisfaction rating (" extremely dislike ", " dislike ", " neutral ", " like ", " extremely like ").
Interval data
• Have known intervals (differences) between values. For example,
➢ it can be determined through this that 30 ºF is 5 ºF warmer than 25 ºF.
Ratio data
• Same as interval data except that ratio between values is meaningful (has a
defined zero point):
➢ A person of 100 kg is twice as heavy as a person of 50 kg; but
➢ Celsius temperature is only interval, because 20 is not twice as hot as 10
➢ Other examples of Ratio data are length, weight, annual sales, population density, etc. 6
Spatial Data Models
• In order to visualize natural phenomena, one must first determine
how to best represent geographic space.
• Spatial Data models are a set of rules and/or constructs used to
describe and represent aspects of the real world in a computer.
• There are two primary spatial data models:
➢ Vector data models
use points, lines and polygons, in a way very similar as paper maps.
➢ Raster data models.
divide the world into arrays of cells and assign attributes to the cells (or pixels).
7
Vector Data Model
• Also called the discrete object model, uses discrete objects well-defined
boundaries to represent spatial features on the Earth’s surface.
• vector data can be prepared in three basic steps:
i. Representing the shape and location of spatial features using points and their x-,
y-coordinates, respectively. (human-oriented conceptual model)
ii. Data structure, i.e., structuring the properties and spatial relationships of these
geometric objects in a logical framework (implementation-oriented logical model)
iii.Storing vector data in digital data files so that they can be accessed, interpreted,
and processed by the computer (physical model)
8
Vector Data Model - i. Representation of Spatial features
• A spatial feature refers to geographic entities (objects) encoded using
the vector data model.
• A Feature class refers to a set of features of the same geometric type.
• The vector data model uses the geometric objects of point, line, and
polygon to represent spatial features:
9
Vector Data Model - i. Representation of Spatial features
❑Point
• A point has zero dimension and has only the property of location.
• Examples of point features are wells, trees, buildings, and retail stores.
❑Line
• A line is one-dimensional and has the property of length, in addition to location.
• A line has two end points and may have additional points in between to mark the shape of the line.
• The shape of a line may be a connection of straight-line segments, or a smooth curve generated
using a mathematical function.
• Examples of line features are roads, boundaries, small streams, etc.
❑Polygon
• A polygon is two-dimensional and has the properties of area (size) and perimeter, in addition to
location.
• Made of connected, closed, nonintersecting lines.
• A polygon may stand alone or share boundaries with other polygons.
• Examples of polygon features include vegetated areas(i.e., forest), urban areas (i.e., cities), and seas.
10
Vector Data Model - i. Representation of Spatial features
Point number x, y coordinates
• To indicate the feature location: 1
3 1 (2, 8)
2 (3, 3)
➢ a point is represented by a pair of 4 3 (12, 7)
2 4 (9, 4)
x- and y-coordinates.
➢ a line is represented by a series of 1 Line number x, y coordinates
1 (1, 6), (4, 8), (8, 6), (13, 8)
x- and y-coordinates. 2 (1, 3), (4, 4), (9, 2), (13, 5)
2
➢ a polygon is represented by a
series of x- and y-coordinates,
Polygon number x, y coordinates
such that the coordinates of the 1 (2, 5), (3, 8), … (2, 5)
1
start and end points are the same. 2 (6, 4), (8, 4), … (6, 4)
2
11
Vector Data Model - i. Representation of Spatial features
• The representation of spatial features on paper maps is not always
straightforward because it can depend on map scale.
• For example, a city on a 1:1,000,000 scale map may appear as a point, but the
same city may appear as a polygon on a 1:24,000 scale map.
12
Vector Data Model - ii. Features Data Structure
• The structure of GIS features may be simple (Spaghetti) or topological.
❑Spaghetti Structure
• Spaghetti because, like a plate of cooked spaghetti, lines (strands of spaghetti)
and polygons (spaghetti hoops) can overlap and there are no relationships
between any of the objects.
• Any polygons that lie adjacent to each other must be uniquely defined by its own
set of X, Y coordinate pairs, even if the adjacent polygons share the exact same
boundary information. This creates some redundancies within the data model
• Spaghetti feature datasets are useful in GIS applications because they are easy to
create and store, and because they can be retrieved and rendered on screen very
quickly.
• On the other hand, spatial relationships are not explicitly encoded within the
spaghetti model; instead, they are implied by their location. Therefore, operations
like shortest-path network analysis and polygon adjacency cannot be performed
without additional calculations. 13
Vector Data Model - ii. Features Data Structure
Spaghetti Features
Original map
63 64 23
10
Map expressed in
Cartesian coordinates
63 64 23
14
Vector Data Model - ii. Features Data Structure
❑Topological Structure
• Topology is a set of rules that model the relationships between neighboring
points, lines, and polygons and determines how they share geometry.
• For example, consider two adjacent polygons:
✓ In the spaghetti model, the shared boundary of two neighboring polygons is defined as
two separate, identical lines.
✓The inclusion of topology into the data model allows for a single line to represent this
shared boundary with an explicit reference to denote which side of the line (right or left)
belongs with which polygon
• Topological relationships remain constant when the geographic space of
objects is distorted.
✓For example, when a map is stretched, properties such as distance and angle change,
whereas topological properties such as adjacency and containment do not.
• Topology is important in GIS for data validation (i.e., detecting and correcting
digitizing errors), query optimization, etc. 15
Vector Data Model - ii. Features Data Structure
Topological Structure (cont.)
• The fundamentals of a topological feature are nodes and arcs:
➢ A node is a topological junction representing a common X, Y coordinate pair
between intersecting lines and/or polygons.
➢ An arc is a line that directly connects two nodes.
16
Vector Data Model - ii. Features Data Structure
Topological Principles: 1. Connectivity (Arc-Node Topology)
• In the arc-node data structure, an arc is defined by two
endpoints: the from-node indicating where the arc begins
and a to-node indicating where it ends.
• Arc-node topology is supported through an arc-node list
that identifies the from- and to-nodes for each arc.
• Connected arcs are determined by searching through the
list for common node numbers.
• In the following example:
o it is possible to determine that arcs 1, 2, and 3 all intersect
because they share node 11.
o The computer can determine that it is possible to travel
along arc 1 and turn onto arc 3 because they share a
common node (11),
o but it's not possible to turn directly from arc 1 onto arc 5
because they don't share a common node. 17
Image source: Coverage topology—ArcMap | Documentation (arcgis.com)
Vector Data Model - ii. Features Data Structure
Topological Principles: 2. Area Definition (Polygon -Arc Topology)
• In the example, polygon F is made up of
arcs 8, 9, 10, and 7 (the 0 before the 7
indicates that this arc creates an island in
the polygon).
• Each arc appears in two polygons (in the
example, arc 6 appears in the list for
polygons B and C).
• Since the polygon is simply the list of arcs
defining its boundary, arc coordinates are
stored only once, thereby reducing the
amount of data and ensuring that the
boundaries of adjacent polygons don't
overlap. 18
Image source: Coverage topology—ArcMap | Documentation (arcgis.com)
Vector Data Model - ii. Features Data Structure
Topological Principles: 2. Area Definition (Polygon -Arc Topology)
• An area is represented in the vector model by one or more boundaries defining
a polygon.
• For example, consider a lake with an island in the middle. The lake has two
boundaries: one that defines its outer edge and the island that defines its inner
boundary (or a hole).
• The arc-node structure represents polygons as an ordered list of arcs rather
than a closed loop of x,y coordinates. This is called polygon-arc topology.
19
Vector Data Model - ii. Features Data Structure
Topological Principles: 3. Contiguity (Left-Right Topology)
• In the example, polygon B is on the left of
arc 6, and polygon C is on the right. Thus,
we know that polygons B and C are
adjacent.
• polygon A is outside the boundary of the
area. This polygon is called the external, or
universe.
• A universe polygon represents the world
outside the study area.
• The universe polygon ensures that each arc
always has a left and right side defined.
20
Image source: Coverage topology—ArcMap | Documentation (arcgis.com)
Vector Data Model - ii. Features Data Structure
Topological Principles: 3. Contiguity (Left-right Topology)
• Contiguity is the topological concept that allows the vector data model to
determine adjacency.
➢ Two geographic features that share a boundary are called adjacent.
21
Vector Data Model - ii. Features Data Structure
a3 N1 a1
A complete
example for Arc - Node topology
a4 A Start End Left Right
N2 Arc
topological N4
Node Node Polygon Polygon
a1 N1 N2 D A
data B a2 N2 N3 D B
structure a5 a3
a4
N3
N4
N1
N1
D
A
A
A
N6 a2
N3 C a5 N3 N2 A B
a6 N6 N6 B C
a6
N5
Arc coordinates
Polygon topology Node topology Start Intermediate End
Arc
X, Y X, Y X, Y
Polygon Arcs Node Arcs
a1 40,60 70,60 70,50
A a1, a5, a3 N1 a1, a3, a4 a2 70,50 70,10; 10,10 10,25
B a2, a5, 0, a6 N2 a1, a2, a5 a3 10,25 10,60 40,60
C a6 N3 a2, a3, a5 a4 40,60 30,50 30,40
D Outside map N4 a4 a5 10,25 20,27; 30,30; 50,32 70,50
N5 0 a6 55,27 55,15; 40,15; 45,27 55,27
N6 a6 22
Vector Data Model - Topology Structure
• Topological Features
23
Vector Data Model - iii. Physical Data models
• The Vector data model: Vector Data Model
➢ may be georelational or object based,
➢ may or may not involve topology, and
➢ may include simple or composite features.
Georelational Object-based
Data Model Data Model
May include
May or may not
simple or
involve
composite
Topology
features
24
Vector Data Models - Georelational data model
• The georelational data model stores geometries and attributes separately
in a split system: geometries (“geo”) in graphic files and attributes
(“relational”) in a relational database.
• Typically, a georelational data model uses the feature identification
number (ID) to link the two components.
• Two examples of data structures for the georelational data model:
➢ The coverage (topological data structure), and
➢ The shapefile (nontopological data structure)
o It is a standard, nontopological data format used in ESRI products.
o Although the shapefile treats a point as a pair of x-, y-coordinates, a line as
a series of points, and a polygon as a series of line segments, no files
describe the spatial relationships between these geometric objects.
25
Vector Data Models - Georelational data model
26
Vector Data Models - Object-based data model
• The object-based data model treats spatial data as objects.
• It differs from the georelational data model in two important aspects:
o The object-based data model stores both the spatial and attribute data of spatial
features in a single system.
o The object-based data model allows a spatial feature (object) to be associated
with a set of properties and methods (the operations to be performed on these
objects).
27
Vector Data Models - Object-based data model
The Geodatabase
• The geodatabase is an example of object-based data model that is founded
by Esri, and it supports topology geographic concepts.
• The geodatabase organizes vector data sets into feature classes and feature
datasets
➢ A feature class stores spatial data of the same geometry type.
➢ A feature dataset stores feature classes that share the same coordinate system and
area extent.
The geodatabase defines Polygon must not overlap, must not have gaps, must not overlap
with, must be covered by feature class of, must cover
topology as relationship each other, must be covered by, boundary must be
rules and lets the user covered by, area boundary must be covered by boundary
choose the rules, if any, to of, and contains point
be implemented in a Line must not overlap, must not intersect, must not have
dangles, must not have pseudo-nodes, must not intersect
feature dataset. or touch interior, must not overlap with, must be covered
by feature class of, must be covered by boundary of,
➢ The geodatabase offers 25 endpoint must be covered by, must not self overlap, must
topology rules by feature not self intersect, and must be single part
type. Point must be covered by boundary of, must be properly inside
polygons, must be covered by endpoint of, and must be
covered by line
30
Vector Data Models - Composite Features (TIN Model)
6
1
A
K 11 J
B 5
C 7 N I
Triangle Table 8
L
M
Id# node# area slope …… 2 D
10 H
A 1, 6, 7 …… …… …… 9
G
B 1, 7, 8 E
C 1, 2, 8
D 2, 8, 9 F 4
E 2, 3, 9
F 3, 4, 9
G 4, 9, 10
3
H 4, 5, 10 Figure 3.14 from Intro to GIS book
I 5, 10, 11
J 5, 6, 11 X-Y Coordinates Z Coordinates
A TIN uses a series of nonoverlapping
K 6, 7, 11 node# coordinates node# z_value triangles to approximate the terrain.
L 7, 8, 9 1 x1, y1 1 z1
M 7, 9, 10 2 x2, y2 2 z2
N 7, 10, 11 3 x3, y3 3 z3
... ... ... ...
11 x11, y11 11 z11
31
Vector Data Models - Summary
Non
Shapefile Geodatabase
topological
32
33