GIS Data Management: Ge 118: Introduction To Gis Engr. Meriam M. Santillan Caraga State University
GIS Data Management: Ge 118: Introduction To Gis Engr. Meriam M. Santillan Caraga State University
Simple list
Ordered sequential files
Indexed files
2
Simple List
Simplest file structure
Unordered/unstructured
Arrangement is by whichever comes first
3
Ordered Sequential Files
Simple lists that are arranged according to
some order (ex. Alphabetical order)
4
Indexed Files
An index to the directory is needed for more
efficient searches involving finding entries
given certain criteria
Can be developed as direct files or inverted
files
5
Direct Indexed Files
Records are used to provide access to other
pertinent information
6
Indirect Indexed Files
Index is based on possible search criteria,
not on the entities themselves
Attributes are the primary search criteria and
the entities rely on them for selection
7
Database
An integrated set of data on a particular
subject
Collection of interrelated data stored
together with controlled redundancy to
serve one or more applications in an
optimal fashion
Requires more elaborate structure
called a database structure or
database management system
8
Significance of Database
Most GIS activities consist of storing entity and
attribute data so that we can retrieve any
combination of these objects.
Each graphical feature must be stored explicitly with
its attributes so that their combined search becomes
faster.
9
Advantages of Database over
File-based datasets
Collecting data at a single location reduces
redundancy and duplication
Lower maintenance cost due to better organization
and decreased data duplication
Multiple applications can use the same data and can
evolve separately over time
10
Advantages of Database over
File-based datasets
User knowledge can be transferred between
applications more easily because database remains
constant
Facilitated data sharing, with a corporate view
provided to data managers and users
Security and standards for data and data access
can be established and enforced
11
Database Management System
A software application designed to organize
the efficient and effective storage and access
to data
A suite of software programs designed to
store, retrieve and manipulate data within a
database
12
Types of Database Structure
1. Hierarchical Data Structures
2. Network Systems
3. Relational Database Structures
13
Hierarchical Data Structure
‘one-to-many’ or ‘parent-child’ relationship
Implies that each element has a direct relationship
to a number of symbolic children
Each child is capable of having the same direct
relationship with his/her own offspring, and so on.
14
Hierarchical Data Structure
15
Hierarchical Data Structure
Advantages:
Simple and straightforward data access since parent
and children are directly linked
Easy to search since structure is well defined
Relatively easy to expand by adding new branches
and formulating new decision rules
16
Hierarchical Data Structure
Disadvantages:
Confined to queries along one branch only
Difficult restructuring to allow other possible search
criteria
Creates large index files
Redundant entries for searching
17
Network Systems
‘many-to-many’ relationship
Each individual data is linked directly to
anywhere in the database using pointers,
without the parent-child relationship.
18
Network Systems
19
Network Systems
Advantages:
Less rigid compared to hierarchical structure
Can handle many-to-many relationships
Allows much greater flexibility
Reduced redundancy of data
20
Network Systems
Disadvantages:
In very complex GIS, the number of
pointers can become large, thus requiring
a lot of storage space
Linkages between data must still be
explicitly defined using pointers
Numerous possible linkages can become
extremely tangled, resulting to confusion
and incorrect linkages
Not recommended for novice users
21
Relational Database
Management Systems
(RDBMS)
Data are stored as ordered records or rows of
attribute values called tuples
Tuples are grouped with corresponding data
rows in a form called relations
Each column represents data for a single
attribute for the entire dataset
22
Relational Database
Management Systems
(RDBMS)
Primary key – a column which is used to define
the search strategy or criterion
Foreign key – column in the second table to
which the primary key is linked
23
Relational Database
Management Systems
(RDBMS)
Normal forms – set of rules to indicate the
forms that the tables should take
24
First Normal Form
Table must contain columns and
rows
Because the columns are to be
used as search keys, there should
only be a single value in each row
location
25
Second Normal Form
Requires that every column that is
not a primary key be totally
dependent on the primary key
Simplifies the tables
Reduces redundancy by imposing the
restriction that each column be only
searchable using the primary key
26
Third Normal Form
States that columns that are not
primary keys must “depend” on the
primary key, whereas the primary
key does not depend on the
nonprimary key
Primary key must be used to find other
columns
But the other columns are not needed to
search for values in the primary key
column
Idea is to reduce redundancy
27
Relational Database
Management Systems
(RDBMS)
Advantages:
Allow us to collect data in reasonably simple tables,
keeping organization also simple
Capable of doing relational joins, as long as there is
at least one column common to the tables to be
joined
Allows greatest flexibility, both in design and
querying
28
Data Storage in a DBMS
Object classes/layers are stored in database
tables
Each layer is stored as a single database
table in a database management system
Rows contain objects, while columns contain
attributes/properties of the objects
29
Data Storage in a DBMS
Geographic database tables have a geometry
column (or shape column), which non-geographic
tables don’t have
Each layer is stored as a single database table in a
database management system
Rows contain objects while columns contain
attributes/properties of the objects
30
Basic Database
Functions/Operations
Join
Tables are joined together using common row/column
values or keys
After joining two or more tables, a new table is created
which contains all the values of the joined tables
Database tables can be joined together to create new
relations, or views of the database.
31
Basic Database
Functions/Operations
Link
Tables are linked using common row/column values or
keys
Unlike in joining, linking tables does not result to a new
table. The original tables are retained but accessing one
enables the user to also access a table linked to it
32
Database Design
Involves three stages: conceptual, logical,
and physical
Involves six practical steps (see Figure)
33
Stages of Database Design
Conceptual Model
Logical Model
User View
Physical Model
Geographic
Database
Object Types
and Database
Relationships Schema
Geographic
Database
Geographic Structure
Representation
34
Conceptual Model
Steps involved are:
1. Model the user’s view
2. Define objects and their relationships
3. Select geographic representation
35
Model the User’s View
Identifying organizational functions,
determining data requirements of these
functions, organizing data into groups for
data management
May be presented using a report with tables
36
Define Objects and Their
Relationships
Specification of object types/classes and
functions, and their relationships
May be presented using diagrams
37
Select Geographic
Representation
Choosing between the types of discrete objects
(point, line, or polygon) or field to represent the
data
Selection has a critical impact on the database
use
Although it is possible to switch between
representations later on, it would be
computationally expensive and would lead to
information loss
38
Logical Model
Steps involved are:
1. Match to geographic database types
2. Organize geographic database structure
39
Match to Geographic Database
Types
Matching of object types to be studied to
specific data types supported by the GIS
40
Organize Geographic Database
Structure
Defining topological associations, specifying
rules and relationships, and assigning
coordinate systems
41
Physical Model
Step involved is:
1. Define database schema
definition of the actual physical database
schema that will hold the database data values
usually created using the DBMS software’s data
definition language (ex. SQL)
42
Database
Organization/Structuring
Necessary for efficient query, analysis, and
mapping
43
Structuring Techniques
1. Topologic Creation
2. Indexing
44
Topologic Creation
Can be created for vector data using either batch or
interactive techniques
Batch Topology – for CAD, survey, simple feature and
other unstructured vector data
– an iterative process
Interactive Topology – performed dynamically at the
time objects are added to the database
45
Indexing
Can help speed up certain types of queries
Three main indexing methods in GIS are grid
indexes, quadtrees, and R-trees.
46
Thank you!
47