1. Attribute Data Models
GIS Database Management System
GE517
Engr. Ablao
Introduction
GIS involves both spatial and attribute data.
Spatial – geometry of map features
Attribute – characteristics of the map features
Attribute data are normally stored in tables.
Record or tuple – row
Field or item – column
Attribute intersection of row and column
A ib – i i f d l
Data models relate spatial & attribute data.
GE 517 Geographic Information System 8/20/2010
1
2. Spatial data (left) are linked to attribute data (right) by the label ID.
GE 517 Geographic Information System 8/20/2010
File Structures (File-based datasets)
Simple list
Simplest file structure
Unordered/unstructured
Arrangement is by whichever comes first
Ordered sequential files
Simple lists that are arranged according to some order (ex.
Alphabetical order)
Indexed files
An index to the directory is needed for more efficient
searches involving finding entries given certain criteria
Can be developed as direct files or inverted files
GE 517 Geographic Information System 8/20/2010
2
3. Indexed Files
Direct Indexed Files
Records are used to provide access to other
p
pertinent information
Indirect Indexed Files
Index is based on possible search criteria, not on
the entities themselves
Attributes are the primary search criteria and the
entities rely on them for selection
GE 517 Geographic Information System 8/20/2010
Flat file database
Contains all data in a large file
Software could only operate on one file at a time
y p
Format is very inflexible with respect to the modification of the
database structure
GE 517 Geographic Information System 8/20/2010
3
4. Flat file database
GE 517 Geographic Information System 8/20/2010
Database
An integrated set of data on a particular subject
Collection of interrelated data stored together
with controlled redundancy to serve one or
more applications in an optimal fashion
Requires more elaborate structure called a
database structure or database management
system
• A DBMS manage attribute data in separate tables
GE 517 Geographic Information System 8/20/2010
4
5. Significance of Database
Most GIS activities consist of storing entity and attribute data
so that we can retrieve any combination of these objects.
y j
Each graphical feature must be stored explicitly with its
attributes so that their combined search becomes faster.
GE 517 Geographic Information System 8/20/2010
Advantages of Database over File-based
datasets
Collecting data at a single location reduces redundancy and
duplication
p
Lower maintenance cost due to better organization and
decreased data duplication
Multiple applications can use the same data and can evolve
separately over time
GE 517 Geographic Information System 8/20/2010
5
6. Advantages of Database over File-based
datasets
User knowledge can be transferred between applications
more easily because database remains constant
Facilitated data sharing, with a corporate view provided to
data managers and users
Security and standards for data and data access can be
established and enforced
GE 517 Geographic Information System 8/20/2010
Types of Database Structure
1. Hierarchical Data Structures
2. Network Systems
y
3. Relational Database Structures
GE 517 Geographic Information System 8/20/2010
6
7. Hierarchical Data Structure
‘one-to-many’ or ‘parent-child’ relationship
Implies that each element has a direct relationship to a number of
p p
symbolic children
Each child is capable of having the same direct relationship with
his/her own offspring, and so on.
GE 517 Geographic Information System 8/20/2010
Hierarchical database
GE 517 Geographic Information System 8/20/2010
7
8. Hierarchical Data Structure
Advantages:
Simple and straightforward data access since parent and
p g p
children are directly linked
Easy to search since structure is well defined
Relatively easy to expand by adding new branches and
formulating new decision rules
GE 517 Geographic Information System 8/20/2010
Hierarchical Data Structure
Disadvantages:
Confined to queries along one branch only
q g y
Difficult restructuring to allow other possible search criteria
Creates large index files
Redundant entries for searching
GE 517 Geographic Information System 8/20/2010
8
9. Network Systems
‘many-to-many’ relationship
Each individual data is linked directly to anywhere
in the d b using pointers, without the parent-
h database h h
child relationship.
GE 517 Geographic Information System 8/20/2010
Network database
GE 517 Geographic Information System 8/20/2010
9
10. Network Systems
GE 517 Geographic Information System 8/20/2010
Network Systems
Advantages:
Less rigid compared to hierarchical structure
g p
Can handle many-to-many relationships
Allows much greater flexibility
Reduced redundancy of data
GE 517 Geographic Information System 8/20/2010
10
11. Network Systems
Disadvantages:
In very complex GIS, the number of pointers can
become large, th requiring a l t of storage space
b l thus ii lot f t
Linkages between data must still be explicitly
defined using pointers
Numerous possible linkages can become extremely
tangled, resulting to confusion and incorrect
linkages
g
Not recommended for novice users
GE 517 Geographic Information System 8/20/2010
Relational Database Management Systems
(RDBMS)
Data are stored as ordered records or rows of attribute values
called tuples
Tuples are grouped with corresponding data rows in a form called
relations
Each column represents data for a single attribute for the entire
dataset
GE 517 Geographic Information System 8/20/2010
11
12. Relational Database Management Systems
(RDBMS)
A key represents one or more attributes whose
values can uniquely identify a record in a table.
A k common to two tables can establish
key bl bli h
connection between records in the tables.
Primary key – a column which is used to define
the search strategy or criterion
Foreign key – column in the second table to
which the primary key is linked
GE 517 Geographic Information System 8/20/2010
Relational database
GE 517 Geographic Information System 8/20/2010
12
13. Relational Database Management Systems
(RDBMS)
Advantages:
Allow us to collect data in reasonably simple tables, keeping
y p p g
organization also simple
Capable of doing relational joins, as long as there is at least
one column common to the tables to be joined
Allows greatest flexibility, both in design and querying
GE 517 Geographic Information System 8/20/2010
Normalization of relational database
Normalization is a process of decomposition,
taking a table with all the attribute data and
breaking it down to small tables while
maintaining the necessary linkages between them.
Normalization is designed to avoid redundant
data in tables, to ensure that attribute data in
separate tables can be maintained and updated
separately and can be linked when necessary, and
to facilitate a distributed database
database.
Normalization slows down data access.
GE 517 Geographic Information System 8/20/2010
13
14. PIN Owner Address Sale date Hectares Zone code Zoning
P101 Gloria 101 01-20-2001 1.2 1 Residential
Pampanga St.
Erap 202
San Juan St.
P102 Fidel 303 06-30-1992 1.5 2 Commercial
Pangasinan St.
Cory 404
Tarlac St.
P103 Ferdie
F di 505 06-30-1965
06 30 1965 2.1
21 2 Commerciall
C i
Ilocos Norte St.
P104 Dado 606 06-30-1961 0.8 1 Residential
Pampanga St.
Unnormalized table
GE 517 Geographic Information System 8/20/2010
PIN Owner Address Sale date Hectares Zone code Zoning
P101 Gloria 101 01-20-2001 1.2 1 Residential
Pampanga St.
P101 Erap 202 01-20-2001 1.2 1 Residential
San Juan St.
P102 Fidel 303 06-30-1992 1.5 2 Commercial
Pangasinan St.
P102 Cory 404 06-30-1992 1.5 2 Commercial
Tarlac St.
P103 Ferdie
F di 505 06-30-1965
06 30 1965 2.1
21 2 Commerciall
C i
Ilocos Norte St.
P104 Dado 606 06-30-1961 0.8 1 Residential
Pampanga St.
First Normal Form
GE 517 Geographic Information System 8/20/2010
14
15. Second Normal
Form
GE 517 Geographic Information
8/20/2010
System
Normalized
Form
GE 517 Geographic Information
8/20/2010
System
15
16. Data Storage in a DBMS
Object classes/layers are stored in database tables
Each layer is stored as a single database table in a database
management system
Rows contain objects, while columns contain
attributes/properties of the objects
GE 517 Geographic Information System 8/20/2010
Basic Database Functions/Operations
Join
Tables are joined together using common row/column values or keys
j g g y
After joining two or more tables, a new table is created which
contains all the values of the joined tables
Database tables can be joined together to create new relations,
or views of the database.
GE 517 Geographic Information System 8/20/2010
16
17. Basic Database Functions/Operations
Link
Tables are linked using common row/column values or keys
g y
Unlike in joining, linking tables does not result to a new table. The
original tables are retained but accessing one enables the user to also
access a table linked to it
GE 517 Geographic Information System 8/20/2010
Database Design
Involves three stages: conceptual, logical, and physical
Involves six practical steps ( Figure)
p p (see g )
GE 517 Geographic Information System 8/20/2010
17
18. Stages of Database Design
Conceptual Model
Logical Model
User View
Physical Model
Geographic
Database
Object Types
and Database
Relationships
p Schema
Geographic
Database
Structure
Geographic
Representation
GE 517 Geographic Information System 8/20/2010
Conceptual Model
Steps involved are:
1. Model the user’s view
Identifying organizational functions, determining data
requirements of these functions, organizing data into groups for
data management
May be presented using a report with tables
GE 517 Geographic Information System 8/20/2010
18
19. Conceptual Model
2. Define objects and their relationships
Specification of object types/classes and functions, and their
p yp
relationships
May be presented using diagrams
GE 517 Geographic Information System 8/20/2010
GE 517 Geographic Information System 8/20/2010
19
20. Conceptual Model
3. Select geographic representation
Choosing between the types of discrete objects
(point, line, or polygon) or field to represent the
data
Selection has a critical impact on the database use
Although it is possible to switch between
representations later on, it would be
computationally expensive and would lead to
information loss
GE 517 Geographic Information System 8/20/2010
Logical Model
Steps involved are:
1. Match to geographic database types
Matching of object types to be studied to specific data
types supported by the GIS
2. Organize geographic database structure
Defining topological associations, specifying rules and
relationships, and assigning coordinate systems
GE 517 Geographic Information System 8/20/2010
20
21. Physical Model
Step involved is:
Define database schema
definition of the actual physical database schema that will hold the
database data values
usually created using the DBMS software’s data definition
language (ex. SQL)
GE 517 Geographic Information System 8/20/2010
Attribute data entry
Field definition
Attribute data entry
y
Attribute data verification
Creation of new attribute data
GE 517 Geographic Information System 8/20/2010
21
22. Field definition
Definition of (a) field name, (b) data type, (c) data width, and (d)
number of decimal places.
Data type may be (a) numeric (integer or floating-point), (b)
string, (c) Boolean, or (d) date.
Consider measurement scale of data.
GE 517 Geographic Information System 8/20/2010
Attribute data entry
Akin to digitizing for spatial data entry
Attribute data need to be entered by typing
Given: map with 2,000 polygons and 10 fields
Time: At 10 seconds per value, it takes 55 hours – 33
minutes – 20 seconds (2.3 days) to enter 20,000
values
Best to determine if an organization has
attribute data in digital format (e g NSO)
(e.g.
GE 517 Geographic Information System 8/20/2010
22
23. Attribute data verification
In this step:
Ensure attribute data are properly linked to spatial data
Verify the accuracy of attribute data
May be difficult due to observation errors, out-of-date data,
and data entry errors
To check for errors:
Table may be printed for manual verification
y p
Computer programs may be written to automate task
GE 517 Geographic Information System 8/20/2010
Creation of new attribute data
Attribute data classification
Example: Elevation
High = {Higher than 600 meters}
Medium = {Between 200 and 600 meters}
Low = {Lower than 200 meters”
Attribute data computation
Example: Soil erosion potential = rainfall parameter × Soil parameter ×
topographic parameter × land cover parameter × management parameter
Example: Agricultural harvest = area × potential yield
GE 517 Geographic Information System 8/20/2010
23