ISO 19157 Geographic information - Data quality provides a structure for organizing comprehensive data quality assessment measures. What it doesn't provide is a priority of data quality elements for a specific dataset and jurisdiction. Over the past year, the Colorado Address Data Quality subgroup has developed a prioritized list of data quality measures for addressed locations, in an effort to establish common criteria and a scorecard. These will provide a means to describe the data compiled from multiple jurisdictions with varying origins in an objective manner so users of the data can determine their fitness for use. It also provides feedback for local jurisdictions to increase their level of quality according to their need and discretion.
In addition, the State of Colorado in coordination with the US Postal Service, the US Census Bureau, and state and local agencies will begin to provide feedback to local jurisdictions on possible discrepancies in comparison to Master Street Address Guides (MSAGs), the Coding Accuracy Support System (CASS), Statewide Colorado Voter Registration and Election System (SCORE), the Colorado Motorist Insurance Identification Database MIDB, and other datasets that contain addresses. These comparisons are particularly helpful in identifying possible omissions but also in confirming and completing georeferenced address data content. This presentation will describe the value of these comparisons and progress in developing and measuring data quality using common criteria and objective measures.
2013 GISCO Track, Quality Assessment and Improvement for Addressed Locations in Colorado by Nathan Lowry
1. GOVERNOR’S OFFICE OF INFORMATION TECHNOLOGY
Quality Assessment and
l
d
Improvement for Addressed
p
Locations in Colorado
GIS in the Rockies
October 9, 2013
Nathan Lowry, Colorado OIT
2. GOVERNOR’S OFFICE OF INFORMATION TECHNOLOGY
Abstract:
Quality Assessment and Improvement for Addressed Locations in Colorado
ISO 19157 Geographic information ‐ Data quality provides a structure for organizing
comprehensive data quality assessment measures. What it doesn't provide is a priority of data
quality elements for a specific dataset and jurisdiction. Over the past year, the Colorado
Address Data Quality subgroup has developed a prioritized list of data quality measures for
addressed locations, in an effort to establish common criteria and a scorecard. These will
p
p
p j
y g g
provide a means to describe the data compiled from multiple jurisdictions with varying origins
in an objective manner so users of the data can determine their fitness for use. It also provides
feedback for local jurisdictions to increase their level of quality according to their need and
discretion.
In addition, the State of Colorado in coordination with the US Postal Service, the US Census
Bureau, and state and local agencies will begin to provide feedback to local jurisdictions on
possible discrepancies in comparison to Master Street Address Guides (MSAGs), the Coding
Accuracy Support System (CASS), the Statewide Colorado Voter Registration and Election
Accuracy Support System (CASS) the Statewide Colorado Voter Registration and Election
System (SCORE), the Colorado Motorist Insurance Identification Database (MIDB), and other
datasets that contain addresses. These comparisons are particularly helpful in identifying
possible omissions but also in confirming and completing georeferenced address data
content This presentation will describe the value of these comparisons and progress in
content. This presentation will describe the value of these comparisons and progress in
developing and measuring data quality using common criteria and objective measures.
3. GOVERNOR’S OFFICE OF INFORMATION TECHNOLOGY
Status of Address Points Received Per County
August 2013
SEDGWICK
LOGAN
MOFFAT
JACKSON
!
!
LARIMER
!
!
ROUTT
PHILLIPS
WELD
!
MORGAN
!
RIO BLANCO
GARFIELD
!
GRAND
!
!
! !
ADAMS
GILPIN
! ! !
DENVER
CLEAR CREEK
!
ARAPAHOE
EAGLE
SUMMIT
!
JEFFERSON
!
!
!
! !
ELBERT
! !
PITKIN
DOUGLAS
!
LAKE
BROOMFIELD
WASHINGTON
YUMA
KIT CARSON
!
PARK
MESA
LINCOLN
TELLER
DELTA
!
GUNNISON
MONTROSE
!
!
CHAFFEE
!
!
!
PASO
EL
! !
!
!
!
DOLORES SAN JUANHINSDALE
SAN MIGUEL
CHEYENNE
KIOWA
FREMONT
!
!
OURAY
CROWLEY
PUEBLO
!
!
SAGUACHE
OTERO
!
!
!
CUSTER
MINERAL
RIO GRANDEALAMOSA
MONTEZUMA
!
BOULDER
BENT
HUERFANO
!
!
!
LA PLATA
!
ARCHULETA
!
BACA
LAS ANIMAS
CONEJOS
!
!
PROWERS
COSTILLA
!
Sharing
!
!
Counties with Address Points
Public
Not Developed
State
In Development
Pending Agreement
Pending Receipt
Received
4. GOVERNOR’S OFFICE OF INFORMATION TECHNOLOGY
What are we trying to map?
Most often: buildings
Most often: buildings
– Residences and workplaces
Sometimes: building complexes
– High‐rises, apartment complexes, campuses
g
, p
p
,
p
Accesses to buildings:
– Main entrances, service entrances
– Driveways, access roads
y
Sometimes other structures:
– Communications, Electrical, Natural Gas, Water, Heating and Cooling,
Sanitary Sewer and Storm Drainage utilities, Signage, etc.
Sometimes, land only
– Parcels, park lands, event sites, etc.
Occasionally, a location w/o reference to property
– Traffic incident locations, other abstract locations
ff
d
l
h
b
l
5. GOVERNOR’S OFFICE OF INFORMATION TECHNOLOGY
Why? Multi‐purpose:
1.
1
2.
2
Increase accuracy of broadband mapping availability per Census block as
I
f b db d
i
il bilit
C
bl k
portrayed by the Colorado Broadband Data Mapping and Development
Program
For administrative accuracy
For administrati e acc rac
– Identify the taxation of, registration of, services provided to Colorado residences
correctly
– Enumerate or estimate the right number of people within a given boundary
g
p p
g
y
• County, Municipal or Service district boundary, intra‐district school population
balancing, business service area boundaries, voter precincts, etc.
3.
Assist in the notification, evacuation, and recovery of personnel and
property from facilities in response to natural and man‐made
emergencies
6. GOVERNOR’S OFFICE OF INFORMATION TECHNOLOGY
Why now?
S i l
Social consciousness and value of georeferenced address locations has
i
d l
f
f
d dd
l ti
h
significantly increased within the last decade (NSGIC, URISA, NENA, etc.)
NTIA (Broadband), Census (GSS‐I), (NG)9‐1‐1, USPS, and many other
communities with significant interests – esp. State of Colorado
comm nities ith significant interests esp State of Colorado
Cost is low ‐ Low floor for “getting in the door”…
Value is high ‐ Implementation is cross‐functional
– “It is the ‘key’ for so many other data sets” – Paul Tessar, NCR April meeting
– After crime stats, the number two data download for the City and County of Denver
And frankly, for a majority, we’re already doing it
7. GOVERNOR’S OFFICE OF INFORMATION TECHNOLOGY
Common Data Model
Why?
– Allows local and state wide querying analysis and integration
Allows local and state‐wide querying, analysis, and integration …
– Accommodates information exchanges
• Hierarchical ‐ City to County, County to Region, Region to State
• Among neighboring jurisdictions (eg. County to County, etc.)
– All
Allows profiles to provide data in standard forms for specific objectives
fil t
id d t i t d d f
f
ifi bj ti
•
•
•
NENA CLDXF for NG‐911
USPS Pub‐28 for CASS
ArcGIS Geocoding (for quality comparisons, etc.)
– It’s more efficient (less work) and assures more quality (less loss)
It s more efficient (less work) and assures more quality (less loss)
Common
Data
Model
(x inputs) X (y outputs) =
z translations
(x inputs) + (y outputs) =
z translations
8. GOVERNOR’S OFFICE OF INFORMATION TECHNOLOGY
Common Data Model
What?
– An implementation of the United States Thoroughfare, Landmark, and Postal Address Data
Standard (FGDC‐STD‐016‐2011),
– Leans on:
•
•
•
National Emergency Number Association (NENA) 02‐014 GIS Data Collection and Maintenance, 02‐010
Standard Data Formats for 9‐1‐1 Data Exchange & GIS Mapping
St d d D t F
t f 911D t E h
& GIS M
i
NENA draft Civic Location Data EXchange Format (CLDXF) and GIS Data Model for Next Generation
(NG)‐911
Census Optimal Address Data Submission Guidelines
9. GOVERNOR’S OFFICE OF INFORMATION TECHNOLOGY
FGDC‐STD‐016‐2011
United States Thoroughfare, Landmark, and Postal Address Data Standard
Of Greatest Significance:
Of Greatest Significance:
1.
Everything* is ‘fully explicit’ (fully spelled‐out)
‒ No abbreviations allowed; No Ambiguity
*The only exception is two‐letter state postal codes (eg. “CO” = Colorado)
2.
You will express exactly how each address will be parsed
‒ Parsing is no longer subject to interpretation
‒ The break‐down is stored in the data for each record
3.
Each Address must be assigned a Unique Identifier (UID)
‒ Multiple representations of the same address can be “tied together”
if and only if (iff) addresses are assigned UIDs.
These are big changes that few have yet implemented
•
Our common data model is designed to accommodate both:
‒ your current state and
‒ thi “t b ” t t
this “to be” state
10. GOVERNOR’S OFFICE OF INFORMATION TECHNOLOGY
Address
AddressPoint
Parcel Centroid
Building Centroid
Main Entrance
Driveway Entrance
Centerline Reference
… Lat/Long
Site
UID
StreetAddress
Landmark
AddressNumber StreetName
ParcedAddress (Complex)
IsMailingAddress (B)
HasSubAddress (B)
…
AddressRange
Name
…
SubAddress
PointAddressRange
LineAddressRange
…
…
MailingAddress (only)
Place
PO Box
Zipcode
Etc.
Etc …
AddressArea
Building Footprint
Unit Area
Parcel …
Name
…
AddressReferenceSystem
AddressVolume
ExtrudedVolume
BIMVolume
AddressExchange (Input/Output Table)*
Lat
Long
UID
StreetAddress
Parced Address
12345.12
‐12345.12
080211239
912593
123 Clark Street,
Antonito, CO
123 | Clark | Street |
Antonito | Conejos | CO
12348.57
‐12346.28
080211239
912593
123 Clark Street,
Antonito, CO
123 | Clark | Street |
Antonito | Conejos | CO
456 Jones Ave,
456 J
A
Antonito, CO
*In advance of XSD/XML implementation routines as described by FGDC‐STD‐016‐2011 Part 5: Address Data Exchange
12. GOVERNOR’S OFFICE OF INFORMATION TECHNOLOGY
Common Data Model
Field Name
PlaceID
AddressUID
AddressUUID
Data Type
Long Integer
Long Integer
GUID
Length Description
Unique identifier assigned to each value in the dPlace domain.
AddressID as defined in FGDC‐STD‐016‐2011. Uniquely identifiable integer assigned by each AddressAuthority. Used in this data model to help uniquely identify Addresses.
A Universally (aka Globally) Unique Identifier, usually a 16‐byte binary value, AddressID as defined in FGDC‐STD‐016‐2011. Uniquely identifiable GUID assigned by each AddressAuthority. Used in this
data model to help uniquely identify Addresses.
LandElementName
NumberPrefix
AddressNumber
NumberSuffix
StreetPreModifier
Text
Text
Long Integer
Text
Text
255
15
StreetPreDirectional
Text
20
StreetNamePreDirectional as defined in FGDC‐STD‐016‐2011. A word preceding the StreetName that indicates the direction or position of the thoroughfare relative to an arbitrary starting point or line,
or the sector where it is located.
h
h
i i l
d
StreetPreType
StreetSeparator
Text
Text
35
10
StreetNamePreType as defined in FGDC‐STD‐016‐2011. A word or phrase that precedes the StreetName and identifies a type of thoroughfare in a CompleteStreetName.
SeparatorElement as defined in FGDC‐STD‐016‐2011. A ... [prepositional] phrase ... used as a separator between a StreetPreType and a StreetName, as in "Avenue of the Americas"].
StreetName
StreetPostType
StreetPostDirectional
Text
Text
Text
75
35
20
StreetName as defined in FGDC‐STD‐016‐2011. The portion of the CompleteStreetName that identifies the particular thoroughfare ... .
StreetNamePostType as defined in FGDC‐STD‐016‐2011. A word or phrase that follows the StreetName and identifies a type of thoroughfare in a CompleteStreetName.
StreetNamePostDirectional as defined in FGDC‐STD‐016‐2011. A word following the Street Name that indicates the direction or position of the thoroughfare relative to an arbitrary starting point or line,
or the sector where it is located.
StreetPostModifier
Text
20
StreetNamePostModifier as defined in FGDC‐STD‐016‐2011. A word or phrase in a Complete Street Name that follows and modifies the StreetName, but is separated from it by a StreetNamePostType
or a StreetNamePostDirectional or both.
AddressLocDesc
PlaceName
PlaceNameType
CountyName
Text
Text
Text
Text
255
100
35
25
LocationDescription as defined in FGDC‐STD‐016‐2011. A text description providing more detail on how to identify or find the addressed feature.
PlaceName as defined in FGDC‐STD‐016‐2011. The name of an area, sector, or development; incorporated municipality ...; county ...; or region within which the address is physically located; or a name
PlaceNameType as defined in FGDC‐STD‐016‐2011. The type of Place Name used in an Address.
The county or county equivalent where the address is physically located as defined in FGDC‐STD‐016‐2011 and the NENA NG9‐1‐1 US CLDXF Standard. A county (or its equivalent) is the primary legal
division of a state or territory.
StateName
Text
30
State names in ANSI INCITS 38:2009. The US states and state equivalents: the fifty US states, the District of Columbia, and all U.S. territories and outlying possessions. A state (or equivalent) is "a
primary governmental division of the United States."
ZIPCode
Long Integer
Zone Improvement Plan Code. A system of 5‐digit codes that identifies the individual Post Office or metropolitan area delivery station associated with an address. See USPS, "Quick Service Guide 800:
Glossary of Postal Terms and Abbreviations in the DMM."
ZIPCodePlusFour
Short Integer
ZipPlus4 in FGDC‐STD‐016‐2011. A 4‐digit extension of the 5‐digit Zip Code (preceded by a hyphen) that, in conjunction with the Zip Code, identifies a specific range of USPS delivery addresses. Adapted
from USPS, "Quick Service Guide 800: Glossary ... ."
15
20
The name of a relatively permanent feature of the ... landscape that has recognizable identity within a particular cultural context. Modified from LandmarkName as defined in FGDC‐STD‐016‐2011.
AddressNumberPrefix as defined in FGDC‐STD‐016‐2011. The portion of the CompleteAddressNumber which precedes the AddressNumber itself.
AddressNumber as defined in FGDC‐STD‐016‐2011. The numeric identifier for a land parcel, house, building, or other location along a thoroughfare or within a com
AddressNumberSuffix as defined in FGDC‐STD‐016‐2011. The portion of the CompleteAddressNumber which follows the AddressNumber itself.
StreetNamePreModifier as defined in FGDC‐STD‐016‐2011. A word or phrase in a CompleteStreetName that precedes and modifies the StreetName, but is separated from it, ... or is placed outside the
StreetName ... [to] sort ... [a] list of street names.
Country
Text
50
CountryName as defined in FGDC‐STD‐016‐2011. The name of the country in which the address is located. A country is "an independent, self‐governing, political entity." See ISO 3166‐1.
ParcelID
TransSegID
Text
Text
20
30
Foreign Key Parcel Identifier as defined in FGDC‐STD‐016‐2011. The primary permanent identifier ... for a parcel that includes the land or feature identified by an address.
Foreign Key AddressTransportationFeatureID as defined in FGDC‐STD‐016‐2011. The unique identifier assigned to the particular feature that represents an address within a transportation base model.
Feature
Type
LBSCStructCode
NAICSCode
Text
Text
Short Integer
Long Integer
30
35
The physical feature of the landscape that is being represented by this geometry.
A description of the placement of this geometry on the landscape to represent the physical feature.
Land Based Classification Standards (LBCS) Structure Code defined by the Address of the American Planning Association (APA). See http://www.planning.org/lbcs/standards/
North American Industry Classification System (NAICS) code for the Address (if it is a business) as defined by the Office of Management and Budget (OMB). See
http://www.census.gov/eos/www/naics/index.html
Longitude
Latitude
Double
Double
AddressLongitude as defined in FGDC‐STD‐016‐2011. The longitude of the address location, in decimal degrees [using the North American Datum of 1983 (NAD83)].
AddressLatitude as defined in FGDC‐STD‐016‐2011. The latitude of the address location, in decimal degrees [using the North American Datum of 1983 (NAD83)].
13. GOVERNOR’S OFFICE OF INFORMATION TECHNOLOGY
Common Data Model
Quantity over Quality Speed over Structure
Quality,
Only one dataset per area processed (eg. Regional)
Retains 60-80% of the attribute data from sources
Converted many values to target data types
Address numbers and number suffixes separated
Domain values not (yet) used or enforced
Multiple representations of fields for common records are not
represented (only first instance)
Not all fields that can be are populated
Some source fields may have been misinterpreted
Some source fields not identified to any target field
Cases (Mixed case instead of UPPER) not standardized
Land use domain values not (yet) translated
Documentation incomplete
We’ve worked further, but the work is not yet complete
14. GOVERNOR’S OFFICE OF INFORMATION TECHNOLOGY
Publication
1. Address Data received at OIT – Dec 2012
2. Conversion and Loading at OIT – Feb 2013
• Conversion to Interim Data Model
• Loaded into state‐wide database
3. Publication as Interim Services – Apr‐Aug 2013
(Access Controlled to State Agencies)
• ArcGIS Server Services:
Server Services:
• OGC Services:
OGC Services:
‐ Mapping, Feature Access, Geodata ‐ WFS (Mapping and Geodata),
WMS, KML Network Links
‐ SQL Server, ArcSDE
4. Publication to Internet ‐ (Publicly Accessible Data only)
Publication to Internet (Publicly Accessible Data only)
• Conversion to Common Data Model in state‐wide database
• Mapping Services, data.colorado.gov, etc.
5. Address Locator, CASS, etc.
5 Address Locator CASS etc
15. GOVERNOR’S OFFICE OF INFORMATION TECHNOLOGY
Source
Source Field Name
Source Data Type
Source Type
Target
Target Field Name
Target Data Type
Target Type
Target Field Calculation
CusterAddressPoints.shp
Shape
Geometry
Point Shapefile
AddressPoint
Shape
LGID
APSAID
Longitude
Lattitude
IsPrincipal
Feature
Type
LGID
AddressSAID
FullAddress
Geometry
Long Integer
Long Integer
Double (15*,9)
Double (15* 9)
Double (15*,9)
Text(10)
Text(30)
Text(35)
Long Integer
Long Integer
Text (255)
Point Feature Class
=
=14001
Assign value (increment)
Y Coordinate of Point; GCS: NAD 83
Y Coordinate of Point; GCS: NAD 83
X Coordinate of Point; GCS: NAD 83
="Yes"
Object Table
=14001
Assign value (increment)
=[LandmarkElement.LandmarkElementName]&" "&[StreetName.ExpressStAdd]
ParcelID
LGID
AddressSAID
SubAddressSAID
APSAID
AddressJoinID
LGID
AddressSAID
LandmarkID
LandmarkID
LandElementID
LandElementName
LandEleSequence
LGID
AddressSAID
StreetAddID
Text (20)
Long Integer
Long Integer
Long Integer
Long Integer
Long Integer
Long Integer
Long Integer
Long Integer
Long Integer
Long Integer
Long Integer
Text (255)
Byte
Long Integer
Long Integer
Long Integer
Long Integer
ExpressStAdd
IsPrincipal
OfficialStatus
AddressAuthority
IsAnomaly
IsMailableAddress
StreetAddID
StreetNameID
NumberAddID
AddressNumber
NumberSuffix
NumberAddID
StreetNameID
ExpressedStreetName
LGID
AddressSAID
StreetAddID
LandmarkID
LastLineID
CountyName
StateName
S
N
Country
LastLineID
LastLineEleID
PlaceName
PlaceNameType
IsPrincipal
LastLineEleSequence
LGID
AddressSAID
StreetAddID
LandmarkID
L d
kID
SubAddressSAID
LocationDescription
LGID
AddressSAID
SubAddressSAID
SubAddressEleID
SubAddressType
SubAddressIdentifier
SubAddEleSequence
Text (255)
Text (10)
Text (100)
Text (50)
Text (10)
Text (10)
Long Integer
Long Integer
Long Integer
Long Integer
Text (15)
Long Integer
Long Integer
Text(255)
Long Integer
Long Integer
Long Integer
Long Integer
Long Integer
Text (25)
Text (30)
T (30)
Text (50)
Long Integer
Long Integer
Text (100)
Text (10)
Text (10)
Byte
Long Integer
Long Integer
Long Integer
Long Integer
L
I t
Long Integer
Text (255)
Long Integer
Long Integer
Long Integer
Long Integer
Text (25)
Text (25)
Byte
Address
Schedule
Text (10)
AddressesMayHaveAddressPoints
Landmark
LandmarkElement
BusinessNm
Text (50)
StreetAddress
FullAddr; FullAddr2
Text (100), Text (100)
NumberedAddress
Address
Address
Text (8)
( )
Text (8)
Roadname; Route
Text (65), Text (50)
StreetName
LastLine
LastLineElement
Roadname
SubAddress
SubAddressElement
Relationship Class
Object Table
Object Table
Object Table
Object Table
Object Table
Object Table
Object Table
Object Table
Object Table
=[Schedule]
=14001
=[Address.AddressSAID]
=[SubAddress.SubAddressSAID]
[SubAddress SubAddressSAID]
=[AddressPoint.APSAID]
Assign value (increment)
=14001
=[Address.AddressSAID]
Assign value (increment)
=[Landmark.LandmarkID]
Assign value (increment)
[BusinessNm]
=1
=14001
=[Address.AddressSAID]
=[Address AddressSAID]
Assign value (increment)
=[FullAddr]
="Yes" when "FullAddr"; "No" when "FullAddr2"
="Unknown"
"Custer County Planning and Zoning"
="Unknown"
="Unknown"
=[StreetAddress.StreetAddID]
=[StreetName.StreetNameID]
Assign value (increment)
=[Address]
Right([CusterAddressPoints.Address],InStr([CusterAddressPoints.Address],“ “))
=[NumberedAddress.NumberAddID]
Assign value (increment)
=[Roadname]; = [Route]
=14001
=[Address.AddressSAID]
=[StreetAddress.StreetAddID]
=[Landmark.LandmarkID]
Assign value (increment)
"Custer"
"Colorado"
"C l d "
"United States"
=[LastLine.LastLineID]
Assign value (increment)
IS NULL; = “Westcliffe” ; = “Silver Cliff”
IS NULL; ="Incorporated Municipality"
= "Yes"
=1
=14001
=[Address.AddressSAID]
=[StreetAddress.StreetAddID]
=[Landmark.LandmarkID]
[L d
kL d
kID]
Assign value (increment)
Null
=14001
=[Address.AddressSAID]
=[SubAddress.SubAddressSAID]
Assign value (increment)
="Apartment"
=[Apartment]
=1
16. GOVERNOR’S OFFICE OF INFORMATION TECHNOLOGY
Colorado State Address Dataset ‐ Common Data Model Crosswalk
Source
Source Field Name
AddressPoints122112.shp
Source Data Type
Source Type
Target
AddressPoint
Address
PARCEL_NUM
SCHEDULE_N
SCHEDULE N
AddressJoinAddressPoints
StreetAddress
Address
NumberedAddress
STREETNO
STREETNO
StreetName
STREETDIR
STREETNAME
STREETSUF
LastLine
LastLineElement
LOCCITY
SubAddress
SubAddressElement
STREETALP
URL
Sequence
Target Field Name
PlaceID
Target Data Type
Long Integer
APSAID
Longitude
Lattitude
MetadataID
PlaceID
AddressSAID
FullAddress
ParcelID
Long Integer
Double (15*,9)
Double (15*,9)
Long Integer
Long Integer
Long Integer
Text (255)
Text (20)
MetadataID
PlaceID
AddressSAID
SubAddressSAID
APSAID
AddressJoinID
PlaceID
AddressSAID
StreetAddID
ExpressStAdd
StreetAddID
StreetNameID
NumberAddID
AddressNumber
NumberSuffix
Long Integer
Long Integer
Long Integer
Long Integer
Long Integer
Long Integer
Long Integer
Long Integer
Long Integer
Text (255)
Long Integer
Long Integer
Long Integer
Long Integer
Text (15)
NumberAddID
StreetNameID
StreetPreDirectional
StreetName
StreetPostType
PlaceID
AddressSAID
StreetAddID
LastLineID
CountyName
StateName
Country
LastLineID
LastLineEleID
PlaceName
IsPrincipal
LastLineEleSequence
PlaceID
AddressSAID
StreetAddID
LandmarkID
MailingID
SubAddressSAID
PlaceID
AddressSAID
SubAddressSAID
S bAdd
SAID
SubAddressEleID
SubAddressType
SubAddressIdentifier
SubAddEleSequence
Long Integer
Long Integer
Text (20)
Text (75)
Text (35)
Text (35)
Long Integer
Long Integer
Long Integer
Long Integer
Text (25)
Text (30)
Text (50)
Long Integer
Long Integer
Text (100)
Text (10)
Byte
Long Integer
Long Integer
Long Integer
Long Integer
Long Integer
Long Integer
Long Integer
Long Integer
Long Integer
L
I
Long Integer
Text (25)
Text (25)
Byte
Target Type
Object Table
Object Table
Target Field Calculation
=20
Assign value (increment)
X Coordinate of Point; GCS: NAD 83
Y Coordinate of Point; GCS: NAD 83
Null (for now)
=20
Assign value (increment)
=[PARCEL_NUM]
Object Table
Object Table
Object Table
Object Table
Object Table
Object Table
Object Table
Object Table
Null (for now)
=20
=[Address.AddressSAID]
=[SubAddress.SubAddressSAID]
=[AddressPoint.APSAID]
Assign value (increment)
=20
=[Address.AddressSAID]
Assign value (increment)
=[Address]
=[StreetAddress.StreetAddID]
=[StreetName.StreetNameID]
Assign value (increment)
=[STREETNO]
Right([STREETNO],InStr([STREETNO],“ “))
=[NumberedAddress.NumberAddID]
Assign value (increment)
=[STREETDIR]
=[STREETNAME]
=[STREETSUF]
[STREETSUF]
=20
=[Address.AddressSAID]
=[StreetAddress.StreetAddID]
Assign value (increment)
"Eagle"
"Colorado"
"United States"
=[LastLine.LastLineID]
Assign value (increment)
=[LOCCITY]
= "Yes"
=1
=20
=[Address.AddressSAID]
=[StreetAddress.StreetAddID]
=[Landmark.LandmarkID]
=[MailAddress.MailingID]
Assign value (increment)
=20
=[Address.AddressSAID]
=[SubAddress.SubAddressSAID]
[S bAdd
S bAdd
SAID]
Assign value (increment)
"Unit"
=[STREETALP]
=1
17. GOVERNOR’S OFFICE OF INFORMATION TECHNOLOGY
Custer County
AddressNumber and AddressSuffix
If IsNumeric([CusterAddressPoints.Address]) = 0
then
if [CusterAddressPoints.Address] = ‘’
or
if [CusterAddressPoints.Address] = ‘???’
then
[NumberedAddress.AddressNumber] IS NULL
else
if InStr([CusterAddressPoints.Address],“ “) =0
then
[NumberedAddress.AddressNumber] = CLng(Left[CusterAddressPoints.Address], InStr([CusterAddressPoints.Address],“‐“)))
and [NumberedAddress.NumberSuffix] = Right([CusterAddressPoints.Address],InStr([CusterAddressPoints.Address],“‐“))
else
[NumberedAddress.AddressNumber] = CLng(Left([CusterAddressPoints.Address], InStr([CusterAddressPoints.Address],“ “)))
and [NumberedAddress.NumberSuffix] Right([CusterAddressPoints.Address],InStr([CusterAddressPoints.Address],“ ))
and [NumberedAddress.NumberSuffix] = Right([CusterAddressPoints.Address],InStr([CusterAddressPoints.Address], “))
else [NumberedAddress.AddressNumber] = [CusterAddressPoints.Address]
StreetName and PlaceName
If [CusterAdddressPoints.Roadname] cn “Westcliffe”
then [StreetName.StreetName] = Left[CusterAddressPoints.Address], InStr([CusterAddressPoints.Roadname], Westcliffe ))
then [StreetName StreetName] = Left[CusterAddressPoints Address] InStr([CusterAddressPoints Roadname] “Westcliffe“))
and [LastLineElement.PlaceName] = “Westcliffe”
If [CusterAdddressPoints.Roadname] cn “Silver Cliff”
then [StreetName.StreetName] = Left[CusterAddressPoints.Address], InStr([CusterAddressPoints.Roadname],“Westcliffe“))
and [LastLineElement.PlaceName] = “Westcliffe”
else [StreetName.StreetName] = [CusterAddressPoints.Roadname]
else [StreetName StreetName] [CusterAddressPoints Roadname]
18. GOVERNOR’S OFFICE OF INFORMATION TECHNOLOGY
“Errata”
Errata
Archuleta
Lats and Longs
Side from Boolean to Text (TranSegSide domain)
Bent
Address Numbers are text (not numeric)
Chaffee
Address Numbers are text (not numeric)
ZIPCodes are text (not numeric)
are text (not numeric)
Custer
Address Number Suffixes must separated from Address Numbers
Eagle
Address Number Suffixes must separated from Address Numbers
El Paso ‐ Teller
ZIPCodes are text (not numeric)
ZIPCode+4’s are text (not numeric)
(
)
Fremont
Address Numbers are text (not numeric)
Garfield
Address Numbers are text (not numeric)
ZIPCodes are text (not numeric)
Some Intersection Addresses
Grand
Address Numbers are text (not numeric)
ZIPCodes are text (not numeric)
Some Range Addresses
Huerfano
Single attribute for expressed street address
Kit Carson
Address Numbers are text (not numeric)
Some address numbers appear to be street names (eg. 1st 1st Street)
ZIPCodes are text (not numeric)
Two Range Addresses
La Plata
AddressUIDs are text (not numeric)
Address Numbers are text (not numeric)
ZIPCodes are text (not numeric)
Complex LocationDescription [TWNRNG_TXT, SECT_TXT, ALQPARTS, BLOCK, LOT, LANDTRACT]
Logan
Address Numbers are double (not integer)
ZIPCodes are text (not numeric)
A dozen or so Address Number Suffixes must separated from Address Numbers
One Range Address
Mesa
Address Numbers are text (not numeric)
ZIPCodes are text (not numeric)
57 ZIPCodes have values of “816XX” or “815XX”
57 ZIPCodes have values of “816XX” or “815XX”
Moffat
Address Numbers are text (not numeric)
ZIPCodes are text (not numeric)
Moffat
Address Numbers are text (not numeric)
ZIPCodes are text (not numeric)
Several Range Addresseses
Several Range Addresseses
One Address Number Suffix must separated from its Address Number
Montezuma
Address Numbers are text (not numeric)
ZIPCodes are text (not numeric)
One Address Number Suffix must separated from its Address Number
Park
AddressUIDs are text (not numeric)
are text (not numeric)
AddressUIDs are not unique (many unpopulated)
AddressUIDs are text are larger than an integer field – must be 17 characters long.
Address Numbers are double (not integer)
ZIPCodes are text (not numeric)
Many SubAddress Identifiers must separated from Address Numbers
Many Range Addresses
Many ZipCode values are ‘ ‘
y p
Pitkin
Address Numbers are double (not integer)
ZIPCodes are text (not numeric)
About 6 Address Number Suffixes must separated from Address Numbers
GlobalID is text (not GUID)
Pueblo
AddressUIDs are text (not numeric) and not unique
(
)
q
Routt
Community Names
One record with SubAddress length over 25 characters
LandmarkName, FeatureType, and Location all contain
A mix of functional descriptions and landmark names
FeatureType should be functional values
LandmarkName should be proper names of facilities
Location should be necessary directions to arrive at the address correctly and successfully.
San Luis Valley
Address Numbers are text (not integer)
Summit
Address Numbers are text (not integer)
Address Number Suffixes must be separated from Address Numbers
19. GOVERNOR’S OFFICE OF INFORMATION TECHNOLOGY
Land Use types to LBCS Structure Codes
Land Use types to LBCS Structure Codes
Bldg_Type
RES PARTIAL EXEMPT
SINGLE FAMILY RESIDENTIAL
SINGLE FAMILY
DETACHED SINGLE FAMILY
LOW‐RISE CONDOMINUM
DETACHED CONDOMINUM
MIXED USE RESIDENTIAL W/COMMERCIAL OR INDUST
RES POLITICAL SUB EXEMPT
LOW‐RISE CONDOMINIUM
RES RELIGIOUS EXEMPT
RES CHARITABLE EXEMPT
RES STATE EXEMPT
DETACTED SINGLE FAMILY
DETACHED SINGLE FAMILY Mail dlvry at 1783 pe
MIXED USE RESIDENTIAL / RELIGIOUS EXEMPT
MIXED USE ‐ RESIDENTIAL / RELIGIOUS EXEMPT
MIXED USE RESIDENTIAL W/ COMMERCIAL OR INDUS
RES COUNTY EXEMPT
RESIDENTIAL‐AGRICULTURAL
ATTACHED TOWNHOUSE
ATTACHED CONDOMINUM
NON‐CONFORMING RESIDENCE‐SINGLE FAMILY
CONDOMINIUM
LOW‐RISE APARTMENTS
ATTACHED CONDOMINIUM
ATTACHED CONDOMINIUM
ATTACHED RESIDENTIAL
DUPLEX TWO FAMILY RESIDENTIAL
MID‐RISE CONDOMINUM
HIGH‐RISE CONDOMINUM
MOBILE HOME
MOBILE HOME PARK
MOBILE HOME SITE‐UNIMPROVED
MOBILE HOME SITE
MOBILE HOME (STRUCTURE ONLY)
MOBILE HOME (STRUCTURE ONLY)
2 & 3 FAMILY UNIT
APARTMENTS
APARTMENT CONVERSION
4‐8 UNIT APARTMENTS
MULTI‐UNITS(9 AND UP)
CONDOMINUM CONVERSION
2‐4 FAMILY RESIDENTIAL
APARTMENT
MILITARY HOUSING
RESIDENTIAL DORMATORIES/NURSING HOMES
RESIDENTIAL SF ASSISTED LIVING
RESIDENTIAL DORMNATORIES/NURSING HOMES
RES PRIVATE SCHOOL EXEMPT
RESIDENTIAL DORMATORIES/ NURSING HOMES
HOTEL‐MOTELS
Frequency LBCSStructCode LBSCStructDesc
3
1000Residential buildings
2
1100Single‐family buildings
1
1100Single‐family buildings
125848
1110Detached units
15858
1110Detached units
143
1110Detached units
128
1110Detached units
79
1110Detached units
56
1110Detached units
44
1110Detached units
29
1110Detached units
25
1110Detached units
12
1110Detached units
1
1110Detached units
1
1110Detached units
1110Detached units
1
1110Detached units
1
1110Detached units
1
1110Detached units
19400
1120Attached units
7201
1120Attached units
122
1120Attached units
65
1120Attached units
22
1120Attached units
14
1120Attached units
1120Attached units
4
1120Attached units
3
1121Duplex structures
2988
1140Townhouses
384
1140Townhouses
2063
1150Manufactured housing
42
1150Manufactured housing
4
1150Manufactured housing
2
1150Manufactured housing
1
1150Manufactured housing
1150M
f t dh i
1836
1200Multifamily structures
1544
1200Multifamily structures
997
1200Multifamily structures
612
1200Multifamily structures
42
1200Multifamily structures
12
1200Multifamily structures
1
1200Multifamily structures
1
1200Multifamily structures
149
1310Barracks
57
1320Dormatories
44
1320Dormatories
6
1320Dormatories
3
1320Dormatories
1
1320Dormitories
75
1330Hotels, motels, and tourist courts
20. GOVERNOR’S OFFICE OF INFORMATION TECHNOLOGY
Automated Processing
1. Means of Transfer
1 M
fT
f
• data.colorado.gov – (Socrata w/Mondara)
C l d
i )
•O
OpenColorado.org ( CKAN i l
(a CKAN implementation)
•
(SSH protocol by October 2013)
o t ose o spec ca y ust co t o access to data
– For those who specifically must control access to data
2. Conversion to Common Data Model
–
–
Correlate local address datasets to Common Data Model
Correlate local address datasets to Common Data Model
Conversion via scripting (SQL, VBScript, or Python)
3. Loading of data into state wide database
3 Loading of data into state‐wide database
21. GOVERNOR’S OFFICE OF INFORMATION TECHNOLOGY
Data Quality
Two Tracks:
Two Tracks:
1. Develop criteria and measure quality
• Develop quality measures in relation to ISO standards
• Draw from measures in standards and practice
2. Compare for potential corrective actions
•
•
•
•
Master Street Address Guide (MSAG) and ALI
(
)
US Postal Service Address Quality Improvement DBs
Statewide Voter Registration System (SCORE)
Motorist Insurance Identification Database (MIIDB)
(
)
Present criteria and comparisons by 4Q CY 2013
Address Working Group meeting
Address Working Group meeting
23. GOVERNOR’S OFFICE OF INFORMATION TECHNOLOGY
Data Quality ‐ Status
Reviewed ISO Geographic information data quality elements
Reviewed ISO Geographic information data quality elements
Developed quality measurement concepts related to ISO data quality
– Added database integrity concepts (e.g. database normalization, referential
integrity, etc.) not well addressed in geographic quality standards to
data
d t quality spreadsheet
lit
dh t
Itemized:
– Tests in Chapter 4 Address Data Quality, FGDC‐STD‐016‐2011US…Address Data
Standard
– Requirements for data quality from NENA 02‐014 GIS Data Collection and
Maintenance and NENA 71‐501 Synchronizing GIS Databases with MSAG and
ALI .
Compared datasets to identify completeness and currency, including:
p
y
p
y,
g
– Colorado State Address Dataset vs. Master Street Address Guide (MSAG)
– Colorado State Address Dataset vs. USPS Coding Accuracy Support System
(CASS)
24. GOVERNOR’S OFFICE OF INFORMATION TECHNOLOGY
Data Quality Measurement Concepts
D
Q li M
C
QualityElement
completeness
QualitySubelement
commission
omission
logical consistency
MeasurementConcept
ComparisonwithImagery(recent)
Duplicate features
PermittingandLicensing
DateofAddition
ParcelIDRelationship
CorrelationwithMSAG/ALI
FieldVerification
CorrelationwithMSAG/ALI
FieldVerification
CheckotherDatabasesinProcess
CoordinationwithinProcesses
KnowledgefromOthersespExperts
ComparisonwithBuildingFootprints
ConsistencywithPolicyonAddressAssignment
conceptual consistency
Database Normalization
Entity Integrity
Referential integrity
Domain integrity
User‐defined integrity constraints
domain consistency
format consistency
topological consistency
Definition
Notes
Building Permits
Business Licenses
If ParcelID is unknown it is manually placed
If ParcelID is unknown‐ it is manually placed
Sampling
Sampling
InsertionwithinAddressRelatedProcesses
Parcel vs. Address based software
Prior to 2003 in Loveland ‐ inconsistency
Third normal form (3NF), and most often
Sequence
Many‐to‐one
Every non‐key a
ribute field must provid
1
Every table must have a primary key
Every field value in a table must exist as a value in another field in the database. Speci
Every element from a relation should respectthe range of values that the element can
Consistency (inconsistency) with assignment e.g. no zipcodes for addresses that don't h
AdditionstoDomains
domain list doesn't exist
domain value that doesn't exist
not correct domain
duplicate domain values
comission
missing domain values
omission
field datatype
e.g. only numeric values in a text field
like address numbers being numeric
field length
field precision/scale
e.g. lat/long values without sufficient significant digits
order of fields
Logical/PhysicalDataModelComparison
Fields in the right FeatureClass
FeatureClass in the right FeatureDataset
duplicate fields
commission
must spec
missing fields
omission
in referen respectivespecific
want‐to‐have missing fields
sensitive fields (not for public consumption) should be related but not contained in the
Compliance to industry standards (FGDC, NENA, etc.)
e.g. pre‐parsed addresses as per standard(s) e.g. no abbreviations, etc.
Comparison of expressed complex elements with composed complex elements
AddressNumberFishbonesMeasure
LeftRightParity
Esp. if generated from geometry. If not, i
Sequence of address assignment
The identification of parity inconsistency
Extent of address ranges
Inherently topological (with centerline rangeEsp. if gen new address without range upd
MSAG Ranges must be equal to or fall within centerline ranges
ALI
postitional accuracy absolute accuracy
relative accuracy
Nathan to write a white‐paper on sampling and positional accuracy (and completeness) measurement
gridded data position accuracy
temporal accuracy
accuracy of a time measurement
create date
inherited date
transaction date
date last updated
active
matched/unmatched
inactive
proposed/retired
"forensic" addressin effective date
rollback?
parent‐child temporal records
temporal consistency
25. GOVERNOR’S OFFICE OF INFORMATION TECHNOLOGY
Data Quality – Potential Tests:
FGDC-STD-016-2011US…Address Data Standard
QualityElement
completeness
completeness
completeness
completeness
completeness
completeness
completeness
logical consistency
logical consistency
logical consistency
logical consistency
logical consistency
logical consistency
logical consistency
logical consistency
logical consistency
logical consistency
logical consistency
logical consistency
logical consistency
logical consistency
logical consistency
l i l
i t
logical consistency
logical consistency
logical consistency
logical consistency
logical consistency
logical consistency
logical consistency
logical consistency
logical consistency
g
y
logical consistency
positional accuracy
temporal accuracy
temporal accuracy
thematic accuracy
thematic accuracy
thematic accuracy
thematic accuracy
thematic accuracy
thematic accuracy
thematic accuracy
thematic accuracy
QualitySubelement
commission
commission
omission
omission
omission
omission
omission
conceptual consistency
conceptual consistency
conceptual consistency
conceptual consistency
conceptual consistency
conceptual consistency
conceptual consistency
format consistency
format consistency (relationship)
format consistency (relationship)
format consistency (relationship)
topological consistency
topological consistency
topological consistency
topological consistency
t
l i l
i t
topological consistency
topological consistency
topological consistency
topological consistency
topological consistency
topological consistency
topological consistency
topological consistency
topological consistency
p g
y
domain consistency
relative accuracy
temporal consistency
temporal consistency
non‐quantitative attribute correctness
non‐quantitative attribute correctness
non‐quantitative attribute correctness
quantitative attribute accuracy
quantitative attribute accuracy
quantitative attribute accuracy
quantitative attribute accuracy
quantitative attribute accuracy
Reference
4.5.27
4.5.35
4.5.1
451
4.5.29
4.5.7
4.5.37
4.5.13
4.5.6
4.5.8
4.5.15
4.5.20
4.5.22
4.5.23
4.5.24
4.5.14
4.5.12
4.5.25
4.5.33
4.5.5
4.5.5.1
4.5.5.2
4.5.5.3
4553
4.5.5.4
4.5.5.5
4.5.9
4.5.10
4.5.11
4.5.16
4.5.19
4.5.30
4.5.31
4.5.34
4.5.2
4.5.4
4.5.32
4.5.21
4.5.3
4.5.28
4.5.17
4.5.26
4.5.36
4.5.38
4.5.18
MeasureName
MeasureDescription
RelatedElementUniquenessMeasure
This measure checks the uniqueness of the values related to a given element, in either the
UniquenessMeasure
This measure tests the uniqueness of a simple or complex value.
AddressCompletenessMeasure
This measure compares the number of addressable objects with the address information
This measure compares the number of addressable objects with the address information
RelatedNotNullMeasure
This measure checks the completeness of data related to another part of the address.
AddressNumberRangeCompletenessMeasure
Check for a low and high value in each Two Number Address Range or Four Number
XYCoordinateCompletenessMeasure
This measure checks for coordinate pairs with one member missing. The query produces
CompleteElementSequenceNumberMeasure
This measure requires assembling a complex element in order by Element Sequence
AddressNumberParityMeasure
Test agreement of the odd/even status of the numeric value of an address number with the
AddressNumberRangeParityConsistencyMeasure
Test agreement of the odd/even status of the numeric value of low and high address
DeliveryAddressTypeSubaddressMeasure
This measure checks for null Complete Subaddress values where the Delivery Address
LeftRightOddEvenParityMeasure
This measure tests the association of odd and even values in each Two Number Address
LowHighAddressSequenceMeasure
This measure confirms that the value of the low address is less than or equal to the high
OfficialStatusAddressAuthorityConsistencyMeasure
This measure tests logical agreement of the Official Status with the Address Authority.
OverlappingRangesMeasure
This measure checks the sequence of numbers where one non‐zero Two Number Address
DataTypeMeasure
This measure uses pattern matching to test for data types. It is common for delimited text
CheckAttachedPairsMeasure
This measure describes how to check Attached Element attributes set to "attached" for
PatternSequenceMeasure
This measure tests the sequence of values in each complex element for conformance to
SubaddressComponentOrderMeasure
This measure tests Subaddress Elements against the component parts in the order
AddressNumberFishbonesMeasure
This measure generates lines between addressed locations and the corresponding locations
Addresses without fishbones
This may show an address with a Complete Street Name value that doesn't match anything
Addresses with fishbones that touch other fishbones
Address Number values may have been assigned out of order. Another possibility,
Addresses with fishbones that cross centerlines
Add
ith fi hb
th t
t li
There may be inconsistencies in the Complete Street Name values recorded in the
Th
b i
i t i i th C
l t St t N
l
d d i th
Addresses with long fishbones
These may indicate variations in street names that need to be resolved, especially when a
Addresses with suspected bowtie fishbones
These frequently indicate address ranges that inappropriately begin with zero (0).
AddressRangeDirectionalityMeasure*
This measure derives Address Range Directionality values, allowing update to and/or
AddressReferenceSystemAxesPointOfBeginningMeasure This measure checks for a common point to describe the intersection of the Address
AddressReferenceSystemRulesMeasure
Address Reference System layers are essential for both address assignment and quality
DuplicateStreetNameMeasure
In many Address Reference Systems distantly disconnected street segments with the
IntersectionValidityMeasure
Check intersection addresses for streets that do not intersect in geometry.
SegmentDirectionalityConsistencyMeasure
Check consistency of street segment directionality, which affects the use of Two Number
SpatialDomainMeasure
p
This measure tests values of some simple elements constrained by domains based on
p
y
TabularDomainMeasure
This measure tests each value for a simple element for agreement with the corresponding
AddressElevationMeasure
This measure checks each elevation in an address point collection against polygons
AddressLifecycleStatusDateConsistencyMeasure
This measure tests the agreement of the Address Lifecycle Status with the development
StartEndDateOrderMeasure
Test the logical ordering of the start and end dates.
LocationDescriptionFieldCheckMeasure
This measure describes checking the location description in the field.
AddressLeftRightMeasure
This measure checks stored values describing left and right against those found by
RelatedElementValueMeasure
This measure checks the logical consistency of data related to another part of the address.
ElementSequenceNumberMeasure
Element Sequence Number values must begin at 1 and increment by 1. This measure
RangeDomainMeasure*
This measure tests each Address Number for agreement with ranges. Address Number
USNGCoordinateSpatialMeasure
This measure tests the agreement between the location of the addressed object and the
XYCoordinateSpatialMeasure
This measure compares the coordinate location of the addressed object with the
FutureDateMeasure
This measure produces a list of dates that are in the future.
TestOn
AddressPtCollection;
AddressPtCollection;
AddressPtCollection
AddressPtCollection
StCenterlineCollection;
AddressPtCollection
AddressPtCollection;
AddressPtCollection
StCenterlineCollection;
AddressPtCollection
StCenterlineCollection
StCenterlineCollection
AddressPtCollection
StCenterlineCollection
AddressPtCollection;
AddressPtCollection;
AddressPtCollection
AddressPtCollection
AddressPtCollection
AddressPtCollection
AddressPtCollection
AddressPtCollection
Add
PtC ll ti
AddressPtCollection
AddressPtCollection
StCenterlineCollection;
StCenterlineCollection
StCenterlineCollection
StCenterlineCollection
StCenterlineCollection
StCenterlineCollection
AddressPtCollection;
;
AddressPtCollection;
AddressPtCollection
AddressPtCollection
AddressPtCollection
AddressPtCollection
StCenterlineCollection
AddressPtCollection
AddressPtCollection
AddressPtCollection
AddressPtCollection
AddressPtCollection
AddressPtCollection
TestAgainst
AddressPtCollection;
AddressPtCollection;
AddressPtCollection
AddressPtCollection
StCenterlineCollection;
AddressPtCollection
AddressPtCollection;
StCenterlineCollection
StCenterlineCollection;
AddressPtCollection
StCenterlineCollection
StCenterlineCollection
AddressPtCollection
StCenterlineCollection;
AddressPtCollection;
AddressPtCollection;
AddressPtCollection
AddressPtCollection
StCenterlineCollection
Available t
StCenterlineCollection
StCenterlineCollection
StCenterlineCollection
StC t li C ll ti
StCenterlineCollection
StCenterlineCollection
StCenterlineCollection;
AddressReferenceSystem
AddressReferenceSystem
StCenterlineCollection
StCenterlineCollection
StCenterlineCollection
AddressPtCollection;
;
AddressPtCollection;
AddressPtCollection
AddressPtCollection
AddressPtCollection
AddressPtCollection
StCenterlineCollection
StCenterlineCollection
AddressPtCollection
StCenterlineCollection
AddressPtCollection
AddressPtCollection
AddressPtCollection
26. GOVERNOR’S OFFICE OF INFORMATION TECHNOLOGY
Data Quality Requirements:
NENA 02-014 GIS Data Collection and Maintenance
QualityElement
positional accuracy
positional accuracy
positional accuracy
positional accuracy
positional accuracy
positional accuracy
positional accuracy
QualitySubelement
relative accuracy
relative accuracy
relative accuracy
relative accuracy
relative accuracy
relative accuracy
relative accuracy
completeness
commission; omission
thematic accuracy non‐quantitative attribute correctness
temporal accuracy temporal validity
Reference
2.1
2.2
2.2
3.1
3.1
3.1
3.1
4
4.1
4.1.1
4.1.1
4.1.2
4.1.3
4.1.4
4.2
logical consistency format consistency
4.3
4.3
thematic accuracy non‐quantitative attribute correctness
4.3 (i)
completeness
commission
4.3 (ii)
logical consistency topological consistency
4.3.1 (iii)
logical consistency topological consistency
4.3.2 (iii)
thematic accuracy non‐quantitative attribute correctness
4.3.3 (iii)
thematic accuracy non‐quantitative attribute correctness
4.3.1 (iv)
logical consistency topological consistency
4.3.2 (iv)
logical consistency topological consistency
4.3.3 (iv)
logical consistency conceptual consistency (referential integ 4.3.4 (i)
logical consistency topological consistency
4.3.4 (ii)
thematic accuracy non‐quantitative attribute correctness
4.3.4 (iii)
logical consistency topological consistency
4.3.4 (iv)
logical consistency topological consistency
4.3.5 (v)
5.1
temporal accuracy temporal validity
5.1
temporal accuracy temporal validity
5.1
temporal accuracy temporal validity
5.1
5.1
temporal accuracy temporal validity
5.2
temporal accuracy temporal validity
5.2
5.2
positional accuracy relative accuracy
5.2 (1)
thematic accuracy quantitative accuracy
temporal accuracy temporal validity
5.2 (2)
5.2 (3)
temporal accuracy temporal validity
5.2 (4)
temporal accuracy temporal validity
5.2 (5)
thematic accuracy non‐quantitative attribute correctness
5.2 (7)
temporal accuracy temporal validity
5.2 (8)
temporal accuracy temporal validity
5.2 (9)
thematic accuracy quantitative accuracy
5.2 (10)
thematic accuracy non‐quantitative attribute correctness
5.2 (11)
metadata
5.2 (12)
logical consistency topological consistency
5.2
temporal accuracy temporal validity
5.3
temporal accuracy temporal validity
5.3
completeness
commission; omission
5.3
thematic accuracy non‐quantitative attribute correctness
temporal accuracy temporal validity
5.3
Requirement
RequirementDescription
Shall meet NMAS for 1:5000
The overall accuracy of GIS vector data shall meet National Map Accuracy Standards at 1:5000
Source 1:24000 or less
Source map data standards are … 1:24,000 or better shall be the standard for GIS vector data
Source ortho 1:2400 or less
Digital Orthoimagery data or raster data standard shall be 1:2400 or better
GPS data collected 10 feet horizontal accuracy at 95% confidence GPS data shall be collected with accuracy of 10 feet (3.048 meters) or less 95% of the time
Minimum 30 positions collected at 1 second intervals
For point features it is recommended that a minimum of 30 positions be collected at 1
At least 4 satellites
One should always acquire at least 4 satellites.
Differentially corrected
ALWAYS do differential corrections, either real‐time or post processed.
Annual validation(s):
… attributes and spatial features of the GIS data shall be validated at a minimum of once a
Validate against Automatic Location Inforamtion (ALI) and the … compar[e] the GIS data and either the entire ALI Data Base or a data base of daily service
Master Street Address Guide (MSAG)
order changes that have “passed edit entry,”
Compare GIS data to ALI
…ensure … that each ALI address is represented in the GIS data layer. … IF ALI data is not
Compare GIS data to MSAG
…compare GIS data with MSAG road ranges, road names and Emergency Service Numbers
Compare GIS data to tax assessment information
Compare address points and road centerline address ranges with tax assessment
Compare GIS data to utility meter address information
…ensure that the GIS data … includes site and street address information that corresponds to
Compare GIS data to building permit issuances
… verify address ranges on a GIS centerline dataset … [with] points located on building with
Compare GIS data to orthoimagery or sattelite imagery
… for positional accuracy validation
Assure that the GIS data includes Sites, Roads, Road Names and At a minimum, the digital mapping system shall include the following GIS data and spatial
Assure that the following audits are performed for each of thes audits with accompanying metadata:
Check for valid attribute values
for Sites, Roads, Road Names
Assure no duplication of any feature
for Sites, Roads, Road Names
Assure parity of addresses is consistent (right side odd, left ev for Sites and Roads
Assure line segment connectivity
for Roads (vital for network and routing analysis)
Assure road names in MSAG/ALI agree with road names in GIS for Road Names
Assure site addresses match ALI Data Base
for Sites
Eliminate overlapping address ranges within the same ESN
for Roads (and perhaps Sites)
Assure address ranges includes all ranges in MSAG
for Roads
Assure that relationships between ESZs, ESNs, and ESAs are coEmergency Service Zones (ESZs), Emergency Service Numbers (ESNs), and Emergency
Remove empty (null/sliver) polygons
Emergency Service Zones (ESZs)
Assure ESZ and ESN information matches to MSAG/ALI
Emergency Service Zones (ESZs)
Assure coincident geometry within ESZ layer
Eliminate gap and overlapping polygons in Emergency Service Zones (ESZs)
Assure coincident geometry between ESZs and jurisdictional bESZ Boundaries should be joined to jurisdictional boundaries where appropriate (e.g. roads, r
Consistently apply a program to identify and correct errors
… a consistently applied program to identify and correct errors.
Provide timely updates to telecommunicators
Timeliness of the update of the GIS data is key to maintaining an accurate map data layer …
Provide timely updates to PSAPs
The updated GIS data layer shall be provided to the PSAPs in a timely manner.
Provide GIS updates within five business days of address receipt … GIS updates be processed as part of the Enhanced 9‐1‐1 GIS data within five business days
Personnel must be qualified and trained to maintain GIS data
… personnel [must be] qualified and trained to maintain GIS data [and] must understand the
Update road centerlines as structures are constructed or demolis As structures are constructed or demolished, the road centerline layer must be updated to
Update one‐way and closed roads as they occur
Maintenance of the one‐way and closed roads within the map’s road centerline layer
To geocode accurately, maintain in road centerline layers:
In order to geocode accurately, the road centerline layer requires maintenance of:
coordinate locations
coordinate locations
name changes
name changes
new roads
new roads
changed addressing start/end points
changed addressing start/end points
the turn‐table (one‐way status)
the turn‐table (one‐way status)
road classifications for symbology and routing (including overpa road classifications for symbology and routing (including overpass/underpass/no‐turn attrib
address range changes
address range changes
municipality annexations
municipality annexations
speed limit or impedance field
speed limit or impedance field
municipal route number field; and
municipal route number field; and
source data field,
source data field.
Where possible, use road centerlines as boundary lines for ESZs It is recommended that the ESZ boundary be joined to the road centerline where the road for
Add or delete address points as buildings are constructed or demAs buildings are constructed or demolished, SITE points need to be added or deleted.
Modify address points when sites require a change in address.
Existing sites may require a change of address.
Every ALI address is represented in GIS
… every address in the ALI Data Base matches to an address in the GIS data layer. (see also 4.1
Receive notices of change of address from:
1) the addressing authority, 2) as a discrepancy between the GIS data and a service order chan
27. GOVERNOR’S OFFICE OF INFORMATION TECHNOLOGY
Data Quality Requirements:
NENA 71-501 Synchronizing GIS Databases with MSAG and ALI
QualityElement
QualitySubelement
temporal accuracy accuracy of a time measurement
thematic accuracy non‐quantitative attribute correctness
thematic accuracy non‐quantitative attribute correctness
thematic accuracy
thematic accuracy
thematic accuracy
thematic accuracy
completeness
thematic accuracy
completeness
completeness
non‐quantitative attribute correctness
quantitative attribute accuracy
non‐quantitative attribute correctness
non‐quantitative attribute correctness
commission
non‐quantitative attribute correctness
omission
omission
thematic accuracy
thematic accuracy
thematic accuracy
thematic accuracy
thematic accuracy
thematic accuracy
thematic accuracy
thematic accuracy
logical consistency
non‐quantitative attribute correctness
non‐quantitative attribute correctness
non‐quantitative attribute correctness
quantitative attribute accuracy
non‐quantitative attribute correctness
non‐quantitative attribute correctness
classification correctness
classification correctness
domain consistency
thematic accuracy non‐quantitative attribute correctness
completeness
commission
Reference Requirement
RequirementDescription
1 Synchronization should be performed by qualified staff
The synchronization process of the GIS data is most reliably accomplished by qualified, traine
3 Develop a process to identify and quickly correct errors
...develop a process that will consistently identify errors or discrepancies in the data and quic
3 Perform an analysis of discrepencies, estimate time, and then correct errors The amount of time to correct the data and eliminate errors cannot be estimated until an
3 Temporal accuracy and speedy updating is essential
All GIS, MSAG, and ALI data must be continuously updated with the newest information and
3 Consolidate and standardize (independently) both MSAG and GIS data
an agency specific workflow can be implemented to consolidate and standardize the MSAG
3 Compare MSAG and GIS data for accuracy and completeness:
Once the MSAG and GIS databases are standardized, they need to be compared for accuracy
3 Prepare and standardize data, make initial corrections, synchronize, prepare dThe … synchronization process … can be broken down into Data Preparation, Data
3 GIS and MSAG must match 98% of the time before being used for ERDB or LoST… a minimum match rate of 98% be set prior to using the GIS data in the Emergency Routing
3.1 Standardize and quality review GIS street centerline and MSAG data before co Standardization and quality control processing must take place on the GIS street centerline
3.1.1 Compare MSAG and GIS data and identify:
A ... comparison of ... GIS street centerline data and the MSAG will identify …
3.1.1 Different road naming conventions
Different road naming conventions
3.1.1 Inaccurate address ranges
Inaccurate address ranges
3.1.1 Improper MSAG Community designations
Improper MSAG Community designations
3.1.1 Improper Postal Community designations
Improper Postal Community designations
3.1.1 Improper Exchange designations
Improper Exchange designations
3.1.1 Incorrect ESN assignments
Incorrect ESN assignments
3.1.1 Incomplete or missing records
Incomplete or missing records
3.1.1 Road segments w/o addressed structures found in GIS but not in MSAG
Roads may be in the GIS that are not in the MSAG because the GIS roads do not have
3.1.1 Standardize GIS street centerlines and the MSAG as follows:
Standardization of the GIS road centerline data and the MSAG data should incorporate the
3.1.1 Use only the eight cardinal/bi‐cardinal directions and their abbreviations
N, S, E, W, NE, NW, SE, or SW are the only prefix and suffix directional abbreviations which
3.1.1 Avoid all punctuation
All punctuation should be avoided.
3.1.1 Eliminate special characters
Remove special characters (dash, underscore, apostrophe, quotes or any other special
3.1.1 Use only whole numbers in house number fields
Use only whole numbers in the house number fields (fractional house numbers belong in
3.1.1 Spell‐out street names as assigned by addressing authorities
Use complete spelling of the legal street name assigned by the addressing authority (e.g.
3.1.1 Spell‐out Postal or Community names
Spell out the complete MSAG and Postal Community name.
3.1.1 Abbreviate directions only when they are not part of the street name
Prefix directional is only abbreviated when not part of the actual street name (North Dr wou
3.1.1
Post directional abbreviated when they are not the actual street name. (Lone Pine Dr South
3.1.1 Only abbreviate USPS Pub 28 Appx C1‐listed suffixes
Standardize street suffix according USPS Publication No. 28 – Appendix C1
3.1.1 Standardization of address information is for data interoperability and exchan … standardization must take place on the 9‐1‐1 databases to ensure interoperability and to
3.1.1 Encourage best practice address standardization in address authorities
...educate the local addressing authorities that standardization will improve quality, lower
3.1.1 All MSAG, ALI, and GIS road naming conventions must be consistent
The street naming conventions should be consistent in the GIS street centerline, the MSAG
3.1.1 Standardization of address information must occur both with MSAG and GIS The standardization process should take place in both the MSAG and the GIS databases.
3.1.1 Agree to the number of changes that can occur per unit of time
Since the number of changes to the databases may be quite high, all involved parties must
3.1.1 Request the MSAG
Request the MSAG from your Data Base Management System provider.
3.1.1 Load MSAG into worksheet or database
Load the MSAG into a worksheet or database format, with each field being in a separate colum
3.1.1 Save the file
Save the MSAG file (e.g. Initial MSAG).
3.1.1 Save a copy of the file
Save another copy of the MSAG under a different name (e.g. Copy of MSAG).
3.1.1 Open the copy
Open the copy of the MSAG (e.g. Copy of MSAG).
3.1.1 Only remove records from the copy
Do not delete any records out of the original MSAG, only removing certain records from the "C
3.1.1 Sort by MSAG Community, delete FX records, and flag unpopulated MSAG CSort the data by MSAG COMMUNITY and delete any FX Records in the “Copy of MSAG”. Make n
28. GOVERNOR’S OFFICE OF INFORMATION TECHNOLOGY
USPS Address
USPS Address
Quality Improvement Processes
1. Locatable Address Conversion System (LACS)
1 Locatable Address Conversion System (LACS)
– 911 address conversions from rural to street addresses
2. Coding Accuracy Support System (CASS)
– Correction or Addition of ZIPCode+4
– Validation of Postal Place Names and States
– Street Names, Street Types, and Directionals
• Conflicts identify that corrections may be needed for either address
authorities/9‐1‐1 or postal service, etc.
3. Delivery Point Validation (DPV)
– IsMailableAddress = “Yes”
l bl dd
“ ”
4. Address Element Correction (AEC)
– To be determined…
29. GOVERNOR’S OFFICE OF INFORMATION TECHNOLOGY
USPS Comparisons:
p
Coding Accuracy Support System
(CASS) (CASS)
(CASS) (CASS)
30. GOVERNOR’S OFFICE OF INFORMATION TECHNOLOGY
Data Quality: USPS State
Street segments with more than one name
h
h
AEC II
– May resolve addresses with multiple Street
Names?
– May identify addresses without mail delivery (e.g.
P.O. Boxes)
CASS Errorcodes 412, 413, 491 are not clearly
understood in all cases
31. GOVERNOR’S OFFICE OF INFORMATION TECHNOLOGY
Data Quality Elements
ISO 19157 Geographic information Data quality defines comprehensive definitions and testing guidance to measure
ISO 19157 Geographic information ‐ Data quality defines comprehensive definitions and testing guidance to measure
data quality:
completeness: presence and absence of features, their attributes and relationships
•
•
commission: excess data present in a dataset
omission: data absent from a dataset
logical consistency: degree of adherence to logical rules of data structure, attribution and relationships (data structure can be
g
y
g
g
,
p (
conceptual, logical or physical)
•
•
•
•
conceptual consistency: adherence to rules of the conceptual schema
domain consistency: adherence of values to the value domains
format consistency: degree to which data is stored in accordance with the physical structure of the dataset
topological consistency: correctness of the explicitly encoded topological characteristics of a dataset
positional accuracy: accuracy of the position of features
•
•
•
absolute (or external) accuracy: closeness of reported coordinate values to values accepted as or being true
b l
(
l)
l
f
d
d
l
l
d
b
relative (or internal) accuracy: closeness of the relative positions of features in a dataset to their respective relative positions accepted
as or being true
gridded data position accuracy: closeness of gridded data position values to values accepted as or being true.
temporal accuracy: accuracy of the temporal attributes and temporal relationships of features
•
•
•
accuracy of a time measurement: correctness of the temporal references of an item (reporting of error in time measurement)
temporal consistency: correctness of ordered events or sequences, if reported
temporal consistency: correctness of ordered events or sequences, if reported
temporal validity: validity of data with respect to time
thematic accuracy: accuracy of quantitative attributes and the correctness of non‐quantitative attributes and of the classifications of
features and their relationships.
•
•
•
classification correctness: comparison of the classes assigned to features or their attributes to a universe of discourse (e.g. ground
truth or reference dataset)
non‐quantitative attribute correctness: correctness of non‐quantitative attributes,
quantitative attribute accuracy: accuracy of quantitative attributes
quantitative attribute accuracy acc rac of q antitati e attrib tes
32. GOVERNOR’S OFFICE OF INFORMATION TECHNOLOGY
Data Quality – Sampling Size
Sample Size and Confidence Interval Tutorial
The confidence interval (commonly referred to as the margin of error or error rate) is the plus-or-minus figure
you hear mentioned relative to surveys or opinion polls. For example, if you use a confidence interval of 4 and
47% percent of your sample picks an answer you can be "sure" that if you had asked the question of the
entire relevant population between 43% (47 4) and 51% (47+4) would have picked that answer Most
(47-4)
answer.
researchers prefer a confidence interval of less than 4 percentage points.
The confidence level tells you how sure you can be. Expressed as a percentage, it represents how often the
true percentage of the population who would pick an answer lies within the confidence interval. The 95%
confidence level means you can be 95% certain; th 99% confidence level means you can b 99% certain.
fid
l
l
b
t i the
fid
l
l
be
t i
Most researchers use the 95% confidence level.
When you put the confidence level and the confidence interval together, you can say (for example) that you
are 95% sure that the true percentage of the population is between 43% and 51%.
The wider the confidence interval (higher margin of error) you are willing to accept, the more certain you can
be that the whole population answers would be within that range. For example, if you asked a sample of 1000
people in a city which brand of cola they preferred, and 60% said Brand A, you can be very certain that
between 40 and 80% (80% confidence interval) of all the people in the city actually do prefer that brand.
However, you cannot be so sure th t between 59 and 61% (99% confidence i t
H
tb
that b t
d
fid
interval) of the people in th city
l) f th
l i the it
prefer the brand.
http://williamgodden.com/tutorial.pdf
33. GOVERNOR’S OFFICE OF INFORMATION TECHNOLOGY
Data Quality – Sampling Size
http://williamgodden.com/samplesizeformula.pdf
34. GOVERNOR’S OFFICE OF INFORMATION TECHNOLOGY
Data Quality – Sampling Size
http://williamgodden.com/samplesizeformula.pdf
35. GOVERNOR’S OFFICE OF INFORMATION TECHNOLOGY
Data Quality – Sampling Size
With a confidence interval of 3 percentage points and a 95 % confidence level:
PlacePreMod
PlaceName
PlacePost AddressPopu InFiniteSamp FiniteSampl
Mod
lation2012 leSize2012 eSize2012
PlacePre
Mod
PlaceName
PlacePostMod
AddressPopu InFiniteSampl FiniteSampl
lation2012 eSize2012
eSize2012
Arapahoe
County
209531
1067
1062
Otero
County
Archuleta
County
7805
1067
939
Ouray
County
3379
1067
811
Baca
County
Park
County
15414
1067
998
Bent
County
County
Chaffee
County
Cheyenne
County
Crowley
County
Phillips
1067
1028
Prowers
County
12822
1067
985
Pueblo
P bl
County
C
84672
1067
1054
Rio Grande
County
3532
1067
820
Routt
County
14170
1067
992
San Juan
County
County
Custer
716
County
Clear Creek
1067
28164
City And County of Broomfield
2171
1067
886
1067
916
1067
1063
County
Sedgwick
County
Summit
County
20485
1067
1014
Weld
6485
258910
City And County of Denver
Dolores
5215
County
100399
1067
1056
County
Douglas
County
121194
1067
1058
Yuma
Eagle
County
34736
1067
1035
City of Aspen
Garfield
County
20269
1067
1014
City of Centennial
Gilpin
County
3548
1067
821
Grand
County
21051
1067
1016
Huerfano
County
Jackson
County
Kiowa
County
Kit Carson
La Plata
1067
947
1067
1041
5808
1067
902
215521
1067
1062
County
3290
1067
806
County
30174
1067
1031
Lake
County
7702
1067
937
Larimer
County
156186
1067
1060
City of Commerce City
24674
1067
1023
City of Fort Collins
80913
1067
1053
City of Grand Junction
County
Jefferson
8427
42772
74767
1067
1052
22565
1067
1019
254898
1067
1063
Fremont County
Regional GIS Authority
El Paso‐Teller County
Enhanced 911 Authority
Las Animas County
Emergency Telephone Service Authority
10059
1067
965
San Luis Valley
Emergency Telephone Service Authority
18254
1067
1008
Lincoln
County
3169
1067
798
Logan
County
16600
1067
1003
Denver County 1
School District
272600
1067
1063
Moffat
County
6175
1067
910
North Central
All‐Hazards Region
1333483
1067
1066
Montezuma
County
15819
1067
1000
Southern Ute
Indian Reservation
588
1067
379
Morgan
County
West Region
GIS Group
52808
1067
1046
36. GOVERNOR’S OFFICE OF INFORMATION TECHNOLOGY
Data Quality – Sampling Method
1. Randomly select 5 address points
d l
l
dd
2. Select road segments associated with
address points
3. Select adjacent connected road segments
j
g
4. Select the address points associated with the
selected road segments
selected road segments
5. Repeat steps 3 & 4 until sample size is
exceeded
37. GOVERNOR’S OFFICE OF INFORMATION TECHNOLOGY
Data Quality – Sampling Method
1. Randomly select of 5 address points
y
p
38. GOVERNOR’S OFFICE OF INFORMATION TECHNOLOGY
Data Quality – Sampling Method
2. Select road segments associated with address points
g
p
39. GOVERNOR’S OFFICE OF INFORMATION TECHNOLOGY
Data Quality – Sampling Method
3. Select adjacent connected road segments
j
g
40. GOVERNOR’S OFFICE OF INFORMATION TECHNOLOGY
Data Quality – Sampling Method
4.
Select the address points associated with the selected road segments
41. GOVERNOR’S OFFICE OF INFORMATION TECHNOLOGY
Data Quality – Sampling Method
5. Repeat steps 3 & 4 until sample size is exceeded
43. GOVERNOR’S OFFICE OF INFORMATION TECHNOLOGY
Data Quality – Sampling Method
Issues:
Sample selection – why start with 5?
– Constrained for ease of collection
Bias – how random must the sample be?
– Confidence interval is set at 3 percentage points (high)
– Will a larger sample size mitigate bias (how much)?
– If so is the actual confidence interval and confidence level lower?
If so, is the actual confidence interval and confidence level lower?
Does the sample capture adequately the structure of the population?
– Distribution of the population is assumed to be very similar to the distribution
p p
y
in the dataset
– Distribution of address points correlates more highly to the distribution of
road centerlines than random points on a plane
44. GOVERNOR’S OFFICE OF INFORMATION TECHNOLOGY
Data Quality Elements
ISO 19157 Geographic information Data quality defines comprehensive definitions and testing guidance to measure
ISO 19157 Geographic information ‐ Data quality defines comprehensive definitions and testing guidance to measure
data quality:
completeness: presence and absence of features, their attributes and relationships
•
•
commission: excess data present in a dataset
omission: data absent from a dataset
logical consistency: degree of adherence to logical rules of data structure, attribution and relationships (data structure can be
g
y
g
g
,
p (
conceptual, logical or physical)
•
•
•
•
conceptual consistency: adherence to rules of the conceptual schema
domain consistency: adherence of values to the value domains
format consistency: degree to which data is stored in accordance with the physical structure of the dataset
topological consistency: correctness of the explicitly encoded topological characteristics of a dataset
positional accuracy: accuracy of the position of features
•
•
•
absolute (or external) accuracy: closeness of reported coordinate values to values accepted as or being true
b l
(
l)
l
f
d
d
l
l
d
b
relative (or internal) accuracy: closeness of the relative positions of features in a dataset to their respective relative positions accepted
as or being true
gridded data position accuracy: closeness of gridded data position values to values accepted as or being true.
temporal accuracy: accuracy of the temporal attributes and temporal relationships of features
•
•
•
accuracy of a time measurement: correctness of the temporal references of an item (reporting of error in time measurement)
temporal consistency: correctness of ordered events or sequences, if reported
temporal consistency: correctness of ordered events or sequences, if reported
temporal validity: validity of data with respect to time
thematic accuracy: accuracy of quantitative attributes and the correctness of non‐quantitative attributes and of the classifications of
features and their relationships.
•
•
•
classification correctness: comparison of the classes assigned to features or their attributes to a universe of discourse (e.g. ground
truth or reference dataset)
non‐quantitative attribute correctness: correctness of non‐quantitative attributes,
quantitative attribute accuracy: accuracy of quantitative attributes
quantitative attribute accuracy acc rac of q antitati e attrib tes
45. GOVERNOR’S OFFICE OF INFORMATION TECHNOLOGY
Data Quality – Completeness
O i i
Omissions ‐ agreed upon by many as the principle threat
d
b
th
i i l th t
Field collection will help measure the completeness of the
population through stats
population through stats
Comparisons with other data sets will aid in finding and
resolving these omissions:
• Master Street Address Guide (MSAG) and ALI
• US Postal Service Address Quality Improvement DBs
• Statewide Voter Registration System (SCORE)
• Motorist Insurance Identification Database (MIIDB)
46. GOVERNOR’S OFFICE OF INFORMATION TECHNOLOGY
Data Quality – Positional Accuracy
P iti
Positional accuracy will be measured from sampling using
l
ill b
df
li
i
the National Standard for Spatial Data Accuracy (NSSDA)
Primary entrances are assumed to be the “target”
Primary entrances are assumed to be the target
– Even structure or parcel “centroids” positional accuracy will be
relative (does it matter if the point is 20 feet from door if you see
the door?)
the door?)
47. GOVERNOR’S OFFICE OF INFORMATION TECHNOLOGY
Data Quality – Temporal Accuracy
Wh t temporal information is being reported
What t
li f
ti i b i
t d
(eg. attributes or metadata) if at all?
– D t d t provided t St t of Colorado
Date data
id d to State f C l d
– Standard dates provided in metadata (CSDGM)
– D t ( d ti ) i f
Date (and time) information i attributes
ti in tt ib t
An inventory of data will be taken
A scale for > to < temporal info will be created
48. GOVERNOR’S OFFICE OF INFORMATION TECHNOLOGY
Data Quality – Thematic Accuracy
Increasing
Value
CompleteStreetNumber
Complete
Correct*
Complete
Correct*
Populated
Correct*
Populated
P
l d
Null
Populated
Null
CompleteStreetName
Complete
Complete
Correct*
Correct*
Correct*
Populated
Populated
P
l d
Populated
Null
Null
Score
100
80
70
60
50
40
30
20
10
0
*Correct means that most people can likely correctly
p p
y
y
deduce the complete street name from the value provided
49. GOVERNOR’S OFFICE OF INFORMATION TECHNOLOGY
Data Quality – Logical Consistency
Fishbones!
53. GOVERNOR’S OFFICE OF INFORMATION TECHNOLOGY
Credits
A very special thanks to:
–
–
–
–
–
–
–
–
–
Rick Smajter, City of Durango
Robb Menzies, Denver Public Schools
,
Matt Goetsch, City of Montrose
Cindy Jones, Park County
Heather Lassner, Cit of Loveland
H th L
City f L
l d
Bob Bush, Fremont County GIS Authority
Mary Kunkel, (formerly of) El Paso-Teller E911
Paso Teller
Pete Magee, San Luis Valley GIS/GPS Authority
Mike Sexson and Kris Schley, State of Colorado
Integrated Document Solutions (IDS)
55. GOVERNOR’S OFFICE OF INFORMATION TECHNOLOGY
Colorado St t Address D t
C l d State Add
Dataset W b it
t Websites
Governor's Office of Information Technology Colorado Broadband Data
and Development Program Colorado State Address Dataset (public site)
56. GOVERNOR’S OFFICE OF INFORMATION TECHNOLOGY
Colorado St t Address D t
C l d State Add
Dataset W b it
t Websites
Colorado Address Working Group (access controlled)
57. GOVERNOR’S OFFICE OF INFORMATION TECHNOLOGY
Colorado State Address Dataset
Nathan Lowry, Colorado OIT
N h L
C l d OIT
October 9, 2013
Nathan Lowry, GIS Outreach Coordinator
State of Colorado, Governor's Office of Information Technology
601 East 18th Avenue, Suite 220, Desk D‐23, Denver, CO 80203‐1494
303.764.7801 nathan.lowry@state.co.us, http://www.colorado.gov/oit
How am I doing? Please contact my manager Jon Gottsegen (Jon.Gottsegen@state.co.us) for comments or questions.
g
y
g J
g (J
g
)
q