Role of AI in seed science Predictive modelling and Beyond.pptx
Leverage and Delegation in Developing an Information Model for Geology
1. Leverage and Delegation in Developing an
Information Model for Geology
Simon Cox
Research Scientist
14 December 2007
2. Outline
• GeoSciML scope, its community of interest
• Methodology & platform: geospatial standards
• Delegation within GeoSciML
• Extensions from GeoScML
• Conclusions
3. GeoSciML
• A language for exchange of geoscience information
• UML logical model
• XML document format
• Scope: interpreted and observed goelogy
• MappedFeature, GeologicUnit, GeologicStructure, Fossil, Geologic
timescale, Borehole, Observation, etc
i.e. information required to maintain geologic maps
• More detail: see poster IN53A-0949
6. Immediate governance arrangements
• IUGS Commission for Geoscience Information
• Active participants:
• GSC, USGS, BGS, BRGM, SGU, GA, GSV, GSJ, APAT + CSIRO
• Documentation + discussion:
• https://www.seegrid.csiro.au/twiki/bin/view/CGIModel/GeoSciML
• Model and schema:
• https://www.seegrid.csiro.au/subversion/GeoSciML/
7. Framework
• Geoscience is largely geospatial
→ Use geospatial information standards for basic framework
• ISO 19100 standards
• UML for model design
• Standard treatments for geometry, time, fields, coordinate systems
• Meta-model for “features”
• XML encoding rule – “Geography Markup Language”
• OGC information and service models
• Standard treatment for Observations & Measurements
• Standard http interfaces – WMS, WFS, SOS
8. E.g. MappedFeature
• ISO 19109
Feature Model
• ISO 19107
Geometry
• ISO 19115
Metadata
• OGC 07-002
Sampling Model
9. Internal delegation
• GeoSciML provides data structure
• E.g. LithostratigraphicUnit is a kind of GeologicFeature with the
properties “preferredAge”, “classifier”, “beddingPattern” etc
• Property values are scoped to an explicit scale
• i.e. timescale, stratigraphic index, units of measure may use a
localized scale or dictionary
10. E.g. localized vocabularies within standard structures
<gsml:ChronostratigraphicUnit>
<gml:name>Castlemaine Group - Lancefieldian</gml:name>
<gml:name>Ocl</gml:name>
<gsml:observationMethod>
<gsml:CGI_TermValue>
<gsml:value
codeSpace="urn:cgi:classifierScheme:GSV:ObservationMethods">
published description</gsml:value>
</gsml:CGI_TermValue>
</gsml:observationMethod>
<gsml:purpose>instance</gsml:purpose>
<gsml:rank codeSpace="urn:cgi:classifierScheme:GSV:Rank">
Formation</gsml:rank>
...
</gsml:ChronostratigraphicUnit>
11. Extensibility
• Related communities are already building specializations on
top of GeoSciML
• GroundWaterML – see IN53C-03
• “Aquifer” specializes GeologicUnit
• GeochronML – see IN53C-02
• Specialized specimens and
observation-procedures
13. Key points
• GeoSciML both uses, and contributes to, a distributed governance
framework for geoscience information
• UML/XML framework allows delegated governance
• UML packages (XML namespaces) reflect system boundaries
discrete governance arrangements
• Markup conventions support late-binding of selected elements (esp.
vocabularies and scales)
• Understand the scope and reach of your community
• Only maintain the elements that are:
a. important to you
b. not governed by someone else
• Enable extensions to your model
• Publish re-usable components in http repository
• e.g. XMI of UML model; XML Schema
• Maintain your components in an orderly way
• Don’t cause surprises!
14. Contact Us
Phone: 1300 363 400 or +61 3 9545 2176
Email: enquiries@csiro.au Web: www.csiro.au
Thank you
Exploration & Mining
Simon Cox
Research Scientist
Phone: 08 6436 8639
Email: Simon.Cox@csiro.au
Web: www.seegrid.csiro.au
Notas del editor
AB: GeoSciML is an information model and XML encoding developed by a group of primarily geologic survey organizations under the auspices of the IUGS CGI. The scope of the core model broadly corresponds with information traditionally portrayed on a geologic map, viz. interpreted geology, some observations, the map legend and accompanying memoir. The development of GeoSciML has followed the methodology specified for an Application Schema defined by OGC and ISO 19100 series standards. This requires agreement within a community concerning their domain model, its formal representation using UML, documentation as a Feature Type Catalogue, with an XML Schema implementation generated from the model by applying a rule-based transformation. The framework and technology supports a modular governance process. Standard datatypes and GI components (geometry, the feature and coverage metamodels, metadata) are imported from the ISO framework. The observation and sampling model (including boreholes) is imported from OGC. The scale used for most scalar literal values (terms, codes, measures) allows for localization where necessary. Wildcards and abstract base- classes provide explicit extensibility points. Link attributes appear in a regular way in the encodings, allowing reference to external resources using URIs. The encoding is compatible with generic GI data-service interfaces (WFS, WMS, SOS). For maximum interoperability within a community, the interfaces may be specialised through domain-specified constraints (e.g. feature-types, scale and vocabulary bindings, query-models). Formalization using UML and XML allows use of standard validation and processing tools. Use of upper-level elements defined for generic GI application reduces the development effort and governance resonsibility, while maximising cross-domain interoperability. On the other hand, enabling specialization to be delegated in a controlled manner is essential to adoption across a range of subdisciplines and jurisdictions. The GeoSciML design team is responsible only for the part of the model that is unique to geology but for which general agreement can be reached within the domain. This paper is presented on behalf of the Interoperability Working Group of the IUGS Commission for Geoscience Information (CGI) - follow web-link for details of the membership.
Design in pictures – but using a formal notation: Unified Modeling Language UML
Automatic transformation into an XML Schema
XML document format is for data transfer uses
Geology model, not geological-map model
Maps are views of the world, projected or sampled on a particular plane etc.
A core part of the model: Geologic Unit
Some “simple” attributes, plus some complex properties (associations).
Specializations as LithologicUnit, ChronostratigraphicUnit, DeformationUnit (maybe more to come).
GeoSciML also supports encoding of Observations.
These include using the generic O&M pattern for outcrops, samples, etc.
But also some important domain-specific specializations – e.g. Borehole
Borehole is a specialized kind of SamplingCurve (from O&M – more later)
FlightLine, ShipsTrack, Trajectory, Traverse are other specializations of SamplingCurve.
The IUGS CGI has a working group on Interoperability
The active participants are primarily from geologic surveys + CSIRO Australia
These guys are responsible for the design and maintenance of GeoSciML.
GeoSciML leverages the geospatial data standards framework from ISO/TC 211 (i.e. the ISO 19100 series) and Open Geospatial Consortium
ISO supplies
The conceptual schema language to formalize the design
Some standard treatments for cross-domain components like geometry, coordinate systems
A meta-model for “features” (named, typed things)
A rule for encoding the model in XML
OGC supplies
A standard treatment for Observations and Sampling (submitted to ISO)
Standard http service interfaces (POX & SOAP)
How does the use of this framework show up?
Click
Click
Use of standard UML stereotypes ( specific encoding patterns)
Reference to standard external components
E.g. Geometry GM_Object (from ISO 19107), metadata MD_Metadata (from ISO 19115), SamplingFeatures (from OGC O&M)
GeoSciML model/schema defines a the data structures – to quite a high degree of detail,
But the data values in many cases can be scoped “at run time” to a specific vocabulary, scale, etc
i.e. the governance of scales, vocabularies, etc are delegated to the data provider
Though for maximum interoperability it is recommended to use published, well-governed vocabularies etc.
In this GeoSciML XML data-instance, the values of the observationMethod and rank are both taken from classifierSchemes governed by GSV.
Note that these classifier schemes are designated using a URN.
As well as internal extensibility, GeoSciML is designed to be extended or specialized by sub-domains within, and related to, geosciences
For example, two papers in this afternoon’s session describe languages explicitly derived from GeoSciML.
The pattern used to accomplish this follows exactly the same method as the basic GeosciML design
i.e. specialization-of, and reference-to externally governed components. (the blue classes are in the GWML domain).
(N.B. this is enforced within the development environment by the use of “controlled packages” in a variety of SubVersion code repositories).
Standards build on standards
Don’t re-invent unnecessarily - its easier (and more interoperable) to borrow elements already managed by someone else
Allow others to borrow yours
But this imposes an obligation on you to maintain an orderly governance process.