SlideShare una empresa de Scribd logo
1 de 34
Descargar para leer sin conexión
Data Modeling for XML & JSON
Donna Burbank
Global Data Strategy Ltd.
Lessons in Data Modeling DATAVERSITY Series
Dec 6th, 2016
Global Data Strategy, Ltd. 2016
Donna is a recognized industry expert in
information management with over 20
years of experience in data strategy,
information management, data modeling,
metadata management, and enterprise
architecture.
She is currently the Managing Director at
Global Data Strategy, Ltd., an international
information management consulting
company that specialises in the alignment
of business drivers with data-centric
technology. In past roles, she has served in
a number of roles related to data modeling
& metadata:
• Metadata consultant (US, Europe, Asia,
Africa)
• Product Manager PLATINUM Metadata
Repository
• Director of Product Management,
ER/Studio
• VP of Product Marketing, Erwin
• Data modeling & data strategy
implementation & consulting
• Author of 2 books of data modeling &
contributor to 1 book on metadata
management, plus numerous articles
• OMG committee member of the
Information Management Metamodel
(IMM)
As an active contributor to the data
management community, she is a long
time DAMA International member and is
the President of the DAMA Rocky
Mountain chapter. She has worked with
dozens of Fortune 500 companies
worldwide in the Americas, Europe, Asia,
and Africa and speaks regularly at industry
conferences. She has co-authored two
books: Data Modeling for the
Business and Data Modeling Made Simple
with ERwin Data Modeler and is a regular
contributor to industry publications such
as DATAVERSITY, EM360, & TDAN. She can
be reached at
donna.burbank@globaldatastrategy.com
Donna is based in Boulder, Colorado, USA.
Donna Burbank
2
Follow on Twitter @donnaburbank
Today’s hashtag: #LessonsDM
Global Data Strategy, Ltd. 2016
Lessons in Data Modeling Series
• July 28th Why a Data Model is an Important Part of your Data Strategy
• August 25th Data Modeling for Big Data
• September 22nd UML for Data Modeling – When Does it Make Sense?
• October 27th Data Modeling & Metadata Management
• December 6th Data Modeling for XML and JSON
3
This Year’s Line Up
Global Data Strategy, Ltd. 2016
Agenda
• Overview of XML and JSON
• Data Modeling & Metadata for XML & JSON
• Integrating XML & JSON with Databases (Relational & NoSQL)
• RDF & the Semantic Web
• Summary & Questions
4
What we’ll cover today
Global Data Strategy, Ltd. 2016
Assumption
• An assumption for today is that the majority of attendees are familiar with relational databases &
Entity-Relationship (E/R) modeling.
• E.g. Data Modelers, Data Architects, SQL Developers, BI Developers, etc.
• The examples are given with that bias, i.e. a comparison with the relational database world.
5
From Data Modeling for the Business by
Hoberman, Burbank, Bradley, Technics
Publications, 2009
Global Data Strategy, Ltd. 2016
What is XML?
• What is XML? – (Extensible Markup Language) is used to store and transport data.
• Some design principles of XML:
• Simplicity: ease of usage, interoperability & understanding
• Modular design: do one thing well
• Extensible: Ability to easily modify the structure & content
• Self-descriptive: ease of understanding
• Machine readable
• Human readable
• Embedded descriptive tags
• XML is designed for data availability, sharing & transport.
• It requires complementary technology to do anything else. i.e. Someone must write a piece of
software to send, receive, store, or display it, for example:
• HTML: Format & presentation of the data
• Web Service: Transport of the data (e.g. SOAP)
• Database: Store & integrate with other data sources
6
Global Data Strategy, Ltd. 2016
XML and JSON Assist with Data Exchange
7
• XML and JSON can be used to assist with data exchange (B2B, B2C, etc.)
• Companies
• Government Agencies
• Research Organizations
• Etc.
Purchase Order
Global Data Strategy, Ltd. 2016
Emergence & the Growth of Data Exchange
In philosophy, systems theory, science, and art, emergence is
the way complex systems and patterns arise out of a
multiplicity of relatively simple interactions.
- Wikipedia
Global Data Strategy, Ltd. 2016
XML uses a Hierarchical Structure
• XML uses a hierarchical, nested tree structure
• An XML tree starts at a root element and branches from the root to child elements.
• All elements can have sub elements (child elements)
9
<?xml version="1.0"?>
<shipto>
<name>John Smith</name>
<address>123 Main ST</address>
<city>Boise</city>
<country>USA</country>
</shipto>
Root
element
Child
elements
Global Data Strategy, Ltd. 2016
XML is Extensible
• XML is extensible, in that element can be easily added as needed.
• If the <state> element is added below, older applications using the original version will still work.
10
<?xml version="1.0"?>
<shipto>
<name>John Smith</name>
<address>123 Main ST</address>
<city>Boise</city>
<country>USA</country>
</shipto>
<?xml version="1.0"?>
<shipto>
<name>John Smith</name>
<address>123 Main ST</address>
<city>Boise</city>
<state>ID</state>
<country>USA</country>
</shipto>
Global Data Strategy, Ltd. 2016
XML is Self-Describing
• XML is self-describing (sort of) with the use of element tags
• Human-readable format
• Tags describe the content of the element (sort of)
11
<?xml version="1.0"?>
<shipto>
<name>John Smith</name>
<address>123 Main ST</address>
<city>Boise</city>
<country>USA</country>
</shipto>
From reading the tags, it’s
pretty clear that we’re
talking about a “Ship To”
address that contains the
name, address, city &
country.
But it doesn’t provide full metadata, e.g.:
• What’s the data type?
• What’s the business definition?
• Is <name> a required field?
Global Data Strategy, Ltd. 2016
XML Metadata – the XML Schema
• Similar to DDL, an XML Schema (XSD) defines the structure & format of data
12
<?xml version="1.0" encoding="UTF-8" ?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="shiporder">
<xs:complexType>
<xs:sequence>
<xs:element name="orderperson" type="xs:string"/>
<xs:element name="shipto">
<xs:complexType>
<xs:sequence>
<xs:element name="name" type="xs:string"/>
<xs:element name="address" type="xs:string"/>
<xs:element name="city" type="xs:string"/>
<xs:element name="country" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
<xs:attribute name="orderid" type="xs:string" use="required"/>
</xs:complexType>
</xs:element>
</xs:schema> XSD
Metadata
Ship to:
John Smith
123 Main ST
Boise
USA
………………………………………
………………………………………
Order Shipment
Data
<?xml version="1.0"?>
<shipto>
<name>John Smith</name>
<address>123 Main ST</address>
<city>Boise</city>
<country>USA</country>
</shipto>
XML
Data
Global Data Strategy, Ltd. 2016
Graphical Models of XML Schemas
13
• XML Schemas can be shown graphically as well as via text.
* Source: Altova
Global Data Strategy, Ltd. 2016
XML Metadata – the XML Schema
• Although the XML Schema does provide some physical structural metadata, full metadata
descriptions are incomplete, e.g.
• Is the name field required?
• What’s the business definition for each field?
• Are there code values and/or reference data that can be used?
• Can a complex data type be used?
• Etc.
14
Global Data Strategy, Ltd. 2016
Levels of Data Modeling
15
Conceptual
Logical
Physical
Purpose
Communication & Definition of
Business Terms & Rules
Clarification & Detail
of Business Rules &
Data Structures
Technical Implementation
with a Physical Database
or Structure
Audience
Business Stakeholders
Data Architecture
Business Analysts
DBAs
Developers
Business Concepts
Data Entities
Physical Tables
XML Schema defines some physical
metadata
But limited or no business metadata
Global Data Strategy, Ltd. 2016
Metadata & Context
From Data Modeling for the Business by Hoberman, Burbank,
Bradley, Technics Publications, 2009
Is this Customer a:
• Premier Customer
• Lapsed Customer
• High Risk Customer?
Can a Customer have
more than one Account?
Is the Ship To Address
related to the Customer
or the Account?
What are the valid state
codes for the Ship To
Address?
Global Data Strategy, Ltd. 2016
XML Assists with Data Exchange
17
• XML and JSON can be used to assist with data exchange (B2B, B2C, etc.)
• Remember modularity, simplicity, etc.
Purchase Order
Dude-all that other stuff
isn’t my job. I’m just
sending the PO!
Global Data Strategy, Ltd. 2016
Integrating XML with Relational Databases
• XML is often used in conjunction with relational databases for permanent storage and integration
with other operational, reporting, and reference data.
18
Purchase Order
Oracle SQL Server
Global Data Strategy, Ltd. 2016
Integrating XML with Relational Databases
• XML can be translated into relational databases, and vice-versa
19
XML Schema DDL
* Source: Altova
Global Data Strategy, Ltd. 2016
Integrating XML with Relational Databases
20
• XML can be translated into relational databases, and vice-versa
XML Model Diagram Relational Model Diagram
* Source: Altova
Global Data Strategy, Ltd. 2016
What is JSON?
• What is JSON? – (JavaScript Object Notation) is a minimal, readable format for structuring data. It
is used primarily to transmit data between a server and web application, as an alternative to XML.
• It is similar to XML in that it is:
• "self describing" & human readable
• hierarchical
• simple & interoperable
21
• It differs from XML in that it is:
• can be parsed with standard JavaScript notation
• uses arrays
• can be simpler & shorter to read & write.
{"employees":[
{"firstName":“Shannon", "lastName":“Kempe"},
{"firstName":"Anita", "lastName":“Kress"},
{"firstName":“Tony", "lastName":“Shaw"}
]}
<employees>
<employee>
<firstName>Shannon</firstName>
<lastName>Kempe</lastName>
</employee>
<employee>
<firstName>Anita</firstName>
<lastName>Kress</lastName>
</employee>
<employee>
<firstName>Tony</firstName>
<lastName>Shaw</lastName>
</employee>
</employees>
JSON XML
Global Data Strategy, Ltd. 2016
JSON Metadata – The JSON Schema
22
• The JSON schema offers a richer set of metadata.
{
"id": 127849,
“brand": “Super Cooler",
"price": 12.50,
"tags": [“camping", “sports"]
}
Example Product in the API
Data
• Can the ID contain letters?
• What is a brand?
• Is a price required?
• Etc.
Context Needed
(i.e. Metadata)
For example, assume we have a JSON based product catalog. This catalog has a product which has an id, a brand,
a price, and an optional set of tags.
{
"$schema": "http://json-schema.org/draft-04/schema#",
"title": "Product",
"description": "A retail product from Acme's online catalog",
"type": "object",
"properties": {
"id": {
"description": "The unique identifier for a product",
"type": "integer"
},
“brand": {
"description": “The brand name of the product as shown in the online catalogue",
"type": "string"
},
"price": {
"type": "number",
},
"tags": {
"type": "array",
"items": {
"type": "string"
},
"minItems": 1,
}
},
"required": ["id", “brand", "price"]
}
JSON Schema
Metadata
Global Data Strategy, Ltd. 2016
Integrating JSON with Document Databases
• JSON is often used with document databases, such as MongoDB, which uses JSON documents in
order to store records
• Document databases are popular ways to store unstructured information in a flexible way (e.g.
multimedia, social media posts, etc. )
23
• Each Collection can contain numerous Documents which could all contain
different fields.
{type: “Artifact”,
medium: “Ceramic”
country: “China”,
}
{type: “Book”,
title: “Ancient China”
country: “China”,
}
Global Data Strategy, Ltd. 2016
The Semantic Web & RDF
• The RDF (Resource Description Framework) model from the World Wide Web Consortium (W3C) provides a
way to link resources on the web (people, places, things). It provides a common framework for applications to
share information without losing meaning.
• Search Engines
• Exchanging data between datasets
• Sharing information with applications / APIs
• Building social networks
• Etc.
• The goal is to move from a web of documents to a web of data.
• The Framework is a simple way to express relationships between resources.
• IRIs (International Resource Identifiers) (e.g. URI) identify resources
• Simple triples relate objects together in the format: <subject> <predicate> <object>
• These relationships create a connected Graph
• There are several serialization formats, with RDF XML being a common one. For example:
• Turtle is a human-friendly format
• RDF/XML
• JSON-LD
• Schemas define the vocabularies used to describe the objects
• Dublin Core and Schema.org are two common ones
24
Subject Object
Predicate
ACME
Publishing
RDF is
Easy
Is Publisher Of
Global Data Strategy, Ltd. 2016
Creating a Web of Data
25
@type: Place
Sheraton San Diego Hotel & Marina
1380 Harbor Island Drive
San Diego, California 92101 USA
"@context": "http://schema.org",
“location": {
"@type": "Place",
"name": "Sheraton San Diego Hotel & Marina",
"address": {
"@type": "PostalAddress",
"streetAddress": "1380 Harbor Island Drive",
"addressLocality": "San Diego",
"addressRegion": "CA",
"postalCode": "92101"
},
"telephone" : "+1-877-734-2726",
"image":
"http://edw2016.dataversity.net/uploads/ConfSiteAssets/72/im
age/sheraton.jpg",
"url":"http://edw2016.dataversity.net/travel.cfm"
},
"@context": "http://schema.org",
"location": {
"@type": "Place",
"name": "Sheraton San Diego Hotel & Marina",
"address": {
"@type": "PostalAddress",
"streetAddress": "1380 Harbor Island Drive",
"addressLocality": "San Diego",
"addressRegion": "CA",
"postalCode": "92101"
},
"telephone" : "+1-877-734-2726",
"image": “http://mysite.com/edw16photo.jpg",
"url":“http://mysite.com/myphotos"
},
* Script provided by: Eric Franzon, eric@smartdataconsultants.com
*
Global Data Strategy, Ltd. 2016
Dublin Core Metadata Initiative
• The Dublin Core Metadata Initiative provides a common metadata standards for resources such as
media, library books, etc.
• It defines standards for information such as:
26
http://dublincore.org
Title
Creator
Subject
Description
Publisher
Contributor
Date
Type
Format
Identifier
Source
Language
Relation
Coverage
Rights
 Resources can be described using:
 Text
 HTML
 XML
 RDF XML
Sample Metadata
Format="video/mpeg; 5 minutes“
Language="en"
Publisher=“Kats Online, LLC"
Title=“My Favorite Cat Video“
Subject=“Cats“
Description=“A short video of a black cat playing with string."
Global Data Strategy, Ltd. 2016
Schema.org
• Schema.org is a vocabulary that webmasters can use to mark-up Web pages for the Semantic
Web, so that search engines understand what the pages are about .
• Created by a group of search providers (e.g. Google, Microsoft, Yahoo and Yandex).
• Vocabularies are developed by an open community process
• Through GitHub (https://github.com/schemaorg/schemaorg)
• Using the public-schemaorg@w3.org mailing list
• The schemas are a set of 'types', each associated with a set of properties. The types are arranged
in a hierarchy. There are currently over 570 types, including:
• Creative works
• Organization
• Person
• Place, LocalBusiness, Restaurant
• Product, Offer, AggregateOffer
• Etc.
• There are also extensions for particular industries such as:
• auto.schema.org
• health-lifesci.schema.org
27
 Resources can be described using:
 JSON-LD
 RDFa
 Etc.
Global Data Strategy, Ltd. 2016
There are Many Other Common Schemas & Vocabularies
• The Dublin Core and Schema.org are two popular schemas, but many more exist for particular
subject areas, industries, etc.
• The Linked Open Vocabularies site (LOV) provides a helpful listing
28http://lov.okfn.org/dataset/lov/
Dublin Core
Schema.org
Friend of a Friend
Global Data Strategy, Ltd. 2016
Summary
• XML and JSON are used for transport and interoperability of data
• They offer a variety of benefits
• Simplicity: ease of usage, interoperability & understanding
• Modular design: do one thing well
• Extensible: Ability to easily modify the structure & content
• Self-descriptive: ease of understanding
• Integration with Databases allows for broader enterprise sharing & storage
• Translation to Relational databases
• Storage for Document databases
• Graphical Models can be used across technologies for an intuitive way to visualize hierarchies &
relationships
• The Semantic Web is a powerful way to support the internet as a “web of data”
Global Data Strategy, Ltd. 2016
About Global Data Strategy, Ltd
• Global Data Strategy is an international information management consulting company that specializes
in the alignment of business drivers with data-centric technology.
• Our passion is data, and helping organizations enrich their business opportunities through data and
information.
• Our core values center around providing solutions that are:
• Business-Driven: We put the needs of your business first, before we look at any technology solution.
• Clear & Relevant: We provide clear explanations using real-world examples.
• Customized & Right-Sized: Our implementations are based on the unique needs of your organization’s
size, corporate culture, and geography.
• High Quality & Technically Precise: We pride ourselves in excellence of execution, with years of
technical expertise in the industry.
30
Data-Driven Business Transformation
Business Strategy
Aligned With
Data Strategy
Visit www.globaldatastrategy.com for more information
Global Data Strategy, Ltd. 2016
Contact Info
• Email: donna.burbank@globaldatastrategy.com
• Twitter: @donnaburbank
@GlobalDataStrat
• Website: www.globaldatastrategy.com
• Company Linkedin: https://www.linkedin.com/company/global-data-strategy-ltd
• Personal Linkedin: https://www.linkedin.com/in/donnaburbank
31
Global Data Strategy, Ltd. 2016
DATAVERSITY Training Center
• Learn the basics of Metadata Management and practical tips on how to apply metadata
management in the real world. This online course hosted by DATAVERSITY provides a series of six
courses including:
• What is Metadata
• The Business Value of Metadata
• Sources of Metadata
• Metamodels and Metadata Standards
• Metadata Architecture, Integration, and Storage
• Metadata Strategy and Implementation
• Purchase all six courses for $399 or individually at $79 each.
Register here
• Other courses available on Data Governance & Data Quality
32
Online Training Courses
New Metadata Management Course
Visit: http://training.dataversity.net/lms/
Global Data Strategy, Ltd. 2016
Lessons in Data Modeling Series - 2017
• January 26th How Data Modeling Fits into an Overall Enterprise Architecture
• February 23rd Data Modeling & Business Intelligence
• March 23rd Conceptual Data Models - How to Get the Attention of Business Users
(for a Technical Audience)
• April 27th The Evolving Role of the Data Architect – What Does it Mean for Your Career?
• May 25th Data Modeling & Metadata Management
• June 22nd Self-Service Data Analysis, Data Wrangling, Data Munging, and Data Modeling –
how do they fit together?
• July 27th Data Modeling & Metadata for Graph Databases
• August 24th Data Modeling & Data Integration
• September 28th Data Modeling & MDM
• October 26th Agile & Data Modeling – How can they work together?
• December 5th Data Modeling, Data Governance, & Data Quality
33
Next Year’s Line Up
Global Data Strategy, Ltd. 2016
Questions?
34
Thoughts? Ideas?

Más contenido relacionado

La actualidad más candente

Webinar future dataintegration-datamesh-and-goldengatekafka
Webinar future dataintegration-datamesh-and-goldengatekafkaWebinar future dataintegration-datamesh-and-goldengatekafka
Webinar future dataintegration-datamesh-and-goldengatekafka
Jeffrey T. Pollock
 
Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing concepts
pcherukumalla
 

La actualidad más candente (20)

Modern-Data-Warehouses-In-The-Cloud-Use-Cases-Slim-Baltagi
Modern-Data-Warehouses-In-The-Cloud-Use-Cases-Slim-BaltagiModern-Data-Warehouses-In-The-Cloud-Use-Cases-Slim-Baltagi
Modern-Data-Warehouses-In-The-Cloud-Use-Cases-Slim-Baltagi
 
Webinar future dataintegration-datamesh-and-goldengatekafka
Webinar future dataintegration-datamesh-and-goldengatekafkaWebinar future dataintegration-datamesh-and-goldengatekafka
Webinar future dataintegration-datamesh-and-goldengatekafka
 
Introduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureIntroduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse Architecture
 
Data-Ed Webinar: Data Quality Success Stories
Data-Ed Webinar: Data Quality Success StoriesData-Ed Webinar: Data Quality Success Stories
Data-Ed Webinar: Data Quality Success Stories
 
Changing the game with cloud dw
Changing the game with cloud dwChanging the game with cloud dw
Changing the game with cloud dw
 
Geek Sync I The Importance of Data Model Change Management
Geek Sync I The Importance of Data Model Change ManagementGeek Sync I The Importance of Data Model Change Management
Geek Sync I The Importance of Data Model Change Management
 
The Data Driven University - Automating Data Governance and Stewardship in Au...
The Data Driven University - Automating Data Governance and Stewardship in Au...The Data Driven University - Automating Data Governance and Stewardship in Au...
The Data Driven University - Automating Data Governance and Stewardship in Au...
 
Achieving Lakehouse Models with Spark 3.0
Achieving Lakehouse Models with Spark 3.0Achieving Lakehouse Models with Spark 3.0
Achieving Lakehouse Models with Spark 3.0
 
Data Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and GovernanceData Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and Governance
 
Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing concepts
 
The Business Value of Metadata for Data Governance
The Business Value of Metadata for Data GovernanceThe Business Value of Metadata for Data Governance
The Business Value of Metadata for Data Governance
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
 
Data mesh
Data meshData mesh
Data mesh
 
Collibra Data Citizen '19 - Bridging Data Privacy with Data Governance
Collibra Data Citizen '19 - Bridging Data Privacy with Data Governance Collibra Data Citizen '19 - Bridging Data Privacy with Data Governance
Collibra Data Citizen '19 - Bridging Data Privacy with Data Governance
 
Neo4j Training Modeling
Neo4j Training ModelingNeo4j Training Modeling
Neo4j Training Modeling
 
Data modeling star schema
Data modeling star schemaData modeling star schema
Data modeling star schema
 
Implementing the Data Maturity Model (DMM)
Implementing the Data Maturity Model (DMM)Implementing the Data Maturity Model (DMM)
Implementing the Data Maturity Model (DMM)
 
LDM Slides: Conceptual Data Models - How to Get the Attention of Business Use...
LDM Slides: Conceptual Data Models - How to Get the Attention of Business Use...LDM Slides: Conceptual Data Models - How to Get the Attention of Business Use...
LDM Slides: Conceptual Data Models - How to Get the Attention of Business Use...
 
Schemas for multidimensional databases
Schemas for multidimensional databasesSchemas for multidimensional databases
Schemas for multidimensional databases
 
Graph Data Modeling Best Practices(Eric_Monk).pptx
Graph Data Modeling Best Practices(Eric_Monk).pptxGraph Data Modeling Best Practices(Eric_Monk).pptx
Graph Data Modeling Best Practices(Eric_Monk).pptx
 

Similar a LDM Slides: Data Modeling for XML and JSON

Creating Effective Data Visualizations in Excel 2016: Some Basics
Creating Effective Data Visualizations in Excel 2016:  Some BasicsCreating Effective Data Visualizations in Excel 2016:  Some Basics
Creating Effective Data Visualizations in Excel 2016: Some Basics
Shalin Hai-Jew
 
Influence of-structured--semi-structured--unstructured-data-on-various-data-m...
Influence of-structured--semi-structured--unstructured-data-on-various-data-m...Influence of-structured--semi-structured--unstructured-data-on-various-data-m...
Influence of-structured--semi-structured--unstructured-data-on-various-data-m...
shivz3
 

Similar a LDM Slides: Data Modeling for XML and JSON (20)

Self-Service Data Analysis, Data Wrangling, Data Munging, and Data Modeling –...
Self-Service Data Analysis, Data Wrangling, Data Munging, and Data Modeling –...Self-Service Data Analysis, Data Wrangling, Data Munging, and Data Modeling –...
Self-Service Data Analysis, Data Wrangling, Data Munging, and Data Modeling –...
 
Data Modeling for Big Data
Data Modeling for Big DataData Modeling for Big Data
Data Modeling for Big Data
 
Data Modeling & Metadata Management
Data Modeling & Metadata ManagementData Modeling & Metadata Management
Data Modeling & Metadata Management
 
LDM Webinar: UML for Data Modeling – When Does it Make Sense?
LDM Webinar: UML for Data Modeling – When Does it Make Sense?LDM Webinar: UML for Data Modeling – When Does it Make Sense?
LDM Webinar: UML for Data Modeling – When Does it Make Sense?
 
LDM Webinar: Data Modeling & Business Intelligence
LDM Webinar: Data Modeling & Business IntelligenceLDM Webinar: Data Modeling & Business Intelligence
LDM Webinar: Data Modeling & Business Intelligence
 
Lessons in Data Modeling: Data Modeling & MDM
Lessons in Data Modeling: Data Modeling & MDMLessons in Data Modeling: Data Modeling & MDM
Lessons in Data Modeling: Data Modeling & MDM
 
Creating Effective Data Visualizations in Excel 2016: Some Basics
Creating Effective Data Visualizations in Excel 2016:  Some BasicsCreating Effective Data Visualizations in Excel 2016:  Some Basics
Creating Effective Data Visualizations in Excel 2016: Some Basics
 
Agile & Data Modeling – How Can They Work Together?
Agile & Data Modeling – How Can They Work Together?Agile & Data Modeling – How Can They Work Together?
Agile & Data Modeling – How Can They Work Together?
 
Data Modeling Techniques
Data Modeling TechniquesData Modeling Techniques
Data Modeling Techniques
 
Incorporating ERP metadata in your data models
Incorporating ERP metadata in your data modelsIncorporating ERP metadata in your data models
Incorporating ERP metadata in your data models
 
[db tech showcase Tokyo 2018] #dbts2018 #B38 『Big Data and the Multi-model Da...
[db tech showcase Tokyo 2018] #dbts2018 #B38 『Big Data and the Multi-model Da...[db tech showcase Tokyo 2018] #dbts2018 #B38 『Big Data and the Multi-model Da...
[db tech showcase Tokyo 2018] #dbts2018 #B38 『Big Data and the Multi-model Da...
 
Transforming Data Management and Time to Insight with Anzo Smart Data Lake®
Transforming Data Management and Time to Insight with Anzo Smart Data Lake®Transforming Data Management and Time to Insight with Anzo Smart Data Lake®
Transforming Data Management and Time to Insight with Anzo Smart Data Lake®
 
INTRODUCTION TO BIG DATA AND HADOOP
INTRODUCTION TO BIG DATA AND HADOOPINTRODUCTION TO BIG DATA AND HADOOP
INTRODUCTION TO BIG DATA AND HADOOP
 
MySQL JSON Document Store - A Document Store with all the benefits of a Trans...
MySQL JSON Document Store - A Document Store with all the benefits of a Trans...MySQL JSON Document Store - A Document Store with all the benefits of a Trans...
MySQL JSON Document Store - A Document Store with all the benefits of a Trans...
 
Column Oriented Databases
Column Oriented DatabasesColumn Oriented Databases
Column Oriented Databases
 
How to Reveal Hidden Relationships in Data and Risk Analytics
How to Reveal Hidden Relationships in Data and Risk AnalyticsHow to Reveal Hidden Relationships in Data and Risk Analytics
How to Reveal Hidden Relationships in Data and Risk Analytics
 
Data Modeling & Data Integration
Data Modeling & Data IntegrationData Modeling & Data Integration
Data Modeling & Data Integration
 
Influence of-structured--semi-structured--unstructured-data-on-various-data-m...
Influence of-structured--semi-structured--unstructured-data-on-various-data-m...Influence of-structured--semi-structured--unstructured-data-on-various-data-m...
Influence of-structured--semi-structured--unstructured-data-on-various-data-m...
 
Master Meta Data
Master Meta DataMaster Meta Data
Master Meta Data
 
BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...
BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...
BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...
 

Más de DATAVERSITY

The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
DATAVERSITY
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best Practices
DATAVERSITY
 

Más de DATAVERSITY (20)

Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
 
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceData at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and Governance
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data Literacy
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business Goals
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for You
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?
 
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling Fundamentals
 
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectShowing ROI for Your Analytic Project
Showing ROI for Your Analytic Project
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at Scale
 
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?
 
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and Forwards
 
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayData Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement Today
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best Practices
 
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best Practices
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive Advantage
 

Último

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Último (20)

Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 

LDM Slides: Data Modeling for XML and JSON

  • 1. Data Modeling for XML & JSON Donna Burbank Global Data Strategy Ltd. Lessons in Data Modeling DATAVERSITY Series Dec 6th, 2016
  • 2. Global Data Strategy, Ltd. 2016 Donna is a recognized industry expert in information management with over 20 years of experience in data strategy, information management, data modeling, metadata management, and enterprise architecture. She is currently the Managing Director at Global Data Strategy, Ltd., an international information management consulting company that specialises in the alignment of business drivers with data-centric technology. In past roles, she has served in a number of roles related to data modeling & metadata: • Metadata consultant (US, Europe, Asia, Africa) • Product Manager PLATINUM Metadata Repository • Director of Product Management, ER/Studio • VP of Product Marketing, Erwin • Data modeling & data strategy implementation & consulting • Author of 2 books of data modeling & contributor to 1 book on metadata management, plus numerous articles • OMG committee member of the Information Management Metamodel (IMM) As an active contributor to the data management community, she is a long time DAMA International member and is the President of the DAMA Rocky Mountain chapter. She has worked with dozens of Fortune 500 companies worldwide in the Americas, Europe, Asia, and Africa and speaks regularly at industry conferences. She has co-authored two books: Data Modeling for the Business and Data Modeling Made Simple with ERwin Data Modeler and is a regular contributor to industry publications such as DATAVERSITY, EM360, & TDAN. She can be reached at donna.burbank@globaldatastrategy.com Donna is based in Boulder, Colorado, USA. Donna Burbank 2 Follow on Twitter @donnaburbank Today’s hashtag: #LessonsDM
  • 3. Global Data Strategy, Ltd. 2016 Lessons in Data Modeling Series • July 28th Why a Data Model is an Important Part of your Data Strategy • August 25th Data Modeling for Big Data • September 22nd UML for Data Modeling – When Does it Make Sense? • October 27th Data Modeling & Metadata Management • December 6th Data Modeling for XML and JSON 3 This Year’s Line Up
  • 4. Global Data Strategy, Ltd. 2016 Agenda • Overview of XML and JSON • Data Modeling & Metadata for XML & JSON • Integrating XML & JSON with Databases (Relational & NoSQL) • RDF & the Semantic Web • Summary & Questions 4 What we’ll cover today
  • 5. Global Data Strategy, Ltd. 2016 Assumption • An assumption for today is that the majority of attendees are familiar with relational databases & Entity-Relationship (E/R) modeling. • E.g. Data Modelers, Data Architects, SQL Developers, BI Developers, etc. • The examples are given with that bias, i.e. a comparison with the relational database world. 5 From Data Modeling for the Business by Hoberman, Burbank, Bradley, Technics Publications, 2009
  • 6. Global Data Strategy, Ltd. 2016 What is XML? • What is XML? – (Extensible Markup Language) is used to store and transport data. • Some design principles of XML: • Simplicity: ease of usage, interoperability & understanding • Modular design: do one thing well • Extensible: Ability to easily modify the structure & content • Self-descriptive: ease of understanding • Machine readable • Human readable • Embedded descriptive tags • XML is designed for data availability, sharing & transport. • It requires complementary technology to do anything else. i.e. Someone must write a piece of software to send, receive, store, or display it, for example: • HTML: Format & presentation of the data • Web Service: Transport of the data (e.g. SOAP) • Database: Store & integrate with other data sources 6
  • 7. Global Data Strategy, Ltd. 2016 XML and JSON Assist with Data Exchange 7 • XML and JSON can be used to assist with data exchange (B2B, B2C, etc.) • Companies • Government Agencies • Research Organizations • Etc. Purchase Order
  • 8. Global Data Strategy, Ltd. 2016 Emergence & the Growth of Data Exchange In philosophy, systems theory, science, and art, emergence is the way complex systems and patterns arise out of a multiplicity of relatively simple interactions. - Wikipedia
  • 9. Global Data Strategy, Ltd. 2016 XML uses a Hierarchical Structure • XML uses a hierarchical, nested tree structure • An XML tree starts at a root element and branches from the root to child elements. • All elements can have sub elements (child elements) 9 <?xml version="1.0"?> <shipto> <name>John Smith</name> <address>123 Main ST</address> <city>Boise</city> <country>USA</country> </shipto> Root element Child elements
  • 10. Global Data Strategy, Ltd. 2016 XML is Extensible • XML is extensible, in that element can be easily added as needed. • If the <state> element is added below, older applications using the original version will still work. 10 <?xml version="1.0"?> <shipto> <name>John Smith</name> <address>123 Main ST</address> <city>Boise</city> <country>USA</country> </shipto> <?xml version="1.0"?> <shipto> <name>John Smith</name> <address>123 Main ST</address> <city>Boise</city> <state>ID</state> <country>USA</country> </shipto>
  • 11. Global Data Strategy, Ltd. 2016 XML is Self-Describing • XML is self-describing (sort of) with the use of element tags • Human-readable format • Tags describe the content of the element (sort of) 11 <?xml version="1.0"?> <shipto> <name>John Smith</name> <address>123 Main ST</address> <city>Boise</city> <country>USA</country> </shipto> From reading the tags, it’s pretty clear that we’re talking about a “Ship To” address that contains the name, address, city & country. But it doesn’t provide full metadata, e.g.: • What’s the data type? • What’s the business definition? • Is <name> a required field?
  • 12. Global Data Strategy, Ltd. 2016 XML Metadata – the XML Schema • Similar to DDL, an XML Schema (XSD) defines the structure & format of data 12 <?xml version="1.0" encoding="UTF-8" ?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="shiporder"> <xs:complexType> <xs:sequence> <xs:element name="orderperson" type="xs:string"/> <xs:element name="shipto"> <xs:complexType> <xs:sequence> <xs:element name="name" type="xs:string"/> <xs:element name="address" type="xs:string"/> <xs:element name="city" type="xs:string"/> <xs:element name="country" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence> <xs:attribute name="orderid" type="xs:string" use="required"/> </xs:complexType> </xs:element> </xs:schema> XSD Metadata Ship to: John Smith 123 Main ST Boise USA ……………………………………… ……………………………………… Order Shipment Data <?xml version="1.0"?> <shipto> <name>John Smith</name> <address>123 Main ST</address> <city>Boise</city> <country>USA</country> </shipto> XML Data
  • 13. Global Data Strategy, Ltd. 2016 Graphical Models of XML Schemas 13 • XML Schemas can be shown graphically as well as via text. * Source: Altova
  • 14. Global Data Strategy, Ltd. 2016 XML Metadata – the XML Schema • Although the XML Schema does provide some physical structural metadata, full metadata descriptions are incomplete, e.g. • Is the name field required? • What’s the business definition for each field? • Are there code values and/or reference data that can be used? • Can a complex data type be used? • Etc. 14
  • 15. Global Data Strategy, Ltd. 2016 Levels of Data Modeling 15 Conceptual Logical Physical Purpose Communication & Definition of Business Terms & Rules Clarification & Detail of Business Rules & Data Structures Technical Implementation with a Physical Database or Structure Audience Business Stakeholders Data Architecture Business Analysts DBAs Developers Business Concepts Data Entities Physical Tables XML Schema defines some physical metadata But limited or no business metadata
  • 16. Global Data Strategy, Ltd. 2016 Metadata & Context From Data Modeling for the Business by Hoberman, Burbank, Bradley, Technics Publications, 2009 Is this Customer a: • Premier Customer • Lapsed Customer • High Risk Customer? Can a Customer have more than one Account? Is the Ship To Address related to the Customer or the Account? What are the valid state codes for the Ship To Address?
  • 17. Global Data Strategy, Ltd. 2016 XML Assists with Data Exchange 17 • XML and JSON can be used to assist with data exchange (B2B, B2C, etc.) • Remember modularity, simplicity, etc. Purchase Order Dude-all that other stuff isn’t my job. I’m just sending the PO!
  • 18. Global Data Strategy, Ltd. 2016 Integrating XML with Relational Databases • XML is often used in conjunction with relational databases for permanent storage and integration with other operational, reporting, and reference data. 18 Purchase Order Oracle SQL Server
  • 19. Global Data Strategy, Ltd. 2016 Integrating XML with Relational Databases • XML can be translated into relational databases, and vice-versa 19 XML Schema DDL * Source: Altova
  • 20. Global Data Strategy, Ltd. 2016 Integrating XML with Relational Databases 20 • XML can be translated into relational databases, and vice-versa XML Model Diagram Relational Model Diagram * Source: Altova
  • 21. Global Data Strategy, Ltd. 2016 What is JSON? • What is JSON? – (JavaScript Object Notation) is a minimal, readable format for structuring data. It is used primarily to transmit data between a server and web application, as an alternative to XML. • It is similar to XML in that it is: • "self describing" & human readable • hierarchical • simple & interoperable 21 • It differs from XML in that it is: • can be parsed with standard JavaScript notation • uses arrays • can be simpler & shorter to read & write. {"employees":[ {"firstName":“Shannon", "lastName":“Kempe"}, {"firstName":"Anita", "lastName":“Kress"}, {"firstName":“Tony", "lastName":“Shaw"} ]} <employees> <employee> <firstName>Shannon</firstName> <lastName>Kempe</lastName> </employee> <employee> <firstName>Anita</firstName> <lastName>Kress</lastName> </employee> <employee> <firstName>Tony</firstName> <lastName>Shaw</lastName> </employee> </employees> JSON XML
  • 22. Global Data Strategy, Ltd. 2016 JSON Metadata – The JSON Schema 22 • The JSON schema offers a richer set of metadata. { "id": 127849, “brand": “Super Cooler", "price": 12.50, "tags": [“camping", “sports"] } Example Product in the API Data • Can the ID contain letters? • What is a brand? • Is a price required? • Etc. Context Needed (i.e. Metadata) For example, assume we have a JSON based product catalog. This catalog has a product which has an id, a brand, a price, and an optional set of tags. { "$schema": "http://json-schema.org/draft-04/schema#", "title": "Product", "description": "A retail product from Acme's online catalog", "type": "object", "properties": { "id": { "description": "The unique identifier for a product", "type": "integer" }, “brand": { "description": “The brand name of the product as shown in the online catalogue", "type": "string" }, "price": { "type": "number", }, "tags": { "type": "array", "items": { "type": "string" }, "minItems": 1, } }, "required": ["id", “brand", "price"] } JSON Schema Metadata
  • 23. Global Data Strategy, Ltd. 2016 Integrating JSON with Document Databases • JSON is often used with document databases, such as MongoDB, which uses JSON documents in order to store records • Document databases are popular ways to store unstructured information in a flexible way (e.g. multimedia, social media posts, etc. ) 23 • Each Collection can contain numerous Documents which could all contain different fields. {type: “Artifact”, medium: “Ceramic” country: “China”, } {type: “Book”, title: “Ancient China” country: “China”, }
  • 24. Global Data Strategy, Ltd. 2016 The Semantic Web & RDF • The RDF (Resource Description Framework) model from the World Wide Web Consortium (W3C) provides a way to link resources on the web (people, places, things). It provides a common framework for applications to share information without losing meaning. • Search Engines • Exchanging data between datasets • Sharing information with applications / APIs • Building social networks • Etc. • The goal is to move from a web of documents to a web of data. • The Framework is a simple way to express relationships between resources. • IRIs (International Resource Identifiers) (e.g. URI) identify resources • Simple triples relate objects together in the format: <subject> <predicate> <object> • These relationships create a connected Graph • There are several serialization formats, with RDF XML being a common one. For example: • Turtle is a human-friendly format • RDF/XML • JSON-LD • Schemas define the vocabularies used to describe the objects • Dublin Core and Schema.org are two common ones 24 Subject Object Predicate ACME Publishing RDF is Easy Is Publisher Of
  • 25. Global Data Strategy, Ltd. 2016 Creating a Web of Data 25 @type: Place Sheraton San Diego Hotel & Marina 1380 Harbor Island Drive San Diego, California 92101 USA "@context": "http://schema.org", “location": { "@type": "Place", "name": "Sheraton San Diego Hotel & Marina", "address": { "@type": "PostalAddress", "streetAddress": "1380 Harbor Island Drive", "addressLocality": "San Diego", "addressRegion": "CA", "postalCode": "92101" }, "telephone" : "+1-877-734-2726", "image": "http://edw2016.dataversity.net/uploads/ConfSiteAssets/72/im age/sheraton.jpg", "url":"http://edw2016.dataversity.net/travel.cfm" }, "@context": "http://schema.org", "location": { "@type": "Place", "name": "Sheraton San Diego Hotel & Marina", "address": { "@type": "PostalAddress", "streetAddress": "1380 Harbor Island Drive", "addressLocality": "San Diego", "addressRegion": "CA", "postalCode": "92101" }, "telephone" : "+1-877-734-2726", "image": “http://mysite.com/edw16photo.jpg", "url":“http://mysite.com/myphotos" }, * Script provided by: Eric Franzon, eric@smartdataconsultants.com *
  • 26. Global Data Strategy, Ltd. 2016 Dublin Core Metadata Initiative • The Dublin Core Metadata Initiative provides a common metadata standards for resources such as media, library books, etc. • It defines standards for information such as: 26 http://dublincore.org Title Creator Subject Description Publisher Contributor Date Type Format Identifier Source Language Relation Coverage Rights  Resources can be described using:  Text  HTML  XML  RDF XML Sample Metadata Format="video/mpeg; 5 minutes“ Language="en" Publisher=“Kats Online, LLC" Title=“My Favorite Cat Video“ Subject=“Cats“ Description=“A short video of a black cat playing with string."
  • 27. Global Data Strategy, Ltd. 2016 Schema.org • Schema.org is a vocabulary that webmasters can use to mark-up Web pages for the Semantic Web, so that search engines understand what the pages are about . • Created by a group of search providers (e.g. Google, Microsoft, Yahoo and Yandex). • Vocabularies are developed by an open community process • Through GitHub (https://github.com/schemaorg/schemaorg) • Using the public-schemaorg@w3.org mailing list • The schemas are a set of 'types', each associated with a set of properties. The types are arranged in a hierarchy. There are currently over 570 types, including: • Creative works • Organization • Person • Place, LocalBusiness, Restaurant • Product, Offer, AggregateOffer • Etc. • There are also extensions for particular industries such as: • auto.schema.org • health-lifesci.schema.org 27  Resources can be described using:  JSON-LD  RDFa  Etc.
  • 28. Global Data Strategy, Ltd. 2016 There are Many Other Common Schemas & Vocabularies • The Dublin Core and Schema.org are two popular schemas, but many more exist for particular subject areas, industries, etc. • The Linked Open Vocabularies site (LOV) provides a helpful listing 28http://lov.okfn.org/dataset/lov/ Dublin Core Schema.org Friend of a Friend
  • 29. Global Data Strategy, Ltd. 2016 Summary • XML and JSON are used for transport and interoperability of data • They offer a variety of benefits • Simplicity: ease of usage, interoperability & understanding • Modular design: do one thing well • Extensible: Ability to easily modify the structure & content • Self-descriptive: ease of understanding • Integration with Databases allows for broader enterprise sharing & storage • Translation to Relational databases • Storage for Document databases • Graphical Models can be used across technologies for an intuitive way to visualize hierarchies & relationships • The Semantic Web is a powerful way to support the internet as a “web of data”
  • 30. Global Data Strategy, Ltd. 2016 About Global Data Strategy, Ltd • Global Data Strategy is an international information management consulting company that specializes in the alignment of business drivers with data-centric technology. • Our passion is data, and helping organizations enrich their business opportunities through data and information. • Our core values center around providing solutions that are: • Business-Driven: We put the needs of your business first, before we look at any technology solution. • Clear & Relevant: We provide clear explanations using real-world examples. • Customized & Right-Sized: Our implementations are based on the unique needs of your organization’s size, corporate culture, and geography. • High Quality & Technically Precise: We pride ourselves in excellence of execution, with years of technical expertise in the industry. 30 Data-Driven Business Transformation Business Strategy Aligned With Data Strategy Visit www.globaldatastrategy.com for more information
  • 31. Global Data Strategy, Ltd. 2016 Contact Info • Email: donna.burbank@globaldatastrategy.com • Twitter: @donnaburbank @GlobalDataStrat • Website: www.globaldatastrategy.com • Company Linkedin: https://www.linkedin.com/company/global-data-strategy-ltd • Personal Linkedin: https://www.linkedin.com/in/donnaburbank 31
  • 32. Global Data Strategy, Ltd. 2016 DATAVERSITY Training Center • Learn the basics of Metadata Management and practical tips on how to apply metadata management in the real world. This online course hosted by DATAVERSITY provides a series of six courses including: • What is Metadata • The Business Value of Metadata • Sources of Metadata • Metamodels and Metadata Standards • Metadata Architecture, Integration, and Storage • Metadata Strategy and Implementation • Purchase all six courses for $399 or individually at $79 each. Register here • Other courses available on Data Governance & Data Quality 32 Online Training Courses New Metadata Management Course Visit: http://training.dataversity.net/lms/
  • 33. Global Data Strategy, Ltd. 2016 Lessons in Data Modeling Series - 2017 • January 26th How Data Modeling Fits into an Overall Enterprise Architecture • February 23rd Data Modeling & Business Intelligence • March 23rd Conceptual Data Models - How to Get the Attention of Business Users (for a Technical Audience) • April 27th The Evolving Role of the Data Architect – What Does it Mean for Your Career? • May 25th Data Modeling & Metadata Management • June 22nd Self-Service Data Analysis, Data Wrangling, Data Munging, and Data Modeling – how do they fit together? • July 27th Data Modeling & Metadata for Graph Databases • August 24th Data Modeling & Data Integration • September 28th Data Modeling & MDM • October 26th Agile & Data Modeling – How can they work together? • December 5th Data Modeling, Data Governance, & Data Quality 33 Next Year’s Line Up
  • 34. Global Data Strategy, Ltd. 2016 Questions? 34 Thoughts? Ideas?