1. www.LatentView.com
PMML Tutorial
Ramesh Hariharan
12-Feb-2009
www.LatentView.com
www.latentview.com/blog
This presentation is solely for the use of LatentView. No part of this
presentation may be circulated, quoted, or reproduced for distribution without
prior written approval from LatentView.
2. www.LatentView.com
Agenda
• PMML Overview
• Constructing a PMML
• XSD Overview
• Reading the PMML Specification
• Next Steps…
2
LatentView Analytics Pvt. Ltd (Confidential)
3. www.LatentView.com
Agenda
• PMML Overview
• Constructing a PMML
• XSD Overview
• Reading the PMML Specification
• Next Steps…
3
LatentView Analytics Pvt. Ltd (Confidential)
4. www.LatentView.com
PMML Overview
PMML – Predictive Modeling Mark-up Language
Used for Model Scoring
XML Document
Owned by DMG. A consortium led by SPSS, SAS, IBM, Microsoft, Oracle and others
Currently in version 3.2
Advantages of PMML Drawbacks of PMML
Portability of models Least Common Denominator
Metadata standardization Potential loss of precision
Model once, score anywhere (MOSA ☺) Lack of support for complex transformations
Lack of support from Tools
Some of the Model Types Supported
Association Rules, Clustering, General Regression, Naïve Bayes, Neural Networks, Support Vector
Machines
Capabilities of PMML
Model Composition – model sequencing & model selection
Built-in and User-defined functions
Usual data types – date, numbers, category
Model Verification – sample results for testing
Output field – create output tables based on the models
Extension Mechanisms
4
LatentView Analytics Pvt. Ltd (Confidential)
5. www.LatentView.com
PMML in the Decision Management Architecture
Business Rules Sales &
formulation
Client Marketing
Create
Managers
Operational Systems
Rules
Business Rules
Customer
Requests Management
Decision Models
Risk
Scores and
Management
Decisions
Model Repository
LatentView Analytic
Analysts Modeling Enterprise Decision Engine
Other
Model
Applications
Development
Analytics Data Backbone
Payment Interaction
Product Channel Customer History Data
Data Data Data Data
Enterprise Data
LatentView Analytics Pvt. Ltd (Confidential)
6. www.LatentView.com
Agenda
• PMML Overview
• Constructing a PMML
• XSD Overview
• Reading the PMML Specification
• Next Steps…
6
LatentView Analytics Pvt. Ltd (Confidential)
9. www.LatentView.com
Agenda
• PMML Overview
• Constructing a PMML
• XSD Overview
• Reading the PMML Specification
• Next Steps…
9
LatentView Analytics Pvt. Ltd (Confidential)
10. www.LatentView.com
XSD Overview
XSD – XML Schema Definition
The purpose of an XML Schema is to define the legal building blocks of an XML document, just like a
DTD.
An XML Schema:
• defines elements that can appear in a document
• defines attributes that can appear in a document
• defines which elements are child elements
• defines the order of child elements
• defines the number of child elements
• defines whether an element is empty or can include text
• defines data types for elements and attributes
• defines default and fixed values for elements and attributes
LatentView Analytics Pvt. Ltd (Confidential)
11. www.LatentView.com
A First Example
Look at this simple XML document called quot;note.xmlquot;:
<?xml version=quot;1.0quot;?>
<note> <to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
Look at the XML Schema for the same
<?xml version=quot;1.0quot;?>
<xs:schema xmlns:xs=quot;http://www.w3.org/2001/XMLSchemaquot; targetNamespace=quot;http://www.w3schools.comquot;
xmlns=quot;http://www.w3schools.comquot; elementFormDefault=quot;qualifiedquot;>
<xs:element name=quot;notequot;>
<xs:complexType>
<xs:sequence>
<xs:element name=quot;toquot; type=quot;xs:stringquot;/>
<xs:element name=quot;fromquot; type=quot;xs:stringquot;/>
<xs:element name=quot;headingquot; type=quot;xs:stringquot;/>
<xs:element name=quot;bodyquot; type=quot;xs:stringquot;/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
LatentView Analytics Pvt. Ltd (Confidential)
12. www.LatentView.com
Simple Elements
<xs:element name=quot;xxxquot; type=quot;yyyquot;/>
XML Schema has a lot of built-in data types. The most common types are:
• xs:string
• xs:decimal
• xs:integer
• xs:boolean
• xs:date
• xs:time
Example
<lastname>Refsnes</lastname>
<age>36</age>
<dateborn>1970-03-27</dateborn>
<xs:element name=quot;lastnamequot; type=quot;xs:stringquot;/>
<xs:element name=quot;agequot; type=quot;xs:integerquot;/>
<xs:element name=quot;datebornquot; type=quot;xs:datequot;/>
LatentView Analytics Pvt. Ltd (Confidential)
13. www.LatentView.com
XSD Attributes
Simple elements cannot have attributes. If an element has attributes, it is considered to be of a complex
type. But the attribute itself is always declared as a simple type.
<xs:attribute name=quot;xxxquot; type=quot;yyyquot;/>
where xxx is the name of the attribute and yyy specifies the data type of the attribute. XML Schema has a
lot of built-in data types. The most common types are:
• xs:string
• xs:decimal
• xs:integer
• xs:boolean
• xs:date
• xs:time
Example
<lastname lang=quot;ENquot;>Smith</lastname>
<xs:attribute name=quot;langquot; type=quot;xs:stringquot;/>
LatentView Analytics Pvt. Ltd (Confidential)
14. www.LatentView.com
Simple Elements: Restrictions
Restrictions are used to define acceptable values for XML elements or attributes. Restrictions on
XML elements are called facets.
Restrictions on Values
<xs:element name=quot;agequot;>
<xs:simpleType>
<xs:restriction base=quot;xs:integerquot;>
<xs:minInclusive value=quot;0quot;/>
<xs:maxInclusive value=quot;120quot;/>
</xs:restriction>
</xs:simpleType>
</xs:element>
Restrictions on a set of Values
<xs:element name=quot;carquot; type=quot;carTypequot;/>
<xs:simpleType name=quot;carTypequot;>
<xs:restriction base=quot;xs:stringquot;>
<xs:enumeration value=quot;Audiquot;/>
<xs:enumeration value=quot;Golfquot;/>
<xs:enumeration value=quot;BMWquot;/>
</xs:restriction>
</xs:simpleType>
LatentView Analytics Pvt. Ltd (Confidential)
16. www.LatentView.com
More Complex Elements
You can also base a complex element on an existing complex element and add some elements, like this:
<xs:element name=quot;employeequot; type=quot;fullpersoninfoquot;/>
<xs:complexType name=quot;personinfoquot;>
<xs:sequence>
<xs:element name=quot;firstnamequot; type=quot;xs:stringquot;/>
<xs:element name=quot;lastnamequot; type=quot;xs:stringquot;/>
</xs:sequence>
</xs:complexType>
<xs:complexType name=quot;fullpersoninfoquot;>
<xs:complexContent>
<xs:extension base=quot;personinfoquot;>
<xs:sequence>
<xs:element name=quot;addressquot; type=quot;xs:stringquot;/>
<xs:element name=quot;cityquot; type=quot;xs:stringquot;/>
<xs:element name=quot;countryquot; type=quot;xs:stringquot;/>
</xs:sequence>
</xs:extension>
</xs:complexContent>
</xs:complexType>
LatentView Analytics Pvt. Ltd (Confidential)
17. www.LatentView.com
XSD Indicators
You can also base a complex element on an existing complex element and add some elements, like this:
Indicators
There are seven indicators:
Order indicators:
• All
• Choice
• Sequence
Occurrence indicators:
• maxOccurs
• minOccurs
Group indicators:
• Group name
• attributeGroup name
LatentView Analytics Pvt. Ltd (Confidential)
18. www.LatentView.com
Complex Type: Example
Let's have a look at this XML document called quot;shiporder.xmlquot;:
<?xml version=quot;1.0quot; encoding=quot;ISO-8859-1quot;?>
<shiporder orderid=quot;889923quot; xmlns:xsi=quot;http://www.w3.org/2001/XMLSchema-instancequot;
xsi:noNamespaceSchemaLocation=quot;shiporder.xsdquot;>
<orderperson>John Smith</orderperson>
<shipto>
<name>Ola Nordmann</name>
<address>Langgt 23</address>
<city>4000 Stavanger</city>
<country>Norway</country>
</shipto>
<item>
<title>Empire Burlesque</title>
<note>Special Edition</note>
<quantity>1</quantity>
<price>10.90</price>
</item>
<item>
<title>Hide your heart</title>
<quantity>1</quantity>
<price>9.90</price>
</item>
</shiporder>
LatentView Analytics Pvt. Ltd (Confidential)
24. www.LatentView.com
PMML Transformations
PMML defines various kinds of simple data transformations:
Normalization: map values to numbers, the input can be continuous or discrete.
Discretization: map continuous values to discrete values.
Value mapping: map discrete values to discrete values.
Functions: derive a value by applying a function to one or more parameters
Aggregation: summarize or collect groups of values, e.g., compute average.
Value Mapping
<DerivedField name=quot;ETHNICGROUPCODE_02quot; optype=quot;ordinalquot; dataType=quot;integerquot;>
<MapValues outputColumn=quot;derivedquot; defaultValue=quot;0quot; mapMissingTo=quot;0quot;>
<FieldColumnPair field=quot;ETHNICGROUPCODEquot; column=quot;originalquot; />
<InlineTable>
<row>
<original>02</original>
<derived>1</derived>
</row>
</InlineTable>
</MapValues>
</DerivedField>
Built-in Function
<DerivedField name=quot;I1EXACTAGE_drquot; optype=quot;continuousquot; dataType=quot;doublequot;>
<Apply function=quot;sumquot;>
<FieldRef field=quot;I1EXACTAGEquot;/>
<FieldRef field=quot;I1ESTIMATEDAGEquot;/>
</Apply>
</DerivedField>
LatentView Analytics Pvt. Ltd (Confidential)
27. www.LatentView.com
Agenda
• PMML Overview
• Constructing a PMML
• XSD Overview
• Reading the PMML Specification
• Next Steps…
27
LatentView Analytics Pvt. Ltd (Confidential)
28. www.LatentView.com
Next Steps
Create a PMML file from your models – one for Logistic, Clustering and
Decision Tree models
Build PMML manually, and validate it using an XML editor such as
XMLFox (a syntactically valid PMML may not be logically valid)
LatentView Analytics Pvt. Ltd (Confidential)
29. www.LatentView.com
Thank You !
JVL Plaza, Ground Floor, 80, Broad Street, 5th Floor
626 Anna Salai, Teynampet, New York, NY 10004
Chennai – 600 018
Phone: +91-44-4509 4039/40 Phone: +1-212-837-7874
LatentView Analytics Pvt. Ltd (Confidential)