Más contenido relacionado La actualidad más candente (20) Similar a Mobile Development Meets Semantic Technology (20) Más de Blue Slate Solutions (10) Mobile Development Meets Semantic Technology1. Mobile Development Meets
Semantic Technology
David S. Read
CTO, Blue Slate Solutions
Semantic Technology and Business Conference
June, 2012
0
© Blue Slate Solutions 2012
2. Introductions
• David Read, CISSP, GSEC, SCJP
– Blue Slate CTO and Chief Solution Architect
– 25+ years tech leadership experience
– Strategy and technology focus
– Passionate about semantic technology
and data mining
• Attendees
– Business?
– Technology?
– Production Use of Semantic
Technologies?
– Mobile Part of Business Vision?
Source: http://crystalwashington.com/wp-content/uploads/2011/07/boringconference.jpg
1 © Blue Slate Solutions 2012
4. Agenda
• Premise and Goal
• Mobile constraints
– not focused on the UI
• Semantic technology components
• Leveraging semantic flexibility
3 © Blue Slate Solutions 2012
5. Premise
4 © Blue Slate Solutions 2012
6. Multi-tier Architectures – Flexible at Design Time
WS .Net iOS Android JSP • View
Presentation
• Layout
Tier Native/Proprietary Domain • Security
App1 App2 App3 App4 App5
• Aggregation
• Business Rules
• SDO
WF RE • Work Flow
Business • Transactions
Tier • Security
Object Domain
• Data abstraction
O-R O-XML O-Unstructured • DAO
Native/Proprietary Domain • Transactions
• Security
ESB • e.g. Hibernate,
XSL, PL/SQL
Integration
WF RE Data
Tier
Data • Persistence
• Transactions
Tier • Categorization
DB Unstructured •5Indexing
• Security
5 © Blue Slate Solutions 2012
7. Goal
Use
semantic technology
to mitigate mobile
platform constraints
at runtime
6 © Blue Slate Solutions 2012
9. Application Memory
• Limits on runtime application and heap
– Android devices
• 16, 24 or 32 MB per application
• Manifest: android:largeHeap=TRUE (ouch!)
– iOS devices
• Kill the application between 16 and 40 MB
– BlackBerry devices
• (COD files) 8 MB application and 8 MB for resources
– Windows CE/Mobile
• Older (v 5 and 6) limited to 32M, v7 peaks at ~90 MB
8 © Blue Slate Solutions 2012
10. CPU
Source: http://www.passmark.com/forum/showthread.php?t=3381
9 © Blue Slate Solutions 2012
11. Bandwidth
10baseT
Sources: http://www.diffen.com/difference/3G_vs_4G
and http://www.hostile.org/coredump/bandwidth.html
10 © Blue Slate Solutions 2012
12. Connectivity
• Verizon, AT&T and Sprint Coverage
• Cellular coverage has gaps
• 4G is far from the norm
Sources: http://www.verizonwireless.com/b2c/CoverageLocatorController,
http://www.wireless.att.com/coverageviewer/#,
http://coverage.sprintpcs.com/IMPACT.jsp?covType=sprint&id9=vanity:coverage
11 © Blue Slate Solutions 2012
13. Battery Life
• Talk and standby time are common
(but not meaningful) measures
• Surfing the web, playing games, checking email, all
require significant power: radio, CPU and screen
Source: http://www2012.wwwconference.org/proceedings/proceedings/p41.pdf
12 © Blue Slate Solutions 2012
15. Data
• Facts from remote and local sources
• May be structured or unstructured
• More value as it is federated
• Standard semantic representation - RDF
Subject Predicate Object
14 © Blue Slate Solutions 2012
16. Ontology
• Classification and rules (inferencing)
• Transform at any tier
– Extrapolate redundant information
• Standard representation - RDFS and OWL
Source: http://data.gov.uk/resources/payments
15 © Blue Slate Solutions 2012
17. Reasoner
• Software which applies the ontology to data
• Can assert new data
• Well understood technology
– Available on a broad range of platforms, including mobile
Reasoner
16 © Blue Slate Solutions 2012
18. Query Processing
• CRUD operations on data
• Standard semantic query language - SPARQL
Relation1 Constant
Local or Remote
ObjectA Relation2 ObjectB Relation3 ObjectD
Relation2 ObjectC Relation4 ObjectE
SPARQL
Processor
?theRelation ?theObject
Relation3 ObjectD
Relation4 ObjectE
17 © Blue Slate Solutions 2012
20. Levers to Minimize Power Consumption
Remote Data Access
Local Computation
19 © Blue Slate Solutions 2012
21. Use Efficient Representations
• Bandwidth is variable
• Radio communication costs battery significantly
• Little entropy per bit in most non-binary formats
– XML-based information often contains less data than
markup (far less than 1 bit entropy per byte)
Format Size (KB) %
RDF/XML 693 100
Turtle 258 37
Zipped RDF/XML 33 5
Zipped Turtle 27 4
20 © Blue Slate Solutions 2012
22. Reasoning Locally to Minimize Power Consumption
• CPU on phone is often underutilized
• Application behavior may have user-specific
configurations controlling data relationships
• Reasoning result at server is equivalent to client
– Same syntax
Network Process
Access Execution
?
21 © Blue Slate Solutions 2012
23. Reasoning Locally – An example
• Application to report on fuel consumption statistics
• Small ontology sets up relationships between
vehicles, fuel purchases and gas stations
• Data for all fuel purchases
• Classify cars and gas stations, infer additional
information (e.g. distance, MPG)
22 © Blue Slate Solutions 2012
26. What Just Happened?
Remote
Mobile Device
(Cloud)
Local Augmented Dbpedia
Data
1 Reasoner 2 Local Data
Data
1 6
4
Local
Ontology Local
DBpedia
Query 7 5 Endpoint
Engine
3
Local 8
Query
Federated
Results
25 © Blue Slate Solutions 2012
27. How Does That Work on a Mobile Device?
• Semantic Reasoner and Query Libraries
– Several available (commercial and open source)
– Using Java (Android)-based Open Source libraries for
this demonstration
• Principles are the same for any semantic library
– Load an ontology and instances into working storage
• Might exist locally or be loaded via network
– Run the reasoner to create a model
– Query the resulting model to obtain result sets
• Inferred data can be persisted to create a new local
data set
26 © Blue Slate Solutions 2012
28. A Mobile Reasoner and Query Processor
• Jena
– Java-based semantic library
• Started by HP Labs as open source project in 2000
• Apache project (incubator) since 2010
– Interfaces to reasoner, triple store
– Built-in reasoner (RDFS, OWL DL)
– http://incubator.apache.org/jena/
– ARQ
• Jena’s SPARQL processor
• Android ports
– Androjena and ARQoid
– http://code.google.com/p/androjena/
27 © Blue Slate Solutions 2012
30. Reasoner Has No Expectations of an Ontology
• The process and libraries we discussed do not
require an ontology to be provided in order to
create a model
• If only data is provided then the reasoner creates a
model containing the data
• Allows us to tune behavior from the server side
without separate code bases
– Dynamically balance network bandwidth usage versus
CPU
– Server load, bandwidth limitations, mobile device
limitations, …
29 © Blue Slate Solutions 2012
31. Simple Example of Real-time Tuning
In our fuel consumption application…
Partial Data
and Ontology Component Full Data
Network
Local CPU
Server CPU
30 © Blue Slate Solutions 2012
32. Fuel Data and Ontology Overview
• Using raw fuel purchases as our representative
data set
– SPARQL endpoint:
http://semantic.monead.com/vehicleinfo/mileage
– Fully inferred data set: http://monead.com/semantic/data/
HybridMileageOntologyAll.Inferenced.xml
• Sizing
Information RDF/XML Turtle %
Size (KB) Size (KB)
Ontology 6 2 1
Minimal Data 200 68 27
Fully Inferenced 692 257 100
31 © Blue Slate Solutions 2012
33. Fuel Data and Ontology Details
veh:fuelPurchase0381 a veh:FuelPurchase;
veh:vehicle pveh:car2; veh:date "2011-08-14";
veh:gallons 6.908; veh:usDollarsPerGallon 3.779;
veh:totalUsDollarsCharged 26.11;
veh:reportedMpg 45.4; veh:odometerMiles 106883.;
veh:purchaseStation veh:Hess1.
veh:Stoughton-MA a veh:Place;
owl:sameAs
<http://dbpedia.org/resource/Stoughton,_Massachusetts>;
veh:placeName "Stoughton, MA".
veh:Sunoco19 a veh:GasStation;
veh:stationName "Sunoco";
veh:brandInfo
<http://dbpedia.org/resource/Sunoco>;
veh:location veh:Glenville.
32 © Blue Slate Solutions 2012
34. Use Case R-1: Slow Network
Mobile
1.5MBPS Server
Device
Component Full Data Set (Sec) Partial Data Set (Sec) Difference (Sec)
Data 171 44 127
Ontology 0 2 -2
Radio Total 171 46 125
Rendering 0 15 -15
Total 171 61 110
33 © Blue Slate Solutions 2012
35. Use Case R-2: (Faster Network)
Mobile
15 MBPS Server
Device
Component Full Data Set (Sec) Partial Data Set (Sec) Difference (Sec)
Data 17 4 -13
Ontology 0 1 1
Radio Total 17 5 -12
Rendering 0 15 15
Total 17 20 3
34 © Blue Slate Solutions 2012
36. Create a Reasoner Instance
OntModel model;
Reasoner reasoner =
ReasonerRegistry.getOWLReasoner();
Model infModel =
ModelFactory.createInfModel(reasoner,
ModelFactory.createDefaultModel());
model = ModelFactory.createOntologyModel(
OntModelSpec.OWL_DL_MEM,infModel);
model.setStrictMode(false);
35 © Blue Slate Solutions 2012
37. Process the Ontology (Run the Reasoner)
inputStream =
new StringReader(“Your Ontology”);
model.read(
inputStream,
null,
“Turtle”);
36 © Blue Slate Solutions 2012
38. Client Behavior is Unchanged for Either Use Case
Server Client
Full Request
Data Reasoner
Set Data
1 2 3
? Data
4
Partial
Data Query Results
5
Set and Engine
Ontology
37 © Blue Slate Solutions 2012
40. SPARQL Results Not Concerned with Data Source
• A SPARQL query can access a local model, remote
data and/or remote SPARQL endpoint
• The results are processed in the same manner
– Another tuning opportunity
• Dynamically shift between server and client
– Network/Radio and CPU (federation)
– CPU (sorting, filtering)
39 © Blue Slate Solutions 2012
41. Query Execution Behavior – 2 Approaches
• Server can send queries to client device
• Client decides which query to use based on
metadata from server or its own measurements
40 © Blue Slate Solutions 2012
42. Query Execution Behavior – Server Decides
Server Client
Request
? 1 Data
Query
Queries 2 3
Query Results
Data 3 4
Engine
41 © Blue Slate Solutions 2012
43. Query Execution Behavior – Client Decides
Server Client
? 2 Queries
1 Data
3
Query Results
Data 3 4
Engine
42 © Blue Slate Solutions 2012
44. Use Case Q-1 (Local Model)
• Query executes against local model
• If query requires a lot of CPU for (sorting, filtering)
could be better off re-architecting to server
Client
Data Queries
Query Results
Engine
43 © Blue Slate Solutions 2012
45. Setup a SPARQL Query Against Local Model
QueryExecution qe;
String query = “Your SPARQL Query”;
qe = QueryExecutionFactory.create(
query, model);
44 © Blue Slate Solutions 2012
46. Use Case Q-2 (Single SPARQL Endpoint)
• Query executes against one SPARQL endpoint
• Have client device execute this directly
• One consideration: significant latency
Server Client
Query SPARQL Query Results
Engine Endpoint Engine
Data Queries
45 © Blue Slate Solutions 2012
47. Use Case Q-3 (Federated SPARQL Endpoints)
• Query executes against several SPARQL
endpoints
• Requires communications with all the endpoints
and integration of the results
Client
Query Results
Engine
Queries
46 © Blue Slate Solutions 2012
48. Use Case Q-3 (Federated SPARQL Endpoints) - Alt
• If significantly complex, makes sense to proxy
Client
Query Results
Engine
Queries
47 © Blue Slate Solutions 2012
49. Setup a SPARQL Query Against Remote Endpoint
QueryExecution qe;
String query = “Your SPARQL Query”;
String queryUri = “Some SPARQL Endpoint URI”;
String queryDefaultGraphUri =
“An Optional Graph Uri”;
if (queryDefaultGraphUri.length() > 0) {
qe = QueryExecutionFactory.
sparqlService(queryUri, query,
queryDefaultGraphUri);
} else {
qe = QueryExecutionFactory.
sparqlService(queryUri, query);
}
48 © Blue Slate Solutions 2012
50. Use Case Q-4 (Raw Data Sources)
• Query executes against semantic data sources
• Server aggregates results since the entire graph
typically needs to be brought across the network
Server Client
Data
Queries
Server
Query SPARQL Query Results
Engine Endpoint Engine
49 © Blue Slate Solutions 2012
51. Setup a SPARQL Query Against Remote Graph
/* e.g. query contains FROM or SERVICE */
QueryExecution qe;
String query = “Your SPARQL Query”;
qe = QueryExecutionFactory.create(query);
50 © Blue Slate Solutions 2012
52. Query Processing Behavior is Unchanged
for Any Use Case
• The query execution syntax differs for local versus
remote data sources
• The result set, however, is processed in the same
manner for any select query
• Results in a small amount of code to execute the
correct query form (the three preceding code
snippets) but all downstream code is consistent
51 © Blue Slate Solutions 2012
53. Retrieve the SPARQL Results
ResultSet resultSet = qe.execSelect();
List<String> colNames=results.getResultVars();
for (String colName : colNames) {
while (results.hasNext()) {
QuerySolution solution = results.next();
for (String var : columnNames) {
if (solution.get(var) != null) {
if (solution.get(var).isLiteral()) {
solution.getLiteral(var).toString();
} else {
solution.getResource(var).getURI();
}
}
}
}
qe.close();
52 © Blue Slate Solutions 2012
54. Caching
• Caching is a good option with mobile devices
– Cache the data received, assertions reasoned and
results obtained
• Typically have access to “private” storage space,
often located on removable storage (SD card)
• Doesn’t interfere with basic phone data (local,
phone) storage
• Not as limited as native storage
– For BlackBerry this is the only reasonable way to break
out of the 8MB application data limitation
• Mature set of caching libraries do most of the
interesting work for you
53 © Blue Slate Solutions 2012
55. Summary
Semantic technology, by virtue of inferencing,
platform independence, consistent syntax and
standard protocols, enables dynamic intra-tier
tuning without significant coding and
configuration.
54 © Blue Slate Solutions 2012
56. Thank You
• I appreciate your taking the time
to attend this session
• Contact and Business
– David.Read@blueslate.net
– www.blueslate.net
• Reference Information
– Semantic technology thoughts and work
• http://monead.com/semantic
– Sparql Droid
• https://play.google.com/store/apps/details?id=com.monead.sema
ntic.android.sparql&hl=en
55 © Blue Slate Solutions 2012