SlideShare una empresa de Scribd logo
1 de 205
Ontopia Tutorial TMRA 2010-09-29 Lars Marius Garshol & Geir Ove Grønmo
Agenda About you who are you? About Ontopia The product The future Participating in the project
Some background About Ontopia
Brief history 1999-2000 private hobby project for Geir Ove 2000-2009 commercial software sold by Ontopia AS lots of international customers in diverse fields 2009- open source project
The project Open source hosted at Google Code Contributors Lars Marius Garshol, Bouvet Geir Ove Grønmo, Bouvet Thomas Neidhart, SpaceApps Lars Heuer, Semagia Hannes Niederhausen, TMLab Stig Lau, Bouvet Baard H. Rehn-Johansen, Bouvet Peter-Paul Kruijssen, Morpheus Quintin Siebers, Morpheus Matthias Fischer, HTW Berlin
Recent work Ontopia/Liferay integration Matthias Fischer & LMG Various fixes and optimizations everyone Tropics (RESTful web service interface) SpaceApps Porting build system to Maven2 Morpheus
Architecture and modules Product overview
The big picture Auto-class. A.N.other A.N.other Other CMSs A.N.other A.N.other DB2TM Portlet support OKP TMSync Engine CMSintegration Data  integration Escenic Taxon.import Ontopoly Web service
The engine Core API TMAPI 2.0 support Import/export RDF conversion TMSync Fulltext search Event API tolog query language tolog update language Engine
The backends In-memory no persistent storage thread-safe no setup RDBMS transactions persistent thread-safe uses caching clustering Remote uses web service read-only unofficial Engine Memory RDBMS Remote
DB2TM Upconversion to TMs from RDBMS via JDBC or from CSV Uses XML mapping can call out to Java Supports sync either full rescan or change table TMRAP Nav DB2TM Classify Engine Memory RDBMS Remote
TMRAP Web service interface via SOAP via plain HTTP Requests get-topic get-topic-page get-tolog delete-topic ... TMRAP Nav DB2TM Classify Engine Memory RDBMS Remote
Navigator framework Servlet-based API manage topic maps load/scan/delete/create JSP tag library XSLT-like based on tolog JSTL integration TMRAP Nav DB2TM Classify Engine Memory RDBMS Remote
Automated classification Undocumented experimental Extracts text autodetects format Word, PDF, XML, HTML Processes text detects language stemming, stop-words Extracts keywords ranked by importance uses existing topics supports compound terms TMRAP Nav DB2TM Classify Engine Memory RDBMS Remote
Vizigator Viz Ontopoly Graphical visualization VizDesktop Swing app to configure filter/style/... Vizlet Java applet for web uses configuration loads via TMRAP uses “Remote” backend TMRAP Nav DB2TM Classify Engine Memory RDBMS Remote
Ontopoly Viz Ontopoly Generic editor web-based, AJAX meta-ontology in TM Ontology designer create types and fields control user interface build views incremental dev Instance editor guided by ontology TMRAP Nav DB2TM Classify Engine Memory RDBMS Remote
Typical deployment Viewing application Engine Users DB Backend Ontopoly Frameworks Editors DB TMRAP DB2TM HTTP DB External application Application server
APIs The engine
Core APIs net.ontopia.topicmaps.core.* Fairly direct mapping from TMDM TopicIF AssociationIF TopicMapIF ... Set/get methods reflect TMDM properties
TopicIF Interface, not a class getTopicNames() addTopicName(TopicNameIF) removeTopicName(TopicNameIF) getOccurrences() + add + remove getSubjectIdentifiers() + add + remove getItemIdentifiers() + add + remove getSubjectLocators() + add + remove getRoles() getRolesByType(TopicIF)
Core interfaces TopicMapStoreIF TopicMapIF TopicIF AssociationIF TopicNameIF OccurrenceIF AssociationRoleIF VariantNameIF
How to get a TopicMapIF Create one directly new net...impl.basic.InMemoryTopicMapStore() Load one from file using an importer (next slide) Connect to an RDBMS covered later Use a topic map repository covered later
TopicMapReaderIF import net.ontopia.topicmaps.core.TopicMapIF; import net.ontopia.topicmaps.core.TopicMapReaderIF; import net.ontopia.topicmaps.utils.ImportExportUtils; public class TopicCounter {   public static void main(String[] argv) throws Exception {     TopicMapReaderIF reader = ImportExportUtils.getReader(argv[0]);     TopicMapIF tm = reader.read();     System.out.println("TM contains " + tm.getTopics().size() 							+ " topics");   } } [larsga@c716c5ac1 tmp]$ java TopicCounter ~/data/bilder/privat/metadata.xtm TM contains 17035 topics [larsga@c716c5ac1 tmp]$
Supported syntaxes
The utility classes A set of classes outside the core interfaces that perform common tasks a number of these utilities are obsolete now that tolog is here They are all built on top of the core interfaces Some important utilities ImportExportUtils				creates readers and writers MergeUtils						merges topics and topic maps PSI								contains important PSIs DeletionUtils					cascading delete of topics DuplicateSuppressionUtils	removes duplicates TopicStringifiers				find names for topics
Topic Maps repository Uses a set of topic maps sources to build a set of topics maps topic maps can be looked up by ID Many kinds of sources scan directory for files matching pattern download from URL connect to RDBMS ... Configurable using an XML file tm-sources.xml Used by Navigator Framework
Event API Allows clients to receive notification of changes Must implement TopicMapListenerIF  static class TestListener extends AbstractTopicMapListener {     public void objectAdded(TMObjectIF snapshot) {       System.out.println("Topic added: " + snapshot.getObjectId());     }     public void objectModified(TMObjectIF snapshot) {       System.out.println("Topic modified: " + snapshot.getObjectId());     }     public void objectRemoved(TMObjectIF snapshot) {       System.out.println("Topic removed: " + snapshot.getObjectId());     }   }
Using the API     // register to listen for events     TestListener listener = new TestListener();     TopicMapEvents.addTopicListener(ref, listener);     // get the store through the reference so the listener is registered     ref.createStore(false);     // let's add a topic     System.out.println("Off we go");     TopicMapBuilderIF builder = tm.getBuilder();     TopicIF newbie = builder.makeTopic(tm);     System.out.println("Let's name this topic");     builder.makeTopicName(newbie, "Newbie topic");     // then let's remove it     System.out.println("And now, the exit");     DeletionUtils.remove(newbie);     System.out.println("Goodbye, short-lived topic"); [larsga@dhcp-98 tmp]$ java EventTest bkclean.xtm  Off we go Topic added: 3409 Let's name this topic Topic modified: 3409 Topic modified: 3409 And now, the exit Topic removed: 3409 Goodbye, short-lived topic
For more information See the Engine Developer's Guide http://www.ontopia.net/doc/current/doc/engine/devguide.html
Persistence & transactions RDBMS backend
RDBMS backend Stores Topic Maps in an RDBMS generic schema access via JDBC Provides full ACID transactions concurrency ... Supports several databases Oracle, MySQL, PostgreSQL, MS SQL Server, hsql Clustering support
Core API implementation Implements same API as in-memory impl theoretically, a switch requires only config change Lazy loading of objects objects loaded from DB as needed Considerable internal caching for performance reasons Separate objects for separate transactions in order to provide isolation Shared cache between transactions
Configuration A Java property file Specifies database type JDBC URL username + password cache settings clustering settings ...
jdbcspy A built-in SQL profiler Useful for identifying cause of performance issues
tolog The Query Engine
tolog A logic-based query language a mix of Prolog and SQL effectively equivalent to Datalog Two parts queries (data retrieval) updates (data modification) Developed by Ontopia not an ISO standard eventually to be replaced by TMQL
tolog The recommended way to interact with the data API programming is slow and cumbersome tolog queries perform better Available via Java API Web service API Forms interface in Omnigator tolog queries return API objects
Finding all operas by a composer     Collection operas = new ArrayList();     TopicIF composer = getTopicById("puccini");     TopicIF composed_by = getTopicById("composed-by");     TopicIF work = getTopicById("work");     TopicIF creator = getTopicById("composer");     for (AssociationRoleIF role1 : composer.getRolesByType(creator)) {       AssociationIF assoc = role1.getAssociation();       if (assoc.getType() != composed_by)         continue;       for (AssociationRoleIF role2 : assoc.getRoles()) {         if (role2.getType() != work)           continue;         operas.add(role2.getPlayer());       }     }
Finding all operas by a composer composed-by(puccini : composer, $O : work)? composed-by($C : composer, tosca : work)? composed-by($C : composer, $O : work)? composed-by(puccini : composer, 						tosca : work)?
Features Access all aspects of a topic map Generic queries independent of ontology AND, OR, NOT, OPTIONAL Count Sort LIMIT/OFFSET Reusable inference rules
Chaining predicates (AND) Predicates can be chained born-in($PERSON : person, $PLACE : place),located-in($PLACE : containee, italy : container)? The comma between the predicates means AND This query finds all the people born in Italy It first builds a two-column table of all born-in associations Then, those rows where the place is not located-in Italy are removed (Note that when the PLACE variable is reused above that means that the birthplace and the location must be the same topic in each match) Any number of predicates can be chained Their order is insignificant Actually, the optimizer reorders the predicates It will start with located-in because it has a topic constant
Thinking in predicates ,[object Object]
function(arg1, arg2, arg3) -> result
Predicates, however, are in a sense bidirectional, because of the way the pattern matching works
predicate(topic : role1, $VAR : role2)
predicate($VAR : role1, topic : role2)
The order of the roles are, on the other hand, insignificant
predicate(topic : role1, $VAR : role2)
predicate($VAR : role2, topic : role1),[object Object]
The instance-of predicate ,[object Object]
instance-of (instance,class)
NOTE: the order of the arguments is significant
Like players, instance and class may be specified in two ways:
using a variable ($name)
using a topic reference
e.g. instance-of ( $A, city )
instance-of makes use of the superclass-subclass associations in the topic map
this means that composers will be considered musicians, and musicians will be considered persons,[object Object]
All non-hidden photos select $PHOTO from   instance-of($PHOTO, op:Photo)   not(ph:hide($PHOTO : ph:hidden)),   not(ph:taken-at($PHOTO : op:Image, $PLACE : op:Place),         ph:hide($PLACE : ph:hidden)),   not(ph:taken-during($PHOTO : op:Image, $EVENT : op:Event),         ph:hide($EVENT : ph:hidden)),   not(ph:depicted-in($PHOTO : ph:depiction, $PERSON : ph:depicted),         ph:hide($PERSON : ph:hidden))?
Demo Show running queries in Omnigator Also show query tracing Shakespeare /* #OPTION: optimizer.reorder = false */
tologspy tolog query profiler shares code with jdbcspy
Using the query engine API ,[object Object],get a QueryProcessorIF object run a query in the QueryProcessorIF and get a QueryResultIF loop over the results and use them close the result object go back to step 2, or do something else ,[object Object]
the API lets you write code without worrying about that, however
the two implementations behave identically,[object Object]
Advanced options ,[object Object]
the processor returns a ParsedQueryIF object, which can be executed
parameters can be passed to the query on each execution
It is possible to make declarations and use them across executions,[object Object]
QueryWrapper Designed to make all of this easier QueryWrapper qw = new QueryWrapper(tm); TopicIF topic = qw.queryForTopic(...); List topics = qw.queryForList(...); List<Person> people =  qw.queryForList(..., mapper);
tolog updates Greatly simplifies TM modification Also means you can do modification without API programming useful with RDBMS topic maps useful with TMs in running web servers By performing a sequence of updates, just about any change can be made Potentially allows much more powerful architecture
DELETE Static form delete lmg Dynamic form delete $person from instance-of($person, person) Delete a value delete subject-identifier(topic, “http://ex.org/tst”)
MERGE Static form MERGE topic1, topic2 Dynamic form MERGE $p1, $p2 FROM									instance-of($p1, person),							instance-of($p2, person), 								   email($p1, $email), 									   email($p2, $email)
INSERT Static form INSERT lmg isa person; - “Lars Marius Garshol” . Dynamic form INSERT 										           tmcl:belongs-to-schema(tmcl:container : theschema, tmcl:containee: $c) FROM instance-of($c, tmcl:constraint)
INSERT again INSERT    ?y $psi .   event-in-year(event: $e, year: ?y) FROM  	start-date($e, $date), 	str:substring($y, $date, 4), 	str:concat($psi, "http://psi.semagia.com/iso8601", $y)
UPDATE Static form UPDATE value(@3421, “New name”) Dynamic form UPDATE value($TN, “Ontopia”)							FROM topic-name(oks, $TN)
More information Look at sample queries in Omnigator tolog tutorial http://www.ontopia.net/doc/current/doc/query/tutorial.html tolog built-in predicate reference http://www.ontopia.net/doc/current/doc/query/predicate-reference.html
Conversion from RDBMS data DB2TM
DB2TM Upconversion of relational data either from CSV files or over JDBC Based on an XML file describing the mapping very highly configurable Support for all of Topic Maps (except variants) value transformations synchronization
Standard use case Pull in data from external source turn it into Topic Maps following some ontology Enrich it usually manually, but not necessarily Resync from source at intervals
DB2TM example Ontopia + = United Nations Bouvet <relation name="organizations.csv" columns="id name url">   <topic type="ex:organization"> 
   <item-identifier>#org${id}</item-identifier> 
   <topic-name>${name}</topic-name> 
   <occurrence type="ex:homepage">${url}</occurrence> 
 </topic> </relation>
Creating associations  <relation name="people.csv" columns="id given family employer phone">     <topic id="employer">       <item-identifier>#org${employer}</item-identifier>     </topic>     <topic type="ex:person">       <item-identifier>#person${id}</item-identifier>       <topic-name>${given} ${family}</topic-name>       <occurrence type="ex:phone">${phone}</occurrence>       <player atype="ex:employed-by" rtype="ex:employee">         <other rtype="ex:employer" player="#employer"/>       </player>     </topic>   </relation>
Value transformations  <relation name="SCHEMATA" columns="SCHEMA_NAME">     <function-column name='SCHEMA_ID'                      method='net.ontopia.topicmaps.db2tm.Functions.makePSI'>       <param>${SCHEMA_NAME}</param>     </function-column>     <topic type="mysql:schema">       <item-identifier>#${SCHEMA_ID}</item-identifier>       <topic-name>${SCHEMA_NAME}</topic-name>     </topic>   </relation>
Running DB2TM java net.ontopia.topicmaps.db2tm.Execute command-line tool also works with RDBMS topic maps net.ontopia.topicmaps.db2tm.DB2TM API class to run transformations methods "add" and "sync"
More information DB2TM User's Guide http://www.ontopia.net/doc/current/doc/db2tm/user-guide.html
Synchronizing with other sources TMSync
TMSync Configurable module for synchronizing one TM against another define subset of source TM to sync (using tolog) define subset of target TM to sync (using tolog) the module handles the rest Can also be used with non-TM sources create a non-updating conversion from the source to some TM format then use TMSync to sync against the converted TM instead of directly against the source
How TMSync works Define which part of the target topic map you want, Define which part of the source topic map it is the master for, and The algorithm does the rest
If the source is not a topic map TMSync convert.xslt Simply do a normal one-time conversion let TMSync do the update for you In other words, TMSync reduces the update problem to a conversion problem source.xml
The City of Bergen usecase Norge.no Service Unit Person LOS City of Bergen LOS
Web service interface TMRAP
TMRAP basics Abstract interface that is, independent of any particular technology coarse-grained operations, to reduce network traffic Protocol bindings exist plain HTTP binding SOAP binding Supports many syntaxes XTM 1.0 LTM TM/XML custom tolog result-set syntax
get-topic Retrieves a single topic from the remote server topic map may optionally be specified syntax likewise Main use to build client-side fragments into a bigger topic map to present information about a topic on a different server
get-topic Parameters identifier: a set of URIs (subject identifiers of wanted topic) subject: a set of URIs (subject locators of wanted topic) item: a set of URIs (item identifiers of wanted topic) topicmap: identifier for topic map being queried syntax: string identifying desired Topic Maps syntax in response view: string identifying TM-Views view used to define fragment Response topic map fragment representing topic in requested syntax default is XTM fragment with all URI identifiers, names, occurrences, and associations in default view types and scopes on these constructs are only identified by one <*Ref xlink:href=“...”/> XTM element the same goes for associated topics
get-topic-page Returns link information about a topic that is, where does the server present this topic mainly useful for realizing the portal integration scenario result information contains metadata about server setup
get-topic-page Parameters identifier: a set of URIs (subject identifiers of wanted topic) subject: a set of URIs (subject locators of wanted topic) item: a set of URIs (item identifiers of wanted topic) topicmap: identifier for topic map being queried syntax: string identifying desired Topic Maps syntax in response Response is a topic map fragment [oks : tmrap:server = "OKS Samplers local installation"] [opera : tmrap:topicmap = "The Italian Opera Topic Map"]   {opera, tmrap:handle, [[opera.xtm]]} tmrap:contained-in(oks :  tmrap:container, opera : tmrap:containee) tmrap:contained-in(opera : tmrap:container, view : tmrap:containee) tmrap:contained-in(opera : tmrap:container, edit : tmrap:containee) [view : tmrap:view-page %"http://localhost:8080/omnigator/models/..."] [edit : tmrap:edit-page %"http://localhost:8080/ontopoly/enter.ted?..."] [russia = "Russia” @"http://www.topicmaps.org/xtm/1.0/country.xtm#RU"]
get-tolog Returns query results main use is to extract larger chunks of the topic map to the client for presentation more flexible than get-topic can achieve more with less network traffic
get-tolog Parameters tolog: tolog query topicmap: identifier for topic map being queried syntax: string identifying desired syntax of response view: string identifying TM-Views view used to define fragment Response if syntax is“tolog” an XML representation of the query result useful if order of results matter otherwise, a topic map fragment containing multiple topics is returned as for get-topic
add-fragment Adds information to topic map on the server does this by merging in a fragment Parameters fragment: topic map fragment topicmap: identifier for topic map being added to syntax: string identifying syntax of request fragment Result fragment imported into named topic map
update-topic Can be used to update a topic add-fragment only adds information update sets the topic to exactly the uploaded information Parameters topicmap: the topic map to update fragment: fragment containing the new topic syntax: syntax of the uploaded fragment identifier: a set of URIs (subject identifiers of wanted topic) subject: a set of URIs (subject locators of wanted topic) item: a set of URIs (item identifiers of wanted topic) Update happens using TMSync
delete-topic Removes a topic from the server Parameters identifier: a set of URIs (subject identifiers of wanted topic) subject: a set of URIs (subject locators of wanted topic) item: a set of URIs (item identifiers of wanted topic) topicmap: identifier for topic map being queried Result deletes the identified topic includes all names, occurrences, and associations
tolog-update Runs a tolog update statement Parameters topicmap: topic map to update statement: tolog statement to run Runs the statement & commits the change
HTTP binding basics The mapping requires a base URL e.g http://localhost:8080/tmrap/ This is used to send requests http://localhost:8080/tmrap/method?param1=value1&... GET is used for requests that do not cause state changes POST for requests that do Responses returned in response body
Exercise #1: Retrieve a topic Use the get-topic request to retrieve a topic from the server base URL is http://localhost:8080/tmrap/ find the identifying URI in Omnigator just print the retrieved fragment to get a look at it Note: you must escape the “#” character in URIs otherwise it is interpreted as the anchor and not transmitted at all escape sequence: %23 Note: you must specify the topic map ID otherwise results will only be returned from loaded topic maps in other words: if the topic map isn’t loaded, you get no results
Solution #1 (in Python) import urllib BASE = "http://localhost:8080/tmrap/tmrap/" psi = "http://www.topicmaps.org/xtm/1.0/country.xtm%23RU" inf = urllib.urlopen(BASE + "get-topic?identifier=" + psi) print inf.read() inf.close()
Solution #1 (response) 
<topicMap xmlns="http://www.topicmaps.org/xtm/1.0/"  
          xmlns:xlink="http://www.w3.org/1999/xlink"> 
  <topic id="id458"> 
    <instanceOf> 
      <subjectIndicatorRef xlink:href="http://psi.ontopia.net/geography/#country"/> 
    </instanceOf> 
    <subjectIdentity> 
      <subjectIndicatorRef xlink:href="http://www.topicmaps.org/xtm/1.0/country.xtm#RU"/> 
      <topicRef xlink:href="file:/.../WEB-INF/topicmaps/geography.xtmm#russia"/> 
    </subjectIdentity> 
    <baseName> 
      <baseNameString>Russia</baseNameString> 
    </baseName> 
  </topic>
Processing XTM with XSLT This is possible, but unpleasant the main problem is that the XML is phrased in terms of Topic Maps, not in domain terms this means that all the XPath will talk about “topic”, “association”, ... and not “person”, “works-for” etc The structure is also complicated this makes queries complicated for example, the XPath to traverse an association looks like this: //xtm:association   [xtm:member[xtm:roleSpec / xtm:topicRef / @xlink:href = '#employer']              [xtm:topicRef / @xlink:href = concat('#', $company)]]   [xtm:instanceOf / xtm:topicRef / @xlink:href = '#employed-by']
TM/XML Non-standard XML syntax for Topic Maps defined by Ontopia (presented at TMRA’05) implemented in the OKS XSLT-friendly much easier to process with XSLT than XTM can be understood by developers who do not understand Topic Maps dynamic domain-specific syntaxes instead of generic syntax predictable (can generate XML Schema from TM ontology)
TM/XML example <topicmap ... reifier="tmtopic">   <topicmap id="tmtopic">     <iso:topic-name><tm:value>TM/XML example</tm:value> </iso:topic-name>      <dc:description>An example of the use of TM/XML.</dc:description>   </topicmap>   <person id="lmg">     <iso:topic-name><tm:value>Lars Marius Garshol</tm:value>       <tm:variant scope="core:sort">garshol, lars marius</tm:variant>     </iso:topic-name>     <homepage datatype="http://www.w3.org/2001/XMLSchema#anyURI"        >http://www.garshol.priv.no</homepage>     <created-by role="creator" topicref="tmtopic" otherrole="work"/>     <presentation role="presenter">       <presented topicref="tmxml"/>        <event topicref="tmra05"/>     </presentation>   </person> </topicmap>
tmphoto Category Person Photo Event Location http://www.garshol.priv.no/tmphoto/ A topic map to organize my personal photos contains ~15,000 photos A web gallery runs on Ontopia on www.garshol.priv.no
tmtools http://www.garshol.priv.no/tmtools/ Organization An index of Topic Maps tools organized as shown on the right Again, web application for browsing screenshots below Person Software product Platform Category Technology
The person page Boring! No content.
And in tmphoto...
get-illustration A web service in tmphoto receives the PSI of a person then automatically picks a suitable photo of that person Based on vote score for photos, categories (portrait), other people in photo ... The service returns a topic map fragment with links to the person page and a few different sizes of the selected photo http://www.garshol.priv.no/blog/183.html
get-illustration Hmmm. Scores, categories, people in photo, ... Do you have a photo of http://psi.ontopedia.net/Benjamin_Bock ? http://www.garshol.priv.no/tmphoto/get-illustration?identifier=http://psi.on.... tmphoto tmtools Topic map fragment
Voila...
Points to note No hard-wiring of links just add identifiers when creating people topics photos appear automatically if a better photo is added later, it’s replaced automatically No copying of data no duplication, no extra maintenance Very loose binding nothing application-specific Highly extensible once the identifiers are in place we can easily pull in more content from other sources
My blog Has more content about people 			(tmphoto & tmtools), events 			(tmphoto), tools 			(tmtools), technologies 	(tmtools) Should be available in those applications
Solution My blog posts are tagged but the tags are topics, which can have PSIs these PSIs are used in tmphoto and tmtools, too The get-topic-page request lets tmphoto & tmtools ask the blog for links to relevant posts given identifiers for a topic, returns links to pages about that topic http://www.garshol.priv.no/blog/145.html
get-topic-page Do you have pages about http://psi.ontopedia.net/TMRA_2008 ? http://www.garshol.priv.no/blog/get-topic-page?identifier=http://psi.on.... Blog tmphoto Topic map fragment Topics linking to individual blog posts
In tmphoto
Making web applications Navigator Framework
Ontopia Navigator Framework Java API for interacting with TM repository JSP tag library based on tolog kind of like XSLT in JSP with tolog instead of XPath has JSTL integration Undocumented parts web presentation components some wrapped as JSP tags want to build proper portlets from them
How it works Web server with JSP containere.g. Apache Tomcat JSP page Browser Topic MapEngine TagLibraries JSP page Browser JSP page Browser JSP page Browser Topic Map
The two tag libraries ,[object Object]
makes up nearly the entire framework
used to extract information from topic maps
lets you execute tolog queries to extract information from the topic map
looping and control flow structures
template
used to create template pages
separates layout and structure from content
not Topic Maps-aware
optional, but recommended,[object Object]
collected from the tm-sources.xml configuration file
each topic map has its own id (usually the file name)
Each page also holds a set of variable bindings
each variable holds a collection of objects
objects can be topics, base names, locators, strings, ...
Tags access variables
some tags set the values of variables, while others use them,[object Object]
Tells the page which tag library to include and binds it to a prefix
Prefixes are used to qualify the tags (and avoid name collisions)
Use the <tolog:context> tag around the entire page
The "topicmap" attribute specifies the ID of the current topic map
The first time you access the page in your browser the page gets compiled
If you modify the page then it will be recompiled the next time it is accessed,[object Object]
Navigator tag library example    <%-- assume variable 'composer' is already set --%> <p><b>Operas:</b><br/><tolog:foreach query=”composed-by(%composer% : composer, $OPERA : opera),                      { premiere-date($OPERA, $DATE) }?”>  <li>    <a href="opera.jsp?id=<tolog:id var="OPERA"/>”          ><tolog:out var="OPERA"/></a>        <tolog:if var="DATE">      <tolog:out var="DATE"/>    </tolog:if>  </li></tolog:foreach></p>
Elmer Preview
Possible configuration Application directories webapps myApp/ *.jsp omnigator/ WEB-INF/ config/ *.xml i18n/ topicmaps/ *.xtm, *.ltm web.xml
The navigator configuration files ,[object Object]
where to find the other files, plus plug-ins
tm-sources.xml
tells the navigator where to find topic maps
log4j.properties
configuration of the log4j logging
More details in the "Configuration Guide" document,[object Object]
... Automated classification
What is automated classification? Create parts of a topic map automatically using the text in existing content as the source not necessarily 100% automatic; user may help out A hard task natural language processing is very complex result is never perfect However, it’s possible to achieve some results
Why automate classification? Creating a topic map requires intellectual effort that is, it requires work by humans Human effort = cost added value must be sufficient to justify the cost in some cases either the cost is too high, or the value added is too limited The purpose of automation is to lower the cost this increases the number of cases where the use of Topic Maps is justified
Automatable tasks Project Person Department Worked on Worked on Jane  Doe worked on employed in XYZ Project IT group Ontology hard depends on requirements one time only Instance data hard usually exists in other sources Document keywords easier frequent operation usually no other sources
Two kinds of categorization Broad: Environment, Crisis management Narrow: Water, Norway, drought, Drought Act, Cloud seeding, Morecambe Bay Broad categorization categories are broadly defined include many different subjects Narrow categorization uses very specific keywords each keyword is a single subject
What it does Extract keywords from content goal is to use these for classification Not entity recognition we only care about identifying what the content is about Uses statistical approach no attempt at full formal parsing of the text
Steps of operation Identify format then, extract the text Identify language then, remove stop words stem remaining words Classify can use terms from preexisting Topic Maps exploits knowledge of the language Return proposed keywords
Example of keyword extraction topic maps			1.0 metadata			0.57 subject-based class.	0.42 Core metadata		0.42 faceted classification	0.34 taxonomy			0.22 monolingual thesauri	0.19 controlled vocabulary	0.19 Dublin Core			0.16 thesauri			0.16 Dublin				0.15 keywords			0.15
Example #2 Automated classification		1.0	5 Topic Maps				0.51	14 XSLT					0.38	11 compound keywords		0.29	2 keywords				0.26	20 Lars					0.23	1 Marius					0.23	1 Garshol				0.22	1 ...
So how could this be used? To help users classify new documents in a CMS interface suggest appropriate keywords, screened by user before approval Automate classification of incoming documents this means lower quality, but also lower cost Get an overview of interesting terms in a document corpus classify all documents, extract the most interesting terms this can be used as the starting point for building an ontology (keyword extraction only)
Example user interface The user creates an article this screen then used to add keywords user adjusts the proposals from the classifier
Interfaces java net.ontopia.topicmaps.classify.Chew <topicmapuri> <inputfile> produces textual output only net.ontopia.topicmaps.classify.SimpleClassifier classify(uri, topicmap) -> TermDatabase classify(uri) -> TermDatabase
Supported formats and languages XML (any schema) HTML (non-XML) PDF Word (.doc, .docx) PowerPoint (.ppt, .pptx) Plain text English Norwegian
Visualization of Topic Maps Vizigator
The Vizigator Graphical visualization of Topic Maps Two parts VizDesktop: Swing desktop app for configuration Vizlet: Java applet for web deployment Configuration stored in XTM file
The uses of visualization Not really suitable for navigation doesn't work for all kinds of data Great for seeing the big picture
Without configuration
With configuration
VizDesktop
The Vizigator The Vizigator uses TMRAP the Vizlet runs in the browser (on the client) a fragment of the topic map is downloaded from the server the fragment is grown as needed Server TMRAP
Embedding the Vizlet Set up TMRAP service Add ontopia-vizlet.jar Add necessary HTML  <applet code="net.ontopia.topicmaps.viz.Vizlet.class"           archive="ontopia-vizlet.jar">     <param name="tmrap"    value="/omnigator/plugins/viz/">     <param name="config"   value="/omnigator/plugins/viz/config.jsp?tm=<%= tmid %>">     <param name="tmid"     value="<%= tmid    %>">     <param name="idtype"   value="<%= idtype  %>">     <param name="idvalue"  value="<%= idvalue %>">     <param name="propTarget"    value="VizletProp">     <param name="controlsVisible"    value="true">     <param name="locality"    value="1">     <param name="max-locality"    value="5"> </applet>
Topic Maps debugger Omnigator
Omnigator Generic Topic Maps browser very useful for seeing what's in a topic map the second-oldest part of Ontopia Contains other features beyond simple browsing statistics management console merging tolog querying/updates export
Ontology designer and editor Ontopoly
Ontopoly A generic Topic Maps editor, in two parts ontology editor: used to create the ontology and schema instance editor: used to enter instances based on ontology Features works with both XTM files and topic maps stored in RDBMS backend supports access control to administrative functions, ontology, and instance editors existing topic maps can be imported parts of the ontology can be marked as read-only, or hidden
Ontology designer Create ontology based on topic, association, name, occurrence, and role types Supports iterative ontology development modify and prototype the ontology until it's right Supports ontology annotation add fields to topic types, for example Supports views define restricted views of certain topic types
Instance editor Configured by the ontology editor shows topics as defined by the ontology Has several ways to pick associations drop-down list by search from hierarchy Avoids conflicts pages viewed by one user are locked to others
Ontopoly is embeddable The Ontopoly instance editor can be embedded basically, the main panel can be inserted into another web application uses an iframe Requires only ID of topic being edited can also be restricted to a specific view Makes it possible to build easier-to-use editors so users don't have to learn all of Ontopoly
Adding content features CMS integrations
CMS integration The best way to add content functionality to Ontopia the world doesn’t need another CMS better to reuse those which already exist So far two integrations exist Escenic OfficeNet Knowledge Portal more are being worked on
Implementation A CMS event listener the listener creates topics for new CMS articles, folders, etc the mapping is basically the design of the ontology used by this listener Presentation integration it must be possible to list all topics attached to an article conversely, it must be possible to list all articles attached to a topic how close the integration needs to be here will vary, as will the difficulty of the integration User interface integration it needs to be possible to attach topics to an article from within the normal CMS user interface this can be quite tricky Search integration the Topic Maps search needs to also search content in the CMS can be achieved by writing a tolog plug-in
Articles as topics is about Elections New city council appointed Goal: associate articles with topics mainly to say what they are about typically also want to include other metadata Need to create topics for the articles to do this in fact, a general CMS-to-TM mapping is needed must decide what metadata and structures to include
Mapping issues Article topics what topic type to use? title becomes name? (do you know the title?) include author? include last modified? include workflow state? should all articles be mapped? Folders/directories/sections/... should these be mapped, too? one topic type for all folders/.../.../...? if so, use associations to connect articles to folders use associations to reproduce hierarchical folder structure Multimedia objects should these be included? what topic type? what name? ...
Two styles of mappings Articles as articles Topic represents only the article Topic type is some subclass of “article” “Is about” association connects article into topic map Fields are presentational title, abstract, body Articles as concepts Topic represents some real-world subject (like a person) article is just the default content about that subject Type is the type of the subject (person) Semantic associations to the rest of the topic map works in department, has competence, ... Fields can be semantic name, phone no, email, ...
Article as article Article about building of a new school Is about association to “Primary schools” Topic type is “article”
Article as concept Article about a sports hall Article really represents the hall Topic type is “Location” Associations to ,[object Object]
events in the location
category “Sports”,[object Object]
Ontopia/Liferay An integration with the Liferay CMS and portal is in progress presented Friday 1130-1150 in Schiller 2
Two projects Real-life usage
The project A new citizen’s portal for the city administration strategic decision to make portal main interface for interaction with citizens as many services as possible are to be moved online Big project started in late 2004, to continue at 							least into 2008 ~5 million Euro spent by launch date 1.7 million Euro budgeted for 2007 Topic Maps development is a fraction 							of this (less than 25%) Many companies involved Bouvet/Ontopia Avenir KPMG Karabin Escenic
Simplified original ontology Service catalog Escenic (CMS) LOS Form Article nearly everything Category Service Subject Department Borough External resource Employee Payroll++

Más contenido relacionado

La actualidad más candente

"О некоторых особенностях Objective-C++" Влад Михайленко (Maps.Me)
"О некоторых особенностях Objective-C++" Влад Михайленко (Maps.Me)"О некоторых особенностях Objective-C++" Влад Михайленко (Maps.Me)
"О некоторых особенностях Objective-C++" Влад Михайленко (Maps.Me)AvitoTech
 
EuroPython 2016 - Do I Need To Switch To Golang
EuroPython 2016 - Do I Need To Switch To GolangEuroPython 2016 - Do I Need To Switch To Golang
EuroPython 2016 - Do I Need To Switch To GolangMax Tepkeev
 
«iPython & Jupyter: 4 fun & profit», Лев Тонких, Rambler&Co
«iPython & Jupyter: 4 fun & profit», Лев Тонких, Rambler&Co«iPython & Jupyter: 4 fun & profit», Лев Тонких, Rambler&Co
«iPython & Jupyter: 4 fun & profit», Лев Тонких, Rambler&CoMail.ru Group
 
Kotlin Bytecode Generation and Runtime Performance
Kotlin Bytecode Generation and Runtime PerformanceKotlin Bytecode Generation and Runtime Performance
Kotlin Bytecode Generation and Runtime Performanceintelliyole
 
Use PEG to Write a Programming Language Parser
Use PEG to Write a Programming Language ParserUse PEG to Write a Programming Language Parser
Use PEG to Write a Programming Language ParserYodalee
 
Oleksii Holub "Expression trees in C#"
Oleksii Holub "Expression trees in C#" Oleksii Holub "Expression trees in C#"
Oleksii Holub "Expression trees in C#" Fwdays
 
Simple ETL in python 3.5+ with Bonobo - PyParis 2017
Simple ETL in python 3.5+ with Bonobo - PyParis 2017Simple ETL in python 3.5+ with Bonobo - PyParis 2017
Simple ETL in python 3.5+ with Bonobo - PyParis 2017Romain Dorgueil
 
"PyTorch Deep Learning Framework: Status and Directions," a Presentation from...
"PyTorch Deep Learning Framework: Status and Directions," a Presentation from..."PyTorch Deep Learning Framework: Status and Directions," a Presentation from...
"PyTorch Deep Learning Framework: Status and Directions," a Presentation from...Edge AI and Vision Alliance
 
Functional programming in C++ LambdaNsk
Functional programming in C++ LambdaNskFunctional programming in C++ LambdaNsk
Functional programming in C++ LambdaNskAlexander Granin
 
Apache PIG - User Defined Functions
Apache PIG - User Defined FunctionsApache PIG - User Defined Functions
Apache PIG - User Defined FunctionsChristoph Bauer
 
About Those Python Async Concurrent Frameworks - Fantix @ OSTC 2014
About Those Python Async Concurrent Frameworks - Fantix @ OSTC 2014About Those Python Async Concurrent Frameworks - Fantix @ OSTC 2014
About Those Python Async Concurrent Frameworks - Fantix @ OSTC 2014Fantix King 王川
 
Expression trees in c#
Expression trees in c#Expression trees in c#
Expression trees in c#Oleksii Holub
 
Geeks Anonymes - Le langage Go
Geeks Anonymes - Le langage GoGeeks Anonymes - Le langage Go
Geeks Anonymes - Le langage GoGeeks Anonymes
 

La actualidad más candente (20)

Go Java, Go!
Go Java, Go!Go Java, Go!
Go Java, Go!
 
"О некоторых особенностях Objective-C++" Влад Михайленко (Maps.Me)
"О некоторых особенностях Objective-C++" Влад Михайленко (Maps.Me)"О некоторых особенностях Objective-C++" Влад Михайленко (Maps.Me)
"О некоторых особенностях Objective-C++" Влад Михайленко (Maps.Me)
 
EuroPython 2016 - Do I Need To Switch To Golang
EuroPython 2016 - Do I Need To Switch To GolangEuroPython 2016 - Do I Need To Switch To Golang
EuroPython 2016 - Do I Need To Switch To Golang
 
«iPython & Jupyter: 4 fun & profit», Лев Тонких, Rambler&Co
«iPython & Jupyter: 4 fun & profit», Лев Тонких, Rambler&Co«iPython & Jupyter: 4 fun & profit», Лев Тонких, Rambler&Co
«iPython & Jupyter: 4 fun & profit», Лев Тонких, Rambler&Co
 
Java Generics
Java GenericsJava Generics
Java Generics
 
Kotlin Bytecode Generation and Runtime Performance
Kotlin Bytecode Generation and Runtime PerformanceKotlin Bytecode Generation and Runtime Performance
Kotlin Bytecode Generation and Runtime Performance
 
Use PEG to Write a Programming Language Parser
Use PEG to Write a Programming Language ParserUse PEG to Write a Programming Language Parser
Use PEG to Write a Programming Language Parser
 
Oleksii Holub "Expression trees in C#"
Oleksii Holub "Expression trees in C#" Oleksii Holub "Expression trees in C#"
Oleksii Holub "Expression trees in C#"
 
Simple ETL in python 3.5+ with Bonobo - PyParis 2017
Simple ETL in python 3.5+ with Bonobo - PyParis 2017Simple ETL in python 3.5+ with Bonobo - PyParis 2017
Simple ETL in python 3.5+ with Bonobo - PyParis 2017
 
"PyTorch Deep Learning Framework: Status and Directions," a Presentation from...
"PyTorch Deep Learning Framework: Status and Directions," a Presentation from..."PyTorch Deep Learning Framework: Status and Directions," a Presentation from...
"PyTorch Deep Learning Framework: Status and Directions," a Presentation from...
 
Don't do this
Don't do thisDon't do this
Don't do this
 
Intro to Pig UDF
Intro to Pig UDFIntro to Pig UDF
Intro to Pig UDF
 
Functional programming in C++ LambdaNsk
Functional programming in C++ LambdaNskFunctional programming in C++ LambdaNsk
Functional programming in C++ LambdaNsk
 
Apache PIG - User Defined Functions
Apache PIG - User Defined FunctionsApache PIG - User Defined Functions
Apache PIG - User Defined Functions
 
Python Async IO Horizon
Python Async IO HorizonPython Async IO Horizon
Python Async IO Horizon
 
About Those Python Async Concurrent Frameworks - Fantix @ OSTC 2014
About Those Python Async Concurrent Frameworks - Fantix @ OSTC 2014About Those Python Async Concurrent Frameworks - Fantix @ OSTC 2014
About Those Python Async Concurrent Frameworks - Fantix @ OSTC 2014
 
From Java to Python
From Java to PythonFrom Java to Python
From Java to Python
 
Expression trees in c#
Expression trees in c#Expression trees in c#
Expression trees in c#
 
effective_r27
effective_r27effective_r27
effective_r27
 
Geeks Anonymes - Le langage Go
Geeks Anonymes - Le langage GoGeeks Anonymes - Le langage Go
Geeks Anonymes - Le langage Go
 

Similar a Ontopia tutorial

Groovy Introduction - JAX Germany - 2008
Groovy Introduction - JAX Germany - 2008Groovy Introduction - JAX Germany - 2008
Groovy Introduction - JAX Germany - 2008Guillaume Laforge
 
TWINS: OOP and FP - Warburton
TWINS: OOP and FP - WarburtonTWINS: OOP and FP - Warburton
TWINS: OOP and FP - WarburtonCodemotion
 
Dev Day 2019: Mike Sperber – Software Design für die Seele
Dev Day 2019: Mike Sperber – Software Design für die SeeleDev Day 2019: Mike Sperber – Software Design für die Seele
Dev Day 2019: Mike Sperber – Software Design für die SeeleDevDay Dresden
 
Java - A broad introduction
Java - A broad introductionJava - A broad introduction
Java - A broad introductionBirol Efe
 
A brief overview of java frameworks
A brief overview of java frameworksA brief overview of java frameworks
A brief overview of java frameworksMD Sayem Ahmed
 
Eclipse Modeling Framework
Eclipse Modeling FrameworkEclipse Modeling Framework
Eclipse Modeling FrameworkAjay K
 
The GO Language : From Beginners to Gophers
The GO Language : From Beginners to GophersThe GO Language : From Beginners to Gophers
The GO Language : From Beginners to GophersAlessandro Sanino
 
Iron Languages - NYC CodeCamp 2/19/2011
Iron Languages - NYC CodeCamp 2/19/2011Iron Languages - NYC CodeCamp 2/19/2011
Iron Languages - NYC CodeCamp 2/19/2011Jimmy Schementi
 
Sour Pickles
Sour PicklesSour Pickles
Sour PicklesSensePost
 
Golang basics for Java developers - Part 1
Golang basics for Java developers - Part 1Golang basics for Java developers - Part 1
Golang basics for Java developers - Part 1Robert Stern
 
Introducing PHP Latest Updates
Introducing PHP Latest UpdatesIntroducing PHP Latest Updates
Introducing PHP Latest UpdatesIftekhar Eather
 
Exploring SharePoint with F#
Exploring SharePoint with F#Exploring SharePoint with F#
Exploring SharePoint with F#Talbott Crowell
 
Refactoring In Tdd The Missing Part
Refactoring In Tdd The Missing PartRefactoring In Tdd The Missing Part
Refactoring In Tdd The Missing PartGabriele Lana
 
Aspect-oriented programming in Perl
Aspect-oriented programming in PerlAspect-oriented programming in Perl
Aspect-oriented programming in Perlmegakott
 
Eclipse Training - Main eclipse ecosystem classes
Eclipse Training - Main eclipse ecosystem classesEclipse Training - Main eclipse ecosystem classes
Eclipse Training - Main eclipse ecosystem classesLuca D'Onofrio
 
Terraform GitOps on Codefresh
Terraform GitOps on CodefreshTerraform GitOps on Codefresh
Terraform GitOps on CodefreshCodefresh
 
Modeling Patterns for JavaScript Browser-Based Games
Modeling Patterns for JavaScript Browser-Based GamesModeling Patterns for JavaScript Browser-Based Games
Modeling Patterns for JavaScript Browser-Based GamesRay Toal
 

Similar a Ontopia tutorial (20)

Groovy Introduction - JAX Germany - 2008
Groovy Introduction - JAX Germany - 2008Groovy Introduction - JAX Germany - 2008
Groovy Introduction - JAX Germany - 2008
 
TWINS: OOP and FP - Warburton
TWINS: OOP and FP - WarburtonTWINS: OOP and FP - Warburton
TWINS: OOP and FP - Warburton
 
Dev Day 2019: Mike Sperber – Software Design für die Seele
Dev Day 2019: Mike Sperber – Software Design für die SeeleDev Day 2019: Mike Sperber – Software Design für die Seele
Dev Day 2019: Mike Sperber – Software Design für die Seele
 
Java - A broad introduction
Java - A broad introductionJava - A broad introduction
Java - A broad introduction
 
Angular Schematics
Angular SchematicsAngular Schematics
Angular Schematics
 
A brief overview of java frameworks
A brief overview of java frameworksA brief overview of java frameworks
A brief overview of java frameworks
 
Demystifying Maven
Demystifying MavenDemystifying Maven
Demystifying Maven
 
Eclipse Modeling Framework
Eclipse Modeling FrameworkEclipse Modeling Framework
Eclipse Modeling Framework
 
The GO Language : From Beginners to Gophers
The GO Language : From Beginners to GophersThe GO Language : From Beginners to Gophers
The GO Language : From Beginners to Gophers
 
Iron Languages - NYC CodeCamp 2/19/2011
Iron Languages - NYC CodeCamp 2/19/2011Iron Languages - NYC CodeCamp 2/19/2011
Iron Languages - NYC CodeCamp 2/19/2011
 
Sour Pickles
Sour PicklesSour Pickles
Sour Pickles
 
Golang basics for Java developers - Part 1
Golang basics for Java developers - Part 1Golang basics for Java developers - Part 1
Golang basics for Java developers - Part 1
 
Introducing PHP Latest Updates
Introducing PHP Latest UpdatesIntroducing PHP Latest Updates
Introducing PHP Latest Updates
 
Exploring SharePoint with F#
Exploring SharePoint with F#Exploring SharePoint with F#
Exploring SharePoint with F#
 
Refactoring In Tdd The Missing Part
Refactoring In Tdd The Missing PartRefactoring In Tdd The Missing Part
Refactoring In Tdd The Missing Part
 
Aspect-oriented programming in Perl
Aspect-oriented programming in PerlAspect-oriented programming in Perl
Aspect-oriented programming in Perl
 
Eclipse Training - Main eclipse ecosystem classes
Eclipse Training - Main eclipse ecosystem classesEclipse Training - Main eclipse ecosystem classes
Eclipse Training - Main eclipse ecosystem classes
 
Twins: OOP and FP
Twins: OOP and FPTwins: OOP and FP
Twins: OOP and FP
 
Terraform GitOps on Codefresh
Terraform GitOps on CodefreshTerraform GitOps on Codefresh
Terraform GitOps on Codefresh
 
Modeling Patterns for JavaScript Browser-Based Games
Modeling Patterns for JavaScript Browser-Based GamesModeling Patterns for JavaScript Browser-Based Games
Modeling Patterns for JavaScript Browser-Based Games
 

Más de Lars Marius Garshol

JSLT: JSON querying and transformation
JSLT: JSON querying and transformationJSLT: JSON querying and transformation
JSLT: JSON querying and transformationLars Marius Garshol
 
Data collection in AWS at Schibsted
Data collection in AWS at SchibstedData collection in AWS at Schibsted
Data collection in AWS at SchibstedLars Marius Garshol
 
NoSQL and Einstein's theory of relativity
NoSQL and Einstein's theory of relativityNoSQL and Einstein's theory of relativity
NoSQL and Einstein's theory of relativityLars Marius Garshol
 
Using the search engine as recommendation engine
Using the search engine as recommendation engineUsing the search engine as recommendation engine
Using the search engine as recommendation engineLars Marius Garshol
 
Linked Open Data for the Cultural Sector
Linked Open Data for the Cultural SectorLinked Open Data for the Cultural Sector
Linked Open Data for the Cultural SectorLars Marius Garshol
 
NoSQL databases, the CAP theorem, and the theory of relativity
NoSQL databases, the CAP theorem, and the theory of relativityNoSQL databases, the CAP theorem, and the theory of relativity
NoSQL databases, the CAP theorem, and the theory of relativityLars Marius Garshol
 
Introduction to Big Data/Machine Learning
Introduction to Big Data/Machine LearningIntroduction to Big Data/Machine Learning
Introduction to Big Data/Machine LearningLars Marius Garshol
 
Hafslund SESAM - Semantic integration in practice
Hafslund SESAM - Semantic integration in practiceHafslund SESAM - Semantic integration in practice
Hafslund SESAM - Semantic integration in practiceLars Marius Garshol
 

Más de Lars Marius Garshol (20)

JSLT: JSON querying and transformation
JSLT: JSON querying and transformationJSLT: JSON querying and transformation
JSLT: JSON querying and transformation
 
Data collection in AWS at Schibsted
Data collection in AWS at SchibstedData collection in AWS at Schibsted
Data collection in AWS at Schibsted
 
Kveik - what is it?
Kveik - what is it?Kveik - what is it?
Kveik - what is it?
 
Nature-inspired algorithms
Nature-inspired algorithmsNature-inspired algorithms
Nature-inspired algorithms
 
Collecting 600M events/day
Collecting 600M events/dayCollecting 600M events/day
Collecting 600M events/day
 
History of writing
History of writingHistory of writing
History of writing
 
NoSQL and Einstein's theory of relativity
NoSQL and Einstein's theory of relativityNoSQL and Einstein's theory of relativity
NoSQL and Einstein's theory of relativity
 
Norwegian farmhouse ale
Norwegian farmhouse aleNorwegian farmhouse ale
Norwegian farmhouse ale
 
Archive integration with RDF
Archive integration with RDFArchive integration with RDF
Archive integration with RDF
 
The Euro crisis in 10 minutes
The Euro crisis in 10 minutesThe Euro crisis in 10 minutes
The Euro crisis in 10 minutes
 
Using the search engine as recommendation engine
Using the search engine as recommendation engineUsing the search engine as recommendation engine
Using the search engine as recommendation engine
 
Linked Open Data for the Cultural Sector
Linked Open Data for the Cultural SectorLinked Open Data for the Cultural Sector
Linked Open Data for the Cultural Sector
 
NoSQL databases, the CAP theorem, and the theory of relativity
NoSQL databases, the CAP theorem, and the theory of relativityNoSQL databases, the CAP theorem, and the theory of relativity
NoSQL databases, the CAP theorem, and the theory of relativity
 
Bitcoin - digital gold
Bitcoin - digital goldBitcoin - digital gold
Bitcoin - digital gold
 
Introduction to Big Data/Machine Learning
Introduction to Big Data/Machine LearningIntroduction to Big Data/Machine Learning
Introduction to Big Data/Machine Learning
 
Hops - the green gold
Hops - the green goldHops - the green gold
Hops - the green gold
 
Big data 101
Big data 101Big data 101
Big data 101
 
Linked Open Data
Linked Open DataLinked Open Data
Linked Open Data
 
Hafslund SESAM - Semantic integration in practice
Hafslund SESAM - Semantic integration in practiceHafslund SESAM - Semantic integration in practice
Hafslund SESAM - Semantic integration in practice
 
Approximate string comparators
Approximate string comparatorsApproximate string comparators
Approximate string comparators
 

Último

CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 

Último (20)

CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 

Ontopia tutorial

  • 1. Ontopia Tutorial TMRA 2010-09-29 Lars Marius Garshol & Geir Ove Grønmo
  • 2. Agenda About you who are you? About Ontopia The product The future Participating in the project
  • 4. Brief history 1999-2000 private hobby project for Geir Ove 2000-2009 commercial software sold by Ontopia AS lots of international customers in diverse fields 2009- open source project
  • 5. The project Open source hosted at Google Code Contributors Lars Marius Garshol, Bouvet Geir Ove Grønmo, Bouvet Thomas Neidhart, SpaceApps Lars Heuer, Semagia Hannes Niederhausen, TMLab Stig Lau, Bouvet Baard H. Rehn-Johansen, Bouvet Peter-Paul Kruijssen, Morpheus Quintin Siebers, Morpheus Matthias Fischer, HTW Berlin
  • 6. Recent work Ontopia/Liferay integration Matthias Fischer & LMG Various fixes and optimizations everyone Tropics (RESTful web service interface) SpaceApps Porting build system to Maven2 Morpheus
  • 7. Architecture and modules Product overview
  • 8. The big picture Auto-class. A.N.other A.N.other Other CMSs A.N.other A.N.other DB2TM Portlet support OKP TMSync Engine CMSintegration Data integration Escenic Taxon.import Ontopoly Web service
  • 9. The engine Core API TMAPI 2.0 support Import/export RDF conversion TMSync Fulltext search Event API tolog query language tolog update language Engine
  • 10. The backends In-memory no persistent storage thread-safe no setup RDBMS transactions persistent thread-safe uses caching clustering Remote uses web service read-only unofficial Engine Memory RDBMS Remote
  • 11. DB2TM Upconversion to TMs from RDBMS via JDBC or from CSV Uses XML mapping can call out to Java Supports sync either full rescan or change table TMRAP Nav DB2TM Classify Engine Memory RDBMS Remote
  • 12. TMRAP Web service interface via SOAP via plain HTTP Requests get-topic get-topic-page get-tolog delete-topic ... TMRAP Nav DB2TM Classify Engine Memory RDBMS Remote
  • 13. Navigator framework Servlet-based API manage topic maps load/scan/delete/create JSP tag library XSLT-like based on tolog JSTL integration TMRAP Nav DB2TM Classify Engine Memory RDBMS Remote
  • 14. Automated classification Undocumented experimental Extracts text autodetects format Word, PDF, XML, HTML Processes text detects language stemming, stop-words Extracts keywords ranked by importance uses existing topics supports compound terms TMRAP Nav DB2TM Classify Engine Memory RDBMS Remote
  • 15. Vizigator Viz Ontopoly Graphical visualization VizDesktop Swing app to configure filter/style/... Vizlet Java applet for web uses configuration loads via TMRAP uses “Remote” backend TMRAP Nav DB2TM Classify Engine Memory RDBMS Remote
  • 16. Ontopoly Viz Ontopoly Generic editor web-based, AJAX meta-ontology in TM Ontology designer create types and fields control user interface build views incremental dev Instance editor guided by ontology TMRAP Nav DB2TM Classify Engine Memory RDBMS Remote
  • 17. Typical deployment Viewing application Engine Users DB Backend Ontopoly Frameworks Editors DB TMRAP DB2TM HTTP DB External application Application server
  • 19. Core APIs net.ontopia.topicmaps.core.* Fairly direct mapping from TMDM TopicIF AssociationIF TopicMapIF ... Set/get methods reflect TMDM properties
  • 20. TopicIF Interface, not a class getTopicNames() addTopicName(TopicNameIF) removeTopicName(TopicNameIF) getOccurrences() + add + remove getSubjectIdentifiers() + add + remove getItemIdentifiers() + add + remove getSubjectLocators() + add + remove getRoles() getRolesByType(TopicIF)
  • 21. Core interfaces TopicMapStoreIF TopicMapIF TopicIF AssociationIF TopicNameIF OccurrenceIF AssociationRoleIF VariantNameIF
  • 22. How to get a TopicMapIF Create one directly new net...impl.basic.InMemoryTopicMapStore() Load one from file using an importer (next slide) Connect to an RDBMS covered later Use a topic map repository covered later
  • 23. TopicMapReaderIF import net.ontopia.topicmaps.core.TopicMapIF; import net.ontopia.topicmaps.core.TopicMapReaderIF; import net.ontopia.topicmaps.utils.ImportExportUtils; public class TopicCounter { public static void main(String[] argv) throws Exception { TopicMapReaderIF reader = ImportExportUtils.getReader(argv[0]); TopicMapIF tm = reader.read(); System.out.println("TM contains " + tm.getTopics().size() + " topics"); } } [larsga@c716c5ac1 tmp]$ java TopicCounter ~/data/bilder/privat/metadata.xtm TM contains 17035 topics [larsga@c716c5ac1 tmp]$
  • 25. The utility classes A set of classes outside the core interfaces that perform common tasks a number of these utilities are obsolete now that tolog is here They are all built on top of the core interfaces Some important utilities ImportExportUtils creates readers and writers MergeUtils merges topics and topic maps PSI contains important PSIs DeletionUtils cascading delete of topics DuplicateSuppressionUtils removes duplicates TopicStringifiers find names for topics
  • 26. Topic Maps repository Uses a set of topic maps sources to build a set of topics maps topic maps can be looked up by ID Many kinds of sources scan directory for files matching pattern download from URL connect to RDBMS ... Configurable using an XML file tm-sources.xml Used by Navigator Framework
  • 27. Event API Allows clients to receive notification of changes Must implement TopicMapListenerIF static class TestListener extends AbstractTopicMapListener { public void objectAdded(TMObjectIF snapshot) { System.out.println("Topic added: " + snapshot.getObjectId()); } public void objectModified(TMObjectIF snapshot) { System.out.println("Topic modified: " + snapshot.getObjectId()); } public void objectRemoved(TMObjectIF snapshot) { System.out.println("Topic removed: " + snapshot.getObjectId()); } }
  • 28. Using the API // register to listen for events TestListener listener = new TestListener(); TopicMapEvents.addTopicListener(ref, listener); // get the store through the reference so the listener is registered ref.createStore(false); // let's add a topic System.out.println("Off we go"); TopicMapBuilderIF builder = tm.getBuilder(); TopicIF newbie = builder.makeTopic(tm); System.out.println("Let's name this topic"); builder.makeTopicName(newbie, "Newbie topic"); // then let's remove it System.out.println("And now, the exit"); DeletionUtils.remove(newbie); System.out.println("Goodbye, short-lived topic"); [larsga@dhcp-98 tmp]$ java EventTest bkclean.xtm Off we go Topic added: 3409 Let's name this topic Topic modified: 3409 Topic modified: 3409 And now, the exit Topic removed: 3409 Goodbye, short-lived topic
  • 29. For more information See the Engine Developer's Guide http://www.ontopia.net/doc/current/doc/engine/devguide.html
  • 30. Persistence & transactions RDBMS backend
  • 31. RDBMS backend Stores Topic Maps in an RDBMS generic schema access via JDBC Provides full ACID transactions concurrency ... Supports several databases Oracle, MySQL, PostgreSQL, MS SQL Server, hsql Clustering support
  • 32. Core API implementation Implements same API as in-memory impl theoretically, a switch requires only config change Lazy loading of objects objects loaded from DB as needed Considerable internal caching for performance reasons Separate objects for separate transactions in order to provide isolation Shared cache between transactions
  • 33. Configuration A Java property file Specifies database type JDBC URL username + password cache settings clustering settings ...
  • 34. jdbcspy A built-in SQL profiler Useful for identifying cause of performance issues
  • 35. tolog The Query Engine
  • 36. tolog A logic-based query language a mix of Prolog and SQL effectively equivalent to Datalog Two parts queries (data retrieval) updates (data modification) Developed by Ontopia not an ISO standard eventually to be replaced by TMQL
  • 37. tolog The recommended way to interact with the data API programming is slow and cumbersome tolog queries perform better Available via Java API Web service API Forms interface in Omnigator tolog queries return API objects
  • 38. Finding all operas by a composer Collection operas = new ArrayList(); TopicIF composer = getTopicById("puccini"); TopicIF composed_by = getTopicById("composed-by"); TopicIF work = getTopicById("work"); TopicIF creator = getTopicById("composer"); for (AssociationRoleIF role1 : composer.getRolesByType(creator)) { AssociationIF assoc = role1.getAssociation(); if (assoc.getType() != composed_by) continue; for (AssociationRoleIF role2 : assoc.getRoles()) { if (role2.getType() != work) continue; operas.add(role2.getPlayer()); } }
  • 39. Finding all operas by a composer composed-by(puccini : composer, $O : work)? composed-by($C : composer, tosca : work)? composed-by($C : composer, $O : work)? composed-by(puccini : composer, tosca : work)?
  • 40. Features Access all aspects of a topic map Generic queries independent of ontology AND, OR, NOT, OPTIONAL Count Sort LIMIT/OFFSET Reusable inference rules
  • 41. Chaining predicates (AND) Predicates can be chained born-in($PERSON : person, $PLACE : place),located-in($PLACE : containee, italy : container)? The comma between the predicates means AND This query finds all the people born in Italy It first builds a two-column table of all born-in associations Then, those rows where the place is not located-in Italy are removed (Note that when the PLACE variable is reused above that means that the birthplace and the location must be the same topic in each match) Any number of predicates can be chained Their order is insignificant Actually, the optimizer reorders the predicates It will start with located-in because it has a topic constant
  • 42.
  • 44. Predicates, however, are in a sense bidirectional, because of the way the pattern matching works
  • 45. predicate(topic : role1, $VAR : role2)
  • 46. predicate($VAR : role1, topic : role2)
  • 47. The order of the roles are, on the other hand, insignificant
  • 48. predicate(topic : role1, $VAR : role2)
  • 49.
  • 50.
  • 52. NOTE: the order of the arguments is significant
  • 53. Like players, instance and class may be specified in two ways:
  • 55. using a topic reference
  • 56. e.g. instance-of ( $A, city )
  • 57. instance-of makes use of the superclass-subclass associations in the topic map
  • 58.
  • 59. All non-hidden photos select $PHOTO from instance-of($PHOTO, op:Photo) not(ph:hide($PHOTO : ph:hidden)), not(ph:taken-at($PHOTO : op:Image, $PLACE : op:Place), ph:hide($PLACE : ph:hidden)), not(ph:taken-during($PHOTO : op:Image, $EVENT : op:Event), ph:hide($EVENT : ph:hidden)), not(ph:depicted-in($PHOTO : ph:depiction, $PERSON : ph:depicted), ph:hide($PERSON : ph:hidden))?
  • 60. Demo Show running queries in Omnigator Also show query tracing Shakespeare /* #OPTION: optimizer.reorder = false */
  • 61. tologspy tolog query profiler shares code with jdbcspy
  • 62.
  • 63. the API lets you write code without worrying about that, however
  • 64.
  • 65.
  • 66. the processor returns a ParsedQueryIF object, which can be executed
  • 67. parameters can be passed to the query on each execution
  • 68.
  • 69. QueryWrapper Designed to make all of this easier QueryWrapper qw = new QueryWrapper(tm); TopicIF topic = qw.queryForTopic(...); List topics = qw.queryForList(...); List<Person> people = qw.queryForList(..., mapper);
  • 70. tolog updates Greatly simplifies TM modification Also means you can do modification without API programming useful with RDBMS topic maps useful with TMs in running web servers By performing a sequence of updates, just about any change can be made Potentially allows much more powerful architecture
  • 71. DELETE Static form delete lmg Dynamic form delete $person from instance-of($person, person) Delete a value delete subject-identifier(topic, “http://ex.org/tst”)
  • 72. MERGE Static form MERGE topic1, topic2 Dynamic form MERGE $p1, $p2 FROM instance-of($p1, person), instance-of($p2, person), email($p1, $email), email($p2, $email)
  • 73. INSERT Static form INSERT lmg isa person; - “Lars Marius Garshol” . Dynamic form INSERT tmcl:belongs-to-schema(tmcl:container : theschema, tmcl:containee: $c) FROM instance-of($c, tmcl:constraint)
  • 74. INSERT again INSERT ?y $psi . event-in-year(event: $e, year: ?y) FROM start-date($e, $date), str:substring($y, $date, 4), str:concat($psi, "http://psi.semagia.com/iso8601", $y)
  • 75. UPDATE Static form UPDATE value(@3421, “New name”) Dynamic form UPDATE value($TN, “Ontopia”) FROM topic-name(oks, $TN)
  • 76. More information Look at sample queries in Omnigator tolog tutorial http://www.ontopia.net/doc/current/doc/query/tutorial.html tolog built-in predicate reference http://www.ontopia.net/doc/current/doc/query/predicate-reference.html
  • 77. Conversion from RDBMS data DB2TM
  • 78. DB2TM Upconversion of relational data either from CSV files or over JDBC Based on an XML file describing the mapping very highly configurable Support for all of Topic Maps (except variants) value transformations synchronization
  • 79. Standard use case Pull in data from external source turn it into Topic Maps following some ontology Enrich it usually manually, but not necessarily Resync from source at intervals
  • 80. DB2TM example Ontopia + = United Nations Bouvet <relation name="organizations.csv" columns="id name url"> <topic type="ex:organization"> <item-identifier>#org${id}</item-identifier> <topic-name>${name}</topic-name> <occurrence type="ex:homepage">${url}</occurrence> </topic> </relation>
  • 81. Creating associations <relation name="people.csv" columns="id given family employer phone"> <topic id="employer"> <item-identifier>#org${employer}</item-identifier> </topic> <topic type="ex:person"> <item-identifier>#person${id}</item-identifier> <topic-name>${given} ${family}</topic-name> <occurrence type="ex:phone">${phone}</occurrence> <player atype="ex:employed-by" rtype="ex:employee"> <other rtype="ex:employer" player="#employer"/> </player> </topic> </relation>
  • 82. Value transformations <relation name="SCHEMATA" columns="SCHEMA_NAME"> <function-column name='SCHEMA_ID' method='net.ontopia.topicmaps.db2tm.Functions.makePSI'> <param>${SCHEMA_NAME}</param> </function-column> <topic type="mysql:schema"> <item-identifier>#${SCHEMA_ID}</item-identifier> <topic-name>${SCHEMA_NAME}</topic-name> </topic> </relation>
  • 83. Running DB2TM java net.ontopia.topicmaps.db2tm.Execute command-line tool also works with RDBMS topic maps net.ontopia.topicmaps.db2tm.DB2TM API class to run transformations methods "add" and "sync"
  • 84. More information DB2TM User's Guide http://www.ontopia.net/doc/current/doc/db2tm/user-guide.html
  • 85. Synchronizing with other sources TMSync
  • 86. TMSync Configurable module for synchronizing one TM against another define subset of source TM to sync (using tolog) define subset of target TM to sync (using tolog) the module handles the rest Can also be used with non-TM sources create a non-updating conversion from the source to some TM format then use TMSync to sync against the converted TM instead of directly against the source
  • 87. How TMSync works Define which part of the target topic map you want, Define which part of the source topic map it is the master for, and The algorithm does the rest
  • 88. If the source is not a topic map TMSync convert.xslt Simply do a normal one-time conversion let TMSync do the update for you In other words, TMSync reduces the update problem to a conversion problem source.xml
  • 89. The City of Bergen usecase Norge.no Service Unit Person LOS City of Bergen LOS
  • 91. TMRAP basics Abstract interface that is, independent of any particular technology coarse-grained operations, to reduce network traffic Protocol bindings exist plain HTTP binding SOAP binding Supports many syntaxes XTM 1.0 LTM TM/XML custom tolog result-set syntax
  • 92. get-topic Retrieves a single topic from the remote server topic map may optionally be specified syntax likewise Main use to build client-side fragments into a bigger topic map to present information about a topic on a different server
  • 93. get-topic Parameters identifier: a set of URIs (subject identifiers of wanted topic) subject: a set of URIs (subject locators of wanted topic) item: a set of URIs (item identifiers of wanted topic) topicmap: identifier for topic map being queried syntax: string identifying desired Topic Maps syntax in response view: string identifying TM-Views view used to define fragment Response topic map fragment representing topic in requested syntax default is XTM fragment with all URI identifiers, names, occurrences, and associations in default view types and scopes on these constructs are only identified by one <*Ref xlink:href=“...”/> XTM element the same goes for associated topics
  • 94. get-topic-page Returns link information about a topic that is, where does the server present this topic mainly useful for realizing the portal integration scenario result information contains metadata about server setup
  • 95. get-topic-page Parameters identifier: a set of URIs (subject identifiers of wanted topic) subject: a set of URIs (subject locators of wanted topic) item: a set of URIs (item identifiers of wanted topic) topicmap: identifier for topic map being queried syntax: string identifying desired Topic Maps syntax in response Response is a topic map fragment [oks : tmrap:server = "OKS Samplers local installation"] [opera : tmrap:topicmap = "The Italian Opera Topic Map"] {opera, tmrap:handle, [[opera.xtm]]} tmrap:contained-in(oks : tmrap:container, opera : tmrap:containee) tmrap:contained-in(opera : tmrap:container, view : tmrap:containee) tmrap:contained-in(opera : tmrap:container, edit : tmrap:containee) [view : tmrap:view-page %"http://localhost:8080/omnigator/models/..."] [edit : tmrap:edit-page %"http://localhost:8080/ontopoly/enter.ted?..."] [russia = "Russia” @"http://www.topicmaps.org/xtm/1.0/country.xtm#RU"]
  • 96. get-tolog Returns query results main use is to extract larger chunks of the topic map to the client for presentation more flexible than get-topic can achieve more with less network traffic
  • 97. get-tolog Parameters tolog: tolog query topicmap: identifier for topic map being queried syntax: string identifying desired syntax of response view: string identifying TM-Views view used to define fragment Response if syntax is“tolog” an XML representation of the query result useful if order of results matter otherwise, a topic map fragment containing multiple topics is returned as for get-topic
  • 98. add-fragment Adds information to topic map on the server does this by merging in a fragment Parameters fragment: topic map fragment topicmap: identifier for topic map being added to syntax: string identifying syntax of request fragment Result fragment imported into named topic map
  • 99. update-topic Can be used to update a topic add-fragment only adds information update sets the topic to exactly the uploaded information Parameters topicmap: the topic map to update fragment: fragment containing the new topic syntax: syntax of the uploaded fragment identifier: a set of URIs (subject identifiers of wanted topic) subject: a set of URIs (subject locators of wanted topic) item: a set of URIs (item identifiers of wanted topic) Update happens using TMSync
  • 100. delete-topic Removes a topic from the server Parameters identifier: a set of URIs (subject identifiers of wanted topic) subject: a set of URIs (subject locators of wanted topic) item: a set of URIs (item identifiers of wanted topic) topicmap: identifier for topic map being queried Result deletes the identified topic includes all names, occurrences, and associations
  • 101. tolog-update Runs a tolog update statement Parameters topicmap: topic map to update statement: tolog statement to run Runs the statement & commits the change
  • 102. HTTP binding basics The mapping requires a base URL e.g http://localhost:8080/tmrap/ This is used to send requests http://localhost:8080/tmrap/method?param1=value1&... GET is used for requests that do not cause state changes POST for requests that do Responses returned in response body
  • 103. Exercise #1: Retrieve a topic Use the get-topic request to retrieve a topic from the server base URL is http://localhost:8080/tmrap/ find the identifying URI in Omnigator just print the retrieved fragment to get a look at it Note: you must escape the “#” character in URIs otherwise it is interpreted as the anchor and not transmitted at all escape sequence: %23 Note: you must specify the topic map ID otherwise results will only be returned from loaded topic maps in other words: if the topic map isn’t loaded, you get no results
  • 104. Solution #1 (in Python) import urllib BASE = "http://localhost:8080/tmrap/tmrap/" psi = "http://www.topicmaps.org/xtm/1.0/country.xtm%23RU" inf = urllib.urlopen(BASE + "get-topic?identifier=" + psi) print inf.read() inf.close()
  • 105. Solution #1 (response) <topicMap xmlns="http://www.topicmaps.org/xtm/1.0/" xmlns:xlink="http://www.w3.org/1999/xlink"> <topic id="id458"> <instanceOf> <subjectIndicatorRef xlink:href="http://psi.ontopia.net/geography/#country"/> </instanceOf> <subjectIdentity> <subjectIndicatorRef xlink:href="http://www.topicmaps.org/xtm/1.0/country.xtm#RU"/> <topicRef xlink:href="file:/.../WEB-INF/topicmaps/geography.xtmm#russia"/> </subjectIdentity> <baseName> <baseNameString>Russia</baseNameString> </baseName> </topic>
  • 106. Processing XTM with XSLT This is possible, but unpleasant the main problem is that the XML is phrased in terms of Topic Maps, not in domain terms this means that all the XPath will talk about “topic”, “association”, ... and not “person”, “works-for” etc The structure is also complicated this makes queries complicated for example, the XPath to traverse an association looks like this: //xtm:association [xtm:member[xtm:roleSpec / xtm:topicRef / @xlink:href = '#employer'] [xtm:topicRef / @xlink:href = concat('#', $company)]] [xtm:instanceOf / xtm:topicRef / @xlink:href = '#employed-by']
  • 107. TM/XML Non-standard XML syntax for Topic Maps defined by Ontopia (presented at TMRA’05) implemented in the OKS XSLT-friendly much easier to process with XSLT than XTM can be understood by developers who do not understand Topic Maps dynamic domain-specific syntaxes instead of generic syntax predictable (can generate XML Schema from TM ontology)
  • 108. TM/XML example <topicmap ... reifier="tmtopic"> <topicmap id="tmtopic"> <iso:topic-name><tm:value>TM/XML example</tm:value> </iso:topic-name> <dc:description>An example of the use of TM/XML.</dc:description> </topicmap> <person id="lmg"> <iso:topic-name><tm:value>Lars Marius Garshol</tm:value> <tm:variant scope="core:sort">garshol, lars marius</tm:variant> </iso:topic-name> <homepage datatype="http://www.w3.org/2001/XMLSchema#anyURI" >http://www.garshol.priv.no</homepage> <created-by role="creator" topicref="tmtopic" otherrole="work"/> <presentation role="presenter"> <presented topicref="tmxml"/> <event topicref="tmra05"/> </presentation> </person> </topicmap>
  • 109. tmphoto Category Person Photo Event Location http://www.garshol.priv.no/tmphoto/ A topic map to organize my personal photos contains ~15,000 photos A web gallery runs on Ontopia on www.garshol.priv.no
  • 110. tmtools http://www.garshol.priv.no/tmtools/ Organization An index of Topic Maps tools organized as shown on the right Again, web application for browsing screenshots below Person Software product Platform Category Technology
  • 111. The person page Boring! No content.
  • 113. get-illustration A web service in tmphoto receives the PSI of a person then automatically picks a suitable photo of that person Based on vote score for photos, categories (portrait), other people in photo ... The service returns a topic map fragment with links to the person page and a few different sizes of the selected photo http://www.garshol.priv.no/blog/183.html
  • 114. get-illustration Hmmm. Scores, categories, people in photo, ... Do you have a photo of http://psi.ontopedia.net/Benjamin_Bock ? http://www.garshol.priv.no/tmphoto/get-illustration?identifier=http://psi.on.... tmphoto tmtools Topic map fragment
  • 116. Points to note No hard-wiring of links just add identifiers when creating people topics photos appear automatically if a better photo is added later, it’s replaced automatically No copying of data no duplication, no extra maintenance Very loose binding nothing application-specific Highly extensible once the identifiers are in place we can easily pull in more content from other sources
  • 117. My blog Has more content about people (tmphoto & tmtools), events (tmphoto), tools (tmtools), technologies (tmtools) Should be available in those applications
  • 118. Solution My blog posts are tagged but the tags are topics, which can have PSIs these PSIs are used in tmphoto and tmtools, too The get-topic-page request lets tmphoto & tmtools ask the blog for links to relevant posts given identifiers for a topic, returns links to pages about that topic http://www.garshol.priv.no/blog/145.html
  • 119. get-topic-page Do you have pages about http://psi.ontopedia.net/TMRA_2008 ? http://www.garshol.priv.no/blog/get-topic-page?identifier=http://psi.on.... Blog tmphoto Topic map fragment Topics linking to individual blog posts
  • 121. Making web applications Navigator Framework
  • 122. Ontopia Navigator Framework Java API for interacting with TM repository JSP tag library based on tolog kind of like XSLT in JSP with tolog instead of XPath has JSTL integration Undocumented parts web presentation components some wrapped as JSP tags want to build proper portlets from them
  • 123. How it works Web server with JSP containere.g. Apache Tomcat JSP page Browser Topic MapEngine TagLibraries JSP page Browser JSP page Browser JSP page Browser Topic Map
  • 124.
  • 125. makes up nearly the entire framework
  • 126. used to extract information from topic maps
  • 127. lets you execute tolog queries to extract information from the topic map
  • 128. looping and control flow structures
  • 130. used to create template pages
  • 131. separates layout and structure from content
  • 133.
  • 134. collected from the tm-sources.xml configuration file
  • 135. each topic map has its own id (usually the file name)
  • 136. Each page also holds a set of variable bindings
  • 137. each variable holds a collection of objects
  • 138. objects can be topics, base names, locators, strings, ...
  • 140.
  • 141. Tells the page which tag library to include and binds it to a prefix
  • 142. Prefixes are used to qualify the tags (and avoid name collisions)
  • 143. Use the <tolog:context> tag around the entire page
  • 144. The "topicmap" attribute specifies the ID of the current topic map
  • 145. The first time you access the page in your browser the page gets compiled
  • 146.
  • 147. Navigator tag library example <%-- assume variable 'composer' is already set --%> <p><b>Operas:</b><br/><tolog:foreach query=”composed-by(%composer% : composer, $OPERA : opera), { premiere-date($OPERA, $DATE) }?”> <li> <a href="opera.jsp?id=<tolog:id var="OPERA"/>” ><tolog:out var="OPERA"/></a> <tolog:if var="DATE"> <tolog:out var="DATE"/> </tolog:if> </li></tolog:foreach></p>
  • 149.
  • 150.
  • 151.
  • 152. Possible configuration Application directories webapps myApp/ *.jsp omnigator/ WEB-INF/ config/ *.xml i18n/ topicmaps/ *.xtm, *.ltm web.xml
  • 153.
  • 154. where to find the other files, plus plug-ins
  • 156. tells the navigator where to find topic maps
  • 158. configuration of the log4j logging
  • 159.
  • 161. What is automated classification? Create parts of a topic map automatically using the text in existing content as the source not necessarily 100% automatic; user may help out A hard task natural language processing is very complex result is never perfect However, it’s possible to achieve some results
  • 162. Why automate classification? Creating a topic map requires intellectual effort that is, it requires work by humans Human effort = cost added value must be sufficient to justify the cost in some cases either the cost is too high, or the value added is too limited The purpose of automation is to lower the cost this increases the number of cases where the use of Topic Maps is justified
  • 163. Automatable tasks Project Person Department Worked on Worked on Jane Doe worked on employed in XYZ Project IT group Ontology hard depends on requirements one time only Instance data hard usually exists in other sources Document keywords easier frequent operation usually no other sources
  • 164. Two kinds of categorization Broad: Environment, Crisis management Narrow: Water, Norway, drought, Drought Act, Cloud seeding, Morecambe Bay Broad categorization categories are broadly defined include many different subjects Narrow categorization uses very specific keywords each keyword is a single subject
  • 165. What it does Extract keywords from content goal is to use these for classification Not entity recognition we only care about identifying what the content is about Uses statistical approach no attempt at full formal parsing of the text
  • 166. Steps of operation Identify format then, extract the text Identify language then, remove stop words stem remaining words Classify can use terms from preexisting Topic Maps exploits knowledge of the language Return proposed keywords
  • 167. Example of keyword extraction topic maps 1.0 metadata 0.57 subject-based class. 0.42 Core metadata 0.42 faceted classification 0.34 taxonomy 0.22 monolingual thesauri 0.19 controlled vocabulary 0.19 Dublin Core 0.16 thesauri 0.16 Dublin 0.15 keywords 0.15
  • 168. Example #2 Automated classification 1.0 5 Topic Maps 0.51 14 XSLT 0.38 11 compound keywords 0.29 2 keywords 0.26 20 Lars 0.23 1 Marius 0.23 1 Garshol 0.22 1 ...
  • 169. So how could this be used? To help users classify new documents in a CMS interface suggest appropriate keywords, screened by user before approval Automate classification of incoming documents this means lower quality, but also lower cost Get an overview of interesting terms in a document corpus classify all documents, extract the most interesting terms this can be used as the starting point for building an ontology (keyword extraction only)
  • 170. Example user interface The user creates an article this screen then used to add keywords user adjusts the proposals from the classifier
  • 171. Interfaces java net.ontopia.topicmaps.classify.Chew <topicmapuri> <inputfile> produces textual output only net.ontopia.topicmaps.classify.SimpleClassifier classify(uri, topicmap) -> TermDatabase classify(uri) -> TermDatabase
  • 172. Supported formats and languages XML (any schema) HTML (non-XML) PDF Word (.doc, .docx) PowerPoint (.ppt, .pptx) Plain text English Norwegian
  • 173. Visualization of Topic Maps Vizigator
  • 174. The Vizigator Graphical visualization of Topic Maps Two parts VizDesktop: Swing desktop app for configuration Vizlet: Java applet for web deployment Configuration stored in XTM file
  • 175. The uses of visualization Not really suitable for navigation doesn't work for all kinds of data Great for seeing the big picture
  • 179. The Vizigator The Vizigator uses TMRAP the Vizlet runs in the browser (on the client) a fragment of the topic map is downloaded from the server the fragment is grown as needed Server TMRAP
  • 180. Embedding the Vizlet Set up TMRAP service Add ontopia-vizlet.jar Add necessary HTML <applet code="net.ontopia.topicmaps.viz.Vizlet.class" archive="ontopia-vizlet.jar"> <param name="tmrap" value="/omnigator/plugins/viz/"> <param name="config" value="/omnigator/plugins/viz/config.jsp?tm=<%= tmid %>"> <param name="tmid" value="<%= tmid %>"> <param name="idtype" value="<%= idtype %>"> <param name="idvalue" value="<%= idvalue %>"> <param name="propTarget" value="VizletProp"> <param name="controlsVisible" value="true"> <param name="locality" value="1"> <param name="max-locality" value="5"> </applet>
  • 181. Topic Maps debugger Omnigator
  • 182. Omnigator Generic Topic Maps browser very useful for seeing what's in a topic map the second-oldest part of Ontopia Contains other features beyond simple browsing statistics management console merging tolog querying/updates export
  • 183. Ontology designer and editor Ontopoly
  • 184. Ontopoly A generic Topic Maps editor, in two parts ontology editor: used to create the ontology and schema instance editor: used to enter instances based on ontology Features works with both XTM files and topic maps stored in RDBMS backend supports access control to administrative functions, ontology, and instance editors existing topic maps can be imported parts of the ontology can be marked as read-only, or hidden
  • 185. Ontology designer Create ontology based on topic, association, name, occurrence, and role types Supports iterative ontology development modify and prototype the ontology until it's right Supports ontology annotation add fields to topic types, for example Supports views define restricted views of certain topic types
  • 186. Instance editor Configured by the ontology editor shows topics as defined by the ontology Has several ways to pick associations drop-down list by search from hierarchy Avoids conflicts pages viewed by one user are locked to others
  • 187. Ontopoly is embeddable The Ontopoly instance editor can be embedded basically, the main panel can be inserted into another web application uses an iframe Requires only ID of topic being edited can also be restricted to a specific view Makes it possible to build easier-to-use editors so users don't have to learn all of Ontopoly
  • 188. Adding content features CMS integrations
  • 189. CMS integration The best way to add content functionality to Ontopia the world doesn’t need another CMS better to reuse those which already exist So far two integrations exist Escenic OfficeNet Knowledge Portal more are being worked on
  • 190. Implementation A CMS event listener the listener creates topics for new CMS articles, folders, etc the mapping is basically the design of the ontology used by this listener Presentation integration it must be possible to list all topics attached to an article conversely, it must be possible to list all articles attached to a topic how close the integration needs to be here will vary, as will the difficulty of the integration User interface integration it needs to be possible to attach topics to an article from within the normal CMS user interface this can be quite tricky Search integration the Topic Maps search needs to also search content in the CMS can be achieved by writing a tolog plug-in
  • 191. Articles as topics is about Elections New city council appointed Goal: associate articles with topics mainly to say what they are about typically also want to include other metadata Need to create topics for the articles to do this in fact, a general CMS-to-TM mapping is needed must decide what metadata and structures to include
  • 192. Mapping issues Article topics what topic type to use? title becomes name? (do you know the title?) include author? include last modified? include workflow state? should all articles be mapped? Folders/directories/sections/... should these be mapped, too? one topic type for all folders/.../.../...? if so, use associations to connect articles to folders use associations to reproduce hierarchical folder structure Multimedia objects should these be included? what topic type? what name? ...
  • 193. Two styles of mappings Articles as articles Topic represents only the article Topic type is some subclass of “article” “Is about” association connects article into topic map Fields are presentational title, abstract, body Articles as concepts Topic represents some real-world subject (like a person) article is just the default content about that subject Type is the type of the subject (person) Semantic associations to the rest of the topic map works in department, has competence, ... Fields can be semantic name, phone no, email, ...
  • 194. Article as article Article about building of a new school Is about association to “Primary schools” Topic type is “article”
  • 195.
  • 196. events in the location
  • 197.
  • 198.
  • 199.
  • 200.
  • 201.
  • 202. Ontopia/Liferay An integration with the Liferay CMS and portal is in progress presented Friday 1130-1150 in Schiller 2
  • 204. The project A new citizen’s portal for the city administration strategic decision to make portal main interface for interaction with citizens as many services as possible are to be moved online Big project started in late 2004, to continue at least into 2008 ~5 million Euro spent by launch date 1.7 million Euro budgeted for 2007 Topic Maps development is a fraction of this (less than 25%) Many companies involved Bouvet/Ontopia Avenir KPMG Karabin Escenic
  • 205. Simplified original ontology Service catalog Escenic (CMS) LOS Form Article nearly everything Category Service Subject Department Borough External resource Employee Payroll++
  • 206. Data flow Ontopoly Ontopia Escenic LOS Integration TMSync DB2TM Fellesdata Payroll (Agresso) Dexter/Extens Service Catalog
  • 207. Conceptual architecture Data sources Oracle Portal Application Ontopia Escenic Oracle Database
  • 210. NRK/Skole Norwegian National Broadcasting (NRK) media resources from the archives published for use in schools integrated with the National Curriculum In production delayed by copyright wrangling Technologies OKS Polopoly CMS MySQL database Resin application server
  • 211. Curriculum-based browsing (1) Curriculum Social studies High school
  • 213. Curriculum-based browsing (3) Feminist movement in the 70s and 80s Changes to the family in the 70s The prime minister’s husband Children choosing careers Gay partnerships in 1993
  • 214. One video (prime minister’s husband) Metadata Subject Person Related resources Description
  • 215. Conceptual architecture Polopoly HTTP Ontopia MediaDB Grep DB2TM TMSync RDBMS backend MySQL Editors
  • 216. Implementation Domain model in Java Plain old Java objects built on Ontopia’s Java API tolog JSP for presentation using JSTL on top of the domain model Subversion for the source code Maven2 to build and deploy Unit tests
  • 217. What we’d like to see The future
  • 218. The big picture Auto-class. A.N.other A.N.other Other CMSs A.N.other A.N.other DB2TM Portlet support OKP XML2TM Engine CMSintegration Data integration Escenic Taxon.import Ontopoly Web service
  • 219. CMS integrations The more of these, the better Candidate CMSs Liferay (being worked on at Bouvet) Alfresco Magnolia Inspera JSR-170 Java Content Repository CMIS (OASIS web service standard)
  • 220. Portlet toolkit Subversion contains a number of “portlets” basically, Java objects doing presentation tasks some have JSP wrappers as well Examples display tree view list of topics filterable by facets show related topics get-topic-page via TMRAP component Not ready for prime-time yet undocumented incomplete
  • 221. Ontopoly plug-ins Plugins for getting more data from externals TMSync import plugin DB2TM plugin Subj3ct.com plugin adapted RDF2TM plugin classify plugin ... Plugins for ontology fragments menu editor, for example
  • 222. TMCL Now implementable We’d like to see an object model for TMCL (supporting changes) a validator based on the object model Ontopoly import/export from TMCL (initially) refactor Ontopoly API to make it more portable Ontopoly ported to use TMCL natively (eventually)
  • 223. Things we’d like to remove OSL support Ontopia Schema Language Web editor framework unfortunately, still used by some major customers Fulltext search the old APIs for this are not really of any use
  • 224. Management interface Import topic maps (to file or RDBMS)
  • 225. What do you think? Suggestions? Questions? Plans? Ideas?
  • 226. Setting up the developer environment Getting started
  • 227. If you are using Ontopia... ...simply download the zip, then unzip, set the classpath, start the server, ... ...and you’re good to go
  • 228. If you are developing Ontopia... You must have Java 1.5 (not 1.6 or 1.7 or ...) Ant 1.6 (or later) Ivy 2.0 (or later) Subversion Then check out the source from Subversion svn checkout http://ontopia.googlecode.com/svn/trunk/ ontopia-read-only ant bootstrap ant dist.jar.ontopia ant test ant dist.ontopia
  • 229. Beware This is fun, because you can play around with anything you want e.g, my build has a faster TopicIF.getRolesByType you can track changes as they happen in svn However, you’re on your own if it fails it’s kind of hard to say why maybe it’s your changes, maybe not For production use, official releases are best
  • 231. Our goal To provide the best toolkit for building Topic Maps-based applications We want it to be actively maintained, bug-free, scalable, easy to use, well documented, stable, reliable
  • 232. Our philosophy We want Ontopia to provide as much useful more-or-less generic functionality as possible New contributions are generally welcome as long as they meet the quality requirements, and they don’t cause problems for others
  • 233. The sandbox There’s a lot of Ontopia-related code which does not meet those requirements some of it can be very useful, someone may pick it up and improve it The sandbox is for these pieces some are in Ontopia’s Subversion repository, others are maintained externally To be “promoted” into Ontopia a module needs an active maintainer, to be generally useful, and to meet certain quality requirements
  • 234. Communications Join the mailing list(s)! http://groups.google.com/group/ontopia http://groups.google.com/group/ontopia-dev Google Code page http://code.google.com/p/ontopia/ note the “updates” feed! Blog http://ontopia.wordpress.com Twitter http://twitter.com/ontopia
  • 235. Committers These are the people who run the project they can actually commit to Subversion they can vote on decisions to be made etc Everyone else can use the software as much as they want, report and comment on issues, discuss on the mailing list, and submit patches for inclusion
  • 236. How to become a committer Participate in the project! that is, get involved first let people get to know you, show some commitment Once you’ve gotten some way into the project you can ask to become a committer best if you have provided some patches first Unless you’re going to commit changes there’s no need to be a committer
  • 237. Finding a task to work on Report bugs! they exist. if you find any, please report them. Look at the open issues there is always testing/discussion to be done Look for issues marked “newbie” http://code.google.com/p/ontopia/issues/list?q=label:Newbie Look at what’s in the sandbox most of these modules need work Scratch an itch if there’s something you want fixed/changed/added...
  • 238. How to fix a bug First figure out why you think it fails Then write a test case based on your assumption make sure the test case fails (test before you fix) Then fix the bug follow the coding guidelines (see wiki) Then run the test suite verify that you’ve fixed the bug verify that you haven’t broken anything Then submit the patch
  • 239. The test suite Lots of *.test packages in the source tree 3795 test cases as of right now test data in ontopia/src/test-data some tests are generators based on files some of the test files come from cxtm-tests.sf.net Run with ant test java net.ontopia.test.TestRunner src/test-data/config/tests.xml test-group
  • 240. Source tree structure net.ontopia. utils various utilities test various test support code infoset LocatorIF code + cruft persistence OR-mapper for RDBMS backend product cruft xml various XML-related utilities topicmaps next slides
  • 241. Source tree structure net.ontopia.topicmaps. core core engine API impl engine backends + utils utils utilities (see next slide) cmdlineutils command-line tools entry TM repository nav + nav2 navigator framework query tolog engine viz classify db2tm webed cruft
  • 242. Source tree structure net.ontopia.topicmaps.utils * various utility classes ltm LTM reader and writer ctm CTM reader rdf RDF converter (both ways) tmrap TMRAP implementation
  • 244. The engine The core API corresponds closely to the TMDM TopicMapIF, TopicIF, TopicNameIF, ... Compile with ant init compile.ontopia .class files go into ontopia/build/classes ant dist.ontopia.jar # makes a jar
  • 245. The importers Main class implements TopicMapReaderIF usually, this lets you set up configuration, etc then uses other classes to do the real work XTM importers use an XML parser main work done in XTM(2)ContentHandler some extra code for validation and format detection CTM/LTM importers use Antlr-based parsers real code in ctm.g/ltm.g All importers work via the core API
  • 246. Find an issue in the issue tracker (Picking one with “Newbie” might be good, but isn’t necessary) Get set up check out the source code build the code run the test suite Then dig in we’ll help you with any questions you have At the end, submit a patch to the issue tracker remember to use the test suite!