Más contenido relacionado
La actualidad más candente (20)
Similar a From research to business: the Web of linked data (20)
From research to business: the Web of linked data
- 1. From research to business:
the Web of linked data
Irene Celino – Semantic Web Practice
CEFRIEL – ICT Institute, Politecnico di Milano
email: irene.celino@cefriel.it – web: http://swa.cefriel.it
From research to business: the Web of linked data
Enterprise X.0/Econom Workshops @ BIS 2009 – Poznan, 29th April 2009 - © CEFRIEL 2009
- 2. Agenda
The problem of integration
Web as a platform
Linked data
How do we produce linked data today?
The case of Service-Finder
How do we manage linked data today?
The case of Urban Computing in LarKC
What’s next?
What’s already going on
Business view
Scientific & technical view
2
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 3. The problem of integration
When do we have an integration problem?
Very large amounts of data that grow and evolve
continuously
problem of scale
Numerous and different data typologies (documents,
media, email, Web results, contacts, etc.)
problem of data
heterogeneity
Numerous and different
information systems (DB,
legacy systems, ERP, etc.)
problem of system
heterogeneity
3
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 4. When 1 + 1 > 2 ?
Data integration always gives an added value
Getting a global high-level view
Sharing knowledge
Business opportunities
Business Intelligence
Still there is the technological problem:
problem
How to reconcile data heterogeneity?
Who took advantage from integration?
Can (Semantic) Web be of help?
4
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 5. Lesson learned from Web 2.0
Participation politics and “wisdom of
the crowds”
Great success of mash-ups
Mash-ups: applications made up of light
integration of artifacts provided by third
parties (often API or REST services)
New integration paradigm to application
development
Publication and access via Web
Storing our information on the Web is
becoming easier and easier
Accessing our information on the Web (e.g.
by retrieving it with search engines) is
becoming more and more frequent
5
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 6. The Web as integration platform
What if we integrate on the Web?
Web
Web as a platform
Data prosumer (producer + consumer)
“Web of Data”
Data
From current “Web of Documents” to a Web of data
Not only information retrieval, but also data retrieval
Exposing your data on the Web
Converting/translating to a suitable format
“Wrapping” the data source
Triplify
Virtuoso
D2R SPASQL
R2O
Relational.OWL Talis
DartGrid
SPOON
SquirrelRDF
6
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 7. Linked data and data cloud
Linked Data
The realization of the “Web of Data” (and of the Semantic Web)
Tim Berners-Lee: http://www.w3.org/DesignIssues/LinkedData
Linking Open Data Initiative
A community publishing and linking data on the Web
http://linkeddata.org/
Data cloud
Today everybody talks
about cloud computing
However, often it’s not only a
computation or storage
issue, but it also about data
and knowledge management
7
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 8. Challenges for linked data
Automatic linked data creation and linkage
Automatic generation of linked data and smart mechanisms to
identify “contact points” between different data sources and to
seamlessly link them
Distributed querying
Querying distributed data over different Web sources
regardless the “physical position” of data and getting
aggregated results
Distributed reasoning
Applying inference techniques to
distributed data, preserving
consistency and correctness of
the reasoning
8
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 9. Service-Finder
http://demo.service-finder.eu
There’s a lot of information already on the Web:
how can we turn it into linked data?
From research to business: the Web of linked data
Enterprise X.0/Econom Workshops @ BIS 2009 – Poznan, 29th April 2009 - © CEFRIEL 2009
- 10. Context: SOA onto the Web
Service Oriented Architectures (SOAs) along with Web
Services technologies are widely seen as the most
promising fundament for realizing service interchange in
business to business settings.
However, it is envisioned that SOAs and
Web Services will increasingly move out
of these settings and out onto the Web.
Web size
Google: 1.000.000.000.000 URIs (08/2008) [ http://developer.ebay.com/ ]
NetCraft: 62.000.000 active hosts
Service Web size
Google: filetype:asmx inurl:wsdl (818)
Service-Finder: > 25.000
[ http://aws.amazon.com/ ]
10
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 11. The rise and fall of public UDDI registries
One of the essential building blocks for UDDI Business
creating applications that utilize the vast Registry Shutdown.
quot;With the approval of UDDI
quantities of services, which are available on
v3.02 as an OASIS Standard
the Web is making it easier to discovery in 2005, and the momentum
UDDI has achieved in market
and select the right services
adoption, IBM, Microsoft and
UDDI was initially proposed as a SAP have evaluated the status
of the UDDI Business Registry
component of Web Services usage process and determined that the goals
enabling registering and discovering for the project have been
achieved. Given this, the UDDI
services, but finally UDDI did not reach its
Business Registry will be
expected potential discontinued as of 12 January
2006.quot;
The critical problem in this new Web [from “Registering for UDDI” 2005-12-17 ]
oriented environment is one of scale [see http://xml.coverpages.org/uddi.html ]
because services appear, disappear and
change at a rate much higher than in
business to business settings
11
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 12. Pitfalls of public UDDI registries
1. UDDI is centered around programmatic access to the registry and
only a few mostly technically focused user interfaces are
available.
2. The information in public UDDI registry was often outdated. The
value of the service in the public UDDI registry is minimal if the
service itself does not exist anymore.
3. There are no means for community feedback. Practically there is
only one possibility to provide feedback allowing the user to
contact a provider by email listed in the service description.
4. A WSDL definition and a short description is not sufficient for a
service consumer to select a service. To make decision about
applicability of the service, service consumer need to become
familiar with pricing, terms and condition, service level
agreements to name just a few.
12
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 13. Overcoming UDDI limitation
1. Easy to use GUI – It is important that early adopters of Web
Services technology, who learns about it for the first time,
should be able to start exploring it with a few simply steps
2. Search Engine style – Web is unpredictable and services can
appear and disappear (the same as websites), but one can put
up a mechanism (periodic crawling and availability check)
allowing to eliminate these services which are not available any
more
3. Architecture of participation – Learn from Web 2.0 (e.g.,
wikis, blogs, etc.) in enabling community contribution
4. More useful info – Include all information required by a user to
make decision about applicability of the service; e.g., pricing,
terms and condition, service level agreements, etc.
13
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 14. project idea
Service-Finder aims at developing a platform for service discovery in which
Service-Finder aims at developing a platform for service discovery in which
Web Services are embedded in a Web 2.0 environment
Web Services are embedded in a Web 2.0 environment
http://demo.service-finder.eu
Automatic
Semantic Search
Semantic Annotation Conceptual Indexing
Combining smart-machine Semantic Matching
and smart-data
Web 2.0
Semantics
User clustering
Knowledge Representation
Realizing Web Service
User-Resource correlation
& Reasoning
Discovery at Web Scale
Semantic Web Services Web Services
As a means to realize As a basic tool to implement
Service Oriented Architecture a Service Oriented Architecture
14
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 15. key objectives
Create a Semantic Search Engine for Web Services
Create a Semantic Search Engine for Web Services
Aggregates information from heterogeneous sources:
Aggregates information from heterogeneous sources:
WSDL, wikis, blogs and also users’ feedbacks and behaviour
WSDL, wikis, blogs and also users’ feedbacks and behaviour
Create a Web Service Crawler to identify Web Services and their
Create a Web Service Crawler to identify Web Services and their
relevant information
relevant information
Automatically generate Semantic Service Descriptions
Automatically generate Semantic Service Descriptions
by analyzing heterogeneous sources
by analyzing heterogeneous sources
Allow efficient and effective search of collected and
Allow efficient and effective search of collected and
generated data
generated data
Provide a Web 2.0 portal
Provide a Web 2.0 portal
To support users in searching and browsing for Web Services
To support users in searching and browsing for Web Services
To give recommendations to users
To give recommendations to users
To track user behaviour for improving accuracy of service search
To track user behaviour for improving accuracy of service search
and user recommendations
and user recommendations
15
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 16. Realizing____________
Realizing
Jan 2008
June 2008
Dec 2008
Today
Dec 2009
16
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 17. Use cases for____________
for
To gather requirements we imaged several use cases
A system administrator at a bank who is looking for
an SMS Messaging service that sends him an SMS
in any case failures with the on-line payment system of
the bank
A business and technology consultant working on a
e-health project that needs to make it possible for
general practitioners to send and receive fax directly
from their patient record application using an on-line
service
A web developer that, after using a service listed on
Service-Finder, decides to edit the information on
the portal in order to improve it for other community
users
17
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 18. Requirements for ___________
We identified within those previous use cases more than 60
requirements and we grouped similar requirements together
into three main categories:
Search related: search for text, search for tag, search for
concept, disambiguation, facet-browsing, ranking, sorting,
comparing, etc.
Web Service information related:
Services details: interface, how can the service be used, its
payment modalities, its terms and clauses, user-added
information as ratings, comments and tags, measured values
of service levels such as availability (uptime) or performance
(response time) and the service level declared by the provider.
Providers info: name of the provider and its references, user-
added information as ratings, comments and tags
User Community related: rating, commenting, tagging,
editing, writing wiki entries, registration, recommendations
18
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 19. Architecture and Components
19
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 20. Key innovations of ___________
Research Activities
To automatic create Web Service descriptions by analyzing
Automatic
WSDL and related information
Service
• coping with contradictions
Annotation
• using community process to verify results
To investigate and implement techniques for:
User and
• clustering users accordingly to their behaviours
Service
• clustering services accordingly to their usage by users
Clustering
belonging to the same clusters
Research and Engineering Activities
To apply semantic technologies in the Web Service discovery
Conceptual
domain
Indexing and
To adopt them to the new forms of input descriptions:
Matching
• Automatic annotations, clusters, contexts
Integration Activities
To provide a Web 2.0 portal
Service-Finder
• demonstrating the developed technologies
Portal
• fostering communities participation
20
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 21. Beyond state of the art
Feature State of the art Improvement
Architecture for lightweight Approaches based on a Enables to scale service
semantic service discovery registration process or discovery with the upcoming
an editorial team increase of publicly available
services
Largest and most accurate set Specialized portals only Focused crawler able to identify
of publicly available services containing subset of services services and related information
Innovative; under-researched
Automatic metadata creation for Metadata generation from Web
Web Service 2.0 data and services
Indexed textual descriptions
Integration of formal and informal Hybrid match-making
(textual) knowledge algorithm
Automatic creation of both user Only general-purpose clustering Specialize clustering
and service clusters techniques exist algorithms that jointly cluster
users and services
Innovative interface that Current Web 2.0 portals do not Techniques that enable
combines Web 2.0 features and include semantic metadata. handling of semantic metadata
service related features in Web 2.0 portals
21
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 22. Expected Impacts
Service-Finder provides core mechanisms to cope with
changing environments:
It uses Web principles such as openness and robustness;
It takes explicit and implicit user interaction for construction,
improvement and validation of rich service description; and
It exploits Semantic Web technologies as means to organize
internally the data on available services.
It simplifies the service publishing process by removing the
burden of any registration and brings service discovery
even to non-technical persons.
Publishers increase their productivity, by being able to provide
complex services without the need to register them explicitly.
Creators become able to design more communicative forms of
content by integrating third party services.
Organizations can automate their processes by quickly finding
adequate services.
22
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 23. Exploitation Prospects
The results of the Service-Finder project have the
potential to revolutionize this market and to outperform
existing solutions
Using Service Finder for Public services
Unique chance
market for public services increases (xignite, cdyne, …)
Missing Alternatives
UDDI (has been shutdown in 2006)
Google (no reliable filter / no additional information)
Portals (rely on editorial process <=400 services)
Service finder can also be applied within organizations
Number of Services increases in organizations
As within internet repositories in big companies can be quickly
outdated
IT Manager like minimal invasive technology
23
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 24. So what? Service-Finder and linked data
Even if I didn’t explicitly talk about linked data, that is
exactly the result of Service-Finder
We take information about services from the Web, we
translate it into structured information describing services
wrt to domain-specific ontologies, we gives this information
back to the community that can further enrich it
Is this linked data? Not yet, but:
RDFa annotation in SF portal pages coming soon
Services to query the knowledge base coming soon
Possibly a “dump” of SF knowledge base could be easily
published on the Web as linked data
24
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 25. Urban Computing in LarKC
http://wiki.larkc.eu/UrbanComputing
There are lots of data sources about cities on
the Web: how can we query and reason on it?
From research to business: the Web of linked data
Enterprise X.0/Econom Workshops @ BIS 2009 – Poznan, 29th April 2009 - © CEFRIEL 2009
- 26. Context: Cities are alive
Cities come to life, grow,
evolve like living beings
The state of a city changes
continuously, influenced by
a lot of factors
human factors: people
moving in the city or
extending it
natural factors:
precipitations or climate
changes
[source http://www.citysense.com]
26
Irene Celino –DoCoMo Invited speech, 11-3-2009
NTT From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 200926
- 27. Today Cities’ Challenges
Our cities face many challenges
•• How can we redevelop existing neighbourhoods and
How can we redevelop existing neighbourhoods and
business districts to improve the quality of life?
business districts to improve the quality of life?
•• How can we create more choices in housing,
How can we create more choices in housing,
accommodating diverse lifestyles and all income levels?
accommodating diverse lifestyles and all income levels?
•• How can we reduce traffic congestion yet stay connected?
How can we reduce traffic congestion yet stay connected?
•• How can we include citizens in planning their communities
How can we include citizens in planning their communities
rather than limiting input to only those affected by the next
rather than limiting input to only those affected by the next
project?
project?
•• How can we fund schools, bridges, roads, and clean water
How can we fund schools, bridges, roads, and clean water
while meeting short-term costs of increased security?
while meeting short-term costs of increased security?
[ source http://www.uli.org/]
27
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 28. Urban Computing to address challenges
28
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 29. Urban Computing
A definition:
The integration of computing, sensing, and actuation technologies into
everyday urban settings and lifestyles.
[source IEEE Pervasive Computing,July-September 2007 (Vol. 6, No. 3)]
Urban settings include, for example, streets, squares, pubs,
shops, buses, and cafés - any space in the semipublic realms of
our towns and cities
Only in the last few years have researchers paid much attention
to technologies in these spaces
Pervasive computing has largely been applied
either in relatively homogeneous rural areas, where researchers have
added sensors in places such as forests, vineyards, and glaciers
or, on the other hand, in small-scale, well-defined patches of the built
environment such as smart houses or rooms
Urban settings are challenging for experimentation and
deployment, and they remain little explored
29
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 30. Availability of Data
Some years ago, due to the lack of data, solving Urban
Computing problems with ICT looked like a Sci-Fi idea
Nowadays, a large amount of the required information can be
made available on the Web at almost no cost. We are running a
survey and we have collected more than 50 sources of data:
maps with streets and paths (Google Maps, Yahoo! Maps…),
events scheduled (EVDB, Upcoming…),
multimedia data with information about location (Flickr…)
relevant places (schools, bus stops, airports...)
traffic information (accidents, problems of public transportation...)
city life (job ads, pollution, health care...)
We are running a survey (please contribute), see
http://wiki.larkc.eu/UrbanComputing/ShowUsABetterWay
http://wiki.larkc.eu/UrbanComputing/OtherDataSources
30
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 31. Are Data Mashups the solution?
[source: http://pipes.yahoo.com/pipes/ ]
[source: http://www.popfly.com/ ]
[source: http://editor.googlemashups.com ]
IBM Lotus Mashups
[source: http://openkapow.com/ ]
[source: http://www-01.ibm.com/software/lotus/products/mashups/ ]
31
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 32. Data Mashups offer powerful visualizations
Google Charts API
http://maps.google.it/ http://code.google.com/apis/chart/
MIT Simile Timeline & Timeplot
http://simile.mit.edu/timeline/ http://simile.mit.edu/timeplot/
http://maps.yahoo.com/
32
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 33. Data Mashups offer simple programming
abstractions
33
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 34. Not everything boils down to plumbing
34
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 35. The LarKC project
.eu !
u!
ww larkc
///www..lark c.e
http: /w
p:
Visiit htt
Vis t
[Source: Fensel, D., van Harmelen, F.: Unifying reasoning and search to web scale. IEEE Internet Computing 11(2) (2007)]
35
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 36. Sustainable mobility as an example
Urban Computing proposes a set of different • • How can we redevelop
How can we redevelop
issues, from technological to social ones. existing neighbourhoods and
existing neighbourhoods and
Our experience in the field make us believe business districts to improve
business districts to improve
the quality of life?
that sustainable mobility is an exemplar the quality of life?
case which we can elicit generalizable • • How can we create more
How can we create more
choices in housing,
requirements from. choices in housing,
accommodating diverse
accommodating diverse
Mobility demand has been growing steadily lifestyles and all income
lifestyles and all income
for decades and it will continue in the future. levels?
levels?
For many years, the primary way of dealing • • How can we reduce traffic
How can we reduce traffic
with this increasing demand has been the congestion yet stay
congestion yet stay
connected?
increase of the roadway network capacity, by connected?
building new roads or adding new lanes to • • How can we include citizens in
How can we include citizens in
planning their communities
existing ones. planning their communities
rather than limiting input to
rather than limiting input to
However, financial and ecological only those affected by the next
only those affected by the next
considerations are posing increasingly severe project?
project?
constraints on this process. • • How can we fund schools,
How can we fund schools,
Hence, there is a need for additional bridges, roads, and clean
bridges, roads, and clean
water while meeting short-term
intelligent approaches designed to meet the water while meeting short-term
costs of increased security?
demand while more efficiently utilizing the costs of increased security?
existing infrastructure and resources.
36
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 37. A Challenging Use Case 1/2 (planning)
Actors: Varese
Carlo: a citizen
living in Varese.
The day after, he
has to go to
Lombardy Region
premises in Milano
at 11.00.
UCS: a fictitious
Urban Computing ©2009 Google – Map Data @2009 Teleatlas – Terms of Usage
System of Milano
area
Ways to Milano Milano
Private Car
FS railways
Le Nord railways ©2009 Google – Map Data @2009 Teleatlas – Terms of Usage
37
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 38. A Challenging Use Case 2/2 (traveling)
Actors: Varese
Carlo: a citizen
living in Varese.
The day after, he
has to go to
Lombardy Region
M
premises in Milano
at 11.00.
UCS: a fictitious
Urban Computing ©2009 Google – Map Data @2009 Teleatlas – Terms of Usage
System of Milano
area
Ways to Milano Milano
Private Car
M
FS railways
Le Nord railways ©2009 Google – Map Data @2009 Teleatlas – Terms of Usage
38
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 39. Requirements for LarKC
Urban Computing (and Mobility Management) encompass
sensing, actuation and computing requirements.
Many previous work in the area of Pervasive and Ubiquitous
Computing investigated requirements in sensing, actuation,
and several aspects of computation (from hardware to
software, from networks to devices)
In this work we are focusing on reasoning requirements
for LarKC, but also of general interest for the entire
community working on the complex relationship of the
Internet with space, places, people and content.
Hereafter we exemplify how coping with
representational, reasoning, and defaults heterogeneity
scale
time-dependency
noisy, uncertain and inconsistent data
39
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 40. Coping with representational heterogeneity
It is an obvious requirement
data always come in different formats (syntactic and
structural heterogeneity)
legacy data not in semantic formats will always exist!
the problem of merging and aligning ontologies is a
structural problem of knowledge engineering and it must
be always considered when developing an application of
semantic technologies.
40
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 41. Coping with reasoning heterogeneity
It means the systems allow for multiple paradigms of
reasoners; e.g.
approximate reasoning when
precise and consistent
inference for telling that at a calculating the probability of a
given junction all vehicles, but traffic jam given the current
public transportation ones, traffic conditions and the past
must go straight history
[ source http://senseable.mit.edu/ ]
41
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 42. Coping with defaults heterogeneity 1/2
Open World Assumption vs. Close World Assumption
While for the an entire city we cannot assume complete
knowledge, for a time table of a bus station we can
[source: http://gizmodo.com/photogallery/trafficsky/1003143552 ]
42
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 43. Coping with defaults heterogeneity 2/2
Unique Name Assumption
A square with several station for buses and subway can be
considered a unique point for multimodal travel planning,
but not when the problem is giving direction in that square to
a pedestrian
©2009 Google – Map Data @2009 Teleatlas – Terms of Usage ©2009 Google – Imagery @2009 Teleatlas – Terms of Usage
43
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 44. Coping with scale
The advent of Pervasive Computing and Web 2.0
technologies led to a constantly growing amount of data
about urban environments
Although we encounter large scale data which are not
manageable, it does not necessary mean that we have to
deal with all of the data simultaneously.
Usually, only very limited amount data are relevant for a
single query/processing at a specific application.
For example, when Carlo is driving to Milano,
only part of the Milano map data are relevant.
the local parking information may become active by a prediction of
the known relation between bad weather conditions and destination
parking lot re-planning.
44
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 45. Coping with time-dependency
Knowledge and data can change over the time.
For instance, in Urban Computing names of streets, landmarks, kind
of events, etc. change very slowly, whereas the number of cars that
go through a traffic detector in five minutes changes very fast.
This means that the system must have the notion of
''observation period'', defined as the period when we the
system is subject to querying.
Moreover the system, within a given observation period,
must consider the following four different types of
knowledge and data:
Invariable knowledge
Invariable data
Periodically changing data that change according to a temporal
law that can be
Event driven changing data that are updated as a consequence of
some external event.
45
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 46. Invariable knowledge and data
Invariable knowledge
it includes obvious terminological
knowledge
such as an address is made up by a
street name, a civic number, a city name
and a ZIP code
less obvious nomological knowledge
that describes how the world is
expected
to be
e.g., given traffic lights are switched off or
certain streets are closed during the night
to evolve
e.g., traffic jams appears more often when
it rains or when important sport events
take place
Invariable data
do not change in the observation period,
e.g. the names and lengths of the roads.
©2009 Google – Imagery @2009 Teleatlas – Terms of Usage
46
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 47. Changing data
Periodically changing data change
according to a temporal law that can
be
Pure periodic law, e.g. every night at
10pm Milano overpasses close.
Probabilistic law, e.g. traffic jam appear
in the west side of Milano due to bad
weather or when San Siro stadium hosts
a soccer match.
Event driven changing data are
updated as a consequence of some
external event. They can be further
characterized by the mean time
between changes:
Slow, e.g. roads closed for scheduled
works
Medium, e.g. roads closed for accidents
or congestion due to traffic
Fast, e.g. the intensity of traffic for each
street in a city
©2009 Google – Imagery @2009 Teleatlas – Terms of Usage
47
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 48. Coping with noisy, uncertain and inconsistent data
Traffic data are a very good example of such data.
Different sensors observing the same road area give apparently
inconsistent information.
a traffic camera may say that the road is empty
whereas an inductive loop traffic detector may tell 100 vehicles went
over it
The two information may be coherent if one consider that a traffic
camera transmits an image per second with a delay of 15-30 seconds,
whereas a traffic detector tells the number of vehicles that went over it
in 5 minutes and the information may arrive 5-10 minutes later.
Moreover, a single data coming from a sensor in a given moment
may have no certain meaning.
an inductive loop traffic detector, it tells you 0 car went over
Is the road empty?
Is the traffic completely stuck?
Did somebody park the car above the sensor?
Is the sensor broken?
Combining multiple information from multiple sensors in a given time
window can be the only reasonable way to reduce the uncertainty.
48
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 49. Towards requirements satisfaction in LarKC
The Large Knowledge Collider
a platform for infinitely scalable
reasoning on the data-web
Pipeline
49
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 50. The first Data Mashup within_________
within
Mobile Data Mashup Environment
REST
Pipeline
request
Config. LarKC platform
SPARQL
Interface query
JSON
SPARQL
response
result
Request data Data
PROBLEM:
Which Milano
monuments or
events or friends
can I quickly get
to from here?
http://www.larkc.eu People Traffic
Events Monuments
50
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 51. A roadmap towards LarKC Urban use case
Data
Known: street topology, monuments/events/friends location, traffic
situation (current data stream + historical time series)
Inferred: traffic predictions, residual street capacity
Formulating the query for LarKC
Basic: shortest path from A to B
Extended: shortest path from A to monuments/events/friends
Advanced: considering traffic predictions and residual street capacity
Configuring the pipeline
Basic configuration
Combining a SPARQL processor and a Graph Processor
Using AllegroGraph GeoExtension as a selector
Extended configuration
DBpedia, EVDB, GoogleLatitude selector
Advanced configuration:
traffic predictions based on recurrent neural networks,
residual street capacity based on data stream analysis
51
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 52. LarKC Early Adopters Workshop
The public launch of the first The Large Knowledge
open source LarKC platform Collider a platform for
release will take place at the massive distributed
incomplete reasoning
forthcoming European Semantic http://www.larkc.eu
Web Conference (ESWC 2009)
Register for the event!
More information at:
http://earlyadopters.larkc.eu/
We are developing the Urban
Baby LarKC as a showcase of
the potentiality of such platform
Everybody will be invited to run
experiments over LarKC
52
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 53. The next Web of
open, linked data
Just research? What’s going on?
Why should I care?
From research to business: the Web of linked data
Enterprise X.0/Econom Workshops @ BIS 2009 – Poznan, 29th April 2009 - © CEFRIEL 2009
- 54. Freebase
“an open, shared database of the world’s information”
Source: Freebase - http://www.freebase.com (2009)
54
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 55. OpenCalais
Source: Thomson Reuters - http://www.opencalais.com/ (2009)
55
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 56. What’s next? Business point of view
Organization today are used to produce lots of data…
…and they have the problem of managing and making
sense of them!
More and more often they ask for Business Intelligence
and related technologies to understand and decide
But it also happens that, in order to fully understand
what’s going on and to take informed decisions, the data
within the organization should be integrated or enhanced
with external knowledge
This could definitely be a job for linked data
technology!
56
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 57. Linked data seen by the Web inventor
“Stop hugging
your data”
Sir Tim Berners-Lee, 2009
Don’t let
considerations
about security or
data ownership
represent an
obstacle to
innovation and
opportunities
www.flickr.com/photos/_-amy-_/3167333250/
57
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 58. What’s next? Technological point of view
How Business Intelligence and similar techniques change
when their basic assumptions are no more valid?
Dynamically changing data sources (and data
themselves…)
Inconsistency typical of the Web (everything & the
opposite of everything)
Partial information
More information than expected or than needed
Linked data pose new challenges for existing
technologies!
58
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 59. If I didn’t convince you…
http://www.ted.com/index.php/talks/tim_berners_lee_on_the_next_web.html
59
Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
- 60. Thanks for your attention! Any question?
Contacts: Irene Celino – Semantic Web Practice
CEFRIEL – ICT Institute, Politecnico di Milano
email: irene.celino@cefriel.it – web: http://swa.cefriel.it
phone: +39-02-23954266 – fax: +39-02-23954466
Slides available at: http://www.slideshare.net/iricelino
From research to business: the Web of linked data
Enterprise X.0/Econom Workshops @ BIS 2009 – Poznan, 29th April 2009 - © CEFRIEL 2009