1. News the New Way
Semantics in the Driver’s Seat
Philip Dudchuk & Daniel Hladky
SemTechBiz, San Francisco, June 5, 2012
2. Philip Dudchuk
Head of Semantic Platform,
RIA Novosti
Daniel Hladky
Deputy Director, W3C Russia
Member of the Board, Ontos AG
Philip Dudchuk & Daniel Hladky
SemTechBiz, San Francisco, June 5, 2012
3. 1941
Founded in the beginning of the WW2, RIA Novosti was
initially a news agency reporting on the situation at the
war front
Philip Dudchuk & Daniel Hladky
SemTechBiz, San Francisco, June 5, 2012
4. First news websites looked
like simple feeds
Philip Dudchuk & Daniel Hladky
SemTechBiz, San Francisco, June 5, 2012
5. Boom of platforms in late 2000s
Philip Dudchuk & Daniel Hladky
SemTechBiz, San Francisco, June 5, 2012
6. Metadata rules the world of news
• News metadata gets right content to right departments of the
customer (big media)
• Metadata locates reported events (local newspapers)
• Metadata enables vertical products focused on selected areas
(banking, automotive, government)
Philip Dudchuk & Daniel Hladky
SemTechBiz, San Francisco, June 5, 2012
8. 2011: Need in a common Semantic Publishing Platform
• Build and manage a common news ontology and
vocabularies for all products and news websites
• Generate metadata for both news items and articles on
websites
• Aggregate content and metadata for further use in end-
user applications (websites and mobile apps)
Philip Dudchuk & Daniel Hladky
SemTechBiz, San Francisco, June 5, 2012
9. Evolution of the Publishing Process
Philip Dudchuk & Daniel Hladky
SemTechBiz, San Francisco, June 5, 2012
11. Managing the Triple Store
Triple Store updates
• Editorial meetings
• Statistics about ‘heuristic’ entities
• Adding an entity directly from CMS
Linguistic Information in the Triple Store
• Morphology
• Disambiguation rules & attributes
Philip Dudchuk & Daniel Hladky
SemTechBiz, San Francisco, June 5, 2012
13. Impact 1: Broadcasting News with Semantic Metadata
Filtering news content by triple queries at the customer’s end
(via API):
• content about any oil & gas company
• content about any employee of any public body in a
given region of Russia
• content about any event going to happen in my city
Common metadata for newswire and web content allow to
blend free and paid content into new products (news archive)
Philip Dudchuk & Daniel Hladky
SemTechBiz, San Francisco, June 5, 2012
14. Impact 2: Adaptive Content of Websites
My ria.ru
• Locating the user and filtering the content by region
• Gathering user interests and filtering content by
entities and topics
Philip Dudchuk & Daniel Hladky
SemTechBiz, San Francisco, June 5, 2012
15. Impact 3: Non-traditional Aggregations and Analytics
Putting together news metadata with external content
• summer forest fires
• juvenile delinquency in towns and regions
• election fraud cases
Philip Dudchuk & Daniel Hladky
SemTechBiz, San Francisco, June 5, 2012
16. 3
21 4
10
10 2 11
3 1 12 16 3
9 1 14
11 2
1
2 12
17 1
5
Combination of crowd-sourced geo data about forest
fires and local reports by RIA Novosti
Philip Dudchuk & Daniel Hladky
SemTechBiz, San Francisco, June 5, 2012
17. A case study: country image analysis
Philip Dudchuk & Daniel Hladky
SemTechBiz, San Francisco, June 5, 2012
18. Country image analysis
• Searching news content related to Russia across more
than 3,000 foreign sources
• Processing search results, tagging and aggregating
content with its metadata
• Producing statistics about reaction on subjects
connected to Russia (events, people, organizations)
Philip Dudchuk & Daniel Hladky
SemTechBiz, San Francisco, June 5, 2012
19. Negativity Index
Tymoshenko’s case in Ukraine,
threat to boycott Euro 2012
‘Pussy riots’ punks
arrested
Top sources with biggest number of negative publications on
involvement of Russian politicians and businessmen in Yulia
Tymoshenko’s case
Philip Dudchuk & Daniel Hladky
SemTechBiz, San Francisco, June 5, 2012
20. US media on Russia’s reaction on
the events in Syria
The New York Times The Financial Times
The Washington Post
Syria’s media on the same topic
Philip Dudchuk & Daniel Hladky
SemTechBiz, San Francisco, June 5, 2012
21. Further Challenge
• Processing the content from social media to create
adaptive social applications
• Semantic metadata for pictures and video (image & voice
recognition)
• Making RIA content & metadata API public
• Creating a LOD cloud bubble out of RIA ontology and
vocabularies
Philip Dudchuk & Daniel Hladky
SemTechBiz, San Francisco, June 5, 2012
22. Thank you!
@philip_dudchuk
@daniel_hladky
Philip Dudchuk & Daniel Hladky
SemTechBiz, San Francisco, June 5, 2012