SlideShare una empresa de Scribd logo
1 de 24
Interlinking multimedia for the analysis of media
           coverage of political debates

        Max Kemman & Henri Beunders
               NOTaS meeting



                  www.polimedia.nl
Main goal
• Aimed at Humanities researchers
• Using CLARIN standard




25-6-2012         PoliMedia - NOTaS meeting   2
Main research question
 What choices do different media make in the coverage
  of people and topics while reporting on debates in the
 Dutch parliament since the first televised evening news
                   in 1956 until 1995?




25-6-2012              PoliMedia - NOTaS meeting           3
Historical research use case
• How did the European Monetary Union (EMU)
  come to be in the 1990’s?
• What events led to the becoming of the EMU?
• How was this all represented by the media at
  that time?




25-6-2012             PoliMedia - NOTaS meeting   4
Current approach


             +                               =   Too much
                                                   work



                                                 Limited
                                                 material

             +                               =      and
                                                 different
                                                 systems
25-6-2012        PoliMedia - NOTaS meeting                   5
PoliMedia approach

            PoliMedia                                 Newspapers
              Portal                                     KB
                                                        1950-1995
                          Staten
     - Browse:           Generaal                      Television
       debate and        Digitaal                   Sound and Vision
       date                 KB                         1956-1995
     - Search:           1818-1995
                                                         Radio
       debate and
                                                          KB
       person                                          1950-1984


25-6-2012               PoliMedia - NOTaS meeting                      6
Why PoliMedia?
Better insight in the relations between media
items




25-6-2012          PoliMedia - NOTaS meeting    7
Data sets
• Primary data set:
            • The Dutch parliamentary debates (Handelingen der
              Staten-General (Dutch Hansard))
            • Available at the KB in raw format
            • Made CLARIN compliant in War In Parliament project
               – chronological structure of consecutive speakers in a debate
• Secondary data set:
            1. NISV Academia set (OAI protocol)
            2. KB - newspapers (SRU protocol)
            3. KB - radio bulletins (SRU protocol)


25-6-2012                       PoliMedia - NOTaS meeting                      8
Current status of technical work
1. Extract structure of debates
2. Find named entities in debate texts: people,
   organizations, locations.
3. Find links between debates and media.




25-6-2012          PoliMedia - NOTaS meeting      9
1. Debate dataset structure
 Debate
metadata




 Topic




Speaker
                                                       Speech
Segment




     25-6-2012            PoliMedia - NOTaS meeting   10
Debate metadata schema
                               2011-12-14
            Stemmingen over…                                         poli:hasNextSpeech                  poli:hasNextSegment
                                           poli:hasPubDate

       poli:hasDesc

                                                                                                                          sem:hasActor
                                                                                                           speech
                               debate                                      speech
                                                                                                           segment
                                                 poli:hasSpeech
                                                                                    poli:hasSpeechSegment
                                                                  poli:hasDesc




                      poli:MediaType                                    Natuur en milieu



                                                                                                                    poli:coveredIn
                               Dbpedia: transcript                                 poli:mentions                    (media)
                                                                                   (People, locations,
                                                                                   organizations)




25-6-2012                                                    PoliMedia - NOTaS meeting                                                   11
2. Named Entity Recognition
                    in debates
• Fietstas: web services for processing textual
  content
      – http://fietstas.science.uva.nl/
• Lists of named entities (NEs) that appear in
  specific documents or sets of documents
• Works well with Dutch language (unlike other
  popular services like Dbpedia spotlight)


25-6-2012                PoliMedia - NOTaS meeting   12
Named Entity Recognition

            debate1
              .xml                                debate1   ner1
                                                    .xml    .xml



            debate2
              .xml                                debate2   ner2
                                                    .xml    .xml




            debate3                               debate3   ner3
              .xml                                  .xml    .xml



25-6-2012             PoliMedia - NOTaS meeting                    13
Named Entity Recognition
                                                •Persons
                                                •Organizations
                                                •Locations
                                                •Miscellaneous




25-6-2012           PoliMedia - NOTaS meeting            14
3. Find links to newspapers and radio
                 bulletins
We use the dates, topics, named entities and
speakers of the debates to query the media
archives.

Media document harvesting:
• SRU protocol (Search and retrieval via URL )
• http://www.loc.gov/standards/sru/
• JSRU is a Java implementation of the SRU
  protocol at the KB

25-6-2012           PoliMedia - NOTaS meeting    15
Automatic Query Construction
                                        • Persons, Locations and Organizations
                Debate
               Metadata                   mentioned inside topics of the debate
                                        • Speakers

                  Topic 1                 TopicList =
                                             PersonsInTopic      LocationsInTopic   Org.InTopic
            Speaker 1 / Content


            Speaker 2 / Content             +
                                          Speaker n =
            Speaker 3 / Content
                                            ActorFromSegment                        TimeFrame



                  Topic 2
                                                              Example query: give all the
                                                              newspaper issues in the collection
            Speaker 1 / Content          Query
                                                              DDD_krantnr where the date value is
                                                              between 01-01-1940 and 31-12-1945
25-6-2012                         PoliMedia - NOTaS meeting
Newspaper metadata
                                                         1951-11-08


                                                SCHUTJASSEN                   poli:hasPubDate

                                                              poli:hasTitle
                                             De Heerenveensche      poli:PublishedIn
                                                  koerier                                 article instance



                                      poli:Mentions


                                                                 poli:MediaType




                                                     Dbpedia: Newspaper article




25-6-2012        PoliMedia - NOTaS meeting                                                           17
Radio bulletin metadata



                                                                           1946/05/06

                                          ANP Nieuwsbericht -
                                            06-05-1946 - 10                             poli:hasPubDate

                                                           poli:hasTitle

                                                                                        article instance




25-6-2012           PoliMedia - NOTaS meeting Dbpedia: Radio bulletin                                      18
The date of a debate and a media
                     article
                     • We use the dates, topics, named
                       entities and speakers of the
                       debates to query the media
                       archives.
                     • News item is always at the same
                       day or after the debate.
                     • How much time should we allow
                       between debate and media item?
                     • Current choice: 1 month.
                       Result 1-26 of 26 results for “Princen” AND “Van
                       Mierlo”
                       Timeframe: one month period:
                       • 26 articles in period between 21/12. and 21/01
25-6-2012              • 7 on day of the
                   PoliMedia - NOTaS meeting debate, only 1 article 1 month later.
                                                                              19
Debate → Newspaper example




                       Dates between:
                       21.12.1994.(debate date)
                       21.01.1995.

                       • Queries:
                            o Small numbers of topics (to avoid
                            overspecialization)
                            o Shorter timespan (fast media cycle)
25-6-2012      PoliMedia - NOTaS meeting                            20
Overview
                                        PersonsInTopic

                                        LocationsInTopic

                                         Org.InTopic




                         TimeFrame

                                               Query
                     ActorFromSegment




25-6-2012   PoliMedia - NOTaS meeting                      21
PoliMedia+
• Elections in September

                300
            influential
             political
              Twitter
             accounts




25-6-2012                  PoliMedia - NOTaS meeting   22
What can you do with this?
• PoliMedia allows a better insight between
  politics and media
• What can Speech- and Language-technologists
  do with it?




25-6-2012            PoliMedia - NOTaS meeting   23
Contact
                 www.polimedia.nl
               kemman@eshcc.eur.nl
Acknowledgements
• Rest of the team
      – Laura Hollink (VU), Geert-Jan Houben, Damir Juric (TU
        Delft), Johan Oomen, Jaap Blom (NISV), Martijn
        Kleppe (EUR)
      – KB
• War in Parliament
• CLARIN
      – Arjan van Hessen
25-6-2012                  PoliMedia - NOTaS meeting        24

Más contenido relacionado

Similar a PoliMedia presentation NOTaS meeting

IASSIST 2012 - DDI-RDF - Trouble with Triples
IASSIST 2012 - DDI-RDF - Trouble with TriplesIASSIST 2012 - DDI-RDF - Trouble with Triples
IASSIST 2012 - DDI-RDF - Trouble with TriplesDr.-Ing. Thomas Hartmann
 
Linked data activities in the Deutsche Nationalbibliothek
Linked data activities in the Deutsche NationalbibliothekLinked data activities in the Deutsche Nationalbibliothek
Linked data activities in the Deutsche NationalbibliothekLars G. Svensson
 
Keynote Exploring and Exploiting Official Publications
Keynote Exploring and Exploiting Official PublicationsKeynote Exploring and Exploiting Official Publications
Keynote Exploring and Exploiting Official Publicationsmaartenmarx
 
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...Heiko Paulheim
 
A new approach to aggregation
A new approach to aggregation A new approach to aggregation
A new approach to aggregation Enno Meijers
 
Bringing parliamentary debates to the Semantic Web
Bringing parliamentary debates to the Semantic WebBringing parliamentary debates to the Semantic Web
Bringing parliamentary debates to the Semantic WebLaura Hollink
 
Expressing Dublin Core Metadata.ppt
Expressing Dublin Core Metadata.pptExpressing Dublin Core Metadata.ppt
Expressing Dublin Core Metadata.pptBharath Abbareddy
 
Prateek Jain dissertation defense, Kno.e.sis, Wright State University
Prateek Jain dissertation defense, Kno.e.sis, Wright State UniversityPrateek Jain dissertation defense, Kno.e.sis, Wright State University
Prateek Jain dissertation defense, Kno.e.sis, Wright State UniversityPrateek Jain
 
Creation of custom KOS-based recommendation systems
Creation of custom KOS-based recommendation systemsCreation of custom KOS-based recommendation systems
Creation of custom KOS-based recommendation systemsGESIS
 
ICWE2013 - Discovering links between political debates and media
ICWE2013 - Discovering links between political debates and mediaICWE2013 - Discovering links between political debates and media
ICWE2013 - Discovering links between political debates and mediagjhouben
 
Principles for proper data management and reuse--An RDA view
Principles for proper data management and reuse--An RDA viewPrinciples for proper data management and reuse--An RDA view
Principles for proper data management and reuse--An RDA viewResearch Data Alliance
 
e-Science, Research Data and Libaries
e-Science, Research Data and Libariese-Science, Research Data and Libaries
e-Science, Research Data and LibariesRob Grim
 
Linking Open Data with Drupal
Linking Open Data with DrupalLinking Open Data with Drupal
Linking Open Data with Drupalemmanuel_jamin
 
Alessio Bosca: Linked Data for Content Analytics in CELI
Alessio Bosca: Linked Data for Content Analytics in CELIAlessio Bosca: Linked Data for Content Analytics in CELI
Alessio Bosca: Linked Data for Content Analytics in CELImbruemmer
 
Linked Open Data Alignment and Enrichment Using Bootstrapping Based Techniques
Linked Open Data Alignment and Enrichment Using Bootstrapping Based TechniquesLinked Open Data Alignment and Enrichment Using Bootstrapping Based Techniques
Linked Open Data Alignment and Enrichment Using Bootstrapping Based TechniquesPrateek Jain
 

Similar a PoliMedia presentation NOTaS meeting (20)

IASSIST 2012 - DDI-RDF - Trouble with Triples
IASSIST 2012 - DDI-RDF - Trouble with TriplesIASSIST 2012 - DDI-RDF - Trouble with Triples
IASSIST 2012 - DDI-RDF - Trouble with Triples
 
Linked data activities in the Deutsche Nationalbibliothek
Linked data activities in the Deutsche NationalbibliothekLinked data activities in the Deutsche Nationalbibliothek
Linked data activities in the Deutsche Nationalbibliothek
 
Keynote Exploring and Exploiting Official Publications
Keynote Exploring and Exploiting Official PublicationsKeynote Exploring and Exploiting Official Publications
Keynote Exploring and Exploiting Official Publications
 
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
Using Knowledge Graphs in Data Science - From Symbolic to Latent Representati...
 
A new approach to aggregation
A new approach to aggregation A new approach to aggregation
A new approach to aggregation
 
Bringing parliamentary debates to the Semantic Web
Bringing parliamentary debates to the Semantic WebBringing parliamentary debates to the Semantic Web
Bringing parliamentary debates to the Semantic Web
 
PhD Proposal Defense - Prateek Jain
PhD Proposal Defense - Prateek JainPhD Proposal Defense - Prateek Jain
PhD Proposal Defense - Prateek Jain
 
Expressing Dublin Core Metadata.ppt
Expressing Dublin Core Metadata.pptExpressing Dublin Core Metadata.ppt
Expressing Dublin Core Metadata.ppt
 
Prateek Jain dissertation defense, Kno.e.sis, Wright State University
Prateek Jain dissertation defense, Kno.e.sis, Wright State UniversityPrateek Jain dissertation defense, Kno.e.sis, Wright State University
Prateek Jain dissertation defense, Kno.e.sis, Wright State University
 
NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Sync...
NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Sync...NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Sync...
NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Sync...
 
Creation of custom KOS-based recommendation systems
Creation of custom KOS-based recommendation systemsCreation of custom KOS-based recommendation systems
Creation of custom KOS-based recommendation systems
 
ICWE2013 - Discovering links between political debates and media
ICWE2013 - Discovering links between political debates and mediaICWE2013 - Discovering links between political debates and media
ICWE2013 - Discovering links between political debates and media
 
Principles for proper data management and reuse--An RDA view
Principles for proper data management and reuse--An RDA viewPrinciples for proper data management and reuse--An RDA view
Principles for proper data management and reuse--An RDA view
 
e-Science, Research Data and Libaries
e-Science, Research Data and Libariese-Science, Research Data and Libaries
e-Science, Research Data and Libaries
 
Prateek Jain's Dissertation Defense - Linked Open Data Alignment and Querying
Prateek Jain's Dissertation Defense - Linked Open Data Alignment and QueryingPrateek Jain's Dissertation Defense - Linked Open Data Alignment and Querying
Prateek Jain's Dissertation Defense - Linked Open Data Alignment and Querying
 
Linking Open Data with Drupal
Linking Open Data with DrupalLinking Open Data with Drupal
Linking Open Data with Drupal
 
Here Comes Everything
Here Comes EverythingHere Comes Everything
Here Comes Everything
 
Connecting Museums with Linked Data
Connecting Museums with Linked DataConnecting Museums with Linked Data
Connecting Museums with Linked Data
 
Alessio Bosca: Linked Data for Content Analytics in CELI
Alessio Bosca: Linked Data for Content Analytics in CELIAlessio Bosca: Linked Data for Content Analytics in CELI
Alessio Bosca: Linked Data for Content Analytics in CELI
 
Linked Open Data Alignment and Enrichment Using Bootstrapping Based Techniques
Linked Open Data Alignment and Enrichment Using Bootstrapping Based TechniquesLinked Open Data Alignment and Enrichment Using Bootstrapping Based Techniques
Linked Open Data Alignment and Enrichment Using Bootstrapping Based Techniques
 

Más de MaxKemman

Boundary practices in digital humanities
Boundary practices in digital humanitiesBoundary practices in digital humanities
Boundary practices in digital humanitiesMaxKemman
 
Infrastructure As Afterthought
Infrastructure As AfterthoughtInfrastructure As Afterthought
Infrastructure As AfterthoughtMaxKemman
 
Interdisciplinary Ignorance
Interdisciplinary IgnoranceInterdisciplinary Ignorance
Interdisciplinary IgnoranceMaxKemman
 
Digital History Projects as Boundary Objects
Digital History Projects as Boundary ObjectsDigital History Projects as Boundary Objects
Digital History Projects as Boundary ObjectsMaxKemman
 
Digital History Projects as Boundary Objects
Digital History Projects as Boundary ObjectsDigital History Projects as Boundary Objects
Digital History Projects as Boundary ObjectsMaxKemman
 
Too Many Varied User Requirements for Digital Humanities Projects
Too Many Varied User Requirements for Digital Humanities ProjectsToo Many Varied User Requirements for Digital Humanities Projects
Too Many Varied User Requirements for Digital Humanities ProjectsMaxKemman
 
Oral History Today
Oral History TodayOral History Today
Oral History TodayMaxKemman
 
Dutch Journalism in the Digital Age
Dutch Journalism in the Digital AgeDutch Journalism in the Digital Age
Dutch Journalism in the Digital AgeMaxKemman
 
User research in the development of PoliMedia
User research in the development of PoliMediaUser research in the development of PoliMedia
User research in the development of PoliMediaMaxKemman
 
User research for the development of search systems
User research for the development of search systemsUser research for the development of search systems
User research for the development of search systemsMaxKemman
 
Mapping the use of digital sources amongst Humanities scholars in the Netherl...
Mapping the use of digital sources amongst Humanities scholars in the Netherl...Mapping the use of digital sources amongst Humanities scholars in the Netherl...
Mapping the use of digital sources amongst Humanities scholars in the Netherl...MaxKemman
 

Más de MaxKemman (11)

Boundary practices in digital humanities
Boundary practices in digital humanitiesBoundary practices in digital humanities
Boundary practices in digital humanities
 
Infrastructure As Afterthought
Infrastructure As AfterthoughtInfrastructure As Afterthought
Infrastructure As Afterthought
 
Interdisciplinary Ignorance
Interdisciplinary IgnoranceInterdisciplinary Ignorance
Interdisciplinary Ignorance
 
Digital History Projects as Boundary Objects
Digital History Projects as Boundary ObjectsDigital History Projects as Boundary Objects
Digital History Projects as Boundary Objects
 
Digital History Projects as Boundary Objects
Digital History Projects as Boundary ObjectsDigital History Projects as Boundary Objects
Digital History Projects as Boundary Objects
 
Too Many Varied User Requirements for Digital Humanities Projects
Too Many Varied User Requirements for Digital Humanities ProjectsToo Many Varied User Requirements for Digital Humanities Projects
Too Many Varied User Requirements for Digital Humanities Projects
 
Oral History Today
Oral History TodayOral History Today
Oral History Today
 
Dutch Journalism in the Digital Age
Dutch Journalism in the Digital AgeDutch Journalism in the Digital Age
Dutch Journalism in the Digital Age
 
User research in the development of PoliMedia
User research in the development of PoliMediaUser research in the development of PoliMedia
User research in the development of PoliMedia
 
User research for the development of search systems
User research for the development of search systemsUser research for the development of search systems
User research for the development of search systems
 
Mapping the use of digital sources amongst Humanities scholars in the Netherl...
Mapping the use of digital sources amongst Humanities scholars in the Netherl...Mapping the use of digital sources amongst Humanities scholars in the Netherl...
Mapping the use of digital sources amongst Humanities scholars in the Netherl...
 

Último

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 

Último (20)

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 

PoliMedia presentation NOTaS meeting

  • 1. Interlinking multimedia for the analysis of media coverage of political debates Max Kemman & Henri Beunders NOTaS meeting www.polimedia.nl
  • 2. Main goal • Aimed at Humanities researchers • Using CLARIN standard 25-6-2012 PoliMedia - NOTaS meeting 2
  • 3. Main research question What choices do different media make in the coverage of people and topics while reporting on debates in the Dutch parliament since the first televised evening news in 1956 until 1995? 25-6-2012 PoliMedia - NOTaS meeting 3
  • 4. Historical research use case • How did the European Monetary Union (EMU) come to be in the 1990’s? • What events led to the becoming of the EMU? • How was this all represented by the media at that time? 25-6-2012 PoliMedia - NOTaS meeting 4
  • 5. Current approach + = Too much work Limited material + = and different systems 25-6-2012 PoliMedia - NOTaS meeting 5
  • 6. PoliMedia approach PoliMedia Newspapers Portal KB 1950-1995 Staten - Browse: Generaal Television debate and Digitaal Sound and Vision date KB 1956-1995 - Search: 1818-1995 Radio debate and KB person 1950-1984 25-6-2012 PoliMedia - NOTaS meeting 6
  • 7. Why PoliMedia? Better insight in the relations between media items 25-6-2012 PoliMedia - NOTaS meeting 7
  • 8. Data sets • Primary data set: • The Dutch parliamentary debates (Handelingen der Staten-General (Dutch Hansard)) • Available at the KB in raw format • Made CLARIN compliant in War In Parliament project – chronological structure of consecutive speakers in a debate • Secondary data set: 1. NISV Academia set (OAI protocol) 2. KB - newspapers (SRU protocol) 3. KB - radio bulletins (SRU protocol) 25-6-2012 PoliMedia - NOTaS meeting 8
  • 9. Current status of technical work 1. Extract structure of debates 2. Find named entities in debate texts: people, organizations, locations. 3. Find links between debates and media. 25-6-2012 PoliMedia - NOTaS meeting 9
  • 10. 1. Debate dataset structure Debate metadata Topic Speaker Speech Segment 25-6-2012 PoliMedia - NOTaS meeting 10
  • 11. Debate metadata schema 2011-12-14 Stemmingen over… poli:hasNextSpeech poli:hasNextSegment poli:hasPubDate poli:hasDesc sem:hasActor speech debate speech segment poli:hasSpeech poli:hasSpeechSegment poli:hasDesc poli:MediaType Natuur en milieu poli:coveredIn Dbpedia: transcript poli:mentions (media) (People, locations, organizations) 25-6-2012 PoliMedia - NOTaS meeting 11
  • 12. 2. Named Entity Recognition in debates • Fietstas: web services for processing textual content – http://fietstas.science.uva.nl/ • Lists of named entities (NEs) that appear in specific documents or sets of documents • Works well with Dutch language (unlike other popular services like Dbpedia spotlight) 25-6-2012 PoliMedia - NOTaS meeting 12
  • 13. Named Entity Recognition debate1 .xml debate1 ner1 .xml .xml debate2 .xml debate2 ner2 .xml .xml debate3 debate3 ner3 .xml .xml .xml 25-6-2012 PoliMedia - NOTaS meeting 13
  • 14. Named Entity Recognition •Persons •Organizations •Locations •Miscellaneous 25-6-2012 PoliMedia - NOTaS meeting 14
  • 15. 3. Find links to newspapers and radio bulletins We use the dates, topics, named entities and speakers of the debates to query the media archives. Media document harvesting: • SRU protocol (Search and retrieval via URL ) • http://www.loc.gov/standards/sru/ • JSRU is a Java implementation of the SRU protocol at the KB 25-6-2012 PoliMedia - NOTaS meeting 15
  • 16. Automatic Query Construction • Persons, Locations and Organizations Debate Metadata mentioned inside topics of the debate • Speakers Topic 1 TopicList = PersonsInTopic LocationsInTopic Org.InTopic Speaker 1 / Content Speaker 2 / Content + Speaker n = Speaker 3 / Content ActorFromSegment TimeFrame Topic 2 Example query: give all the newspaper issues in the collection Speaker 1 / Content Query DDD_krantnr where the date value is between 01-01-1940 and 31-12-1945 25-6-2012 PoliMedia - NOTaS meeting
  • 17. Newspaper metadata 1951-11-08 SCHUTJASSEN poli:hasPubDate poli:hasTitle De Heerenveensche poli:PublishedIn koerier article instance poli:Mentions poli:MediaType Dbpedia: Newspaper article 25-6-2012 PoliMedia - NOTaS meeting 17
  • 18. Radio bulletin metadata 1946/05/06 ANP Nieuwsbericht - 06-05-1946 - 10 poli:hasPubDate poli:hasTitle article instance 25-6-2012 PoliMedia - NOTaS meeting Dbpedia: Radio bulletin 18
  • 19. The date of a debate and a media article • We use the dates, topics, named entities and speakers of the debates to query the media archives. • News item is always at the same day or after the debate. • How much time should we allow between debate and media item? • Current choice: 1 month. Result 1-26 of 26 results for “Princen” AND “Van Mierlo” Timeframe: one month period: • 26 articles in period between 21/12. and 21/01 25-6-2012 • 7 on day of the PoliMedia - NOTaS meeting debate, only 1 article 1 month later. 19
  • 20. Debate → Newspaper example Dates between: 21.12.1994.(debate date) 21.01.1995. • Queries: o Small numbers of topics (to avoid overspecialization) o Shorter timespan (fast media cycle) 25-6-2012 PoliMedia - NOTaS meeting 20
  • 21. Overview PersonsInTopic LocationsInTopic Org.InTopic TimeFrame Query ActorFromSegment 25-6-2012 PoliMedia - NOTaS meeting 21
  • 22. PoliMedia+ • Elections in September 300 influential political Twitter accounts 25-6-2012 PoliMedia - NOTaS meeting 22
  • 23. What can you do with this? • PoliMedia allows a better insight between politics and media • What can Speech- and Language-technologists do with it? 25-6-2012 PoliMedia - NOTaS meeting 23
  • 24. Contact www.polimedia.nl kemman@eshcc.eur.nl Acknowledgements • Rest of the team – Laura Hollink (VU), Geert-Jan Houben, Damir Juric (TU Delft), Johan Oomen, Jaap Blom (NISV), Martijn Kleppe (EUR) – KB • War in Parliament • CLARIN – Arjan van Hessen 25-6-2012 PoliMedia - NOTaS meeting 24

Notas del editor

  1. Limited: not everything is in it, but more importantly no mark-up or pages
  2. Searching and browsing multimedial databases in a single interfaceOffering a better insight in the relations between media itemsAllowing researchers to create their own interface on top of the infrastructure