SlideShare una empresa de Scribd logo
1 de 36
Why Is It Difficult to Detect Outbreaks
in Twitter?
Avaré Stewart, Nattiya Kanhabua, Sara Romano
Ernesto Diaz-Aviles, Wolf Siberski, and Wolfgang Nejdl
L3S Research Center / Leibniz Universität Hannover, Germany
SIGIR 2013 Workshop on Health Search and Discovery
1 August 2013, Dublin, Ireland
Motivation
• Numerous works use Twitter to infer the existence
and magnitude of real-world events in real-time
– Earthquake [Sakaki et al., 2010]
– Predicting financial time series [Ruiz et al., 2012]
– Influenza epidemics [Culotta, 2010; Lampos et al.,
2011; Paul et al., 2011]
Early Warnings
Health related tweets
• User status updates or news related to
public health are common in Twitter
– I have the mumps...am I alone?
– my baby girl has a Gastroenteritis so great!! Please
do not give it to meee
– #Cholera breaks out in #Dadaab refugee camp in
#Kenya http://t.co/....
– As many as 16 people have been found infected with
Anthrax in Shahjadpur upazila of the Sirajganj district
in Bangladesh.
Matching Tweets
[Kanhabua et al., CIKM’12]
Matching Tweets
[Kanhabua et al., CIKM’12]
Twitter vs. Official Source
M-Eco System
Medical Ecosystem: Personalized Event-based Surveillance
http://www.meco-project.eu/
Data Collection
• Official outbreak reports
– ~3,000 ProMED-mail reports from 2011
– WHO reports have very small coverage
• Twitter data
– ~1,200 health-related terms (i.e., infectious
diseases, their synonyms, pathogens and symptoms)
– Over 112 millions of tweets from 2011
• Series of NLP tools including
– OpenNLP (tokenization, sentence splitting, POS
tagging)
– OpenCalais (named entity recognition)
– HeidelTime (temporal expression extraction)
Ground Truths
[Kanhabua et al., TAIA’ 12]
Event Extraction
• An event is a sentence containing two entities
– (1) medical condition and (2) geographic expression
– A minimum requirement by domain experts
• A victim and the time of an event can be identified
from the sentence itself, or its surrounding context
• Output: a set of event candidates
Reported by World Health Organization (WHO) on
29 July 2012 about an ongoing Ebola outbreak
in Uganda since the beginning of July 2012
[Kanhabua et al., TAIA’ 12]
Message Filtering: Challenges
• Ambiguity
– having several meanings
– used in different contexts
• Incompleteness
– missing or under-reported events
– data processing errors
Message Filtering: Challenges
• Ambiguity
– having several meanings
– used in different contexts
• Incompleteness
– missing or under-reported events
– data processing errors
Category Example tweet
Literature A two hour train journey, Love In the Time of Cholera ...
Music Dengue Fever’s “Uku,” Mixed by Paul Dreux Smith
Universal Audio...
Marketing Exclusive distributor of high quality #HIV/AIDS Blood &
Urine and #Hepatitis #Self -testers.
General Identification of genotype 4 Hepatitis E virus binding
proteins on swine liver cells: Hepatitis E virus...
Negative i dont have sniffles and no real coughing..well its
coughing but not like an influenza cough.
Joke Thought I had Bieber Fever. Ends up I just had a combo
of the mumps, mono, measles & the hershey squ...
Challenge I. Noisy/evolving
• Evolving data
– Relevant features changes over time
Challenge I. Noisy/evolving
Approach for Noisy Data
• MedISys1
– providing a list of negative keywords
created by medical experts
• Urban Dictionary2
– a Web-based dictionary of slang, ethnic
culture words or phrases
1
http://medusa.jrc.it/medisys/homeedition/en/home.html
2
http://www.urbandictionary.com/
Approach for Noisy Data
1
http://medusa.jrc.it/medisys/homeedition/en/home.html
2
http://www.urbandictionary.com/
[Kanhabua and Nejdl, WOW’ 13]
[Kanhabua and Nejdl, WOW’ 13]
Approach for Feature Changes
Signal Generation: Challenges
• Temporal Dynamics
– seasonal infectious diseases
– rare and spontaneous outbreaks
• Location Dynamics
– frequency and duration
– levels of prevalence or severity
Signal Generation: Challenges
• Temporal Dynamics
– seasonal infectious diseases
– rare and spontaneous outbreaks
• Location Dynamics
– frequency and duration
– levels of prevalence or severity
[Rortais et al., 2010 in Journal of Food Research International]
Signal Generation: Challenges
• Temporal Dynamics
– seasonal infectious diseases
– rare and spontaneous outbreaks
• Location Dynamics
– frequency and duration
– levels of prevalence or severity
Signal Generation: Challenges
[Emch et al., 2008 in International Journal of Health Geographics]
Outbreak Categorization
Outbreak Categorization
How to generate a reliable
signal for low aggregate counts?
Approach
[Kanhabua and Nejdl, WOW’ 13]
Temporal Diversity
• Refined Jaccard Index (RDJ-index)
– average Jaccard similarity of all object pairs
• Note: lower RDJ corresponds to higher diversity
• Problem: “All-Pair comparison”
• Solution: Estimation algorithms with probabilistic
error bound guarantees
[Deng et al., CIKM’
∑<−
=
ji
ji OOJS
nn
RDJ ),(
)1(
2
nji ≤<≤1
∩ UU
Jaccard similarity
Temporal Diversity
• Refined Jaccard Index (RDJ-index)
– average Jaccard similarity of all object pairs
• Note: lower RDJ corresponds to higher diversity
• Problem: “All-Pair comparison”
• Solution: Estimation algorithms with probabilistic
error bound guarantees
[Deng et al., CIKM’
∑<−
=
ji
ji OOJS
nn
RDJ ),(
)1(
2
nji ≤<≤1
∩ UU
Jaccard similarity
(1) Top-k terms
(2) Entities
Threat Assessment: Challenge
• Overwhelming with the large number of tweets
Approach
• Personalized Tweet Ranking for Epidemic
Intelligence
– Learning to rank and recommender systems
– User's context as implicit criteria for
recommendation
[Diaz-Aviles et al., WWW’ 12,
Diaz-Aviles et al., ICWSM’ 12]
Approach
Signal Search Prototype
Future Work
• Real-Time Analysis of Big and Fast
Social Web Streams
– Scalable, efficient methods for filtering and
generating signals in real-time
– Effective methods for aggregating and
visualizing information in a meaningful way
Thank you!
kanhabua@L3S.de
References
• [Culotta, 2010] A. Culotta. Towards detecting influenza epidemics by analyzing twitter messages. In
Proceedings of the First Workshop on Social Media Analytics (SOMA’2010), 2010.
• [Diaz-Aviles et al., 2012a] E. Diaz-Aviles, A. Stewart, E. Velasco, K. Denecke, and W. Nejdl. Towards
personalized learning to rank for epidemic intelligence based on social media streams. In Proceedings of
the 21st World Wide Web Conference (WWW ‘2012), 2012.
• [Diaz-Aviles et al., 2012b] E. Diaz-Aviles, A. Stewart, E. Velasco, K. Denecke, and W. Nejdl. Epidemic
intelligence for the crowd, by the crowd. In Proceedings of International AAAI Conference on Weblogs
and Social Media (ICWSM’2012), 2012.
• [Kanhabua et al., 2012a] N. Kanhabua, Sara Romano, and A. Stewart, Identifying Relevant Temporal
Expressions for Real-world Events, In SIGIR 2012 Workshop on Time-aware Information Access
(TAIA'2012), 2012.
• [Kanhabua et al., 2012b] N. Kanhabua, Sara Romano, and A. Stewart and W. Nejdl. Supporting
Temporal Analytics for Health Related Events in Microblogs. In Proceedings of CIKM'2012, 2012.
• [Kanhabua and Nejdl 2013] N. Kanhabua and W. Nejdl. Understanding the Diversity of Tweets in the
Time of Outbreaks. In Proceedings of the First International Web Observatory Workshop (WOW'2013) at
WWW'2013, 2013.
• [Lampos et al., 2011] V. Lampos and N. Cristianini. Nowcasting events from the social web with
statistical learning. ACM TIST, 3, 2011.
• [Paul et al., 2011] M. J. Paul and M. Dredze. You are what you tweet: Analyzing twitter for public health.
In Proceedings of International AAAI Conference on Weblogs and Social Media (ICWSM’2011), 2011.
• [Ruiz et al., 2012] E. J. Ruiz, V. Hristidis, C. Castillo, A. Gionis, and A. Jaimes. Correlating financial time
series with micro-blogging activity. In Proceedings of WSDM’2012, 2012.
• [Sakaki et al., 2010] T. Sakaki, M. Okazaki, and Y. Matsuo. Earthquake shakes twitter users: real-time
event detection by social sensors. In Proceedings of WWW’2010, 2010.

Más contenido relacionado

Similar a Why Is It Difficult to Detect Outbreaks in Twitter?

Awareness of dengue fever among Youth vivek.pptx
Awareness of dengue fever among Youth vivek.pptxAwareness of dengue fever among Youth vivek.pptx
Awareness of dengue fever among Youth vivek.pptxvivek784476
 
emerging and re-emerging vector borne diseases
emerging and re-emerging vector borne diseasesemerging and re-emerging vector borne diseases
emerging and re-emerging vector borne diseasesAnil kumar
 
Dengue Virus Disease: Recent Updates on Vaccine Development
Dengue Virus Disease: Recent Updates on Vaccine DevelopmentDengue Virus Disease: Recent Updates on Vaccine Development
Dengue Virus Disease: Recent Updates on Vaccine DevelopmentRSIS International
 
The concept of HIV AIDS and nutrition for community development and social w...
The concept of HIV AIDS  and nutrition for community development and social w...The concept of HIV AIDS  and nutrition for community development and social w...
The concept of HIV AIDS and nutrition for community development and social w...RiberatusPhilipo
 
Real Time Pcr Detection Of Hiv Viral Load
Real Time Pcr Detection Of Hiv Viral LoadReal Time Pcr Detection Of Hiv Viral Load
Real Time Pcr Detection Of Hiv Viral LoadAlison Reed
 
Harnessing the Power of Infectious Disease Information with a Relational Data...
Harnessing the Power of Infectious Disease Information with a Relational Data...Harnessing the Power of Infectious Disease Information with a Relational Data...
Harnessing the Power of Infectious Disease Information with a Relational Data...Jay Brown
 
Inv.epidemics
Inv.epidemicsInv.epidemics
Inv.epidemicsNc Das
 
BRENDER-Economic considerations in risk management-ID1485-IDRC2014_b
BRENDER-Economic considerations in risk management-ID1485-IDRC2014_bBRENDER-Economic considerations in risk management-ID1485-IDRC2014_b
BRENDER-Economic considerations in risk management-ID1485-IDRC2014_bGlobal Risk Forum GRFDavos
 
Colloque RI 2014 : Intervention de Sylvie DOLBEAULT, MD, MPH (Institut Curie,...
Colloque RI 2014 : Intervention de Sylvie DOLBEAULT, MD, MPH (Institut Curie,...Colloque RI 2014 : Intervention de Sylvie DOLBEAULT, MD, MPH (Institut Curie,...
Colloque RI 2014 : Intervention de Sylvie DOLBEAULT, MD, MPH (Institut Curie,...Institut national du cancer
 
2014 2015-update-on-ebola-virus-dr-bryna-warshawsky (1)
2014 2015-update-on-ebola-virus-dr-bryna-warshawsky  (1)2014 2015-update-on-ebola-virus-dr-bryna-warshawsky  (1)
2014 2015-update-on-ebola-virus-dr-bryna-warshawsky (1)Elyas Mohammed
 
Aids: Then and Now Webquest
Aids: Then and Now WebquestAids: Then and Now Webquest
Aids: Then and Now Webquestechavey
 
Epidemiology Introduction
Epidemiology Introduction Epidemiology Introduction
Epidemiology Introduction KULDEEP VYAS
 
A World United Against Infectious Diseases: Connecting Organizations for Regi...
A World United Against Infectious Diseases: Connecting Organizations for Regi...A World United Against Infectious Diseases: Connecting Organizations for Regi...
A World United Against Infectious Diseases: Connecting Organizations for Regi...The Rockefeller Foundation
 
The patient perspective nat caplyme final
The patient perspective nat caplyme finalThe patient perspective nat caplyme final
The patient perspective nat caplyme finalnatcaplyme
 
The patient perspective_v3_final
The patient perspective_v3_finalThe patient perspective_v3_final
The patient perspective_v3_finalDonald Freeman
 

Similar a Why Is It Difficult to Detect Outbreaks in Twitter? (20)

Awareness of dengue fever among Youth vivek.pptx
Awareness of dengue fever among Youth vivek.pptxAwareness of dengue fever among Youth vivek.pptx
Awareness of dengue fever among Youth vivek.pptx
 
I so p 9.10.2017
I so p 9.10.2017I so p 9.10.2017
I so p 9.10.2017
 
emerging and re-emerging vector borne diseases
emerging and re-emerging vector borne diseasesemerging and re-emerging vector borne diseases
emerging and re-emerging vector borne diseases
 
Dengue Virus Disease: Recent Updates on Vaccine Development
Dengue Virus Disease: Recent Updates on Vaccine DevelopmentDengue Virus Disease: Recent Updates on Vaccine Development
Dengue Virus Disease: Recent Updates on Vaccine Development
 
The concept of HIV AIDS and nutrition for community development and social w...
The concept of HIV AIDS  and nutrition for community development and social w...The concept of HIV AIDS  and nutrition for community development and social w...
The concept of HIV AIDS and nutrition for community development and social w...
 
Real Time Pcr Detection Of Hiv Viral Load
Real Time Pcr Detection Of Hiv Viral LoadReal Time Pcr Detection Of Hiv Viral Load
Real Time Pcr Detection Of Hiv Viral Load
 
Harnessing the Power of Infectious Disease Information with a Relational Data...
Harnessing the Power of Infectious Disease Information with a Relational Data...Harnessing the Power of Infectious Disease Information with a Relational Data...
Harnessing the Power of Infectious Disease Information with a Relational Data...
 
Inv.epidemics
Inv.epidemicsInv.epidemics
Inv.epidemics
 
سل 2.pptx
سل 2.pptxسل 2.pptx
سل 2.pptx
 
Obumneke amadi
Obumneke amadi  Obumneke amadi
Obumneke amadi
 
BRENDER-Economic considerations in risk management-ID1485-IDRC2014_b
BRENDER-Economic considerations in risk management-ID1485-IDRC2014_bBRENDER-Economic considerations in risk management-ID1485-IDRC2014_b
BRENDER-Economic considerations in risk management-ID1485-IDRC2014_b
 
Colloque RI 2014 : Intervention de Sylvie DOLBEAULT, MD, MPH (Institut Curie,...
Colloque RI 2014 : Intervention de Sylvie DOLBEAULT, MD, MPH (Institut Curie,...Colloque RI 2014 : Intervention de Sylvie DOLBEAULT, MD, MPH (Institut Curie,...
Colloque RI 2014 : Intervention de Sylvie DOLBEAULT, MD, MPH (Institut Curie,...
 
2014 2015-update-on-ebola-virus-dr-bryna-warshawsky (1)
2014 2015-update-on-ebola-virus-dr-bryna-warshawsky  (1)2014 2015-update-on-ebola-virus-dr-bryna-warshawsky  (1)
2014 2015-update-on-ebola-virus-dr-bryna-warshawsky (1)
 
One
OneOne
One
 
Aids: Then and Now Webquest
Aids: Then and Now WebquestAids: Then and Now Webquest
Aids: Then and Now Webquest
 
Keynote address: 'No time to lose: of infectious diseases, science, politics ...
Keynote address: 'No time to lose: of infectious diseases, science, politics ...Keynote address: 'No time to lose: of infectious diseases, science, politics ...
Keynote address: 'No time to lose: of infectious diseases, science, politics ...
 
Epidemiology Introduction
Epidemiology Introduction Epidemiology Introduction
Epidemiology Introduction
 
A World United Against Infectious Diseases: Connecting Organizations for Regi...
A World United Against Infectious Diseases: Connecting Organizations for Regi...A World United Against Infectious Diseases: Connecting Organizations for Regi...
A World United Against Infectious Diseases: Connecting Organizations for Regi...
 
The patient perspective nat caplyme final
The patient perspective nat caplyme finalThe patient perspective nat caplyme final
The patient perspective nat caplyme final
 
The patient perspective_v3_final
The patient perspective_v3_finalThe patient perspective_v3_final
The patient perspective_v3_final
 

Más de Nattiya Kanhabua

Search, Exploration and Analytics of Evolving Data
Search, Exploration and Analytics of Evolving DataSearch, Exploration and Analytics of Evolving Data
Search, Exploration and Analytics of Evolving DataNattiya Kanhabua
 
Towards Concise Preservation by Managed Forgetting: Research Issues and Case ...
Towards Concise Preservation by Managed Forgetting: Research Issues and Case ...Towards Concise Preservation by Managed Forgetting: Research Issues and Case ...
Towards Concise Preservation by Managed Forgetting: Research Issues and Case ...Nattiya Kanhabua
 
Leveraging Dynamic Query Subtopics for Time-aware Search Result Diversification
Leveraging Dynamic Query Subtopics for Time-aware Search Result DiversificationLeveraging Dynamic Query Subtopics for Time-aware Search Result Diversification
Leveraging Dynamic Query Subtopics for Time-aware Search Result DiversificationNattiya Kanhabua
 
On the Value of Temporal Anchor Texts in Wikipedia
On the Value of Temporal Anchor Texts in WikipediaOn the Value of Temporal Anchor Texts in Wikipedia
On the Value of Temporal Anchor Texts in WikipediaNattiya Kanhabua
 
Ranking Related News Predictions
Ranking Related News PredictionsRanking Related News Predictions
Ranking Related News PredictionsNattiya Kanhabua
 
Temporal summarization of event related updates
Temporal summarization of event related updatesTemporal summarization of event related updates
Temporal summarization of event related updatesNattiya Kanhabua
 
Temporal Web Dynamics: Implications from Search Perspective
Temporal Web Dynamics: Implications from Search PerspectiveTemporal Web Dynamics: Implications from Search Perspective
Temporal Web Dynamics: Implications from Search PerspectiveNattiya Kanhabua
 
Temporal Web Dynamics and Implications for Information Retrieval
Temporal Web Dynamics and Implications for Information RetrievalTemporal Web Dynamics and Implications for Information Retrieval
Temporal Web Dynamics and Implications for Information RetrievalNattiya Kanhabua
 
Preservation and Forgetting: Friends or Foes?
Preservation and Forgetting: Friends or Foes?Preservation and Forgetting: Friends or Foes?
Preservation and Forgetting: Friends or Foes?Nattiya Kanhabua
 
Concise Preservation by Combining Managed Forgetting and Contextualized Remem...
Concise Preservation by Combining Managed Forgetting and Contextualized Remem...Concise Preservation by Combining Managed Forgetting and Contextualized Remem...
Concise Preservation by Combining Managed Forgetting and Contextualized Remem...Nattiya Kanhabua
 
Searching the Temporal Web: Challenges and Current Approaches
Searching the Temporal Web: Challenges and Current ApproachesSearching the Temporal Web: Challenges and Current Approaches
Searching the Temporal Web: Challenges and Current ApproachesNattiya Kanhabua
 
Improving Temporal Language Models For Determining Time of Non-Timestamped Do...
Improving Temporal Language Models For Determining Time of Non-Timestamped Do...Improving Temporal Language Models For Determining Time of Non-Timestamped Do...
Improving Temporal Language Models For Determining Time of Non-Timestamped Do...Nattiya Kanhabua
 
Exploiting temporal information in retrieval of archived documents (doctoral ...
Exploiting temporal information in retrieval of archived documents (doctoral ...Exploiting temporal information in retrieval of archived documents (doctoral ...
Exploiting temporal information in retrieval of archived documents (doctoral ...Nattiya Kanhabua
 
Determining Time of Queries for Re-ranking Search Results
Determining Time of Queries for Re-ranking Search ResultsDetermining Time of Queries for Re-ranking Search Results
Determining Time of Queries for Re-ranking Search ResultsNattiya Kanhabua
 
Supporting Exploration and Serendipity in Information Retrieval
Supporting Exploration and Serendipity in Information RetrievalSupporting Exploration and Serendipity in Information Retrieval
Supporting Exploration and Serendipity in Information RetrievalNattiya Kanhabua
 
Time-aware Approaches to Information Retrieval
Time-aware Approaches to Information RetrievalTime-aware Approaches to Information Retrieval
Time-aware Approaches to Information RetrievalNattiya Kanhabua
 
Learning to Rank Search Results for Time-Sensitive Queries (poster presentation)
Learning to Rank Search Results for Time-Sensitive Queries (poster presentation)Learning to Rank Search Results for Time-Sensitive Queries (poster presentation)
Learning to Rank Search Results for Time-Sensitive Queries (poster presentation)Nattiya Kanhabua
 
Estimating Query Difficulty for News Prediction Retrieval (poster presentation)
Estimating Query Difficulty for News Prediction Retrieval (poster presentation)Estimating Query Difficulty for News Prediction Retrieval (poster presentation)
Estimating Query Difficulty for News Prediction Retrieval (poster presentation)Nattiya Kanhabua
 
Exploiting Time-based Synonyms in Searching Document Archives
Exploiting Time-based Synonyms in Searching Document ArchivesExploiting Time-based Synonyms in Searching Document Archives
Exploiting Time-based Synonyms in Searching Document ArchivesNattiya Kanhabua
 
Identifying Relevant Temporal Expressions for Real-world Events
Identifying Relevant Temporal Expressions for Real-world EventsIdentifying Relevant Temporal Expressions for Real-world Events
Identifying Relevant Temporal Expressions for Real-world EventsNattiya Kanhabua
 

Más de Nattiya Kanhabua (20)

Search, Exploration and Analytics of Evolving Data
Search, Exploration and Analytics of Evolving DataSearch, Exploration and Analytics of Evolving Data
Search, Exploration and Analytics of Evolving Data
 
Towards Concise Preservation by Managed Forgetting: Research Issues and Case ...
Towards Concise Preservation by Managed Forgetting: Research Issues and Case ...Towards Concise Preservation by Managed Forgetting: Research Issues and Case ...
Towards Concise Preservation by Managed Forgetting: Research Issues and Case ...
 
Leveraging Dynamic Query Subtopics for Time-aware Search Result Diversification
Leveraging Dynamic Query Subtopics for Time-aware Search Result DiversificationLeveraging Dynamic Query Subtopics for Time-aware Search Result Diversification
Leveraging Dynamic Query Subtopics for Time-aware Search Result Diversification
 
On the Value of Temporal Anchor Texts in Wikipedia
On the Value of Temporal Anchor Texts in WikipediaOn the Value of Temporal Anchor Texts in Wikipedia
On the Value of Temporal Anchor Texts in Wikipedia
 
Ranking Related News Predictions
Ranking Related News PredictionsRanking Related News Predictions
Ranking Related News Predictions
 
Temporal summarization of event related updates
Temporal summarization of event related updatesTemporal summarization of event related updates
Temporal summarization of event related updates
 
Temporal Web Dynamics: Implications from Search Perspective
Temporal Web Dynamics: Implications from Search PerspectiveTemporal Web Dynamics: Implications from Search Perspective
Temporal Web Dynamics: Implications from Search Perspective
 
Temporal Web Dynamics and Implications for Information Retrieval
Temporal Web Dynamics and Implications for Information RetrievalTemporal Web Dynamics and Implications for Information Retrieval
Temporal Web Dynamics and Implications for Information Retrieval
 
Preservation and Forgetting: Friends or Foes?
Preservation and Forgetting: Friends or Foes?Preservation and Forgetting: Friends or Foes?
Preservation and Forgetting: Friends or Foes?
 
Concise Preservation by Combining Managed Forgetting and Contextualized Remem...
Concise Preservation by Combining Managed Forgetting and Contextualized Remem...Concise Preservation by Combining Managed Forgetting and Contextualized Remem...
Concise Preservation by Combining Managed Forgetting and Contextualized Remem...
 
Searching the Temporal Web: Challenges and Current Approaches
Searching the Temporal Web: Challenges and Current ApproachesSearching the Temporal Web: Challenges and Current Approaches
Searching the Temporal Web: Challenges and Current Approaches
 
Improving Temporal Language Models For Determining Time of Non-Timestamped Do...
Improving Temporal Language Models For Determining Time of Non-Timestamped Do...Improving Temporal Language Models For Determining Time of Non-Timestamped Do...
Improving Temporal Language Models For Determining Time of Non-Timestamped Do...
 
Exploiting temporal information in retrieval of archived documents (doctoral ...
Exploiting temporal information in retrieval of archived documents (doctoral ...Exploiting temporal information in retrieval of archived documents (doctoral ...
Exploiting temporal information in retrieval of archived documents (doctoral ...
 
Determining Time of Queries for Re-ranking Search Results
Determining Time of Queries for Re-ranking Search ResultsDetermining Time of Queries for Re-ranking Search Results
Determining Time of Queries for Re-ranking Search Results
 
Supporting Exploration and Serendipity in Information Retrieval
Supporting Exploration and Serendipity in Information RetrievalSupporting Exploration and Serendipity in Information Retrieval
Supporting Exploration and Serendipity in Information Retrieval
 
Time-aware Approaches to Information Retrieval
Time-aware Approaches to Information RetrievalTime-aware Approaches to Information Retrieval
Time-aware Approaches to Information Retrieval
 
Learning to Rank Search Results for Time-Sensitive Queries (poster presentation)
Learning to Rank Search Results for Time-Sensitive Queries (poster presentation)Learning to Rank Search Results for Time-Sensitive Queries (poster presentation)
Learning to Rank Search Results for Time-Sensitive Queries (poster presentation)
 
Estimating Query Difficulty for News Prediction Retrieval (poster presentation)
Estimating Query Difficulty for News Prediction Retrieval (poster presentation)Estimating Query Difficulty for News Prediction Retrieval (poster presentation)
Estimating Query Difficulty for News Prediction Retrieval (poster presentation)
 
Exploiting Time-based Synonyms in Searching Document Archives
Exploiting Time-based Synonyms in Searching Document ArchivesExploiting Time-based Synonyms in Searching Document Archives
Exploiting Time-based Synonyms in Searching Document Archives
 
Identifying Relevant Temporal Expressions for Real-world Events
Identifying Relevant Temporal Expressions for Real-world EventsIdentifying Relevant Temporal Expressions for Real-world Events
Identifying Relevant Temporal Expressions for Real-world Events
 

Último

Microsoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AIMicrosoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AITatiana Gurgel
 
SaaStr Workshop Wednesday w/ Lucas Price, Yardstick
SaaStr Workshop Wednesday w/ Lucas Price, YardstickSaaStr Workshop Wednesday w/ Lucas Price, Yardstick
SaaStr Workshop Wednesday w/ Lucas Price, Yardsticksaastr
 
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night EnjoyCall Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night EnjoyPooja Nehwal
 
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...Sheetaleventcompany
 
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Kayode Fayemi
 
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxChiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxraffaeleoman
 
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779Delhi Call girls
 
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesVVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesPooja Nehwal
 
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptxMohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptxmohammadalnahdi22
 
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort ServiceDelhi Call girls
 
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...Hasting Chen
 
If this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New NigeriaIf this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New NigeriaKayode Fayemi
 
Presentation on Engagement in Book Clubs
Presentation on Engagement in Book ClubsPresentation on Engagement in Book Clubs
Presentation on Engagement in Book Clubssamaasim06
 
Mathematics of Finance Presentation.pptx
Mathematics of Finance Presentation.pptxMathematics of Finance Presentation.pptx
Mathematics of Finance Presentation.pptxMoumonDas2
 
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024eCommerce Institute
 
Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510Vipesco
 
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...henrik385807
 
George Lever - eCommerce Day Chile 2024
George Lever -  eCommerce Day Chile 2024George Lever -  eCommerce Day Chile 2024
George Lever - eCommerce Day Chile 2024eCommerce Institute
 
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )Pooja Nehwal
 
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdfCTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdfhenrik385807
 

Último (20)

Microsoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AIMicrosoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AI
 
SaaStr Workshop Wednesday w/ Lucas Price, Yardstick
SaaStr Workshop Wednesday w/ Lucas Price, YardstickSaaStr Workshop Wednesday w/ Lucas Price, Yardstick
SaaStr Workshop Wednesday w/ Lucas Price, Yardstick
 
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night EnjoyCall Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
 
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
 
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
 
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxChiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
 
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
 
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesVVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
 
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptxMohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptx
 
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
 
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
 
If this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New NigeriaIf this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New Nigeria
 
Presentation on Engagement in Book Clubs
Presentation on Engagement in Book ClubsPresentation on Engagement in Book Clubs
Presentation on Engagement in Book Clubs
 
Mathematics of Finance Presentation.pptx
Mathematics of Finance Presentation.pptxMathematics of Finance Presentation.pptx
Mathematics of Finance Presentation.pptx
 
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
 
Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510
 
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
 
George Lever - eCommerce Day Chile 2024
George Lever -  eCommerce Day Chile 2024George Lever -  eCommerce Day Chile 2024
George Lever - eCommerce Day Chile 2024
 
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
 
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdfCTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
 

Why Is It Difficult to Detect Outbreaks in Twitter?

  • 1. Why Is It Difficult to Detect Outbreaks in Twitter? Avaré Stewart, Nattiya Kanhabua, Sara Romano Ernesto Diaz-Aviles, Wolf Siberski, and Wolfgang Nejdl L3S Research Center / Leibniz Universität Hannover, Germany SIGIR 2013 Workshop on Health Search and Discovery 1 August 2013, Dublin, Ireland
  • 2. Motivation • Numerous works use Twitter to infer the existence and magnitude of real-world events in real-time – Earthquake [Sakaki et al., 2010] – Predicting financial time series [Ruiz et al., 2012] – Influenza epidemics [Culotta, 2010; Lampos et al., 2011; Paul et al., 2011]
  • 4. Health related tweets • User status updates or news related to public health are common in Twitter – I have the mumps...am I alone? – my baby girl has a Gastroenteritis so great!! Please do not give it to meee – #Cholera breaks out in #Dadaab refugee camp in #Kenya http://t.co/.... – As many as 16 people have been found infected with Anthrax in Shahjadpur upazila of the Sirajganj district in Bangladesh.
  • 8. M-Eco System Medical Ecosystem: Personalized Event-based Surveillance http://www.meco-project.eu/
  • 9. Data Collection • Official outbreak reports – ~3,000 ProMED-mail reports from 2011 – WHO reports have very small coverage • Twitter data – ~1,200 health-related terms (i.e., infectious diseases, their synonyms, pathogens and symptoms) – Over 112 millions of tweets from 2011 • Series of NLP tools including – OpenNLP (tokenization, sentence splitting, POS tagging) – OpenCalais (named entity recognition) – HeidelTime (temporal expression extraction)
  • 10. Ground Truths [Kanhabua et al., TAIA’ 12]
  • 11. Event Extraction • An event is a sentence containing two entities – (1) medical condition and (2) geographic expression – A minimum requirement by domain experts • A victim and the time of an event can be identified from the sentence itself, or its surrounding context • Output: a set of event candidates Reported by World Health Organization (WHO) on 29 July 2012 about an ongoing Ebola outbreak in Uganda since the beginning of July 2012 [Kanhabua et al., TAIA’ 12]
  • 12. Message Filtering: Challenges • Ambiguity – having several meanings – used in different contexts • Incompleteness – missing or under-reported events – data processing errors
  • 13. Message Filtering: Challenges • Ambiguity – having several meanings – used in different contexts • Incompleteness – missing or under-reported events – data processing errors Category Example tweet Literature A two hour train journey, Love In the Time of Cholera ... Music Dengue Fever’s “Uku,” Mixed by Paul Dreux Smith Universal Audio... Marketing Exclusive distributor of high quality #HIV/AIDS Blood & Urine and #Hepatitis #Self -testers. General Identification of genotype 4 Hepatitis E virus binding proteins on swine liver cells: Hepatitis E virus... Negative i dont have sniffles and no real coughing..well its coughing but not like an influenza cough. Joke Thought I had Bieber Fever. Ends up I just had a combo of the mumps, mono, measles & the hershey squ...
  • 14. Challenge I. Noisy/evolving • Evolving data – Relevant features changes over time
  • 16. Approach for Noisy Data • MedISys1 – providing a list of negative keywords created by medical experts • Urban Dictionary2 – a Web-based dictionary of slang, ethnic culture words or phrases 1 http://medusa.jrc.it/medisys/homeedition/en/home.html 2 http://www.urbandictionary.com/
  • 17. Approach for Noisy Data 1 http://medusa.jrc.it/medisys/homeedition/en/home.html 2 http://www.urbandictionary.com/
  • 18. [Kanhabua and Nejdl, WOW’ 13]
  • 19. [Kanhabua and Nejdl, WOW’ 13]
  • 21. Signal Generation: Challenges • Temporal Dynamics – seasonal infectious diseases – rare and spontaneous outbreaks • Location Dynamics – frequency and duration – levels of prevalence or severity
  • 22. Signal Generation: Challenges • Temporal Dynamics – seasonal infectious diseases – rare and spontaneous outbreaks • Location Dynamics – frequency and duration – levels of prevalence or severity [Rortais et al., 2010 in Journal of Food Research International]
  • 23. Signal Generation: Challenges • Temporal Dynamics – seasonal infectious diseases – rare and spontaneous outbreaks • Location Dynamics – frequency and duration – levels of prevalence or severity
  • 24. Signal Generation: Challenges [Emch et al., 2008 in International Journal of Health Geographics]
  • 26. Outbreak Categorization How to generate a reliable signal for low aggregate counts?
  • 28. Temporal Diversity • Refined Jaccard Index (RDJ-index) – average Jaccard similarity of all object pairs • Note: lower RDJ corresponds to higher diversity • Problem: “All-Pair comparison” • Solution: Estimation algorithms with probabilistic error bound guarantees [Deng et al., CIKM’ ∑<− = ji ji OOJS nn RDJ ),( )1( 2 nji ≤<≤1 ∩ UU Jaccard similarity
  • 29. Temporal Diversity • Refined Jaccard Index (RDJ-index) – average Jaccard similarity of all object pairs • Note: lower RDJ corresponds to higher diversity • Problem: “All-Pair comparison” • Solution: Estimation algorithms with probabilistic error bound guarantees [Deng et al., CIKM’ ∑<− = ji ji OOJS nn RDJ ),( )1( 2 nji ≤<≤1 ∩ UU Jaccard similarity (1) Top-k terms (2) Entities
  • 30. Threat Assessment: Challenge • Overwhelming with the large number of tweets
  • 31. Approach • Personalized Tweet Ranking for Epidemic Intelligence – Learning to rank and recommender systems – User's context as implicit criteria for recommendation [Diaz-Aviles et al., WWW’ 12, Diaz-Aviles et al., ICWSM’ 12]
  • 34. Future Work • Real-Time Analysis of Big and Fast Social Web Streams – Scalable, efficient methods for filtering and generating signals in real-time – Effective methods for aggregating and visualizing information in a meaningful way
  • 36. References • [Culotta, 2010] A. Culotta. Towards detecting influenza epidemics by analyzing twitter messages. In Proceedings of the First Workshop on Social Media Analytics (SOMA’2010), 2010. • [Diaz-Aviles et al., 2012a] E. Diaz-Aviles, A. Stewart, E. Velasco, K. Denecke, and W. Nejdl. Towards personalized learning to rank for epidemic intelligence based on social media streams. In Proceedings of the 21st World Wide Web Conference (WWW ‘2012), 2012. • [Diaz-Aviles et al., 2012b] E. Diaz-Aviles, A. Stewart, E. Velasco, K. Denecke, and W. Nejdl. Epidemic intelligence for the crowd, by the crowd. In Proceedings of International AAAI Conference on Weblogs and Social Media (ICWSM’2012), 2012. • [Kanhabua et al., 2012a] N. Kanhabua, Sara Romano, and A. Stewart, Identifying Relevant Temporal Expressions for Real-world Events, In SIGIR 2012 Workshop on Time-aware Information Access (TAIA'2012), 2012. • [Kanhabua et al., 2012b] N. Kanhabua, Sara Romano, and A. Stewart and W. Nejdl. Supporting Temporal Analytics for Health Related Events in Microblogs. In Proceedings of CIKM'2012, 2012. • [Kanhabua and Nejdl 2013] N. Kanhabua and W. Nejdl. Understanding the Diversity of Tweets in the Time of Outbreaks. In Proceedings of the First International Web Observatory Workshop (WOW'2013) at WWW'2013, 2013. • [Lampos et al., 2011] V. Lampos and N. Cristianini. Nowcasting events from the social web with statistical learning. ACM TIST, 3, 2011. • [Paul et al., 2011] M. J. Paul and M. Dredze. You are what you tweet: Analyzing twitter for public health. In Proceedings of International AAAI Conference on Weblogs and Social Media (ICWSM’2011), 2011. • [Ruiz et al., 2012] E. J. Ruiz, V. Hristidis, C. Castillo, A. Gionis, and A. Jaimes. Correlating financial time series with micro-blogging activity. In Proceedings of WSDM’2012, 2012. • [Sakaki et al., 2010] T. Sakaki, M. Okazaki, and Y. Matsuo. Earthquake shakes twitter users: real-time event detection by social sensors. In Proceedings of WWW’2010, 2010.

Notas del editor

  1. To exploit this timeliness potential, we present an event-based Epidemic Intelligence (EI) system, which has emerged as a type of intelligence gathering aimed to detect events of interest to the public health from unstructured text on the Web In the medical domain, there has been a surge in detecting health related tweets for early warning Allow a rapid response from authorities [Diaz-Aviles et al., 2012]
  2. Note that, there are existing EI systems, such as, the Bio- Caster Global Health Monitor1 or HealthMap 2. However, they differ from our proposed system in the level of analysis, information sources, language coverage and visualization. Frequencies of cases reported to RKI and number of tweets mentioning the name of the disease: EHEC. Pearson correlation coefficient = 0.864. The monitor of Twitter allowed M-Eco to generate the first signals on Friday, May 20th, 2011.
  3. We study and propose solutions to three main research challenges in gathering epidemic intelligence from social media streams: 1) dynamic classification to enable message filtering, 2) producing reliable warning signals (temporal anomalyies found) based on observe term frequency changes in these messages, using biosurveillance algorithms, and 3) providing suitable information and recommendations to domain experts, for better assessment of the potential outbreak threats associated with the generated signals. Part I. Ground truth creation Official outbreak reports World Health Organization1 ProMED-mail2 Part II. Creating Twitter time series medical condition disease name, synonyms, pathogens, symptoms location geographic expressions, geo-location, or user profile 3 levels: country, continent, latitude
  4. M-Eco strives to detect a large variety of infectious diseases, so we make use of a list of 1,258 terms consisting of infectious diseases, their synonyms, pathogens and symptoms, which are provided by the domain experts in two languages, namely English and German, for an initial filtering step. All documents and tweets are annotated with locations, medical conditions and temporal expressions using a series of language processing tools, including OpenNLP2 for tokenization, sentence splitting and part-of-speech tagging, HeidelTime [34] for temporal expression extraction
  5. Hash-tags co-occurring with #EHEC during May 23 and June 19, 2011, the main period of the outbreak. The hash-tags are classified as entities of type Medical Condition, Location, or Complementary Context, hash-tags out of these categories are discarded.
  6. Hash-tags co-occurring with #EHEC during May 23 and June 19, 2011, the main period of the outbreak. The hash-tags are classified as entities of type Medical Condition, Location, or Complementary Context, hash-tags out of these categories are discarded.
  7. Our approach builds upon [18] and extends it by: 1) incorporating the use of an orthogonal vector, which is learned by a Support Vector Machine (SVM), as a description of the feature change; and 2) computing a novelty score that lets the system identify those tweets that contribute to the feature change, so that their true labels can be obtained.
  8. In order to detect outbreak events for early warning, we exploit different state-of-the-art Biosurveillance algorithms as anomaly detectors in disease-related Twitter messages: \textbf{C1}, \textbf{C2}, \textbf{C3}, F-Statistic (\textbf{FS}), Experimental Weighted Moving Average (\textbf{EWMA}) and Farrington (\textbf{FA})~\cite{basseville1993detection, farrington_1996}. Traditional bio-surveillance systems usually exploit information from official sources, e.g., laboratory results, mortality rates, or the number of reported patients suffering from a disease outbreak. In recent years, researchers in the medical domain have begun to leverage real-time, social Web data, such as, tweets.
  9. In order to detect outbreak events for early warning, we exploit different state-of-the-art Biosurveillance algorithms as anomaly detectors in disease-related Twitter messages: \textbf{C1}, \textbf{C2}, \textbf{C3}, F-Statistic (\textbf{FS}), Experimental Weighted Moving Average (\textbf{EWMA}) and Farrington (\textbf{FA})~\cite{basseville1993detection, farrington_1996}. Traditional bio-surveillance systems usually exploit information from official sources, e.g., laboratory results, mortality rates, or the number of reported patients suffering from a disease outbreak. In recent years, researchers in the medical domain have begun to leverage real-time, social Web data, such as, tweets.
  10. Identified topics show similar trends during the known time periods of real-world outbreaks Diversity reflects how the language (i.e., terms and locations) are used differently Div(entity) highly correlates with topic dynamics for some diseases, i.e., mumps, ebola, botulism and ehec Div(term) shows correlation with topic dynamics for cholera, anthrax and rubella
  11. Algorithms: SampleDJ, TrackDJ (claims and proofs in [Deng et al., 2012])
  12. Algorithms: SampleDJ, TrackDJ (claims and proofs in [Deng et al., 2012])
  13. (1) Rely upon abundant user interactions and/or the availability of explicit feedback (e.g., ratings, likes, dislikes) (2) Within M-Eco, we use the tweets from signals in developing techniques to provide a personalized short list of tweets that meets the context of the investigation. In this section, we review one of them; namely Personalized Tweet Ranking for Epidemic Intelligence (PTR4EI) [13, 14] and discuss the evaluation conducted during a major EHEC outbreak in Germany. where t is a discrete Time interval, MCu the set of Medical Conditions, and Lu the set of Locations of user interest. More precisely, we expand the user&amp;apos;s context, Cu, using latent topics computed with LDA [5] on: 1) an indexed collection of tweets for epidemic intelligence; and 2) the hash-tags that co-occur with this context.
  14. (1) Rely upon abundant user interactions and/or the availability of explicit feedback (e.g., ratings, likes, dislikes) (2) Within M-Eco, we use the tweets from signals in developing techniques to provide a personalized short list of tweets that meets the context of the investigation. In this section, we review one of them; namely Personalized Tweet Ranking for Epidemic Intelligence (PTR4EI) [13, 14] and discuss the evaluation conducted during a major EHEC outbreak in Germany. where t is a discrete Time interval, MCu the set of Medical Conditions, and Lu the set of Locations of user interest. More precisely, we expand the user&amp;apos;s context, Cu, using latent topics computed with LDA [5] on: 1) an indexed collection of tweets for epidemic intelligence; and 2) the hash-tags that co-occur with this context.