SlideShare a Scribd company logo
1 of 7
Download to read offline
Twitter is noisy
but you can find some diamonds
Pew Internet report:


“75% of online news consumers say they get
  news forwarded through email or posts on
  social networking sites
and 52% say they share links to news with
  others via those means.”
Twitter Lists
• Filtering main friends timeline is a bad idea
• Twitter Lists: manually created set of users who
  often post on a certain topic
• For example:
   – @huffingtonpost/apple-news
   – @IndieFlix/film-people-to-follow
   – @alisohani/bigdata-analytics
• A Twitter user can be included into different lists.
• Me for example:
  http://twitter.com/mariagrineva/lists/memberships
What kind of noise?
• People tweet on other topics too, including
  personal stuff



• Global news widely spread, often really
  annoying: IPad launch, ash clouds, Christmas,
  Michael Jackson
Our Approach
• Identifying niche topic of Twitter list
  automatically, at real-time
• Improve the niche topic with respect to the
  Global Twitter Stream
  – If there is a burst related to Apple, IPad => check
    maybe all Twitter is talking about that
Filtering = Classification
• Traditional approaches to filter news use only
  textual features
• We use both textual and social features for
  classification
  – Twitter lists is a community of interconnected
    users => see who is the center and who is an
    outsider
What is done
• Method for identification list’s topic
  signature with respect to Global Twitter
  Stream
• Social features identification
• Evaluation framework

More Related Content

Viewers also liked

Analytics for the Real-Time Web
Analytics for the Real-Time WebAnalytics for the Real-Time Web
Analytics for the Real-Time Webmaria.grineva
 
Semantic Data Search and Analysis Using Web-based User-Generated Knowledge Bases
Semantic Data Search and Analysis Using Web-based User-Generated Knowledge BasesSemantic Data Search and Analysis Using Web-based User-Generated Knowledge Bases
Semantic Data Search and Analysis Using Web-based User-Generated Knowledge Basesmaria.grineva
 
XQuery Triggers in Native XML Database Sedna
XQuery Triggers in Native XML Database SednaXQuery Triggers in Native XML Database Sedna
XQuery Triggers in Native XML Database Sednamaria.grineva
 
Architecture of Native XML Database Sedna
Architecture of Native XML Database SednaArchitecture of Native XML Database Sedna
Architecture of Native XML Database Sednamaria.grineva
 
Tqr 2013 probes proxies
Tqr 2013 probes proxies Tqr 2013 probes proxies
Tqr 2013 probes proxies An Jacobs
 
Using Social Media to Build Your Business
Using Social Media to Build Your BusinessUsing Social Media to Build Your Business
Using Social Media to Build Your BusinessStuart Ayling
 
Marketing Success in 7 Steps Guaranteed
Marketing Success in 7 Steps GuaranteedMarketing Success in 7 Steps Guaranteed
Marketing Success in 7 Steps GuaranteedStuart Ayling
 
Defining the User Experience of Emotion & Content
Defining the User Experience of Emotion & ContentDefining the User Experience of Emotion & Content
Defining the User Experience of Emotion & ContentJason Wishard
 
Analytics for the Real-Time Web
Analytics for the Real-Time WebAnalytics for the Real-Time Web
Analytics for the Real-Time Webmaria.grineva
 

Viewers also liked (12)

Analytics for the Real-Time Web
Analytics for the Real-Time WebAnalytics for the Real-Time Web
Analytics for the Real-Time Web
 
Semantic Data Search and Analysis Using Web-based User-Generated Knowledge Bases
Semantic Data Search and Analysis Using Web-based User-Generated Knowledge BasesSemantic Data Search and Analysis Using Web-based User-Generated Knowledge Bases
Semantic Data Search and Analysis Using Web-based User-Generated Knowledge Bases
 
XQuery Triggers in Native XML Database Sedna
XQuery Triggers in Native XML Database SednaXQuery Triggers in Native XML Database Sedna
XQuery Triggers in Native XML Database Sedna
 
Architecture of Native XML Database Sedna
Architecture of Native XML Database SednaArchitecture of Native XML Database Sedna
Architecture of Native XML Database Sedna
 
Tqr 2013 probes proxies
Tqr 2013 probes proxies Tqr 2013 probes proxies
Tqr 2013 probes proxies
 
Строителство Градът
Строителство ГрадътСтроителство Градът
Строителство Градът
 
Using Social Media to Build Your Business
Using Social Media to Build Your BusinessUsing Social Media to Build Your Business
Using Social Media to Build Your Business
 
Marketing Success in 7 Steps Guaranteed
Marketing Success in 7 Steps GuaranteedMarketing Success in 7 Steps Guaranteed
Marketing Success in 7 Steps Guaranteed
 
Development and Testing of a Conceptual Framework for Asking about Intoxication
Development and Testing of a Conceptual Framework for Asking about IntoxicationDevelopment and Testing of a Conceptual Framework for Asking about Intoxication
Development and Testing of a Conceptual Framework for Asking about Intoxication
 
The Social Research Group
The Social Research GroupThe Social Research Group
The Social Research Group
 
Defining the User Experience of Emotion & Content
Defining the User Experience of Emotion & ContentDefining the User Experience of Emotion & Content
Defining the User Experience of Emotion & Content
 
Analytics for the Real-Time Web
Analytics for the Real-Time WebAnalytics for the Real-Time Web
Analytics for the Real-Time Web
 

Recently uploaded

AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 

Recently uploaded (20)

AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 

Filtering Twitter

  • 1. Twitter is noisy but you can find some diamonds
  • 2. Pew Internet report: “75% of online news consumers say they get news forwarded through email or posts on social networking sites and 52% say they share links to news with others via those means.”
  • 3. Twitter Lists • Filtering main friends timeline is a bad idea • Twitter Lists: manually created set of users who often post on a certain topic • For example: – @huffingtonpost/apple-news – @IndieFlix/film-people-to-follow – @alisohani/bigdata-analytics • A Twitter user can be included into different lists. • Me for example: http://twitter.com/mariagrineva/lists/memberships
  • 4. What kind of noise? • People tweet on other topics too, including personal stuff • Global news widely spread, often really annoying: IPad launch, ash clouds, Christmas, Michael Jackson
  • 5. Our Approach • Identifying niche topic of Twitter list automatically, at real-time • Improve the niche topic with respect to the Global Twitter Stream – If there is a burst related to Apple, IPad => check maybe all Twitter is talking about that
  • 6. Filtering = Classification • Traditional approaches to filter news use only textual features • We use both textual and social features for classification – Twitter lists is a community of interconnected users => see who is the center and who is an outsider
  • 7. What is done • Method for identification list’s topic signature with respect to Global Twitter Stream • Social features identification • Evaluation framework