SlideShare una empresa de Scribd logo
1 de 29
Descargar para leer sin conexión
SCRAPING THE SOCIAL GRAPH
      CRISIS MONITORING WITH SOCIAL MEDIA



               Georgetown University
                jongos@gmail.com
                     @jongos
About Ushahidi                       Notable Uses                             The Challenge
Ushahidi is a free, open-source      Ushahidi has been deployed in            As the amounts of data
platform used for crowdsourcing      major global crisis scenarios,           aggregated by Ushahidi users
and visualizing data geospatially.   allowing organizations to draw           grows, they face a common
It was born out of the 2008          situational awareness from the           problem. How do they effectively
election unrest when founders        crowd.      To date it’s been            manage this realtime data? How
Juliana Rotich, Erik Hersman, Ory    downloaded over 15,000 times.            can we help them discover
Okolloh and David Kobia wanted                                                credible and actionable info from
to allow Kenyan citizens a way to    S o m e o f t h e m o re n o t a b l e   the deluge of reports they’ll get
SMS reports of incident to know      deployments include recently in          from the public? The SwiftRiver
what was occurring around them.      Egypt, the Haiti earthquakes, the        initiative was created to begin to
This was one of the earliest uses    fires in Russia, the Queensland           answer some of these questions
of crowdsourcing for crisis          floods in Australia.                      for Ushahidi deployers.
response.
USHAHIDI HAITI
OILSPILL CRISIS MAP
UCHAGUZI
RUSSIAN FIRES “HELP MAP”
PAKREPORT
TUBESTRIKE CROWDMAP
PRAGUEWATCH
HARASSMAP
U-SHAHID
CHRISTCHURCH
SINSAI.INFO
“It’s not information overload. It’s filter failure.”
                                            - Clay Shirky
PLATFORM GOALS
Consider the context, relevance defined by the user

Offer an opt-in global database of trust and authority

Algorithms augment, but not define, human decision making

Work across media channels (Twitter, Email, Feeds, SMS)

Be accessible (offline/online/mobile)

Index massive amounts of the mobile/social web
KNC AWARD & RIVER ID

final component of the veracity algorithm

needs to be able to scale massively

changing the backend (Hadoop & Mongo DB)

research by data scientists

use-cases at scale and iterative improvements
THIS IS A DATA PROBLEM
PROGRESS

7,000+ downloads in 6 months

7,000+ API Users

100,000+ Lines of code

5 APIs and 2 Apps

Data Items Processed - 70,000,000 (liberal extrapolation)
Sweeper - User Interface
NETWORK DYNAMICS




Good crowdsourcing campaigns build upon the existing ties
between people and their networks. There’s a natural mult-
iplier, where the people in the original network become
nodes for new networks and so on.
EARNING TRUST
❖ Participation is permission
❖ Consent is not carte blanche
❖ Clarity is critical
❖ Trust is Earned or Burned
❖ Transparency is hard to teach
PRIVACY
❖ Protection of data is different than the
protection of people/identity
❖ Standards like HTTPS or SSL
❖ Encryption
❖ Anonymity is not a given (TOR Project)
❖ The usual fail-points are still threats (weak
passwords, compromised servers, careless
employees)
VALIDATION
 ❖ Verify   factual occurrences (location, time,
 date)
 ❖ Verify contributor identity (who?)
 ❖ Verify contributor credentials



Everything beyond these three points is an educated
guess. Anyone looking to game the campaign will only
be affective if they are able to compromise the
aforementioned.
MOTIVATION
❖ Ease of participation
❖ Low risk of failure or shame
❖ Social Capital
❖ Repute & Accolade
❖ Barter
❖ Strategic Spending ($)
❖ Data Sharing
❖ Altruism & Charity
THANKS!
Knight News Challenge
   jg@swiftly.org
     @swiftriver

Más contenido relacionado

La actualidad más candente

An informed community is a resilient community
An informed community is a resilient communityAn informed community is a resilient community
An informed community is a resilient community
Kerrie Purcell
 
Mac281 big data & journalism lecture 2014
Mac281 big data &  journalism lecture 2014Mac281 big data &  journalism lecture 2014
Mac281 big data & journalism lecture 2014
Rob Jewitt
 
NICAR: Open government, Gov 2.0 and open data journalism
NICAR: Open government, Gov 2.0 and open data journalismNICAR: Open government, Gov 2.0 and open data journalism
NICAR: Open government, Gov 2.0 and open data journalism
Alexander Howard
 

La actualidad más candente (20)

data journalism
data journalismdata journalism
data journalism
 
An informed community is a resilient community
An informed community is a resilient communityAn informed community is a resilient community
An informed community is a resilient community
 
Open Data Journalism
Open Data JournalismOpen Data Journalism
Open Data Journalism
 
Henry Addo gfke 2014
Henry Addo gfke 2014Henry Addo gfke 2014
Henry Addo gfke 2014
 
Harnessing Technology to Empower Marginalized Communities
Harnessing Technology to Empower Marginalized CommunitiesHarnessing Technology to Empower Marginalized Communities
Harnessing Technology to Empower Marginalized Communities
 
Mac373 med312 data journalism lecture
Mac373 med312 data journalism lectureMac373 med312 data journalism lecture
Mac373 med312 data journalism lecture
 
The Digital Humanitarian Moment: New Practices, Knowledge Politics, and Phila...
The Digital Humanitarian Moment: New Practices, Knowledge Politics, and Phila...The Digital Humanitarian Moment: New Practices, Knowledge Politics, and Phila...
The Digital Humanitarian Moment: New Practices, Knowledge Politics, and Phila...
 
The art and science of data-driven journalism
The art and science of data-driven journalism The art and science of data-driven journalism
The art and science of data-driven journalism
 
Safecast long version oct 2015
Safecast long version oct 2015Safecast long version oct 2015
Safecast long version oct 2015
 
New(s) Alternative: Social Media in Journalism
New(s) Alternative: Social Media in JournalismNew(s) Alternative: Social Media in Journalism
New(s) Alternative: Social Media in Journalism
 
Data-driven journalism (GIJC, Geneva April 2010) #ddj
Data-driven journalism (GIJC, Geneva April 2010) #ddjData-driven journalism (GIJC, Geneva April 2010) #ddj
Data-driven journalism (GIJC, Geneva April 2010) #ddj
 
Data Journalism - Introduction
Data Journalism - IntroductionData Journalism - Introduction
Data Journalism - Introduction
 
NCSU invited talk: Leveraging Social Media for Tourism Marketplace Coordination
NCSU invited talk: Leveraging Social Media for Tourism Marketplace CoordinationNCSU invited talk: Leveraging Social Media for Tourism Marketplace Coordination
NCSU invited talk: Leveraging Social Media for Tourism Marketplace Coordination
 
Thesis Development
Thesis DevelopmentThesis Development
Thesis Development
 
Journalism, data and storytelling: navigating the battlefield
Journalism, data and storytelling: navigating the battlefieldJournalism, data and storytelling: navigating the battlefield
Journalism, data and storytelling: navigating the battlefield
 
Mac281 big data & journalism lecture 2014
Mac281 big data &  journalism lecture 2014Mac281 big data &  journalism lecture 2014
Mac281 big data & journalism lecture 2014
 
HunchWorks: Combining Human Expertise and Big Data
HunchWorks: Combining Human Expertise and Big DataHunchWorks: Combining Human Expertise and Big Data
HunchWorks: Combining Human Expertise and Big Data
 
Crowdmapping & Verification Hanoi Workshop
Crowdmapping & Verification Hanoi WorkshopCrowdmapping & Verification Hanoi Workshop
Crowdmapping & Verification Hanoi Workshop
 
Towards a critical data journalism practice
Towards a critical data journalism practiceTowards a critical data journalism practice
Towards a critical data journalism practice
 
NICAR: Open government, Gov 2.0 and open data journalism
NICAR: Open government, Gov 2.0 and open data journalismNICAR: Open government, Gov 2.0 and open data journalism
NICAR: Open government, Gov 2.0 and open data journalism
 

Destacado

2011 ushahidi deployment-partners
2011 ushahidi deployment-partners2011 ushahidi deployment-partners
2011 ushahidi deployment-partners
Ushahidi
 
Java Web Scraping
Java Web ScrapingJava Web Scraping
Java Web Scraping
Sumant Raja
 

Destacado (20)

Knockout!
Knockout! Knockout!
Knockout!
 
Mining Facebook for Feelings
Mining Facebook for FeelingsMining Facebook for Feelings
Mining Facebook for Feelings
 
Social Media Data: Twitter Scraping on NeCTAR
Social Media Data: Twitter Scraping on NeCTARSocial Media Data: Twitter Scraping on NeCTAR
Social Media Data: Twitter Scraping on NeCTAR
 
Identification of User Patterns in Social Networks by Data Mining Techniques:...
Identification of User Patterns in Social Networks by Data Mining Techniques:...Identification of User Patterns in Social Networks by Data Mining Techniques:...
Identification of User Patterns in Social Networks by Data Mining Techniques:...
 
Data Mining in Facebook
Data Mining in FacebookData Mining in Facebook
Data Mining in Facebook
 
Intro to Ushahidi for Developers
Intro to Ushahidi for DevelopersIntro to Ushahidi for Developers
Intro to Ushahidi for Developers
 
QR Codes and the Ushahidi Platform
QR Codes and the Ushahidi PlatformQR Codes and the Ushahidi Platform
QR Codes and the Ushahidi Platform
 
2011 ushahidi deployment-partners
2011 ushahidi deployment-partners2011 ushahidi deployment-partners
2011 ushahidi deployment-partners
 
What if Citizens Mapped Health?
What if Citizens Mapped Health?What if Citizens Mapped Health?
What if Citizens Mapped Health?
 
Greeneworks
GreeneworksGreeneworks
Greeneworks
 
Ushahidi Deployment - Assessment Toolbox
Ushahidi Deployment - Assessment ToolboxUshahidi Deployment - Assessment Toolbox
Ushahidi Deployment - Assessment Toolbox
 
Data mining on Social Media
Data mining on Social MediaData mining on Social Media
Data mining on Social Media
 
Ushahidi Lessons Learned
Ushahidi Lessons LearnedUshahidi Lessons Learned
Ushahidi Lessons Learned
 
Pem Fuel Cell For A Rc Aircraft
Pem Fuel Cell For A Rc AircraftPem Fuel Cell For A Rc Aircraft
Pem Fuel Cell For A Rc Aircraft
 
Ushahidi Deployment - Output Toolbox
Ushahidi Deployment - Output ToolboxUshahidi Deployment - Output Toolbox
Ushahidi Deployment - Output Toolbox
 
Kenya Ushahidi Evaluation: Unsung Peace Heros/Building Bridges
Kenya Ushahidi Evaluation: Unsung Peace Heros/Building BridgesKenya Ushahidi Evaluation: Unsung Peace Heros/Building Bridges
Kenya Ushahidi Evaluation: Unsung Peace Heros/Building Bridges
 
Python webinar 2nd july
Python webinar 2nd julyPython webinar 2nd july
Python webinar 2nd july
 
Scrapinghub PyCon Philippines 2015
Scrapinghub PyCon Philippines 2015Scrapinghub PyCon Philippines 2015
Scrapinghub PyCon Philippines 2015
 
Social Media Mining and Analytics
Social Media Mining and AnalyticsSocial Media Mining and Analytics
Social Media Mining and Analytics
 
Java Web Scraping
Java Web ScrapingJava Web Scraping
Java Web Scraping
 

Similar a Scraping the Social Graph with Ushahidi and SwiftRiver

Guidance for Incorporating Big Data into Humanitarian Operations - 2015 - web...
Guidance for Incorporating Big Data into Humanitarian Operations - 2015 - web...Guidance for Incorporating Big Data into Humanitarian Operations - 2015 - web...
Guidance for Incorporating Big Data into Humanitarian Operations - 2015 - web...
Katie Whipkey
 
Ict4d and crowdsourcing
Ict4d and crowdsourcingIct4d and crowdsourcing
Ict4d and crowdsourcing
Anahi Iacucci
 
Presentation ppt
Presentation pptPresentation ppt
Presentation ppt
Tina Moore
 
Opportunities and Challenges in Crisis Informatics
Opportunities and Challenges in Crisis InformaticsOpportunities and Challenges in Crisis Informatics
Opportunities and Challenges in Crisis Informatics
Lea Shanley
 
Crowdsourcing and crowfeeding - second version
Crowdsourcing and crowfeeding - second versionCrowdsourcing and crowfeeding - second version
Crowdsourcing and crowfeeding - second version
Anahi Iacucci
 

Similar a Scraping the Social Graph with Ushahidi and SwiftRiver (20)

Ushahidi and Crowdmap Checkins
Ushahidi and Crowdmap CheckinsUshahidi and Crowdmap Checkins
Ushahidi and Crowdmap Checkins
 
Ushahidi: Made in Africa
Ushahidi: Made in AfricaUshahidi: Made in Africa
Ushahidi: Made in Africa
 
08302011 cc vtc_risk
08302011 cc vtc_risk08302011 cc vtc_risk
08302011 cc vtc_risk
 
Guidance for Incorporating Big Data into Humanitarian Operations - 2015 - web...
Guidance for Incorporating Big Data into Humanitarian Operations - 2015 - web...Guidance for Incorporating Big Data into Humanitarian Operations - 2015 - web...
Guidance for Incorporating Big Data into Humanitarian Operations - 2015 - web...
 
Ict4d and crowdsourcing
Ict4d and crowdsourcingIct4d and crowdsourcing
Ict4d and crowdsourcing
 
Ushahidi esri juliana
Ushahidi esri julianaUshahidi esri juliana
Ushahidi esri juliana
 
CKX: The Dark Side of Data
CKX: The Dark Side of DataCKX: The Dark Side of Data
CKX: The Dark Side of Data
 
Emerging Trends in Crisis Informatics
Emerging Trends in Crisis InformaticsEmerging Trends in Crisis Informatics
Emerging Trends in Crisis Informatics
 
Presentation ppt
Presentation pptPresentation ppt
Presentation ppt
 
Opportunities and Challenges in Crisis Informatics
Opportunities and Challenges in Crisis InformaticsOpportunities and Challenges in Crisis Informatics
Opportunities and Challenges in Crisis Informatics
 
Towards a More Open World
Towards a More Open WorldTowards a More Open World
Towards a More Open World
 
Big data, democratized analytics and deep context,
Big data, democratized analytics and deep context, Big data, democratized analytics and deep context,
Big data, democratized analytics and deep context,
 
Big Data, Democratized Analytics and International Development
Big Data, Democratized Analytics and International Development Big Data, Democratized Analytics and International Development
Big Data, Democratized Analytics and International Development
 
FILM260 Flipbook Assignment
FILM260 Flipbook AssignmentFILM260 Flipbook Assignment
FILM260 Flipbook Assignment
 
Crowdsourcing and crowfeeding - second version
Crowdsourcing and crowfeeding - second versionCrowdsourcing and crowfeeding - second version
Crowdsourcing and crowfeeding - second version
 
Sais.34.1
Sais.34.1Sais.34.1
Sais.34.1
 
Web 2.0 Technology Building Situational Awareness: Free and Open Source Too...
Web 2.0 Technology  Building Situational Awareness:  Free and Open Source Too...Web 2.0 Technology  Building Situational Awareness:  Free and Open Source Too...
Web 2.0 Technology Building Situational Awareness: Free and Open Source Too...
 
Social Media Management in Crisis Communication
Social Media Management in Crisis CommunicationSocial Media Management in Crisis Communication
Social Media Management in Crisis Communication
 
Applying citizen science model to disaster management
Applying citizen science model to disaster managementApplying citizen science model to disaster management
Applying citizen science model to disaster management
 
ICCM 2014 -- Ignite Talks -- Session 2
ICCM 2014 -- Ignite Talks -- Session 2ICCM 2014 -- Ignite Talks -- Session 2
ICCM 2014 -- Ignite Talks -- Session 2
 

Más de Ushahidi

Pivoting An African Open Source Project
Pivoting An African Open Source ProjectPivoting An African Open Source Project
Pivoting An African Open Source Project
Ushahidi
 

Más de Ushahidi (20)

Data Science for Social Good and Ushahidi - Final Presentation
Data Science for Social Good and Ushahidi - Final PresentationData Science for Social Good and Ushahidi - Final Presentation
Data Science for Social Good and Ushahidi - Final Presentation
 
Corruption mapping (april 2013, part 2)
Corruption mapping (april 2013, part 2)Corruption mapping (april 2013, part 2)
Corruption mapping (april 2013, part 2)
 
Anti-Corruption Mapping (April 2013, part 1)
Anti-Corruption Mapping (April 2013, part 1)Anti-Corruption Mapping (April 2013, part 1)
Anti-Corruption Mapping (April 2013, part 1)
 
Ushahdi 3.0 Design Framework
Ushahdi 3.0 Design Framework Ushahdi 3.0 Design Framework
Ushahdi 3.0 Design Framework
 
Around the Globe Corruption Mapping (part 2)
Around the Globe Corruption Mapping (part 2)Around the Globe Corruption Mapping (part 2)
Around the Globe Corruption Mapping (part 2)
 
Around the Globe Corruption Mapping (part 1)
Around the Globe Corruption Mapping (part 1)Around the Globe Corruption Mapping (part 1)
Around the Globe Corruption Mapping (part 1)
 
Ushahidi Toolbox - Real-time Evaluation
Ushahidi Toolbox - Real-time EvaluationUshahidi Toolbox - Real-time Evaluation
Ushahidi Toolbox - Real-time Evaluation
 
Ushahidi Toolbox - Implementation
Ushahidi Toolbox - ImplementationUshahidi Toolbox - Implementation
Ushahidi Toolbox - Implementation
 
Ushahidi Toolbox - Assessment
Ushahidi Toolbox - AssessmentUshahidi Toolbox - Assessment
Ushahidi Toolbox - Assessment
 
Kenya Ushahidi Evaluation: Uchaguzi
Kenya Ushahidi Evaluation: UchaguziKenya Ushahidi Evaluation: Uchaguzi
Kenya Ushahidi Evaluation: Uchaguzi
 
Kenya Ushahidi Evaluation: Blog Series
Kenya Ushahidi Evaluation: Blog SeriesKenya Ushahidi Evaluation: Blog Series
Kenya Ushahidi Evaluation: Blog Series
 
Pivoting An African Open Source Project
Pivoting An African Open Source ProjectPivoting An African Open Source Project
Pivoting An African Open Source Project
 
Ushahidi personas scenarios
Ushahidi personas scenariosUshahidi personas scenarios
Ushahidi personas scenarios
 
Citizen pollution mapping made easy
Citizen pollution mapping made easy Citizen pollution mapping made easy
Citizen pollution mapping made easy
 
Testimony
TestimonyTestimony
Testimony
 
Map it, Change it
Map it, Change itMap it, Change it
Map it, Change it
 
Map it, Make it, Hack it
Map it, Make it, Hack itMap it, Make it, Hack it
Map it, Make it, Hack it
 
Re-imagining Citizen Engagement
Re-imagining Citizen EngagementRe-imagining Citizen Engagement
Re-imagining Citizen Engagement
 
Ushahidi Research Seminar 11.11.11
Ushahidi Research Seminar 11.11.11Ushahidi Research Seminar 11.11.11
Ushahidi Research Seminar 11.11.11
 
Ihub Research
Ihub ResearchIhub Research
Ihub Research
 

Último

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Último (20)

Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 

Scraping the Social Graph with Ushahidi and SwiftRiver

  • 1. SCRAPING THE SOCIAL GRAPH CRISIS MONITORING WITH SOCIAL MEDIA Georgetown University jongos@gmail.com @jongos
  • 2. About Ushahidi Notable Uses The Challenge Ushahidi is a free, open-source Ushahidi has been deployed in As the amounts of data platform used for crowdsourcing major global crisis scenarios, aggregated by Ushahidi users and visualizing data geospatially. allowing organizations to draw grows, they face a common It was born out of the 2008 situational awareness from the problem. How do they effectively election unrest when founders crowd. To date it’s been manage this realtime data? How Juliana Rotich, Erik Hersman, Ory downloaded over 15,000 times. can we help them discover Okolloh and David Kobia wanted credible and actionable info from to allow Kenyan citizens a way to S o m e o f t h e m o re n o t a b l e the deluge of reports they’ll get SMS reports of incident to know deployments include recently in from the public? The SwiftRiver what was occurring around them. Egypt, the Haiti earthquakes, the initiative was created to begin to This was one of the earliest uses fires in Russia, the Queensland answer some of these questions of crowdsourcing for crisis floods in Australia. for Ushahidi deployers. response.
  • 3.
  • 15. “It’s not information overload. It’s filter failure.” - Clay Shirky
  • 16. PLATFORM GOALS Consider the context, relevance defined by the user Offer an opt-in global database of trust and authority Algorithms augment, but not define, human decision making Work across media channels (Twitter, Email, Feeds, SMS) Be accessible (offline/online/mobile) Index massive amounts of the mobile/social web
  • 17. KNC AWARD & RIVER ID final component of the veracity algorithm needs to be able to scale massively changing the backend (Hadoop & Mongo DB) research by data scientists use-cases at scale and iterative improvements
  • 18. THIS IS A DATA PROBLEM
  • 19.
  • 20. PROGRESS 7,000+ downloads in 6 months 7,000+ API Users 100,000+ Lines of code 5 APIs and 2 Apps Data Items Processed - 70,000,000 (liberal extrapolation)
  • 21.
  • 22.
  • 23. Sweeper - User Interface
  • 24. NETWORK DYNAMICS Good crowdsourcing campaigns build upon the existing ties between people and their networks. There’s a natural mult- iplier, where the people in the original network become nodes for new networks and so on.
  • 25. EARNING TRUST ❖ Participation is permission ❖ Consent is not carte blanche ❖ Clarity is critical ❖ Trust is Earned or Burned ❖ Transparency is hard to teach
  • 26. PRIVACY ❖ Protection of data is different than the protection of people/identity ❖ Standards like HTTPS or SSL ❖ Encryption ❖ Anonymity is not a given (TOR Project) ❖ The usual fail-points are still threats (weak passwords, compromised servers, careless employees)
  • 27. VALIDATION ❖ Verify factual occurrences (location, time, date) ❖ Verify contributor identity (who?) ❖ Verify contributor credentials Everything beyond these three points is an educated guess. Anyone looking to game the campaign will only be affective if they are able to compromise the aforementioned.
  • 28. MOTIVATION ❖ Ease of participation ❖ Low risk of failure or shame ❖ Social Capital ❖ Repute & Accolade ❖ Barter ❖ Strategic Spending ($) ❖ Data Sharing ❖ Altruism & Charity
  • 29. THANKS! Knight News Challenge jg@swiftly.org @swiftriver