SlideShare una empresa de Scribd logo
1 de 27
Twitter: A source of Big Data in a
NSO.
Experimental Indicators in Sentiment and Mobility
The ESS Big Data Workshop 2016
Objective
To share the experience of INEGI in the use of Twitter as a
big data source.
Types of Big Data Sources
• Meters and smart sensors. Traffic cameras, GPS devices, power
consume meters, IoT, smartwatches, smartphones, etc.
• Social interactions. Conversations and publications on social
networks like Twitter, Facebook, FourSquare, etc.
• Business transactions. Credit cards movements, scanned data,
cell phone records, etc.
• Electronic files. Documents which are available in electronic
formats such as PDF files, websites, videos, audio, images, photos,
etc.
• Broadcast media. Digital video and audio streamed on real-time
Types of Big Data Sources
• Meters and smart sensors. Traffic cameras, GPS devices,
power consume meters, IoT, smartwatches, smartphones, etc.
• Social interactions. Conversations and publications on social
networks like Twitter, Facebook, FourSquare, etc.
• Business transactions. Credit cards movements, electronic
cash registers, cell phone records, etc.
• Electronic files. Documents which are available in electronic
formats such as PDF files, websites, videos, audio, digital media
broadcasting
• Broadcast media. Digital video and audio streamed on real-time
Process Followed by INEGI (until now)
Production process
Study Case
• Initial objective of INEGI’s Big Data Project: To generate
experimental indicators using Big Data techniques with
social media data, to complement statistical information
obtained from traditional methods and sources.
• Initial Goal: To obtain indicators of subjective wellbeing
from social media data sources.
Why did we choose Twitter as a data
source?
• It’s a widely adopted social network where you can find content
written by common people
• Tweets are public, so we can use them without concerns about
privacy
• There is a free API which allows to get up to 1% of the tweets
that are being produced on real time
• ( https://dev.twitter.com/streaming )
In the beginning…
February 2014 – October 2015
24/7
Collection / Analysis infrastructure
Since October 2015 - …
(EMC / VMWare)
LOGSTASH
Location Query
Free Access
https://twittercommunity.com/t/stream-filter-sample-size/30865
Apache Spark
Clean & Sentiment
Analysis
Tweets
Daily
Processing
(4 a.m.)
300 K
Geo-Tweets
Minimal
Representation
~260 Millions
of Geo-Tweets
~130 Millions inside Mexico
~ 2 Years and 8 Months ~ 24/7
Multivariate Stratification
of blocks in Apache Spark
%Acceso a Internet, %Pc, %Telefono Celular, %Automovil
Software Stack (2016)
Tweet Structure
Tweets Map
Tweets Map
We found that
• The JSON structure is easy to process (APACHE SPARK)
• The content is text that we can examine to make the
sentiment analysis
• Geographical coordinates can be used to filter the tweets
and obtain only those of interest (warning: not all the
tweets are geo referenced)
• We can make mobility analysis, based in Tweets’ location
and time (a serendipity)
Exploring Big Data Base
Supervised Training
Manual Tagging
5000 people (TecMilenio students),
100 tweets tagged by each one,
each tweet was tagged nine times,
about 40,000 different tagged tweets,
interpreted accordingly to regional
idioms
http://cienciadedatos.inegi.org.mx/animotuitero/
They’re not enchiladas!
Training and modeling Collaborative
Research
Python
Implementation
Sentiment Visualization (2015)
(Monthly Indicator)
http://www.inegi.org.mx/inegi/contenidos/investigacion/experimentales/animotuitero/default.aspx
{JSON:File}
Sentiment Visualization (Late 2016)
(Daily Indicator)
C#
{RESTful:API}
{NoSQL}
Mobility (Late 2016)
Integration of other sources
Big Spatial Join
(5 M Economic Units with +60 M Tweets)
https://github.com/syoummer/SpatialSpark
Some applications
• Tourism
• Migration
• Use of roads
• Regional influence of big cities
• Mobility patterns
• Business activity patterns
• Subjective wellbeing
• Inequities impact
• Impact analysis of relevant
news
• Mental health
• Misogynist/discriminatory
language use
• SDG indicators?
Collaboration
• International
– UNECE
• ICHEC
– UNSD
– LAMBDoop
– University of Pensylvania
• National
– KioNetworks
– Dattlas
– TecMilenio
– INFOTEC
– Centro Geo
– CIDE
– CIMAT
– Sectur
• Internal
– INEGI General Directorates
Questions?
Abel.Coronado@inegi.org.mx
M.Sc. Abel Coronado
@abxda
Conociendo México
01 800 111 46 34
www.inegi.org.mx
atencion.usuarios@inegi.org.mx
@inegi_informa INEGI Informa

Más contenido relacionado

La actualidad más candente

The Critical Role of IoT Data Integration to develop Big Data Applications (f...
The Critical Role of IoT Data Integration to develop Big Data Applications (f...The Critical Role of IoT Data Integration to develop Big Data Applications (f...
The Critical Role of IoT Data Integration to develop Big Data Applications (f...Rainer Sternfeld
 
Getting Started with Splunk Breakout Session
Getting Started with Splunk Breakout SessionGetting Started with Splunk Breakout Session
Getting Started with Splunk Breakout SessionSplunk
 
2017-01-08-scaling tribalknowledge
2017-01-08-scaling tribalknowledge2017-01-08-scaling tribalknowledge
2017-01-08-scaling tribalknowledgeChristopher Williams
 
Exploring the Great Olympian Graph
Exploring the Great Olympian GraphExploring the Great Olympian Graph
Exploring the Great Olympian GraphNeo4j
 
From Developer to Data Scientist
From Developer to Data ScientistFrom Developer to Data Scientist
From Developer to Data ScientistGaines Kergosien
 
Schneller Nutzen mit Neo4j: das Beispiel Panama Papers
Schneller Nutzen mit Neo4j: das Beispiel Panama PapersSchneller Nutzen mit Neo4j: das Beispiel Panama Papers
Schneller Nutzen mit Neo4j: das Beispiel Panama PapersNeo4j
 
NoSQL: what does it mean, how did we get here, and why should I care? - Hugo ...
NoSQL: what does it mean, how did we get here, and why should I care? - Hugo ...NoSQL: what does it mean, how did we get here, and why should I care? - Hugo ...
NoSQL: what does it mean, how did we get here, and why should I care? - Hugo ...South London Geek Nights
 
Applying large scale text analytics with graph databases
Applying large scale text analytics with graph databasesApplying large scale text analytics with graph databases
Applying large scale text analytics with graph databasesMarissa Kobylenski
 
A Linked Data Dataset for Madrid Transport Authority's Datasets
A Linked Data Dataset for Madrid Transport Authority's DatasetsA Linked Data Dataset for Madrid Transport Authority's Datasets
A Linked Data Dataset for Madrid Transport Authority's DatasetsOscar Corcho
 
Hardcore Data Science - in Practice
Hardcore Data Science - in PracticeHardcore Data Science - in Practice
Hardcore Data Science - in PracticeMikio L. Braun
 
Network and IT Operations
Network and IT OperationsNetwork and IT Operations
Network and IT OperationsNeo4j
 
Big data - An Introduction
Big data - An IntroductionBig data - An Introduction
Big data - An IntroductionSpotle.ai
 
What you need to know to start an AI company?
What you need to know to start an AI company?What you need to know to start an AI company?
What you need to know to start an AI company?Mo Patel
 
Data analytics using the cloud challenges and opportunities for india
Data analytics using the cloud   challenges and opportunities for india Data analytics using the cloud   challenges and opportunities for india
Data analytics using the cloud challenges and opportunities for india Ajay Ohri
 
Skillshare - Let's talk about R in Data Journalism
Skillshare - Let's talk about R in Data JournalismSkillshare - Let's talk about R in Data Journalism
Skillshare - Let's talk about R in Data JournalismSchool of Data
 
Introduction to Big Data Technologies & Applications
Introduction to Big Data Technologies & ApplicationsIntroduction to Big Data Technologies & Applications
Introduction to Big Data Technologies & ApplicationsNguyen Cao
 
Autodiscovery or The long tail of open data
Autodiscovery or The long tail of open dataAutodiscovery or The long tail of open data
Autodiscovery or The long tail of open dataConnected Data World
 

La actualidad más candente (20)

The Critical Role of IoT Data Integration to develop Big Data Applications (f...
The Critical Role of IoT Data Integration to develop Big Data Applications (f...The Critical Role of IoT Data Integration to develop Big Data Applications (f...
The Critical Role of IoT Data Integration to develop Big Data Applications (f...
 
Getting Started with Splunk Breakout Session
Getting Started with Splunk Breakout SessionGetting Started with Splunk Breakout Session
Getting Started with Splunk Breakout Session
 
R at Microsoft
R at MicrosoftR at Microsoft
R at Microsoft
 
2017-01-08-scaling tribalknowledge
2017-01-08-scaling tribalknowledge2017-01-08-scaling tribalknowledge
2017-01-08-scaling tribalknowledge
 
Exploring the Great Olympian Graph
Exploring the Great Olympian GraphExploring the Great Olympian Graph
Exploring the Great Olympian Graph
 
From Developer to Data Scientist
From Developer to Data ScientistFrom Developer to Data Scientist
From Developer to Data Scientist
 
Schneller Nutzen mit Neo4j: das Beispiel Panama Papers
Schneller Nutzen mit Neo4j: das Beispiel Panama PapersSchneller Nutzen mit Neo4j: das Beispiel Panama Papers
Schneller Nutzen mit Neo4j: das Beispiel Panama Papers
 
NoSQL: what does it mean, how did we get here, and why should I care? - Hugo ...
NoSQL: what does it mean, how did we get here, and why should I care? - Hugo ...NoSQL: what does it mean, how did we get here, and why should I care? - Hugo ...
NoSQL: what does it mean, how did we get here, and why should I care? - Hugo ...
 
Applying large scale text analytics with graph databases
Applying large scale text analytics with graph databasesApplying large scale text analytics with graph databases
Applying large scale text analytics with graph databases
 
A Linked Data Dataset for Madrid Transport Authority's Datasets
A Linked Data Dataset for Madrid Transport Authority's DatasetsA Linked Data Dataset for Madrid Transport Authority's Datasets
A Linked Data Dataset for Madrid Transport Authority's Datasets
 
Hardcore Data Science - in Practice
Hardcore Data Science - in PracticeHardcore Data Science - in Practice
Hardcore Data Science - in Practice
 
It takes a village (to raise a ML model)
It takes a village (to raise a ML model)It takes a village (to raise a ML model)
It takes a village (to raise a ML model)
 
Network and IT Operations
Network and IT OperationsNetwork and IT Operations
Network and IT Operations
 
Big data - An Introduction
Big data - An IntroductionBig data - An Introduction
Big data - An Introduction
 
What you need to know to start an AI company?
What you need to know to start an AI company?What you need to know to start an AI company?
What you need to know to start an AI company?
 
Data analytics using the cloud challenges and opportunities for india
Data analytics using the cloud   challenges and opportunities for india Data analytics using the cloud   challenges and opportunities for india
Data analytics using the cloud challenges and opportunities for india
 
Skillshare - Let's talk about R in Data Journalism
Skillshare - Let's talk about R in Data JournalismSkillshare - Let's talk about R in Data Journalism
Skillshare - Let's talk about R in Data Journalism
 
Resume (kaushik shakkari)
Resume (kaushik shakkari)Resume (kaushik shakkari)
Resume (kaushik shakkari)
 
Introduction to Big Data Technologies & Applications
Introduction to Big Data Technologies & ApplicationsIntroduction to Big Data Technologies & Applications
Introduction to Big Data Technologies & Applications
 
Autodiscovery or The long tail of open data
Autodiscovery or The long tail of open dataAutodiscovery or The long tail of open data
Autodiscovery or The long tail of open data
 

Destacado

Ejemplos de Proyectos de Ciencia de Datos y Big Data en el INEGI
Ejemplos de Proyectos de Ciencia de Datos y Big Data en el INEGIEjemplos de Proyectos de Ciencia de Datos y Big Data en el INEGI
Ejemplos de Proyectos de Ciencia de Datos y Big Data en el INEGIAbel Alejandro Coronado Iruegas
 
Revelando los secretos de las redes sociales, Universidad Autónoma de Aguasca...
Revelando los secretos de las redes sociales, Universidad Autónoma de Aguasca...Revelando los secretos de las redes sociales, Universidad Autónoma de Aguasca...
Revelando los secretos de las redes sociales, Universidad Autónoma de Aguasca...Abel Alejandro Coronado Iruegas
 
Big Data, Revelando los secretos de twitter, CIMAT Zacatecas 2014
Big Data, Revelando los secretos de twitter, CIMAT Zacatecas 2014Big Data, Revelando los secretos de twitter, CIMAT Zacatecas 2014
Big Data, Revelando los secretos de twitter, CIMAT Zacatecas 2014Abel Alejandro Coronado Iruegas
 
Revelando los secretos de twitter, Festival de Software Libre 2014
Revelando los secretos de twitter, Festival de Software Libre 2014Revelando los secretos de twitter, Festival de Software Libre 2014
Revelando los secretos de twitter, Festival de Software Libre 2014Abel Alejandro Coronado Iruegas
 

Destacado (19)

Ejemplos de Proyectos de Ciencia de Datos y Big Data en el INEGI
Ejemplos de Proyectos de Ciencia de Datos y Big Data en el INEGIEjemplos de Proyectos de Ciencia de Datos y Big Data en el INEGI
Ejemplos de Proyectos de Ciencia de Datos y Big Data en el INEGI
 
Big data taller inegi sedesol
Big data taller inegi sedesolBig data taller inegi sedesol
Big data taller inegi sedesol
 
PresentacionParaINFOTEC
PresentacionParaINFOTECPresentacionParaINFOTEC
PresentacionParaINFOTEC
 
Scala 1
Scala 1Scala 1
Scala 1
 
Realidades y Sueños de Big Data en México
Realidades y Sueños de Big Data en MéxicoRealidades y Sueños de Big Data en México
Realidades y Sueños de Big Data en México
 
Revelando los secretos de las redes sociales, Universidad Autónoma de Aguasca...
Revelando los secretos de las redes sociales, Universidad Autónoma de Aguasca...Revelando los secretos de las redes sociales, Universidad Autónoma de Aguasca...
Revelando los secretos de las redes sociales, Universidad Autónoma de Aguasca...
 
Taller de Big Data y Ciencia de Datos en COLMEX dia 2
Taller de Big Data y Ciencia de Datos en COLMEX dia 2Taller de Big Data y Ciencia de Datos en COLMEX dia 2
Taller de Big Data y Ciencia de Datos en COLMEX dia 2
 
Big data lead colmex
Big data lead colmexBig data lead colmex
Big data lead colmex
 
Taller de Big Data y Ciencia de Datos en COLMEX dia 1
Taller de Big Data y Ciencia de Datos en COLMEX dia 1 Taller de Big Data y Ciencia de Datos en COLMEX dia 1
Taller de Big Data y Ciencia de Datos en COLMEX dia 1
 
Big data big opportunities
Big data big opportunitiesBig data big opportunities
Big data big opportunities
 
Que es big data huejutla uaeh
Que es big data huejutla uaehQue es big data huejutla uaeh
Que es big data huejutla uaeh
 
Explorando Big Data y Ciencia de Datos con GPUs
Explorando Big Data y Ciencia de Datos con GPUsExplorando Big Data y Ciencia de Datos con GPUs
Explorando Big Data y Ciencia de Datos con GPUs
 
¿Qué es big data?
¿Qué es big data?¿Qué es big data?
¿Qué es big data?
 
Revelando los secretos de twitter en México sg virtual
Revelando los secretos de twitter en México sg virtualRevelando los secretos de twitter en México sg virtual
Revelando los secretos de twitter en México sg virtual
 
Anatomía de un proyecto de Big Data
Anatomía de un proyecto de Big DataAnatomía de un proyecto de Big Data
Anatomía de un proyecto de Big Data
 
Geo Big Data 2015
Geo Big Data 2015 Geo Big Data 2015
Geo Big Data 2015
 
Big Data, Revelando los secretos de twitter, CIMAT Zacatecas 2014
Big Data, Revelando los secretos de twitter, CIMAT Zacatecas 2014Big Data, Revelando los secretos de twitter, CIMAT Zacatecas 2014
Big Data, Revelando los secretos de twitter, CIMAT Zacatecas 2014
 
Revelando los secretos de las redes sociales
Revelando los secretos de las redes socialesRevelando los secretos de las redes sociales
Revelando los secretos de las redes sociales
 
Revelando los secretos de twitter, Festival de Software Libre 2014
Revelando los secretos de twitter, Festival de Software Libre 2014Revelando los secretos de twitter, Festival de Software Libre 2014
Revelando los secretos de twitter, Festival de Software Libre 2014
 

Similar a INEGI ESS big data workshop

Research in Intelligent Systems and Data Science at the Knowledge Media Insti...
Research in Intelligent Systems and Data Science at the Knowledge Media Insti...Research in Intelligent Systems and Data Science at the Knowledge Media Insti...
Research in Intelligent Systems and Data Science at the Knowledge Media Insti...Enrico Motta
 
Information Management Trends 2009
Information Management Trends 2009Information Management Trends 2009
Information Management Trends 2009Christopher Eagle
 
Big data analytics with Apache Hadoop
Big data analytics with Apache  HadoopBig data analytics with Apache  Hadoop
Big data analytics with Apache HadoopSuman Saurabh
 
Snap4City: Smart City IOT/IOE Platform scalable Smart aNalytic APplication bu...
Snap4City: Smart City IOT/IOE Platform scalable Smart aNalytic APplication bu...Snap4City: Smart City IOT/IOE Platform scalable Smart aNalytic APplication bu...
Snap4City: Smart City IOT/IOE Platform scalable Smart aNalytic APplication bu...Paolo Nesi
 
Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...
Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...
Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...Artificial Intelligence Institute at UofSC
 
SFX, Metalib, mobile services and social networks
SFX, Metalib, mobile services and social networksSFX, Metalib, mobile services and social networks
SFX, Metalib, mobile services and social networksIan Clark
 
MAX: Realtime messaging and activity stream engine
MAX: Realtime messaging and activity stream engineMAX: Realtime messaging and activity stream engine
MAX: Realtime messaging and activity stream enginesneridagh
 
Data sciences and marketing analytics
Data sciences and marketing analyticsData sciences and marketing analytics
Data sciences and marketing analyticsMJ Xavier
 
Itsme_MBD@DomusAcademy04/2009
Itsme_MBD@DomusAcademy04/2009Itsme_MBD@DomusAcademy04/2009
Itsme_MBD@DomusAcademy04/2009Itsmecommunity
 
Social Media Analytics for Official Statistics
Social Media Analytics for Official StatisticsSocial Media Analytics for Official Statistics
Social Media Analytics for Official StatisticsIsmail Fahmi
 
What do we want computers to do for us?
What do we want computers to do for us? What do we want computers to do for us?
What do we want computers to do for us? Andrea Volpini
 
FinalPPT-StJoseph (3).pptx
FinalPPT-StJoseph (3).pptxFinalPPT-StJoseph (3).pptx
FinalPPT-StJoseph (3).pptxssuser046cf5
 
SuanIct-Bigdata desktop-final
SuanIct-Bigdata desktop-finalSuanIct-Bigdata desktop-final
SuanIct-Bigdata desktop-finalstelligence
 
(Some of) Wikipedia's Open Data
(Some of) Wikipedia's Open Data(Some of) Wikipedia's Open Data
(Some of) Wikipedia's Open Datanuria_ruiz
 
Ontology Building vs Data Harvesting and Cleaning for Smart-city Services
Ontology Building vs Data Harvesting and Cleaning for Smart-city ServicesOntology Building vs Data Harvesting and Cleaning for Smart-city Services
Ontology Building vs Data Harvesting and Cleaning for Smart-city ServicesPaolo Nesi
 
Wherever Your Patrons Are: Mobile Services for Libraries
Wherever Your Patrons Are: Mobile Services for LibrariesWherever Your Patrons Are: Mobile Services for Libraries
Wherever Your Patrons Are: Mobile Services for LibrariesMeredith Farkas
 
Michael Yoch (NPR) - The NPR API: Powering "Radio" in a Multiplatform World
Michael Yoch (NPR) - The NPR API: Powering "Radio" in a Multiplatform WorldMichael Yoch (NPR) - The NPR API: Powering "Radio" in a Multiplatform World
Michael Yoch (NPR) - The NPR API: Powering "Radio" in a Multiplatform WorldRadiocamp 2011
 
Presentation of context: Web Annotations (& Pundit) during the StoM Project (...
Presentation of context: Web Annotations (& Pundit) during the StoM Project (...Presentation of context: Web Annotations (& Pundit) during the StoM Project (...
Presentation of context: Web Annotations (& Pundit) during the StoM Project (...Net7
 
Python report on twitter sentiment analysis
Python report on twitter sentiment analysisPython report on twitter sentiment analysis
Python report on twitter sentiment analysisAntaraBhattacharya12
 

Similar a INEGI ESS big data workshop (20)

Research in Intelligent Systems and Data Science at the Knowledge Media Insti...
Research in Intelligent Systems and Data Science at the Knowledge Media Insti...Research in Intelligent Systems and Data Science at the Knowledge Media Insti...
Research in Intelligent Systems and Data Science at the Knowledge Media Insti...
 
Information Management Trends 2009
Information Management Trends 2009Information Management Trends 2009
Information Management Trends 2009
 
Big data analytics with Apache Hadoop
Big data analytics with Apache  HadoopBig data analytics with Apache  Hadoop
Big data analytics with Apache Hadoop
 
Snap4City: Smart City IOT/IOE Platform scalable Smart aNalytic APplication bu...
Snap4City: Smart City IOT/IOE Platform scalable Smart aNalytic APplication bu...Snap4City: Smart City IOT/IOE Platform scalable Smart aNalytic APplication bu...
Snap4City: Smart City IOT/IOE Platform scalable Smart aNalytic APplication bu...
 
Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...
Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...
Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...
 
SFX, Metalib, mobile services and social networks
SFX, Metalib, mobile services and social networksSFX, Metalib, mobile services and social networks
SFX, Metalib, mobile services and social networks
 
MAX: Realtime messaging and activity stream engine
MAX: Realtime messaging and activity stream engineMAX: Realtime messaging and activity stream engine
MAX: Realtime messaging and activity stream engine
 
Data sciences and marketing analytics
Data sciences and marketing analyticsData sciences and marketing analytics
Data sciences and marketing analytics
 
Itsme_MBD@DomusAcademy04/2009
Itsme_MBD@DomusAcademy04/2009Itsme_MBD@DomusAcademy04/2009
Itsme_MBD@DomusAcademy04/2009
 
Social Media Analytics for Official Statistics
Social Media Analytics for Official StatisticsSocial Media Analytics for Official Statistics
Social Media Analytics for Official Statistics
 
What do we want computers to do for us?
What do we want computers to do for us? What do we want computers to do for us?
What do we want computers to do for us?
 
FinalPPT-StJoseph (3).pptx
FinalPPT-StJoseph (3).pptxFinalPPT-StJoseph (3).pptx
FinalPPT-StJoseph (3).pptx
 
SuanIct-Bigdata desktop-final
SuanIct-Bigdata desktop-finalSuanIct-Bigdata desktop-final
SuanIct-Bigdata desktop-final
 
(Some of) Wikipedia's Open Data
(Some of) Wikipedia's Open Data(Some of) Wikipedia's Open Data
(Some of) Wikipedia's Open Data
 
Ontology Building vs Data Harvesting and Cleaning for Smart-city Services
Ontology Building vs Data Harvesting and Cleaning for Smart-city ServicesOntology Building vs Data Harvesting and Cleaning for Smart-city Services
Ontology Building vs Data Harvesting and Cleaning for Smart-city Services
 
Wherever Your Patrons Are: Mobile Services for Libraries
Wherever Your Patrons Are: Mobile Services for LibrariesWherever Your Patrons Are: Mobile Services for Libraries
Wherever Your Patrons Are: Mobile Services for Libraries
 
Michael Yoch (NPR) - The NPR API: Powering "Radio" in a Multiplatform World
Michael Yoch (NPR) - The NPR API: Powering "Radio" in a Multiplatform WorldMichael Yoch (NPR) - The NPR API: Powering "Radio" in a Multiplatform World
Michael Yoch (NPR) - The NPR API: Powering "Radio" in a Multiplatform World
 
Transmission Media for Networking
Transmission Media for NetworkingTransmission Media for Networking
Transmission Media for Networking
 
Presentation of context: Web Annotations (& Pundit) during the StoM Project (...
Presentation of context: Web Annotations (& Pundit) during the StoM Project (...Presentation of context: Web Annotations (& Pundit) during the StoM Project (...
Presentation of context: Web Annotations (& Pundit) during the StoM Project (...
 
Python report on twitter sentiment analysis
Python report on twitter sentiment analysisPython report on twitter sentiment analysis
Python report on twitter sentiment analysis
 

Más de Abel Alejandro Coronado Iruegas (12)

Mobility Master Class.pdf
Mobility Master Class.pdfMobility Master Class.pdf
Mobility Master Class.pdf
 
Live UAEMex Cubo de Datos Geoespaciales de Mexico
Live UAEMex Cubo de Datos Geoespaciales de MexicoLive UAEMex Cubo de Datos Geoespaciales de Mexico
Live UAEMex Cubo de Datos Geoespaciales de Mexico
 
Cubo de datos uaemex
Cubo de datos uaemexCubo de datos uaemex
Cubo de datos uaemex
 
Geo Big Data 4 Datalab
Geo Big Data 4 DatalabGeo Big Data 4 Datalab
Geo Big Data 4 Datalab
 
Catedra INEGI Big Data en IBERO
Catedra INEGI Big Data en IBEROCatedra INEGI Big Data en IBERO
Catedra INEGI Big Data en IBERO
 
Integrating eo with official statistics using machine learning in mexico geo ...
Integrating eo with official statistics using machine learning in mexico geo ...Integrating eo with official statistics using machine learning in mexico geo ...
Integrating eo with official statistics using machine learning in mexico geo ...
 
Machine learning and Satellite Images
Machine learning and Satellite ImagesMachine learning and Satellite Images
Machine learning and Satellite Images
 
El Cubo de Datos Geoespaciales de Mexico
El Cubo de Datos Geoespaciales de MexicoEl Cubo de Datos Geoespaciales de Mexico
El Cubo de Datos Geoespaciales de Mexico
 
No Sql
No SqlNo Sql
No Sql
 
Cubo de Datos Geoespaciales de Mexico
Cubo de Datos Geoespaciales de MexicoCubo de Datos Geoespaciales de Mexico
Cubo de Datos Geoespaciales de Mexico
 
Congreso UAA 2018 Animo Tuitero 2 0
Congreso UAA 2018 Animo Tuitero 2 0Congreso UAA 2018 Animo Tuitero 2 0
Congreso UAA 2018 Animo Tuitero 2 0
 
Analisis del Sentimiento en el Estado de Animo de los Tuiteros en Mexico
Analisis del Sentimiento en el Estado de Animo de los Tuiteros en MexicoAnalisis del Sentimiento en el Estado de Animo de los Tuiteros en Mexico
Analisis del Sentimiento en el Estado de Animo de los Tuiteros en Mexico
 

Último

Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 

Último (20)

Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 

INEGI ESS big data workshop

  • 1. Twitter: A source of Big Data in a NSO. Experimental Indicators in Sentiment and Mobility The ESS Big Data Workshop 2016
  • 2. Objective To share the experience of INEGI in the use of Twitter as a big data source.
  • 3. Types of Big Data Sources • Meters and smart sensors. Traffic cameras, GPS devices, power consume meters, IoT, smartwatches, smartphones, etc. • Social interactions. Conversations and publications on social networks like Twitter, Facebook, FourSquare, etc. • Business transactions. Credit cards movements, scanned data, cell phone records, etc. • Electronic files. Documents which are available in electronic formats such as PDF files, websites, videos, audio, images, photos, etc. • Broadcast media. Digital video and audio streamed on real-time
  • 4. Types of Big Data Sources • Meters and smart sensors. Traffic cameras, GPS devices, power consume meters, IoT, smartwatches, smartphones, etc. • Social interactions. Conversations and publications on social networks like Twitter, Facebook, FourSquare, etc. • Business transactions. Credit cards movements, electronic cash registers, cell phone records, etc. • Electronic files. Documents which are available in electronic formats such as PDF files, websites, videos, audio, digital media broadcasting • Broadcast media. Digital video and audio streamed on real-time
  • 5. Process Followed by INEGI (until now)
  • 7. Study Case • Initial objective of INEGI’s Big Data Project: To generate experimental indicators using Big Data techniques with social media data, to complement statistical information obtained from traditional methods and sources. • Initial Goal: To obtain indicators of subjective wellbeing from social media data sources.
  • 8. Why did we choose Twitter as a data source? • It’s a widely adopted social network where you can find content written by common people • Tweets are public, so we can use them without concerns about privacy • There is a free API which allows to get up to 1% of the tweets that are being produced on real time • ( https://dev.twitter.com/streaming )
  • 9. In the beginning… February 2014 – October 2015 24/7
  • 10. Collection / Analysis infrastructure Since October 2015 - … (EMC / VMWare) LOGSTASH Location Query Free Access https://twittercommunity.com/t/stream-filter-sample-size/30865 Apache Spark Clean & Sentiment Analysis Tweets Daily Processing (4 a.m.) 300 K Geo-Tweets Minimal Representation ~260 Millions of Geo-Tweets ~130 Millions inside Mexico ~ 2 Years and 8 Months ~ 24/7
  • 11. Multivariate Stratification of blocks in Apache Spark %Acceso a Internet, %Pc, %Telefono Celular, %Automovil
  • 16. We found that • The JSON structure is easy to process (APACHE SPARK) • The content is text that we can examine to make the sentiment analysis • Geographical coordinates can be used to filter the tweets and obtain only those of interest (warning: not all the tweets are geo referenced) • We can make mobility analysis, based in Tweets’ location and time (a serendipity)
  • 18. Supervised Training Manual Tagging 5000 people (TecMilenio students), 100 tweets tagged by each one, each tweet was tagged nine times, about 40,000 different tagged tweets, interpreted accordingly to regional idioms http://cienciadedatos.inegi.org.mx/animotuitero/ They’re not enchiladas!
  • 19. Training and modeling Collaborative Research Python Implementation
  • 20. Sentiment Visualization (2015) (Monthly Indicator) http://www.inegi.org.mx/inegi/contenidos/investigacion/experimentales/animotuitero/default.aspx {JSON:File}
  • 21. Sentiment Visualization (Late 2016) (Daily Indicator) C# {RESTful:API} {NoSQL}
  • 23. Integration of other sources Big Spatial Join (5 M Economic Units with +60 M Tweets) https://github.com/syoummer/SpatialSpark
  • 24. Some applications • Tourism • Migration • Use of roads • Regional influence of big cities • Mobility patterns • Business activity patterns • Subjective wellbeing • Inequities impact • Impact analysis of relevant news • Mental health • Misogynist/discriminatory language use • SDG indicators?
  • 25. Collaboration • International – UNECE • ICHEC – UNSD – LAMBDoop – University of Pensylvania • National – KioNetworks – Dattlas – TecMilenio – INFOTEC – Centro Geo – CIDE – CIMAT – Sectur • Internal – INEGI General Directorates
  • 27. Conociendo México 01 800 111 46 34 www.inegi.org.mx atencion.usuarios@inegi.org.mx @inegi_informa INEGI Informa