SlideShare una empresa de Scribd logo
1 de 45
Digital Trace Data for
Demographic Research
Ingmar Weber
@ingmarweber
June 12, 2019
Lecture at BIGSSS CSS 2019
Or How I learned to Love Online Advertising
Amazing Collaborators! (Alphab.)
• Natalie Adler, Musa Al-Asad, Francesco Billari,
Antoine Dubois, Masoomali Fatehkia, Harsh
Gandhi, Manuel Garcia, Kiran Garimella, Krishna
Gummadi, Karri Haranko, Ridhi Kashyap, Yelena
Mejova, Alfredo Morales, Fabrizio Natale, Joao
Palotti, Tejas Rafaliya, Francesco Rampazzo, Marzia
Rango, Vedran Sekara, Vatsala Singh, Spyridon
Spyratos, Bogdan State, Reham Tamime, Michele
Vespe, Jeffrey Villaveces, Agnese Vitali, Emilio
Zagheni, …
What is Demography?
Demography is the statistical study of populations.
According to IP address 70.67.193.176, user Pbsouthwood and other
contributors to https://en.wikipedia.org/wiki/Demography
The Population Equation
Change in population = Inputs – Outputs
Inputs = Births + In-migration
Outputs = Deaths + Out-Migration
• ∆P = (B + I) − (D + O)
Fertility, Mortality and Migration
Quant: How much? Where? When?
• Births
- Birth registry: India: ~75%, Kenya: ~65%, Liberia: ~25% (2017)
• Deaths
- “Global Burden of Disease” (Murray and Lopez, 1997):
“Medically certified information is available for less than 30% of
the estimated 50.5 million deaths that occur each year
worldwide.”
• Migration
- “The size of the irregular migrant stock of the EU-27 in 2008
was measured to be between 1.9 and 3.8 million, a decline from
between 2.4 and 5.4 million in the EU-25 in 2005” (Kovacheva
and Vogel, 2009).
Qual: Why? How?
• Births
- Effect of religiosity, available childcare, …
• Deaths
- Ikigai: “reason to get up in the morning”
• Migration
- Push/pull factors, assimilation, …
Opportunities for New methods
• Filling data gaps
– New data on migration, fertility, employment, …
• Explaining behavior
– Richer data, including networks and long-term history
• Predictive modeling
– Multi-modal forecasting
• Take a global perspective on things
– Facebook, Google, satellites know (almost) no borders
Goal is to augment, not replace, traditional approaches
Big Data is not a cure-all panacea
Rest of the Talk: Data-Centric
• Online advertising audience estimates
- Migration stocks, migrant assimilation
- Male mean-age-at-childbirth
- Ethics, limitations and challenges
• More non-obvious data sources
- Google Correlate, Followerwonk
- Even more non-obvious data sources
• Thoughts on interdisciplinary work
Facebook’s Audience Estimates
LinkedIn’s Audience Estimates
Female-to-male ratio LI users:
Female-to-male ratio LI users w/ AI:
0.94
0.27
Twitter’s Audience Estimates
Snapchat’s Audience Estimates
Female-to-male ratio SC users:
Female-to-male ratio SC users w/ STEM:
1.25
0.45
Google’s Impression Estimates
What they are actively researching or planning:
Baby and children’s products
Low-Cost Urban Census
http://fb-doha.qcri.org/
MIGRATION MONITORING
Expats Across US States
2014
2017
Expats Across Countries
2015
2017
regression line
Age-Specific Selection Biases
Bias Reduction via Model-Fitting
Mean out-of-sample absolute percentage error 37%,
down from 56% without origin-age bias correction
Adjusted R^2 = .70
Does not use GDP, language, internet penetration, …
z = age-gender group
i = country of birth
j = US state of residence
QUANTIFYING MIGRANT
ASSIMILATION
Do Refugees Share German Interests?
What interests to consider? Everybody likes “Music” and “Technology”.
How to interpret the score? High/low compared to European migrants?
Germans in DEU
FB Interests:
Football (90%)
Max Planck (70%)
Sauerkraut (40%)
…
Arabs in MENA
FB Interests:
Quran (80%)
Ibn Al-Haytham (60%)
Falafel (60%)
…
Arabs in DEU
FB Interests:
?
Obtaining an Assimilation Score
Migrant Group Assim. Score
Austrian migrants .900
Spanish migrants .864
French migrants .803
Turkish-speaking migrants .746
Arabic-speaking migrants .643
A: Women, non-uni, 45-64 .461
A: Men, uni, 18-24 .677
• Experimental methodology: take with a ton, not just a grain of salt
• Needs to be validated externally
• Goals include finding “bridging” interests/patterns
REAL-TIME MIGRATION MONITOR
Visualization of Venezuelan Exodus
Trends across time
Trends across space
Socio-economic insights
Report Hosted on “R4V: Operational Portal Refugee Situations”
STUDYING MALE FERTILITY RATES
Parenthood on Facebook
Mean Age at Child-Bearing
• Goal: fill data gaps on “mean age at child-bearing”
Out-of-Sample Predictions
Male MAC predictions for countries w/o ground truth
LIMITATIONS AND CHALLENGES
Ethical Challenges
• Privacy
– Was possible to obtain PII until early 2018 [Venkatadri
et al., 2018]
– Audience estimates for “custom audiences” no longer
supported
– The k in k-anonymity has been increased
• Vulnerable populations
– Was possible to exclude minorities from ads
– Was possible to target based on likely diseases
– Still targetable through proxy interests
We only use aggregate, anonymous data without
interacting with any user
Limitations: Selection Bias
Aren’t you just studying FB/LI/… vs. the “real
world”?
• If we understand the selection bias, we can
model it and de-bias the estimates
– Non-response biases in surveys
– Usual signal in a prediction model
– Non-random fake/duplicate accounts could
become problematic depending on domain
• Even if “only” LI, still real world implications
– LI used for hiring and to find keynote speakers
Limitations: Black Box
Who knows how FB’s classifier labels “expats” or
SC’s classifier labels “math enthusiasts”?
• Use as signal, not as ground truth
– Empirically, highly predictive of “proper” definition
– Unified definition can be a plus
• Incentives are in the right place
– Companies try to provide values to advertisers and,
hence, are incentivized to have correct labels
• Inconsistencies over time problematic
– In March 2019 FB changed its “expat” classifier
Limitations: No Longitudinal Data
None of the services provide information on
running a hypothetical ad campaign in the past
• No historical data sets of audience estimates
exist
– Hard to do causal inference (natural experiments)
• Similar to Twitter streaming API
– The best time to start collecting data is 20 years
ago. The second best time is today.
Limitations: What about Myspace?
Services come and go and FB et al. might
become obsolete
• Only useful for understanding and modeling
processes with current relevance
Usage patterns change over time
• FB of 2009 unlike FB of 2019.
• Users might become more privacy concerned.
• Re-validate and re-train your model over time.
MORE NON-OBVIOUS DATA
SOURCES
Google Correlate and Fertility
Discover search terms correlated with different fertility rates across US
states
https://www.google.com/trends/correlate/search?e=id:f7PU4mFDWV-
&t=all
Remove terms with no conceivable link to sex, pregnancy or maternity
Predicting Spatial Variability
• Performance of the regression models using
leave-one-out cross-validation. SMAPE is in [%], RMSE
values are multiplied by 1,000.
Use the previous terms to build
models predicting state-level fertility
rates
All these models make predictions
based on linear combinations of
search intensity
Goal: apply these spatial models
across time
Learning Across Space, Predicting Across
Time
• Temporal trend when applying the “teen” model
across time. Values are rescaled to a maximum of 1.0.
Pearson r correlation across 2010-2015 when using
the spatial model to predict trends across time.
Followerwonk and Gender Roles
(mother|mom) of … (father|dad) of …
… (girls|daughters) 1,257 303 1,560
… (sons|boys) 941 545 1,486
2,198 848
Location: (us|usa|united states)
https://followerwonk.com/bio/?q=(father|dad)%20of%20(sons|boys)&l=(us|usa|united%20states)
More Creative Data Sources
Online genealogy
- see how marriage mobility has changed
Online obituaries
- monitor patients discharged from hospital
Google Street View
- parked cars tell income and political orientation
https://sites.google.com/site/digitaldemography/
Creative Offline Data
Closing Thoughts
• Receptive collaborators (re selection bias)
• Publication venue (re career considerations)
• Data for Development (re practical use)
Again, amazing collaborators!
• Natalie Adler, Musa Al-Asad, Francesco Billari,
Antoine Dubois, Masoomali Fatehkia, Harsh
Gandhi, Manuel Garcia, Kiran Garimella, Krishna
Gummadi, Karri Haranko, Ridhi Kashyap, Yelena
Mejova, Alfredo Morales, Fabrizio Natale, Joao
Palotti, Tejas Rafaliya, Francesco Rampazzo, Marzia
Rango, Vedran Sekara, Vatsala Singh, Spyridon
Spyratos, Bogdan State, Reham Tamime, Michele
Vespe, Jeffrey Villaveces, Agnese Vitali, Emilio
Zagheni, …
Thanks!
Interested in a collaboration? Get in touch:
iweber@hbku.edu.qa
References and full texts: https://ingmarweber.de/publications/

Más contenido relacionado

La actualidad más candente

Journalist Involvement in Comment Sections
Journalist Involvement  in Comment SectionsJournalist Involvement  in Comment Sections
Journalist Involvement in Comment SectionsGenaro Bardy
 
Mapping Online Publics on Twitter
Mapping Online Publics on TwitterMapping Online Publics on Twitter
Mapping Online Publics on TwitterAxel Bruns
 
High-skilled Immigrants in the Massachusetts Civilian Labor Force
High-skilled Immigrants in the Massachusetts Civilian Labor ForceHigh-skilled Immigrants in the Massachusetts Civilian Labor Force
High-skilled Immigrants in the Massachusetts Civilian Labor ForceInstituto Diáspora Brasil (IDB)
 
Social Media Research and Practice in the Health Domain - Tutorial, Part II
Social Media Research and Practice in the Health Domain - Tutorial, Part IISocial Media Research and Practice in the Health Domain - Tutorial, Part II
Social Media Research and Practice in the Health Domain - Tutorial, Part IIIngmar Weber
 
GlobalPulse_SAS_MethodsPaper2011
GlobalPulse_SAS_MethodsPaper2011GlobalPulse_SAS_MethodsPaper2011
GlobalPulse_SAS_MethodsPaper2011UN Global Pulse
 
Global Pulse: Mining Indonesian Tweets to Understand Food Price Crises copy
Global Pulse: Mining Indonesian Tweets to Understand Food Price Crises copyGlobal Pulse: Mining Indonesian Tweets to Understand Food Price Crises copy
Global Pulse: Mining Indonesian Tweets to Understand Food Price Crises copyUN Global Pulse
 
Statistical Analysis on the Usage of Internet
Statistical Analysis on the Usage of InternetStatistical Analysis on the Usage of Internet
Statistical Analysis on the Usage of Internettheijes
 
Taking Collaborations to Scale to Improve Population Health
Taking Collaborations to Scale to Improve Population HealthTaking Collaborations to Scale to Improve Population Health
Taking Collaborations to Scale to Improve Population HealthPractical Playbook
 
Data Breach Research Plan 72415 FINAL
Data Breach Research Plan 72415 FINALData Breach Research Plan 72415 FINAL
Data Breach Research Plan 72415 FINALJoseph White MPA CPM
 
Analyzing Attitudes Towards Contraception & Teenage Pregnancy Using Social Da...
Analyzing Attitudes Towards Contraception & Teenage Pregnancy Using Social Da...Analyzing Attitudes Towards Contraception & Teenage Pregnancy Using Social Da...
Analyzing Attitudes Towards Contraception & Teenage Pregnancy Using Social Da...UN Global Pulse
 
GWI Social Q3 2014
GWI Social Q3 2014GWI Social Q3 2014
GWI Social Q3 2014Webrazzi
 
Online Search And Society: Could Your Best Friend Be Your Worst Enemy?
Online Search And Society: Could Your Best Friend Be Your Worst Enemy?Online Search And Society: Could Your Best Friend Be Your Worst Enemy?
Online Search And Society: Could Your Best Friend Be Your Worst Enemy?Rachel Noonan
 
Profiling Big Data sources to assess their selectivity
Profiling Big Data sources to assess their selectivityProfiling Big Data sources to assess their selectivity
Profiling Big Data sources to assess their selectivityPiet J.H. Daas
 
Mapping Online Publics: New Methods for Twitter Research
Mapping Online Publics: New Methods for Twitter ResearchMapping Online Publics: New Methods for Twitter Research
Mapping Online Publics: New Methods for Twitter ResearchAxel Bruns
 
S16_MAP_Using Data Science to Model Relationships Between Educational Levels ...
S16_MAP_Using Data Science to Model Relationships Between Educational Levels ...S16_MAP_Using Data Science to Model Relationships Between Educational Levels ...
S16_MAP_Using Data Science to Model Relationships Between Educational Levels ...Peiyun Zhang
 
Data science week_2_visualization
Data science week_2_visualizationData science week_2_visualization
Data science week_2_visualizationKeiko Ono
 
Big data for development
Big data for development Big data for development
Big data for development Junaid Qadir
 
Detecting fake news_with_weak_social_supervision
Detecting fake news_with_weak_social_supervisionDetecting fake news_with_weak_social_supervision
Detecting fake news_with_weak_social_supervisionSuresh S
 

La actualidad más candente (20)

Journalist Involvement in Comment Sections
Journalist Involvement  in Comment SectionsJournalist Involvement  in Comment Sections
Journalist Involvement in Comment Sections
 
Mapping Online Publics on Twitter
Mapping Online Publics on TwitterMapping Online Publics on Twitter
Mapping Online Publics on Twitter
 
High-skilled Immigrants in the Massachusetts Civilian Labor Force
High-skilled Immigrants in the Massachusetts Civilian Labor ForceHigh-skilled Immigrants in the Massachusetts Civilian Labor Force
High-skilled Immigrants in the Massachusetts Civilian Labor Force
 
Social Media Research and Practice in the Health Domain - Tutorial, Part II
Social Media Research and Practice in the Health Domain - Tutorial, Part IISocial Media Research and Practice in the Health Domain - Tutorial, Part II
Social Media Research and Practice in the Health Domain - Tutorial, Part II
 
GlobalPulse_SAS_MethodsPaper2011
GlobalPulse_SAS_MethodsPaper2011GlobalPulse_SAS_MethodsPaper2011
GlobalPulse_SAS_MethodsPaper2011
 
Global Pulse: Mining Indonesian Tweets to Understand Food Price Crises copy
Global Pulse: Mining Indonesian Tweets to Understand Food Price Crises copyGlobal Pulse: Mining Indonesian Tweets to Understand Food Price Crises copy
Global Pulse: Mining Indonesian Tweets to Understand Food Price Crises copy
 
Statistical Analysis on the Usage of Internet
Statistical Analysis on the Usage of InternetStatistical Analysis on the Usage of Internet
Statistical Analysis on the Usage of Internet
 
Taking Collaborations to Scale to Improve Population Health
Taking Collaborations to Scale to Improve Population HealthTaking Collaborations to Scale to Improve Population Health
Taking Collaborations to Scale to Improve Population Health
 
Data Breach Research Plan 72415 FINAL
Data Breach Research Plan 72415 FINALData Breach Research Plan 72415 FINAL
Data Breach Research Plan 72415 FINAL
 
We Got a Map for That
We Got a Map for ThatWe Got a Map for That
We Got a Map for That
 
Analyzing Attitudes Towards Contraception & Teenage Pregnancy Using Social Da...
Analyzing Attitudes Towards Contraception & Teenage Pregnancy Using Social Da...Analyzing Attitudes Towards Contraception & Teenage Pregnancy Using Social Da...
Analyzing Attitudes Towards Contraception & Teenage Pregnancy Using Social Da...
 
GWI Social Q3 2014
GWI Social Q3 2014GWI Social Q3 2014
GWI Social Q3 2014
 
Online Search And Society: Could Your Best Friend Be Your Worst Enemy?
Online Search And Society: Could Your Best Friend Be Your Worst Enemy?Online Search And Society: Could Your Best Friend Be Your Worst Enemy?
Online Search And Society: Could Your Best Friend Be Your Worst Enemy?
 
Profiling Big Data sources to assess their selectivity
Profiling Big Data sources to assess their selectivityProfiling Big Data sources to assess their selectivity
Profiling Big Data sources to assess their selectivity
 
Mapping Online Publics: New Methods for Twitter Research
Mapping Online Publics: New Methods for Twitter ResearchMapping Online Publics: New Methods for Twitter Research
Mapping Online Publics: New Methods for Twitter Research
 
S16_MAP_Using Data Science to Model Relationships Between Educational Levels ...
S16_MAP_Using Data Science to Model Relationships Between Educational Levels ...S16_MAP_Using Data Science to Model Relationships Between Educational Levels ...
S16_MAP_Using Data Science to Model Relationships Between Educational Levels ...
 
Data science week_2_visualization
Data science week_2_visualizationData science week_2_visualization
Data science week_2_visualization
 
Big data for development
Big data for development Big data for development
Big data for development
 
Detecting fake news_with_weak_social_supervision
Detecting fake news_with_weak_social_supervisionDetecting fake news_with_weak_social_supervision
Detecting fake news_with_weak_social_supervision
 
Social media in the public sector south korea twitter
Social media in the public sector south korea twitterSocial media in the public sector south korea twitter
Social media in the public sector south korea twitter
 

Similar a Digital Trace Data for Demographic Research

Digital Demography - Keynote at SocInfo'18
Digital Demography - Keynote at SocInfo'18Digital Demography - Keynote at SocInfo'18
Digital Demography - Keynote at SocInfo'18Ingmar Weber
 
Digital & Social Media: 3-step action plan for clinical trials volunteer recr...
Digital & Social Media: 3-step action plan for clinical trials volunteer recr...Digital & Social Media: 3-step action plan for clinical trials volunteer recr...
Digital & Social Media: 3-step action plan for clinical trials volunteer recr...Aimee Edgeworth
 
Recruitment for Hard-to-Reach Populations: LGBTQ Youth (E. Fordyce)
Recruitment for Hard-to-Reach Populations: LGBTQ Youth (E. Fordyce)Recruitment for Hard-to-Reach Populations: LGBTQ Youth (E. Fordyce)
Recruitment for Hard-to-Reach Populations: LGBTQ Youth (E. Fordyce)Esmeralda Casas-Silva, Ph.D.
 
Digital Gender Gaps Seen Through Social Media
Digital Gender Gaps Seen Through Social MediaDigital Gender Gaps Seen Through Social Media
Digital Gender Gaps Seen Through Social MediaIngmar Weber
 
Analytics Academy 2015 Presentation Slides
Analytics Academy 2015 Presentation SlidesAnalytics Academy 2015 Presentation Slides
Analytics Academy 2015 Presentation SlidesHarvardComms
 
Marketing in higher education - Surviving in rough terrain - by Toddy Gibby o...
Marketing in higher education - Surviving in rough terrain - by Toddy Gibby o...Marketing in higher education - Surviving in rough terrain - by Toddy Gibby o...
Marketing in higher education - Surviving in rough terrain - by Toddy Gibby o...WorkSmart Integrated Marketing
 
Mobile Advertising 101: Beyond Geofencing
Mobile Advertising 101: Beyond GeofencingMobile Advertising 101: Beyond Geofencing
Mobile Advertising 101: Beyond GeofencingGil Rogers
 
Breakout 3. AI for Sustainable Development and Human Rights: Inclusion, Diver...
Breakout 3. AI for Sustainable Development and Human Rights: Inclusion, Diver...Breakout 3. AI for Sustainable Development and Human Rights: Inclusion, Diver...
Breakout 3. AI for Sustainable Development and Human Rights: Inclusion, Diver...Saurabh Mishra
 
Capturing Social and Clinical Knowledge for personalised care
Capturing Social and Clinical Knowledge for personalised careCapturing Social and Clinical Knowledge for personalised care
Capturing Social and Clinical Knowledge for personalised careVanessa Lopez
 
Online marketing strategy for audiologists
Online marketing strategy for audiologistsOnline marketing strategy for audiologists
Online marketing strategy for audiologistsGeoffrey Cooling
 
SOCI 11- Day One - Monday Afternoon - June 13, 2016
SOCI 11- Day One - Monday Afternoon - June 13, 2016SOCI 11- Day One - Monday Afternoon - June 13, 2016
SOCI 11- Day One - Monday Afternoon - June 13, 2016Michael Kerr
 
Accessing the Already Connected Consumer
Accessing the Already Connected ConsumerAccessing the Already Connected Consumer
Accessing the Already Connected ConsumerRay Poynter
 
Ways of seeing learning - 2017v1.0 - NUI Galway University of Limerick postgr...
Ways of seeing learning - 2017v1.0 - NUI Galway University of Limerick postgr...Ways of seeing learning - 2017v1.0 - NUI Galway University of Limerick postgr...
Ways of seeing learning - 2017v1.0 - NUI Galway University of Limerick postgr...Mary Loftus
 
Let’s Get Personal! Ways to Harness Your Data to Improve Personalization
Let’s Get Personal! Ways to Harness Your Data to Improve PersonalizationLet’s Get Personal! Ways to Harness Your Data to Improve Personalization
Let’s Get Personal! Ways to Harness Your Data to Improve PersonalizationHobsons
 
America's Backbone: Education and our Youth
America's Backbone: Education and our YouthAmerica's Backbone: Education and our Youth
America's Backbone: Education and our YouthSahr Saffa
 
Grant Writing and Reporting
Grant Writing and ReportingGrant Writing and Reporting
Grant Writing and ReportingHealthy City
 

Similar a Digital Trace Data for Demographic Research (20)

Digital Demography - Keynote at SocInfo'18
Digital Demography - Keynote at SocInfo'18Digital Demography - Keynote at SocInfo'18
Digital Demography - Keynote at SocInfo'18
 
Digital & Social Media: 3-step action plan for clinical trials volunteer recr...
Digital & Social Media: 3-step action plan for clinical trials volunteer recr...Digital & Social Media: 3-step action plan for clinical trials volunteer recr...
Digital & Social Media: 3-step action plan for clinical trials volunteer recr...
 
Recruitment for Hard-to-Reach Populations: LGBTQ Youth (E. Fordyce)
Recruitment for Hard-to-Reach Populations: LGBTQ Youth (E. Fordyce)Recruitment for Hard-to-Reach Populations: LGBTQ Youth (E. Fordyce)
Recruitment for Hard-to-Reach Populations: LGBTQ Youth (E. Fordyce)
 
Digital Gender Gaps Seen Through Social Media
Digital Gender Gaps Seen Through Social MediaDigital Gender Gaps Seen Through Social Media
Digital Gender Gaps Seen Through Social Media
 
Analytics Academy 2015 Presentation Slides
Analytics Academy 2015 Presentation SlidesAnalytics Academy 2015 Presentation Slides
Analytics Academy 2015 Presentation Slides
 
Marketing in higher education - Surviving in rough terrain - by Toddy Gibby o...
Marketing in higher education - Surviving in rough terrain - by Toddy Gibby o...Marketing in higher education - Surviving in rough terrain - by Toddy Gibby o...
Marketing in higher education - Surviving in rough terrain - by Toddy Gibby o...
 
NEGAP 2011: Sizing Up A Monumental Task: Building your Recruitment Funnel and...
NEGAP 2011: Sizing Up A Monumental Task: Building your Recruitment Funnel and...NEGAP 2011: Sizing Up A Monumental Task: Building your Recruitment Funnel and...
NEGAP 2011: Sizing Up A Monumental Task: Building your Recruitment Funnel and...
 
Mobile Advertising 101: Beyond Geofencing
Mobile Advertising 101: Beyond GeofencingMobile Advertising 101: Beyond Geofencing
Mobile Advertising 101: Beyond Geofencing
 
Breakout 3. AI for Sustainable Development and Human Rights: Inclusion, Diver...
Breakout 3. AI for Sustainable Development and Human Rights: Inclusion, Diver...Breakout 3. AI for Sustainable Development and Human Rights: Inclusion, Diver...
Breakout 3. AI for Sustainable Development and Human Rights: Inclusion, Diver...
 
Capturing Social and Clinical Knowledge for personalised care
Capturing Social and Clinical Knowledge for personalised careCapturing Social and Clinical Knowledge for personalised care
Capturing Social and Clinical Knowledge for personalised care
 
Maura Tuohy
Maura TuohyMaura Tuohy
Maura Tuohy
 
Online marketing strategy for audiologists
Online marketing strategy for audiologistsOnline marketing strategy for audiologists
Online marketing strategy for audiologists
 
SOCI 11- Day One - Monday Afternoon - June 13, 2016
SOCI 11- Day One - Monday Afternoon - June 13, 2016SOCI 11- Day One - Monday Afternoon - June 13, 2016
SOCI 11- Day One - Monday Afternoon - June 13, 2016
 
Accessing the Already Connected Consumer
Accessing the Already Connected ConsumerAccessing the Already Connected Consumer
Accessing the Already Connected Consumer
 
Ways of seeing learning - 2017v1.0 - NUI Galway University of Limerick postgr...
Ways of seeing learning - 2017v1.0 - NUI Galway University of Limerick postgr...Ways of seeing learning - 2017v1.0 - NUI Galway University of Limerick postgr...
Ways of seeing learning - 2017v1.0 - NUI Galway University of Limerick postgr...
 
Let’s Get Personal! Ways to Harness Your Data to Improve Personalization
Let’s Get Personal! Ways to Harness Your Data to Improve PersonalizationLet’s Get Personal! Ways to Harness Your Data to Improve Personalization
Let’s Get Personal! Ways to Harness Your Data to Improve Personalization
 
1710 track1 bagirov
1710 track1 bagirov1710 track1 bagirov
1710 track1 bagirov
 
Saffas 05
Saffas 05Saffas 05
Saffas 05
 
America's Backbone: Education and our Youth
America's Backbone: Education and our YouthAmerica's Backbone: Education and our Youth
America's Backbone: Education and our Youth
 
Grant Writing and Reporting
Grant Writing and ReportingGrant Writing and Reporting
Grant Writing and Reporting
 

Más de Ingmar Weber

Different Hashtags, Different Opinions - Twitter Polarization in Egypt
Different Hashtags, Different Opinions - Twitter Polarization in EgyptDifferent Hashtags, Different Opinions - Twitter Polarization in Egypt
Different Hashtags, Different Opinions - Twitter Polarization in EgyptIngmar Weber
 
Data on Polarization, Peace, and Propaganda
Data on Polarization, Peace, and PropagandaData on Polarization, Peace, and Propaganda
Data on Polarization, Peace, and PropagandaIngmar Weber
 
Using Advertising Platforms for Social Good
Using Advertising Platforms for Social GoodUsing Advertising Platforms for Social Good
Using Advertising Platforms for Social GoodIngmar Weber
 
Monitoring migration using social media data an introduction
Monitoring migration using social media data   an introductionMonitoring migration using social media data   an introduction
Monitoring migration using social media data an introductionIngmar Weber
 
Not so-obvious social media analysis to study current affairs
Not so-obvious social media analysis to study current affairsNot so-obvious social media analysis to study current affairs
Not so-obvious social media analysis to study current affairsIngmar Weber
 
Digital advertising data for migration research
Digital advertising data for migration researchDigital advertising data for migration research
Digital advertising data for migration researchIngmar Weber
 
Advertising Data for Good
Advertising Data for GoodAdvertising Data for Good
Advertising Data for GoodIngmar Weber
 
Using advertising data to model migration, poverty and digital gender gaps
Using advertising data to model migration, poverty and digital gender gapsUsing advertising data to model migration, poverty and digital gender gaps
Using advertising data to model migration, poverty and digital gender gapsIngmar Weber
 
Correlated Impulses: Using Facebook Interests to Improve Predictions of Crime...
Correlated Impulses: Using Facebook Interests to Improve Predictions of Crime...Correlated Impulses: Using Facebook Interests to Improve Predictions of Crime...
Correlated Impulses: Using Facebook Interests to Improve Predictions of Crime...Ingmar Weber
 
Tapping into advertising platforms to monitor ict usage and more
Tapping into advertising platforms to monitor ict usage and moreTapping into advertising platforms to monitor ict usage and more
Tapping into advertising platforms to monitor ict usage and moreIngmar Weber
 
Tracking Digital Gender Gaps
Tracking Digital Gender GapsTracking Digital Gender Gaps
Tracking Digital Gender GapsIngmar Weber
 
Estimating Migration and Quantifying Migrant Assimilation Using Internet Adve...
Estimating Migration and Quantifying Migrant Assimilation Using Internet Adve...Estimating Migration and Quantifying Migrant Assimilation Using Internet Adve...
Estimating Migration and Quantifying Migrant Assimilation Using Internet Adve...Ingmar Weber
 
Using internet advertising data for studying international migration
Using internet advertising data for studying international migrationUsing internet advertising data for studying international migration
Using internet advertising data for studying international migrationIngmar Weber
 
Social media analysis for better policy making
Social media analysis for better policy makingSocial media analysis for better policy making
Social media analysis for better policy makingIngmar Weber
 
Matching Methods and Natural Experiments - Examples of Causal Inference from ...
Matching Methods and Natural Experiments - Examples of Causal Inference from ...Matching Methods and Natural Experiments - Examples of Causal Inference from ...
Matching Methods and Natural Experiments - Examples of Causal Inference from ...Ingmar Weber
 
A Warm Welcome Matters! The Link Between Social Feedback and Weight Loss in /...
A Warm Welcome Matters! The Link Between Social Feedback and Weight Loss in /...A Warm Welcome Matters! The Link Between Social Feedback and Weight Loss in /...
A Warm Welcome Matters! The Link Between Social Feedback and Weight Loss in /...Ingmar Weber
 

Más de Ingmar Weber (16)

Different Hashtags, Different Opinions - Twitter Polarization in Egypt
Different Hashtags, Different Opinions - Twitter Polarization in EgyptDifferent Hashtags, Different Opinions - Twitter Polarization in Egypt
Different Hashtags, Different Opinions - Twitter Polarization in Egypt
 
Data on Polarization, Peace, and Propaganda
Data on Polarization, Peace, and PropagandaData on Polarization, Peace, and Propaganda
Data on Polarization, Peace, and Propaganda
 
Using Advertising Platforms for Social Good
Using Advertising Platforms for Social GoodUsing Advertising Platforms for Social Good
Using Advertising Platforms for Social Good
 
Monitoring migration using social media data an introduction
Monitoring migration using social media data   an introductionMonitoring migration using social media data   an introduction
Monitoring migration using social media data an introduction
 
Not so-obvious social media analysis to study current affairs
Not so-obvious social media analysis to study current affairsNot so-obvious social media analysis to study current affairs
Not so-obvious social media analysis to study current affairs
 
Digital advertising data for migration research
Digital advertising data for migration researchDigital advertising data for migration research
Digital advertising data for migration research
 
Advertising Data for Good
Advertising Data for GoodAdvertising Data for Good
Advertising Data for Good
 
Using advertising data to model migration, poverty and digital gender gaps
Using advertising data to model migration, poverty and digital gender gapsUsing advertising data to model migration, poverty and digital gender gaps
Using advertising data to model migration, poverty and digital gender gaps
 
Correlated Impulses: Using Facebook Interests to Improve Predictions of Crime...
Correlated Impulses: Using Facebook Interests to Improve Predictions of Crime...Correlated Impulses: Using Facebook Interests to Improve Predictions of Crime...
Correlated Impulses: Using Facebook Interests to Improve Predictions of Crime...
 
Tapping into advertising platforms to monitor ict usage and more
Tapping into advertising platforms to monitor ict usage and moreTapping into advertising platforms to monitor ict usage and more
Tapping into advertising platforms to monitor ict usage and more
 
Tracking Digital Gender Gaps
Tracking Digital Gender GapsTracking Digital Gender Gaps
Tracking Digital Gender Gaps
 
Estimating Migration and Quantifying Migrant Assimilation Using Internet Adve...
Estimating Migration and Quantifying Migrant Assimilation Using Internet Adve...Estimating Migration and Quantifying Migrant Assimilation Using Internet Adve...
Estimating Migration and Quantifying Migrant Assimilation Using Internet Adve...
 
Using internet advertising data for studying international migration
Using internet advertising data for studying international migrationUsing internet advertising data for studying international migration
Using internet advertising data for studying international migration
 
Social media analysis for better policy making
Social media analysis for better policy makingSocial media analysis for better policy making
Social media analysis for better policy making
 
Matching Methods and Natural Experiments - Examples of Causal Inference from ...
Matching Methods and Natural Experiments - Examples of Causal Inference from ...Matching Methods and Natural Experiments - Examples of Causal Inference from ...
Matching Methods and Natural Experiments - Examples of Causal Inference from ...
 
A Warm Welcome Matters! The Link Between Social Feedback and Weight Loss in /...
A Warm Welcome Matters! The Link Between Social Feedback and Weight Loss in /...A Warm Welcome Matters! The Link Between Social Feedback and Weight Loss in /...
A Warm Welcome Matters! The Link Between Social Feedback and Weight Loss in /...
 

Último

Citronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayCitronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayupadhyaymani499
 
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPirithiRaju
 
User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)Columbia Weather Systems
 
final waves properties grade 7 - third quarter
final waves properties grade 7 - third quarterfinal waves properties grade 7 - third quarter
final waves properties grade 7 - third quarterHanHyoKim
 
bonjourmadame.tumblr.com bhaskar's girls
bonjourmadame.tumblr.com bhaskar's girlsbonjourmadame.tumblr.com bhaskar's girls
bonjourmadame.tumblr.com bhaskar's girlshansessene
 
办理麦克马斯特大学毕业证成绩单|购买加拿大文凭证书
办理麦克马斯特大学毕业证成绩单|购买加拿大文凭证书办理麦克马斯特大学毕业证成绩单|购买加拿大文凭证书
办理麦克马斯特大学毕业证成绩单|购买加拿大文凭证书zdzoqco
 
Oxo-Acids of Halogens and their Salts.pptx
Oxo-Acids of Halogens and their Salts.pptxOxo-Acids of Halogens and their Salts.pptx
Oxo-Acids of Halogens and their Salts.pptxfarhanvvdk
 
Gas-ExchangeS-in-Plants-and-Animals.pptx
Gas-ExchangeS-in-Plants-and-Animals.pptxGas-ExchangeS-in-Plants-and-Animals.pptx
Gas-ExchangeS-in-Plants-and-Animals.pptxGiovaniTrinidad
 
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》rnrncn29
 
GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024Jene van der Heide
 
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPirithiRaju
 
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptxGENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptxRitchAndruAgustin
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationColumbia Weather Systems
 
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingNetHelix
 
trihybrid cross , test cross chi squares
trihybrid cross , test cross chi squarestrihybrid cross , test cross chi squares
trihybrid cross , test cross chi squaresusmanzain586
 
Forensic limnology of diatoms by Sanjai.pptx
Forensic limnology of diatoms by Sanjai.pptxForensic limnology of diatoms by Sanjai.pptx
Forensic limnology of diatoms by Sanjai.pptxkumarsanjai28051
 
Four Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptFour Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptJoemSTuliba
 
OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024innovationoecd
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologycaarthichand2003
 

Último (20)

Citronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyayCitronella presentation SlideShare mani upadhyay
Citronella presentation SlideShare mani upadhyay
 
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
 
User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)
 
AZOTOBACTER AS BIOFERILIZER.PPTX
AZOTOBACTER AS BIOFERILIZER.PPTXAZOTOBACTER AS BIOFERILIZER.PPTX
AZOTOBACTER AS BIOFERILIZER.PPTX
 
final waves properties grade 7 - third quarter
final waves properties grade 7 - third quarterfinal waves properties grade 7 - third quarter
final waves properties grade 7 - third quarter
 
bonjourmadame.tumblr.com bhaskar's girls
bonjourmadame.tumblr.com bhaskar's girlsbonjourmadame.tumblr.com bhaskar's girls
bonjourmadame.tumblr.com bhaskar's girls
 
办理麦克马斯特大学毕业证成绩单|购买加拿大文凭证书
办理麦克马斯特大学毕业证成绩单|购买加拿大文凭证书办理麦克马斯特大学毕业证成绩单|购买加拿大文凭证书
办理麦克马斯特大学毕业证成绩单|购买加拿大文凭证书
 
Oxo-Acids of Halogens and their Salts.pptx
Oxo-Acids of Halogens and their Salts.pptxOxo-Acids of Halogens and their Salts.pptx
Oxo-Acids of Halogens and their Salts.pptx
 
Gas-ExchangeS-in-Plants-and-Animals.pptx
Gas-ExchangeS-in-Plants-and-Animals.pptxGas-ExchangeS-in-Plants-and-Animals.pptx
Gas-ExchangeS-in-Plants-and-Animals.pptx
 
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》《Queensland毕业文凭-昆士兰大学毕业证成绩单》
《Queensland毕业文凭-昆士兰大学毕业证成绩单》
 
GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024
 
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdf
 
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptxGENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather Station
 
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
 
trihybrid cross , test cross chi squares
trihybrid cross , test cross chi squarestrihybrid cross , test cross chi squares
trihybrid cross , test cross chi squares
 
Forensic limnology of diatoms by Sanjai.pptx
Forensic limnology of diatoms by Sanjai.pptxForensic limnology of diatoms by Sanjai.pptx
Forensic limnology of diatoms by Sanjai.pptx
 
Four Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptFour Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.ppt
 
OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technology
 

Digital Trace Data for Demographic Research

  • 1. Digital Trace Data for Demographic Research Ingmar Weber @ingmarweber June 12, 2019 Lecture at BIGSSS CSS 2019 Or How I learned to Love Online Advertising
  • 2. Amazing Collaborators! (Alphab.) • Natalie Adler, Musa Al-Asad, Francesco Billari, Antoine Dubois, Masoomali Fatehkia, Harsh Gandhi, Manuel Garcia, Kiran Garimella, Krishna Gummadi, Karri Haranko, Ridhi Kashyap, Yelena Mejova, Alfredo Morales, Fabrizio Natale, Joao Palotti, Tejas Rafaliya, Francesco Rampazzo, Marzia Rango, Vedran Sekara, Vatsala Singh, Spyridon Spyratos, Bogdan State, Reham Tamime, Michele Vespe, Jeffrey Villaveces, Agnese Vitali, Emilio Zagheni, …
  • 3. What is Demography? Demography is the statistical study of populations. According to IP address 70.67.193.176, user Pbsouthwood and other contributors to https://en.wikipedia.org/wiki/Demography
  • 4. The Population Equation Change in population = Inputs – Outputs Inputs = Births + In-migration Outputs = Deaths + Out-Migration • ∆P = (B + I) − (D + O) Fertility, Mortality and Migration
  • 5. Quant: How much? Where? When? • Births - Birth registry: India: ~75%, Kenya: ~65%, Liberia: ~25% (2017) • Deaths - “Global Burden of Disease” (Murray and Lopez, 1997): “Medically certified information is available for less than 30% of the estimated 50.5 million deaths that occur each year worldwide.” • Migration - “The size of the irregular migrant stock of the EU-27 in 2008 was measured to be between 1.9 and 3.8 million, a decline from between 2.4 and 5.4 million in the EU-25 in 2005” (Kovacheva and Vogel, 2009).
  • 6. Qual: Why? How? • Births - Effect of religiosity, available childcare, … • Deaths - Ikigai: “reason to get up in the morning” • Migration - Push/pull factors, assimilation, …
  • 7. Opportunities for New methods • Filling data gaps – New data on migration, fertility, employment, … • Explaining behavior – Richer data, including networks and long-term history • Predictive modeling – Multi-modal forecasting • Take a global perspective on things – Facebook, Google, satellites know (almost) no borders Goal is to augment, not replace, traditional approaches Big Data is not a cure-all panacea
  • 8. Rest of the Talk: Data-Centric • Online advertising audience estimates - Migration stocks, migrant assimilation - Male mean-age-at-childbirth - Ethics, limitations and challenges • More non-obvious data sources - Google Correlate, Followerwonk - Even more non-obvious data sources • Thoughts on interdisciplinary work
  • 10. LinkedIn’s Audience Estimates Female-to-male ratio LI users: Female-to-male ratio LI users w/ AI: 0.94 0.27
  • 12. Snapchat’s Audience Estimates Female-to-male ratio SC users: Female-to-male ratio SC users w/ STEM: 1.25 0.45
  • 13. Google’s Impression Estimates What they are actively researching or planning: Baby and children’s products
  • 16. Expats Across US States 2014 2017
  • 19. Bias Reduction via Model-Fitting Mean out-of-sample absolute percentage error 37%, down from 56% without origin-age bias correction Adjusted R^2 = .70 Does not use GDP, language, internet penetration, … z = age-gender group i = country of birth j = US state of residence
  • 21. Do Refugees Share German Interests? What interests to consider? Everybody likes “Music” and “Technology”. How to interpret the score? High/low compared to European migrants? Germans in DEU FB Interests: Football (90%) Max Planck (70%) Sauerkraut (40%) … Arabs in MENA FB Interests: Quran (80%) Ibn Al-Haytham (60%) Falafel (60%) … Arabs in DEU FB Interests: ?
  • 22. Obtaining an Assimilation Score Migrant Group Assim. Score Austrian migrants .900 Spanish migrants .864 French migrants .803 Turkish-speaking migrants .746 Arabic-speaking migrants .643 A: Women, non-uni, 45-64 .461 A: Men, uni, 18-24 .677 • Experimental methodology: take with a ton, not just a grain of salt • Needs to be validated externally • Goals include finding “bridging” interests/patterns
  • 25. Trends across time Trends across space Socio-economic insights Report Hosted on “R4V: Operational Portal Refugee Situations”
  • 28. Mean Age at Child-Bearing • Goal: fill data gaps on “mean age at child-bearing”
  • 29. Out-of-Sample Predictions Male MAC predictions for countries w/o ground truth
  • 31. Ethical Challenges • Privacy – Was possible to obtain PII until early 2018 [Venkatadri et al., 2018] – Audience estimates for “custom audiences” no longer supported – The k in k-anonymity has been increased • Vulnerable populations – Was possible to exclude minorities from ads – Was possible to target based on likely diseases – Still targetable through proxy interests We only use aggregate, anonymous data without interacting with any user
  • 32. Limitations: Selection Bias Aren’t you just studying FB/LI/… vs. the “real world”? • If we understand the selection bias, we can model it and de-bias the estimates – Non-response biases in surveys – Usual signal in a prediction model – Non-random fake/duplicate accounts could become problematic depending on domain • Even if “only” LI, still real world implications – LI used for hiring and to find keynote speakers
  • 33. Limitations: Black Box Who knows how FB’s classifier labels “expats” or SC’s classifier labels “math enthusiasts”? • Use as signal, not as ground truth – Empirically, highly predictive of “proper” definition – Unified definition can be a plus • Incentives are in the right place – Companies try to provide values to advertisers and, hence, are incentivized to have correct labels • Inconsistencies over time problematic – In March 2019 FB changed its “expat” classifier
  • 34. Limitations: No Longitudinal Data None of the services provide information on running a hypothetical ad campaign in the past • No historical data sets of audience estimates exist – Hard to do causal inference (natural experiments) • Similar to Twitter streaming API – The best time to start collecting data is 20 years ago. The second best time is today.
  • 35. Limitations: What about Myspace? Services come and go and FB et al. might become obsolete • Only useful for understanding and modeling processes with current relevance Usage patterns change over time • FB of 2009 unlike FB of 2019. • Users might become more privacy concerned. • Re-validate and re-train your model over time.
  • 37. Google Correlate and Fertility Discover search terms correlated with different fertility rates across US states https://www.google.com/trends/correlate/search?e=id:f7PU4mFDWV- &t=all Remove terms with no conceivable link to sex, pregnancy or maternity
  • 38. Predicting Spatial Variability • Performance of the regression models using leave-one-out cross-validation. SMAPE is in [%], RMSE values are multiplied by 1,000. Use the previous terms to build models predicting state-level fertility rates All these models make predictions based on linear combinations of search intensity Goal: apply these spatial models across time
  • 39. Learning Across Space, Predicting Across Time • Temporal trend when applying the “teen” model across time. Values are rescaled to a maximum of 1.0. Pearson r correlation across 2010-2015 when using the spatial model to predict trends across time.
  • 40. Followerwonk and Gender Roles (mother|mom) of … (father|dad) of … … (girls|daughters) 1,257 303 1,560 … (sons|boys) 941 545 1,486 2,198 848 Location: (us|usa|united states) https://followerwonk.com/bio/?q=(father|dad)%20of%20(sons|boys)&l=(us|usa|united%20states)
  • 41. More Creative Data Sources Online genealogy - see how marriage mobility has changed Online obituaries - monitor patients discharged from hospital Google Street View - parked cars tell income and political orientation https://sites.google.com/site/digitaldemography/
  • 43. Closing Thoughts • Receptive collaborators (re selection bias) • Publication venue (re career considerations) • Data for Development (re practical use)
  • 44. Again, amazing collaborators! • Natalie Adler, Musa Al-Asad, Francesco Billari, Antoine Dubois, Masoomali Fatehkia, Harsh Gandhi, Manuel Garcia, Kiran Garimella, Krishna Gummadi, Karri Haranko, Ridhi Kashyap, Yelena Mejova, Alfredo Morales, Fabrizio Natale, Joao Palotti, Tejas Rafaliya, Francesco Rampazzo, Marzia Rango, Vedran Sekara, Vatsala Singh, Spyridon Spyratos, Bogdan State, Reham Tamime, Michele Vespe, Jeffrey Villaveces, Agnese Vitali, Emilio Zagheni, …
  • 45. Thanks! Interested in a collaboration? Get in touch: iweber@hbku.edu.qa References and full texts: https://ingmarweber.de/publications/