LinkedIn emplea cookies para mejorar la funcionalidad y el rendimiento de nuestro sitio web, así como para ofrecer publicidad relevante. Si continúas navegando por ese sitio web, aceptas el uso de cookies. Consulta nuestras Condiciones de uso y nuestra Política de privacidad para más información.
LinkedIn emplea cookies para mejorar la funcionalidad y el rendimiento de nuestro sitio web, así como para ofrecer publicidad relevante. Si continúas navegando por ese sitio web, aceptas el uso de cookies. Consulta nuestra Política de privacidad y nuestras Condiciones de uso para más información.
"Big Data for Development: Opportunities & Challenges” - UN Global Pulse
Download at:http://www.unglobalpulse.org/BigDataforDevWhitePape r
TABLE OF CONTENTSSection I: Opportunities • DATA INTENT AND CAPACITY • SOCIAL SCIENCE AND POLICY APPLICATIONS Section II: Challenges • DATA CHALLENGES • ANALYTICAL CHALLENGES Section III: Applications • WHAT NEW DATA STREAMS BRING TO THE TABLE • MAKING BIG DATA WORK FOR
Section I: Opportunity The Data RevolutionBig data• The three V’s of the digital data deluge: • Exponential growth in volume • Increasing velocity of data flow • Bewildering variety of new data typesReal-time operations in the private sector• Real-time analysis, real-time decision-making, real-time customer feedback
What Do We Mean by Real-Time?Global Pulse Definition:“Information about a phenomenon available quickly enough tomaintain an accurate reflection of its current state, such thateffective action may be taken in response.”Timeframe for intervention is relative to context: • Malnutrition Months • Starvation Weeks • Cholera Days • Earthquake Hours
Section I: Opportunity Relevance to the Developing World• As of 2010: 4 billion of the world’s 5 billion mobile phones are in in developing countries• Mobile Services: money transfers, job search, commerce, market prices, social Mobile Banking in East Africa: Kenya: 11,000 new users/day, media Tanzania: 15,000, Uganda 18,000 Facebook in Senegal: 100,000 new users per month
Section I: Opportunity Intent in an Age of Growing Volatility• Drivers of Volatility: financial shocks, climate change, hyperconnectivity 2011 OECD Report: “[d]isruptive shocks to the global economy are likely to become more frequent and cause greater economic and societal hardship. The economic spill-over effect of events like the financial crisis or a potential pandemic will grow due to the increasing interconnectivity of the global economy and speed with which people, goods and data travel”.• Early Warning Today: local impacts invisible or impossible to track as they happen.• Growing Intent: policy makers are recognizing both the costs of volatility and the need for greater agility.
Section I: Opportunity Data Mining and Data Science• The availability of real-time digital data is increasing every second.• Slowly but surely, intent to leverage it as a public good is growing.• Yet there must also be capacity to understand it -- and use it to change outcomes. “Data is the new oil. Like oil, it must be refined before it can be used.” - Andreas Weigend
Section I: Opportunity Big Data for Development: Getting StartedIllustration: Coping strategies of a hypotheticalhousehold facing rising commodity prices andunemployment OFFLINE BEHAVIORS DIGITAL SIGNATURES • Buy cheaper foods • Depletion of airtime credit • Work longer hours • Smaller mobile airtime • Reduce energy use purchases • Draw down savings • Failure to repay microloans via • Sell assets mobile financial services • Borrow from relatives • Changes in calling patterns • Inbound money transfers • Searches for jobs, health • Sales of livestock via mobile trading network • Venting frustrations on social media
Section I: Opportunity Big Data for Development: Getting StartedA Loose BD4D Taxonomy: 1. Data Exhaust. Mobile usage, purchases, search, app usage. 2. Online Information. New stories, blogs, Twitter, Facebook, obituaries, job postings, ecommerce. 3. Physical Sensors. Satellite imagery, video, traffic sensors, etc. 4. Crowdsourced Reports. Information actively generated by citizens through mobile phone-based surveys, hotlines, online maps, etc.
Section I: Opportunity Capacity: Big Data Analytics Data Analytics and “Reality Mining”1. Stream Analytics: Continuous analysis over real-time streaming data (social media, calling patterns, online prices, search)2. Data Mining: Online digestion of semi- structured and unstructured historical data (news items, blog posts)3. Real-Time Correlation: Integrating fast streams with historical records to provide context to new data
Data Visualization Matters!A word cloud of this whitepaper Global legal timber trade: Top 5 exporters and costs
Section I: Opportunity Social Science and Policy ApplicationsA growing body of evidence:• Mining mobile location data to detect job loss, migration.• Mining mobile usage to detect mental illness• Mining Twitter for misuse of antibiotics and other medications• Mining Facebook for evidence of drinking problems among college students• Remote sensing of nighttime light emissions for a real-time estimation of GDP• Crowdsourcing citizen SMS reports to estimate earthquake damage
Tracking Health-Related Behaviour Change: Mining Twitter messages H1N1 epidemic in the US Cholera in Haiti
Tracking Health-Related Behaviour Change Mining Google searches Volume of real-time searches for symptoms predicts official # of cases of Dengue in Brazil
Section II: Challenges Data Privacy1. Digital Data Privacy as a Human Right • Data acquisition • Storage • Retention • Use • Presentation2. Privacy Risks in Big Data. • Awareness of consent to collect, • Reuse of public content, • Re-identification.
Section II: Challenges Data AccessPrivate sector barriers tosharing Big Data: • Legal constraints • Reputational risk • Competitive advantage • Culture of secrecy • Lack of incentives • Technical complexity • Level of effortDataPhilanthropy!
Section II: Challenges AnalysisGetting the picture right withuser-generated data • Falsification, deliberate distortion • Sensor network distribution • Perceptions vs. facts: Flu Trends detects ILI, not Influenza. • Sentiment Analysis: sarcasm, Map of tweets in Jakarta irony, hyperbole, humor, and the elusiveness of intent. • Expressed vs. actual intentions • Text mining: context and significance
Section II: Challenges AnalysisInterpreting behavioral data • Selection bias: income, education, age, gender, technical aptitude, service provider • Media coverage drives behaviour change • Apophenia: correlation is not causality
Section II: Challenges AnalysisDetecting and defining anomalies in humanecosystems • Establishing a baseline: how stringent is your model? • Sensitivity vs. specificity: false positive undermine credibility; false negatives reduce relevance.
Section III: Application What New Data Streams Bring to the TableKnow your data! • Big Data is….just data. However… • News organizations have developed verification methodologies • Perceptual data is useful for detecting events • False perceptions drive population behavior • Selection bias can be an advantage: in developing countries, online inflation may precede offline inflation
Section III: Application What New Data Streams Bring to the TableApplications of Big Data for Development“Even if all you have got is a contemporaneous correlation, you’ve got a 6-week lead on the reported values. The hope is that as you take the economicpulse in real time, you will be able to respond to anomalies more quickly.” -Hal Varian, Chief Economist, Google• Sometimes correlation suffices: proxy indicators• Accuracy vs. speed, cost, scale• Real-time data saves lives USGS Twitter Earthquake Detector
Section III: ApplicationWhat New Data Streams Bring to the Table Global Pulse research: real-time proxy indicators Tweets about the price of rice vs. official food prices in Indonesia
Section III: ApplicationWhat New Data Streams Bring to the Table Global Pulse research: real-time proxy indicatorsCorrelation of mood changes and emerging topics in social mediawith official unemployment figures in the US and Ireland
Section III: Application What New Data Streams Bring to the TableA threefold opportunity for development1. Early warning: Faster detection of anomalies at the onset of a crisis allows more agile responses to prevent harm.2. Real-time awareness: A fine-grained and current representation of reality informs better design and targeting of programmes and policies;3. Real-time feedback: Continuous monitoring for behaviour changes following programme implementation enables a more adaptive approach to development, in which rapid adjustments may be made until results are achieved.
Section III: Application Making Big Data Work for DevelopmentContextualization is key1. Data context: Indicators should not be interpreted in isolation. Monitor for constellations of anomalies, triangulating across data sources.2. Cultural context: Local knowledge of what is “normal” in a given population is a prerequisite for recognizing anomalies. Cultural practices and norms vary widely the world over and these differences certainly extend to the use of digital services. There is a deeply ethnographic dimension to using Big Data for development
Section III: Application Making Big Data Work for DevelopmentBecoming sophisticated users of informationExample: FEMA tracking 2011 US tornadoimpacts through Twitter1. “We aren’t making widgets”: Navigating the tradeoff between speed and accuracy.2. Focus on changing outcomes. How can we leverage the real-time nature of the data to save lives?“Disasters are like horseshoes, hand grenades andthermal nuclear devices, you just need to be close—preferably more than less.” – Craig Fugate, Administrator,US Federal Emergency Management Agency
ConclusionHow can Big Data fulfill its potential as apublic good?1. Institutional and financial support from public sector actors2. Creating incentives for corporations to share data3. Creating opportunities for academic researchers to collaborate4. Developing new models, technologies and policies for safe and responsible sharing and reuse of data for the public good5. New types of partnerships
UN Global Pulsewww.unglobalpulse.org@unglobalpulse Image credit: Aaron Koblin 24 hours of AT&T phone calls and Internet traffic flowing through New York City