2. Objectives Reporting and collecting data: Collecting, balancing structured vs. un-structured (and semi-structured), formats, level of detail, tools for collecting, storing and visualizing Learning Objectives: Explain the difference between structured, semi-structured and un-structured data Design a simple structured data reporting format Understand the options for collecting, parsing, storing and analyzing and sharing structured data
3. External data Information is power Relevance & Targeting Aggregation, Sharing, Authorizing Rapid Feedback Alerting Early analytics Getting information
17. Why is SMS popular Low cost Needs little signal (1 bar enough) Uses little power Works on existing equipment Scales from few users to nationwide
18. Ways of sharing information UNSTRUCTURED “Hello everyone we have meeting tomorrow” “We are dealing with cholera outbreak will call you later” SEMI-STRUCTURED “at Ratchaburi, we are seeing Cholera URGENT” “H5N1 Birds:200 should we call PHD?” STRUCTURED “H5N1, Birds:200, Lab: No, FollowUp: no” E.g TURTLE standard Simple, easy, flexible Simple, requires FEATURE EXTRACTION, some training Complex for human entry, hard to learn and to get right
19. Feature Extraction Unstructured Data Feature Extraction Structured Data SMS Messages Time Data Records Algorithms & Databases Voice/Radio Calls Place Relationships Pictures and videos Person Metadata Event Experts Organization Sensor Readings Trustworthiness Image Recognition Crowds Closed Captioning Sensors Face Recognition Calibration
22. Feature Extraction Example: Places “At Stung Treng, seeing Cholera” Lat; Long 1) FIND Feature 2) Associate Metadata Stung Treng= Lat; Long OPEN DATABASES tend to be the richest sources of local data & provide a strong platform Google, Yahoo Geocoders Open Street Maps Your own database e.g. PCODES Humans are excellent at extracting features! Unless you need real-time; large volume geocoding, crowd sourcing is an excellent option
23. A Haitian with a need sends an SMS to the 4636 shortcode The SMS goes through Nuntium and then onto Emergency Information System A Haitian volunteer or staff and translates, tags, geocodes The organized information is then dispatched to response or added to reports
34. International #Easy to get started High Scale Spotty coverage Blocked in some countries Cost/msg
35. Logical vs Physical data paths Pattern: Digitize existing reporting protocols Antipattern: implement in digital form the physical path of paper Complex-Expensive-Unreliable-Less Secure
45. Summary Data collection is only one link in the chain Simplicity and usability are key Unstructured and semi-structured data are a good balance of data quality and usability
46. Thank You! Eduardo Jezierski (Engineering)not a doctor!! edjez@instedd.org @edjez