Turning Data Into Narrative

•Download as PPTX, PDF•

0 likes•253 views

Daniel X. O'Neil

Turning Data Into Narrative: Strategies for finding and sharing stories embedded within sets of data.

News & Politics Technology

Strategies for finding…
• Search is your friend
• Advanced search is your best friend
• Don’t default to FOIA
• Don’t deal with Public Information Officers
• The hidden web still exists
• Data is often more structured than you think
• It takes an abundance of data types to tell a
story
@juggernautco

FOIA is not your friend.
• The Internet is your friend.
• Example: Dallas crime reports
• Here’s there statement about getting data from them
on their public Web site:
– Open Records requests must be made in writing. They may
be:
– 1.Hand-carried to the Records Section, Dallas Police
Headquarters, 1400 S. Lamar Street, Dallas, TX
– 2.Faxed to 214-671-4636
– 3.E-mailed to openrecordunit@dpd.ci.dallas.tx.us
– 4.Mailed by US Postage to - Dallas Police Open
Records, 1400 S. Lamar Street, Dallas, TX. 75215
@juggernautco

…and sharing stories…
• Knowing more than anyone else is still the
only way to do this
• Surfacing from the hidden Web is doing
everybody a favor
• Information is not knowledge. Publishing data
without context is not super-useful
• Most data is boring. Why? Because data is
made by people, and most people are boring
most of the time
@juggernautco

Ten Databases
• Building permits
• Business licenses
• Historic preservation list
• Sanborn maps (1929 and 1950)
• County assessor
• County recorder of deeds
• Original photography
• Google search for news coverage
• New York Times archive
• Walgreens surplus property
@juggernautco

…embedded
within sets of data
• It’s got to be the other way around
• We’ve got to embed our data into our stories
rather than find stories embedded in our data
• I don’t want to search for anything
• I’d rather know everything
• Every object should have a page on the
Internet (so let’s get to work)
@juggernautco

We need a machine.
• A generic context engine
• To evenly distribute information
• And tell me what the information
means
• I know: that sounds like a
“reporter”
• But people used to think that
“search engine” sounded a lot like
“librarian”, too
• We need humans and machines
@juggernautco

It’s easy.
• Find dataset
• Review dataset
• Describe what the data means
• Find another dataset
• Describe what the other dataset
means
• Describe what the first dataset means
in the context of the second dataset
• Repeat
• Let’s do this thing.
@juggernautco

Viewers also liked

Smart Chicago presentation to the Knight Community Information Challenge webinarDaniel X. O'Neil

Community Based Broadband Report by Executive Office of the PresidentDaniel X. O'Neil

Being OpenDaniel X. O'Neil

GIS!Daniel X. O'Neil

Reconstruction of the Congress Parkway Bridge Over the South Branch of the Ch...Daniel X. O'Neil

Yay for DSSG!Daniel X. O'Neil

Me + CTA Consumer-Focused TechnologyDaniel X. O'Neil

2014 Summer Youth Program RFADaniel X. O'Neil

The Smart Chicago Model, Daniel X. O’Neil, Gigabit City Summit, January 2015Daniel X. O'Neil

Structured narrative interviewDr Sjoerd-Jeroen Moenandar

Telling Great Stories & Learning to ListenSaul Klein

Narrativechristimothy12

my reserach presentationkhadija seher

Lecture 6 qualitative data analysisAyuni Abdullah

PDU 214 Methods of Observation & Interviewing: ObservationsAgatha N. Ardhiati

How to write NarrativesJacqui Sharp

Narrative Powerpointmrswjohnston

Writing Narrativesmungo13

Narrative paragraphTeng Sam An

Data Narrative SectionsL H

Viewers also liked (20)

Smart Chicago presentation to the Knight Community Information Challenge webinar

Community Based Broadband Report by Executive Office of the President

Being Open

GIS!

Reconstruction of the Congress Parkway Bridge Over the South Branch of the Ch...

Yay for DSSG!

Me + CTA Consumer-Focused Technology

2014 Summer Youth Program RFA

The Smart Chicago Model, Daniel X. O’Neil, Gigabit City Summit, January 2015

Structured narrative interview

Telling Great Stories & Learning to Listen

Narrative

my reserach presentation

Lecture 6 qualitative data analysis

PDU 214 Methods of Observation & Interviewing: Observations

How to write Narratives

Narrative Powerpoint

Writing Narratives

Narrative paragraph

Data Narrative Sections

Similar to Turning Data Into Narrative

Working With Data and HumansDaniel X. O'Neil

Big Data, Small Data, Data that Totally Rocks - SMWTORob Clark

Hacks, hackers and data journalismGlen McGregor

Data Visualization in the NewsroomCarl V. Lewis

Dressler Kristof The Right to be Forgotten and Digital CollectionsNational Information Standards Organization (NISO)

Intro open data hackdayOpen Data Network

Intro open data hackdaygueste2d87d8

open data hackday introOpen Knowledge Foundation

Intro open data hackdaygueste2d87d8

Digital divide and computer assisted reportingAnna Polud

POLE Investigations with Neo4jNeo4j

Data-Driven Enterprise on Any Beat by Manuel Torres - Monroe, La., NewsTrain ...News Leaders Association's NewsTrain

Data-driven enterprise off your beat - Doug Caruso - Columbus, Ohio, NewsTrai...News Leaders Association's NewsTrain

Data journalismGlyn Mottershead

Open Data JournalismGabriella Razzano

open-data-presentation.pptxDennicaRivera

Making data more humanPratyush Sunandan

Insight Address: "The Visual Organization: Data Visualization, Big Data, & th...iMedia Connection

Studying foursquareMattias Rost

Data Journalism: chapter from Online Journalism Handbook first editionPaul Bradshaw

Similar to Turning Data Into Narrative (20)

Working With Data and Humans

Big Data, Small Data, Data that Totally Rocks - SMWTO

Hacks, hackers and data journalism

Data Visualization in the Newsroom

Dressler Kristof The Right to be Forgotten and Digital Collections

Intro open data hackday

open data hackday intro

Intro open data hackday

Digital divide and computer assisted reporting

POLE Investigations with Neo4j

Data-Driven Enterprise on Any Beat by Manuel Torres - Monroe, La., NewsTrain ...

Data-driven enterprise off your beat - Doug Caruso - Columbus, Ohio, NewsTrai...

Data journalism

Open Data Journalism

open-data-presentation.pptx

Making data more human

Insight Address: "The Visual Organization: Data Visualization, Big Data, & th...

Studying foursquare

Data Journalism: chapter from Online Journalism Handbook first edition

Recently uploaded

Busty Desi⚡Call Girls in Vasundhara Ghaziabad >༒8448380779 Escort ServiceDelhi Call girls

China's soft power in 21st century .pptxYasinAhmad20

WhatsApp 📞 8448380779 ✅Call Girls In Chaura Sector 22 ( Noida)Delhi Call girls

Transformative Leadership: N Chandrababu Naidu and TDP's Vision for Innovatio...srinuseo15

Enjoy Night ≽ 8448380779 ≼ Call Girls In Gurgaon Sector 48 (Gurgaon)Delhi Call girls

Enjoy Night ≽ 8448380779 ≼ Call Girls In Palam Vihar (Gurgaon)Delhi Call girls

America Is the Target; Israel Is the Front Line _ Andy Blumenthal _ The Blogs...Andy (Avraham) Blumenthal

${Qatar{^🚀^(+971558539980**}})Abortion Pills for Sale in Dubai. .abu dhabi, sh...$ ${Qatar{^🚀^(+971558539980**}})Abortion Pills for Sale in Dubai. .abu dhabi, sh...$

{Qatar{^🚀^(+971558539980**}})Abortion Pills for Sale in Dubai. .abu dhabi, sh...hyt3577

declarationleaders_sd_re_greens_theleft_5.pdfssuser5750e1

Enjoy Night ≽ 8448380779 ≼ Call Girls In Gurgaon Sector 46 (Gurgaon)Delhi Call girls

Nara Chandrababu Naidu's Visionary Policies For Andhra Pradesh's Developmentnarsireddynannuri1

Embed-4.pdf lkdiinlajeklhndklheduhuekjdhbhavenpr

422524114-Patriarchy-Kamla-Bhasin gg.pdflambardar420420

Busty Desi⚡Call Girls in Sector 62 Noida Escorts >༒8448380779 Escort ServiceDelhi Call girls

Powerful Love Spells in Phoenix, AZ (310) 882-6330 Bring Back Lost LoverPsychicRuben LoveSpells

06052024_First India Newspaper Jaipur.pdfFIRST INDIA

Politician uddhav thackeray biography- Full DetailsVoterMood

05052024_First India Newspaper Jaipur.pdfFIRST INDIA

KING VISHNU BHAGWANON KA BHAGWAN PARAMATMONKA PARATOMIC PARAMANU KASARVAMANVA...IT Industry

Gujarat-SEBCs.pdf pfpkoopapriorjfperjreiebhavenpr

Recently uploaded (20)

Busty Desi⚡Call Girls in Vasundhara Ghaziabad >༒8448380779 Escort Service

China's soft power in 21st century .pptx

WhatsApp 📞 8448380779 ✅Call Girls In Chaura Sector 22 ( Noida)

Transformative Leadership: N Chandrababu Naidu and TDP's Vision for Innovatio...

Enjoy Night ≽ 8448380779 ≼ Call Girls In Gurgaon Sector 48 (Gurgaon)

Enjoy Night ≽ 8448380779 ≼ Call Girls In Palam Vihar (Gurgaon)

America Is the Target; Israel Is the Front Line _ Andy Blumenthal _ The Blogs...

${Qatar{^🚀^(+971558539980**}})Abortion Pills for Sale in Dubai. .abu dhabi, sh...$ ${Qatar{^🚀^(+971558539980**}})Abortion Pills for Sale in Dubai. .abu dhabi, sh...$

{Qatar{^🚀^(+971558539980**}})Abortion Pills for Sale in Dubai. .abu dhabi, sh...

declarationleaders_sd_re_greens_theleft_5.pdf

Enjoy Night ≽ 8448380779 ≼ Call Girls In Gurgaon Sector 46 (Gurgaon)

Nara Chandrababu Naidu's Visionary Policies For Andhra Pradesh's Development

Embed-4.pdf lkdiinlajeklhndklheduhuekjdh

422524114-Patriarchy-Kamla-Bhasin gg.pdf

Busty Desi⚡Call Girls in Sector 62 Noida Escorts >༒8448380779 Escort Service

Powerful Love Spells in Phoenix, AZ (310) 882-6330 Bring Back Lost Lover

06052024_First India Newspaper Jaipur.pdf

Politician uddhav thackeray biography- Full Details

05052024_First India Newspaper Jaipur.pdf

KING VISHNU BHAGWANON KA BHAGWAN PARAMATMONKA PARATOMIC PARAMANU KASARVAMANVA...

Gujarat-SEBCs.pdf pfpkoopapriorjfperjreie

Turning Data Into Narrative

1. Turning Data Into Narrative: Strategies for finding and sharing stories embedded within sets of data. Daniel X. O’Neil @juggernautco

2. Strategies for finding… • Search is your friend • Advanced search is your best friend • Don’t default to FOIA • Don’t deal with Public Information Officers • The hidden web still exists • Data is often more structured than you think • It takes an abundance of data types to tell a story @juggernautco

3. FOIA is not your friend. • The Internet is your friend. • Example: Dallas crime reports • Here’s there statement about getting data from them on their public Web site: – Open Records requests must be made in writing. They may be: – 1.Hand-carried to the Records Section, Dallas Police Headquarters, 1400 S. Lamar Street, Dallas, TX – 2.Faxed to 214-671-4636 – 3.E-mailed to openrecordunit@dpd.ci.dallas.tx.us – 4.Mailed by US Postage to - Dallas Police Open Records, 1400 S. Lamar Street, Dallas, TX. 75215 @juggernautco

8. …and sharing stories… • Knowing more than anyone else is still the only way to do this • Surfacing from the hidden Web is doing everybody a favor • Information is not knowledge. Publishing data without context is not super-useful • Most data is boring. Why? Because data is made by people, and most people are boring most of the time @juggernautco

9. @juggernautco

10. Ten Databases • Building permits • Business licenses • Historic preservation list • Sanborn maps (1929 and 1950) • County assessor • County recorder of deeds • Original photography • Google search for news coverage • New York Times archive • Walgreens surplus property @juggernautco

11. @juggernautco

12. …embedded within sets of data • It’s got to be the other way around • We’ve got to embed our data into our stories rather than find stories embedded in our data • I don’t want to search for anything • I’d rather know everything • Every object should have a page on the Internet (so let’s get to work) @juggernautco

13. We need a machine. • A generic context engine • To evenly distribute information • And tell me what the information means • I know: that sounds like a “reporter” • But people used to think that “search engine” sounded a lot like “librarian”, too • We need humans and machines @juggernautco

14. It’s easy. • Find dataset • Review dataset • Describe what the data means • Find another dataset • Describe what the other dataset means • Describe what the first dataset means in the context of the second dataset • Repeat • Let’s do this thing. @juggernautco

Editor's Notes

I’m Dan O’Neil, and I run the Smart Chicago Collaborative, an organization devoted to improving lives in Chicago through technology. Among other things, I work with Chicago city government, developers, and community groups to use civic data in new and useful ways. As a co-founder of EveryBlock, I’m also a previous Knight News Challenge granteeI certainly wouldn’t be doing any of this today if it weren’t for the vision of the Knight Foundation.
The main charge to the panelists is to talk about “Strategies for finding and sharing stories embedded within sets of data.” Let’s take that piece by piece. I’ve been responsible for data acquisition for quite some time, and I’ve found a goodly amount of data in my day. These are the main upshots I’ve got to share that are not already widely propagated.
One way that I think I differ from the may reporter/ journalism mode of finding data is that I prefer Searching to Asking. Search is your friend, and advanced search is your best friend.I think that the instinct is to make freedom of information act requests and go through traditional routes like calling Public Information Officers.That can waste a lot of time.Here’s an example in Dallas– if you use their default process, you’re in for a pretty traditional experience.Requests in writing, wait for an answer.
And if you use the default search for crime records, you get this screen.It has records going back to 2005.You fill out the form and you get your answers back.Pretty typical experience.
What you wouldn’t be able to tell, unless you searched the Dallas Police Web site more deeply, is this.The Dallas Police publishes an amazing cache of crime data in flat files.All of it, with no search, no letters or emails, going back 12 years.Why anyone would make any FOIA request– or why the Dallas Police would want anyone to do that– is beyond me.And this data has some of the most amazing crime details– the police narrative– that you can find in crime data anywhere.This is hidden in plain sight.
Data is often more structured than you think.Over the weekend I participated in the Knight-Mozilla-MIT "Story & Algorithm" Hack Day run by Dan Sinker.I met a couple of Boston developers and we executed on a project I’ve had for about 7 years.Like many of you here, I’m not smart enough to actually make things, so I have to rely on the kindness of developers.What we made was “Condition of Anonymity”– a Web site that automatically pulls the reason that anonymity was granted to an anonymous source by a reporter for the New York Times.We often think about data as the stuff inside spreadsheets and published in flat files to FTP servers, but there is a whole world of semi-structured data like this hidden in plain sight, inside plain text.We used the NYT Search API to review every article in the NYT back to January 1, 2000 for the phrase, “condition of anonymity”, then used a natural language processing toolkit to find what I call the “because clauses”.There’s some gold in there.It takes an abundance of data types to tell a story.This story feels like a Walt Whitman poem to me.
Lastly, I highly recommend the Data Journalism Handbook, which was created, in part, by many people in this room.It’s a really excellent resource.
I am not a journalist.But in my own time, I have published a pretty extensive set of stories based on data, and I have some insights maybe.The first one is that there aren’t any shortcuts.You still have to know more than anybody else about a subject in order to tell good stories.I’ve got an example to share.Next is kind of a gimme, which is that you shouldn’t mix up Information andknowledgeThe analysis is where it’s at.The most amazing insight I can share is that data is boring.I’ve had a long time to consider why that is true, and I think I have the answer.The reason is because people are boring.We forget that data is made by people.And most people are boring most of the timeEvery object should have a page on the Internet (so let’s get to work)
Here’s kind of a master example.I live near this building.It was been empty for a very long time.Then construction started.The construction was heralded by a building permit.But, of course, the building permit was boring.So I looked further.
I searched ten different databases and lo and beyhold, more data made it less boring.Why? Because almost all people are interesting some of the time.So if you look hard enough, you’ll find those stories.I found a business license for a 3-day pop-up store.So this place has been empty for decades, but was open for three days.And I missed it.It used to be a bank, and in 1937 I found out that– from the NYT archive, in PDF format– the hidden Web– that there was a bank run at this location in 1937.Again, not boring.
Here’s an example of two things:Finding data in unstructured text and finding interesting data.This is an Advanced Search in Google for the word “jimmied” in the Dallas crime data published by EveryBlock.So that site becomes a public, searchable instance of a previously hidden data set.Apparently police have used the word “jimmied” to describe an action taken by suspected criminals 2,430 times.All sorts of things are jimmied, apparently.It’s not boring.
Lastly, I want to encourage a different way to think about data.We’ve got to embed our data into our stories rather than find stories embedded in our dataI don’t want to search for anything.I’d rather know everything.Every object should have a page on the Internet– just like 1601 N. Milwaukee
This machine can be described as a generic context engineTo evenly distribute informationAnd tell me what the information meansI know: that sounds like a “reporter”But people used to think that “search engine” sounded a lot like “librarian”, tooWe need humans and machines
Find datasetReview datasetDescribe what the data meansFind another datasetDescribe what the other dataset meansDescribe what the first dataset means in the context of the second datasetRepeatLet’s do this thing.

Turning Data Into Narrative

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (20)

Similar to Turning Data Into Narrative

Similar to Turning Data Into Narrative (20)

More from Daniel X. O'Neil

More from Daniel X. O'Neil (20)

Recently uploaded

Recently uploaded (20)

Turning Data Into Narrative

Editor's Notes