7. "Over
the
next
decade,
cities
will
continue
to
grow
larger
at
a
rapid
pace.
At
the
same
time,
new
technologies
will
unlock
massive
streams
of
data
about
cities
and
their
residents.
As
these
forces
collide,
they
will
turn
every
city
into
a
unique
civic
laboratory—a
place
where
technology
is
adapted
in
novel
ways
to
meet
local
needs."
December
2010.
Institute
of
the
Future's
2020
Forecast
–
The
Future
of
Cities,
Information
and
Inclusion.
8. Ci#es:
Worldwide,
city
leaders
and
managers
need
cost-‐effective
&
smart
solutions
Popula'on
Growth:
-‐ 221
cities
globally
with
more
than
1
million
citizens
-‐
China
will
move
300
million
people
to
cities
by
2020
-‐
90%
of
these
cities
are
in
emerging
markets
-‐
In
2008,
more
people
lived
in
cities
(3.3
billion),
by
2030,
5
billion
-‐ Cities
are
more
efJicient
and
have
less
environmental
impact
Cost
of
City
Services:
Aging
infrastructure,
resource
constraints
&
waste
-‐ Washington
DC’s
water
system
has
elements
that
date
to
the
Civil
War
-‐ InefJiciency,
leaks
and
waste
rival
maintenance
and
expansion
costs
-‐ Legacy
infrastructure
in
megacities
like
NYC
that
are
too
cost-‐
prohibitive
to
replace
source:
Gartner
–
Is
Smart
Cities
the
Next
Big
Market?
March
2011
11. Int. No. 29: Accessibility to Public Data Sets
“...requires
that
all
public
data
sets
maintained
by
City
agencies
shall
be
made
available
on
the
Internet
through
a
single
web
portal,
formatted
to
enable
viewing
by
web
browsers
and
mobile
devices
and
also
in
their
raw
or
unprocessed
form.”
12.
13. Why NYC?
• City Population: • Gross Metropolitan Product:
8.4 M (NYC estimate) USD $ 1.28 Trillion
(Greyhill Advisors)
• Metro NYC Population:
18.9 million (2010 Census) • De facto Capital of the World
• City Density: • Fastest growing Tech Industry -
10,630/km2 (2010 Census) “New Tech City” (Center for
an Urban Future)
• Metro NYC Density:
1,085.7/km2 (2010 Census) • Second only to Silicon Valley for
most startups
• 50 million visitors a year
• Emphasis on public-private
• DoITT Annual Budget ~$325M partnerships
17. 0. Huge Open Data
1. Extract Metadata
2. Derive ExtraMetadata
(Semantics + Statistics + Algorithm + Crowd)
3. Do Federated Queries on both the
Metadata AND the Data
Crowdknowing
18. Crowdknowing
Human-powered, Machine-accelerated,
Collective Knowledge Systems
Curation, Comments, Ontology, Inferencing, Semantic
Microcontributions, Feedback, Mapping, Query Federation, Statistics,
Bug Reports, Pattern Recognition, Multivariate
Likes, Shares, Profile, Votes, Analysis & Forecasting, Automated
Subscribes, Tagging, linking, Feeds, Notifications
etc. etc. etc. etc. etc. etc.
22. ExtraMetadata?
• Derived using Algorithm & the Crowd”
“Semantics, Statistics,
• “Supercharacterize” by sampling the underlying
not just the schema, but
each dataset
data as well
• Score each dataset - Pediacities Rank
• Virtuous Feedback Loop around the Data
micro-conversations/contributions
23. ExtraMetadata
Top Level Detail
ExtraMetadata ExtraMetadata
• Number of Rows • Top Values
• Pediacities Rank • Descriptive statistics
• Freshness Score • Nulls/Non-nulls
• Sparseness Score • Smallest Value
• Social Score • Largest Value
• Views Score • “Uniqueness”
• Download Score
• Rating Score
• Simple Visualization
32. in tim
e for
4.0
•
More
Datasources!
•
Not
just
Metadata!
Data
too!
•
Federated
Queries!
•
SPARQL
support
•
Collaborative
Ontology
Modeling
•
Feeds
/
Subscriptions
/
NotiQications
•
Microcontributions
•
GamiQication
•
combine
NYCDataWeb
and
NYCFacets
•
Support
both
Web
2.0
&
Web
3.0
APIs
33. Linked
LOV User Slideshare tags2con
Audio
Feedback 2RDF delicious
Moseley Scrobbler Bricklink Sussex
Folk (DBTune) Reading St.
GTAA
Magna- Lists Andrews
Klapp-
tune stuhl- Resource NTU
DB club Lists Resource
Tropes Lotico Semantic yovisto
John Music Man- Lists
Music Tweet chester
Hellenic Peel Brainz NDL
(DBTune) (Data Brainz Reading
subjects
FBD (zitgist) Lists Open
EUTC Incubator) Linked
Hellenic Library Open t4gm
Produc- Crunch-
PD Surge RDF info
tions
Discogs base Library
Radio Ontos Source Code
Crime ohloh Plymouth (Talis)
(Data News LEM
Ecosystem Reading RAMEAU
Reports business Incubator)
Crime data.gov. Portal Linked Data Lists SH
UK Music Jamendo
(En- uk
Brainz (DBtune) LinkedL
Ox AKTing) FanHubz gnoss ntnusc
(DBTune) SSW CCN
Points Thesau-
Last.FM Poké- Thesaur
Popula- artists pédia Didactal us rus W LIBRIS
tion (En- (DBTune) Last.FM ia theses. LCSH Rådata
reegle research patents MARC
AKTing) (rdfize) my fr nå!
data.gov. data.go Codes
Ren.
NHS uk v.uk Good- Experi-
Classical List
Energy (En- win flickr ment
(DB Pokedex Norwe-
Genera- AKTing) Mortality BBC Family wrappr Sudoc PSH
Tune) gian
(En-
tors Program MeSH
AKTing) semantic
mes BBC IdRef GND
CO2 educatio OpenEI web.org SW
Energy Sudoc ndlna
Emission n.data.g Music Dog VIAF
EEA (En- Chronic- Linked
(En- ov.uk Portu- Food UB
AKTing) ling Event MDB
AKTing) guese Mann- Europeana
BBC America Media
DBpedia Calames heim
Ord- Recht- Wildlife Deutsche
Open Revyu DDC
Openly spraak. Finder Bio- lobid
Election nance
legislation Local nl RDF graphie
Resources NSZL Swedish
Data Survey Tele- data Ulm
EU New Book
Project data.gov.uk graphis bnf.fr Catalog Open
Insti- York
URI Open Mashup Cultural
tutions Times Greek P20
UK Post- Burner Calais Heritage
codes DBpedia ECS Wiki
statistics lobid
GovWILD data.gov. Taxon iServe South- Organi-
LOIUS BNB
Brazilian
uk Concept ECS ampton sations
Geo World OS BibBase STW GESIS
Poli- ESD South- ECS
Names Fact- (RKB
ticians stan- reference ampton
data.gov.uk book Freebase Explorer) Budapest
dards data.gov. NASA EPrints
uk intervals Project OAI
Lichfield transport (Data DBpedia data
Guten- Pisa
Spen- data.gov. Incu- dcs RESEX Scholaro-
ISTAT ding bator) Fishes berg DBLP DBLP
uk Geo
meter
Immi- Scotland of Texas (FU (L3S)
Pupils & Uberblic DBLP
gration Species Berlin) IRIT
Exams Euro- dbpedia data- (RKB
London TCM ACM
stat lite open- Explorer) NVD
Gazette (FUB) Gene IBM
Traffic Geo ac-uk
Scotland TWC LOGD Eurostat Daily DIT
Linked UN/
Data UMBEL Med ERA
Data LOCODE DEPLOY
Gov.ie CORDIS YAGO New-
lingvoj Disea-
(RKB some SIDER RAE2001 castle LOCAH
CORDIS Explorer) Linked Eurécom
Eurostat Drug CiteSeer Roma
(FUB) Sensor Data
GovTrack (Ontology (Kno.e.sis) Open Bank Pfam Course-
Central) riese Enipedia
Cyc Lexvo LinkedCT ware
Linked PDB
UniProt VIVO
EURES EDGAR dotAC
US SEC Indiana ePrints IEEE
(Ontology totl.net
(rdfabout)
Central) WordNet RISKS
(VUA) Taxono UniProt
US Census EUNIS Twarql HGNC
Semantic Cornetto (Bio2RDF)
(rdfabout) my VIVO
FTS XBRL PRO- ProDom STITCH Cornell LAAS
SITE KISTI NSF
Scotland
Geo- GeoWord LODE
graphy Net WordNet WordNet JISC
(W3C) (RKB
Climbing
Linked Affy- KEGG
SMC Explorer) SISVU Pub VIVO UF
Piedmont GeoData metrix Drug
ECCO-
Finnish Journals PubMed Gene SGD Chem
Munici-
Accomo- El AGROV Ontology TCP Media
dations Alpine bible
palities Viajero OC
Ski ontology
Tourism KEGG
Ocean
Austria
Enzyme PBAC Geographic
Metoffice GEMET ChEMBL
Italian Drilling OMIM KEGG
Weather Open
public Codices AEMET Linked MGI Pathway
schools Forecasts
Data
Open InterPro GeneID Publications
EARTh Thesau- KEGG
Turismo
rus Colors Reaction
de
Zaragoza Product Smart KEGG
User-generated content
Weather DB Link Medi Glycan
Janus Stations Product Care KEGG
AMP UniParc UniRef UniSTS Government
Types Italian
Homolo Com-
Yahoo! Airports Museums pound
Ontology Google
Gene
Geo Art
Planet National
wrapper
Chem2 Cross-domain
Radio- Bio2RDF
activity UniPath
JP Sears Open Linked OGOLOD way
Life sciences
Corpo- Amster- Reactome
dam medu- Open
rates Numbers
Museum cator
As of September 2011
34. .NYC - the First Linked Open Data City
Mosele
Folk
GTAA
Magna-
tune
DB
Tropes John Mu
Hellenic Peel Bra
(DBTune) (D
FBD
EUTC Incub
Hellenic Produc-
PD Surge
tions
Radio Discogs
.NYC
Crime (Data
Reports business Incubator)
Crime data.gov.
UK (En- uk
AKTing) B
Ox FanHubz
Points (D
Last.FM
Popula- artists
tion (En- (DBTune) Last.FM
reegle research patents
AKTing) (rdfize)
data.gov. data.go
Ren.
NHS uk v.uk
Energy (En-
Genera- AKTing) Mortality BBC
(En-
tors Program
AKTing)
mes BBC
CO2 educatio OpenEI
Energy
Emission n.data.g Music
EEA (En- Chro
AKTing)
(En- ov.uk
lin
AKTing)
BBC Ame
Ord- Recht- Wildlife
Open Finder
Election nance Openly spraak.
Data legislation Survey Local nl Tele-
EU
Insti- Project data.gov.uk graphis
tutions
UK Post-
codes statistics
GovWILD data.gov. Taxon
LOIUS
Brazilian
uk Concept
Geo
Poli- ESD Names
ticians stan- reference
data.gov.uk
dards data.gov. NASA
uk intervals
Lichfield transport (Data
Spen- data.gov. Incu-
ISTAT ding bator) Fishes
uk Geo
Immi- Scotland of Texas
Pupils & Species
gration Exams Euro-
London stat
Traffic Gazette (FUB) Geo
Scotland TWC LOGD Eurostat
Linked
Data Data
UMBEL
Gov.ie CORDIS Y
(RKB
CORDIS Explorer) Linked
Eurostat Sensor Data
(FUB) (Ontology
GovTrack (Kno.e.sis) Open
Central) riese
Cyc
Linked
EURES EDGAR
(Ontology
US SEC
(rdfabout)
35. We need your help & feedback
A Smart Data Exchange for All Data NYC
Find out more at
http://nyc.pediacities.com/facets
@jqnatividad @samimirzabaig @pediacities @ontodia
36. CREDITS
• Flickr User Weston Price, Paleo-Caveman-Omnivore-
LowCarb-Meat-Diet-Info (http://www.flickr.com/
photos/paleo-atkins-meat-diet-info/with/6718805047/)
• Flickr User Gao Yi (http://www.flickr.com/photos/gaoyi/
178514677/)
• Senator Arlen Specter being confronted at a Town Hall
meeting after passage of Healthcare Reform Act
(Bradley C Bower-AP)
• Several pictures taken from NYC.gov/NYCEDC
properties, Tumblr and Flickr accounts