Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Crowdsourcing Citizen Science Data Quality
1. Crowdsourcing Citizen Science
Data Quality with a
Human-Computer Learning Network
Wiggins, Gerbracht, Lagoze,Yu, Wong, & Kelling
7 December, 2012 ~ Lake Tahoe, NV
Workshop on Human Computation for Science and Computational Sustainability
3. eBird
• Online checklist
program for bird
abundance &
distribution
• Data (mostly)
from recreational
birders; used
widely
• Over 100 million
records & growing eBird observations per month
6. The eBird HCLN
S Kelling, C Lagoze, W-K Wong, J Yu, T Damoulas, J Gerbracht, D Fink, C Gomes. 2012. eBird: A Human/Computer Learning
Network to Improve Biodiversity Conservation and Research. Artificial Intelligence.
7. Emergent Filters
Kelling, S., J.Yu, J. Gerbracht, and W. K. Wong. 2011. Emergent Filters: Automated Data Verification in a Large-scale Citizen Science
Project. Proceedings of the IEEE eScience Conference.
8. Modeling Expertise
Environmental Occupancy Detection
Covariates (Latent) Detection Covariates
oi dit
Xi$ Zi$ Yit$ Wit$
t=1,…,Ti$
i=1,…,N$
10. Average Detection Probabilities
0.20
0.15
0.10
0.05
0.00
h
o
w
n
er
al
y
h
tc
re
o
Ja
s
lo
in
h
ru
er
ha
Vi
as
-0.05
al
rd
ue
Th
H
r
ut
Sw
Ca
d
Th
Bl
e
N
de
d
lu
oo
n
ed
n
ea
d
B
er
ow
te
W
g
-h
at
th
as
in
Br
ue
re
or
re
w
G
Bl
N
h-
-b
ug
te
hi
Ro
W
n
er
th
or
N
Common birds Hard-to-detect birds
Yu, J., W. K. Wong, and R. A. Hutchinson. 2010. Modeling Experts and Novices in Citizen Science Data
for Species Distribution Modeling. IEEE 10th International Conference on Data Mining (ICDM),
17. Future Work
• Preliminary studies integrated into eBird for better
data quality on multiple levels
• Resulting human-computer learning network will
use eBird data in new ways
• Evaluation of motivation, learning, and skills related
to expertise ranking & birding routes