1. STEWARDS – A decade of
Increasing the Impact of ARS
Watershed Research Programs
E.J. Sadler, J.L. Steiner, J.L. Hatfield, D.E. James,
B.C. Vandenberg, and T. Tsegaye
2. Outline
• Background about STEWARDS
• Expected impact back in 2008
• Unmeasurables
• Measurables
– Citations
– Downloads of data
– Downloads of papers
• Observations and Conclusions
3. What is STEWARDS?
• Repository for CEAP Watershed data
• Developed by a volunteer team
– Steiner, J. L., Sadler. E. J., Chen, J-S., Wilson, G., James, D., Vandenberg, B., Ross, J.,
Oster, T. L., and Cole, K. J. STEWARDS: Overview of development and challenges. J.
Soil & Water Cons. 63(6):569-576. 2008.
– Sadler, E. J., Steiner, J. L., Chen, J-S., Wilson, G., Ross, J., Oster, T., James, D.,
Vandenberg, B., Cole, K., and Hatfield, J.L. STEWARDS User perspective, operation,
and application. J. Soil & Water Cons. 63(6):577-589. 2008.
– Steiner, J. L., Sadler. E. J., Hatfield, J. L., Wilson, G., James, D. E.., Vandenberg, B. C.,
Ross, J. D., Oster, T. L., and Cole, K. J. Data management to enhance long-term
watershed research: Context and STEWARDS case study. J. Ecohydrology. 2:391-
398. 2009.
– Steiner, J.L., Sadler, E.J., Wilson, G., Hatfield, J.L., James, D., Vandenberg, B., Chen,
J.-S., Oster, T., Ross, J.D., and Cole, K. STEWARDS watershed data system: system
design and implementation. Trans ASABE 52(5):1523-1533. 2009.
4. Compliance impacts
• US GAO-OIG 2004 report 04-382
– “Watershed Management: Better Coordination of Data
Collection Efforts Needed to Support Key Decisions”
• The Water Resources Development Act of 2007
– “… to provide public access to water resources and water
quality data …”
• OSTP Policy Memorandum 22 February 2013
– “Expanding Public Access to the Results of Federally
Funded Research”
5. Policy Impacts
• Essentially undocumented, perhaps undocumentable
• Personal knowledge at times
• Inferences about regional interests
– Columbia pesticide downloads for re-registration of atrazine
– West Lafayette nutrient downloads for Lake Erie
– University Park phosphorus
– Beltsville nutrients for Chesapeake Bay issues
6. Expectations of Science
• Increasing calls for public access, metadata,
clearinghouses, and self-describing data
– Budapest Open Access Initiative - 2002
– Berlin Declaration on Access to Knowledge in the Sciences and
Humanities – 2003
– Bethesda Statement on Open Access Publishing 2003
– OECD 2004 “Communiqué on Science, Technology and Innovation for
the 21st Century”
– Creative Commons (Science Commons in 2005, since resorbed)
– Jim Gray on eScience: A transformed scientific method, in The fourth
paradigm: Data-intensive scientific discovery, 2009.
7. Impact narratives, ca 2008
• Impacts already seen at that time
– Improved scientific credibility by documenting QA/QC procedures
– Increased collaborative opportunities for individual scientists, watershed
teams, and the ARS water resources program
– Increased learning opportunities for participants at watersheds
– Increased demands on scientists for provision of open data
– Better accountability at the agency level for investment in long-term
watershed research
• Anticipated impacts at that time
– Increased scientific productivity at watersheds
– Increased credit to scientists for contribution to open data systems
8. Value of STEWARDS
• Citations of the articles documenting STEWARDS
Paper Google Scholar data SCOPUS data
Steiner et al 2008 JSWC 21 18
Sadler et al 2008 JSWC 14 16
Steiner et al 2009 Trans ASABE 14 13
Steiner et al 2009 Ecohydrology 9 NA
9. Value of STEWARDS
• Citations of articles documenting data
SCOPUS SCOPUS Google
Series Year STEWARDS # papers mean cites Mean for J-Y Scholar
Ames JEQ 1999 Yes 7 73.6 37.2 103.1
Boise WRR 2001 No 10 35.0 55.1 50.6
Oxford ACS 2004 Yes 15 10.8
Tifton WRR 2007 Yes 5 28.3 41.4 43.4
Tucson WRR 2008 No 19 30.4 31.7 50.3
U Park WRR 2011 Yes 4 10.7 27.9 12.0
El Reno JEQ 2014 Yes 11 6.5 7.4 8.1
Columbia JEQ 2015 Yes 9 8.5 5.9 12.0
10. Value of STEWARDS
• Downloads of papers documenting data
– JEQ provides this for 2014-2015 special sections
• Open Access matters! 3-4x as many downloads of OA than of non-OA
– Tucson has kept monthly download counts for both data and
papers and may serve as a comprehensive model of metrics
Site, year, journal Acc downloads Mean downloads Mean, J
Tucson, 2008, WRR ~20k in 2 years ~1000 in 2 years n/a
El Reno, 2014, JEQ 5,060 460 360
Columbia, 2015, JEQ 7,030 781 360
11. Value of STEWARDS
• Downloads of data –in only the last 5 years
Year Records downloaded
2013 Total - partial year 1,098,738
2014 Total 602,136
2015 Total 4,554,629
2016 Total 4,807,983
2017 Total 1,431,487
2018 - as of 07/25 2,570,094
Grand Total 15,065,067
12. Value of WQX Portal for
STEWARDS Water Quality data
• Second, MUCH more visible web presence
• Same WQ data but different format and metadata
• ARS data alongside EPA STORET, USGS NWIS, and BioData
Year Rows of Data Download Count
2015 375,203,161 1,890
2016 49,153,151 2,537
2017 129,409,275 51,403
2018 22,635,077 2,750
Total 576,400,664 58,580
13. Trends Toward More Valuable Data
• Toward data mining and fusion of many sources
• Toward self-describing data
• Can’t predict future use – places challenges on
documentation
» Benchmark has been if someone could reproduce the study
» That’s really not good enough for datasets
14. Toward multiple destinations
• NOT
– one size fits all
– do it once and it is then static
• A better model
– suitable local database that works locally
– local expertise in exporting to the requisite transfer standards
• In other words, separate the storage, reporting, retrieval,
and transmit functions, so that they don’t place constraints
on the other processes.
16. Toward more data provenance –
the audit trail back to the raw data
• NEON uses engineering units at raw level, post-processes calibrations
to user units – actually separates the measurement from the
calibration process.
• Change tracking of data and a formal way to document derived
products – another separation of processes.
• Versioning, retention of prior versions of data. Ability to recreate
status at a point in the past. DOI for each query.
• Versioning of all post-processing algorithms into derived units or
transport formats. Checksums to confirm data are unchanged, or at
least changed as documented.
17. Conclusions
• It is beastly difficult to measure impact of data
• There appears to be often uncited use
• Download numbers may not reflect use
• Download, and sometimes citations, are
episodic for reasons not always clear
• Regional issues of concern and interest effects
• Visibility and accessibility matter