1. What is eScience, and where does it go
from here?
eScience 2019, 25 September 2019
Daniel S. Katz
(d.katz@ieee.org, http://danielskatz.org, @danielskatz)
Assistant Director for Scientific
Software & Applications, NCSA
Research Associate Professor,
CS, ECE, iSchool
2. e-Science in 2000
• “In November 2000 the Director General of UK Research Councils, Dr John Taylor,
announced £98M funding for a new UK e-Science programme.
• “In the future, e-Science will refer to the large scale science that will increasingly be carried
out through distributed global collaborations enabled by the Internet. Typically, a feature of
such collaborative scientific enterprises is that they will require access to very large data
collections, very large scale computing resources and high performance visualisation back
to the individual user scientists.
• “The World Wide Web gave us access to information on Web pages written in html
anywhere on the Internet. A much more powerful infrastructure is needed to support e-
Science. Besides information stored in Web pages, scientists will need easy access to
expensive remote facilities, to computing resources - either as dedicated Teraflop computers
or cheap collections of PCs - and to information stored in dedicated databases.
• “The Grid is an architecture proposed to bring all these issues together and make a reality of
such a vision for e-Science.”
https://web.archive.org/web/20040818222850/http://www.rcuk.ac.uk/escience_old/firstphase.shtml
“e-Science is about global collaboration in key
areas of science and the next generation of
infrastructure that will enable it” -- Dr John Taylor
3. e-Science in 2005
• Next generation of scientific research and experiments will be carried out by
communities of researchers from organizations that span national boundaries
• Activities will involve geographically distributed and heterogeneous resources such
as computational systems, scientific instruments, databases, sensors, software
components, networks, and people
• Such large-scale and enhanced scientific endeavors, popularly termed as e-
Science, are carried out via collaborations on a global scale
• “Grid computing has emerged as one of the key computing paradigms that enable
the creation and management of Internet-based utility computing infrastructure,
called Cyberinfrastructure, for realization of e-Science and e-Business at the global
level
https://web.archive.org/web/20160911013712/http://www.cloudbus.org/escience/cfp.html
4. eScience in 2012
• When planning the 2012 IEEE eScience conference:
• We decided that the key distinguishing element of eScience was
joint use of advances in infrastructure and advances in the use of
that infrastructure to make advances in scholarly research (which
we called science)
• We used this to organize the call for papers and the sessions
• Some papers focused more on
advances in the infrastructure
• Some more on advances in using it
• All papers had to have some
combination of these advances that
advanced research
5. Naming ourselves
• We called this eScience because of its tie to eInfrastructure, sometimes called
cyberinfrastructure
6. Digressing to defining cyberinfrastructure
• For an expansive view of what is infrastructure, we can use Craig
Stewart’s definition of cyberinfrastructure:
• “Computing systems, data storage systems, advanced
instruments and data repositories, visualization environments,
and people, all linked together by software and high performance
networks to improve research productivity and enable
breakthroughs not otherwise possible”
• Note that this contains the same goal of advancing research as
eScience.
https://doi.org/10.1145/1878335.1878347
7. Naming ourselves
• We called this eScience because of its tie to eInfrastructure, sometimes called
cyberinfrastructure
• But as with other e* and i* and cyber* things, the eScience name feels a little dated
• We need a new name
• Is eScience just Science now?
• Probably not
• Coupling of advances in digital infrastructure with advances in how the infrastructure is used
is still different than much of research
• Plus, science in English is more limited than wissenschaft in German
• Use research or scholarship or scholarly research instead?
• Thinking of the inherent research & infrastructure symbiosis (cf. lichen), maybe:
• Research and Infrastructure Development Symbiosis (RaIDS)
8. Coupling advances in infrastructure and research
• Sometimes a collaboration
• Though not always a single team
• Or even at a single time; can be a gap between paired advances
• Useful to examine these collaborations and interactions
• Specifically, the human aspects
9. Coupled advances
• For this type of progress to be made:
• Infrastructure developers have to be aware of infrastructure users’
potential needs
• Infrastructure users have to be aware of infrastructure developers’
possible offerings
• This requires communication
• After some advance in both that leads to progress in research, a
match may be declared, where the infrastructure advance needs to be
given credit for the research progress
• I think we as a community need to talk about these aspects,
communication and credit
• Due to limited time, I will mostly focus on communications
• Overall challenge: how can we improve them to improve eScience?
10. Push & Pull
• Developers make advances in infrastructure, intended to enable
better/more advanced scholarship
• Need to decide what advances to make
• Based on what they think will be used
• Based on what they think is possible
• Need to communicate this to researchers
• Advances in scholarship depend on advances in infrastructure
• Researchers need to decide what infrastructure advances to try/invest in
• Based on which are robust
• Based on which are sustainable
• Based on what others are using
• Based on what they think is possible
• Need to communicate their needs to infrastructure developers
• Idea: is our role to work between these two groups?
12. Problem 2: Thinking of what is possible is hard
https://www.youtube.com/watch?v=jWTGsUyv8IE
https://www.youtube.com/watch?v=jWTGsUyv8IE
https://www.chron.com/neighborhood/bayarea/news/article/When-Boris-Yeltsin-went-
grocery-shopping-in-Clear-5759129
“’Even the Politburo doesn't have this choice. Not even Mr. Gorbachev,’ he
said. When he was told through his interpreter that there were thousands of
items in the store for sale he didn't believe it. He had even thought that the
store was staged, a show for him.”
https://www.pinterest.com/pin/439593613604653533/
13. Problem 3: Too much …
https://www.etsy.com/listing/99710612/monkey-see-monkey-do-grown-up-t-shirt
14. Push and pull today
• Today, eScience (RaIDS) highlights collaborations that have
succeeded, e.g.
• Pegasus & Montage: https://pegasus.isi.edu/application-showcase/montage/
• Charm++ & NAMD: “Chapter 5: NAMD: Scalable Molecular Dynamics Based
on the Charm++ Parallel Runtime System”
• through papers and workshops
• Idea: Can we improve these papers by requiring them to state the
advances on which they are based?
• How can RaIDS promote communication that leads to encourage
additional successful collaborations?
15. Push and pull solutions though communication
• Idea: Can RaIDS bring in elements of matchmaking?
• Facilitated discussions, aka charrettes, ideas labs, sandpits
• Community road map and white paper processes (e.g. NASA Earth Science,
HEP software)
• Decadal surveys (e.g., astronomy)
• Idea: Can RaIDS promote/improve/standardize catalogs?
• XSEDE services: https://www.xsede.org/ecosystem/services
• ELIXIR’s bio.tools
• Idea: Can RaIDS promote infrastructure likely to be useful, e.g., a
technology showcase, and research needs, e.g., a research
challenges showcase
16. Quick foray into credit
• I don’t want to take this bus
• We need to better tie research advances to
the infrastructure (computing, data, software,
etc.) advances that enable them
• In our current system, best way is joint authorship
for synchronous collaborations
• eScience is a good platform for this now
• And via citation for asynchronous collaborations
• New standards emerging for data and software citation, ideas for computing
(instrument) citations
• Idea: RaIDS should encourage these via appropriate guidance to authors
and reviewers
17. Recap
• We aren’t a grid conference nor one on global-scale work, nor do we
focus only on science we need a new name
• I suggest Research and Infrastructure Development Symbiosis
(RaIDS), but other ideas are equally welcome
• Ideas for future years of RaIDS
• Focus on role of attendees as people between infrastructure & applications
• Improve papers by requiring them to state advances on which they are based
• Bring in elements of matchmaking intended to lead to new collaborations
• Promote/improve/standardize catalogs
• Promote infrastructure likely to be useful, e.g., a technology showcase, and
research needs, e.g., a research challenges showcase
• Encourage infrastructure citation via appropriate guidance to authors and
reviewers