SlideShare una empresa de Scribd logo
1 de 137
Descargar para leer sin conexión
Sandra	
  Gesing	
  
sandra.gesing@nd.edu	
  
	
  
cHiPSet	
  Training	
  School	
  2016	
  
22	
  September	
  2016	
  
Science	
  Gateways	
  –	
  	
  
Leveraging	
  Modeling	
  and	
  
SimulaDons	
  in	
  HPC	
  Infrastructures	
  
via	
  Increased	
  Usability	
  	
  	
  
University	
  of	
  Notre	
  Dame	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  2	
  
• 	
  In	
  the	
  middle	
  of	
  nowhere	
  of	
  northern	
  Indiana	
  	
  
	
  (1.5	
  h	
  from	
  Chicago)	
  
• 	
  4	
  undergraduate	
  colleges	
  	
  
• 	
  ~35	
  research	
  insDtutes	
  and	
  centers	
  
• 	
  ~12,000	
  students	
  
Modeling	
  and	
  SimulaDons	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  3	
  
•  Genomics	
  
•  Proteomics	
  
•  Metabolomics	
  
•  Immunomics	
  
•  System	
  biology	
  
•  Molecular	
  simulaDons	
  
•  Docking	
  
•  Epidemiology	
  
•  …	
  
Black	
  Swallowtail	
  –	
  	
  
larvae	
  and	
  buVerfly	
  
The	
  Genomics	
  Boom	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  4	
  
February	
  16,	
  2001	
  
	
  biotech	
  company	
  Celera	
  	
  
February	
  15,	
  2001	
  
The	
  Human	
  Genome	
  Project	
  	
  
The	
  Genomics	
  Boom	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  5	
  
Craig	
  Venter	
  (le[)	
  and	
  Francis	
  Collins	
  (right)	
  
Big	
  Data	
  
Sandra	
  Gesing 	
   	
  	
  	
  	
  	
  	
  	
  	
   	
   	
   	
  	
  	
  	
  	
   	
  	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  6	
  
• 	
  Explosion	
  in	
  the	
  quanDty,	
  variety	
  and	
  complexity	
  of	
  
	
  data	
  	
  
• 	
  QuesDons	
  can	
  be	
  answered	
  impossible	
  to	
  even	
  ask	
  
	
  about	
  10	
  years	
  ago	
  
• 	
  Costs	
  far	
  reduced	
  (e.g.,	
  Human	
  Genome	
  project,	
  15	
  
	
  years,	
  ~$2	
  billion;	
  today	
  ~3	
  days,	
  $1000)	
  
Big	
  Data	
  
Sandra	
  Gesing 	
   	
  	
  	
  	
  	
  	
  	
  	
   	
   	
   	
  	
  	
  	
  	
   	
  	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  7	
  
hVp://www.genome.gov/images/content/cost_per_genome_oct2015.jpg	
  
Modeling	
  and	
  SimulaDons	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  8	
  
Workflows	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  9	
  
12181 acatttctac caacagtgga tgaggttgtt ggtctatgtt ctcaccaaat ttggtgttgt
12241 cagtctttta aattttaacc tttagagaag agtcatacag tcaatagcct tttttagctt
12301 gaccatccta atagatacac agtggtgtct cactgtgatt ttaatttgca ttttcctgct
12361 gactaattat gttgagcttg ttaccattta gacaacttca ttagagaagt gtctaatatt
12421 taggtgactt gcctgttttt ttttaattgg gatcttaatt tttttaaatt attgatttgt
12481 aggagctatt tatatattct ggatacaagt tctttatcag atacacagtt tgtgactatt
12541 ttcttataag tctgtggttt ttatattaat gtttttattg atgactgttt tttacaattg
12601 tggttaagta tacatgacat aaaacggatt atcttaacca ttttaaaatg taaaattcga
12661 tggcattaag tacatccaca atattgtgca actatcacca ctatcatact ccaaaagggc
12721 atccaatacc cattaagctg tcactcccca atctcccatt ttcccacccc tgacaatcaa
12781 taacccattt tctgtctcta tggatttgcc tgttctggat attcatatta atagaatcaa
Slide	
  copied	
  from:	
  Stuart	
  Owen	
  „Workflows	
  with	
  Taverna“	
  
A	
  sequence	
  of	
  connected	
  steps	
  in	
  a	
  defined	
  order	
  	
  
based	
  on	
  their	
  control	
  and	
  data	
  dependencies	
  
Workflow	
  Systems	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  10	
  
• 	
  Different	
  workflow	
  concepts	
  
• 	
  Different	
  workflow	
  languages	
  
• 	
  Different	
  workflow	
  constructs	
  	
  
	
  
	
  
Taverna	
  
Workflow	
  Editors	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  11	
  
• 	
  Different	
  technologies	
  (workbenches,	
  web-­‐based)	
  	
  
• 	
  Different	
  look-­‐and-­‐feel	
  
	
  
State	
  of	
  the	
  Art	
  	
  	
  
Data	
  and	
  compute-­‐	
  
intensive	
  problems	
  
High-­‐speed	
  networks	
  
Users	
  generally	
  not	
  
IT	
  specialists	
  Tools	
  and	
  workflow	
  
engines	
  
Web-­‐based	
  	
  
agile	
  frameworks	
   Distributed	
  data	
  and	
  	
  
compuDng	
  infrastructures	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  12	
  
Challenge	
  for	
  Developers	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  13	
  
Data	
  and	
  compute-­‐	
  
intensive	
  problems	
  
High-­‐speed	
  networks	
  Tools	
  and	
  workflow	
  
engines	
  
Web-­‐based	
  	
  
agile	
  frameworks	
   Distributed	
  data	
  and	
  	
  
compuDng	
  infrastructures	
  
Users	
  generally	
  not	
  
IT	
  specialists	
  
Need	
  for	
  intuiDve	
  and	
  self-­‐explanatory	
  user	
  
interfaces!	
  
Challenge	
  for	
  Developers	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  14	
  
Data	
  and	
  compute-­‐	
  
intensive	
  problems	
  
High-­‐speed	
  networks	
  Tools	
  and	
  workflow	
  
engines	
  
Web-­‐based	
  	
  
agile	
  frameworks	
   Distributed	
  data	
  and	
  	
  
compuDng	
  infrastructures	
  
Users	
  generally	
  not	
  
IT	
  specialists	
  
Usability	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  15	
  
	
  
	
  
“A[er	
  all,	
  usability	
  really	
  just	
  means	
  that	
  making	
  sure	
  
that	
  something	
  works	
  well:	
  that	
  a	
  person	
  …	
  can	
  use	
  the	
  
thing	
  -­‐	
  whether	
  it's	
  a	
  Web	
  site,	
  a	
  fighter	
  jet,	
  or	
  a	
  
revolving	
  door	
  -­‐	
  for	
  its	
  intended	
  purpose	
  without	
  geqng	
  
hopelessly	
  frustrated.”	
  	
  
(Steve	
  Krug	
  in	
  “Don't	
  make	
  me	
  	
  
think!:	
  A	
  Common	
  Sense	
  Approach	
  
to	
  Web	
  Usability”,	
  2005)	
  
Reusability	
  
Sandra	
  Gesing 	
   	
  	
  	
  	
  	
  	
  	
  	
   	
   	
  	
  	
  	
  	
   	
  	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  16	
  
“The	
  key	
  to	
  producDvity	
  is	
  reusability.	
  The	
  easiest	
  way	
  to	
  	
  
produce	
  code	
  is	
  obviously	
  to	
  have	
  it	
  already!"	
  	
  
(John	
  R.	
  Bourne	
  in	
  “Object-­‐oriented	
  Engineering:	
  Building	
  Engineering	
  	
  
Systems	
  Using	
  Smalltalk-­‐80”,	
  1992)	
  
Reproducibility	
  
Sandra	
  Gesing 	
   	
  	
  	
  	
  	
  	
  	
  	
   	
   	
   	
  	
  	
  	
  	
   	
  	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  17	
  
“The	
  closeness	
  of	
  agreement	
  between	
  independent	
  
results	
  obtained	
  with	
  the	
  same	
  method	
  on	
  idenDcal	
  
test	
  material	
  but	
  under	
  different	
  condiDons	
  
(different	
  operators,	
  different	
  apparatus,	
  different	
  
laboratories	
  and/or	
  a[er	
  different	
  intervals	
  of	
  Dme)
…”	
  
(IUPAC	
  (InternaDonal	
  Union	
  of	
  Pure	
  and	
  Applied	
  Chemistry	
  iupac.org)	
  GoldBook)	
  
Reproducibility	
  
Sandra	
  Gesing 	
   	
  	
  	
  	
  	
  	
  	
  	
   	
   	
   	
  	
  	
  	
  	
   	
  	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  18	
  
“The	
  closeness	
  of	
  agreement	
  between	
  independent	
  
results	
  obtained	
  with	
  the	
  same	
  method	
  on	
  idenDcal	
  
test	
  material	
  but	
  under	
  different	
  condiDons	
  
(different	
  operators,	
  different	
  apparatus,	
  different	
  
laboratories	
  and/or	
  a[er	
  different	
  intervals	
  of	
  Dme)
…”	
  
(IUPAC	
  (InternaDonal	
  Union	
  of	
  Pure	
  and	
  Applied	
  Chemistry	
  iupac.org)	
  GoldBook)	
  
Reusability	
  vs.	
  Reproducibility	
  
Sandra	
  Gesing 	
   	
  	
  	
  	
  	
  	
  	
  	
   	
   	
   	
  	
  	
  	
  	
   	
  	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  19	
  
Efficiency	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  20	
  
•  Time	
  
•  ComputaDonal	
  resources	
  
•  Money	
  
	
  
Science	
  Gateways	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  21	
  
science gateway /sī′ əәns gāt′ wā′/ n.
1. an online community space for science and engineering research and
education.
2. a Web-based resource for accessing data, software, computing services, and
equipment specific to the needs of a science or engineering discipline.
 Why	
  are	
  Science	
  Gateways	
  Important?	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  22	
  
•  Increased	
  complexity	
  of	
  
–  today’s	
  research	
  quesDons	
  
–  hardware	
  and	
  so[ware	
  
–  skills	
  required	
  
•  Greater	
  need	
  for	
  
openness	
  and	
  
reproducibility	
  
–  Science	
  increasingly	
  driving	
  
policy	
  quesDons	
  
•  Opportunity	
  to	
  integrate	
  
research	
  with	
  teaching	
  
–  BeVer	
  workforce	
  
preparaDon	
  
We	
  need	
  interfaces	
  
that	
  provide	
  	
  
broad	
  access	
  to	
  	
  
advanced	
  resources	
  	
  
and	
  	
  
allow	
  all	
  to	
  tackle	
  
today’s	
  challenging	
  
science	
  ques9ons.	
  
Science	
  Gateways	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  23	
  
Science	
  Gateways	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  24	
  
Science	
  Gateways	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  25	
  
It’s	
  a	
  
Science	
  
Gateway	
  
It’s	
  a	
  
Research	
  
Portal	
  
It’s	
  a	
  
Collaboratory	
  
It’s	
  a	
  
Cyberinfrastructure	
  
It’s	
  
e-­‐Science	
  
eResearch	
  
It’s	
  a	
  
Virtual	
  	
  
Lab	
  
Frameworks	
  and	
  APIs	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  26	
  
Re-­‐invenDng	
  is	
  not	
  always	
  necessary..	
  
Frameworks	
  and	
  APIs	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  27	
  
...	
  and	
  users	
  should	
  get	
  more	
  features	
  easily...	
  
Frameworks	
  and	
  APIs	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  28	
  
...	
  but	
  the	
  model	
  should	
  fit	
  to	
  the	
  demands	
  of	
  the	
  
community	
  
BioinformaDc	
  Infrastructure	
  Survey	
  
Sandra	
  Gesing 	
   	
  	
  	
  	
  	
  	
  	
  	
   	
   	
   	
  	
  	
  	
  	
   	
  	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  29	
  
QuesDons	
  around	
  frustraDon	
  and	
  limitaDons	
  of	
  using	
  
• 	
  BioinformaDc	
  so[ware	
  
• 	
  BioinformaDc	
  resources	
  
• 	
  HPC	
  and	
  Cloud	
  infrastructures	
  
and	
  about	
  challenges	
  to	
  train	
  students	
  in	
  bioinformaDcs	
  
	
  
Answers	
  o[en	
  address	
  
•  Hurdles	
  to	
  use	
  bioinformaDc	
  resources	
  because	
  of	
  
commandline	
  access	
  or	
  not	
  available	
  so[ware	
  
•  Quality	
  of	
  documentaDon	
  of	
  so[ware	
  
•  Need	
  for	
  parsers	
  and	
  converters	
  for	
  diverse	
  data	
  formats	
  
•  Long	
  waiDng	
  Dme	
  for	
  support	
  or	
  even	
  lack	
  of	
  support	
  
	
  
BioinformaDc	
  Infrastructure	
  Survey	
  
Sandra	
  Gesing 	
   	
  	
  	
  	
  	
  	
  	
  	
   	
   	
   	
  	
  	
  	
  	
   	
  	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  30	
  
• 	
  Nick	
  Loman	
  
	
  (Birmingham,	
  UK)	
  	
  
•  Thomas	
  Connor	
  	
  
	
  (Cardiff,	
  UK)	
  
	
  
• 	
  October	
  2015	
  
• 	
  272	
  answers	
  
	
  
hVps://drive.google.com/drive/folders/0B7KZv1TRi06fLUJCU1BYM3JScjg	
  
	
  
BioinformaDc	
  Infrastructure	
  Survey	
  
Sandra	
  Gesing 	
   	
  	
  	
  	
  	
  	
  	
  	
   	
   	
   	
  	
  	
  	
  	
   	
  	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  31	
  
BioinformaDc	
  Infrastructure	
  Survey	
  
0" 20" 40" 60" 80" 100" 120"
Cloud"
Ins0tu0on2wide"resource"
Local"resource"
Personal"computer"
Where	
  do	
  bioinformaDcians	
  do	
  most	
  of	
  their	
  work	
  
Sandra	
  Gesing 	
   	
  	
  	
  	
  	
  	
  	
  	
   	
   	
   	
  	
  	
  	
  	
   	
  	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  32	
  
BioinformaDc	
  Infrastructure	
  Survey	
  
0" 20" 40" 60" 80" 100" 120"
Cloud"
Ins0tu0on2wide"resource"
Local"resource"
Personal"computer"
0.00%$ 10.00%$20.00%$30.00%$40.00%$50.00%$60.00%$70.00%$80.00%$90.00%$
Best$for$job$
Good$documenta>on$
Word$of$mouth$recommenda>on$
Used$in$similar$analysis$
Quickest$
Already$installed$on$server$
Other$
Graphical$interface$
Where	
  do	
  bioinformaDcians	
  do	
  most	
  of	
  their	
  work	
  
Why	
  do	
  bioinformaDcians	
  use	
  the	
  so[ware	
  they	
  use	
  
Sandra	
  Gesing 	
   	
  	
  	
  	
  	
  	
  	
  	
   	
   	
   	
  	
  	
  	
  	
   	
  	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  33	
  
BioinformaDc	
  Infrastructure	
  Survey	
  
0" 20" 40" 60" 80" 100" 120"
Cloud"
Ins0tu0on2wide"resource"
Local"resource"
Personal"computer"
0.00%$ 10.00%$20.00%$30.00%$40.00%$50.00%$60.00%$70.00%$80.00%$90.00%$
Best$for$job$
Good$documenta>on$
Word$of$mouth$recommenda>on$
Used$in$similar$analysis$
Quickest$
Already$installed$on$server$
Other$
Graphical$interface$
Where	
  do	
  bioinformaDcians	
  do	
  most	
  of	
  their	
  work	
  
Why	
  do	
  bioinformaDcians	
  use	
  the	
  so[ware	
  they	
  use	
  
Sandra	
  Gesing 	
   	
  	
  	
  	
  	
  	
  	
  	
   	
   	
   	
  	
  	
  	
  	
   	
  	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  34	
  
A	
  Typical	
  Life	
  Cycle	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  35	
  
Early	
  
adopters	
  
Publicity	
  
Wider	
  
adopDon	
  
Funding	
  
ends	
  
ScienDsts	
  
disillusioned	
  
New	
  
project	
  
prototype	
  
Gateways	
  enable	
  research,	
  but	
  are	
  not	
  research	
  projects	
  themselves…	
  	
  
Sustainability	
  is	
  a	
  problem…	
  
Science	
  Gateways	
  	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  36	
  
A	
  new	
  era…	
  
•  Novel	
  developments	
  of	
  web-­‐based	
  agile	
  
frameworks	
  
•  Infrastructure	
  providers	
  report	
  that	
  science	
  
gateways	
  are	
  more	
  used	
  than	
  commandlines	
  
	
  
Science	
  Gateways	
  	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  37	
  
A	
  new	
  era…	
  
	
  
Gateways
Login
hVps://www.xsede.org/	
  
Science	
  Gateways	
  	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  38	
  
A	
  new	
  era…	
  
•  Novel	
  developments	
  of	
  web-­‐based	
  agile	
  
frameworks	
  
•  Infrastructure	
  providers	
  report	
  that	
  science	
  
gateways	
  are	
  more	
  used	
  than	
  commandlines	
  
But	
  also	
  always	
  new	
  challenges…	
  
•  Novel	
  infrastructures	
  
•  Novel	
  data	
  sources	
  like	
  NGS	
  sequencing	
  
machines,	
  telescopes	
  such	
  as	
  the	
  Square	
  
Kilometre	
  Array	
  (SKA)	
  (will	
  create	
  data	
  rates	
  in	
  
exa-­‐scale	
  size)	
  	
  
è Support	
  of	
  developers	
  necessary	
  	
  
Science	
  Gateways	
  Community	
  InsDtute	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  39	
  
hVp://sciencegateways.org	
  
•  Diverse	
  experDse	
  on	
  
demand	
  
•  Longer	
  term	
  support	
  
engagements	
  
•  So[ware	
  and	
  visibility	
  for	
  
gateways	
  
•  InformaDon	
  exchange	
  in	
  a	
  
community	
  environment	
  
•  Student	
  opportuniDes	
  
and	
  more	
  stable	
  	
  
career	
  paths	
  
	
  
Science	
  Gateway	
  Survey	
  2014	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  40	
  
•  29,000-­‐person	
  survey	
  	
  
•  4957	
  responses	
  from	
  across	
  domains	
  
Science	
  Gateway	
  Survey	
  2014	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  41	
  
n	
  of	
  applicaDon	
  types=7,805,	
  	
  
by	
  2,756	
  creators	
  (out	
  of	
  2,819);	
  
mean=2.8	
  applicaDon	
  types	
  per	
  
applicaDon	
  creator	
  
Science	
  Gateway	
  Survey	
  2014	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  42	
  
34% 36%
20%
17%
31%
26%
42%
16%
30%
18%
45% 44%
14% 15%
0%
5%
10%
15%
20%
25%
30%
35%
40%
45%
50%
Usability
Consultant
Graphic
Designer
Community
Liaison/
Evangelist
Project
Manager
Professional
Software
Developer
Security
Expert
Quality
Assurance
and	
  Testing
Expert
Wished	
  we	
  had	
  this
Yes,	
  we	
  had	
  this
n=2,756	
  respondents	
  or	
  98%	
  of	
  applicaDon	
  creators	
  
Science	
  Gateway	
  Survey	
  2014	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  43	
  
What	
  services	
  	
  
would	
  be	
  helpful?	
  
Science	
  Gateway	
  Technologies	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  44	
  
• 	
  Content	
  management	
  systems	
  (Drupal)	
  
• 	
  Libraries	
  for	
  implementaDon	
  (Django)	
  
• 	
  Portal	
  frameworks	
  (Liferay)	
  
• 	
  Science	
  gateway	
  frameworks	
  (WS-­‐PGRADE,	
  Galaxy)	
  
• 	
  StaDc	
  layout	
  
• 	
  Layout	
  extendable	
  
• 	
  Workflow-­‐enabled	
  
• 	
  APIs	
  for	
  implementaDon	
  (Apache	
  Airavata,	
  Agave)	
  
	
  
	
  
Drupal	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  45	
  
VectorBase	
  -­‐	
  Example	
  for	
  Drupal	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  46	
  
VectorBase	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  47	
  
VectorBase	
  	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  48	
  
Django	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  49	
  
VecNet	
  –	
  Example	
  for	
  Django	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  50	
  
VecNet	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  51	
  
VecNet	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  52	
  
VecNet	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  53	
  
Liferay	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  54	
  
Portal	
  framework	
  
• 	
  AuthenDcaDon	
  (e.g.,	
  OpenSSO,	
  CAS)	
  
• 	
  AuthorizaDon	
  	
  
• 	
  Standards	
  compliant	
  	
  
• 	
  JSR168/286	
  
• 	
  Web	
  services	
  
• 	
  Web	
  2.0	
  websites	
  
• 	
  Web	
  Publishing	
  and	
  Shared	
  Workspaces	
  
• 	
  CollaboraDon	
  
• 	
  Social	
  Networking	
  
	
  
WS-­‐PGRADE	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  55	
  
User	
  Interface	
  
WS-­‐PGRADE	
  
Liferay	
  
DCI	
  Resources	
  	
  
Middleware	
  Layer	
  
High-­‐Level	
  
Middleware	
  
Service	
  Layer	
  
gUSE	
  
MoSGrid	
  as	
  WS-­‐PGRADE	
  Example	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  56	
  
Molecular	
  SimulaDon	
  Grid	
  
• 	
  	
  Science	
  gateway	
  integrated	
  with	
  underlying	
  
	
  compute	
  and	
  data	
  management	
  infrastructure	
  	
  	
  
• 	
  	
  Distributed	
  workflow	
  management	
  
• 	
  	
  Data	
  repository	
  
• 	
  	
  Metadata	
  management	
  
MoSGrid	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  57	
  
MoSGrid	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  58	
  
MoSGrid	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  59	
  
MoSGrid	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  60	
  
MoSGrid	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  61	
  
MoSGrid	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  62	
  
MoSGrid	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  63	
  
MoSGrid	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  64	
  
Molecular	
  Dynamics	
  
• 	
  	
  Study	
  and	
  simulaDon	
  of	
  molecular	
  moDon	
  
Quantum	
  Chemistry	
  
• 	
  	
  Study	
  and	
  simulaDon	
  of	
  molecular	
  electronic	
  
	
  behavior	
  relaDve	
  to	
  their	
  chemical	
  reacDvity	
  
Docking	
  
• 	
  	
  Main	
  focus	
  on	
  evaluaDon	
  of	
  ligand-­‐receptor	
  
	
  interacDons	
  (e.g.,	
  for	
  drug	
  design)	
  
MoSGrid	
  -­‐	
  Metadata	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  65	
  
•  Molecular	
  SimulaDon	
  Markup	
  Language	
  (MSML)	
  	
  
•  CML	
  compliant	
  
•  Template	
  for	
  each	
  and	
  every	
  workflow	
  
•  Molecular	
  input	
  
•  Domain	
  specific	
  tools	
  
•  Job	
  configuraDon	
  
•  OpDmized	
  structures,	
  trajectories,	
  energies,	
  …	
  
•  SemanDc	
  search	
  (Apache	
  Lucene)	
  
MoSGrid	
  -­‐	
  Metadata	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  66	
  
MoSGrid	
  -­‐	
  Metadata	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  67	
  
MoSGrid	
  –	
  VisualizaDon	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  68	
  
TesDng	
  of	
  ChemDoodle	
  and	
  MolCAD	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
   	
   	
  	
  
	
   	
   	
  	
  web.chemdoodle.com	
   molcad.de	
  
MoSGrid	
  –	
  Basic	
  Workflow	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  69	
  
Job	
  
DefiniHon	
  
ApplicaHon	
  
Input	
  
ExecuHon	
  
Meta-­‐	
  
processing	
  
Job	
  
Submission	
  
ApplicaHon	
  
Output	
  
Post-­‐	
  
processing	
  
Output	
  
Portal	
  
User-­‐	
  
Input	
  
Grid	
  
Resource	
  
MoSGrid	
  –	
  QC	
  Portlet	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  70	
  
•  Specialised	
  interface	
  for	
  quantum	
  chemistry	
  
so[ware	
  (Gaussian,	
  NWChem,	
  ORCA)	
  
•  Basic	
  workflows	
  
•  Easy	
  GeneraDon	
  or	
  Uploading	
  of	
  Input	
  Files	
  
•  Parsing	
  of	
  result	
  files	
  
MoSGrid	
  –	
  MD	
  Portlet	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  71	
  
MoSGrid	
  –	
  Docking	
  Portlet	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  72	
  
MoSGrid	
  –	
  Docking	
  Portlet	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  73	
  
Galaxy	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  74	
  
Galaxy	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  75	
  
RNA-­‐Seq	
  Analysis	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  76	
  
Apache	
  Airavata	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  77	
  
•  Airavata	
  is	
  a	
  general	
  purpose	
  distributed	
  system	
  
so[ware	
  framework	
  build	
  on	
  micro-­‐service	
  and	
  
component	
  based	
  architecture	
  principles	
  	
  
•  Airavata	
  provides	
  capabiliDes	
  to	
  compose,	
  
manage,	
  execute	
  and	
  monitor	
  large	
  scale	
  
applicaDons	
  and	
  workflows	
  on	
  distributed	
  
compuDng	
  resources	
  
•  Airavata	
  supports	
  execuDons	
  on	
  local	
  clusters,	
  
naDonal	
  grids,	
  academic	
  and	
  commercial	
  clouds	
  	
  
•  Airavata	
  is	
  inherently	
  mulD-­‐tenanted	
  
Apache	
  Airavata	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  78	
  
Apache	
  Airavata	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  79	
  
•  External	
  clients	
  interact	
  with	
  Airavata	
  API	
  (based	
  on	
  
Apache	
  Thri[)	
  
•  Internally,	
  components	
  interact	
  with	
  each	
  other	
  through	
  
Component	
  Programming	
  Interfaces	
  (thri[-­‐based	
  CPIs)	
  
Apache	
  Airavata	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  80	
  
Clean	
  way	
  to	
  define	
  IDLs	
  with	
  
richer	
  data	
  structures	
  
SciGap	
  –	
  Example	
  for	
  Apache	
  Airavata	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  81	
  
Science	
  Gateway	
  Pla•orm	
  as	
  a	
  Service	
  
Apache	
  Airavata	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  82	
  
Science	
  Gateway	
  Pla•orm	
  as	
  a	
  Service	
  (SciGaP)	
  
User	
  IdenDty	
  
Management	
  
InformaDon,	
  Monitoring	
  &	
  
AudiDng	
  
ApplicaDon	
  Programmer	
  Interface	
  
CIPRES	
  
Science	
  Gateways	
  
Neuro	
  
Science	
  
Ultrascan	
   BioVLAB	
  GAAMP	
  
DES	
  
SimWG	
  
Param	
  
Chem	
  
Graphical	
  Interfaces	
   Admin	
  Dashboards	
  
XSEDE	
   OSG	
  
Future	
  
Grid	
  
Data	
  
Nets	
  
Campus	
  
Clusters	
  
Academic	
  &	
  
Commercial	
  
Clouds	
  
InternaDonal	
  
Grids	
  
Data	
  &	
  Provenance	
  
Management	
  
Scalable	
   Secure	
  Load	
  Balanced	
   Configurable	
   Fault	
  Tolerant	
   Maintainable	
   Performance	
  
Job	
  &	
  Workflow	
  
Management	
  
Apache	
  Airavata	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  83	
  
Community	
  Hangout	
  	
  
Mailing	
  lists:	
  
architecture@airavata.apache.org	
  
dev@airavata.apache.org	
  
users@airavata.apache.org	
  	
  
Extend	
  Airavata	
  from	
  your	
  
project	
  or	
  extend	
  your	
  project	
  
from	
  Airavata	
  
Agave	
  API	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  84	
  
Agave	
  is	
  a	
  Science-­‐as-­‐a-­‐Service	
  web	
  API	
  pla•orm	
  
Run	
  scienHfic	
  codes	
  	
  
•  your	
  own	
  or	
  community	
  provided	
  codes	
  
...on	
  HPC,	
  HTC,	
  or	
  cloud	
  resources	
  	
  
•  your	
  own,	
  shared,	
  or	
  commercial	
  systems	
  
...and	
  manage	
  your	
  data	
  	
  
•  reliable,	
  mulD-­‐protocol,	
  async	
  data	
  movement	
  
...from	
  the	
  web	
  	
  
•  webhooks,	
  rest,	
  json,	
  cors,	
  oauth2	
  
...and	
  remember	
  how	
  you	
  did	
  it	
  	
  
•  deep	
  provenance,	
  history,	
  and	
  reproducibility	
  built	
  in	
  
	
  
Agave	
  API	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  85	
  
•  MulDtenant	
  
•  Hosted	
  idenDty	
  
management	
  
•  Supports	
  mulDple	
  IdP	
  
•  OAuth2/OIDC	
  server	
  
•  API	
  Management	
  
•  Hosted	
  or	
  on	
  premise	
  
•  VerDcal	
  SSO	
  
•  AnalyDcs	
  and	
  reporDng	
  
•  Developer	
  resources	
  
•  MulDple	
  SDK	
  &	
  CLI	
  
•  Reference	
  gateway	
  
•  White	
  labeled	
  
•  100%	
  open	
  source	
  
Agave	
  API	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  86	
  
Used	
  to	
  power	
  web	
  &	
  mobile	
  
applicaDons	
  
Agave	
  API	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  87	
  
Used	
  to	
  extend	
  exisDng	
  
processes	
  
Agave	
  API	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  88	
  
(Re)Introducing	
  the	
  	
  
Micro	
  App	
  Paradigm	
  
Agave	
  API	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  89	
  
Agave	
  Delivers	
  Process-­‐as-­‐a-­‐
Service	
  
Agave	
  API	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  90	
  
iPlant	
  –	
  Example	
  for	
  Agave	
  API	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  91	
  
Agave	
  API	
  -­‐	
  Tutorials	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  92	
  
Agave	
  API	
  -­‐	
  Tutorials	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  93	
  
CollaboraDon	
  on	
  Science	
  Gateways	
  	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  94	
  
Crucial	
  Topics	
  
•  Close	
  collaboraDon	
  with	
  user	
  communiDes	
  
•  Knowledge	
  about	
  available	
  technical	
  soluDons	
  
	
  
Sounds	
  easy	
  but…	
  
•  Requirements	
  of	
  user	
  communiDes	
  o[en	
  not	
  so	
  
clear	
  
•  Technologies	
  someDmes	
  sDll	
  under	
  development	
  
for	
  certain	
  building	
  blocks	
  
è Slow	
  uptake	
  of	
  soluDons	
  	
  
è Larger	
  effort	
  for	
  creaDng	
  science	
  gateways	
  
New	
  Science	
  Gateways	
  -­‐	
  Checklist	
  	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  95	
  
DISCUSSION	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
OrganizaDonal	
  
Aspects	
  
Technical	
  
Aspects	
  
Domain-­‐Specific	
  
Aspects	
  
Developers	
   Domain	
  Experts	
  
New	
  Science	
  Gateways	
  -­‐	
  Checklist	
  	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  96	
  
Domain-­‐specific	
  aspects:	
  
•  Goal,	
  target	
  area	
  and	
  target	
  users	
  	
  
•  Visions/demands	
  on	
  the	
  layout	
  
•  PrioriDes	
  of	
  features	
  and	
  opDons,	
  e.g.,	
  a	
  list	
  
from	
  must-­‐have	
  to	
  great-­‐to-­‐have	
  opDons	
  
•  IntegraDon	
  of	
  exisDng	
  applicaDons	
  or	
  
development	
  of	
  applicaDons	
  
•  Technologies	
  of	
  the	
  applicaDons	
  
•  VisualizaDon	
  
•  Security	
  demands	
  
•  Workflows	
  
New	
  Science	
  Gateways	
  -­‐	
  Checklist	
  	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  97	
  
OrganizaDonal	
  aspects:	
  
•  Time	
  constraints	
  for	
  the	
  development,	
  
agreement	
  on	
  a	
  (maybe	
  even	
  rough)	
  project	
  
plan	
  with	
  milestones	
  	
  
•  Agreement	
  on	
  alpha-­‐	
  or	
  beta-­‐tester	
  
•  Regular	
  meeDngs	
  	
  
New	
  Science	
  Gateways	
  -­‐	
  Checklist	
  	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  98	
  
Technical	
  aspects:	
  
•  Experience	
  with	
  exisDng	
  frameworks	
  and	
  
programming	
  languages	
  
•  Available	
  infrastructure	
  including	
  security	
  
infrastructure	
  and	
  resources	
  
•  Available	
  support	
  of	
  suitable	
  technologies	
  
•  Scalability	
  of	
  suitable	
  technologies	
  
•  Effort	
  for	
  extending	
  exisDng	
  technologies	
  
compared	
  to	
  novel	
  developments	
  	
  
•  Synergy	
  effects	
  with	
  other	
  science	
  gateway	
  
projects	
  
Challenges	
  	
  
Sandra	
  Gesing 	
   	
  	
  	
  	
  	
  	
  	
  	
   	
   	
   	
  	
  	
  	
  	
   	
  	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  99	
  
A	
  world-­‐wide	
  research	
  compuDng	
  infrastructure	
  
•  Transparent	
  service	
  selecDon	
  
•  e.g.,	
  Docker	
  could	
  be	
  part	
  of	
  the	
  soluDon	
  
•  Access	
  to	
  data	
  irrespecDve	
  of	
  locaDon	
  
•  OpDons	
  to	
  share	
  data	
  efficiently	
  
•  Appropriate	
  privacy	
  and	
  security	
  measures	
  
•  OpDmized	
  usage	
  of	
  resources	
  
•  e.g.,	
  opDmized	
  usage	
  of	
  cloud	
  compuDng	
  and	
  their	
  
business	
  models	
  
Researchers	
  
Sandra	
  Gesing 	
   	
  	
  	
  	
  	
  	
  	
  	
   	
   	
   	
  	
  	
  	
  	
   	
  	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  
~7	
  million	
  researchers	
  world	
  wide	
  
	
  
hVp://chartsbin.com/view/1124	
  
High-­‐Speed	
  Network	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  101	
  
Challenges	
  	
  
IntegraDon	
  of	
  data	
  sources	
  and	
  instruments	
  
•  Different	
  data	
  formats	
  
•  Different	
  interfaces	
  
•  Different	
  hardwares	
  and	
  technologies	
  
	
  
…	
  from	
  small	
  ones	
  to	
  the	
  big	
  ones…	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  102	
  
Challenges	
  	
  
So[ware	
  searchability,	
  reproducibility	
  and	
  reusability	
  
•  Science	
  gateways	
  step	
  in	
  the	
  right	
  direcDon	
  but	
  …	
  
much	
  more	
  work	
  necessary	
  on	
  searchibility…	
  Not	
  only	
  
finding	
  any	
  data	
  for	
  a	
  research	
  area	
  but	
  finding	
  the	
  right	
  
data	
  
•  Metadata	
  approaches	
  
•  DicDonaries	
  
•  More	
  involvement	
  of	
  	
  
librarians	
  
	
  
	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  103	
  
Challenges	
  	
  
So[ware	
  searchability,	
  reproducibility	
  and	
  reusability	
  
•  Science	
  gateways	
  step	
  in	
  the	
  right	
  direcDon	
  but	
  …	
  
much	
  more	
  work	
  necessary	
  on	
  reproducibility	
  and	
  
reusability…	
  	
  
•  studies	
  in	
  medicine	
  and	
  pharmacology:	
  11%	
  or	
  6%	
  of	
  the	
  
analysed	
  research	
  was	
  reproducible	
  	
  
•  myExperiment:	
  only	
  20%	
  of	
  workflows	
  reusable	
  because	
  
of	
  dependencies	
  on	
  hardware,	
  local	
  or	
  distributed	
  data,	
  
so[ware	
  versions	
  	
  
	
  
	
  	
  
	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  104	
  
Challenges	
  	
  
So[ware	
  searchability,	
  reproducibility	
  and	
  reusability	
  
•  Science	
  gateways	
  and	
  workflow	
  systems	
  step	
  in	
  the	
  
right	
  direcDon	
  but	
  …	
  
much	
  more	
  work	
  necessary	
  on	
  reproducibility	
  and	
  
reusability…	
  	
  
	
  
•  ContainerizaDon	
  approaches	
  
•  MigraDon	
  approaches	
  
•  CombinaDon	
  of	
  both	
  
	
  	
  
	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  105	
  
Challenges	
  –	
  Novel	
  and	
  Old...	
  	
  	
  	
  
…	
  require	
  novel	
  soluDons!	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  106	
  
Projects	
  -­‐	
  OSF	
  
	
  
•  Big	
  Data	
  
•  Reproducibility	
  
	
  
Open	
  Access	
  to	
  Data	
  and	
  Projects	
  could	
  solve	
  
parts	
  of	
  the	
  problems…	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  107	
  
Workflow	
  Enhancements	
  
•  Logical	
  level:	
  Meta-­‐workflows	
  
Herres-­‐Pawlis,	
  S.,	
  Hoffmann,	
  A.,	
  Rösener,	
  T.,	
  Krüger,	
  J.,	
  Grunzke,	
  R.,	
  and	
  Gesing,	
  S.	
  “MulD-­‐layer	
  Meta-­‐
metaworkflows	
  for	
  the	
  EvaluaDon	
  of	
  Solvent	
  and	
  Dispersion	
  Effects	
  in	
  TransiDon	
  Metal	
  Systems	
  Using	
  the	
  
MoSGrid	
  Science	
  Gateways”Science	
  Gateways	
  (IWSG),	
  2015	
  7th	
  InternaDonal	
  Workshop	
  on,	
  pp.47-­‐52,	
  3-­‐5	
  June	
  
2015,	
  IEEE	
  Xplore,	
  doi:	
  10.1109/IWSG.2015.13	
  	
  
•  System	
  level:	
  CombinaDon	
  of	
  strengths	
  of	
  workflow	
  
systems	
  
Hazekamp,	
  N.,	
  Sarro,	
  J.,	
  Choudhury,	
  O.,	
  Gesing,	
  S.,	
  ScoV	
  Emrich	
  and	
  Thain,	
  D.	
  “Scaling	
  Up	
  BioinformaDcs	
  
Workflows	
  with	
  Dynamic	
  Job	
  Expansion:	
  A	
  Case	
  Study	
  Using	
  Galaxy	
  and	
  Makeflow”,	
  e-­‐Science	
  (e-­‐Science),	
  2015	
  
IEEE	
  11th	
  InternaDonal	
  Conference	
  on,	
  pp.332-­‐341,	
  Aug.	
  31	
  2015-­‐Sept.	
  4	
  2015	
  
•  PredicDon:	
  Model	
  for	
  opDmizaDon	
  of	
  tasks	
  and	
  
threads	
  
Choudhury,	
  O.,	
  Rajan,	
  D.,	
  Hazekamp,	
  N.,	
  Gesing,	
  S.,	
  Thain,	
  D.,	
  and	
  Emrich,	
  S.	
  “Balancing	
  Thread-­‐level	
  and	
  Task-­‐
level	
  Parallelism	
  for	
  Data-­‐Intensive	
  Workloads	
  on	
  Clusters	
  and	
  Clouds”,	
  Cluster	
  CompuDng	
  (CLUSTER),	
  2015	
  IEEE	
  
InternaDonal	
  Conference	
  on,	
  pp.390-­‐393,	
  8-­‐11	
  Sept.	
  2015,	
  doi:10.1109/CLUSTER.2015.60	
  
	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  108	
  
Science	
  Case:	
  PolymerisaDon	
  catalysts	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  109	
  
TranslaDon	
  into	
  Workflows	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  110	
  
TranslaDon	
  into	
  Workflows	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  111	
  
Meta-­‐Workflows	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  112	
  
TranslaDon	
  into	
  Meta-­‐Workflows	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  113	
  
Scaling	
  Up	
  Workflows	
  	
  
#	
  Machines	
   #	
  Cores	
  
Data	
  
ParHHoning	
  
Save	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  114	
  
Scaling	
  Up	
  Workflows	
  	
  
Galaxy	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  115	
  
Genome	
  Sequencing	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  116	
  
•  Finding	
  precise	
  order	
  of	
  nucleoDdes	
  within	
  a	
  DNA	
  
molecule	
  
•  A	
  (adenine),	
  G	
  (guanine),	
  C	
  (cytosine),	
  and	
  T	
  (thymine)	
  
(Human	
  genome	
  over	
  3	
  billion	
  of	
  nucleoDdes)	
  
	
  
Genome	
  Sequencing	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  117	
  
Let’s	
  imagine	
  a	
  party	
  game.	
  The	
  game	
  is	
  a	
  guessing	
  
game.	
  Here	
  is	
  how	
  it	
  is	
  played:	
  
You	
  are	
  thinking	
  of	
  a	
  number	
  and	
  the	
  group	
  has	
  to	
  guess	
  
it.	
  The	
  tricky	
  part	
  is	
  that	
  the	
  number	
  is	
  200-­‐digits	
  in	
  
length.	
  You	
  are	
  reading	
  the	
  digits	
  of	
  the	
  number	
  in	
  your	
  
head	
  without	
  making	
  a	
  sound.	
  	
  Every	
  so	
  o[en	
  a	
  person	
  
interrupts	
  you,	
  and	
  you	
  tell	
  them	
  the	
  single	
  digit	
  you	
  
were	
  just	
  thinking	
  and	
  where	
  it	
  is	
  in	
  the	
  sequence	
  of	
  
200.	
  Each	
  Dme	
  you	
  are	
  interrupted,	
  you	
  have	
  to	
  start	
  
again.	
  You	
  leave	
  a[er	
  a	
  few	
  hours	
  and	
  the	
  group	
  has	
  to	
  
figure	
  out	
  the	
  200-­‐digit	
  number.	
  They	
  have	
  to	
  piece	
  
together	
  the	
  informaDon	
  you	
  gave	
  them,	
  for	
  example	
  
the	
  25th	
  number	
  was	
  5,	
  the	
  40th	
  number	
  was	
  0,	
  and	
  so	
  
on.	
  Using	
  the	
  informaDon	
  from	
  their	
  interrupDons,	
  they	
  
can	
  repeat	
  the	
  number	
  they	
  gave	
  you.	
  
Scaling	
  Up	
  Workflows	
  	
  
Simple	
  Workflow	
  in	
  Galaxy	
  
Problem:	
  As	
  Size	
  increases	
  so	
  does	
  Time	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  118	
  
Scaling	
  Up	
  Workflows	
  	
  
Workflow	
  with	
  Parallelism	
  added	
  in	
  Galaxy	
  
Problem:	
  Tools	
  must	
  be	
  updated	
  every	
  
change	
  in	
  Parallelism/Relies	
  on	
  ScienDst	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  119	
  
Scaling	
  Up	
  Workflows	
  	
  
Workflow	
  Dynamically	
  Expanded	
  behind	
  Galaxy	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  120	
  
Scaling	
  Up	
  Workflows	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  121	
  
Scaling	
  Up	
  Workflows	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  122	
  
Scaling	
  Up	
  Workflows	
  	
  
Makeflow	
  
•  Task	
  Structure	
  
INPUTS	
  :	
  OUTPUTS	
  
	
  COMMAND	
  
	
  
•  Directed	
  Acyclic	
  Graph	
  (DAG)	
  
•  ProgrammaDcally	
  Generated	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  123	
  
Scaling	
  Up	
  Workflows	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  124	
  
Scaling	
  Up	
  Workflows	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  125	
  
Scaling	
  Up	
  Workflows	
  	
  
Job	
  Sandbox	
  –	
  Log	
  file	
  creaDon	
  for	
  cleanup	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  126	
  
Scaling	
  Up	
  Workflows	
  	
  
Dynamic	
  Job	
  Expansion	
  	
  
	
  
•  Work	
  Queue:	
  we	
  uDlized	
  100s	
  of	
  cores	
  from	
  a	
  	
  
Condor	
  Pool	
  
•  Cleaning	
  Sandbox	
  using	
  knowledge	
  of	
  intermediates	
  
	
  and	
  logging	
  
•  Explored	
  methods	
  to	
  transmit	
  needed	
  environments	
  	
  
such	
  as	
  	
  
executables	
  and	
  Java	
  
	
  
61.5X	
  speed-­‐up	
  on	
  32	
  GB	
  dataset	
  uDlizing	
  these	
  methods	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  127	
  
Thread-­‐level	
  and	
  Task-­‐level	
  Parallelism	
  	
  
•  Develop predictive performance models for an
application domain
•  Achieve acceptable performance the first time
•  Optimize resource utilization
•  Execution time
•  Memory usage
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  128	
  
Thread-­‐level	
  and	
  Task-­‐level	
  Parallelism	
  	
  
•  WorkQueue master-worker framework
•  Sun Grid Engine (SGE) batch system
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  129	
  
Thread-­‐level	
  and	
  Task-­‐level	
  Parallelism	
  	
  
1.  ApplicaDon-­‐level	
  model	
  for	
  Dme:
𝑇(𝑅, 𝑄, 𝑁)=   𝛽1​ 𝑅 𝑄/𝑁 +   𝛽2	
  
2.  ApplicaDon-­‐level	
  model	
  for	
  memory:	
  	
  
𝑀(𝑅, 𝑁)=  γ1R  +γ2N	
  
3.  System-­‐level	
  model	
  for	
  Dme:	
  	
  
𝑇𝑇𝑜𝑡𝑎𝑙= 𝜂1​ 𝑄 𝐾/𝐷 + 𝜂2(​ 𝑄/𝐵 +​ 𝑅 𝐾𝑁/𝐵𝐶 )+ 𝜂3T(R,​ 𝑄/𝐾 , 𝑁)∗​ 𝐾 𝑁/𝑀𝐶 +	
   𝜂4​ 𝑂/𝐵 + 𝜂5​ 𝑂 𝐾/𝐷   	
  
4.  System-­‐level	
  model	
  for	
  memory:	
  	
  
𝑀𝑀𝑎𝑠𝑡𝑒𝑟(𝑅, 𝑄)=ϕ1R  +ϕ2Q	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  130	
  
Thread-­‐level	
  and	
  Task-­‐level	
  Parallelism	
  	
  
7	
  data	
  
points	
  
(R)	
  
7	
  data	
  
points	
  
(Q)	
  
7	
  data	
  
points	
  
(N)	
  
343	
  data	
  
points	
  
Data	
  
CollecHon	
  
Training	
  
data	
  
Regression	
  	
  
Model	
  
Training	
  
Accuracy	
  
Test	
  
MAPE	
  
TesHng	
  
Regression	
  
Coefficient
s	
  
TesDng	
  
data	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  131	
  
Thread-­‐level	
  and	
  Task-­‐level	
  Parallelism	
  	
  
Avg.  MAPE
=  3.1
MAPE	
  =	
  Mean	
  Absolute	
  Percentage	
  Error	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  132	
  
Thread-­‐level	
  and	
  Task-­‐level	
  Parallelism	
  	
  
For the given dataset, K* = 90, N* = 4
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  133	
  
Result	
  	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
   	
   	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  134	
  
# Cores/
Task
#Tasks
Predicted
Time
(min)
Speedup
Estimated
EC2
Cost ($)
Estimated
Azure
Cost ($)
1 360 70 6.6 50.4 64.8
2 180 38 12.3 25.2 32.4
4 90 24 19.5 18.9 32.4
8 45 27 17.3 18.9 32.4
InformaDon	
  on	
  Science	
  Gateways	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  135	
  
• 	
  Science	
  Gateway	
  InsDtute	
  
	
  hVp://sciencegateways.org	
  
• 	
  Science	
  Gateway	
  Workshops	
  	
  
	
  Europe:	
  IWSG	
  -­‐	
  hVp://iwsg.info	
  
USA:	
  GCE	
  -­‐	
  hVp://sciencegateways.org	
  
Australasia:	
  IWSG-­‐A	
  -­‐	
  hVp://iwsg.info	
  
• 	
  IEEE	
  Technical	
  Area	
  on	
  Science	
  Gateways	
  	
  
	
   	
  hVp://ieeesciencegateways.org	
  
• 	
  XSEDE	
  Science	
  Gateways	
  
	
  hVps://www.xsede.org/gateways-­‐overview	
  
• 	
  CRC	
  Science	
  Gateways	
  
	
  hVps://crc.nd.edu/index.php/research/gateways	
  
	
  
Exercises	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  136	
  
QuesDons	
  and	
  exercises	
  at	
  
hVp://bit.ly/2dlkySW	
  
	
  
Data	
  at	
  	
  
hVp://bit.ly/2cTwKaN	
  
Sandra	
  Gesing 	
   	
   	
  	
  	
  	
  	
  Science	
  Gateways 	
   	
   	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  137	
  
sandra.gesing@nd.edu	
  

Más contenido relacionado

Destacado

Profile VTM GROUP
Profile VTM GROUPProfile VTM GROUP
Profile VTM GROUPsusan Hang
 
Administrador de Infraestructuras Ferroviarias (ADIF)
Administrador de Infraestructuras Ferroviarias (ADIF)Administrador de Infraestructuras Ferroviarias (ADIF)
Administrador de Infraestructuras Ferroviarias (ADIF)Editorial CEP
 
Prevencion de riesgos laborales
Prevencion de riesgos laboralesPrevencion de riesgos laborales
Prevencion de riesgos laboralesLUGUENSI
 
final senior thesis 2015
final senior thesis 2015final senior thesis 2015
final senior thesis 2015Romina Mollo
 
Planificación de una parada presentación 7º encuentro directores de proyecto ...
Planificación de una parada presentación 7º encuentro directores de proyecto ...Planificación de una parada presentación 7º encuentro directores de proyecto ...
Planificación de una parada presentación 7º encuentro directores de proyecto ...Manuel Parra Palacios, MSc, PMP
 
3. Manipulación manual de cargas
3. Manipulación manual de cargas3. Manipulación manual de cargas
3. Manipulación manual de cargasTVPerú
 
Nathan Hudson resume
Nathan Hudson resumeNathan Hudson resume
Nathan Hudson resumeNathan Hudson
 
Riesgos en el sector Construcción. Movimiento de Tierras
Riesgos en el sector Construcción. Movimiento de TierrasRiesgos en el sector Construcción. Movimiento de Tierras
Riesgos en el sector Construcción. Movimiento de TierrasSandra Echavarri Jaudenes
 
Seguridad en excavaciones y zanjas
Seguridad en excavaciones y zanjasSeguridad en excavaciones y zanjas
Seguridad en excavaciones y zanjasYanet Caldas
 
Presentacion curso project
Presentacion curso projectPresentacion curso project
Presentacion curso projectjuanyfrancis
 

Destacado (19)

Cuci
CuciCuci
Cuci
 
Catabolismo1
Catabolismo1Catabolismo1
Catabolismo1
 
Sgci iwsg-a-10-10-16
Sgci iwsg-a-10-10-16Sgci iwsg-a-10-10-16
Sgci iwsg-a-10-10-16
 
1ª supercopa dos campeões
1ª supercopa dos campeões1ª supercopa dos campeões
1ª supercopa dos campeões
 
Apresentação.
Apresentação. Apresentação.
Apresentação.
 
Profile VTM GROUP
Profile VTM GROUPProfile VTM GROUP
Profile VTM GROUP
 
Administrador de Infraestructuras Ferroviarias (ADIF)
Administrador de Infraestructuras Ferroviarias (ADIF)Administrador de Infraestructuras Ferroviarias (ADIF)
Administrador de Infraestructuras Ferroviarias (ADIF)
 
Sistemas medio
Sistemas medioSistemas medio
Sistemas medio
 
Prevencion de riesgos laborales
Prevencion de riesgos laboralesPrevencion de riesgos laborales
Prevencion de riesgos laborales
 
final senior thesis 2015
final senior thesis 2015final senior thesis 2015
final senior thesis 2015
 
Riscos
RiscosRiscos
Riscos
 
Planificación de una parada presentación 7º encuentro directores de proyecto ...
Planificación de una parada presentación 7º encuentro directores de proyecto ...Planificación de una parada presentación 7º encuentro directores de proyecto ...
Planificación de una parada presentación 7º encuentro directores de proyecto ...
 
3. Manipulación manual de cargas
3. Manipulación manual de cargas3. Manipulación manual de cargas
3. Manipulación manual de cargas
 
Slideshare
SlideshareSlideshare
Slideshare
 
Nathan Hudson resume
Nathan Hudson resumeNathan Hudson resume
Nathan Hudson resume
 
Hart's Concept of Law
Hart's Concept of LawHart's Concept of Law
Hart's Concept of Law
 
Riesgos en el sector Construcción. Movimiento de Tierras
Riesgos en el sector Construcción. Movimiento de TierrasRiesgos en el sector Construcción. Movimiento de Tierras
Riesgos en el sector Construcción. Movimiento de Tierras
 
Seguridad en excavaciones y zanjas
Seguridad en excavaciones y zanjasSeguridad en excavaciones y zanjas
Seguridad en excavaciones y zanjas
 
Presentacion curso project
Presentacion curso projectPresentacion curso project
Presentacion curso project
 

Similar a Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability

Science Gateways for Life Sciences – Balancing Usability and Re-Usability
Science Gateways for Life Sciences – Balancing Usability and Re-Usability Science Gateways for Life Sciences – Balancing Usability and Re-Usability
Science Gateways for Life Sciences – Balancing Usability and Re-Usability Sandra Gesing
 
SGCI - S2I2: Science Gateways Community Institute
SGCI - S2I2: Science Gateways Community InstituteSGCI - S2I2: Science Gateways Community Institute
SGCI - S2I2: Science Gateways Community InstituteSandra Gesing
 
Internet2 Bio IT 2016 v2
Internet2 Bio IT 2016 v2Internet2 Bio IT 2016 v2
Internet2 Bio IT 2016 v2Dan Taylor
 
Gridforum David De Roure Newe Science 20080402
Gridforum David De Roure Newe Science 20080402Gridforum David De Roure Newe Science 20080402
Gridforum David De Roure Newe Science 20080402vrij
 
New Trends and Directions in Data Science - MIT Information Quality Conferenc...
New Trends and Directions in Data Science - MIT Information Quality Conferenc...New Trends and Directions in Data Science - MIT Information Quality Conferenc...
New Trends and Directions in Data Science - MIT Information Quality Conferenc...Mario Faria
 
Open Science and Executable Papers
Open Science and Executable PapersOpen Science and Executable Papers
Open Science and Executable PapersJose Enrique Ruiz
 
g-Social - Enhancing e-Science Tools with Social Networking Functionality
g-Social - Enhancing e-Science Tools with Social Networking Functionalityg-Social - Enhancing e-Science Tools with Social Networking Functionality
g-Social - Enhancing e-Science Tools with Social Networking FunctionalityNicholas Loulloudes
 
Mexico talk foster march 2012
Mexico talk foster march 2012Mexico talk foster march 2012
Mexico talk foster march 2012Ian Foster
 
About an Immune System Understanding for Cloud-native Applications - Biology ...
About an Immune System Understanding for Cloud-native Applications - Biology ...About an Immune System Understanding for Cloud-native Applications - Biology ...
About an Immune System Understanding for Cloud-native Applications - Biology ...Nane Kratzke
 
Foster CRA March 2022.pptx
Foster CRA March 2022.pptxFoster CRA March 2022.pptx
Foster CRA March 2022.pptxIan Foster
 
GSA Presentation - MILLER 251-4.pdf
GSA Presentation - MILLER 251-4.pdfGSA Presentation - MILLER 251-4.pdf
GSA Presentation - MILLER 251-4.pdfRaoul Miller
 
Data-intensive bioinformatics on HPC and Cloud
Data-intensive bioinformatics on HPC and CloudData-intensive bioinformatics on HPC and Cloud
Data-intensive bioinformatics on HPC and CloudOla Spjuth
 
BeSTGRID OpenGridForum 29 GIN session
BeSTGRID OpenGridForum 29 GIN sessionBeSTGRID OpenGridForum 29 GIN session
BeSTGRID OpenGridForum 29 GIN sessionNick Jones
 
Digital Science: Towards the executable paper
Digital Science: Towards the executable paperDigital Science: Towards the executable paper
Digital Science: Towards the executable paperJose Enrique Ruiz
 
Science Engagement: A Non-Technical Approach to the Technical Divide
Science Engagement: A Non-Technical Approach to the Technical DivideScience Engagement: A Non-Technical Approach to the Technical Divide
Science Engagement: A Non-Technical Approach to the Technical DivideCybera Inc.
 

Similar a Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability (20)

Science Gateways for Life Sciences – Balancing Usability and Re-Usability
Science Gateways for Life Sciences – Balancing Usability and Re-Usability Science Gateways for Life Sciences – Balancing Usability and Re-Usability
Science Gateways for Life Sciences – Balancing Usability and Re-Usability
 
SGCI - S2I2: Science Gateways Community Institute
SGCI - S2I2: Science Gateways Community InstituteSGCI - S2I2: Science Gateways Community Institute
SGCI - S2I2: Science Gateways Community Institute
 
Internet2 Bio IT 2016 v2
Internet2 Bio IT 2016 v2Internet2 Bio IT 2016 v2
Internet2 Bio IT 2016 v2
 
Gridforum David De Roure Newe Science 20080402
Gridforum David De Roure Newe Science 20080402Gridforum David De Roure Newe Science 20080402
Gridforum David De Roure Newe Science 20080402
 
New Trends and Directions in Data Science - MIT Information Quality Conferenc...
New Trends and Directions in Data Science - MIT Information Quality Conferenc...New Trends and Directions in Data Science - MIT Information Quality Conferenc...
New Trends and Directions in Data Science - MIT Information Quality Conferenc...
 
Open Science and Executable Papers
Open Science and Executable PapersOpen Science and Executable Papers
Open Science and Executable Papers
 
Sgci nsf-2-22-17
Sgci nsf-2-22-17Sgci nsf-2-22-17
Sgci nsf-2-22-17
 
g-Social - Enhancing e-Science Tools with Social Networking Functionality
g-Social - Enhancing e-Science Tools with Social Networking Functionalityg-Social - Enhancing e-Science Tools with Social Networking Functionality
g-Social - Enhancing e-Science Tools with Social Networking Functionality
 
Mexico talk foster march 2012
Mexico talk foster march 2012Mexico talk foster march 2012
Mexico talk foster march 2012
 
About an Immune System Understanding for Cloud-native Applications - Biology ...
About an Immune System Understanding for Cloud-native Applications - Biology ...About an Immune System Understanding for Cloud-native Applications - Biology ...
About an Immune System Understanding for Cloud-native Applications - Biology ...
 
Foster CRA March 2022.pptx
Foster CRA March 2022.pptxFoster CRA March 2022.pptx
Foster CRA March 2022.pptx
 
Sgci data west 12-15-16
Sgci data west 12-15-16Sgci data west 12-15-16
Sgci data west 12-15-16
 
GSA Presentation - MILLER 251-4.pdf
GSA Presentation - MILLER 251-4.pdfGSA Presentation - MILLER 251-4.pdf
GSA Presentation - MILLER 251-4.pdf
 
Summary of 3DPAS
Summary of 3DPASSummary of 3DPAS
Summary of 3DPAS
 
Sgci data west 12-15-16
Sgci data west 12-15-16Sgci data west 12-15-16
Sgci data west 12-15-16
 
Data-intensive bioinformatics on HPC and Cloud
Data-intensive bioinformatics on HPC and CloudData-intensive bioinformatics on HPC and Cloud
Data-intensive bioinformatics on HPC and Cloud
 
BeSTGRID OpenGridForum 29 GIN session
BeSTGRID OpenGridForum 29 GIN sessionBeSTGRID OpenGridForum 29 GIN session
BeSTGRID OpenGridForum 29 GIN session
 
Digital Science: Towards the executable paper
Digital Science: Towards the executable paperDigital Science: Towards the executable paper
Digital Science: Towards the executable paper
 
Science Engagement: A Non-Technical Approach to the Technical Divide
Science Engagement: A Non-Technical Approach to the Technical DivideScience Engagement: A Non-Technical Approach to the Technical Divide
Science Engagement: A Non-Technical Approach to the Technical Divide
 
Knowledge Graphs for Scholarly Communication
Knowledge Graphs for Scholarly CommunicationKnowledge Graphs for Scholarly Communication
Knowledge Graphs for Scholarly Communication
 

Más de Sandra Gesing

The Reasons Why the Science Gateways Community Needs an Institute
The Reasons Why the Science Gateways Community Needs an InstituteThe Reasons Why the Science Gateways Community Needs an Institute
The Reasons Why the Science Gateways Community Needs an InstituteSandra Gesing
 
Bridging Gaps and Broadening Participation in Today's and Future Research Com...
Bridging Gaps and Broadening Participation inToday's and Future Research Com...Bridging Gaps and Broadening Participation inToday's and Future Research Com...
Bridging Gaps and Broadening Participation in Today's and Future Research Com...Sandra Gesing
 
SGCI and URSSI: Xpert Network – Exchanging Best Practices in Supporting Compu...
SGCI and URSSI: Xpert Network – Exchanging Best Practices in Supporting Compu...SGCI and URSSI: Xpert Network – Exchanging Best Practices in Supporting Compu...
SGCI and URSSI: Xpert Network – Exchanging Best Practices in Supporting Compu...Sandra Gesing
 
Sustainability of HPC Research Computing: Fostering career paths for facilit...
Sustainability of HPC Research Computing:  Fostering career paths for facilit...Sustainability of HPC Research Computing:  Fostering career paths for facilit...
Sustainability of HPC Research Computing: Fostering career paths for facilit...Sandra Gesing
 
URSSI - SGCI - PresQT: Research Software and Science Gateways: Addressing Su...
URSSI - SGCI - PresQT: Research Software and Science Gateways:  Addressing Su...URSSI - SGCI - PresQT: Research Software and Science Gateways:  Addressing Su...
URSSI - SGCI - PresQT: Research Software and Science Gateways: Addressing Su...Sandra Gesing
 
SGCI - URSSI - Science Gateways for Electronics, Photonics and Magnetics: Ach...
SGCI - URSSI - Science Gateways for Electronics, Photonics and Magnetics: Ach...SGCI - URSSI - Science Gateways for Electronics, Photonics and Magnetics: Ach...
SGCI - URSSI - Science Gateways for Electronics, Photonics and Magnetics: Ach...Sandra Gesing
 
The Conceptualization of URSSI - How You Can Engage
The Conceptualization of URSSI - How You Can EngageThe Conceptualization of URSSI - How You Can Engage
The Conceptualization of URSSI - How You Can EngageSandra Gesing
 
SGCI-URSSI-Sustainability in Research Computing
SGCI-URSSI-Sustainability in Research ComputingSGCI-URSSI-Sustainability in Research Computing
SGCI-URSSI-Sustainability in Research ComputingSandra Gesing
 
SGCI - URSSI - Research Software Engineers, Science Gateway Developers and Cy...
SGCI - URSSI - Research Software Engineers, Science Gateway Developers and Cy...SGCI - URSSI - Research Software Engineers, Science Gateway Developers and Cy...
SGCI - URSSI - Research Software Engineers, Science Gateway Developers and Cy...Sandra Gesing
 
SGCI - Science Gateways - Technology-Enhanced Research Under Consideration of...
SGCI - Science Gateways - Technology-Enhanced Research Under Consideration of...SGCI - Science Gateways - Technology-Enhanced Research Under Consideration of...
SGCI - Science Gateways - Technology-Enhanced Research Under Consideration of...Sandra Gesing
 
SGCI - The Science Gateways Community Institute: International Collaboration ...
SGCI - The Science Gateways Community Institute: International Collaboration ...SGCI - The Science Gateways Community Institute: International Collaboration ...
SGCI - The Science Gateways Community Institute: International Collaboration ...Sandra Gesing
 
SGCI - Science Gateways: An Overview
SGCI - Science Gateways: An OverviewSGCI - Science Gateways: An Overview
SGCI - Science Gateways: An OverviewSandra Gesing
 
SGCI - Science Gateways Community Institute: Software Registry
SGCI - Science Gateways Community Institute: Software RegistrySGCI - Science Gateways Community Institute: Software Registry
SGCI - Science Gateways Community Institute: Software RegistrySandra Gesing
 
SGCI - Science Gateways Bootcamp: Strategies for Developing, Operating and Su...
SGCI - Science Gateways Bootcamp: Strategies for Developing, Operating and Su...SGCI - Science Gateways Bootcamp: Strategies for Developing, Operating and Su...
SGCI - Science Gateways Bootcamp: Strategies for Developing, Operating and Su...Sandra Gesing
 
SGCI - RDA - Sustainability of Collaborative Platforms
SGCI - RDA - Sustainability of Collaborative PlatformsSGCI - RDA - Sustainability of Collaborative Platforms
SGCI - RDA - Sustainability of Collaborative PlatformsSandra Gesing
 
SGCI - Science Gateways Community Institute: Subsidized Services and Consulta...
SGCI - Science Gateways Community Institute: Subsidized Services and Consulta...SGCI - Science Gateways Community Institute: Subsidized Services and Consulta...
SGCI - Science Gateways Community Institute: Subsidized Services and Consulta...Sandra Gesing
 
SGCI - The Science Gateways Community Institute: Going Beyond Borders
SGCI - The Science Gateways Community Institute: Going Beyond BordersSGCI - The Science Gateways Community Institute: Going Beyond Borders
SGCI - The Science Gateways Community Institute: Going Beyond BordersSandra Gesing
 
SGCI Science Gateways: Ushering in a New Era of Sustainability
SGCI Science Gateways: Ushering in a New Era of Sustainability SGCI Science Gateways: Ushering in a New Era of Sustainability
SGCI Science Gateways: Ushering in a New Era of Sustainability Sandra Gesing
 
SGCI Science Gateways: Software sustainability via on-campus teams - Webinar ...
SGCI Science Gateways: Software sustainability via on-campus teams - Webinar ...SGCI Science Gateways: Software sustainability via on-campus teams - Webinar ...
SGCI Science Gateways: Software sustainability via on-campus teams - Webinar ...Sandra Gesing
 
SGCI Science Gateways: Addressing Data Management Challenges
SGCI Science Gateways: Addressing Data Management ChallengesSGCI Science Gateways: Addressing Data Management Challenges
SGCI Science Gateways: Addressing Data Management ChallengesSandra Gesing
 

Más de Sandra Gesing (20)

The Reasons Why the Science Gateways Community Needs an Institute
The Reasons Why the Science Gateways Community Needs an InstituteThe Reasons Why the Science Gateways Community Needs an Institute
The Reasons Why the Science Gateways Community Needs an Institute
 
Bridging Gaps and Broadening Participation in Today's and Future Research Com...
Bridging Gaps and Broadening Participation inToday's and Future Research Com...Bridging Gaps and Broadening Participation inToday's and Future Research Com...
Bridging Gaps and Broadening Participation in Today's and Future Research Com...
 
SGCI and URSSI: Xpert Network – Exchanging Best Practices in Supporting Compu...
SGCI and URSSI: Xpert Network – Exchanging Best Practices in Supporting Compu...SGCI and URSSI: Xpert Network – Exchanging Best Practices in Supporting Compu...
SGCI and URSSI: Xpert Network – Exchanging Best Practices in Supporting Compu...
 
Sustainability of HPC Research Computing: Fostering career paths for facilit...
Sustainability of HPC Research Computing:  Fostering career paths for facilit...Sustainability of HPC Research Computing:  Fostering career paths for facilit...
Sustainability of HPC Research Computing: Fostering career paths for facilit...
 
URSSI - SGCI - PresQT: Research Software and Science Gateways: Addressing Su...
URSSI - SGCI - PresQT: Research Software and Science Gateways:  Addressing Su...URSSI - SGCI - PresQT: Research Software and Science Gateways:  Addressing Su...
URSSI - SGCI - PresQT: Research Software and Science Gateways: Addressing Su...
 
SGCI - URSSI - Science Gateways for Electronics, Photonics and Magnetics: Ach...
SGCI - URSSI - Science Gateways for Electronics, Photonics and Magnetics: Ach...SGCI - URSSI - Science Gateways for Electronics, Photonics and Magnetics: Ach...
SGCI - URSSI - Science Gateways for Electronics, Photonics and Magnetics: Ach...
 
The Conceptualization of URSSI - How You Can Engage
The Conceptualization of URSSI - How You Can EngageThe Conceptualization of URSSI - How You Can Engage
The Conceptualization of URSSI - How You Can Engage
 
SGCI-URSSI-Sustainability in Research Computing
SGCI-URSSI-Sustainability in Research ComputingSGCI-URSSI-Sustainability in Research Computing
SGCI-URSSI-Sustainability in Research Computing
 
SGCI - URSSI - Research Software Engineers, Science Gateway Developers and Cy...
SGCI - URSSI - Research Software Engineers, Science Gateway Developers and Cy...SGCI - URSSI - Research Software Engineers, Science Gateway Developers and Cy...
SGCI - URSSI - Research Software Engineers, Science Gateway Developers and Cy...
 
SGCI - Science Gateways - Technology-Enhanced Research Under Consideration of...
SGCI - Science Gateways - Technology-Enhanced Research Under Consideration of...SGCI - Science Gateways - Technology-Enhanced Research Under Consideration of...
SGCI - Science Gateways - Technology-Enhanced Research Under Consideration of...
 
SGCI - The Science Gateways Community Institute: International Collaboration ...
SGCI - The Science Gateways Community Institute: International Collaboration ...SGCI - The Science Gateways Community Institute: International Collaboration ...
SGCI - The Science Gateways Community Institute: International Collaboration ...
 
SGCI - Science Gateways: An Overview
SGCI - Science Gateways: An OverviewSGCI - Science Gateways: An Overview
SGCI - Science Gateways: An Overview
 
SGCI - Science Gateways Community Institute: Software Registry
SGCI - Science Gateways Community Institute: Software RegistrySGCI - Science Gateways Community Institute: Software Registry
SGCI - Science Gateways Community Institute: Software Registry
 
SGCI - Science Gateways Bootcamp: Strategies for Developing, Operating and Su...
SGCI - Science Gateways Bootcamp: Strategies for Developing, Operating and Su...SGCI - Science Gateways Bootcamp: Strategies for Developing, Operating and Su...
SGCI - Science Gateways Bootcamp: Strategies for Developing, Operating and Su...
 
SGCI - RDA - Sustainability of Collaborative Platforms
SGCI - RDA - Sustainability of Collaborative PlatformsSGCI - RDA - Sustainability of Collaborative Platforms
SGCI - RDA - Sustainability of Collaborative Platforms
 
SGCI - Science Gateways Community Institute: Subsidized Services and Consulta...
SGCI - Science Gateways Community Institute: Subsidized Services and Consulta...SGCI - Science Gateways Community Institute: Subsidized Services and Consulta...
SGCI - Science Gateways Community Institute: Subsidized Services and Consulta...
 
SGCI - The Science Gateways Community Institute: Going Beyond Borders
SGCI - The Science Gateways Community Institute: Going Beyond BordersSGCI - The Science Gateways Community Institute: Going Beyond Borders
SGCI - The Science Gateways Community Institute: Going Beyond Borders
 
SGCI Science Gateways: Ushering in a New Era of Sustainability
SGCI Science Gateways: Ushering in a New Era of Sustainability SGCI Science Gateways: Ushering in a New Era of Sustainability
SGCI Science Gateways: Ushering in a New Era of Sustainability
 
SGCI Science Gateways: Software sustainability via on-campus teams - Webinar ...
SGCI Science Gateways: Software sustainability via on-campus teams - Webinar ...SGCI Science Gateways: Software sustainability via on-campus teams - Webinar ...
SGCI Science Gateways: Software sustainability via on-campus teams - Webinar ...
 
SGCI Science Gateways: Addressing Data Management Challenges
SGCI Science Gateways: Addressing Data Management ChallengesSGCI Science Gateways: Addressing Data Management Challenges
SGCI Science Gateways: Addressing Data Management Challenges
 

Último

W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech studentsHimanshiGarg82
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplatePresentation.STUDIO
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnAmarnathKambale
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesVictorSzoltysek
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfproinshot.com
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfryanfarris8
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 

Último (20)

Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 

Science Gateways – Leveraging Modeling and Simulations in HPC Infrastructures via Increased Usability

  • 1. Sandra  Gesing   sandra.gesing@nd.edu     cHiPSet  Training  School  2016   22  September  2016   Science  Gateways  –     Leveraging  Modeling  and   SimulaDons  in  HPC  Infrastructures   via  Increased  Usability      
  • 2. University  of  Notre  Dame   Sandra  Gesing                                                  2   •   In  the  middle  of  nowhere  of  northern  Indiana      (1.5  h  from  Chicago)   •   4  undergraduate  colleges     •   ~35  research  insDtutes  and  centers   •   ~12,000  students  
  • 3. Modeling  and  SimulaDons   Sandra  Gesing              Science  Gateways                                  3   •  Genomics   •  Proteomics   •  Metabolomics   •  Immunomics   •  System  biology   •  Molecular  simulaDons   •  Docking   •  Epidemiology   •  …   Black  Swallowtail  –     larvae  and  buVerfly  
  • 4. The  Genomics  Boom   Sandra  Gesing                                                  4   February  16,  2001    biotech  company  Celera     February  15,  2001   The  Human  Genome  Project    
  • 5. The  Genomics  Boom   Sandra  Gesing                                                  5   Craig  Venter  (le[)  and  Francis  Collins  (right)  
  • 6. Big  Data   Sandra  Gesing                                                                  6   •   Explosion  in  the  quanDty,  variety  and  complexity  of    data     •   QuesDons  can  be  answered  impossible  to  even  ask    about  10  years  ago   •   Costs  far  reduced  (e.g.,  Human  Genome  project,  15    years,  ~$2  billion;  today  ~3  days,  $1000)  
  • 7. Big  Data   Sandra  Gesing                                                                  7   hVp://www.genome.gov/images/content/cost_per_genome_oct2015.jpg  
  • 8. Modeling  and  SimulaDons   Sandra  Gesing                                              8  
  • 9. Workflows   Sandra  Gesing              Science  Gateways                            9   12181 acatttctac caacagtgga tgaggttgtt ggtctatgtt ctcaccaaat ttggtgttgt 12241 cagtctttta aattttaacc tttagagaag agtcatacag tcaatagcct tttttagctt 12301 gaccatccta atagatacac agtggtgtct cactgtgatt ttaatttgca ttttcctgct 12361 gactaattat gttgagcttg ttaccattta gacaacttca ttagagaagt gtctaatatt 12421 taggtgactt gcctgttttt ttttaattgg gatcttaatt tttttaaatt attgatttgt 12481 aggagctatt tatatattct ggatacaagt tctttatcag atacacagtt tgtgactatt 12541 ttcttataag tctgtggttt ttatattaat gtttttattg atgactgttt tttacaattg 12601 tggttaagta tacatgacat aaaacggatt atcttaacca ttttaaaatg taaaattcga 12661 tggcattaag tacatccaca atattgtgca actatcacca ctatcatact ccaaaagggc 12721 atccaatacc cattaagctg tcactcccca atctcccatt ttcccacccc tgacaatcaa 12781 taacccattt tctgtctcta tggatttgcc tgttctggat attcatatta atagaatcaa Slide  copied  from:  Stuart  Owen  „Workflows  with  Taverna“   A  sequence  of  connected  steps  in  a  defined  order     based  on  their  control  and  data  dependencies  
  • 10. Workflow  Systems   Sandra  Gesing              Science  Gateways                              10   •   Different  workflow  concepts   •   Different  workflow  languages   •   Different  workflow  constructs         Taverna  
  • 11. Workflow  Editors   Sandra  Gesing              Science  Gateways                              11   •   Different  technologies  (workbenches,  web-­‐based)     •   Different  look-­‐and-­‐feel    
  • 12. State  of  the  Art       Data  and  compute-­‐   intensive  problems   High-­‐speed  networks   Users  generally  not   IT  specialists  Tools  and  workflow   engines   Web-­‐based     agile  frameworks   Distributed  data  and     compuDng  infrastructures   Sandra  Gesing                                          12  
  • 13. Challenge  for  Developers     Sandra  Gesing                                              13   Data  and  compute-­‐   intensive  problems   High-­‐speed  networks  Tools  and  workflow   engines   Web-­‐based     agile  frameworks   Distributed  data  and     compuDng  infrastructures   Users  generally  not   IT  specialists   Need  for  intuiDve  and  self-­‐explanatory  user   interfaces!  
  • 14. Challenge  for  Developers     Sandra  Gesing                                              14   Data  and  compute-­‐   intensive  problems   High-­‐speed  networks  Tools  and  workflow   engines   Web-­‐based     agile  frameworks   Distributed  data  and     compuDng  infrastructures   Users  generally  not   IT  specialists  
  • 15. Usability   Sandra  Gesing                                              15       “A[er  all,  usability  really  just  means  that  making  sure   that  something  works  well:  that  a  person  …  can  use  the   thing  -­‐  whether  it's  a  Web  site,  a  fighter  jet,  or  a   revolving  door  -­‐  for  its  intended  purpose  without  geqng   hopelessly  frustrated.”     (Steve  Krug  in  “Don't  make  me     think!:  A  Common  Sense  Approach   to  Web  Usability”,  2005)  
  • 16. Reusability   Sandra  Gesing                                                                                                    16   “The  key  to  producDvity  is  reusability.  The  easiest  way  to     produce  code  is  obviously  to  have  it  already!"     (John  R.  Bourne  in  “Object-­‐oriented  Engineering:  Building  Engineering     Systems  Using  Smalltalk-­‐80”,  1992)  
  • 17. Reproducibility   Sandra  Gesing                                                              17   “The  closeness  of  agreement  between  independent   results  obtained  with  the  same  method  on  idenDcal   test  material  but  under  different  condiDons   (different  operators,  different  apparatus,  different   laboratories  and/or  a[er  different  intervals  of  Dme) …”   (IUPAC  (InternaDonal  Union  of  Pure  and  Applied  Chemistry  iupac.org)  GoldBook)  
  • 18. Reproducibility   Sandra  Gesing                                                              18   “The  closeness  of  agreement  between  independent   results  obtained  with  the  same  method  on  idenDcal   test  material  but  under  different  condiDons   (different  operators,  different  apparatus,  different   laboratories  and/or  a[er  different  intervals  of  Dme) …”   (IUPAC  (InternaDonal  Union  of  Pure  and  Applied  Chemistry  iupac.org)  GoldBook)  
  • 19. Reusability  vs.  Reproducibility   Sandra  Gesing                                                              19  
  • 20. Efficiency   Sandra  Gesing                                              20   •  Time   •  ComputaDonal  resources   •  Money    
  • 21. Science  Gateways   Sandra  Gesing              Science  Gateways                              21   science gateway /sī′ əәns gāt′ wā′/ n. 1. an online community space for science and engineering research and education. 2. a Web-based resource for accessing data, software, computing services, and equipment specific to the needs of a science or engineering discipline.
  • 22.  Why  are  Science  Gateways  Important?   Sandra  Gesing              Science  Gateways                              22   •  Increased  complexity  of   –  today’s  research  quesDons   –  hardware  and  so[ware   –  skills  required   •  Greater  need  for   openness  and   reproducibility   –  Science  increasingly  driving   policy  quesDons   •  Opportunity  to  integrate   research  with  teaching   –  BeVer  workforce   preparaDon   We  need  interfaces   that  provide     broad  access  to     advanced  resources     and     allow  all  to  tackle   today’s  challenging   science  ques9ons.  
  • 23. Science  Gateways   Sandra  Gesing                                          23  
  • 24. Science  Gateways   Sandra  Gesing              Science  Gateways                              24  
  • 25. Science  Gateways   Sandra  Gesing              Science  Gateways                              25   It’s  a   Science   Gateway   It’s  a   Research   Portal   It’s  a   Collaboratory   It’s  a   Cyberinfrastructure   It’s   e-­‐Science   eResearch   It’s  a   Virtual     Lab  
  • 26. Frameworks  and  APIs   Sandra  Gesing                                          26   Re-­‐invenDng  is  not  always  necessary..  
  • 27. Frameworks  and  APIs   Sandra  Gesing                                              27   ...  and  users  should  get  more  features  easily...  
  • 28. Frameworks  and  APIs   Sandra  Gesing                                              28   ...  but  the  model  should  fit  to  the  demands  of  the   community  
  • 29. BioinformaDc  Infrastructure  Survey   Sandra  Gesing                                                              29   QuesDons  around  frustraDon  and  limitaDons  of  using   •   BioinformaDc  so[ware   •   BioinformaDc  resources   •   HPC  and  Cloud  infrastructures   and  about  challenges  to  train  students  in  bioinformaDcs     Answers  o[en  address   •  Hurdles  to  use  bioinformaDc  resources  because  of   commandline  access  or  not  available  so[ware   •  Quality  of  documentaDon  of  so[ware   •  Need  for  parsers  and  converters  for  diverse  data  formats   •  Long  waiDng  Dme  for  support  or  even  lack  of  support    
  • 30. BioinformaDc  Infrastructure  Survey   Sandra  Gesing                                                              30   •   Nick  Loman    (Birmingham,  UK)     •  Thomas  Connor      (Cardiff,  UK)     •   October  2015   •   272  answers     hVps://drive.google.com/drive/folders/0B7KZv1TRi06fLUJCU1BYM3JScjg    
  • 31. BioinformaDc  Infrastructure  Survey   Sandra  Gesing                                                              31  
  • 32. BioinformaDc  Infrastructure  Survey   0" 20" 40" 60" 80" 100" 120" Cloud" Ins0tu0on2wide"resource" Local"resource" Personal"computer" Where  do  bioinformaDcians  do  most  of  their  work   Sandra  Gesing                                                              32  
  • 33. BioinformaDc  Infrastructure  Survey   0" 20" 40" 60" 80" 100" 120" Cloud" Ins0tu0on2wide"resource" Local"resource" Personal"computer" 0.00%$ 10.00%$20.00%$30.00%$40.00%$50.00%$60.00%$70.00%$80.00%$90.00%$ Best$for$job$ Good$documenta>on$ Word$of$mouth$recommenda>on$ Used$in$similar$analysis$ Quickest$ Already$installed$on$server$ Other$ Graphical$interface$ Where  do  bioinformaDcians  do  most  of  their  work   Why  do  bioinformaDcians  use  the  so[ware  they  use   Sandra  Gesing                                                              33  
  • 34. BioinformaDc  Infrastructure  Survey   0" 20" 40" 60" 80" 100" 120" Cloud" Ins0tu0on2wide"resource" Local"resource" Personal"computer" 0.00%$ 10.00%$20.00%$30.00%$40.00%$50.00%$60.00%$70.00%$80.00%$90.00%$ Best$for$job$ Good$documenta>on$ Word$of$mouth$recommenda>on$ Used$in$similar$analysis$ Quickest$ Already$installed$on$server$ Other$ Graphical$interface$ Where  do  bioinformaDcians  do  most  of  their  work   Why  do  bioinformaDcians  use  the  so[ware  they  use   Sandra  Gesing                                                              34  
  • 35. A  Typical  Life  Cycle   Sandra  Gesing              Science  Gateways                              35   Early   adopters   Publicity   Wider   adopDon   Funding   ends   ScienDsts   disillusioned   New   project   prototype   Gateways  enable  research,  but  are  not  research  projects  themselves…     Sustainability  is  a  problem…  
  • 36. Science  Gateways       Sandra  Gesing              Science  Gateways                              36   A  new  era…   •  Novel  developments  of  web-­‐based  agile   frameworks   •  Infrastructure  providers  report  that  science   gateways  are  more  used  than  commandlines    
  • 37. Science  Gateways       Sandra  Gesing              Science  Gateways                              37   A  new  era…     Gateways Login hVps://www.xsede.org/  
  • 38. Science  Gateways       Sandra  Gesing              Science  Gateways                              38   A  new  era…   •  Novel  developments  of  web-­‐based  agile   frameworks   •  Infrastructure  providers  report  that  science   gateways  are  more  used  than  commandlines   But  also  always  new  challenges…   •  Novel  infrastructures   •  Novel  data  sources  like  NGS  sequencing   machines,  telescopes  such  as  the  Square   Kilometre  Array  (SKA)  (will  create  data  rates  in   exa-­‐scale  size)     è Support  of  developers  necessary    
  • 39. Science  Gateways  Community  InsDtute   Sandra  Gesing              Science  Gateways                              39   hVp://sciencegateways.org   •  Diverse  experDse  on   demand   •  Longer  term  support   engagements   •  So[ware  and  visibility  for   gateways   •  InformaDon  exchange  in  a   community  environment   •  Student  opportuniDes   and  more  stable     career  paths    
  • 40. Science  Gateway  Survey  2014     Sandra  Gesing              Science  Gateways                              40   •  29,000-­‐person  survey     •  4957  responses  from  across  domains  
  • 41. Science  Gateway  Survey  2014     Sandra  Gesing              Science  Gateways                              41   n  of  applicaDon  types=7,805,     by  2,756  creators  (out  of  2,819);   mean=2.8  applicaDon  types  per   applicaDon  creator  
  • 42. Science  Gateway  Survey  2014     Sandra  Gesing              Science  Gateways                              42   34% 36% 20% 17% 31% 26% 42% 16% 30% 18% 45% 44% 14% 15% 0% 5% 10% 15% 20% 25% 30% 35% 40% 45% 50% Usability Consultant Graphic Designer Community Liaison/ Evangelist Project Manager Professional Software Developer Security Expert Quality Assurance and  Testing Expert Wished  we  had  this Yes,  we  had  this n=2,756  respondents  or  98%  of  applicaDon  creators  
  • 43. Science  Gateway  Survey  2014     Sandra  Gesing              Science  Gateways                              43   What  services     would  be  helpful?  
  • 44. Science  Gateway  Technologies   Sandra  Gesing              Science  Gateways                              44   •   Content  management  systems  (Drupal)   •   Libraries  for  implementaDon  (Django)   •   Portal  frameworks  (Liferay)   •   Science  gateway  frameworks  (WS-­‐PGRADE,  Galaxy)   •   StaDc  layout   •   Layout  extendable   •   Workflow-­‐enabled   •   APIs  for  implementaDon  (Apache  Airavata,  Agave)      
  • 45. Drupal     Sandra  Gesing              Science  Gateways                              45  
  • 46. VectorBase  -­‐  Example  for  Drupal     Sandra  Gesing              Science  Gateways                              46  
  • 47. VectorBase   Sandra  Gesing              Science  Gateways                              47  
  • 48. VectorBase       Sandra  Gesing              Science  Gateways                              48  
  • 49. Django   Sandra  Gesing              Science  Gateways                              49  
  • 50. VecNet  –  Example  for  Django   Sandra  Gesing              Science  Gateways                              50  
  • 51. VecNet     Sandra  Gesing              Science  Gateways                              51  
  • 52. VecNet     Sandra  Gesing              Science  Gateways                              52  
  • 53. VecNet     Sandra  Gesing              Science  Gateways                              53  
  • 54. Liferay   Sandra  Gesing              Science  Gateways                              54   Portal  framework   •   AuthenDcaDon  (e.g.,  OpenSSO,  CAS)   •   AuthorizaDon     •   Standards  compliant     •   JSR168/286   •   Web  services   •   Web  2.0  websites   •   Web  Publishing  and  Shared  Workspaces   •   CollaboraDon   •   Social  Networking    
  • 55. WS-­‐PGRADE   Sandra  Gesing              Science  Gateways                              55   User  Interface   WS-­‐PGRADE   Liferay   DCI  Resources     Middleware  Layer   High-­‐Level   Middleware   Service  Layer   gUSE  
  • 56. MoSGrid  as  WS-­‐PGRADE  Example   Sandra  Gesing              Science  Gateways                              56   Molecular  SimulaDon  Grid   •     Science  gateway  integrated  with  underlying    compute  and  data  management  infrastructure       •     Distributed  workflow  management   •     Data  repository   •     Metadata  management  
  • 57. MoSGrid     Sandra  Gesing              Science  Gateways                              57  
  • 58. MoSGrid     Sandra  Gesing              Science  Gateways                              58  
  • 59. MoSGrid     Sandra  Gesing              Science  Gateways                              59  
  • 60. MoSGrid     Sandra  Gesing              Science  Gateways                              60  
  • 61. MoSGrid     Sandra  Gesing              Science  Gateways                              61  
  • 62. MoSGrid     Sandra  Gesing              Science  Gateways                              62  
  • 63. MoSGrid     Sandra  Gesing              Science  Gateways                              63  
  • 64. MoSGrid   Sandra  Gesing              Science  Gateways                              64   Molecular  Dynamics   •     Study  and  simulaDon  of  molecular  moDon   Quantum  Chemistry   •     Study  and  simulaDon  of  molecular  electronic    behavior  relaDve  to  their  chemical  reacDvity   Docking   •     Main  focus  on  evaluaDon  of  ligand-­‐receptor    interacDons  (e.g.,  for  drug  design)  
  • 65. MoSGrid  -­‐  Metadata   Sandra  Gesing              Science  Gateways                              65   •  Molecular  SimulaDon  Markup  Language  (MSML)     •  CML  compliant   •  Template  for  each  and  every  workflow   •  Molecular  input   •  Domain  specific  tools   •  Job  configuraDon   •  OpDmized  structures,  trajectories,  energies,  …   •  SemanDc  search  (Apache  Lucene)  
  • 66. MoSGrid  -­‐  Metadata     Sandra  Gesing              Science  Gateways                              66  
  • 67. MoSGrid  -­‐  Metadata     Sandra  Gesing              Science  Gateways                              67  
  • 68. MoSGrid  –  VisualizaDon   Sandra  Gesing              Science  Gateways                              68   TesDng  of  ChemDoodle  and  MolCAD                                web.chemdoodle.com   molcad.de  
  • 69. MoSGrid  –  Basic  Workflow   Sandra  Gesing              Science  Gateways                              69   Job   DefiniHon   ApplicaHon   Input   ExecuHon   Meta-­‐   processing   Job   Submission   ApplicaHon   Output   Post-­‐   processing   Output   Portal   User-­‐   Input   Grid   Resource  
  • 70. MoSGrid  –  QC  Portlet   Sandra  Gesing              Science  Gateways                              70   •  Specialised  interface  for  quantum  chemistry   so[ware  (Gaussian,  NWChem,  ORCA)   •  Basic  workflows   •  Easy  GeneraDon  or  Uploading  of  Input  Files   •  Parsing  of  result  files  
  • 71. MoSGrid  –  MD  Portlet   Sandra  Gesing              Science  Gateways                              71  
  • 72. MoSGrid  –  Docking  Portlet   Sandra  Gesing              Science  Gateways                              72  
  • 73. MoSGrid  –  Docking  Portlet   Sandra  Gesing              Science  Gateways                              73  
  • 74. Galaxy   Sandra  Gesing              Science  Gateways                              74  
  • 75. Galaxy   Sandra  Gesing              Science  Gateways                              75  
  • 76. RNA-­‐Seq  Analysis   Sandra  Gesing              Science  Gateways                              76  
  • 77. Apache  Airavata     Sandra  Gesing              Science  Gateways                              77   •  Airavata  is  a  general  purpose  distributed  system   so[ware  framework  build  on  micro-­‐service  and   component  based  architecture  principles     •  Airavata  provides  capabiliDes  to  compose,   manage,  execute  and  monitor  large  scale   applicaDons  and  workflows  on  distributed   compuDng  resources   •  Airavata  supports  execuDons  on  local  clusters,   naDonal  grids,  academic  and  commercial  clouds     •  Airavata  is  inherently  mulD-­‐tenanted  
  • 78. Apache  Airavata     Sandra  Gesing              Science  Gateways                              78  
  • 79. Apache  Airavata     Sandra  Gesing              Science  Gateways                              79   •  External  clients  interact  with  Airavata  API  (based  on   Apache  Thri[)   •  Internally,  components  interact  with  each  other  through   Component  Programming  Interfaces  (thri[-­‐based  CPIs)  
  • 80. Apache  Airavata     Sandra  Gesing              Science  Gateways                              80   Clean  way  to  define  IDLs  with   richer  data  structures  
  • 81. SciGap  –  Example  for  Apache  Airavata     Sandra  Gesing              Science  Gateways                              81   Science  Gateway  Pla•orm  as  a  Service  
  • 82. Apache  Airavata     Sandra  Gesing              Science  Gateways                              82   Science  Gateway  Pla•orm  as  a  Service  (SciGaP)   User  IdenDty   Management   InformaDon,  Monitoring  &   AudiDng   ApplicaDon  Programmer  Interface   CIPRES   Science  Gateways   Neuro   Science   Ultrascan   BioVLAB  GAAMP   DES   SimWG   Param   Chem   Graphical  Interfaces   Admin  Dashboards   XSEDE   OSG   Future   Grid   Data   Nets   Campus   Clusters   Academic  &   Commercial   Clouds   InternaDonal   Grids   Data  &  Provenance   Management   Scalable   Secure  Load  Balanced   Configurable   Fault  Tolerant   Maintainable   Performance   Job  &  Workflow   Management  
  • 83. Apache  Airavata     Sandra  Gesing              Science  Gateways                              83   Community  Hangout     Mailing  lists:   architecture@airavata.apache.org   dev@airavata.apache.org   users@airavata.apache.org     Extend  Airavata  from  your   project  or  extend  your  project   from  Airavata  
  • 84. Agave  API     Sandra  Gesing              Science  Gateways                              84   Agave  is  a  Science-­‐as-­‐a-­‐Service  web  API  pla•orm   Run  scienHfic  codes     •  your  own  or  community  provided  codes   ...on  HPC,  HTC,  or  cloud  resources     •  your  own,  shared,  or  commercial  systems   ...and  manage  your  data     •  reliable,  mulD-­‐protocol,  async  data  movement   ...from  the  web     •  webhooks,  rest,  json,  cors,  oauth2   ...and  remember  how  you  did  it     •  deep  provenance,  history,  and  reproducibility  built  in    
  • 85. Agave  API     Sandra  Gesing              Science  Gateways                              85   •  MulDtenant   •  Hosted  idenDty   management   •  Supports  mulDple  IdP   •  OAuth2/OIDC  server   •  API  Management   •  Hosted  or  on  premise   •  VerDcal  SSO   •  AnalyDcs  and  reporDng   •  Developer  resources   •  MulDple  SDK  &  CLI   •  Reference  gateway   •  White  labeled   •  100%  open  source  
  • 86. Agave  API     Sandra  Gesing              Science  Gateways                              86   Used  to  power  web  &  mobile   applicaDons  
  • 87. Agave  API     Sandra  Gesing              Science  Gateways                              87   Used  to  extend  exisDng   processes  
  • 88. Agave  API     Sandra  Gesing              Science  Gateways                              88   (Re)Introducing  the     Micro  App  Paradigm  
  • 89. Agave  API     Sandra  Gesing              Science  Gateways                              89   Agave  Delivers  Process-­‐as-­‐a-­‐ Service  
  • 90. Agave  API     Sandra  Gesing              Science  Gateways                              90  
  • 91. iPlant  –  Example  for  Agave  API     Sandra  Gesing              Science  Gateways                              91  
  • 92. Agave  API  -­‐  Tutorials     Sandra  Gesing              Science  Gateways                              92  
  • 93. Agave  API  -­‐  Tutorials   Sandra  Gesing              Science  Gateways                              93  
  • 94. CollaboraDon  on  Science  Gateways       Sandra  Gesing              Science  Gateways                              94   Crucial  Topics   •  Close  collaboraDon  with  user  communiDes   •  Knowledge  about  available  technical  soluDons     Sounds  easy  but…   •  Requirements  of  user  communiDes  o[en  not  so   clear   •  Technologies  someDmes  sDll  under  development   for  certain  building  blocks   è Slow  uptake  of  soluDons     è Larger  effort  for  creaDng  science  gateways  
  • 95. New  Science  Gateways  -­‐  Checklist       Sandra  Gesing              Science  Gateways                              95   DISCUSSION                       OrganizaDonal   Aspects   Technical   Aspects   Domain-­‐Specific   Aspects   Developers   Domain  Experts  
  • 96. New  Science  Gateways  -­‐  Checklist       Sandra  Gesing              Science  Gateways                              96   Domain-­‐specific  aspects:   •  Goal,  target  area  and  target  users     •  Visions/demands  on  the  layout   •  PrioriDes  of  features  and  opDons,  e.g.,  a  list   from  must-­‐have  to  great-­‐to-­‐have  opDons   •  IntegraDon  of  exisDng  applicaDons  or   development  of  applicaDons   •  Technologies  of  the  applicaDons   •  VisualizaDon   •  Security  demands   •  Workflows  
  • 97. New  Science  Gateways  -­‐  Checklist       Sandra  Gesing              Science  Gateways                          97   OrganizaDonal  aspects:   •  Time  constraints  for  the  development,   agreement  on  a  (maybe  even  rough)  project   plan  with  milestones     •  Agreement  on  alpha-­‐  or  beta-­‐tester   •  Regular  meeDngs    
  • 98. New  Science  Gateways  -­‐  Checklist       Sandra  Gesing              Science  Gateways                          98   Technical  aspects:   •  Experience  with  exisDng  frameworks  and   programming  languages   •  Available  infrastructure  including  security   infrastructure  and  resources   •  Available  support  of  suitable  technologies   •  Scalability  of  suitable  technologies   •  Effort  for  extending  exisDng  technologies   compared  to  novel  developments     •  Synergy  effects  with  other  science  gateway   projects  
  • 99. Challenges     Sandra  Gesing                                                              99   A  world-­‐wide  research  compuDng  infrastructure   •  Transparent  service  selecDon   •  e.g.,  Docker  could  be  part  of  the  soluDon   •  Access  to  data  irrespecDve  of  locaDon   •  OpDons  to  share  data  efficiently   •  Appropriate  privacy  and  security  measures   •  OpDmized  usage  of  resources   •  e.g.,  opDmized  usage  of  cloud  compuDng  and  their   business  models  
  • 100. Researchers   Sandra  Gesing                                                               ~7  million  researchers  world  wide     hVp://chartsbin.com/view/1124  
  • 101. High-­‐Speed  Network   Sandra  Gesing                                          101  
  • 102. Challenges     IntegraDon  of  data  sources  and  instruments   •  Different  data  formats   •  Different  interfaces   •  Different  hardwares  and  technologies     …  from  small  ones  to  the  big  ones…   Sandra  Gesing                                          102  
  • 103. Challenges     So[ware  searchability,  reproducibility  and  reusability   •  Science  gateways  step  in  the  right  direcDon  but  …   much  more  work  necessary  on  searchibility…  Not  only   finding  any  data  for  a  research  area  but  finding  the  right   data   •  Metadata  approaches   •  DicDonaries   •  More  involvement  of     librarians       Sandra  Gesing                                          103  
  • 104. Challenges     So[ware  searchability,  reproducibility  and  reusability   •  Science  gateways  step  in  the  right  direcDon  but  …   much  more  work  necessary  on  reproducibility  and   reusability…     •  studies  in  medicine  and  pharmacology:  11%  or  6%  of  the   analysed  research  was  reproducible     •  myExperiment:  only  20%  of  workflows  reusable  because   of  dependencies  on  hardware,  local  or  distributed  data,   so[ware  versions             Sandra  Gesing                                          104  
  • 105. Challenges     So[ware  searchability,  reproducibility  and  reusability   •  Science  gateways  and  workflow  systems  step  in  the   right  direcDon  but  …   much  more  work  necessary  on  reproducibility  and   reusability…       •  ContainerizaDon  approaches   •  MigraDon  approaches   •  CombinaDon  of  both         Sandra  Gesing                                          105  
  • 106. Challenges  –  Novel  and  Old...         …  require  novel  soluDons!   Sandra  Gesing                                          106  
  • 107. Projects  -­‐  OSF     •  Big  Data   •  Reproducibility     Open  Access  to  Data  and  Projects  could  solve   parts  of  the  problems…   Sandra  Gesing                                          107  
  • 108. Workflow  Enhancements   •  Logical  level:  Meta-­‐workflows   Herres-­‐Pawlis,  S.,  Hoffmann,  A.,  Rösener,  T.,  Krüger,  J.,  Grunzke,  R.,  and  Gesing,  S.  “MulD-­‐layer  Meta-­‐ metaworkflows  for  the  EvaluaDon  of  Solvent  and  Dispersion  Effects  in  TransiDon  Metal  Systems  Using  the   MoSGrid  Science  Gateways”Science  Gateways  (IWSG),  2015  7th  InternaDonal  Workshop  on,  pp.47-­‐52,  3-­‐5  June   2015,  IEEE  Xplore,  doi:  10.1109/IWSG.2015.13     •  System  level:  CombinaDon  of  strengths  of  workflow   systems   Hazekamp,  N.,  Sarro,  J.,  Choudhury,  O.,  Gesing,  S.,  ScoV  Emrich  and  Thain,  D.  “Scaling  Up  BioinformaDcs   Workflows  with  Dynamic  Job  Expansion:  A  Case  Study  Using  Galaxy  and  Makeflow”,  e-­‐Science  (e-­‐Science),  2015   IEEE  11th  InternaDonal  Conference  on,  pp.332-­‐341,  Aug.  31  2015-­‐Sept.  4  2015   •  PredicDon:  Model  for  opDmizaDon  of  tasks  and   threads   Choudhury,  O.,  Rajan,  D.,  Hazekamp,  N.,  Gesing,  S.,  Thain,  D.,  and  Emrich,  S.  “Balancing  Thread-­‐level  and  Task-­‐ level  Parallelism  for  Data-­‐Intensive  Workloads  on  Clusters  and  Clouds”,  Cluster  CompuDng  (CLUSTER),  2015  IEEE   InternaDonal  Conference  on,  pp.390-­‐393,  8-­‐11  Sept.  2015,  doi:10.1109/CLUSTER.2015.60     Sandra  Gesing                                          108  
  • 109. Science  Case:  PolymerisaDon  catalysts   Sandra  Gesing                                          109  
  • 110. TranslaDon  into  Workflows   Sandra  Gesing                                          110  
  • 111. TranslaDon  into  Workflows   Sandra  Gesing                                          111  
  • 112. Meta-­‐Workflows   Sandra  Gesing                                          112  
  • 113. TranslaDon  into  Meta-­‐Workflows   Sandra  Gesing                                          113  
  • 114. Scaling  Up  Workflows     #  Machines   #  Cores   Data   ParHHoning   Save   Sandra  Gesing                                          114  
  • 115. Scaling  Up  Workflows     Galaxy   Sandra  Gesing                                          115  
  • 116. Genome  Sequencing   Sandra  Gesing                                          116   •  Finding  precise  order  of  nucleoDdes  within  a  DNA   molecule   •  A  (adenine),  G  (guanine),  C  (cytosine),  and  T  (thymine)   (Human  genome  over  3  billion  of  nucleoDdes)    
  • 117. Genome  Sequencing   Sandra  Gesing                                          117   Let’s  imagine  a  party  game.  The  game  is  a  guessing   game.  Here  is  how  it  is  played:   You  are  thinking  of  a  number  and  the  group  has  to  guess   it.  The  tricky  part  is  that  the  number  is  200-­‐digits  in   length.  You  are  reading  the  digits  of  the  number  in  your   head  without  making  a  sound.    Every  so  o[en  a  person   interrupts  you,  and  you  tell  them  the  single  digit  you   were  just  thinking  and  where  it  is  in  the  sequence  of   200.  Each  Dme  you  are  interrupted,  you  have  to  start   again.  You  leave  a[er  a  few  hours  and  the  group  has  to   figure  out  the  200-­‐digit  number.  They  have  to  piece   together  the  informaDon  you  gave  them,  for  example   the  25th  number  was  5,  the  40th  number  was  0,  and  so   on.  Using  the  informaDon  from  their  interrupDons,  they   can  repeat  the  number  they  gave  you.  
  • 118. Scaling  Up  Workflows     Simple  Workflow  in  Galaxy   Problem:  As  Size  increases  so  does  Time   Sandra  Gesing                                          118  
  • 119. Scaling  Up  Workflows     Workflow  with  Parallelism  added  in  Galaxy   Problem:  Tools  must  be  updated  every   change  in  Parallelism/Relies  on  ScienDst   Sandra  Gesing                                          119  
  • 120. Scaling  Up  Workflows     Workflow  Dynamically  Expanded  behind  Galaxy   Sandra  Gesing                                          120  
  • 121. Scaling  Up  Workflows     Sandra  Gesing                                          121  
  • 122. Scaling  Up  Workflows     Sandra  Gesing                                          122  
  • 123. Scaling  Up  Workflows     Makeflow   •  Task  Structure   INPUTS  :  OUTPUTS    COMMAND     •  Directed  Acyclic  Graph  (DAG)   •  ProgrammaDcally  Generated   Sandra  Gesing                                          123  
  • 124. Scaling  Up  Workflows     Sandra  Gesing                                          124  
  • 125. Scaling  Up  Workflows     Sandra  Gesing                                          125  
  • 126. Scaling  Up  Workflows     Job  Sandbox  –  Log  file  creaDon  for  cleanup   Sandra  Gesing                                          126  
  • 127. Scaling  Up  Workflows     Dynamic  Job  Expansion       •  Work  Queue:  we  uDlized  100s  of  cores  from  a     Condor  Pool   •  Cleaning  Sandbox  using  knowledge  of  intermediates    and  logging   •  Explored  methods  to  transmit  needed  environments     such  as     executables  and  Java     61.5X  speed-­‐up  on  32  GB  dataset  uDlizing  these  methods   Sandra  Gesing                                          127  
  • 128. Thread-­‐level  and  Task-­‐level  Parallelism     •  Develop predictive performance models for an application domain •  Achieve acceptable performance the first time •  Optimize resource utilization •  Execution time •  Memory usage Sandra  Gesing                                          128  
  • 129. Thread-­‐level  and  Task-­‐level  Parallelism     •  WorkQueue master-worker framework •  Sun Grid Engine (SGE) batch system Sandra  Gesing                                          129  
  • 130. Thread-­‐level  and  Task-­‐level  Parallelism     1.  ApplicaDon-­‐level  model  for  Dme: 𝑇(𝑅, 𝑄, 𝑁)=   𝛽1​ 𝑅 𝑄/𝑁 +   𝛽2   2.  ApplicaDon-­‐level  model  for  memory:     𝑀(𝑅, 𝑁)=  γ1R  +γ2N   3.  System-­‐level  model  for  Dme:     𝑇𝑇𝑜𝑡𝑎𝑙= 𝜂1​ 𝑄 𝐾/𝐷 + 𝜂2(​ 𝑄/𝐵 +​ 𝑅 𝐾𝑁/𝐵𝐶 )+ 𝜂3T(R,​ 𝑄/𝐾 , 𝑁)∗​ 𝐾 𝑁/𝑀𝐶 +   𝜂4​ 𝑂/𝐵 + 𝜂5​ 𝑂 𝐾/𝐷      4.  System-­‐level  model  for  memory:     𝑀𝑀𝑎𝑠𝑡𝑒𝑟(𝑅, 𝑄)=ϕ1R  +ϕ2Q   Sandra  Gesing                                          130  
  • 131. Thread-­‐level  and  Task-­‐level  Parallelism     7  data   points   (R)   7  data   points   (Q)   7  data   points   (N)   343  data   points   Data   CollecHon   Training   data   Regression     Model   Training   Accuracy   Test   MAPE   TesHng   Regression   Coefficient s   TesDng   data   Sandra  Gesing                                          131  
  • 132. Thread-­‐level  and  Task-­‐level  Parallelism     Avg.  MAPE =  3.1 MAPE  =  Mean  Absolute  Percentage  Error   Sandra  Gesing                                          132  
  • 133. Thread-­‐level  and  Task-­‐level  Parallelism     For the given dataset, K* = 90, N* = 4 Sandra  Gesing                                          133  
  • 134. Result     Sandra  Gesing                                          134   # Cores/ Task #Tasks Predicted Time (min) Speedup Estimated EC2 Cost ($) Estimated Azure Cost ($) 1 360 70 6.6 50.4 64.8 2 180 38 12.3 25.2 32.4 4 90 24 19.5 18.9 32.4 8 45 27 17.3 18.9 32.4
  • 135. InformaDon  on  Science  Gateways   Sandra  Gesing              Science  Gateways                          135   •   Science  Gateway  InsDtute    hVp://sciencegateways.org   •   Science  Gateway  Workshops      Europe:  IWSG  -­‐  hVp://iwsg.info   USA:  GCE  -­‐  hVp://sciencegateways.org   Australasia:  IWSG-­‐A  -­‐  hVp://iwsg.info   •   IEEE  Technical  Area  on  Science  Gateways        hVp://ieeesciencegateways.org   •   XSEDE  Science  Gateways    hVps://www.xsede.org/gateways-­‐overview   •   CRC  Science  Gateways    hVps://crc.nd.edu/index.php/research/gateways    
  • 136. Exercises   Sandra  Gesing              Science  Gateways                          136   QuesDons  and  exercises  at   hVp://bit.ly/2dlkySW     Data  at     hVp://bit.ly/2cTwKaN  
  • 137. Sandra  Gesing              Science  Gateways                          137   sandra.gesing@nd.edu