Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.
Masahiro Tanaka
University of Tsukuba
2010-11-14 1RubyConf X
 Masahiro Tanaka
 NArray author
 majored in Astronomy
 Research fellow in Computer Science
◦ at Center for Computation...
 Research Fields
◦ Computer Science:
 High Performance Computing
 Computational Informatics
◦ Computational Science:
 ...
 FIRST
◦ 512 cores+BladeGRAPE
◦ 36 TFLOPS
 PACS-CS
◦ 2,560 cores
◦ 14.4 TFOPS
 T2K Tsukuba
◦ 10,368 cores
◦ 95 TFLOPS
2...
 Conference on
SuperComputer
 > 10,000
participants
2010-11-14RubyConf X 5
We are here
SC10 venue
Ernest N. Morial
Convention Center
exhibit Nov 15-18
2010-11-14 6RubyConf X
 CCS booth at SC09
2010-11-14RubyConf X 7
 Pwrake : a Distributed Workflow
Engine for e-Science
2010-11-14 8RubyConf X
2010-11-14 9RubyConf X
LHC
Particle Accelerator
ALMA
Radio Observatory
2010-11-14 10RubyConf X
2010-11-14RubyConf X 11
http://www.sinet.ad.jp/case-examples/tsukuba
: Sharing QCD Simulation data
 Computationally intensive science that
is carried out in highly distributed
network environments,
or
 Science that uses...
 The term was created by John Taylor,
◦ Director General of the United Kingdom's Office
of Science and Technology
◦ in 19...
 is a key issue for e-Science.
2010-11-14 14RubyConf X
 Performance of single core does no
more increase.
2010-11-14 15RubyConf X
2010-11-14 16RubyConf X
2010-11-14 17RubyConf X
 Parallelize your program
 Scalability is an key issue
2010-11-14RubyConf X 18
 P : parallelizable
 1-P : sequential
 N : # of processors
 Speed-up formula :
2010-11-14RubyConf X 19
Number of Proce...
 MapReduce
 MPI
 OpenMP
 thread
 Parallel programming languages
 process
2010-11-14RubyConf X 20
 Independent processes can be
parallelized
 Without parallel programming
 Workflow System is required
2010-11-14RubyCon...
2010-11-14RubyConf X 22
 Description of
procedures
 It is like building a
program
2010-11-14RubyConf X 23
cc –c –o a.o a.c
a.c b.c
b.oa.o
cc –o ...
 Task: Ellipse Node
 File: Rectangle Node
 Dependency: Edge
 DAG
◦ Directed Acyclic Graph
2010-11-14 24RubyConf X
Inpu...
 Montage
◦ software for producing a custom mosaic
image from multiple shots of images.
◦ http://montage.ipac.caltech.edu/...
 Tasks:
◦ Projection
◦ Brightness correction
◦ Coadding
 1 image : 1 process
26RubyConf X 2010-11-14
mProjectPP
mDiff
mB...
flow
Input
files
Output
file
2010-11-14 27RubyConf X
◦ invoke task based on
dependency
◦ assign a task to an
available computer
◦ parallel execution for
independent tasks
2010...
 for Grid Computing
2010-11-14RubyConf X 29
• DAGMan
• Pegasus
• Triana
• ICENI
• Taverna
• GrADS
• GridFlow
• UNICORE
• ...
 Define DAG in XML
◦ Human cannot write
complex XML.
◦ Need to write a program
to generate XML
2010-11-14RubyConf X 30
<a...
 DSL to define task dependency
 Rule
◦ define multiple tasks at once
◦ avoid redundancy
 Skip finished tasks
◦ based on...
 Grid Explorer : Grid and Cluster shell
◦ http://www.logos.ic.i.u-tokyo.ac.jp/gxp/
◦ written in Python.
 GXP Make
◦ GNU ...
 Makefile
◦ same input files
◦ same tasks
◦ executed repeatedly
 Scientific Workflows have different
aspects.
2010-11-14...
 Same workflow for different files
◦ “rule” may solve, but is not enough.
2010-11-14RubyConf X 34
File set 1
file file fi...
 Task dependencies rely on :
◦ Not only file name
◦ Parameters, e.g. Geometry
2010-11-14RubyConf X 35
A B
C
D
A
task
B C ...
 Entire workflow is unknown at first
 Result of a task affect
◦ Output files
◦ Afterward tasks
2010-11-14RubyConf X 36
T...
 create Makefile during Make execution
 tricky way
Makefile
Makefile.sub: prerequisite
awk –f hoge.awk $< > $@
target: M...
 Scientific workflow requires powerful
and flexible definition language.
 You probably know the solution.
◦ What is it?
...
2010-11-14RubyConf X 39
 Build tool
 Internal DSL
 Programming power of Ruby
2010-11-14RubyConf X 40
file "file2" => "file1" do
sh "program file1 > file2"
end
2010-11-14RubyConf X 41
file1
program
file2
for x in LIST
file x[1] => x[0] do |t|
sh "your_program …"
end
end
2010-11-14 42RubyConf X
 How do your write it with Rake?
2010-11-14RubyConf X 43
Task
file3file1 file2
Task Task
task :A do
task :B do
puts “B”
end
end
task :default => :A
 No task depends
on Task B
task :A do
b = task :B do
puts “B”
end
b.invoke
end
task :default => :A
 Rake::Task#invoke
 Invoke Task B
immediately af...
 multitask
◦ Rake built-in feature
◦ Parallelize prerequisite tasks of multitask
◦ Ruby thread
 Problem
◦ No control for...
 http://drake.rubyforge.org/
 Specify the number of threads
 All the independent task are
automatically parallelized.
◦...
Find task
2010-11-14RubyConf X 48
D E F
C
A
B
Queue to
workers
worker
thread 1
worker
thread 2
Queue from
workers
B
Availa...
 Remote process execution
 Dynamic task definition
◦ dRake does not allows “invoke” method.
 Performance issue
2010-11-...
 Need Powerful Scientific Workflow tool
 Existing
◦ Rake : Powerful for writing workflow
◦ dRake : Parallel execution
 ...
2010-11-14RubyConf X 51
Process
Process
Process
Rake
Rakefile
Gfarm filesystem
file file file
Pwrake
extension
2010-11-14RubyConf X 52
 Parallel + distributed
 Workflow
 extension for Rake
 repository:
◦ http://github.com/masa16/pwrake
2010-11-14RubyCon...
 Same syntax as Rake.
 Parallelize task, file
◦ no multitask
 Replace “sh” method
◦ invoke process through SSH
 Scalab...
 Why SSH
◦ Secure
◦ Probably SSH port is available
 SSH class for Pwrake
◦ Original implementation
◦ Performance issues
...
 Worker thread in Ruby
 Ruby thread uses single-core
◦ GVL
 sh process uses multi-core.
for x in LIST
file x[1] => x[0]...
2010-11-14RubyConf X 57
x10
faster
dRake
Pwrake
Our approach:
 use Distributed Filesystem
◦ file sharing
◦ consistent file timestamp
◦ I/O performance
2010-11-14RubyConf...
Storage
CPU CPU CPU
file1 file2 file3
Storage
CPU CPU CPU
Storage I/O
becomes
bottleneck StorageStorage
file1 file2 file3
...
2010-11-14RubyConf X 60
 Wide-area distributed file system
 Global namespace to federate storages
 Main developer : Prof. Osamu Tatebe
 Open s...
 use Local I/O for performance
 assign task based on File locality
 implement as a function of Pwrake
62RubyConf X 2010...
 Locality-aware task assignment for
Gfarm
2010-11-14RubyConf X 63
Task1
Task2
Task3
AffinityQueue
…
worker
thread1
worker...
1 node
4 cores
2 nodes
8 cores
4 nodes
16 cores
8 nodes
32 cores
64RubyConf X 2010-11-14
NFS
does not
scale
1 node
4 cores
2 nodes
8 cores
4 nodes
16 cores
8 nodes
32 cores
65RubyConf X 2010-11-14
Gfarm
scales
20%
Speedup
using lo...
 Montage workflow
2010-11-14RubyConf X 66
 Geographically distributed wokrflow
 Fault tolerance
2010-11-14RubyConf X 67
 Rake
◦ is so powerful to be used for Scientific
definition language.
 Pwrake
◦ Parallel and Distributed Workflow
extens...
 Pwrake site
◦ https://github.com/masa16/pwrake
 Questions?
2010-11-14RubyConf X 69
Próxima SlideShare
Cargando en…5
×

Pwrake: Distributed Workflow Engine for e-Science - RubyConfX

e-Science is scientific research enabled by widely distributed computational resources in collaboration among several institutes. One of issues in making use of e-Science infrastructure is to define complex workflows (composition of many tasks and their dependencies). We propose to employ Rake as a workflow definition language. In contrast to Makefile, Rake is an internal DSL and takes advantage of Ruby's scripting power which requires to define complex scientific workflows. In order to execute Rake workflow on distributed computer resources, we develop Pwrake, an Parallel Workflow extension for Rake. Pwrake is designed to work on Gfarm, a wide-area distributed file system. Gfarm provides a unified filesystem and consistent file time stamps among distributed computers, and also high performance of parallel I/O. We show a powerfulness of Rake as a workflow language and the scalable performance of distributed computing using Pwrake and Gfarm.

  • Sé el primero en comentar

Pwrake: Distributed Workflow Engine for e-Science - RubyConfX

  1. 1. Masahiro Tanaka University of Tsukuba 2010-11-14 1RubyConf X
  2. 2.  Masahiro Tanaka  NArray author  majored in Astronomy  Research fellow in Computer Science ◦ at Center for Computational Sciences, University of Tsukuba ◦ since 2009 2010-11-14 2RubyConf X
  3. 3.  Research Fields ◦ Computer Science:  High Performance Computing  Computational Informatics ◦ Computational Science:  Particle Physics, Astrophysics, Material Science, Life Science, Biology, Environmental Science 2010-11-14 3RubyConf X
  4. 4.  FIRST ◦ 512 cores+BladeGRAPE ◦ 36 TFLOPS  PACS-CS ◦ 2,560 cores ◦ 14.4 TFOPS  T2K Tsukuba ◦ 10,368 cores ◦ 95 TFLOPS 2010-11-14RubyConf X 4
  5. 5.  Conference on SuperComputer  > 10,000 participants 2010-11-14RubyConf X 5
  6. 6. We are here SC10 venue Ernest N. Morial Convention Center exhibit Nov 15-18 2010-11-14 6RubyConf X
  7. 7.  CCS booth at SC09 2010-11-14RubyConf X 7
  8. 8.  Pwrake : a Distributed Workflow Engine for e-Science 2010-11-14 8RubyConf X
  9. 9. 2010-11-14 9RubyConf X
  10. 10. LHC Particle Accelerator ALMA Radio Observatory 2010-11-14 10RubyConf X
  11. 11. 2010-11-14RubyConf X 11 http://www.sinet.ad.jp/case-examples/tsukuba : Sharing QCD Simulation data
  12. 12.  Computationally intensive science that is carried out in highly distributed network environments, or  Science that uses immense data sets that require grid computing ◦ (Wikipedia). 2010-11-14 12RubyConf X
  13. 13.  The term was created by John Taylor, ◦ Director General of the United Kingdom's Office of Science and Technology ◦ in 1999 2010-11-14RubyConf X 13
  14. 14.  is a key issue for e-Science. 2010-11-14 14RubyConf X
  15. 15.  Performance of single core does no more increase. 2010-11-14 15RubyConf X
  16. 16. 2010-11-14 16RubyConf X
  17. 17. 2010-11-14 17RubyConf X
  18. 18.  Parallelize your program  Scalability is an key issue 2010-11-14RubyConf X 18
  19. 19.  P : parallelizable  1-P : sequential  N : # of processors  Speed-up formula : 2010-11-14RubyConf X 19 Number of Processors Speedup 1 1 − P + P/N P<1 1 1 − P
  20. 20.  MapReduce  MPI  OpenMP  thread  Parallel programming languages  process 2010-11-14RubyConf X 20
  21. 21.  Independent processes can be parallelized  Without parallel programming  Workflow System is required 2010-11-14RubyConf X 21 input1 program output1 ...input2 program output2 input3 program output3 input4 program output4 ... ...
  22. 22. 2010-11-14RubyConf X 22
  23. 23.  Description of procedures  It is like building a program 2010-11-14RubyConf X 23 cc –c –o a.o a.c a.c b.c b.oa.o cc –o prog … prog cc –c –o b.o b.c
  24. 24.  Task: Ellipse Node  File: Rectangle Node  Dependency: Edge  DAG ◦ Directed Acyclic Graph 2010-11-14 24RubyConf X Input File Task Output File Dependency
  25. 25.  Montage ◦ software for producing a custom mosaic image from multiple shots of images. ◦ http://montage.ipac.caltech.edu/ 2010-11-14 25RubyConf X
  26. 26.  Tasks: ◦ Projection ◦ Brightness correction ◦ Coadding  1 image : 1 process 26RubyConf X 2010-11-14 mProjectPP mDiff mBgModel mBackground mAddmFitplane m1 = a'1x+b'1y+c'1 m2 = a'2x+b'2y+c'2 a1x+b1y+c1=0 a2x+b2y+c2=0 Final image Input images
  27. 27. flow Input files Output file 2010-11-14 27RubyConf X
  28. 28. ◦ invoke task based on dependency ◦ assign a task to an available computer ◦ parallel execution for independent tasks 2010-11-14RubyConf X 28 Process Process Process Workflow System Workflow definition
  29. 29.  for Grid Computing 2010-11-14RubyConf X 29 • DAGMan • Pegasus • Triana • ICENI • Taverna • GrADS • GridFlow • UNICORE • Globus workflow • Askalan • Karajan • Kepler from “A Taxonomy of Scientific Workflow Systems for Grid Computing” Jia Yu and Rajkumar Buyya (2005)
  30. 30.  Define DAG in XML ◦ Human cannot write complex XML. ◦ Need to write a program to generate XML 2010-11-14RubyConf X 30 <adag xmlns="http://www.griphyn.org/chimera/DAX" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.griphyn.org/chimera/DAX http://www.griphyn.org/chimera/dax-1.8.xsd" count="1" index="0" name="test"> <filename file="2mass-atlas-981204n-j0160056.fits" link="input"/> … <job id="ID000001" name="mProject" version="3.0" level="11" dv- name="mProject1" dv-version="1.0"> <argument> <filename file="2mass-atlas-981204n-j0160056.fits"/> <filename file="p2mass-atlas-981204n-j0160056.fits"/> <filename file="templateTMP_AAAaaa01.hdr"/> </argument> <uses file="2mass-atlas-981204n-j0160056.fits" link="input" dontRegister="false" dontTransfer="false"/> <uses file="p2mass-atlas-981204n-j0160056.fits" link="output" dontRegister="true" dontTransfer="true" temporaryHint="tmp"/> <uses file="p2mass-atlas-981204n-j0160056_area.fits" link="output" dontRegister="true" dontTransfer="true" temporaryHint="tmp"/> <uses file="templateTMP_AAAaaa01.hdr" link="input" dontRegister="false" dontTransfer="false"/> </job> … <child ref="ID003006"> <parent ref="ID000001"/> <parent ref="ID000006"/> </child> DAG XML
  31. 31.  DSL to define task dependency  Rule ◦ define multiple tasks at once ◦ avoid redundancy  Skip finished tasks ◦ based on timestamp of file 2010-11-14RubyConf X 31
  32. 32.  Grid Explorer : Grid and Cluster shell ◦ http://www.logos.ic.i.u-tokyo.ac.jp/gxp/ ◦ written in Python.  GXP Make ◦ GNU Make-based workflow system ◦ Distributed & Parallel execution 2010-11-14RubyConf X 32
  33. 33.  Makefile ◦ same input files ◦ same tasks ◦ executed repeatedly  Scientific Workflows have different aspects. 2010-11-14 33RubyConf X
  34. 34.  Same workflow for different files ◦ “rule” may solve, but is not enough. 2010-11-14RubyConf X 34 File set 1 file file file File set 2 file file file Workflow result 1 result 2
  35. 35.  Task dependencies rely on : ◦ Not only file name ◦ Parameters, e.g. Geometry 2010-11-14RubyConf X 35 A B C D A task B C D task task task
  36. 36.  Entire workflow is unknown at first  Result of a task affect ◦ Output files ◦ Afterward tasks 2010-11-14RubyConf X 36 Task file3file1 file2 Task Task
  37. 37.  create Makefile during Make execution  tricky way Makefile Makefile.sub: prerequisite awk –f hoge.awk $< > $@ target: Makefile.sub make -f $< Makefile.sub target1: source1 target2: source2 … create invoke 2010-11-14 37RubyConf X
  38. 38.  Scientific workflow requires powerful and flexible definition language.  You probably know the solution. ◦ What is it? 2010-11-14 38RubyConf X
  39. 39. 2010-11-14RubyConf X 39
  40. 40.  Build tool  Internal DSL  Programming power of Ruby 2010-11-14RubyConf X 40
  41. 41. file "file2" => "file1" do sh "program file1 > file2" end 2010-11-14RubyConf X 41 file1 program file2
  42. 42. for x in LIST file x[1] => x[0] do |t| sh "your_program …" end end 2010-11-14 42RubyConf X
  43. 43.  How do your write it with Rake? 2010-11-14RubyConf X 43 Task file3file1 file2 Task Task
  44. 44. task :A do task :B do puts “B” end end task :default => :A  No task depends on Task B
  45. 45. task :A do b = task :B do puts “B” end b.invoke end task :default => :A  Rake::Task#invoke  Invoke Task B immediately after definition
  46. 46.  multitask ◦ Rake built-in feature ◦ Parallelize prerequisite tasks of multitask ◦ Ruby thread  Problem ◦ No control for the number of thread. ◦ All the prerequisite tasks are invoked at the same time. 2010-11-14RubyConf X 46
  47. 47.  http://drake.rubyforge.org/  Specify the number of threads  All the independent task are automatically parallelized. ◦ multitask is not necessary 2010-11-14RubyConf X 47
  48. 48. Find task 2010-11-14RubyConf X 48 D E F C A B Queue to workers worker thread 1 worker thread 2 Queue from workers B Available task ...
  49. 49.  Remote process execution  Dynamic task definition ◦ dRake does not allows “invoke” method.  Performance issue 2010-11-14 49RubyConf X
  50. 50.  Need Powerful Scientific Workflow tool  Existing ◦ Rake : Powerful for writing workflow ◦ dRake : Parallel execution  Missing ◦ Remote Process Invocation ◦ Scalability 2010-11-14RubyConf X 50
  51. 51. 2010-11-14RubyConf X 51 Process Process Process Rake Rakefile Gfarm filesystem file file file Pwrake extension
  52. 52. 2010-11-14RubyConf X 52
  53. 53.  Parallel + distributed  Workflow  extension for Rake  repository: ◦ http://github.com/masa16/pwrake 2010-11-14RubyConf X 53
  54. 54.  Same syntax as Rake.  Parallelize task, file ◦ no multitask  Replace “sh” method ◦ invoke process through SSH  Scalability 2010-11-14 54RubyConf X
  55. 55.  Why SSH ◦ Secure ◦ Probably SSH port is available  SSH class for Pwrake ◦ Original implementation ◦ Performance issues 2010-11-14RubyConf X 55
  56. 56.  Worker thread in Ruby  Ruby thread uses single-core ◦ GVL  sh process uses multi-core. for x in LIST file x[1] => x[0] do |t| sh "your_program …" end end 2010-11-14RubyConf X 56 here uses multi-core
  57. 57. 2010-11-14RubyConf X 57 x10 faster dRake Pwrake
  58. 58. Our approach:  use Distributed Filesystem ◦ file sharing ◦ consistent file timestamp ◦ I/O performance 2010-11-14RubyConf X 58
  59. 59. Storage CPU CPU CPU file1 file2 file3 Storage CPU CPU CPU Storage I/O becomes bottleneck StorageStorage file1 file2 file3 Network File System Distributed File System Parallel Processing 59RubyConf X 2010-11-14
  60. 60. 2010-11-14RubyConf X 60
  61. 61.  Wide-area distributed file system  Global namespace to federate storages  Main developer : Prof. Osamu Tatebe  Open source development ◦ http://datafarm.apgrid.org/ Local Storage Local Storage Local Storage Internet Gfarm File System / /dir1 file1 file2 /dir2 file3 file4 Computer nodes Local Storage Local Storage Local Storage 61RubyConf X 2010-11-14
  62. 62.  use Local I/O for performance  assign task based on File locality  implement as a function of Pwrake 62RubyConf X 2010-11-14 Local Storage Local Storage Local Storage File System Nodes file1 file2 file3 Local Storagefile4 Job for File 1 Job for File 3 Job for File 3 Slow Fast
  63. 63.  Locality-aware task assignment for Gfarm 2010-11-14RubyConf X 63 Task1 Task2 Task3 AffinityQueue … worker thread1 worker thread2 worker thread3 push with hostname Queue for host1 pop with hostname Queue for host2 Queue for host3 … …
  64. 64. 1 node 4 cores 2 nodes 8 cores 4 nodes 16 cores 8 nodes 32 cores 64RubyConf X 2010-11-14 NFS does not scale
  65. 65. 1 node 4 cores 2 nodes 8 cores 4 nodes 16 cores 8 nodes 32 cores 65RubyConf X 2010-11-14 Gfarm scales 20% Speedup using locality
  66. 66.  Montage workflow 2010-11-14RubyConf X 66
  67. 67.  Geographically distributed wokrflow  Fault tolerance 2010-11-14RubyConf X 67
  68. 68.  Rake ◦ is so powerful to be used for Scientific definition language.  Pwrake ◦ Parallel and Distributed Workflow extension for Rake  Gfarm ◦ for scalable I/O performance 2010-11-14 68RubyConf X
  69. 69.  Pwrake site ◦ https://github.com/masa16/pwrake  Questions? 2010-11-14RubyConf X 69

×