SlideShare una empresa de Scribd logo
1 de 111
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
High Performance and
Research Computing
(For life scientists seeking a brief
introduction to high performance and
research computing)
The BioTeam
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
BioTeam™ Inc.
• Objective & vendor neutral informatics and ‘bio-IT’ consulting
• Composed of scientists who learned to bridge the gap between life
science informatics and high performance IT
• “iNquiry” bioinformatics cluster solution
• Staff
Michael Athanas Bill Van Etten
Chris Dagdigian Stan Gloss
Chris Dwan
http://bioteam.net
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Day Two Rough Outline
• History of Computing
• How Computers Work
• Architectures for High Performance Computing
• Parallel Computing
• Cluster Computing
• Building your own HPC Environment
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
I can’t teach Unix today, Sorry
• Also can’t teach programming today.
• Really can’t teach an editor, meaningfully.
• The Unix command line and a basic facility with
programming are required to take advantage of
HPC resources.
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
What I Want You To Remember
• It is important to map your solution to the computer
architecture.
• Automatic tools for performance tuning will only get
you so far.
• Good programs (and bad ones) can be written in
any language
• High Throughput vs. High Performance
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Language and Word Choice
• Technical Language is very specific
• Words overlap between disciplines
Please ask questions.
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Computing - Pre-History
• 1588 - English defeat the Spanish Armada (in part) using
pre-computed firing tables
• 1820’s - Charles Babbage invents a mechanical calculator
• 1940’s - Bletchley Park, England breaks German codes
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Computing - History
• 1947: Transistor invented
• 1950s
– First electronic, stored program computers
– FORTRAN Programming Language
• 1965: Moore’s Law stated
• 1968: Knuth publishes volume 1 of TAOCP
• 1972: C Programming Language
• 1976: Cray-1 Supercomputer
• 1984: DNS introduced to the internet
• 1989: WWW invented
• 1991: First version of Linux
• 1993: First Beowulf cluster
• 1994: Java Programming Language
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Von Neumann Model (1946)
Processor(s)
Memory
(contains both
program and data)
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Alan Turing (1912-1954)
• Any computing language of a certain
“power” can solve any “computable”
problem
– Store values in memory
– Add one to values
– Conditional execution based on a value in
memory
• Proof using the “Turing machine”
Some problems may be more easily stated or
understood in one language or another.
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
1965: Moore’s Law
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Donald Knuth
• Professor Emeritus, Stanford University
The Art of Computer Programming
1968: Volume One - Fundamental Algorithms
1969: Volume Two - Seminumerical Algorithms
1973: Volume Three - Searching and Sorting
Literate Programming:
“The main idea is to regard a program as a communication to human
beings rather than as a set of instructions to a computer”
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Cray Supercomputers
• 1976: Cray 1
– Cray 1 (XMP, YMP, C90, J90,
T90)
• 1985:
– Cray 2
• 1993:
– Cray 3 (one machine delivered)
• …
• Present:
– X1, XT3, XD1, SX-6
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Clusters
• 1993:
– Beowulf: Custom interconnects
(switched ethernet too expensive)
• 2000:
– Commercial cluster sales (Linux
Networx)
• 2003:
– 7 of the top 10 supercomputers are
clusters
– 40% of the top 500
supercomputers are clusters
• 2004:
– Apple “workgroup cluster”
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
“Big Mac”
• Virginia Tech:
– (2003) 3rd fastest supercomputer
in the world: $5.4 Million
– Ordered from Apple’s web sales
page.
“Virginia Tech has aspirations …
…This is one of those spires
that one can build that will be
visible from afar.”
-Hassan Aref
Dean of Engineering, Virginia Tech
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
2004 - Apple Server Products
• Xserve
– Dual G5
– Up to 8GB RAM
• XRAID
– 5.6 TB Storage per unit
– XSAN to combine up to 64TB
• Apple Workgroup Cluster
– Packaged with iNquiry
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
2004: Cray X1
• Scales to 4,096 CPUs
• 4 CPUs per Node
• Scales to 32TB RAM,
globally addressable
• 34.1 GB/sec per CPU
memory bandwidth
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
2004: Orion MultiSystems
• 10 to 96 CPUs in a desk
side box.
• 1.2GHz chips, but lots of
them.
• Pre-packaged cluster
• Powered by a single,
standard 15A wall socket
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Other Observations, 2004
• Major computer manufacturers find their profits in
selling sub $500 consumer electronics and ink.
• Style (see-through cases, round cables, etc) is the
determining characteristic in workstation purchases
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
How Computers Work
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Context switching
• At any one time, only one process is actually executing on
one CPU.
• Switching between processes requires time, and is driven by
interrupts.
• Capture all allocated memory and write off to memory
On CPU
Other jobs
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
OS and Interrupts
• OS switches between processes from time to time
• Also performs “housekeeping” tasks
• Interrupts force OS context switches:
– I/O
– Power fault
– Disk is ready to send data
– …
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Memory
• CPU / Registers
– Physically part of the CPU
– Immediately accessible bythe
machine code
– ~128 registers on modern chips
• Cache
– 1 - 2 MB of very fast memory, also
built into the chip.
• RAM
– 1 - 8 GB (Cough cough)
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Memory Timings (2004)
• CPU / Registers:
– 10-9 seconds per instruction
• Cache:
– Low Latency
• Memory
– Latency: 102 cycles (~300 cycles)
– Streaming: 0.8 GB / sec (~1 byte / cycle)
• Disk
– Latency: 10-3 seconds (106 cycles)
– Streaming: 100 MB / sec (10-1 bytes / cycle)
• Tape
– Seconds to minutes
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Memory to the Program
• One large address space: 0 - 232 (or 264, or some other
value) relative to the program
• When memory is used, the “stack” increases
• The program’s “memory footprint” is the amount of memory
allocated. Larger footprints can step out of cache and even
out of RAM.
• “Segmentation violation” means “you tried to access memory
that’s not yours”
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
32 vs 64 bits
• Number of bits in a single memory “word”
• Affects
– Precision of calculations
– Maximum memory address space
– Compatibility of files
– Memory data bandwidth
– Marketing
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Potential Limits
• Largest integer 232 or 264
• Largest file 2 GB
– not usually a problem anymore, but it crops up at really
annoying times
• Smallest / largest floating point number
• Number of files on disk (inodes)
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
System Monitoring Examples
• Ps
• Top
• Iostat
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Abstract: Convenience vs. Time
Hardware: Voltages, Clocks, Transistors
Microcode
Assembly Language
Operating System
User Interface
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Compiled vs. Interpreted
• Script:
– interpreted one line at a time
– sh, csh, bash, tcsh, Perl, tcl, Ruby, …
– Much faster development (to a point)
– Can be slow
– Program is relatively generic / portable (cough cough)
• Compiled Language:
– Code is translated into Assembly by a “compiler”
– FORTRAN, PASCAL, C, C++, …
– Can optimize for the specific architecture at compile time
– Intellectual property is hidden in the compiled code
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Performance Measurement
• Wall Clock Time: How long you waited
• User Time: How much of “your” computer time was
spent on this job
• System Time: How much time the system spent on
the actual job
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Example Program
• Script in PERL
• Program in C
• Compile with optimization
• Remove IO
• Example memory allocation “bug”
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
High Performance Computing
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
What is “Super” computing?
• Faster than what most other people are doing.
• $106+ investment
• Custom / innovative design
It Depends
http://www.top500.org
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Just make the system faster
• Increase memory bandwidth
• Increase memory size
• Decrease clock time
Clearly limited
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Superscalar Architectures
• More than one ALU
• More than one CPU
• Instructions can happen in parallel
• Most modern CPUs are superscalar to some level
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Pipelining
• Break each instruction into a series of steps
• Build a pipeline of the steps (as much as possible)
Y = x + y;
Z = y - z;
1) Load inst: add
2) Load data: X, Y Load inst: sub
3) Calc: + Load data: Y, Z Load inst: …
4) Store: Y Stall
5) Calc: -
6) Store: Z
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Branch Prediction
if (x == 0) { a++; }
else { b--; }
Load Inst: “if”
Load Data: “x” Load Instruction: Which one?
• Always yes
• Branch Prediction
• Do both (superscalar processing pipeline)
• Profile code and insert hints for runtime
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Vector Processing
• SIMD - Single Instruction, Multiple Data
Vector A Vector CVector B
+ =
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Vector Processing
• Cray 1: 64 x 64 bit registers
• Could be Software as well:
• read the next instruction and decode it
• get this number
• get that number
• add them
• put the result here
• get the 10 numbers here and add them to the numbers
there and put the results here
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Parallel Computing is Not New
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
“Would you rather have 4 strong oxen, or
1024 chickens?”
Seymour Cray (Cray Research Inc.)
“Quantity has a quality all its own.”
Russian Saying
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Amdahl’s Law
• Gene Amdahl (Architect for the IBM 360)
• Parallel Time = WS + (WP / N) + C(N)
• N = Number of processors
• WP = Parallel Fraction
• WS = Serial Fraction
• C(N) = Cost of setting up a job over N machines
• Assumption: WP + WS = W
Amdahl measured parallel fraction for several IBM codes of the day
and found it to be approx. 1/2. This meant the maximum
speedup on those codes would be a factor of 2.
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Embarrassingly Parallel
• Large numbers of nearly identical jobs
• Identical analysis on large pools of input
data
• Easily subdivided, mostly independent
tasks (large code builds, raytracing,
rendering, bioinformatic sequence
analysis)
• User writes serial code and executes in
batches.
• Nearly always a speedup
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Traditional Parallelism
• A single process in which several
parallel elements (threads) must
communicate to solve a single problem.
• Parallel programming is difficult, and far
from automatic
• Users must explicitly use parallel
programming tools
• Speedup is not guaranteed
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Parallel vs. Serial codes
Entirely parallelizable
X[0] = 7 + 5
X[1] = 2 + 3
X[2] = 4 + 5
X[3] = 6 + 8
Loop dependencies
X[0] = 0
X[1] = X[0] + 1
X[2] = X[1] + 2
X[3] = X[2] + 3
Can be reduced to:
X[n] = (1 + n)(n/2)
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Loop Unrolling
• Simple, obvious things can be done automatically
(and already are)
• If the contents of the loop are invariant with the
iterator, we can safely unroll the loop.
for (n = 0; n < 10; n++) { a[n]++; }
for (n = 1; n < 10; n++) { a[n-1] + 1;}
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Parallel Processing Architectures
• Cycle Stealing / Network of Workstations
• Single System Image
– Log in and use the machine.
– Parallelism can be hidden.
• Message Passing vs. Shared Memory Architectures
• Portal Architecture (cluster or compute farm)
– Log in and submit jobs to be run in parallel.
– Parallelism is explicit
– Can use message passing.
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Network Of Workstations
Cycle Stealing
Workstations
Public
Network
My Job
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Network Of Workstations
Cycle Stealing
Workstations
Public
Network
Also My JobMy Job
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Labs of Workstations
• Offer to improve the lab
machines by installing
hardware you need.
• Do not make the users suffer.
• Accept that this is a part time
resource (return is much less
than number of CPUs)
• Unless the owner of the lab
buys into the distributed
computing idea, there will be
trouble.
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Cycle Stealing
• Variation in hosts
• Data motion
• Need small granularity in your problem
• Condor (U. Wisconsin)
• * @ Home
• United Devices
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Shared Memory Multiprocessor
• SGI Origin, others.
• Limited scalability
• Remarkably expensive
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
NUMA
• Non Uniform Memory Architecture
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Message Passing
• Start up multiple instances of the same program
• Each figures out which one it is
• Can send messages between them.
• Requires a parallel programmer
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Supercomputing
• Really exploiting and tuning code for a particular
supercomputer requires a lot of hard work.
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Cluster Computing
• A cost effective way to achieve large speedups
• Throughput rather than High Performance.
It’s all about power and money
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Portal Architecture
Compute
Nodes
Private Network
Head
Node
Public Network
Cluster
Partitions
Jobs of
user 1
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Data motion
Compute
Nodes
Private Network
Head
Node
Public Network
Cluster
Partitions
Data I/O a huge bottleneck for
many types of computation
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Portal Architecture
Compute
Nodes
Private Network
Head
Node
Public Network
Cluster
Partitions
Jobs of
user 1
Jobs of
user 2
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Distributed Resource Managers
(DRM)
• Maintain a queue of jobs
• Schedule jobs onto compute nodes
• Several Options, mostly identical:
– Sun GridEngine (SGE)
– Portable Batch System (PBS)
– Load Sharing Facility (LSF - Platform Computing)
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Job Scheduling and Priority
• First In, First Out (FIFO)
• Fairshare
– Try to maintain a goal level of usage on the cluster.
– Going above that level lowers your priority
– Not using the system for a while raises priority
• Job Priority is a social / political issue.
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Job Scheduling
• Sadly though, even though users and managers understand share-tree when the
method is explained to them they tend to forget these details when they notice
their jobs pending in the wait list. Users who have been told to
expect a 50% entitlement to cluster resources get
frustrated when they launch their jobs and don't get to
take over half of the cluster instantly. Explaining to them that the
50% entitlement is a goal that the scheduler is working to meet "as averaged
over time..." fall upon deaf ears. Heavy users get upset to learn that their current
entitlement is being "penalized" because their past usage greatly exceeded their
alloted share. Cluster admins then spend far too much time
attempting to "prove" to the user community that they
are not getting short changed
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Stages of Cluster Use
1. I just need to get this one set of data processed.
2. This is a task that I will perform frequently.
3. I am the bane of my local administrator. I have my own
little cluster, plus a bunch of workstations in my lab. I wish I
had administrative access to the big cluster.
4. I have a pipeline of data which will always be subject to the
same analysis, and I run all my jobs on some large (set of)
central resource(s)
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Example SGE usage
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Parallel Programming
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Why is the program slow?
• Who cares?
• Something about the way it was run
• Something about the system on which it was run
• Something about the program itself
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Solution Strategies
High Performance High Throughput
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
“Premature optimization is the root of all
evil in computer programming”
Donald Knuth
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Photoshop Example
• Steve and Phil’s Photoshop
Demo
– 8 minutes
• On 8 Xserves
– 1 minute?
• No!
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
High Throughput
• 8 X Photshop Demos on 8
Xserves
– 8 minutes?
• Yes!
• With Some Effort
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
High Performance
• Not 8 X Work in 1 X Time
• But 1 X Work in 1/8 X Time
• Partition the Problem?
• Limited by
–Application
–Data Parallelism
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
High Performance
• Sharpen, Blur, Diffuse,
Rotate, etc.
• Divide Task by Step?
• No!
– Steps are order dependent
– Can’t merge results
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Divide by Step
Local Area Network
PS Sequence
Distributed Resource Manager
Sharpen Diffuse Rotate Blur etc. … … …
•Divide Steps
•Perform Steps
•Merge Results
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
High Performance
• Divide by Image?
• 1/8th Image on Each of 8
Xserves
– 1 minute?
• Plausible, but…
– new work
– duplicated work
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Divide by Image
Local Area Network
PS Sequence
Distributed Resource Manager
1/8 2/8 3/8 4/8 5/8 6/8 7/8 8/8
•Divide Image
•Perform Steps
•Merge Results
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
High Performance
• Divide Task by Layer?
• 1 of 8 Layers on Each of 8
Xserves
– 1 minute?
• Probably Yes!
– If each layer computes in the
same time
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Divide by Layer
Local Area Network
PS Sequence
Distributed Resource Manager
Layer 1 Layer 2 Layer 3 Layer 4 Layer 5 Layer 6 Layer 7 Layer 8
•Render Layers
•Merge Results
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
High Performance
• Same Work in Less Time
• Application Specific
• Data Specific
• Use Specific
• Unless?
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Tightly Coupled
High Speed/Low Latency Switching Fabric
Private Ethernet Network
“Public” Local Area Network
Photoshop
•App Partitions
•Pass messages
•Share Memory
•PVM, MPI
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Building Your Own HPC
Environment
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Do It Yourself (DIY)
• It is possible to build a high performance computing
environment in your basement.
• Home users can now run Linux without having to be
computer experts.
• Labs and corporations can run cluster computers
without having to employ full time computer support
staff.
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
DIY: Physical Considerations
• Power
– Resting draw vs Loaded draw.
– Two Phase vs. Three Phase
• Cooling
– Air cooling the “Big Mac” would
have required 60MPH winds
• Space
• Physical Security
• Noise
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
DIY: Administration
• ~1 FTE to maintain and provide access to a large
cluster
• Also plan on some portion of a software developer
to assist with research (distinct from system
administration)
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Cluster Size Thresholds
• 1-16 nodes
– Scrape by with poor practices
– Make a grad student do it.
• 32 nodes
– Physical management issues loom large
– Split out fileserver functions
• 128 nodes:
– Network gets interesting
– Power / cooling become major issues
• 1,024 nodes:
– Call the newspaper
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Should I build my own cluster?
• Install takes time:
– Expert: ~1 day to configure
– Novice: Weeks to never.
• Systems support is ongoing.
– Well managed: ~1 FTE / 1,000 compute nodes.
• No need to share, custom configurations in real
time.
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Infx Cluster Checklist
• User Model
• Applications
• Compilers
• Phys. Environment
• Know Bottlenecks
• Network & Topology
• Storage
• Maintenance
• Administration & Monitoring
• DRM
• Common User Env and File
System
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
User model & Use Cases
• Single User, Few, Many Users
• Groups of users
• Are some more equal than others
• Batch/bulk vs. singleton jobs
• High-throughput or high-performance
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Application Requirements
• Many short running processes
• Few long running processes
• Ave/Max RAM requirement
• CPU and/or IO bound
• Single/Multi-threaded
• MPI/PVM/Linda parallel aware
• Will it run under *nix
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Compilers
• GNU tools are great, but…
– consider commercial compiler options
– Performance freak
– Writing SMP or parallel apps
– Serious scientific programs in C(++)/Fortran
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Phys/Environmental Constraints
• Available Power
• Available Cooling
• Density (Blades/1U/2U/Tower)
• DIY Staging space
• Raised floor or ceiling drops
• Height & width surprises
• Fire code
• Organizational standards
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
This could be you…
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Know your bottlenecks
• I/O bound
– Sequence analysis is limited by the speed of your
disks and fileserver
• CPU bound
– Chemical and protein structure modeling are
generally CPU bound
• RAM bound
– Some apps (eg Gaussian) are bound by speed of
memory access
• Network bound
– Network IO/IPC
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Network & Interconnects
• Bandwidth for IO/IPC
• Parallel networks?
• High speed
interconnect(s)?
–Enough PCI slots?
• Network topology effects:
–Scaling and growth
–Wire management
–Access to external
networks
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
High Speed Interconnects
• When low latency message passing is critical
– Massively parallel applications
• Not generally needed in BioClusters (yet)
• Can add 50% or more to cost of each server
• No magic, must be planned for
– Applications, APIs, code, compilers, PCI slots, cable
management & rack space
• Commercial products
– Myrinet (www.myricom.com)
– Dolphin SCI (www.dolphinics.com)
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Storage
• Most BioClusters are I/O bound
• Not an area to pinch pennies
• NAS vs. SAN vs. Local Disk
• Pick the right RAID levels
• Heterogeneous storage
• Plan for data staging and caching
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
SAN vs. NAS vs. Local Disk
• SAN generally inappropriate
• NAS or Hybrid NAS/SAN is best
– multiple clients with concurrent read/write
access to the same file system or volume
• Local Disk
– Diskless nodes inappropriate
– Expensive SCSI disks are unnecessary
– Large, cheap IDE drives allow for clever data
caching
– Best way to avoid thrashing a fileserver
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
RAID Levels
• RAID 0: Stripe
– Fast / risky
• RAID 1: Mirror
– Great, if you can afford double the disk
• RAID 5:
– Able to lose any individual disk and still keep data
• Striped Banks of RAID 5:
– Scalable, stable solution
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Stage Data to Local Disk
• A $250K NAS server can be brought to its knees by a few large BLAST
searches
• Active clusters will tax even the fastest arrays. This is mostly
unavoidable.
• Plan for data staging
– Move data from fileserver to cheap local disk on cluster compute
nodes
– No magic; users and software developers need to do this explicitly in
their workflows and pipelines
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Maintenance Philosophy
• Compute nodes must be
– Anonymous, Interchangeable, Disposable
• 3 Possible states
– Running / Online
– Faulted / Re-imaging
– Failed / Offline & marked for replacement
• Administration must be
– Scalable, Automated, Remote
The Three “R”s of a cluster: Reboot, Reimage, Replace
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Monitoring and Reporting
• Many high quality free tools
– Ganglia, Big Brother, RRDTool, MRTG
– Sar, Ntop, etc. etc.
• Commercial tools too
• Log files are the poor man’s trending tool
– System, daemon, DRM
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
OS Installation and Updates
• SystemImager- www.systemimager.org
– “SystemImager is software that automates UNIX
installs, software distribution, and production
deployment.”
– Unattended disk partitioning and UNIX installation
– Incremental updates of active systems
– Based on open standards and tools
• RSYNC, SSH, DHCP, TFTP
– Totally free, open source
– Merging with IBM LUI project
• Also: Apple NetBoot
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
Common DRM suites
• Open Source and/or freely licensed
– OpenPBS
– Sun GridEngine
• Commercially available
– PBS Pro
– Platform LSF
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
DRM: My $.02
• At this time Platform LSF is still technically the best
choice for serious production BioClusters
– Lowest administrative/operational burden
– Fault tolerance features are unmatched
• 2nd choice(s)
– GridEngine, if you can support it internally
Go with your local expertise
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
A Hybrid Approach
• Systems group
– Specify supported configurations
– Maintain a central machine room
– Configure scheduler to give owners priority on their own
nodes.
• Researchers
– Estimate their computational load
– Include line-item for the required number of nodes
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
What can I do today?
• CS:
– Take biology coursework
– Accept that biology is really, really
complex and difficult.
• Bio:
– Take CS coursework
– Accept that computer engineering /
software development is tricky.
• Administrators:
– Decide to build a “spire, which will be
visible from afar”
• All:
– Attend Journal Clubs, symposia, etc.
– Get a bigger monitor
© 2004 The BioTeam
http://bioteam.net
cdwan@bioteam.net
The future
• All scientists will write computer programs
• “Computational Biology” will sound just as redundant as
“Computational Physics”
• Most labs will have a small cluster and some local expertise,
plus collaborations with supercomputing centers
• Grid / Web Services technology will enable cool things.

Más contenido relacionado

La actualidad más candente

High Performance Computing using MPI
High Performance Computing using MPIHigh Performance Computing using MPI
High Performance Computing using MPIAnkit Mahato
 
High Performance Computing: an Introduction for the Society of Actuaries
High Performance Computing: an Introduction for the Society of ActuariesHigh Performance Computing: an Introduction for the Society of Actuaries
High Performance Computing: an Introduction for the Society of ActuariesAdam DeConinck
 
Introduction to High-Performance Computing
Introduction to High-Performance ComputingIntroduction to High-Performance Computing
Introduction to High-Performance ComputingUmarudin Zaenuri
 
High performance computing - building blocks, production & perspective
High performance computing - building blocks, production & perspectiveHigh performance computing - building blocks, production & perspective
High performance computing - building blocks, production & perspectiveJason Shih
 
High performance computing
High performance computingHigh performance computing
High performance computingGuy Tel-Zur
 
An Introduction to Prometheus (GrafanaCon 2016)
An Introduction to Prometheus (GrafanaCon 2016)An Introduction to Prometheus (GrafanaCon 2016)
An Introduction to Prometheus (GrafanaCon 2016)Brian Brazil
 
Overview of HPC.pptx
Overview of HPC.pptxOverview of HPC.pptx
Overview of HPC.pptxsundariprabhu
 
Kubernetes 101
Kubernetes 101Kubernetes 101
Kubernetes 101Huy Vo
 
uReplicator: Uber Engineering’s Scalable, Robust Kafka Replicator
uReplicator: Uber Engineering’s Scalable,  Robust Kafka ReplicatoruReplicator: Uber Engineering’s Scalable,  Robust Kafka Replicator
uReplicator: Uber Engineering’s Scalable, Robust Kafka ReplicatorMichael Hongliang Xu
 
Intro to Machine Learning for GPUs
Intro to Machine Learning for GPUsIntro to Machine Learning for GPUs
Intro to Machine Learning for GPUsSri Ambati
 
Overview of kubernetes network functions
Overview of kubernetes network functionsOverview of kubernetes network functions
Overview of kubernetes network functionsHungWei Chiu
 
Pulumi. Modern Infrastructure as Code.
Pulumi. Modern Infrastructure as Code.Pulumi. Modern Infrastructure as Code.
Pulumi. Modern Infrastructure as Code.Yurii Bychenok
 
Kafka replication apachecon_2013
Kafka replication apachecon_2013Kafka replication apachecon_2013
Kafka replication apachecon_2013Jun Rao
 
RESTful API – How to Consume, Extract, Store and Visualize Data with InfluxDB...
RESTful API – How to Consume, Extract, Store and Visualize Data with InfluxDB...RESTful API – How to Consume, Extract, Store and Visualize Data with InfluxDB...
RESTful API – How to Consume, Extract, Store and Visualize Data with InfluxDB...InfluxData
 
Task Scheduling Using Firefly algorithm with cloudsim
Task Scheduling Using Firefly algorithm with cloudsimTask Scheduling Using Firefly algorithm with cloudsim
Task Scheduling Using Firefly algorithm with cloudsimAqilIzzuddin
 

La actualidad más candente (20)

High Performance Computing using MPI
High Performance Computing using MPIHigh Performance Computing using MPI
High Performance Computing using MPI
 
High Performance Computing: an Introduction for the Society of Actuaries
High Performance Computing: an Introduction for the Society of ActuariesHigh Performance Computing: an Introduction for the Society of Actuaries
High Performance Computing: an Introduction for the Society of Actuaries
 
Introduction to High-Performance Computing
Introduction to High-Performance ComputingIntroduction to High-Performance Computing
Introduction to High-Performance Computing
 
High performance computing - building blocks, production & perspective
High performance computing - building blocks, production & perspectiveHigh performance computing - building blocks, production & perspective
High performance computing - building blocks, production & perspective
 
High performance computing
High performance computingHigh performance computing
High performance computing
 
An Introduction to Prometheus (GrafanaCon 2016)
An Introduction to Prometheus (GrafanaCon 2016)An Introduction to Prometheus (GrafanaCon 2016)
An Introduction to Prometheus (GrafanaCon 2016)
 
Overview of HPC.pptx
Overview of HPC.pptxOverview of HPC.pptx
Overview of HPC.pptx
 
Introduction to helm
Introduction to helmIntroduction to helm
Introduction to helm
 
Kubernetes 101
Kubernetes 101Kubernetes 101
Kubernetes 101
 
uReplicator: Uber Engineering’s Scalable, Robust Kafka Replicator
uReplicator: Uber Engineering’s Scalable,  Robust Kafka ReplicatoruReplicator: Uber Engineering’s Scalable,  Robust Kafka Replicator
uReplicator: Uber Engineering’s Scalable, Robust Kafka Replicator
 
Tensor Processing Unit (TPU)
Tensor Processing Unit (TPU)Tensor Processing Unit (TPU)
Tensor Processing Unit (TPU)
 
Intro to Machine Learning for GPUs
Intro to Machine Learning for GPUsIntro to Machine Learning for GPUs
Intro to Machine Learning for GPUs
 
Overview of kubernetes network functions
Overview of kubernetes network functionsOverview of kubernetes network functions
Overview of kubernetes network functions
 
Network Service Mesh
Network Service MeshNetwork Service Mesh
Network Service Mesh
 
Pulumi. Modern Infrastructure as Code.
Pulumi. Modern Infrastructure as Code.Pulumi. Modern Infrastructure as Code.
Pulumi. Modern Infrastructure as Code.
 
Kafka replication apachecon_2013
Kafka replication apachecon_2013Kafka replication apachecon_2013
Kafka replication apachecon_2013
 
Apache Flink Hands On
Apache Flink Hands OnApache Flink Hands On
Apache Flink Hands On
 
RESTful API – How to Consume, Extract, Store and Visualize Data with InfluxDB...
RESTful API – How to Consume, Extract, Store and Visualize Data with InfluxDB...RESTful API – How to Consume, Extract, Store and Visualize Data with InfluxDB...
RESTful API – How to Consume, Extract, Store and Visualize Data with InfluxDB...
 
Task Scheduling Using Firefly algorithm with cloudsim
Task Scheduling Using Firefly algorithm with cloudsimTask Scheduling Using Firefly algorithm with cloudsim
Task Scheduling Using Firefly algorithm with cloudsim
 
TPU paper slide
TPU paper slideTPU paper slide
TPU paper slide
 

Similar a Introduction to HPC

CLIMB System Introduction Talk - CLIMB Launch
CLIMB System Introduction Talk - CLIMB LaunchCLIMB System Introduction Talk - CLIMB Launch
CLIMB System Introduction Talk - CLIMB LaunchTom Connor
 
Denver devops : enabling DevOps with data virtualization
Denver devops : enabling DevOps with data virtualizationDenver devops : enabling DevOps with data virtualization
Denver devops : enabling DevOps with data virtualizationKyle Hailey
 
Climb stateoftheartintro
Climb stateoftheartintroClimb stateoftheartintro
Climb stateoftheartintrothomasrconnor
 
Using Containers and HPC to Solve the Mysteries of the Universe by Deborah Bard
Using Containers and HPC to Solve the Mysteries of the Universe by Deborah BardUsing Containers and HPC to Solve the Mysteries of the Universe by Deborah Bard
Using Containers and HPC to Solve the Mysteries of the Universe by Deborah BardDocker, Inc.
 
Desktop as a Service supporting Environmental ‘omics
Desktop as a Service supporting Environmental ‘omicsDesktop as a Service supporting Environmental ‘omics
Desktop as a Service supporting Environmental ‘omicsDavid Wallom
 
The Exascale Computing Project and the future of HPC
The Exascale Computing Project and the future of HPCThe Exascale Computing Project and the future of HPC
The Exascale Computing Project and the future of HPCinside-BigData.com
 
The Five Stages of Enterprise Jupyter Deployment
The Five Stages of Enterprise Jupyter DeploymentThe Five Stages of Enterprise Jupyter Deployment
The Five Stages of Enterprise Jupyter DeploymentFrederick Reiss
 
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storageWebinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storageMayaData Inc
 
HPC Cluster Computing from 64 to 156,000 Cores 
HPC Cluster Computing from 64 to 156,000 Cores HPC Cluster Computing from 64 to 156,000 Cores 
HPC Cluster Computing from 64 to 156,000 Cores inside-BigData.com
 
Why 2015 is the Year of Copy Data - What are the requirements?
Why 2015 is the Year of Copy Data - What are the requirements?Why 2015 is the Year of Copy Data - What are the requirements?
Why 2015 is the Year of Copy Data - What are the requirements?Storage Switzerland
 
How to accelerate docker adoption with a simple and powerful user experience
How to accelerate docker adoption with a simple and powerful user experienceHow to accelerate docker adoption with a simple and powerful user experience
How to accelerate docker adoption with a simple and powerful user experienceDocker, Inc.
 
2010 AIRI Petabyte Challenge - View From The Trenches
2010 AIRI Petabyte Challenge - View From The Trenches2010 AIRI Petabyte Challenge - View From The Trenches
2010 AIRI Petabyte Challenge - View From The TrenchesGeorge Ang
 
"The Cutting Edge Can Hurt You"
"The Cutting Edge Can Hurt You""The Cutting Edge Can Hurt You"
"The Cutting Edge Can Hurt You"Chris Dwan
 
Reducing Downtime Using Incremental Backups X-Platform TTS
Reducing Downtime Using Incremental Backups X-Platform TTSReducing Downtime Using Incremental Backups X-Platform TTS
Reducing Downtime Using Incremental Backups X-Platform TTSEnkitec
 
Hands on kubernetes_container_orchestration
Hands on kubernetes_container_orchestrationHands on kubernetes_container_orchestration
Hands on kubernetes_container_orchestrationAmir Hossein Sorouri
 
Docker: Containers for Data Science
Docker: Containers for Data ScienceDocker: Containers for Data Science
Docker: Containers for Data ScienceAlessandro Adamo
 

Similar a Introduction to HPC (20)

CLIMB System Introduction Talk - CLIMB Launch
CLIMB System Introduction Talk - CLIMB LaunchCLIMB System Introduction Talk - CLIMB Launch
CLIMB System Introduction Talk - CLIMB Launch
 
Denver devops : enabling DevOps with data virtualization
Denver devops : enabling DevOps with data virtualizationDenver devops : enabling DevOps with data virtualization
Denver devops : enabling DevOps with data virtualization
 
Climb stateoftheartintro
Climb stateoftheartintroClimb stateoftheartintro
Climb stateoftheartintro
 
Climb bath
Climb bathClimb bath
Climb bath
 
Using Containers and HPC to Solve the Mysteries of the Universe by Deborah Bard
Using Containers and HPC to Solve the Mysteries of the Universe by Deborah BardUsing Containers and HPC to Solve the Mysteries of the Universe by Deborah Bard
Using Containers and HPC to Solve the Mysteries of the Universe by Deborah Bard
 
Desktop as a Service supporting Environmental ‘omics
Desktop as a Service supporting Environmental ‘omicsDesktop as a Service supporting Environmental ‘omics
Desktop as a Service supporting Environmental ‘omics
 
The Exascale Computing Project and the future of HPC
The Exascale Computing Project and the future of HPCThe Exascale Computing Project and the future of HPC
The Exascale Computing Project and the future of HPC
 
The Five Stages of Enterprise Jupyter Deployment
The Five Stages of Enterprise Jupyter DeploymentThe Five Stages of Enterprise Jupyter Deployment
The Five Stages of Enterprise Jupyter Deployment
 
Cavity Data
Cavity DataCavity Data
Cavity Data
 
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storageWebinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
 
HPC Cluster Computing from 64 to 156,000 Cores 
HPC Cluster Computing from 64 to 156,000 Cores HPC Cluster Computing from 64 to 156,000 Cores 
HPC Cluster Computing from 64 to 156,000 Cores 
 
ch1.ppt
ch1.pptch1.ppt
ch1.ppt
 
Why 2015 is the Year of Copy Data - What are the requirements?
Why 2015 is the Year of Copy Data - What are the requirements?Why 2015 is the Year of Copy Data - What are the requirements?
Why 2015 is the Year of Copy Data - What are the requirements?
 
How to accelerate docker adoption with a simple and powerful user experience
How to accelerate docker adoption with a simple and powerful user experienceHow to accelerate docker adoption with a simple and powerful user experience
How to accelerate docker adoption with a simple and powerful user experience
 
2010 AIRI Petabyte Challenge - View From The Trenches
2010 AIRI Petabyte Challenge - View From The Trenches2010 AIRI Petabyte Challenge - View From The Trenches
2010 AIRI Petabyte Challenge - View From The Trenches
 
"The Cutting Edge Can Hurt You"
"The Cutting Edge Can Hurt You""The Cutting Edge Can Hurt You"
"The Cutting Edge Can Hurt You"
 
Reducing Downtime Using Incremental Backups X-Platform TTS
Reducing Downtime Using Incremental Backups X-Platform TTSReducing Downtime Using Incremental Backups X-Platform TTS
Reducing Downtime Using Incremental Backups X-Platform TTS
 
Hands on kubernetes_container_orchestration
Hands on kubernetes_container_orchestrationHands on kubernetes_container_orchestration
Hands on kubernetes_container_orchestration
 
Available HPC Resources at CSUC
Available HPC Resources at CSUCAvailable HPC Resources at CSUC
Available HPC Resources at CSUC
 
Docker: Containers for Data Science
Docker: Containers for Data ScienceDocker: Containers for Data Science
Docker: Containers for Data Science
 

Más de Chris Dwan

Somerville Police Staffing Final Report.pdf
Somerville Police Staffing Final Report.pdfSomerville Police Staffing Final Report.pdf
Somerville Police Staffing Final Report.pdfChris Dwan
 
2023 Ward 2 community meeting.pdf
2023 Ward 2 community meeting.pdf2023 Ward 2 community meeting.pdf
2023 Ward 2 community meeting.pdfChris Dwan
 
One Size Does Not Fit All
One Size Does Not Fit AllOne Size Does Not Fit All
One Size Does Not Fit AllChris Dwan
 
Somerville FY23 Proposed Budget
Somerville FY23 Proposed BudgetSomerville FY23 Proposed Budget
Somerville FY23 Proposed BudgetChris Dwan
 
Production Bioinformatics, emphasis on Production
Production Bioinformatics, emphasis on ProductionProduction Bioinformatics, emphasis on Production
Production Bioinformatics, emphasis on ProductionChris Dwan
 
#Defund thepolice
#Defund thepolice#Defund thepolice
#Defund thepoliceChris Dwan
 
2009 cluster user training
2009 cluster user training2009 cluster user training
2009 cluster user trainingChris Dwan
 
No Free Lunch: Metadata in the life sciences
No Free Lunch:  Metadata in the life sciencesNo Free Lunch:  Metadata in the life sciences
No Free Lunch: Metadata in the life sciencesChris Dwan
 
Somerville ufc memo tree hearing
Somerville ufc memo   tree hearingSomerville ufc memo   tree hearing
Somerville ufc memo tree hearingChris Dwan
 
2011 career-fair
2011 career-fair2011 career-fair
2011 career-fairChris Dwan
 
Advocacy in the Enterprise (what works, what doesn't)
Advocacy in the Enterprise (what works, what doesn't)Advocacy in the Enterprise (what works, what doesn't)
Advocacy in the Enterprise (what works, what doesn't)Chris Dwan
 
Intro bioinformatics
Intro bioinformaticsIntro bioinformatics
Intro bioinformaticsChris Dwan
 
Proposed tree protection ordinance
Proposed tree protection ordinanceProposed tree protection ordinance
Proposed tree protection ordinanceChris Dwan
 
Tree Ordinance Change Matrix
Tree Ordinance Change MatrixTree Ordinance Change Matrix
Tree Ordinance Change MatrixChris Dwan
 
Tree protection overhaul
Tree protection overhaulTree protection overhaul
Tree protection overhaulChris Dwan
 
Response from newport
Response from newportResponse from newport
Response from newportChris Dwan
 
Sacramento underpass bid_docs
Sacramento underpass bid_docsSacramento underpass bid_docs
Sacramento underpass bid_docsChris Dwan
 
2019 BioIt World - Post cloud legacy edition
2019 BioIt World - Post cloud legacy edition2019 BioIt World - Post cloud legacy edition
2019 BioIt World - Post cloud legacy editionChris Dwan
 
Somerville tree stat 2019 02 12
Somerville tree stat 2019 02 12Somerville tree stat 2019 02 12
Somerville tree stat 2019 02 12Chris Dwan
 
Ivaloo harrison kent
Ivaloo harrison kentIvaloo harrison kent
Ivaloo harrison kentChris Dwan
 

Más de Chris Dwan (20)

Somerville Police Staffing Final Report.pdf
Somerville Police Staffing Final Report.pdfSomerville Police Staffing Final Report.pdf
Somerville Police Staffing Final Report.pdf
 
2023 Ward 2 community meeting.pdf
2023 Ward 2 community meeting.pdf2023 Ward 2 community meeting.pdf
2023 Ward 2 community meeting.pdf
 
One Size Does Not Fit All
One Size Does Not Fit AllOne Size Does Not Fit All
One Size Does Not Fit All
 
Somerville FY23 Proposed Budget
Somerville FY23 Proposed BudgetSomerville FY23 Proposed Budget
Somerville FY23 Proposed Budget
 
Production Bioinformatics, emphasis on Production
Production Bioinformatics, emphasis on ProductionProduction Bioinformatics, emphasis on Production
Production Bioinformatics, emphasis on Production
 
#Defund thepolice
#Defund thepolice#Defund thepolice
#Defund thepolice
 
2009 cluster user training
2009 cluster user training2009 cluster user training
2009 cluster user training
 
No Free Lunch: Metadata in the life sciences
No Free Lunch:  Metadata in the life sciencesNo Free Lunch:  Metadata in the life sciences
No Free Lunch: Metadata in the life sciences
 
Somerville ufc memo tree hearing
Somerville ufc memo   tree hearingSomerville ufc memo   tree hearing
Somerville ufc memo tree hearing
 
2011 career-fair
2011 career-fair2011 career-fair
2011 career-fair
 
Advocacy in the Enterprise (what works, what doesn't)
Advocacy in the Enterprise (what works, what doesn't)Advocacy in the Enterprise (what works, what doesn't)
Advocacy in the Enterprise (what works, what doesn't)
 
Intro bioinformatics
Intro bioinformaticsIntro bioinformatics
Intro bioinformatics
 
Proposed tree protection ordinance
Proposed tree protection ordinanceProposed tree protection ordinance
Proposed tree protection ordinance
 
Tree Ordinance Change Matrix
Tree Ordinance Change MatrixTree Ordinance Change Matrix
Tree Ordinance Change Matrix
 
Tree protection overhaul
Tree protection overhaulTree protection overhaul
Tree protection overhaul
 
Response from newport
Response from newportResponse from newport
Response from newport
 
Sacramento underpass bid_docs
Sacramento underpass bid_docsSacramento underpass bid_docs
Sacramento underpass bid_docs
 
2019 BioIt World - Post cloud legacy edition
2019 BioIt World - Post cloud legacy edition2019 BioIt World - Post cloud legacy edition
2019 BioIt World - Post cloud legacy edition
 
Somerville tree stat 2019 02 12
Somerville tree stat 2019 02 12Somerville tree stat 2019 02 12
Somerville tree stat 2019 02 12
 
Ivaloo harrison kent
Ivaloo harrison kentIvaloo harrison kent
Ivaloo harrison kent
 

Último

Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptMAESTRELLAMesa2
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisDiwakar Mishra
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxBroad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxjana861314
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfnehabiju2046
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡anilsa9823
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 

Último (20)

Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.ppt
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxBroad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdf
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 

Introduction to HPC

  • 1. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net High Performance and Research Computing (For life scientists seeking a brief introduction to high performance and research computing) The BioTeam
  • 2. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net BioTeam™ Inc. • Objective & vendor neutral informatics and ‘bio-IT’ consulting • Composed of scientists who learned to bridge the gap between life science informatics and high performance IT • “iNquiry” bioinformatics cluster solution • Staff Michael Athanas Bill Van Etten Chris Dagdigian Stan Gloss Chris Dwan http://bioteam.net
  • 3. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Day Two Rough Outline • History of Computing • How Computers Work • Architectures for High Performance Computing • Parallel Computing • Cluster Computing • Building your own HPC Environment
  • 4. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net I can’t teach Unix today, Sorry • Also can’t teach programming today. • Really can’t teach an editor, meaningfully. • The Unix command line and a basic facility with programming are required to take advantage of HPC resources.
  • 5. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net What I Want You To Remember • It is important to map your solution to the computer architecture. • Automatic tools for performance tuning will only get you so far. • Good programs (and bad ones) can be written in any language • High Throughput vs. High Performance
  • 6. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Language and Word Choice • Technical Language is very specific • Words overlap between disciplines Please ask questions.
  • 7. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Computing - Pre-History • 1588 - English defeat the Spanish Armada (in part) using pre-computed firing tables • 1820’s - Charles Babbage invents a mechanical calculator • 1940’s - Bletchley Park, England breaks German codes
  • 8. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Computing - History • 1947: Transistor invented • 1950s – First electronic, stored program computers – FORTRAN Programming Language • 1965: Moore’s Law stated • 1968: Knuth publishes volume 1 of TAOCP • 1972: C Programming Language • 1976: Cray-1 Supercomputer • 1984: DNS introduced to the internet • 1989: WWW invented • 1991: First version of Linux • 1993: First Beowulf cluster • 1994: Java Programming Language
  • 9. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Von Neumann Model (1946) Processor(s) Memory (contains both program and data)
  • 10. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Alan Turing (1912-1954) • Any computing language of a certain “power” can solve any “computable” problem – Store values in memory – Add one to values – Conditional execution based on a value in memory • Proof using the “Turing machine” Some problems may be more easily stated or understood in one language or another.
  • 11. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net 1965: Moore’s Law
  • 12. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net
  • 13. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Donald Knuth • Professor Emeritus, Stanford University The Art of Computer Programming 1968: Volume One - Fundamental Algorithms 1969: Volume Two - Seminumerical Algorithms 1973: Volume Three - Searching and Sorting Literate Programming: “The main idea is to regard a program as a communication to human beings rather than as a set of instructions to a computer”
  • 14. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Cray Supercomputers • 1976: Cray 1 – Cray 1 (XMP, YMP, C90, J90, T90) • 1985: – Cray 2 • 1993: – Cray 3 (one machine delivered) • … • Present: – X1, XT3, XD1, SX-6
  • 15. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Clusters • 1993: – Beowulf: Custom interconnects (switched ethernet too expensive) • 2000: – Commercial cluster sales (Linux Networx) • 2003: – 7 of the top 10 supercomputers are clusters – 40% of the top 500 supercomputers are clusters • 2004: – Apple “workgroup cluster”
  • 16. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net “Big Mac” • Virginia Tech: – (2003) 3rd fastest supercomputer in the world: $5.4 Million – Ordered from Apple’s web sales page. “Virginia Tech has aspirations … …This is one of those spires that one can build that will be visible from afar.” -Hassan Aref Dean of Engineering, Virginia Tech
  • 17. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net
  • 18. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net
  • 19. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net 2004 - Apple Server Products • Xserve – Dual G5 – Up to 8GB RAM • XRAID – 5.6 TB Storage per unit – XSAN to combine up to 64TB • Apple Workgroup Cluster – Packaged with iNquiry
  • 20. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net 2004: Cray X1 • Scales to 4,096 CPUs • 4 CPUs per Node • Scales to 32TB RAM, globally addressable • 34.1 GB/sec per CPU memory bandwidth
  • 21. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net 2004: Orion MultiSystems • 10 to 96 CPUs in a desk side box. • 1.2GHz chips, but lots of them. • Pre-packaged cluster • Powered by a single, standard 15A wall socket
  • 22. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Other Observations, 2004 • Major computer manufacturers find their profits in selling sub $500 consumer electronics and ink. • Style (see-through cases, round cables, etc) is the determining characteristic in workstation purchases
  • 23. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net How Computers Work
  • 24. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Context switching • At any one time, only one process is actually executing on one CPU. • Switching between processes requires time, and is driven by interrupts. • Capture all allocated memory and write off to memory On CPU Other jobs
  • 25. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net OS and Interrupts • OS switches between processes from time to time • Also performs “housekeeping” tasks • Interrupts force OS context switches: – I/O – Power fault – Disk is ready to send data – …
  • 26. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Memory • CPU / Registers – Physically part of the CPU – Immediately accessible bythe machine code – ~128 registers on modern chips • Cache – 1 - 2 MB of very fast memory, also built into the chip. • RAM – 1 - 8 GB (Cough cough)
  • 27. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Memory Timings (2004) • CPU / Registers: – 10-9 seconds per instruction • Cache: – Low Latency • Memory – Latency: 102 cycles (~300 cycles) – Streaming: 0.8 GB / sec (~1 byte / cycle) • Disk – Latency: 10-3 seconds (106 cycles) – Streaming: 100 MB / sec (10-1 bytes / cycle) • Tape – Seconds to minutes
  • 28. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Memory to the Program • One large address space: 0 - 232 (or 264, or some other value) relative to the program • When memory is used, the “stack” increases • The program’s “memory footprint” is the amount of memory allocated. Larger footprints can step out of cache and even out of RAM. • “Segmentation violation” means “you tried to access memory that’s not yours”
  • 29. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net 32 vs 64 bits • Number of bits in a single memory “word” • Affects – Precision of calculations – Maximum memory address space – Compatibility of files – Memory data bandwidth – Marketing
  • 30. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Potential Limits • Largest integer 232 or 264 • Largest file 2 GB – not usually a problem anymore, but it crops up at really annoying times • Smallest / largest floating point number • Number of files on disk (inodes)
  • 31. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net System Monitoring Examples • Ps • Top • Iostat
  • 32. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Abstract: Convenience vs. Time Hardware: Voltages, Clocks, Transistors Microcode Assembly Language Operating System User Interface
  • 33. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Compiled vs. Interpreted • Script: – interpreted one line at a time – sh, csh, bash, tcsh, Perl, tcl, Ruby, … – Much faster development (to a point) – Can be slow – Program is relatively generic / portable (cough cough) • Compiled Language: – Code is translated into Assembly by a “compiler” – FORTRAN, PASCAL, C, C++, … – Can optimize for the specific architecture at compile time – Intellectual property is hidden in the compiled code
  • 34. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Performance Measurement • Wall Clock Time: How long you waited • User Time: How much of “your” computer time was spent on this job • System Time: How much time the system spent on the actual job
  • 35. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Example Program • Script in PERL • Program in C • Compile with optimization • Remove IO • Example memory allocation “bug”
  • 36. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net High Performance Computing
  • 37. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net What is “Super” computing? • Faster than what most other people are doing. • $106+ investment • Custom / innovative design It Depends http://www.top500.org
  • 38. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Just make the system faster • Increase memory bandwidth • Increase memory size • Decrease clock time Clearly limited
  • 39. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Superscalar Architectures • More than one ALU • More than one CPU • Instructions can happen in parallel • Most modern CPUs are superscalar to some level
  • 40. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Pipelining • Break each instruction into a series of steps • Build a pipeline of the steps (as much as possible) Y = x + y; Z = y - z; 1) Load inst: add 2) Load data: X, Y Load inst: sub 3) Calc: + Load data: Y, Z Load inst: … 4) Store: Y Stall 5) Calc: - 6) Store: Z
  • 41. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Branch Prediction if (x == 0) { a++; } else { b--; } Load Inst: “if” Load Data: “x” Load Instruction: Which one? • Always yes • Branch Prediction • Do both (superscalar processing pipeline) • Profile code and insert hints for runtime
  • 42. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Vector Processing • SIMD - Single Instruction, Multiple Data Vector A Vector CVector B + =
  • 43. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Vector Processing • Cray 1: 64 x 64 bit registers • Could be Software as well: • read the next instruction and decode it • get this number • get that number • add them • put the result here • get the 10 numbers here and add them to the numbers there and put the results here
  • 44. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Parallel Computing is Not New
  • 45. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net “Would you rather have 4 strong oxen, or 1024 chickens?” Seymour Cray (Cray Research Inc.) “Quantity has a quality all its own.” Russian Saying
  • 46. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Amdahl’s Law • Gene Amdahl (Architect for the IBM 360) • Parallel Time = WS + (WP / N) + C(N) • N = Number of processors • WP = Parallel Fraction • WS = Serial Fraction • C(N) = Cost of setting up a job over N machines • Assumption: WP + WS = W Amdahl measured parallel fraction for several IBM codes of the day and found it to be approx. 1/2. This meant the maximum speedup on those codes would be a factor of 2.
  • 47. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Embarrassingly Parallel • Large numbers of nearly identical jobs • Identical analysis on large pools of input data • Easily subdivided, mostly independent tasks (large code builds, raytracing, rendering, bioinformatic sequence analysis) • User writes serial code and executes in batches. • Nearly always a speedup
  • 48. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Traditional Parallelism • A single process in which several parallel elements (threads) must communicate to solve a single problem. • Parallel programming is difficult, and far from automatic • Users must explicitly use parallel programming tools • Speedup is not guaranteed
  • 49. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Parallel vs. Serial codes Entirely parallelizable X[0] = 7 + 5 X[1] = 2 + 3 X[2] = 4 + 5 X[3] = 6 + 8 Loop dependencies X[0] = 0 X[1] = X[0] + 1 X[2] = X[1] + 2 X[3] = X[2] + 3 Can be reduced to: X[n] = (1 + n)(n/2)
  • 50. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Loop Unrolling • Simple, obvious things can be done automatically (and already are) • If the contents of the loop are invariant with the iterator, we can safely unroll the loop. for (n = 0; n < 10; n++) { a[n]++; } for (n = 1; n < 10; n++) { a[n-1] + 1;}
  • 51. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Parallel Processing Architectures • Cycle Stealing / Network of Workstations • Single System Image – Log in and use the machine. – Parallelism can be hidden. • Message Passing vs. Shared Memory Architectures • Portal Architecture (cluster or compute farm) – Log in and submit jobs to be run in parallel. – Parallelism is explicit – Can use message passing.
  • 52. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Network Of Workstations Cycle Stealing Workstations Public Network My Job
  • 53. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Network Of Workstations Cycle Stealing Workstations Public Network Also My JobMy Job
  • 54. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Labs of Workstations • Offer to improve the lab machines by installing hardware you need. • Do not make the users suffer. • Accept that this is a part time resource (return is much less than number of CPUs) • Unless the owner of the lab buys into the distributed computing idea, there will be trouble.
  • 55. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Cycle Stealing • Variation in hosts • Data motion • Need small granularity in your problem • Condor (U. Wisconsin) • * @ Home • United Devices
  • 56. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Shared Memory Multiprocessor • SGI Origin, others. • Limited scalability • Remarkably expensive
  • 57. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net NUMA • Non Uniform Memory Architecture
  • 58. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Message Passing • Start up multiple instances of the same program • Each figures out which one it is • Can send messages between them. • Requires a parallel programmer
  • 59. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Supercomputing • Really exploiting and tuning code for a particular supercomputer requires a lot of hard work.
  • 60. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Cluster Computing • A cost effective way to achieve large speedups • Throughput rather than High Performance. It’s all about power and money
  • 61. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net
  • 62. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Portal Architecture Compute Nodes Private Network Head Node Public Network Cluster Partitions Jobs of user 1
  • 63. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Data motion Compute Nodes Private Network Head Node Public Network Cluster Partitions Data I/O a huge bottleneck for many types of computation
  • 64. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Portal Architecture Compute Nodes Private Network Head Node Public Network Cluster Partitions Jobs of user 1 Jobs of user 2
  • 65. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Distributed Resource Managers (DRM) • Maintain a queue of jobs • Schedule jobs onto compute nodes • Several Options, mostly identical: – Sun GridEngine (SGE) – Portable Batch System (PBS) – Load Sharing Facility (LSF - Platform Computing)
  • 66. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Job Scheduling and Priority • First In, First Out (FIFO) • Fairshare – Try to maintain a goal level of usage on the cluster. – Going above that level lowers your priority – Not using the system for a while raises priority • Job Priority is a social / political issue.
  • 67. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Job Scheduling • Sadly though, even though users and managers understand share-tree when the method is explained to them they tend to forget these details when they notice their jobs pending in the wait list. Users who have been told to expect a 50% entitlement to cluster resources get frustrated when they launch their jobs and don't get to take over half of the cluster instantly. Explaining to them that the 50% entitlement is a goal that the scheduler is working to meet "as averaged over time..." fall upon deaf ears. Heavy users get upset to learn that their current entitlement is being "penalized" because their past usage greatly exceeded their alloted share. Cluster admins then spend far too much time attempting to "prove" to the user community that they are not getting short changed
  • 68. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Stages of Cluster Use 1. I just need to get this one set of data processed. 2. This is a task that I will perform frequently. 3. I am the bane of my local administrator. I have my own little cluster, plus a bunch of workstations in my lab. I wish I had administrative access to the big cluster. 4. I have a pipeline of data which will always be subject to the same analysis, and I run all my jobs on some large (set of) central resource(s)
  • 69. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Example SGE usage
  • 70. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Parallel Programming
  • 71. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Why is the program slow? • Who cares? • Something about the way it was run • Something about the system on which it was run • Something about the program itself
  • 72. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Solution Strategies High Performance High Throughput
  • 73. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net “Premature optimization is the root of all evil in computer programming” Donald Knuth
  • 74. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Photoshop Example • Steve and Phil’s Photoshop Demo – 8 minutes • On 8 Xserves – 1 minute? • No!
  • 75. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net High Throughput • 8 X Photshop Demos on 8 Xserves – 8 minutes? • Yes! • With Some Effort
  • 76. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net High Performance • Not 8 X Work in 1 X Time • But 1 X Work in 1/8 X Time • Partition the Problem? • Limited by –Application –Data Parallelism
  • 77. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net High Performance • Sharpen, Blur, Diffuse, Rotate, etc. • Divide Task by Step? • No! – Steps are order dependent – Can’t merge results
  • 78. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Divide by Step Local Area Network PS Sequence Distributed Resource Manager Sharpen Diffuse Rotate Blur etc. … … … •Divide Steps •Perform Steps •Merge Results
  • 79. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net High Performance • Divide by Image? • 1/8th Image on Each of 8 Xserves – 1 minute? • Plausible, but… – new work – duplicated work
  • 80. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Divide by Image Local Area Network PS Sequence Distributed Resource Manager 1/8 2/8 3/8 4/8 5/8 6/8 7/8 8/8 •Divide Image •Perform Steps •Merge Results
  • 81. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net High Performance • Divide Task by Layer? • 1 of 8 Layers on Each of 8 Xserves – 1 minute? • Probably Yes! – If each layer computes in the same time
  • 82. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Divide by Layer Local Area Network PS Sequence Distributed Resource Manager Layer 1 Layer 2 Layer 3 Layer 4 Layer 5 Layer 6 Layer 7 Layer 8 •Render Layers •Merge Results
  • 83. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net High Performance • Same Work in Less Time • Application Specific • Data Specific • Use Specific • Unless?
  • 84. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Tightly Coupled High Speed/Low Latency Switching Fabric Private Ethernet Network “Public” Local Area Network Photoshop •App Partitions •Pass messages •Share Memory •PVM, MPI
  • 85. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Building Your Own HPC Environment
  • 86. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Do It Yourself (DIY) • It is possible to build a high performance computing environment in your basement. • Home users can now run Linux without having to be computer experts. • Labs and corporations can run cluster computers without having to employ full time computer support staff.
  • 87. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net DIY: Physical Considerations • Power – Resting draw vs Loaded draw. – Two Phase vs. Three Phase • Cooling – Air cooling the “Big Mac” would have required 60MPH winds • Space • Physical Security • Noise
  • 88. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net DIY: Administration • ~1 FTE to maintain and provide access to a large cluster • Also plan on some portion of a software developer to assist with research (distinct from system administration)
  • 89. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Cluster Size Thresholds • 1-16 nodes – Scrape by with poor practices – Make a grad student do it. • 32 nodes – Physical management issues loom large – Split out fileserver functions • 128 nodes: – Network gets interesting – Power / cooling become major issues • 1,024 nodes: – Call the newspaper
  • 90. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Should I build my own cluster? • Install takes time: – Expert: ~1 day to configure – Novice: Weeks to never. • Systems support is ongoing. – Well managed: ~1 FTE / 1,000 compute nodes. • No need to share, custom configurations in real time.
  • 91. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Infx Cluster Checklist • User Model • Applications • Compilers • Phys. Environment • Know Bottlenecks • Network & Topology • Storage • Maintenance • Administration & Monitoring • DRM • Common User Env and File System
  • 92. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net User model & Use Cases • Single User, Few, Many Users • Groups of users • Are some more equal than others • Batch/bulk vs. singleton jobs • High-throughput or high-performance
  • 93. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Application Requirements • Many short running processes • Few long running processes • Ave/Max RAM requirement • CPU and/or IO bound • Single/Multi-threaded • MPI/PVM/Linda parallel aware • Will it run under *nix
  • 94. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Compilers • GNU tools are great, but… – consider commercial compiler options – Performance freak – Writing SMP or parallel apps – Serious scientific programs in C(++)/Fortran
  • 95. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Phys/Environmental Constraints • Available Power • Available Cooling • Density (Blades/1U/2U/Tower) • DIY Staging space • Raised floor or ceiling drops • Height & width surprises • Fire code • Organizational standards
  • 96. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net This could be you…
  • 97. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Know your bottlenecks • I/O bound – Sequence analysis is limited by the speed of your disks and fileserver • CPU bound – Chemical and protein structure modeling are generally CPU bound • RAM bound – Some apps (eg Gaussian) are bound by speed of memory access • Network bound – Network IO/IPC
  • 98. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Network & Interconnects • Bandwidth for IO/IPC • Parallel networks? • High speed interconnect(s)? –Enough PCI slots? • Network topology effects: –Scaling and growth –Wire management –Access to external networks
  • 99. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net High Speed Interconnects • When low latency message passing is critical – Massively parallel applications • Not generally needed in BioClusters (yet) • Can add 50% or more to cost of each server • No magic, must be planned for – Applications, APIs, code, compilers, PCI slots, cable management & rack space • Commercial products – Myrinet (www.myricom.com) – Dolphin SCI (www.dolphinics.com)
  • 100. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Storage • Most BioClusters are I/O bound • Not an area to pinch pennies • NAS vs. SAN vs. Local Disk • Pick the right RAID levels • Heterogeneous storage • Plan for data staging and caching
  • 101. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net SAN vs. NAS vs. Local Disk • SAN generally inappropriate • NAS or Hybrid NAS/SAN is best – multiple clients with concurrent read/write access to the same file system or volume • Local Disk – Diskless nodes inappropriate – Expensive SCSI disks are unnecessary – Large, cheap IDE drives allow for clever data caching – Best way to avoid thrashing a fileserver
  • 102. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net RAID Levels • RAID 0: Stripe – Fast / risky • RAID 1: Mirror – Great, if you can afford double the disk • RAID 5: – Able to lose any individual disk and still keep data • Striped Banks of RAID 5: – Scalable, stable solution
  • 103. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Stage Data to Local Disk • A $250K NAS server can be brought to its knees by a few large BLAST searches • Active clusters will tax even the fastest arrays. This is mostly unavoidable. • Plan for data staging – Move data from fileserver to cheap local disk on cluster compute nodes – No magic; users and software developers need to do this explicitly in their workflows and pipelines
  • 104. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Maintenance Philosophy • Compute nodes must be – Anonymous, Interchangeable, Disposable • 3 Possible states – Running / Online – Faulted / Re-imaging – Failed / Offline & marked for replacement • Administration must be – Scalable, Automated, Remote The Three “R”s of a cluster: Reboot, Reimage, Replace
  • 105. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Monitoring and Reporting • Many high quality free tools – Ganglia, Big Brother, RRDTool, MRTG – Sar, Ntop, etc. etc. • Commercial tools too • Log files are the poor man’s trending tool – System, daemon, DRM
  • 106. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net OS Installation and Updates • SystemImager- www.systemimager.org – “SystemImager is software that automates UNIX installs, software distribution, and production deployment.” – Unattended disk partitioning and UNIX installation – Incremental updates of active systems – Based on open standards and tools • RSYNC, SSH, DHCP, TFTP – Totally free, open source – Merging with IBM LUI project • Also: Apple NetBoot
  • 107. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net Common DRM suites • Open Source and/or freely licensed – OpenPBS – Sun GridEngine • Commercially available – PBS Pro – Platform LSF
  • 108. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net DRM: My $.02 • At this time Platform LSF is still technically the best choice for serious production BioClusters – Lowest administrative/operational burden – Fault tolerance features are unmatched • 2nd choice(s) – GridEngine, if you can support it internally Go with your local expertise
  • 109. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net A Hybrid Approach • Systems group – Specify supported configurations – Maintain a central machine room – Configure scheduler to give owners priority on their own nodes. • Researchers – Estimate their computational load – Include line-item for the required number of nodes
  • 110. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net What can I do today? • CS: – Take biology coursework – Accept that biology is really, really complex and difficult. • Bio: – Take CS coursework – Accept that computer engineering / software development is tricky. • Administrators: – Decide to build a “spire, which will be visible from afar” • All: – Attend Journal Clubs, symposia, etc. – Get a bigger monitor
  • 111. © 2004 The BioTeam http://bioteam.net cdwan@bioteam.net The future • All scientists will write computer programs • “Computational Biology” will sound just as redundant as “Computational Physics” • Most labs will have a small cluster and some local expertise, plus collaborations with supercomputing centers • Grid / Web Services technology will enable cool things.