Clemson University's high performance computing storage includes the Palmetto cluster with 1972 nodes, 22928 cores, and 6 petabytes of storage. They are connecting research clusters across campus with 100 gigabit ethernet and expanding access to research data through WebDAV and the Innovation Platform. Performance testing of OrangeFS on Dell R720 servers with 10 gigabit ethernet showed read speeds up to 12 gigabytes per second and write speeds up to 8 gigabytes per second.
1. Clemson
HPC
Storage
Dell
Panel
SC13
Boyd
Wilson
So,ware
CTO
Clemson
University
2. Outline
• Palme9o
Cluster
• Wide
Area
Storage
Across
the
Innova@on
PlaAorm
• Collec@ve
Cluster
(Real-‐Time
Data
Aggrega@on
and
Analy@cs
Cluster)
• Performance
Numbers
• Research
DMZ/Network
3. Palmetto
Storage
Primary
Research
Cluster
at
Clemson
• 1972
nodes
• 22928
Cores
• 998400
Cuda
Cores
• 396
TF
(only
benchmarked
newest
GPU
nodes)
• ~120
+
TF
addi@onal
not
benchmarked.
• Condominium
Model
• Home
Storage
SAMQFS
backed
by
SL8500
(6PB)
• Scratch
OrangeFS
4. Palmetto
Storage
MX
Nodes
1622
Nodes
96
TF
10G
MX
96
IB
Nodes
with
FDR
IB
Nodes
200
Nodes
400
Nvidia
K20
396
TF
FDR
IB
10G
Eth
Scratch
• 32
R510
• 16
R720
• 512TB
OrangeFS
(v2.8.8)
Home/Archive
• SAMQFS
over
NFS
• 120TB
Disk
• 6PB
Tape
NFS
SAM
QFS
Home
and
Archive
on
SL8500
5. Palmetto
Scratch
Next
Steps
MX
Nodes
1622
Nodes
96
TF
10G
IPoMX
Mul@ple
10G
Eth
WebDAV
Campus
Data
Access
•
•
•
•
•
32
Dell
R720
520TB
Scratch
OrangeFS
WebDAV
to
OrangeFS
Hadoop
over
OrangeFS
with
MyHadoop
FDR
IPoIB
FDR
IB
Nodes
200
Nodes
400
Nvidia
K20
GPU
396
TF
Mul@ple
10G
Eth
/
100
G
ScienceDMZ
Innova@on
PlaAorm
Data
Access
6. Clemson
–
USC
100Gb
tests
12
Dell
R720
OrangeFS
Servers
OrangeFS
Clients
• File
Write
37Gb/s
• Server
Hw
problems
&
network
packet
loss
during
tests
• Perfsonar
49Gb/s
ini@al
• Later
retest
~70Gb/s
with
tuning
• Addi@onal
File
tes@ng
planned
(Ini@al
tes@ng
systems
had
to
move
to
produc@on)
8. The
“Collective”
Cluster
Palme9o
• 12
R720
• 170TB
• D3
based
Vis
Toolkit
called
SocalTap
• Social
Media
Aggrega@on
Via
GNIP
• Elas@c
Search
• Hadoop
MapReduce
• OrangeFS
• WebDAV
to
OrangeFS
Mul@ple
10G
Eth
WebDAV
Campus
Data
Access
Social
Data
Input
ScienceDMZ
Innova@on
PlaAorm
Data
Access
10. MapReduce
over
OrangeFS
•
*25%
improvement
with
OrangeFS
running
on
Separate
nodes
from
Map
Reduce
• 8
Dell
R720
Servers
Connected
with
10Gb/s
Ethernet
• Remote
Case
adds
an
additional
8
Identical
Servers
and
does
all
OrangeFS
work
Remotely
and
only
Local
work
is
done
on
Compute
Node
(Traditional
HPC
Model)
11. MapReduce
over
OrangeFS
• 16
Dell
R720
Servers
Connected
with
10Gb/s
Ethernet
• Remote
Clients
are
Dell
R720s
with
single
SAS
disks
for
local
data
(vs.
12
disk
arrays
in
the
previous
test).
12. Clemson
Research
Network
Internet/I2/NLR
PerfSonar
PerfSonar
Collaborator PerfSonar
CLight
Science(DMZ(
Perimeter&F/W
I2&InnovaJon&PlaKorm
Internet
F/W&(ACL)&and&Route&Filter
DMZ
Campus
gg
ed
&
Peer&Link
Clemson
10
0G
i
g&T
a
PerfSonar
Tr
un
k
PerfSonar
Innova@on(
PlaAorm
Palme>oNet
Host&Firewall
Brocade(MLx32(
Core((Router
CC7NIE
Fibre(Channel
Dell&Z9000
SamQFS
Dell&S4810
Top&of&Rack
Dell&S4810
Dell&S4810