SlideShare una empresa de Scribd logo
1 de 186
Simulation
CAD
CAE
Engineering
Price Modelling
Life Sciences
Automotive
Aerospace
Risk Analysis
Big Data Analytics
High Throughput Computing
HighPerformance
Computing2013/14
TechnologyCompass
2
TechnologyCompass
Table of contents and Introduction
High Performance Computing............................................4
Performance Turns Into Productivity.......................................................6
Flexible Deployment With xcat...................................................................8
Service and Customer Care From A to Z ..............................................10
Advanced Cluster Management Made Easy............ 12
Easy-to-use, Complete and Scalable.......................................................14
Cloud Bursting With Bright..........................................................................18
Intelligent HPC Workload Management.................... 30
Moab HPC Suite – Basic Edition.................................................................32
Moab HPC Suite – Enterprise Edition.....................................................34
Moab HPC Suite – Grid Option....................................................................36
Optimizing Accelerators with Moab HPC Suite...............................38
Moab HPC Suite – Application Portal Edition...................................42
Moab HPC Suite – Remote Visualization Edition............................44
Using Moab With Grid Environments....................................................46
Nice EnginFrame...................................................................... 52
A Technical Portal for Remote Visualization.....................................54
Application Highlights....................................................................................56
Desktop Cloud Virtualization.....................................................................58
Remote Visualization.......................................................................................60
Intel Cluster Ready ................................................................. 64
A Quality Standard for HPC Clusters......................................................66
Intel Cluster Ready builds HPC Momentum......................................70
The transtec Benchmarking Center........................................................74
Parallel NFS.................................................................................. 76
The New Standard for HPC Storage........................................................78
Whats´s New in NFS 4.1?...............................................................................80
Panasas HPC Storage.......................................................................................82
NVIDIA GPU Computing........................................................ 94
What is GPU Computing?..............................................................................96
Kepler GK110 – The Next Generation GPU..........................................98
Introducing NVIDIA Parallel Nsight...................................................... 100
A Quick Refresher on CUDA....................................................................... 112
Intel TrueScale InfiniBand and GPUs................................................... 118
Intel Xeon Phi Coprocessor..............................................122
The Architecture.............................................................................................. 124
InfiniBand....................................................................................134
High-speed Interconnects......................................................................... 136
Top 10 Reasons to Use Intel TrueScale InfiniBand...................... 138
Intel MPI Library 4.0 Performance......................................................... 140
Intel Fabric Suite 7.......................................................................................... 144
Parstream....................................................................................150
Big Data Analytics........................................................................................... 152
Glossary........................................................................................160
3
More than 30 years of experience in
Scientific Computing
1980 marked the beginning of a decade where numerous start-
ups were created, some of which later transformed into big play-
ers in the IT market. Technical innovations brought dramatic
changes to the nascent computer market. In Tübingen, close to
one of Germany’s prime and oldest universities, transtec was
founded.
In the early days, transtec focused on reselling DEC computers
and peripherals, delivering high-performance workstations to
university institutes and research facilities. In 1987, SUN/Sparc
and storage solutions broadened the portfolio, enhanced by
IBM/RS6000 products in 1991. These were the typical worksta-
tions and server systems for high performance computing then,
used by the majority of researchers worldwide.
In the late 90s, transtec was one of the first companies to offer
highly customized HPC cluster solutions based on standard Intel
architecture servers, some of which entered the TOP 500 list of
the world’s fastest computing systems.
Thus, given this background and history, it is fair to say that
transtec looks back upon a more than 30 years’ experience in sci-
entific computing; our track record shows more than 500 HPC in-
stallations. With this experience, we know exactly what custom-
ers’ demands are and how to meet them. High performance and
ease of management – this is what customers require today. HPC
systems are for sure required to peak-perform, as their name in-
dicates, but that is not enough: they must also be easy to handle.
Unwieldy design and operational complexity must be avoided
or at least hidden from administrators and particularly users of
HPC computer systems.
This brochure focusses on where transtec HPC solutions excel.
transtec HPC solutions use the latest and most innovative tech-
nology. Bright Cluster Manager as the technology leader for uni-
fied HPC cluster management, leading-edge Moab HPC Suite for
job and workload management, Intel Cluster Ready certification
as an independent quality standard for our systems, Panasas
HPC storage systems for highest performance and real ease of
management required of a reliable HPC storage system. Again,
with these components, usability, reliability and ease of man-
agement are central issues that are addressed, even in a highly
heterogeneous environment. transtec is able to provide custom-
ers with well-designed, extremely powerful solutions for Tesla
GPU computing, as well as thoroughly engineered Intel Xeon
Phi systems. Intel’s InfiniBand Fabric Suite makes managing a
large InfiniBand fabric easier than ever before – transtec mas-
terly combines excellent and well-chosen components that are
already there to a fine-tuned, customer-specific, and thoroughly
designed HPC solution.
Your decision for a transtec HPC solution means you opt for
most intensive customer care and best service in HPC. Our ex-
perts will be glad to bring in their expertise and support to assist
you at any stage, from HPC design to daily cluster operations, to
HPC Cloud Services.
Last but not least, transtec HPC Cloud Services provide custom-
ers with the possibility to have their jobs run on dynamically pro-
vided nodes in a dedicated datacenter, professionally managed
and individually customizable. Numerous standard applications
like ANSYS, LS-Dyna, OpenFOAM, as well as lots of codes like Gro-
macs, NAMD, VMD, and others are pre-installed, integrated into
an enterprise-ready cloud management environment, and ready
to run.
Have fun reading the transtec HPC Compass 2013/14!
HighPerformanceCom-
puting–Performance
TurnsIntoProductivity
55
High Performance Comput-
ing (HPC) has been with us
from the very beginning
of the computer era. High-
performance computers
were built to solve numer-
ous problems which the
“human computers” could
not handle. The term HPC
just hadn’t been coined yet.
More important, some of
the early principles have
changed fundamentally.
6
Performance Turns Into Productivity
HPC systems in the early days were much different from those
we see today. First, we saw enormous mainframes from large
computer manufacturers, including a proprietary operating
system and job management system. Second, at universities
and research institutes, workstations made inroads and sci-
entists carried out calculations on their dedicated Unix or
VMS workstations. In either case, if you needed more comput-
ing power, you scaled up, i.e. you bought a bigger machine.
Today the term High-Performance Computing has gained a
fundamentally new meaning. HPC is now perceived as a way
to tackle complex mathematical, scientific or engineering
problems. The integration of industry standard, “off-the-shelf”
server hardware into HPC clusters facilitates the construction
of computer networks of such power that one single system
could never achieve. The new paradigm for parallelization is
scaling out.
Computer-supported simulations of realistic processes (so-
called Computer Aided Engineering – CAE) has established
itself as a third key pillar in the field of science and research
alongside theory and experimentation. It is nowadays incon-
ceivable that an aircraft manufacturer or a Formula One rac-
ing team would operate without using simulation software.
And scientific calculations, such as in the fields of astrophys-
ics, medicine, pharmaceuticals and bio-informatics, will to a
large extent be dependent on supercomputers in the future.
Software manufacturers long ago recognized the benefit of
high-performance computers based on powerful standard
servers and ported their programs to them accordingly.
The main advantages of scale-out supercomputers is just that:
they are infinitely scalable, at least in principle. Since they are
based on standard hardware components, such a supercom-
puter can be charged with more power whenever the com-
putational capacity of the system is not sufficient any more,
simply by adding additional nodes of the same kind. A cum-
bersome switch to a different technology can be avoided in
most cases.
HighPerformanceComputing
Performance Turns Into Productivity
“transtec HPC solutions are meant to provide cus-
tomers with unparalleled ease-of-management and
ease-of-use. Apart from that, deciding for a transtec
HPC solution means deciding for the most intensive
customer care and the best service imaginable.”
Dr. Oliver Tennert Director Technology Management &
HPC Solutions
7
Variations on the theme: MPP and SMP
Parallel computations exist in two major variants today. Ap-
plications running in parallel on multiple compute nodes are
frequently so-called Massively Parallel Processing (MPP) appli-
cations. MPP indicates that the individual processes can each
utilize exclusive memory areas. This means that such jobs are
predestined to be computed in parallel, distributed across the
nodes in a cluster. The individual processes can thus utilize the
separate units of the respective node – especially the RAM, the
CPU power and the disk I/O.
Communication between the individual processes is imple-
mented in a standardized way through the MPI software inter-
face (Message Passing Interface), which abstracts the underlying
network connections between the nodes from the processes.
However, the MPI standard (current version 2.0) merely requires
source code compatibility, not binary compatibility, so an off-
the-shelf application usually needs specific versions of MPI
libraries in order to run. Examples of MPI implementations are
OpenMPI, MPICH2, MVAPICH2, Intel MPI or – for Windows clusters
– MS-MPI.
If the individual processes engage in a large amount of commu-
nication, the response time of the network (latency) becomes im-
portant. Latency in a Gigabit Ethernet or a 10GE network is typi-
cally around 10 µs. High-speed interconnects such as InfiniBand,
reduce latency by a factor of 10 down to as low as 1 µs. Therefore,
high-speed interconnects can greatly speed up total processing.
The other frequently used variant is called SMP applications.
SMP, in this HPC context, stands for Shared Memory Processing.
It involves the use of shared memory areas, the specific imple-
mentation of which is dependent on the choice of the underly-
ing operating system. Consequently, SMP jobs generally only run
on a single node, where they can in turn be multi-threaded and
thus be parallelized across the number of CPUs per node. For
many HPC applications, both the MPP and SMP variant can be
chosen.
Many applications are not inherently suitable for parallel execu-
tion. In such a case, there is no communication between the in-
dividual compute nodes, and therefore no need for a high-speed
network between them; nevertheless, multiple computing jobs
can be run simultaneously and sequentially on each individual
node, depending on the number of CPUs.
In order to ensure optimum computing performance for these
applications, it must be examined how many CPUs and cores de-
liver the optimum performance.
We find applications of this sequential type of work typically in
the fields of data analysis or Monte-Carlo simulations.
8
xCAT as a Powerful and Flexible Deployment Tool
xCAT (Extreme Cluster Administration Tool) is an open source
toolkit for the deployment and low-level administration of
HPC cluster environments, small as well as large ones.
xCAT provides simple commands for hardware control, node
discovery, the collection of MAC addresses, and the node de-
ployment with (diskful) or without local (diskless) installation.
The cluster configuration is stored in a relational database.
Node groups for different operating system images can be
defined. Also, user-specific scripts can be executed automati-
cally at installation time.
xCAT Provides the Following Low-Level Administrative Fea-
tures
ʎʎ Remote console support
ʎʎ Parallel remote shell and remote copy commands
ʎʎ Plugins for various monitoring tools like Ganglia or Nagios
ʎʎ Hardware control commands for node discovery, collect-
ing MAC addresses, remote power switching and resetting
of nodes
ʎʎ Automatic configuration of syslog, remote shell, DNS,
DHCP, and ntp within the cluster
ʎʎ Extensive documentation and man pages
For cluster monitoring, we install and configure the open
source tool Ganglia or the even more powerful open source
HighPerformanceComputing
Flexible Deployment With xCAT
9
solution Nagios, according to the customer’s preferences and
requirements.
Local Installation or Diskless Installation
We offer a diskful or a diskless installation of the cluster
nodes. A diskless installation means the operating system is
hosted partially within the main memory, larger parts may or
may not be included via NFS or other means. This approach
allows for deploying large amounts of nodes very efficiently,
and the cluster is up and running within a very small times-
cale. Also, updating the cluster can be done in a very efficient
way. For this, only the boot image has to be updated, and the
nodes have to be rebooted. After this, the nodes run either a
new kernel or even a new operating system. Moreover, with
this approach, partitioning the cluster can also be very effi-
ciently done, either for testing purposes, or for allocating dif-
ferent cluster partitions for different users or applications.
Development Tools, Middleware, and Applications
According to the application, optimization strategy, or under-
lying architecture, different compilers lead to code results of
very different performance. Moreover, different, mainly com-
mercial, applications, require different MPI implementations.
And even when the code is self-developed, developers often
prefer one MPI implementation over another.
According to the customer’s wishes, we install various compil-
ers, MPI middleware, as well as job management systems like
Parastation, Grid Engine, Torque/Maui, or the very powerful
Moab HPC Suite for the high-level cluster management.
10
HighPerformanceComputing
Services and Customer Care from A to Z
transtec HPC as a Service
You will get a range of applications like LS-Dyna, ANSYS,
Gromacs, NAMD etc. from all kinds of areas pre-installed,
integrated into an enterprise-ready cloud and workload
management system, and ready-to run. Do you miss your
application?
Ask us: HPC@transtec.de
transtec Platform as a Service
You will be provided with dynamically provided compute
nodes for running your individual code. The operating
system will be pre-installed according to your require-
ments. Common Linux distributions like RedHat, CentOS,
or SLES are the standard. Do you need another distribu-
tion?
Ask us: HPC@transtec.de
transtec Hosting as a Service
You will be provided with hosting space inside a profes-
sionally managed and secured datacenter where you
can have your machines hosted, managed, maintained,
according to your requirements. Thus, you can build up
your own private cloud. What range of hosting and main-
tenance services do you need?
Tell us: HPC@transtec.de
HPC @ transtec: Services and Customer Care from
A to Z
transtec AG has over 30 years of experience in scientific computing
and is one of the earliest manufacturers of HPC clusters. For nearly
a decade, transtec has delivered highly customized High Perfor-
manceclustersbasedonstandardcomponentstoacademicandin-
dustry customers across Europe with all the high quality standards
and the customer-centric approach that transtec is well known for.
Every transtec HPC solution is more than just a rack full of hard-
ware – it is a comprehensive solution with everything the HPC user,
owner,andoperatorneed.Intheearlystagesofanycustomer’sHPC
project, transtec experts provide extensive and detailed consult-
ing to the customer – they benefit from expertise and experience.
Consulting is followed by benchmarking of different systems with
either specifically crafted customer code or generally accepted
individual Presales
consulting
application-,
customer-,
site-specific
sizing of
HPC solution
burn-in tests
of systems
benchmarking of
different systems
continual
improvement
software
& OS
installation
application
installation
onsite
hardware
assembly
integration
into
customer’s
environment
customer
training
maintenance,
support &
managed services
individual Presales
consulting
application-,
customer-,
site-specific
sizing of
HPC solution
burn-in tests
of systems
benchmarking of
different systems
continual
improvement
software
& OS
installation
application
installation
onsite
hardware
assembly
integration
into
customer’s
environment
customer
training
maintenance,
support &
managed services
Services and Customer Care from A to Z
11
benchmarking routines; this aids customers in sizing and devising
the optimal and detailed HPC configuration.
Each and every piece of HPC hardware that leaves our factory un-
dergoes a burn-in procedure of 24 hours or more if necessary. We
make sure that any hardware shipped meets our and our custom-
ers’ quality requirements. transtec HPC solutions are turnkey solu-
tions.Bydefault,atranstecHPCclusterhaseverythinginstalledand
configured – from hardware and operating system to important
middleware components like cluster management or developer
tools and the customer’s production applications. Onsite delivery
means onsite integration into the customer’s production environ-
ment, be it establishing network connectivity to the corporate net-
work, or setting up software and configuration parts.
transtec HPC clusters are ready-to-run systems – we deliver, you
turn the key, the system delivers high performance. Every HPC proj-
ect entails transfer to production: IT operation processes and poli-
ciesapplytothenewHPCsystem.Effectively,ITpersonnelistrained
hands-on, introduced to hardware components and software, with
all operational aspects of configuration management.
transtec services do not stop when the implementation projects
ends. Beyond transfer to production, transtec takes care. transtec
offers a variety of support and service options, tailored to the cus-
tomer’s needs. When you are in need of a new installation, a major
reconfiguration or an update of your solution – transtec is able to
support your staff and, if you lack the resources for maintaining
the cluster yourself, maintain the HPC solution for you. From Pro-
fessional Services to Managed Services for daily operations and
required service levels, transtec will be your complete HPC service
and solution provider. transtec’s high standards of performance, re-
liability and dependability assure your productivity and complete
satisfaction.
transtec’s offerings of HPC Managed Services offer customers the
possibility of having the complete management and administra-
tion of the HPC cluster managed by transtec service specialists, in
an ITIL compliant way. Moreover, transtec’s HPC on Demand servic-
es help provide access to HPC resources whenever they need them,
for example, because they do not have the possibility of owning
and running an HPC cluster themselves, due to lacking infrastruc-
ture, know-how, or admin staff.
transtec HPC Cloud Services
Lastbutnotleasttranstec’sservicesportfolioevolvesascustomers‘
demands change. Starting this year, transtec is able to provide HPC
Cloud Services. transtec uses a dedicated datacenter to provide
computing power to customers who are in need of more capacity
than they own, which is why this workflow model is sometimes
called computing-on-demand. With these dynamically provided
resources, customers with the possibility to have their jobs run on
HPC nodes in a dedicated datacenter, professionally managed and
secured, and individually customizable. Numerous standard ap-
plications like ANSYS, LS-Dyna, OpenFOAM, as well as lots of codes
like Gromacs, NAMD, VMD, and others are pre-installed, integrated
intoanenterprise-readycloudandworkloadmanagementenviron-
ment, and ready to run.
Alternatively, whenever customers are in need of space for hosting
their own HPC equipment because they do not have the space ca-
pacity or cooling and power infrastructure themselves, transtec is
also able to provide Hosting Services to those customers who’d like
to have their equipment professionally hosted, maintained, and
managed. Customers can thus build up their own private cloud!
Are you interested in any of transtec’s broad range of HPC related
services? Write us an email to HPC@transtec.de. We’ll be happy to
hear from you!
AdvancedCluster
Management
MadeEasy
13
Bright Cluster Manager re-
moves the complexity from
the installation, manage-
ment and use of HPC clus-
ters — onpremise or in the
cloud.
With Bright Cluster Man-
ager, you can easily install,
manage and use multiple
clusters simultaneously, in-
cluding compute, Hadoop,
storage, database and work-
station clusters.
14
The Bright Advantage
Bright Cluster Manager delivers improved productivity, in-
creased uptime, proven scalability and security, while reduc-
ing operating cost:
Rapid Productivity Gains
ʎʎ Short learning curve: intuitive GUI drives it all
ʎʎ Quick installation: one hour from bare metal to compute-
ready
ʎʎ Fast, flexible provisioning: incremental, live, disk-full, disk-less,
over InfiniBand, to virtual machines, auto node discovery
ʎʎ Comprehensive monitoring: on-the-fly graphs, Rackview, mul-
tiple clusters, custom metrics
ʎʎ Powerful automation: thresholds, alerts, actions
ʎʎ Complete GPU support: NVIDIA, AMD1, CUDA, OpenCL
ʎʎ Full support for Intel Xeon Phi
ʎʎ On-demand SMP: instant ScaleMP virtual SMP deployment
ʎʎ Fast customization and task automation: powerful cluster
management shell, SOAP and JSON APIs make it easy
ʎʎ Seamless integration with leading workload managers: Slurm,
Open Grid Scheduler, Torque, openlava, Maui2, PBS Profes-
sional, Univa Grid Engine, Moab2, LSF
ʎʎ Integrated (parallel) application development environment
ʎʎ Easy maintenance: automatically update your cluster from
Linux and Bright Computing repositories
ʎʎ Easily extendable, web-based User Portal
ʎʎ Cloud-readiness at no extra cost3, supporting scenarios “Clus-
ter-on-Demand” and “Cluster-Extension”, with data-aware
scheduling
ʎʎ Deploys, provisions, monitors and manages Hadoop clusters
ʎʎ Future-proof: transparent customization minimizes disrup-
tion from staffing changes
Maximum Uptime
ʎʎ Automatic head node failover: prevents system downtime
ʎʎ Powerful cluster automation: drives pre-emptive actions
based on monitoring thresholds
AdvancedClusterManagement
MadeEasy
Easy-to-use, complete and scalable
The cluster installer takes you through the installation process and
offers advanced options such as “Express” and “Remote”.
15
ʎʎ Comprehensive cluster monitoring and health checking: au-
tomatic sidelining of unhealthy nodes to prevent job failure
Scalability from Deskside to TOP500
ʎʎ Off-loadable provisioning: enables maximum scalability
ʎʎ Proven: used on some of the world’s largest clusters
Minimum Overhead / Maximum Performance
ʎʎ Lightweight: single daemon drives all functionality
ʎʎ Optimized: daemon has minimal impact on operating system
and applications
ʎʎ Efficient: single database for all metric and configuration
data
Top Security
ʎʎ Key-signed repositories: controlled, automated security and
other updates
ʎʎ Encryptionoption:forexternalandinternalcommunications
ʎʎ Safe: X509v3 certificate-based public-key authentication
ʎʎ Sensible access: role-based access control, complete audit
trail
ʎʎ Protected: firewalls and LDAP
A Unified Approach
Bright Cluster Manager was written from the ground up as a to-
tally integrated and unified cluster management solution. This
fundamental approach provides comprehensive cluster manage-
ment that is easy to use and functionality-rich, yet has minimal
impact on system performance. It has a single, light-weight
daemon, a central database for all monitoring and configura-
tion data, and a single CLI and GUI for all cluster management
functionality. Bright Cluster Manager is extremely easy to use,
scalable, secure and reliable. You can monitor and manage all
aspects of your clusters with virtually no learning curve.
Bright’s approach is in sharp contrast with other cluster man-
agement offerings, all of which take a “toolkit” approach. These
toolkits combine a Linux distribution with many third-party
tools for provisioning, monitoring, alerting, etc. This approach
has critical limitations: these separate tools were not designed
to work together; were often not designed for HPC, nor designed
to scale. Furthermore, each of the tools has its own interface
(mostly command-line based), and each has its own daemon(s)
and database(s). Countless hours of scripting and testing by
highly skilled people are required to get the tools to work for a
specific cluster, and much of it goes undocumented.
Time is wasted, and the cluster is at risk if staff changes occur,
losing the “in-head” knowledge of the custom scripts.
By selecting a cluster node in the tree on the left and the Tasks tab on the right,
you can execute a number of powerful tasks on that node with just a single
mouse click.
1616
Ease of Installation
Bright Cluster Manager is easy to install. Installation and testing
of a fully functional cluster from “bare metal” can be complet-
ed in less than an hour. Configuration choices made during the
installation can be modified afterwards. Multiple installation
modes are available, including unattended and remote modes.
Cluster nodes can be automatically identified based on switch
ports rather than MAC addresses, improving speed and reliabil-
ity of installation, as well as subsequent maintenance. All major
hardware brands are supported: Dell, Cray, Cisco, DDN, IBM, HP,
Supermicro, Acer, Asus and more.
Ease of Use
Bright Cluster Manager is easy to use, with two interface op-
tions: the intuitive Cluster Management Graphical User Interface
(CMGUI) and the powerful Cluster Management Shell (CMSH).
The CMGUI is a standalone desktop application that provides
a single system view for managing all hardware and software
aspects of the cluster through a single point of control. Admin-
istrative functions are streamlined as all tasks are performed
through one intuitive, visual interface. Multiple clusters can be
managed simultaneously. The CMGUI runs on Linux, Windows
and OS X, and can be extended using plugins. The CMSH provides
practically the same functionality as the CMGUI, but via a com-
mand-line interface. The CMSH can be used both interactively
and in batch mode via scripts.
Either way, you now have unprecedented flexibility and control
over your clusters.
Support for Linux and Windows
Bright Cluster Manager is based on Linux and is available with
a choice of pre-integrated, pre-configured and optimized Linux
distributions, including SUSE Linux Enterprise Server, Red Hat
Enterprise Linux, CentOS and Scientific Linux. Dual-boot installa-
tions with Windows HPC Server are supported as well, allowing
nodes to either boot from the Bright-managed Linux head node,
or the Windows-managed head node.
AdvancedClusterManagement
MadeEasy
Easy-to-use, complete and scalable
TheOverviewtabprovidesinstant,high-levelinsightintothestatusof
the cluster.
1717
Extensive Development Environment
Bright Cluster Manager provides an extensive HPC development
environment for both serial and parallel applications, including
the following (some are cost options):
ʎʎ Compilers, including full suites from GNU, Intel, AMD and Port-
land Group.
ʎʎ Debuggers and profilers, including the GNU debugger and
profiler, TAU, TotalView, Allinea DDT and Allinea MAP.
ʎʎ GPU libraries, including CUDA and OpenCL.
ʎʎ MPI libraries, including OpenMPI, MPICH, MPICH2, MPICHMX,
MPICH2-MX, MVAPICH and MVAPICH2; all cross-compiled with
the compilers installed on Bright Cluster Manager, and opti-
mized for high-speed interconnects such as InfiniBand and
10GE.
ʎʎ Mathematical libraries, including ACML, FFTW, Goto-BLAS,
MKL and ScaLAPACK.
ʎʎ Other libraries, including Global Arrays, HDF5, IIPP, TBB, Net-
CDF and PETSc.
Bright Cluster Manager also provides Environment Modules to
make it easy to maintain multiple versions of compilers, libraries
andapplications fordifferentusers on the cluster, without creating
compatibility conflicts. Each Environment Module file contains the
information needed to configure the shell for an application, and
automatically sets these variables correctly for the particular appli-
cationwhenitisloaded.BrightClusterManagerincludesmanypre-
configured module files for many scenarios, such as combinations
of compliers, mathematical and MPI libraries.
Powerful Image Management and Provisioning
Bright Cluster Manager features sophisticated software image
management and provisioning capability. A virtually unlimited
number of images can be created and assigned to as many dif-
ferent categories of nodes as required. Default or custom Linux
kernels can be assigned to individual images. Incremental
changes to images can be deployed to live nodes without re-
booting or re-installation.
The provisioning system only propagates changes to the images,
minimizing time and impact on system performance and avail-
ability. Provisioning capability can be assigned to any number of
nodes on-the-fly, for maximum flexibility and scalability. Bright
Cluster Manager can also provision over InfiniBand and to ram-
disk or virtual machine.
Comprehensive Monitoring
With Bright Cluster Manager, you can collect, monitor, visual-
ize and analyze a comprehensive set of metrics. Many software
and hardware metrics available to the Linux kernel, and many
hardware management interface metrics (IPMI, DRAC, iLO, etc.)
are sampled.
Cluster metrics, such as GPU, Xeon Phi and CPU temperatures, fan speeds and network statistics can be visualized by simply dragging and dropping them into a graphing
window. Multiple metrics can be combined in one graph and graphs can be zoomed into. A Graphing wizard allows creation of all graphs for a selected combination of
metrics and nodes. Graph layout and color configurations can be tailored to your requirements and stored for re-use.
18
High performance meets efficiency
Initially, massively parallel systems constitute a challenge to
both administrators and users. They are complex beasts. Anyone
building HPC clusters will need to tame the beast, master the
complexity and present users and administrators with an easy-
to-use, easy-to-manage system landscape.
Leading HPC solution providers such as transtec achieve this
goal. They hide the complexity of HPC under the hood and match
high performance with efficiency and ease-of-use for both users
and administrators. The “P” in “HPC” gains a double meaning:
“Performance” plus “Productivity”.
Cluster and workload management software like Moab HPC
Suite, Bright Cluster Manager or QLogic IFS provide the means
to master and hide the inherent complexity of HPC systems.
For administrators and users, HPC clusters are presented as
single, large machines, with many different tuning parameters.
The software also provides a unified view of existing clusters
whenever unified management is added as a requirement by the
customer at any point in time after the first installation. Thus,
daily routine tasks such as job management, user management,
queue partitioning and management, can be performed easily
with either graphical or web-based tools, without any advanced
scripting skills or technical expertise required from the adminis-
trator or user.
AdvancedClusterManagement
MadeEasy
Cloud Bursting With Bright
19
Bright Cluster Manager unleashes and manages
the unlimited power of the cloud
Create a complete cluster in Amazon EC2, or easily extend your
onsite cluster into the cloud, enabling you to dynamically add
capacity and manage these nodes as part of your onsite cluster.
Both can be achieved in just a few mouse clicks, without the
need for expert knowledge of Linux or cloud computing.
Bright Cluster Manager’s unique data aware scheduling capabil-
ity means that your data is automatically in place in EC2 at the
start of the job, and that the output data is returned as soon as
the job is completed.
With Bright Cluster Manager, every cluster is cloud-ready, at no
extra cost. The same powerful cluster provisioning, monitoring,
scheduling and management capabilities that Bright Cluster
Manager provides to onsite clusters extend into the cloud.
The Bright advantage for cloud bursting
ʎʎ Ease of use: Intuitive GUI virtually eliminates user learning
curve; no need to understand Linux or EC2 to manage system.
Alternatively, cluster management shell provides powerful
scripting capabilities to automate tasks
ʎʎ Complete management solution: Installation/initialization,
provisioning, monitoring, scheduling and management in
one integrated environment
ʎʎ Integrated workload management: Wide selection of work-
load managers included and automatically configured with
local, cloud and mixed queues
ʎʎ Singlesystemview;completevisibilityandcontrol:Cloudcom-
pute nodes managed as elements of the on-site cluster; visible
from a single console with drill-downs and historic data.
ʎʎ Efficient data management via data aware scheduling: Auto-
matically ensures data is in position at start of computation;
delivers results back when complete.
ʎʎ Secure, automatic gateway: Mirrored LDAP and DNS services
over automatically-created VPN connects local and cloud-
based nodes for secure communication.
ʎʎ Cost savings: More efficient use of cloud resources; support
for spot instances, minimal user intervention.
Two cloud bursting scenarios
Bright Cluster Manager supports two cloud bursting scenarios:
“Cluster-on-Demand” and “Cluster Extension”.
Scenario 1: Cluster-on-Demand
The Cluster-on-Demand scenario is ideal if you do not have a
cluster onsite, or need to set up a totally separate cluster. With
just a few mouse clicks you can instantly create a complete clus-
ter in the public cloud, for any duration of time.
Scenario 2: Cluster Extension
TheClusterExtensionscenarioisidealifyouhaveaclusteronsite
but you need more compute power, including GPUs. With Bright
Cluster Manager, you can instantly add EC2-based resources to
20
your onsite cluster, for any duration of time. Bursting into the
cloud is as easy as adding nodes to an onsite cluster — there are
only a few additional steps after providing the public cloud ac-
count information to Bright Cluster Manager.
The Bright approach to managing and monitoring a cluster in
the cloud provides complete uniformity, as cloud nodes are man-
aged and monitored the same way as local nodes:
ʎʎ Load-balanced provisioning
ʎʎ Software image management
ʎʎ Integrated workload management
ʎʎ Interfaces — GUI, shell, user portal
ʎʎ Monitoring and health checking
ʎʎ Compilers, debuggers, MPI libraries, mathematical libraries
and environment modules
Bright Cluster Manager also provides additional features that
are unique to the cloud.
AdvancedClusterManagement
MadeEasy
Easy-to-use, complete and scalable
The Add Cloud Provider Wizard and
the Node Creation Wizard make the
cloud bursting process easy, also for
users with no cloud experience.
21
Amazon spot instance support
Bright Cluster Manager enables users to take advantage of the
cost savings offered by Amazon’s spot instances. Users can spec-
ify the use of spot instances, and Bright will automatically sched-
ule as available, reducing the cost to compute without the need
to monitor spot prices and manually schedule.
Hardware Virtual Machine (HVM) virtualization
Bright Cluster Manager automatically initializes all Amazon
instance types, including Cluster Compute and Cluster GPU in-
stances that rely on HVM virtualization.
Data aware scheduling
Data aware scheduling ensures that input data is transferred to
the cloud and made accessible just prior to the job starting, and
that the output data is transferred back. There is no need to wait
(and monitor) for the data to load prior to submitting jobs (de-
laying entry into the job queue), nor any risk of starting the job
before the data transfer is complete (crashing the job). Users sub-
mit their jobs, and Bright’s data aware scheduling does the rest.
Bright Cluster Manager provides the choice:
“To Cloud, or Not to Cloud”
Not all workloads are suitable for the cloud. The ideal situation
for most organizations is to have the ability to choose between
onsite clusters for jobs that require low latency communication,
complex I/O or sensitive data; and cloud clusters for many other
types of jobs.
Bright Cluster Manager delivers the best of both worlds: a pow-
erful management solution for local and cloud clusters, with
the ability to easily extend local clusters into the cloud without
compromising provisioning, monitoring or managing the cloud
resources.
One or more cloud nodes can be configured under the Cloud Settings tab.
Bright Computing
22
Examples include CPU, GPU and Xeon Phi temperatures, fan
speeds, switches, hard disk SMART information, system load,
memory utilization, network metrics, storage metrics, power
systems statistics, and workload management metrics. Custom
metrics can also easily be defined.
Metric sampling is done very efficiently — in one process, or out-
of-band where possible. You have full flexibility over how and
when metrics are sampled, and historic data can be consolidat-
ed over time to save disk space.
Cluster Management Automation
Cluster management automation takes pre-emptive actions
when predetermined system thresholds are exceeded, saving
time and preventing hardware damage. Thresholds can be con-
figured on any of the available metrics. The built-in configura-
tion wizard guides you through the steps of defining a rule:
selecting metrics, defining thresholds and specifying actions.
For example, a temperature threshold for GPUs can be estab-
lished that results in the system automatically shutting down an
AdvancedClusterManagement
MadeEasy
Easy-to-use, complete and scalable
The parallel shell allows for simultaneous execution of commands or scripts
across node groups or across the entire cluster.
The status of cluster nodes, switches, other hardware, as well as up
to six metrics can be visualized in the Rackview. A zoomout option
is available for clusters with many racks.
23
overheated GPU unit and sending a text message to your mobile
phone. Several predefined actions are available, but any built-in
cluster management command, Linux command or script can be
used as an action.
Comprehensive GPU Management
Bright Cluster Manager radically reduces the time and effort
of managing GPUs, and fully integrates these devices into the
single view of the overall system. Bright includes powerful GPU
management and monitoring capability that leverages function-
ality in NVIDIA Tesla and AMD GPUs.
You can easily assume maximum control of the GPUs and gain
instant and time-based status insight. Depending on the GPU
make and model, Bright monitors a full range of GPU metrics,
including:
ʎʎ GPU temperature, fan speed, utilization
ʎʎ GPU exclusivity, compute, display, persistance mode
ʎʎ GPU memory utilization, ECC statistics
ʎʎ Unit fan speed, serial number, temperature, power usage, vol-
tages and currents, LED status, firmware.
ʎʎ Board serial, driver version, PCI info.
Beyond metrics, Bright Cluster Manager features built-in support
for GPU computing with CUDA and OpenCL libraries. Switching
between current and previous versions of CUDA and OpenCL has
also been made easy.
Full Support for Intel Xeon Phi
Bright Cluster Manager makes it easy to set up and use the Intel
Xeon Phi coprocessor. Bright includes everything that is needed
to get Phi to work, including a setup wizard in the CMGUI. Bright
ensures that your software environment is set up correctly, so
that the Intel Xeon Phi coprocessor is available for applications
that are able to take advantage of it.
Bright collects and displays a wide range of metrics for Phi, en-
suring that the coprocessor is visible and manageable as a de-
vice type, as well as including Phi as a resource in the workload
management system. Bright’s pre-job health checking ensures
that Phi is functioning properly before directing tasks to the
coprocessor.
Multi-Tasking Via Parallel Shell
The parallel shell allows simultaneous execution of multiple
commands and scripts across the cluster as a whole, or across
easily definable groups of nodes. Output from the executed
commands is displayed in a convenient way with variable levels
of verbosity. Running commands and scripts can be killed eas-
ily if necessary. The parallel shell is available through both the
CMGUI and the CMSH.
Integrated Workload Management
Bright Cluster Manager is integrated with a wide selection
of free and commercial workload managers. This integration
provides a number of benefits:
ʎʎ The selected workload manager gets automatically installed
and configured
The automation configuration wizard guides you through the steps of defining
a rule: selecting metrics, defining thresholds and specifying actions
24
ʎʎ Many workload manager metrics are monitored
ʎʎ The CMGUI and User Portal provide a user-friendly interface
to the workload manager
ʎʎ The CMSH and the SOAP & JSON APIs provide direct and po-
werful access to a number of workload manager commands
and metrics
ʎʎ Reliable workload manager failover is properly configured
ʎʎ The workload manager is continuously made aware of the
health state of nodes (see section on Health Checking)
ʎʎ The workload manager is used to save power through auto-
power on/off based on workload
ʎʎ The workload manager is used for data-aware scheduling
of jobs to the cloud
The following user-selectable workload managers are tightly in-
tegrated with Bright Cluster Manager:
ʎʎ PBS Professional, Univa Grid Engine, Moab, LSF
ʎʎ Slurm, openlava, Open Grid Scheduler, Torque, Maui
Alternatively, other workload managers, such as LoadLevele
and Condor can be installed on top of Bright Cluster Manager.
Integrated SMP Support
Bright Cluster Manager — Advanced Edition dynamically aggre-
gates multiple cluster nodes into a single virtual SMP node, us-
ing ScaleMP’s Versatile SMP™ (vSMP) architecture. Creating and
dismantling a virtual SMP node can be achieved with just a few
AdvancedClusterManagement
MadeEasy
Easy-to-use, complete and scalable
Creating and dismantling a virtual SMP node can be achieved with just a few
clicks within the GUI or a single command in the cluster management shell.
Example graphs that visualize metrics on a GPU cluster.
25
clicks within the CMGUI. Virtual SMP nodes can also be launched
and dismantled automatically using the scripting capabilities of
the CMSH.
In Bright Cluster Manager a virtual SMP node behaves like any
other node, enabling transparent, on-the-fly provisioning, con-
figuration, monitoring and management of virtual SMP nodes as
part of the overall system management.
Maximum Uptime with Head Node
Failover Bright Cluster Manager — Advanced Edition allows two
head nodes to be configured in active-active failover mode. Both
head nodes are on active duty, but if one fails, the other takes
over all tasks, seamlesly.
Maximum Uptime with Health Checking
Bright Cluster Manager — Advanced Edition includes a
powerful cluster health checking framework that maxi-
mizes system uptime. It continually checks multiple health
indicators for all hardware and software components and
proactively initiates corrective actions. It can also automati-
cally perform a series of standard and user-defined tests just
before starting a new job, to ensure a successful execution,
and preventing the “black hole node syndrome”. Examples
of corrective actions include autonomous bypass of faulty
nodes, automatic job requeuing to avoid queue flushing, and
process “jailing” to allocate, track, trace and flush completed
user processes. The health checking framework ensures the
highest job throughput, the best overall cluster efficiency
and the lowest administration overhead.
Top Cluster Security
Bright Cluster Manager offers an unprecedented level of
security that can easily be tailored to local requirements.
Security features include:
ʎʎ Automated security and other updates from key-signed Linux
and Bright Computing repositories.
ʎʎ Encrypted internal and external communications.
ʎʎ X509v3 certificate based public-key authentication to the
cluster management infrastructure.
ʎʎ Role-based access control and complete audit trail.
ʎʎ Firewalls, LDAP and SSH.
User and Group Management
Users can be added to the cluster through the CMGUI or the
CMSH. Bright Cluster Manager comes with a pre-configured
LDAP database, but an external LDAP service, or alternative au-
thentication system, can be used instead.
Web-Based User Portal
The web-based User Portal provides read-only access to essen-
tial cluster information, including a general overview of the clus-
ter status, node hardware and software properties, workload
manager statistics and user-customizable graphs. The User Por-
tal can easily be customized and expanded using PHP and the
SOAP or JSON APIs.
Workload management queues can be viewed and configured from the GUI,
without the need for workload management expertise.
26
Multi-Cluster Capability
Bright Cluster Manager — Advanced Edition is ideal for organiza-
tions that need to manage multiple clusters, either in one or in
multiple locations. Capabilities include:
ʎʎ All cluster management and monitoring functionality is
available for all clusters through one GUI.
ʎʎ Selecting any set of configurations in one cluster and ex-
porting them to any or all other clusters with a few mouse
clicks.
ʎʎ Metric visualizations and summaries across clusters.
ʎʎ Making node images available to other clusters.
Fundamentally API-Based
Bright Cluster Manager is fundamentally API-based, which
means that any cluster management command and any piece
of cluster management data — whether it is monitoring data
or configuration data — is available through the API. Both a
SOAP and a JSON API are available and interfaces for various
programming languages, including C++, Python and PHP are
provided.
Cloud Bursting
Bright Cluster Manager supports two cloud bursting sce-
narios: “Cluster-on-Demand” — running stand-alone clusters
AdvancedClusterManagement
MadeEasy
Easy-to-use, complete and scalable
“The building blocks for transtec HPC solutions
must be chosen according to our goals ease-of-
management and ease-of-use. With Bright Cluster
Manager, we are happy to have the technology
leader at hand, meeting these requirements, and
our customers value that.”
Armin Jäger HPC Solution Engineer
27
in the cloud; and “Cluster Extension” — adding cloud-based
resources to existing, onsite clusters and managing these
cloud nodes as if they were local. In addition, Bright provides
data aware scheduling to ensure that data is accessible in
the cloud at the start of jobs, and results are promptly trans-
ferred back. Both scenarios can be achieved in a few simple
steps. Every Bright cluster is automatically cloud-ready, at no
extra cost.
Hadoop Cluster Management
Bright Cluster Manager is the ideal basis for Hadoop clusters.
Bright installs on bare metal, configuring a fully operational
Hadoop cluster in less than one hour. In the process, Bright pre-
pares your Hadoop cluster for use by provisioning the operating
system and the general cluster management and monitoring ca-
pabilities required as on any cluster.
Bright then manages and monitors your Hadoop cluster’s
hardware and system software throughout its life-cycle,
collecting and graphically displaying a full range of Hadoop
metrics from the HDFS, RPC and JVM sub-systems. Bright sig-
nificantly reduces setup time for Cloudera, Hortonworks and
other Hadoop stacks, and increases both uptime and Map-
Reduce job throughput.
This functionality is scheduled to be further enhanced in
upcoming releases of Bright, including dedicated manage-
ment roles and profiles for name nodes, data nodes, as well
as advanced Hadoop health checking and monitoring func-
tionality.
Standard and Advanced Editions
Bright Cluster Manager is available in two editions: Standard
and Advanced. The table on this page lists the differences. You
can easily upgrade from the Standard to the Advanced Edition as
your cluster grows in size or complexity.
The web-based User Portal provides read-only access to essential cluster in-
formation, including a general overview of the cluster status, node hardware
and software properties, workload manager statistics and user-customizable
graphs.
Bright Cluster Manager can manage multiple clusters simultaneously. This
overview shows clusters in Oslo, Abu Dhabi and Houston, all managed through
one GUI.
28
Documentation and Services
A comprehensive system administrator manual and user man-
ual are included in PDF format. Standard and tailored services
are available, including various levels of support, installation,
training and consultancy.
AdvancedClusterManagement
MadeEasy
Easy-to-use, complete and scalable
Feature Standard Advanced
Choice of Linux distributions x x
Intel Cluster Ready x x
Cluster Management GUI x x
Cluster Management Shell x x
Web-Based User Portal x x
SOAP API x x
Node Provisioning x x
Node Identification x x
Cluster Monitoring x x
Cluster Automation x x
User Management x x
Role-based Access Control x x
Parallel Shell x x
Workload Manager Integration x x
Cluster Security x x
Compilers x x
Debuggers & Profilers x x
MPI Libraries x x
Mathematical Libraries x x
Environment Modules x x
Cloud Bursting x x
Hadoop Management & Monitoring x x
NVIDIA CUDA & OpenCL - x
GPU Management & Monitoring - x
Xeon Phi Management & Monitoring - x
ScaleMP Management & Monitoring - x
Redundant Failover Head Nodes - x
Cluster Health Checking - x
Off-loadable Provisioning - x
Multi-Cluster Management - x
Suggested Number of Nodes 4–128 129–10,000+
Standard Support x x
Premium Support Optional Optional
Cluster health checks can be visualized in the Rackview. This screenshot shows
that GPU unit 41 fails a health check called “AllFansRunning”.
29
GESTIONINTELLI-
GENTE
DEFLUX
D’APPLICATIONSHPC
IntelligentHPC
WorkloadManagement
31
Tandis que tous les systèmes HPC se trouvent confrontés à de
grands challenges en matière de complexité, d’évolutivité et
d’exigences des flux applicatifs et des ressources, les systèmes
HPC des entreprises font face aux défis et attentes les plus
exigeants. Les systèmes HPC d’entreprises doivent satisfaire
aux exigences critiques et prioritaires des flux d’applications
pour les entreprises commerciales et les organisations de
recherche et académiques à orientation commerciale. Ils doi-
vent trouver un équilibre entre des SLA et des priorités com-
plexes. En effet, leurs flux HPC ont un impact direct sur leurs
revenus, la livraison de leurs produits et leurs objectifs.
31
While all HPC systems face
challenges in workload de-
mand, resource complexity,
and scale, enterprise HPC
systems face more strin-
gent challenges and expec-
tations.
Enterprise HPC systems
must meet mission-critical
and priority HPC workload
demands for commercial
businesses and business-ori-
ented research and academ-
ic organizations. They have
complex SLAs and priorities
to balance. Their HPC work-
loads directly impact the
revenue, product delivery,
and organizational objec-
tives of their organizations.
32
Enterprise HPC Challenges
Enterprise HPC systems must eliminate job delays and failures.
They are also seeking to improve resource utilization and man-
agement efficiency across multiple heterogeneous systems. To-
maximize user productivity, they are required to make it easier to
access and use HPC resources for users or even expand to other
clusters or HPC cloud to better handle workload demand surges.
Intelligent Workload Management
As today’s leading HPC facilities move beyond petascale towards
exascale systems that incorporate increasingly sophisticated and
specialized technologies, equally sophisticated and intelligent
management capabilities are essential. With a proven history of
managing the most advanced, diverse, and data-intensive sys-
tems in the Top500, Moab continues to be the preferred workload
management solution for next generation HPC facilities.
IntelligentHPC
WorkloadManagement
Moab HPC Suite – Basic Edition
Moab is the “brain” of an HPC system, intelligently optimizing workload
throughput while balancing service levels and priorities.
33
Moab HPC Suite – Basic Edition
Moab HPC Suite - Basic Edition is an intelligent workload man-
agement solution. It automates the scheduling, managing,
monitoring, and reporting of HPC workloads on massive scale,
multi-technology installations. The patented Moab intelligence
engine uses multi-dimensional policies to accelerate running
workloads across the ideal combination of diverse resources.
These policies balance high utilization and throughput goals
with competing workload priorities and SLA requirements. The
speed and accuracy of the automated scheduling decisions
optimizes workload throughput and resource utilization. This
gets more work accomplished in less time, and in the right prio-
rity order. Moab HPC Suite - Basic Edition optimizes the value
and satisfaction from HPC systems while reducing manage-
ment cost and complexity.
Moab HPC Suite – Enterprise Edition
MoabHPCSuite-EnterpriseEditionprovidesenterprise-readyHPC
workload management that accelerates productivity, automates
workload uptime, and consistently meets SLAs and business prior-
ities for HPC systems and HPC cloud. It uses the battle-tested and
patented Moab intelligence engine to automatically balance the
complex, mission-critical workload priorities of enterprise HPC
systems. Enterprise customers benefit from a single integrated
product that brings together key enterprise HPC capabilities, im-
plementation services, and 24x7 support. This speeds the realiza-
tion of benefits from the HPC system for the business including:
ʎʎ Higher job throughput
ʎʎ Massive scalability for faster response and system expansion
ʎʎ Optimum utilization of 90-99% on a consistent basis
ʎʎ Fast, simple job submission and management to increase
34
productivity
ʎʎ Reduced cluster management complexity and support costs
across heterogeneous systems
ʎʎ Reduced job failures and auto-recovery from failures
ʎʎ SLAs consistently met for improved user satisfaction
ʎʎ Reduce power usage and costs by 10-30%
Productivity Acceleration
Moab HPC Suite – Enterprise Edition gets more results deliv-
ered faster from HPC resources at lower costs by accelerating
overall system, user and administrator productivity. Moab pro-
vides the scalability, 90-99 percent utilization, and simple job
submission that is required to maximize the productivity of
enterprise HPC systems. Enterprise use cases and capabilities
include:
ʎʎ Massive multi-point scalability to accelerate job response
and throughput, including high throughput computing
ʎʎ Workload-optimized allocation policies and provisioning
to get more results out of existing heterogeneous resources
and reduce costs, including topology-based allocation
ʎʎ Unify workload management cross heterogeneous clus-
ters to maximize resource availability and administration
efficiency by managing them as one cluster
ʎʎ Optimized, intelligent scheduling packs workloads and
backfills around priority jobs and reservations while balanc-
ing SLAs to efficiently use all available resources
ʎʎ Optimized scheduling and management of accelerators,
both Intel MIC and GPGPUs, for jobs to maximize their utili-
zation and effectiveness
ʎʎ Simplified job submission and management with advanced
job arrays, self-service portal, and templates
ʎʎ Administrator dashboards and reporting tools reduce man-
agement complexity, time, and costs
ʎʎ Workload-aware auto-power management reduces energy
use and costs by 10-30 percent
Intelligent HPC
Workload Management
Moab HPC Suite – Enterprise Edition
“With Moab HPC Suite, we can meet very demand-
ing customers’ requirements as regards unifi ed
management of heterogeneous cluster environ-
ments, grid management, and provide them with
fl exible and powerful confi guration and reporting
options. Our customers value that highly.”
Thomas Gebert HPC Solution Architect
35
Uptime Automation
Job and resource failures in enterprise HPC systems lead to de-
layed results, missed organizational opportunities, and missed
objectives. Moab HPC Suite – Enterprise Edition intelligently au-
tomates workload and resource uptime in the HPC system to en-
sure that workload completes successfully and reliably, avoiding
these failures. Enterprises can benefit from:
ʎʎ Intelligent resource placement to prevent job failures with
granular resource modeling to meet workload requirements
and avoid at-risk resources
ʎʎ Auto-response to failures and events with configurable ac-
tions to pre-failure conditions, amber alerts, or other metrics
and monitors
ʎʎ Workload-aware future maintenance scheduling that helps
maintain a stable HPC system without disrupting workload
productivity
ʎʎ Real-world expertise for fast time-to-value and system
uptime with included implementation, training, and 24x7
support remote services
Auto SLA Enforcement
Moab HPC Suite – Enterprise Edition uses the powerful Moab
intelligence engine to optimally schedule and dynamically
adjust workload to consistently meet service level agree-
ments (SLAs), guarantees, or business priorities. This auto-
matically ensures that the right workloads are completed at
the optimal times, taking into account the complex number of
using departments, priorities and SLAs to be balanced. Moab
provides:
ʎʎ Usage accounting and budget enforcement that schedules
resources and reports on usage in line with resource shar-
ing agreements and precise budgets (includes usage limits,
36
usage reports, auto budget management and dynamic fair-
share policies)
ʎʎ SLA and priority polices that make sure the highest priorty
workloads are processed first (includes Quality of Service,
hierarchical priority weighting)
ʎʎ Continuous plus future scheduling that ensures priorities
and guarantees are proactively met as conditions and work-
load changes (i.e. future reservations, pre-emption, etc.)
Grid- and Cloud-Ready Workload Management
The benefits of a traditional HPC environment can be extended
to more efficiently manage and meet workload demand with
the ulti-cluster grid and HPC cloud management capabilities in
Moab HPC Suite - Enterprise Edition. It allows you to:
ʎʎ Showback or chargeback for pay-for-use so actual resource
usage is tracked with flexible chargeback rates and report-
ing by user, department, cost center, or cluster
ʎʎ Manage and share workload across multiple remote clusters
to meet growing workload demand or surges by adding on
the Moab HPC Suite – Grid Option.
Moab HPC Suite – Grid Option
The Moab HPC Suite - Grid Option is a powerful grid-workload
management solution that includes scheduling, advanced poli-
cy management, and tools to control all the components of ad-
vanced grids. Unlike other grid solutions, Moab HPC Suite - Grid
Option connects disparate clusters into a logical whole, enabling
grid administrators and grid policies to have sovereignty over all
systems while preserving control at the individual cluster.
Moab HPC Suite - Grid Option has powerful applications that al-
low organizations to consolidate reporting; information gather-
ing; and workload, resource, and data management. Moab HPC
Suite - Grid Option delivers these services in a near-transparent
way: users are unaware they are using grid resources—they
know only that they are getting work done faster and more eas-
ily than ever before.
Intelligent HPC
Workload Management
Moab HPC Suite – Grid Option
37
Moab manages many of the largest clusters and grids in the
world. Moab technologies are used broadly across Fortune 500
companies and manage more than a third of the compute cores
in the top 100 systems of the Top500 supercomputers. Adaptive
Computing is a globally trusted ISV (independent software ven-
dor), and the full scalability and functionality Moab HPC Suite
with the Grid Option offers in a single integrated solution has
traditionally made it a significantly more cost-effective option
than other tools on the market today.
Components
The Moab HPC Suite - Grid Option is available as an add-on to
Moab HPC Suite - Basic Edition and -Enterprise Edition. It extends
the capabilities and functionality of the Moab HPC Suite compo-
nents to manage grid environments including:
ʎʎ Moab Workload Manager for Grids – a policy-based work-
load management and scheduling multi-dimensional deci-
sion engine
ʎʎ Moab Cluster Manager for Grids – a powerful and unified
graphical administration tool for monitoring, managing and
reporting tool across multiple clusters
ʎʎ Moab ViewpointTM for Grids – a Web-based self-service
end-user job-submission and management portal and ad-
ministrator dashboard
Benefits
ʎʎ Unified management across heterogeneous clusters provides
ability to move quickly from cluster to optimized grid
ʎʎ Policy-driven and predictive scheduling ensures that jobs
start and run in the fastest time possible by selecting optimal
resources
ʎʎ Flexible policy and decision engine adjusts workload pro-
cessing at both grid and cluster level
ʎʎ Grid-wide interface and reporting tools provide view of grid
resources, status and usage charts, and trends over time for
capacity planning, diagnostics, and accounting
ʎʎ Advanced administrative control allows various business
units to access and view grid resources, regardless of physical
or organizational boundaries, or alternatively restricts access
to resources by specific departments or entities
ʎʎ Scalable architecture to support peta-scale, highthroughput
computing and beyond
Grid Control with Automated Tasks, Policies, and Re-
porting
ʎʎ Guarantee that the most-critical work runs first with flexible
global policies that respect local cluster policies but continue
to support grid service-level agreements
ʎʎ Ensure availability of key resources at specific times with ad-
vanced reservations
ʎʎ Tune policies prior to rollout with cluster- and grid-level simu-
lations
ʎʎ Use a global view of all grid operations for self-diagnosis, plan-
ning, reporting, and accounting across all resources, jobs, and
clusters
Moab HPC Suite -Grid Option can be flexibly configured for centralized, cen-
tralized and local, or peer-to-peer grid policies, decisions and rules. It is able to
manage multiple resource managers and multiple Moab instances.
38
HarnessingthePowerofIntelXeonPhiandGPGPUs
The mathematical acceleration delivered by many-core General
Purpose Graphics Processing Units (GPGPUs) offers significant
performance advantages for many classes of numerically in-
tensive applications. Parallel computing tools such as NVIDIA’s
CUDA and the OpenCL framework have made harnessing the
power of these technologies much more accessible to develop-
ers, resulting in the increasing deployment of hybrid GPGPU-
based systems and introducing significant challenges for ad-
ministrators and workload management systems.
The new Intel Xeon Phi coprocessors offer breakthrough perfor-
mance for highly parallel applications with the benefit of lever-
aging existing code and flexibility in programming models for
faster application development. Moving an application to Intel
Xeon Phi coprocessors has the promising benefit of requiring
substantially less effort and lower power usage. Such a small
investment can reap big performance gains for both data-inten-
sive and numerical or scientific computing applications that can
make the most of the Intel Many Integrated Cores (MIC) technol-
ogy. So it is no surprise that organizations are quickly embracing
these new coprocessors in their HPC systems.
The main goal is to create breakthrough discoveries, products
and research that improve our world. Harnessing and optimiz-
ing the power of these new accelerator technologies is key to
doing this as quickly as possible. Moab HPC Suite helps organi-
zations easily integrate accelerator and coprocessor technolo-
gies into their HPC systems, optimizing their utilization for the
right workloads and problems they are trying to solve.
Intelligent HPC
Workload Management
OptimizingAcceleratorswithMoabHPCSuite
39
Moab HPC Suite automatically schedules hybrid systems incor-
porating Intel Xeon Phi coprocessors and GPGPU accelerators,
optimizing their utilization as just another resource type with
policies. Organizations can choose the accelerators that best
meet their different workload needs.
Managing Hybrid Accelerator Systems
Hybrid accelerator systems add a new dimension of manage-
ment complexity when allocating workloads to available re-
sources. In addition to the traditional needs of aligning work-
load placement with software stack dependencies, CPU type,
memory, and interconnect requirements, intelligent workload
management systems now need to consider:
ʎʎ Workload’s ability to exploit Xeon Phi or GPGPU
technology
ʎʎ Additional software dependenciesand reduce costs,
including topology-based allocation
ʎʎ Current health and usage status of available Xeon Phis
or GPGPUs
ʎʎ Resource configuration for type and number of Xeon Phis
or GPGPUs
Moab HPC Suite auto detects which types of accelerators are
where in the system to reduce the management effort and costs
as these processors are introduced and maintained together in
an HPC system. This gives customers maximum choice and per-
formance in selecting the accelerators that work best for each
of their workloads, whether Xeon Phi, GPGPU or a hybrid mix of
the two.
Moab HPC Suite optimizes accelerator utilization with policies that ensure
the right ones get used for the right user and group jobs at the right time.
40
Optimizing Intel Xeon Phi and GPGPUs Utilization
System and application diversity is one of the first issues
workload management software must address in optimizing
accelerator utilization. The ability of current codes to use GP-
GPUs and Xeon Phis effectively, ongoing cluster expansion,
and costs usually mean only a portion of a system will be
equipped with one or a mix of both types of accelerators.
Moab HPC Suite has a twofold role in reducing the complexity
of their administration and optimizing their utilization. First,
it must be able to automatically detect new GPGPUs or Intel
Xeon Phi coprocessors in the environment and their availabil-
ity without the cumbersome burden of manual administra-
tor configuration. Second, and most importantly, it must ac-
curately match Xeon Phi-bound or GPGPUenabled workloads
with the appropriately equipped resources in addition to
managing contention for those limited resources according
to organizational policy. Moab’s powerful allocation and pri-
oritization policies can ensure the right jobs from users and
groups get to the optimal accelerator resources at the right
time. This keeps the accelerators at peak utilization for the
right priority jobs.
Moab’s policies are a powerful ally to administrators in auto
determining the optimal Xeon Phi coprocessor or GPGPU to
use and ones to avoid when scheduling jobs. These alloca-
tion policies can be based on any of the GPGPU or Xeon Phi
metrics such as memory (to ensure job needs are met), tem-
perature (to avoid hardware errors or failures), utilization, or
other metrics.
IntelligentHPC
WorkloadManagement
OptimizingAcceleratorswithMoabHPCSuite
41
Use the following metrics in policies to optimize allocation of
Intel Xeon Phi coprocessors:
ʎʎ Number of Cores
ʎʎ Number of Hardware Threads
ʎʎ Physical Memory
ʎʎ Free Physical Memory
ʎʎ Swap Memory
ʎʎ Free Swap Memory
ʎʎ Max Frequency (in MHz)
Use the following metrics in policies to optimize allocation of
GPGPUs and for management diagnostics:
ʎʎ Error counts; single/double-bit, commands to reset counts
ʎʎ Temperature
ʎʎ Fan speed
ʎʎ Memory; total, used, utilization
ʎʎ Utilization percent
ʎʎ Metrics time stamp
Improve GPGPU Job Speed and Success
The GPGPU drivers supplied by vendors today allow multiple
jobs to share a GPGPU without corrupting results, but sharing
GPGPUs and other key GPGPU factors can significantly impact
the speed and success of GPGPU jobs and the final level of ser-
vice delivered to the system users and the organization.
Consider the example of a user estimating the time for a GPGPU
job based on exclusive access to a GPGPU, but the workload man-
ager allowing the GPGPU to be shared when the job is scheduled.
The job will likely exceed the estimated time and be terminated
by the workload manager unsuccessfully, leaving a very unsatis-
fied user and the organization, group or project they represent.
Moab HPC Suite provides the ability to schedule GPGPU jobs to
run exclusively and finish in the shortest time possible for cer-
tain types, or classes of jobs or users to ensure job speed and
success.
42
Cluster Sovereignty and Trusted Sharing
ʎʎ Guarantee that shared resources are allocated fairly with
global policies that fully respect local cluster configuration
and needs
ʎʎ Establish trust between resource owners through graphical
usage controls, reports, and accounting across all shared
resources
ʎʎ Maintain cluster sovereignty with granular settings to con-
trol where jobs can originate and be processed n Establish
resource ownership and enforce appropriate access levels
with prioritization, preemption, and access guarantees
Increase User Collaboration and Productivity
ʎʎ Reduce end-user training and job management time with
easy-to-use graphical interfaces
ʎʎ Enable end users to easily submit and manage their jobs
through an optional web portal, minimizing the costs of ca-
tering to a growing base of needy users
ʎʎ Collaborate more effectively with multicluster co-allocation,
allowing key resource reservations for high-priority projects
ʎʎ Leverage saved job templates, allowing users to submit mul-
Intelligent HPC
Workload Management
Moab HPC Suite – Application Portal Edition
The Visual Cluster view in Moab Cluster Manager for Grids makes cluster and
grid management easy and efficient.
43
tiple jobs quickly and with minimal changes
ʎʎ Speed job processing with enhanced grid placement options
for job arrays; optimal or single cluster placement
Process More Work in Less Time to Maximize ROI
ʎʎ Achieve higher, more consistent resource utilization with in-
telligent scheduling, matching job requests to the bestsuited
resources, including GPGPUs
ʎʎ Use optimized data staging to ensure that remote data trans-
fers are synchronized with resource availability to minimize
poor utilization
ʎʎ Allow local cluster-level optimizations of most grid workload
Unify Management Across Independent Clusters
ʎʎ Unify management across existing internal, external, and
partner clusters – even if they have different resource man-
agers, databases, operating systems, and hardware
ʎʎ Out-of-the-box support for both local area and wide area
grids
ʎʎ Manage secure access to resources with simple credential
mapping or interface with popular security tool sets
ʎʎ Leverage existing data-migration technologies, such as SCP
or GridFTP
Moab HPC Suite – Application Portal Edition
Remove the Barriers to Harnessing HPC
Organizations of all sizes and types are looking to harness the
potential of high performance computing (HPC) to enable and
accelerate more competitive product design and research. To do
this, you must find a way to extend the power of HPC resourc-
es to designers, researchers and engineers not familiar with or
trained on using the specialized HPC technologies. Simplifying
the user access to and interaction with applications running on
more efficient HPC clusters removes the barriers to discovery
and innovation for all types of departments and organizations.
Moab® HPC Suite – Application Portal Edition provides easy
single-point access to common technical applications, data, HPC
resources, and job submissions to accelerate research analysis
and collaboration process for end users while reducing comput-
ing costs.
HPC Ease of Use Drives Productivity
Moab HPC Suite – Application Portal Edition improves ease of
use and productivity for end users using HPC resources with sin-
gle-point access to their common technical applications, data,
resources, and job submissions across the design and research
process. It simplifies the specification and running of jobs on
HPC resources to a few simple clicks on an application specific
template with key capabilities including:
ʎʎ Application-centric job submission templates for common
manufacturing, energy, life-sciences, education and other
industry technical applications
ʎʎ Support for batch and interactive applications, such as
simulations or analysis sessions, to accelerate the full proj-
ect or design cycle
ʎʎ No special HPC training required to enable more users to
start accessing HPC resources with intuitive portal
ʎʎ Distributed data management avoids unnecessary remote
file transfers for users, storing data optimally on the network
for easy access and protection, and optimizing file transfer
to/ from user desktops if needed
Accelerate Collaboration While Maintaining Control
More and more projects are done across teams and with partner
and industry organizations. Moab HPC Suite – Application Portal
Edition enables you to accelerate this collaboration between indi-
viduals and teams while maintaining control as additional users,
inside and outside the organization, can easily and securely ac-
cessprojectapplications,HPCresourcesanddatatospeedproject
cycles. Key capabilities and benefits you can leverage are:
ʎʎ Encrypted and controllable access, by class of user, ser-
vices, application or resource, for remote and partner users
improves collaboration while protecting infrastructure and
intellectual property
44
ʎʎ Enable multi-site, globally available HPC services available
anytime, anywhere, accessible through many different types
of devices via a standard web browser
ʎʎ Reduced client requirements for projectsmeans new proj-
ect members can quickly contribute without being limited by
their desktop capabilities
Reduce Costs with Optimized Utilization and Sharing
Moab HPC Suite – Application Portal Edition reduces the costs of
technical computing by optimizing resource utilization and the
sharing of resources including HPC compute nodes, application
licenses, and network storage. The patented Moab intelligence
engine and its powerful policies deliver value in key areas of uti-
lization and sharing:
ʎʎ Maximize HPC resource utilization with intelligent, opti-
mized scheduling policies that pack and enable more work
to be done on HPC resources to meet growing and changing
demand
ʎʎ Optimize application license utilization and access by shar-
ing application licenses in a pool, re-using and optimizing us-
age with allocation policies that integrate with license man-
agers for lower costs and better service levels to a broader
set of users
ʎʎ Usage budgeting and priorities enforcement with usage
accounting and priority policies that ensure resources are
shared in-line with service level agreements and project pri-
orities
ʎʎ Leverage and fully utilize centralized storage capacity in-
stead of duplicative, expensive, and underutilized individual
workstation storage
Key Intelligent Workload Management Capabilities:
ʎʎ Massive multi-point scalability
ʎʎ Workload-optimized allocation policies and provisioning
ʎʎ Unify workload management cross heterogeneous clusters
ʎʎ Optimized, intelligent scheduling
Intelligent HPC Workload
Management
MoabHPCSuite–RemoteVisualizationEdition
45
ʎʎ Optimized scheduling and management of accelerators
(both Intel MIC and GPGPUs)
ʎʎ Administrator dashboards and reporting tools
ʎʎ Workload-aware auto power management
ʎʎ Intelligent resource placement to prevent job failures
ʎʎ Auto-response to failures and events
ʎʎ Workload-aware future maintenance scheduling
ʎʎ Usage accounting and budget enforcement
ʎʎ SLA and priority polices
ʎʎ Continuous plus future scheduling
MoabHPCSuite–RemoteVisualizationEdition
Removing Inefficiencies in Current 3D Visualization Methods
Using 3D and 2D visualization, valuable data is interpreted and
new innovations are discovered. There are numerous inefficien-
cies in how current technical computing methods support users
and organizations in achieving these discoveries in a cost effec-
tive way. High-cost workstations and their software are often
not fully utilized by the limited user(s) that have access to them.
They are also costly to manage and they don’t keep pace long
with workload technology requirements. In addition, the work-
station model also requires slow transfers of the large data sets
between them. This is costly and inefficient in network band-
width, user productivity, and team collaboration while posing in-
creased data and security risks. There is a solution that removes
these inefficiencies using new technical cloud computing mod-
els. Moab HPC Suite – Remote Visualization Edition significantly
reduces the costs of visualization technical computing while
improving the productivity, collaboration and security of the de-
sign and research process.
ReducetheCostsofVisualizationTechnicalComputing
Moab HPC Suite – Remote Visualization Edition significantly
reduces the hardware, network and management costs of vi-
sualization technical computing. It enables you to create easy,
centralized access to shared visualization compute, applica-
tion and data resources. These compute, application, and data
resources reside in a technical compute cloud in your data cen-
ter instead of in expensive and underutilized individual work-
stations.
ʎʎ Reduce hardware costs by consolidating expensive individual
technical workstations into centralized visualization servers
for higher utilization by multiple users; reduce additive or up-
grade technical workstation or specialty hardware purchases,
such as GPUs, for individual users.
ʎʎ Reduce management costs by moving from remote user work-
stations that are difficult and expensive to maintain, upgrade,
and back-up to centrally managed visualization servers that
require less admin overhead
ʎʎ Decreasenetworkaccesscostsandcongestionassignificantly
lower loads of just compressed, visualized pixels are moving to
users, not full data sets
ʎʎ Reduce energy usageas centralized visualization servers con-
sume less energy for the user demand met
ʎʎ Reduce data storage costs by consolidating data into common
storage node(s) instead of under-utilized individual storage
The Application Portal Edition easily extends the power of HPC to your users
and projects to drive innovation and competentive advantage.
46
Consolidation and Grid
Many sites have multiple clusters as a result of having multiple
independent groups or locations, each with demands for HPC,
and frequent additions of newer machines. Each new cluster
increases the overall administrative burden and overhead. Addi-
tionally, many of these systems can sit idle while others are over-
loaded. Because of this systems-management challenge, sites
turn to grids to maximize the efficiency of their clusters. Grids
can be enabled either independently or in conjunction with one
another in three areas:
ʎʎ Reporting Grids Managers want to have global reporting
across all HPC resources so they can see how users and proj-
ects are really utilizing hardware and so they can effectively
plan capacity. Unfortunately, manually consolidating all of
this information in an intelligible manner for more than just
a couple clusters is a management nightmare. To solve that
problem, sites will create a reporting grid, or share informa-
tion across their clusters for reporting and capacity-planning
purposes.
ʎʎ Management Grids Managing multiple clusters indepen-
dently can be especially difficult when processes change,
because policies must be manually reconfigured across all
clusters. To ease that difficulty, sites often set up manage-
ment grids that impose a synchronized management layer
across all clusters while still allowing each cluster some level
of autonomy.
ʎʎ Workload-Sharing Grids Sites with multiple clusters often
have the problem of some clusters sitting idle while other
clusters have large backlogs. Such inequality in cluster utili-
zation wastes expensive resources, and training users to per-
form different workload-submission routines across various
clusters can be difficult and expensive as well. To avoid these
problems, sites often set up workload-sharing grids. These
grids can be as simple as centralizing user submission or as
Intelligent HPC
Workload Management
Using Moab With Grid Environments
47
complex as having each cluster maintain its own user sub-
mission routine with an underlying grid-management tool
that migrates jobs between clusters.
Inhibitors to Grid Environments
Three common inhibitors keep sites from enabling grid environ-
ments:
ʎʎ Politics Because grids combine resources across users,
groups, and projects that were previously independent, grid
implementation can be a political nightmare. To create a
grid in the real world, sites need a tool that allows clusters
to retain some level of sovereignty while participating in the
larger grid.
ʎʎ Multiple Resource Managers Most sites have a variety of
resource managers used by various groups within the orga-
nization, and each group typically has a large investment in
scripts that are specific to one resource manager and that
cannot be changed. To implement grid computing effective-
ly, sites need a robust tool that has flexibility in integrating
with multiple resource managers.
ʎʎ Credentials Many sites have different log-in credentials for
each cluster, and those credentials are generally indepen-
dent of one another. For example, one user might be Joe.P on
one cluster and J_Peterson on another. To enable grid envi-
ronments, sites must create a combined user space that can
recognize and combine these different credentials.
Using Moab in a Grid
Sites can use Moab to set up any combination of reporting,
management, and workload-sharing grids. Moab is a grid metas-
cheduler that allows sites to set up grids that work effectively in
the real world. Moab’s feature-rich functionality overcomes the
inhibitors of politics, multiple resource managers, and varying
credentials by providing:
ʎʎ Grid Sovereignty Moab has multiple features that break
down political barriers by letting sites choose how each clus-
ter shares in the grid. Sites can control what information is
shared between clusters and can specify which workload is
passed between clusters. In fact, sites can even choose to let
each cluster be completely sovereign in making decisions
about grid participation for itself.
ʎʎ Support for Multiple Resource Managers Moab meta-
schedules across all common resource managers. It fully
integrates with TORQUE and SLURM, the two most-common
open-source resource managers, and also has limited in-
tegration with commercial tools such as PBS Pro and SGE.
Moab’s integration includes the ability to recognize when a
user has a script that requires one of these tools, and Moab
can intelligently ensure that the script is sent to the correct
machine. Moab even has the ability to translate common
scripts across multiple resource managers.
ʎʎ Credential Mapping Moab can map credentials across clus-
ters to ensure that users and projects are tracked appropri-
ately and to provide consolidated reporting.
48
Improve Collaboration, Productivity and Security
With Moab HPC Suite – Remote Visualization Edition, you can im-
prove the productivity, collaboration and security of the design
and research process by only transferring pixels instead of data
to users to do their simulations and analysis. This enables a wid-
er range of users to collaborate and be more productive at any
time, on the same data, from anywhere without any data trans-
fer time lags or security issues. Users also have improved imme-
diate access to specialty applications and resources, like GPUs,
they might need for a project so they are no longer limited by
personal workstation constraints or to a single working location.
ʎʎ Improve workforce collaboration by enabling multiple
users to access and collaborate on the same interactive ap-
plication data at the same time from almost any location or
device, with little or no training needed
ʎʎ Eliminate data transfer time lags for users by keeping data
close to the visualization applications and compute resourc-
es instead of transferring back and forth to remote worksta-
tions; only smaller compressed pixels are transferred
ʎʎ Improve security with centralized data storage so data no
longer gets transferred to where it shouldn’t, gets lost, or
gets inappropriately accessed on remote workstations, only
pixels get moved
MaximizeUtilizationandSharedResourceGuarantees
Moab HPC Suite – Remote Visualization Edition maximizes your
resource utilization and scalability while guaranteeing shared
resource access to users and groups with optimized workload
management policies. Moab’s patented policy intelligence en-
gine optimizes the scheduling of the visualization sessions
across the shared resources to maximize standard and specialty
resource utilization, helping them scale to support larger vol-
umes of visualization workloads. These intelligent policies also
guarantee shared resource access to users to ensure a high ser-
vice level as they transition from individual workstations.
Intelligent HPC Workload
Management
MoabHPCSuite–RemoteVisualizationEdition
49
ʎʎ Guarantee shared visualization resource access for users
with priority policies and usage budgets that ensure they
receive the appropriate service level. Policies include ses-
sion priorities and reservations, number of users per node,
fairshare usage policies, and usage accounting and budgets
across multiple groups or users.
ʎʎ Maximize resource utilization and scalability by packing
visualization workload optimally on shared servers using
Moab allocation and scheduling policies. These policies can
include optimal resource allocation by visualization session
characteristics (like CPU, memory, etc.), workload packing,
number of users per node, and GPU policies, etc.
ʎʎ Improve application license utilization to reduce software
costs and improve access with a common pool of 3D visual-
ization applications shared across multiple users, with usage
optimized by intelligent Moab allocation policies, integrated
with license managers.
ʎʎ Optimized scheduling and management of GPUs and other
accelerators to maximize their utilization and effectiveness
ʎʎ Enable multiple Windows or Linux user sessions on a single
machine managed by Moab to maximize resource and power
utilization.
ʎʎ Enable dynamic OS re-provisioning on resources, managed
by Moab, to optimally meet changing visualization applica-
tion workload demand and maximize resource utilization
and availability to users.
The Remote Visualization Edition makes visualization more cost effective and productive with an innovative technical compute cloud.
transtec HPC solutions are designed for maximum flexibility,
and ease of management. We not only offer our customers the
most powerful and flexible cluster management solution out
there, but also provide them with customized setup and site-
specific configuration. Whether a customer needs a dynamical
Linux-Windows dual-boot solution, unifi ed management of dif-
ferent clusters at different sites, or the fi ne-tuning of the Moab
scheduler for implementing fi ne-grained policy confi guration
– transtec not only gives you the framework at hand, but also
helps you adapt the system according to your special needs.
Needless to say, when customers are in need of special train-
ings, transtec will be there to provide customers, administra-
tors, or users with specially adjusted Educational Services.
Having a many years’ experience in High Performance Comput-
ing enabled us to develop effi cient concepts for installing and
deploying HPC clusters. For this, we leverage well-proven 3rd
party tools, which we assemble to a total solution and adapt
and confi gure according to the customer’s requirements.
We manage everything from the complete installation and
confi guration of the operating system, necessary middleware
components like job and cluster management systems up to
the customer’s applications.
50
Intelligent HPC Workload
Management
MoabHPCSuite–RemoteVisualizationEdition
Moab HPC Suite Use Case Basic Enterprice App. Portal Remote Visualization
ProductivityAcceleration
Massive multifaceted scalability x x x x
Workload optimized resource allocation x x x x
Optimized, intelligent scheduling of work-
load and resources
x x x x
Optimized accelerator scheduling & manage-
ment (Intel MIC & GPGPU)
x x x x
Simplified user job submission & mgmt.
- Application Portal
x x x
x
x
Administrator dashboard and management
reporting
x x x x
Unified workload management across het-
erogeneous cluster (s)
x
Single cluster only
x x x
Workload optimized node provisioning x x x
Visualization Workload Optimization x
Workload-aware auto-power management x x x
Uptime Automation
Intelligent resource placement for failure
prevention
x x x x
Workload-aware maintenance scheduling x x x x
Basic deployment, training and premium
(24x7) support service
Standard
support only
x x x
Auto response to failures and events x
Limited no
re-boot/provision
x x x
Auto SLA enforcement
Resource sharing and usage policies x x x x
SLA and oriority policies x x x x
Continuous and future reservations scheduling x x x x
Usage accounting and budget enforcement x x x
Grid- and Cloud-Ready
Manage and share workload across multiple
clusters in wich area grid
x
w/grid Option
x
w/grid Option
x
w/grid Option
x
w/grid Option
User self-service: job request & management
portal
x x x x
Pay-for-use showback and chargeback x x x
Workload optimized node provisioning &
re-purposing
x x x
Multiple clusters in single domain
51
InfiniBand
high-speed
interconntects
NICEEnginFrame–
ATechnicalComputing
Portal
53
InfiniBand (IB) is an efficient I/O technology that provides high-
speed data transfers and ultra-low latencies for computing and
storage over a highly reliable and scalable single fabric. The
InfiniBand industry standard ecosystem creates cost effective
hardware and software solutions that easily scale from genera-
tion to generation.
InfiniBand is a high-bandwidth, low-latency network interconnect
solution that has grown tremendous market share in the High
Performance Computing (HPC) cluster community. InfiniBand was
designed to take the place of today’s data center networking tech-
nology. In the late 1990s, a group of next generation I/O architects
formed an open, community driven network technology to provide
scalability and stability based on successes from other network
designs. Today, InfiniBand is a popular and widely used I/O fabric
among customers within the Top500 supercomputers: major Uni-
versities and Labs; Life Sciences; Biomedical; Oil and Gas (Seismic,
Reservoir, Modeling Applications); Computer Aided Design and En-
gineering; Enterprise Oracle; and Financial Applications.
HPC systems and enterprise
grids deliver unprecedented
time-to-market and perfor-
mance advantages to many
research and corporate cus-
tomers, struggling every
day with compute and data
intensive processes. This of-
ten generates or transforms
massive amounts of jobs
and data, that needs to be
handled and archived effi-
ciently to deliver timely in-
formation to users distrib-
uted in multiple locations,
with different security con-
cerns.
Poor usability of such com-
plex systems often nega-
tively impacts users’ pro-
ductivity, and ad-hoc data
management often increas-
es information entropy and
dissipates knowledge and
intellectual property.
54
NICEEnginFrame
A Technical Portal for Remote Visualization
Solving distributed computing issues for our customers, it is
easy to understand that a modern, user-friendly web front-end
to HPC and grids can drastically improve engineering productiv-
ity, if properly designed to address the specific challenges of the
Technical Computing market.
Nice EnginFrame overcomes many common issues in the areas
of usability, data management, security and integration, open-
ing the way to a broader, more effective use of the Technical
Computing resources.
The key components of our solutions are:
ʎʎ a flexible and modular Java-based kernel, with clear separati-
on between customizations and core services
ʎʎ powerful data management features, reflecting the typical
needs of engineering applications
ʎʎ comprehensive security options and a fine grained authoriza-
tion system
ʎʎ scheduler abstraction layer to adapt to different workload
and resource managers
ʎʎ responsive and competent support services
55
Enduserscantypicallyenjoythefollowingimprovements:
ʎʎ user-friendly, intuitive access to computing resources, using a
standard web browser
ʎʎ application-centric job submission
ʎʎ organized access to job information and data
ʎʎ increased mobility and reduced client requirements
On the other side, the Technical Computing Portal delivers sig-
nificant added-value for system administrators and IT:
ʎʎ reduced training needs to enable users to access the resour-
ces
ʎʎ centralized configuration and immediate deployment of ser-
vices
ʎʎ comprehensive authorization to access services and informa-
tion
ʎʎ reduced support calls and submission errors
Coupled with our Remote Visualization solutions, our custom-
ers quickly deploy end-to-end engineering processes on their
Intranet, Extranet or Internet.
EnginFrame Highlights
ʎʎ Universal and flexible access to your Grid infrastructure En-
ginFrame provides easy accessover intranet, extranet or the
Internet using standard and secure Internet languages and
protocols.
ʎʎ Interface to the Grid in your organization EnginFrame offers
pluggable server-side XML services to enable easy integrati-
on of the leading workload schedulers in the market. Plug-in
modules for all major grid solutions including the Platform
Computing LSF and OCS, Microsoft Windows HPC Server
2008, Sun GridEngine, UnivaUD UniCluster, IBM LoadLeve-
ler, Altair PBS/Pro and OpenPBS, Torque, Condor, EDG/gLite
Globus-based toolkits, and provides easy interfaces to the
existing computing infrastructure.
ʎʎ Security and Access Control You can give encrypted and con-
trollable access to remote users and improve collaboration
with partners while protecting your infrastructure and Intel-
lectualProperty(IP). Thisenablesyoutospeedupdesigncycles
while working with related departments or external partners
and eases communication through a secure infrastructure.
Distributed Data Management The comprehensive remote
file management built into EnginFrame avoids unnecessary
file transfers, and enables server-side treatment of data, as
well as file transfer from/to the user’s desktop.
ʎʎ Interactive application support Through the integration
with leading GUI virtualization solutions (including Desktop
Cloud Visualization (DCV), VNC, VirtualGL and NX), EnginFrame
enables you to address all stages of the design process, both
batch and interactive.
ʎʎ SOA-enable your applications and resources EnginFrame
can automatically publish your computing applications via
standard WebServices (tested both for Java and .NET intero-
perability), enabling immediate integration with other enter-
prise applications.
Business Benefits
Category Business Benefits
Web and Cloud Portal ʎʎ Simplify user access; minimize
the training requirements for
HPC users; broaden the user
community
ʎʎ Enable access from intranet,
internet or extranet
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014
HPC compass 2013/2014

Más contenido relacionado

La actualidad más candente

Covid-19 Response Capability with Power Systems
Covid-19 Response Capability with Power SystemsCovid-19 Response Capability with Power Systems
Covid-19 Response Capability with Power SystemsGanesan Narayanasamy
 
Store data more efficiently and increase I/O performance with lower latency w...
Store data more efficiently and increase I/O performance with lower latency w...Store data more efficiently and increase I/O performance with lower latency w...
Store data more efficiently and increase I/O performance with lower latency w...Principled Technologies
 
HPC DAY 2017 | HPE Storage and Data Management for Big Data
HPC DAY 2017 | HPE Storage and Data Management for Big DataHPC DAY 2017 | HPE Storage and Data Management for Big Data
HPC DAY 2017 | HPE Storage and Data Management for Big DataHPC DAY
 
The Dell EMC PowerMax 8000 outperformed another vendor's array on an OLTP-lik...
The Dell EMC PowerMax 8000 outperformed another vendor's array on an OLTP-lik...The Dell EMC PowerMax 8000 outperformed another vendor's array on an OLTP-lik...
The Dell EMC PowerMax 8000 outperformed another vendor's array on an OLTP-lik...Principled Technologies
 
Collaborate07kmohiuddin
Collaborate07kmohiuddinCollaborate07kmohiuddin
Collaborate07kmohiuddinSal Marcus
 
DDN: Protecting Your Data, Protecting Your Hardware
DDN: Protecting Your Data, Protecting Your HardwareDDN: Protecting Your Data, Protecting Your Hardware
DDN: Protecting Your Data, Protecting Your Hardwareinside-BigData.com
 
HPC DAY 2017 | The network part in accelerating Machine-Learning and Big-Data
HPC DAY 2017 | The network part in accelerating Machine-Learning and Big-DataHPC DAY 2017 | The network part in accelerating Machine-Learning and Big-Data
HPC DAY 2017 | The network part in accelerating Machine-Learning and Big-DataHPC DAY
 
Ibm symp14 referentin_barbara koch_power_8 launch bk
Ibm symp14 referentin_barbara koch_power_8 launch bkIbm symp14 referentin_barbara koch_power_8 launch bk
Ibm symp14 referentin_barbara koch_power_8 launch bkIBM Switzerland
 
HPC DAY 2017 | FlyElephant Solutions for Data Science and HPC
HPC DAY 2017 | FlyElephant Solutions for Data Science and HPCHPC DAY 2017 | FlyElephant Solutions for Data Science and HPC
HPC DAY 2017 | FlyElephant Solutions for Data Science and HPCHPC DAY
 
Enterprise Blade Computing With Hitachi Compute Blade 500: Flexibility, Scala...
Enterprise Blade Computing With Hitachi Compute Blade 500: Flexibility, Scala...Enterprise Blade Computing With Hitachi Compute Blade 500: Flexibility, Scala...
Enterprise Blade Computing With Hitachi Compute Blade 500: Flexibility, Scala...Hitachi Vantara
 
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...HPC DAY
 
VSP Mainframe Dynamic Tiering Performance Considerations
VSP Mainframe Dynamic Tiering Performance ConsiderationsVSP Mainframe Dynamic Tiering Performance Considerations
VSP Mainframe Dynamic Tiering Performance ConsiderationsHitachi Vantara
 
Blade Server Technology Daniel Nilles Herzing
Blade Server Technology  Daniel Nilles  HerzingBlade Server Technology  Daniel Nilles  Herzing
Blade Server Technology Daniel Nilles HerzingDaniel Nilles
 
Why hitachi virtual storage platform does so well in a mainframe environment ...
Why hitachi virtual storage platform does so well in a mainframe environment ...Why hitachi virtual storage platform does so well in a mainframe environment ...
Why hitachi virtual storage platform does so well in a mainframe environment ...Hitachi Vantara
 
Computação de Alto Desempenho - Fator chave para a competitividade do País, d...
Computação de Alto Desempenho - Fator chave para a competitividade do País, d...Computação de Alto Desempenho - Fator chave para a competitividade do País, d...
Computação de Alto Desempenho - Fator chave para a competitividade do País, d...Igor José F. Freitas
 
OpenPOWER/POWER9 Webinar from MIT and IBM
OpenPOWER/POWER9 Webinar from MIT and IBM OpenPOWER/POWER9 Webinar from MIT and IBM
OpenPOWER/POWER9 Webinar from MIT and IBM Ganesan Narayanasamy
 
High Performance Computing
High Performance ComputingHigh Performance Computing
High Performance ComputingDivyen Patel
 
NGD Systems and Microsoft Keynote Presentation at IPDPS MPP in Vacouver
NGD Systems and Microsoft Keynote Presentation at IPDPS MPP in VacouverNGD Systems and Microsoft Keynote Presentation at IPDPS MPP in Vacouver
NGD Systems and Microsoft Keynote Presentation at IPDPS MPP in VacouverScott Shadley, MBA,PMC-III
 

La actualidad más candente (20)

Covid-19 Response Capability with Power Systems
Covid-19 Response Capability with Power SystemsCovid-19 Response Capability with Power Systems
Covid-19 Response Capability with Power Systems
 
WML OpenPOWER presentation
WML OpenPOWER presentationWML OpenPOWER presentation
WML OpenPOWER presentation
 
Store data more efficiently and increase I/O performance with lower latency w...
Store data more efficiently and increase I/O performance with lower latency w...Store data more efficiently and increase I/O performance with lower latency w...
Store data more efficiently and increase I/O performance with lower latency w...
 
HPC DAY 2017 | HPE Storage and Data Management for Big Data
HPC DAY 2017 | HPE Storage and Data Management for Big DataHPC DAY 2017 | HPE Storage and Data Management for Big Data
HPC DAY 2017 | HPE Storage and Data Management for Big Data
 
The Dell EMC PowerMax 8000 outperformed another vendor's array on an OLTP-lik...
The Dell EMC PowerMax 8000 outperformed another vendor's array on an OLTP-lik...The Dell EMC PowerMax 8000 outperformed another vendor's array on an OLTP-lik...
The Dell EMC PowerMax 8000 outperformed another vendor's array on an OLTP-lik...
 
Collaborate07kmohiuddin
Collaborate07kmohiuddinCollaborate07kmohiuddin
Collaborate07kmohiuddin
 
DDN: Protecting Your Data, Protecting Your Hardware
DDN: Protecting Your Data, Protecting Your HardwareDDN: Protecting Your Data, Protecting Your Hardware
DDN: Protecting Your Data, Protecting Your Hardware
 
HPC DAY 2017 | The network part in accelerating Machine-Learning and Big-Data
HPC DAY 2017 | The network part in accelerating Machine-Learning and Big-DataHPC DAY 2017 | The network part in accelerating Machine-Learning and Big-Data
HPC DAY 2017 | The network part in accelerating Machine-Learning and Big-Data
 
Ibm symp14 referentin_barbara koch_power_8 launch bk
Ibm symp14 referentin_barbara koch_power_8 launch bkIbm symp14 referentin_barbara koch_power_8 launch bk
Ibm symp14 referentin_barbara koch_power_8 launch bk
 
HPC DAY 2017 | FlyElephant Solutions for Data Science and HPC
HPC DAY 2017 | FlyElephant Solutions for Data Science and HPCHPC DAY 2017 | FlyElephant Solutions for Data Science and HPC
HPC DAY 2017 | FlyElephant Solutions for Data Science and HPC
 
Enterprise Blade Computing With Hitachi Compute Blade 500: Flexibility, Scala...
Enterprise Blade Computing With Hitachi Compute Blade 500: Flexibility, Scala...Enterprise Blade Computing With Hitachi Compute Blade 500: Flexibility, Scala...
Enterprise Blade Computing With Hitachi Compute Blade 500: Flexibility, Scala...
 
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...
 
VSP Mainframe Dynamic Tiering Performance Considerations
VSP Mainframe Dynamic Tiering Performance ConsiderationsVSP Mainframe Dynamic Tiering Performance Considerations
VSP Mainframe Dynamic Tiering Performance Considerations
 
Blade Server Technology Daniel Nilles Herzing
Blade Server Technology  Daniel Nilles  HerzingBlade Server Technology  Daniel Nilles  Herzing
Blade Server Technology Daniel Nilles Herzing
 
OpenPOWER/POWER9 AI webinar
OpenPOWER/POWER9 AI webinar OpenPOWER/POWER9 AI webinar
OpenPOWER/POWER9 AI webinar
 
Why hitachi virtual storage platform does so well in a mainframe environment ...
Why hitachi virtual storage platform does so well in a mainframe environment ...Why hitachi virtual storage platform does so well in a mainframe environment ...
Why hitachi virtual storage platform does so well in a mainframe environment ...
 
Computação de Alto Desempenho - Fator chave para a competitividade do País, d...
Computação de Alto Desempenho - Fator chave para a competitividade do País, d...Computação de Alto Desempenho - Fator chave para a competitividade do País, d...
Computação de Alto Desempenho - Fator chave para a competitividade do País, d...
 
OpenPOWER/POWER9 Webinar from MIT and IBM
OpenPOWER/POWER9 Webinar from MIT and IBM OpenPOWER/POWER9 Webinar from MIT and IBM
OpenPOWER/POWER9 Webinar from MIT and IBM
 
High Performance Computing
High Performance ComputingHigh Performance Computing
High Performance Computing
 
NGD Systems and Microsoft Keynote Presentation at IPDPS MPP in Vacouver
NGD Systems and Microsoft Keynote Presentation at IPDPS MPP in VacouverNGD Systems and Microsoft Keynote Presentation at IPDPS MPP in Vacouver
NGD Systems and Microsoft Keynote Presentation at IPDPS MPP in Vacouver
 

Similar a HPC compass 2013/2014

Hpc kompass 2015
Hpc kompass 2015Hpc kompass 2015
Hpc kompass 2015TTEC
 
HPC Compass IBM Special 2013/14
HPC Compass IBM Special 2013/14HPC Compass IBM Special 2013/14
HPC Compass IBM Special 2013/14Marco van der Hart
 
HPC kompass ibm_special_2013/2014
HPC kompass ibm_special_2013/2014HPC kompass ibm_special_2013/2014
HPC kompass ibm_special_2013/2014TTEC
 
Hpc kompass ibm_special_2014
Hpc kompass ibm_special_2014Hpc kompass ibm_special_2014
Hpc kompass ibm_special_2014TTEC
 
HPC Compass 2014/2015 IBM Special
HPC Compass 2014/2015 IBM SpecialHPC Compass 2014/2015 IBM Special
HPC Compass 2014/2015 IBM SpecialMarco van der Hart
 
Hpc compass 2016 17
Hpc compass 2016 17Hpc compass 2016 17
Hpc compass 2016 17TTEC
 
IEEE Paper - A Study Of Cloud Computing Environments For High Performance App...
IEEE Paper - A Study Of Cloud Computing Environments For High Performance App...IEEE Paper - A Study Of Cloud Computing Environments For High Performance App...
IEEE Paper - A Study Of Cloud Computing Environments For High Performance App...Angela Williams
 
MIG 5th Data Centre Summit 2016 PTS Presentation v1
MIG 5th Data Centre Summit 2016 PTS Presentation v1MIG 5th Data Centre Summit 2016 PTS Presentation v1
MIG 5th Data Centre Summit 2016 PTS Presentation v1blewington
 
Interventions for scientific and enterprise applications
Interventions for scientific and enterprise applicationsInterventions for scientific and enterprise applications
Interventions for scientific and enterprise applicationseSAT Publishing House
 
Interventions for scientific and enterprise applications based on high perfor...
Interventions for scientific and enterprise applications based on high perfor...Interventions for scientific and enterprise applications based on high perfor...
Interventions for scientific and enterprise applications based on high perfor...eSAT Journals
 
Give Your Organization Better, Faster Insights & Answers with High Performanc...
Give Your Organization Better, Faster Insights & Answers with High Performanc...Give Your Organization Better, Faster Insights & Answers with High Performanc...
Give Your Organization Better, Faster Insights & Answers with High Performanc...Dell World
 
Hpc cae nl
Hpc cae nlHpc cae nl
Hpc cae nlTTEC
 
ICEOTOPE & OCF: Performance for Manufacturing
ICEOTOPE & OCF: Performance for Manufacturing ICEOTOPE & OCF: Performance for Manufacturing
ICEOTOPE & OCF: Performance for Manufacturing IceotopePR
 
OpenPOWER's ISC 2016 Recap
OpenPOWER's ISC 2016 RecapOpenPOWER's ISC 2016 Recap
OpenPOWER's ISC 2016 RecapOpenPOWERorg
 
Hp network function virtualization technical white paper NFV
Hp  network function virtualization technical white paper NFVHp  network function virtualization technical white paper NFV
Hp network function virtualization technical white paper NFVIgnacio García-Carrillo
 
Software based video, audio, web conferencing - can standard servers deliver?
Software based video, audio, web conferencing - can standard servers deliver?Software based video, audio, web conferencing - can standard servers deliver?
Software based video, audio, web conferencing - can standard servers deliver?Anders Løkke
 
Cluster Tutorial
Cluster TutorialCluster Tutorial
Cluster Tutorialcybercbm
 
Case for migrating_itaniumhpux_to_x86sles
Case for migrating_itaniumhpux_to_x86slesCase for migrating_itaniumhpux_to_x86sles
Case for migrating_itaniumhpux_to_x86slessébastien carrias
 
Innovation with ai at scale on the edge vt sept 2019 v0
Innovation with ai at scale  on the edge vt sept 2019 v0Innovation with ai at scale  on the edge vt sept 2019 v0
Innovation with ai at scale on the edge vt sept 2019 v0Ganesan Narayanasamy
 

Similar a HPC compass 2013/2014 (20)

Hpc kompass 2015
Hpc kompass 2015Hpc kompass 2015
Hpc kompass 2015
 
HPC Compass IBM Special 2013/14
HPC Compass IBM Special 2013/14HPC Compass IBM Special 2013/14
HPC Compass IBM Special 2013/14
 
HPC kompass ibm_special_2013/2014
HPC kompass ibm_special_2013/2014HPC kompass ibm_special_2013/2014
HPC kompass ibm_special_2013/2014
 
Hpc kompass ibm_special_2014
Hpc kompass ibm_special_2014Hpc kompass ibm_special_2014
Hpc kompass ibm_special_2014
 
HPC Compass 2014/2015 IBM Special
HPC Compass 2014/2015 IBM SpecialHPC Compass 2014/2015 IBM Special
HPC Compass 2014/2015 IBM Special
 
Hpc compass 2016 17
Hpc compass 2016 17Hpc compass 2016 17
Hpc compass 2016 17
 
HPC Compass 2016_17
HPC Compass 2016_17HPC Compass 2016_17
HPC Compass 2016_17
 
IEEE Paper - A Study Of Cloud Computing Environments For High Performance App...
IEEE Paper - A Study Of Cloud Computing Environments For High Performance App...IEEE Paper - A Study Of Cloud Computing Environments For High Performance App...
IEEE Paper - A Study Of Cloud Computing Environments For High Performance App...
 
MIG 5th Data Centre Summit 2016 PTS Presentation v1
MIG 5th Data Centre Summit 2016 PTS Presentation v1MIG 5th Data Centre Summit 2016 PTS Presentation v1
MIG 5th Data Centre Summit 2016 PTS Presentation v1
 
Interventions for scientific and enterprise applications
Interventions for scientific and enterprise applicationsInterventions for scientific and enterprise applications
Interventions for scientific and enterprise applications
 
Interventions for scientific and enterprise applications based on high perfor...
Interventions for scientific and enterprise applications based on high perfor...Interventions for scientific and enterprise applications based on high perfor...
Interventions for scientific and enterprise applications based on high perfor...
 
Give Your Organization Better, Faster Insights & Answers with High Performanc...
Give Your Organization Better, Faster Insights & Answers with High Performanc...Give Your Organization Better, Faster Insights & Answers with High Performanc...
Give Your Organization Better, Faster Insights & Answers with High Performanc...
 
Hpc cae nl
Hpc cae nlHpc cae nl
Hpc cae nl
 
ICEOTOPE & OCF: Performance for Manufacturing
ICEOTOPE & OCF: Performance for Manufacturing ICEOTOPE & OCF: Performance for Manufacturing
ICEOTOPE & OCF: Performance for Manufacturing
 
OpenPOWER's ISC 2016 Recap
OpenPOWER's ISC 2016 RecapOpenPOWER's ISC 2016 Recap
OpenPOWER's ISC 2016 Recap
 
Hp network function virtualization technical white paper NFV
Hp  network function virtualization technical white paper NFVHp  network function virtualization technical white paper NFV
Hp network function virtualization technical white paper NFV
 
Software based video, audio, web conferencing - can standard servers deliver?
Software based video, audio, web conferencing - can standard servers deliver?Software based video, audio, web conferencing - can standard servers deliver?
Software based video, audio, web conferencing - can standard servers deliver?
 
Cluster Tutorial
Cluster TutorialCluster Tutorial
Cluster Tutorial
 
Case for migrating_itaniumhpux_to_x86sles
Case for migrating_itaniumhpux_to_x86slesCase for migrating_itaniumhpux_to_x86sles
Case for migrating_itaniumhpux_to_x86sles
 
Innovation with ai at scale on the edge vt sept 2019 v0
Innovation with ai at scale  on the edge vt sept 2019 v0Innovation with ai at scale  on the edge vt sept 2019 v0
Innovation with ai at scale on the edge vt sept 2019 v0
 

Más de TTEC

Transtec360 brochure
Transtec360 brochureTranstec360 brochure
Transtec360 brochureTTEC
 
Virtual graphic workspace
Virtual graphic workspace Virtual graphic workspace
Virtual graphic workspace TTEC
 
Software defined storage rev. 2.0
Software defined storage rev. 2.0 Software defined storage rev. 2.0
Software defined storage rev. 2.0 TTEC
 
Hpc life science-nl
Hpc life science-nlHpc life science-nl
Hpc life science-nlTTEC
 
transtec vdi in-a-box
transtec vdi in-a-boxtranstec vdi in-a-box
transtec vdi in-a-boxTTEC
 
Nexenta transtec
Nexenta transtecNexenta transtec
Nexenta transtecTTEC
 
Dacoria overview englisch
Dacoria overview englischDacoria overview englisch
Dacoria overview englischTTEC
 
Vmware certified IBM Servers
Vmware certified IBM ServersVmware certified IBM Servers
Vmware certified IBM ServersTTEC
 
Ibm v3700
Ibm v3700Ibm v3700
Ibm v3700TTEC
 
Sansymphony v-r9
Sansymphony v-r9Sansymphony v-r9
Sansymphony v-r9TTEC
 
Microsoft Hyper-V explained
Microsoft Hyper-V explainedMicrosoft Hyper-V explained
Microsoft Hyper-V explainedTTEC
 
ttec NAS powered by Open-E
ttec NAS powered by Open-Ettec NAS powered by Open-E
ttec NAS powered by Open-ETTEC
 
Sandy bridge platform from ttec
Sandy bridge platform from ttecSandy bridge platform from ttec
Sandy bridge platform from ttecTTEC
 
Solution magazine 2012_final_nl
Solution magazine 2012_final_nlSolution magazine 2012_final_nl
Solution magazine 2012_final_nlTTEC
 
Hpc compass transtec_2012
Hpc compass transtec_2012Hpc compass transtec_2012
Hpc compass transtec_2012TTEC
 
2012 07-01e tn ovd lto product overview
2012 07-01e tn ovd lto product overview2012 07-01e tn ovd lto product overview
2012 07-01e tn ovd lto product overviewTTEC
 
Nexenta ttec-2012
Nexenta ttec-2012Nexenta ttec-2012
Nexenta ttec-2012TTEC
 
ttec infortrend ds
ttec infortrend dsttec infortrend ds
ttec infortrend dsTTEC
 
Pvma32xx series-raid
Pvma32xx series-raidPvma32xx series-raid
Pvma32xx series-raidTTEC
 
SANsymphony V
SANsymphony VSANsymphony V
SANsymphony VTTEC
 

Más de TTEC (20)

Transtec360 brochure
Transtec360 brochureTranstec360 brochure
Transtec360 brochure
 
Virtual graphic workspace
Virtual graphic workspace Virtual graphic workspace
Virtual graphic workspace
 
Software defined storage rev. 2.0
Software defined storage rev. 2.0 Software defined storage rev. 2.0
Software defined storage rev. 2.0
 
Hpc life science-nl
Hpc life science-nlHpc life science-nl
Hpc life science-nl
 
transtec vdi in-a-box
transtec vdi in-a-boxtranstec vdi in-a-box
transtec vdi in-a-box
 
Nexenta transtec
Nexenta transtecNexenta transtec
Nexenta transtec
 
Dacoria overview englisch
Dacoria overview englischDacoria overview englisch
Dacoria overview englisch
 
Vmware certified IBM Servers
Vmware certified IBM ServersVmware certified IBM Servers
Vmware certified IBM Servers
 
Ibm v3700
Ibm v3700Ibm v3700
Ibm v3700
 
Sansymphony v-r9
Sansymphony v-r9Sansymphony v-r9
Sansymphony v-r9
 
Microsoft Hyper-V explained
Microsoft Hyper-V explainedMicrosoft Hyper-V explained
Microsoft Hyper-V explained
 
ttec NAS powered by Open-E
ttec NAS powered by Open-Ettec NAS powered by Open-E
ttec NAS powered by Open-E
 
Sandy bridge platform from ttec
Sandy bridge platform from ttecSandy bridge platform from ttec
Sandy bridge platform from ttec
 
Solution magazine 2012_final_nl
Solution magazine 2012_final_nlSolution magazine 2012_final_nl
Solution magazine 2012_final_nl
 
Hpc compass transtec_2012
Hpc compass transtec_2012Hpc compass transtec_2012
Hpc compass transtec_2012
 
2012 07-01e tn ovd lto product overview
2012 07-01e tn ovd lto product overview2012 07-01e tn ovd lto product overview
2012 07-01e tn ovd lto product overview
 
Nexenta ttec-2012
Nexenta ttec-2012Nexenta ttec-2012
Nexenta ttec-2012
 
ttec infortrend ds
ttec infortrend dsttec infortrend ds
ttec infortrend ds
 
Pvma32xx series-raid
Pvma32xx series-raidPvma32xx series-raid
Pvma32xx series-raid
 
SANsymphony V
SANsymphony VSANsymphony V
SANsymphony V
 

Último

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Último (20)

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

HPC compass 2013/2014

  • 1. Simulation CAD CAE Engineering Price Modelling Life Sciences Automotive Aerospace Risk Analysis Big Data Analytics High Throughput Computing HighPerformance Computing2013/14 TechnologyCompass
  • 2. 2 TechnologyCompass Table of contents and Introduction High Performance Computing............................................4 Performance Turns Into Productivity.......................................................6 Flexible Deployment With xcat...................................................................8 Service and Customer Care From A to Z ..............................................10 Advanced Cluster Management Made Easy............ 12 Easy-to-use, Complete and Scalable.......................................................14 Cloud Bursting With Bright..........................................................................18 Intelligent HPC Workload Management.................... 30 Moab HPC Suite – Basic Edition.................................................................32 Moab HPC Suite – Enterprise Edition.....................................................34 Moab HPC Suite – Grid Option....................................................................36 Optimizing Accelerators with Moab HPC Suite...............................38 Moab HPC Suite – Application Portal Edition...................................42 Moab HPC Suite – Remote Visualization Edition............................44 Using Moab With Grid Environments....................................................46 Nice EnginFrame...................................................................... 52 A Technical Portal for Remote Visualization.....................................54 Application Highlights....................................................................................56 Desktop Cloud Virtualization.....................................................................58 Remote Visualization.......................................................................................60 Intel Cluster Ready ................................................................. 64 A Quality Standard for HPC Clusters......................................................66 Intel Cluster Ready builds HPC Momentum......................................70 The transtec Benchmarking Center........................................................74 Parallel NFS.................................................................................. 76 The New Standard for HPC Storage........................................................78 Whats´s New in NFS 4.1?...............................................................................80 Panasas HPC Storage.......................................................................................82 NVIDIA GPU Computing........................................................ 94 What is GPU Computing?..............................................................................96 Kepler GK110 – The Next Generation GPU..........................................98 Introducing NVIDIA Parallel Nsight...................................................... 100 A Quick Refresher on CUDA....................................................................... 112 Intel TrueScale InfiniBand and GPUs................................................... 118 Intel Xeon Phi Coprocessor..............................................122 The Architecture.............................................................................................. 124 InfiniBand....................................................................................134 High-speed Interconnects......................................................................... 136 Top 10 Reasons to Use Intel TrueScale InfiniBand...................... 138 Intel MPI Library 4.0 Performance......................................................... 140 Intel Fabric Suite 7.......................................................................................... 144 Parstream....................................................................................150 Big Data Analytics........................................................................................... 152 Glossary........................................................................................160
  • 3. 3 More than 30 years of experience in Scientific Computing 1980 marked the beginning of a decade where numerous start- ups were created, some of which later transformed into big play- ers in the IT market. Technical innovations brought dramatic changes to the nascent computer market. In Tübingen, close to one of Germany’s prime and oldest universities, transtec was founded. In the early days, transtec focused on reselling DEC computers and peripherals, delivering high-performance workstations to university institutes and research facilities. In 1987, SUN/Sparc and storage solutions broadened the portfolio, enhanced by IBM/RS6000 products in 1991. These were the typical worksta- tions and server systems for high performance computing then, used by the majority of researchers worldwide. In the late 90s, transtec was one of the first companies to offer highly customized HPC cluster solutions based on standard Intel architecture servers, some of which entered the TOP 500 list of the world’s fastest computing systems. Thus, given this background and history, it is fair to say that transtec looks back upon a more than 30 years’ experience in sci- entific computing; our track record shows more than 500 HPC in- stallations. With this experience, we know exactly what custom- ers’ demands are and how to meet them. High performance and ease of management – this is what customers require today. HPC systems are for sure required to peak-perform, as their name in- dicates, but that is not enough: they must also be easy to handle. Unwieldy design and operational complexity must be avoided or at least hidden from administrators and particularly users of HPC computer systems. This brochure focusses on where transtec HPC solutions excel. transtec HPC solutions use the latest and most innovative tech- nology. Bright Cluster Manager as the technology leader for uni- fied HPC cluster management, leading-edge Moab HPC Suite for job and workload management, Intel Cluster Ready certification as an independent quality standard for our systems, Panasas HPC storage systems for highest performance and real ease of management required of a reliable HPC storage system. Again, with these components, usability, reliability and ease of man- agement are central issues that are addressed, even in a highly heterogeneous environment. transtec is able to provide custom- ers with well-designed, extremely powerful solutions for Tesla GPU computing, as well as thoroughly engineered Intel Xeon Phi systems. Intel’s InfiniBand Fabric Suite makes managing a large InfiniBand fabric easier than ever before – transtec mas- terly combines excellent and well-chosen components that are already there to a fine-tuned, customer-specific, and thoroughly designed HPC solution. Your decision for a transtec HPC solution means you opt for most intensive customer care and best service in HPC. Our ex- perts will be glad to bring in their expertise and support to assist you at any stage, from HPC design to daily cluster operations, to HPC Cloud Services. Last but not least, transtec HPC Cloud Services provide custom- ers with the possibility to have their jobs run on dynamically pro- vided nodes in a dedicated datacenter, professionally managed and individually customizable. Numerous standard applications like ANSYS, LS-Dyna, OpenFOAM, as well as lots of codes like Gro- macs, NAMD, VMD, and others are pre-installed, integrated into an enterprise-ready cloud management environment, and ready to run. Have fun reading the transtec HPC Compass 2013/14!
  • 5. 55 High Performance Comput- ing (HPC) has been with us from the very beginning of the computer era. High- performance computers were built to solve numer- ous problems which the “human computers” could not handle. The term HPC just hadn’t been coined yet. More important, some of the early principles have changed fundamentally.
  • 6. 6 Performance Turns Into Productivity HPC systems in the early days were much different from those we see today. First, we saw enormous mainframes from large computer manufacturers, including a proprietary operating system and job management system. Second, at universities and research institutes, workstations made inroads and sci- entists carried out calculations on their dedicated Unix or VMS workstations. In either case, if you needed more comput- ing power, you scaled up, i.e. you bought a bigger machine. Today the term High-Performance Computing has gained a fundamentally new meaning. HPC is now perceived as a way to tackle complex mathematical, scientific or engineering problems. The integration of industry standard, “off-the-shelf” server hardware into HPC clusters facilitates the construction of computer networks of such power that one single system could never achieve. The new paradigm for parallelization is scaling out. Computer-supported simulations of realistic processes (so- called Computer Aided Engineering – CAE) has established itself as a third key pillar in the field of science and research alongside theory and experimentation. It is nowadays incon- ceivable that an aircraft manufacturer or a Formula One rac- ing team would operate without using simulation software. And scientific calculations, such as in the fields of astrophys- ics, medicine, pharmaceuticals and bio-informatics, will to a large extent be dependent on supercomputers in the future. Software manufacturers long ago recognized the benefit of high-performance computers based on powerful standard servers and ported their programs to them accordingly. The main advantages of scale-out supercomputers is just that: they are infinitely scalable, at least in principle. Since they are based on standard hardware components, such a supercom- puter can be charged with more power whenever the com- putational capacity of the system is not sufficient any more, simply by adding additional nodes of the same kind. A cum- bersome switch to a different technology can be avoided in most cases. HighPerformanceComputing Performance Turns Into Productivity “transtec HPC solutions are meant to provide cus- tomers with unparalleled ease-of-management and ease-of-use. Apart from that, deciding for a transtec HPC solution means deciding for the most intensive customer care and the best service imaginable.” Dr. Oliver Tennert Director Technology Management & HPC Solutions
  • 7. 7 Variations on the theme: MPP and SMP Parallel computations exist in two major variants today. Ap- plications running in parallel on multiple compute nodes are frequently so-called Massively Parallel Processing (MPP) appli- cations. MPP indicates that the individual processes can each utilize exclusive memory areas. This means that such jobs are predestined to be computed in parallel, distributed across the nodes in a cluster. The individual processes can thus utilize the separate units of the respective node – especially the RAM, the CPU power and the disk I/O. Communication between the individual processes is imple- mented in a standardized way through the MPI software inter- face (Message Passing Interface), which abstracts the underlying network connections between the nodes from the processes. However, the MPI standard (current version 2.0) merely requires source code compatibility, not binary compatibility, so an off- the-shelf application usually needs specific versions of MPI libraries in order to run. Examples of MPI implementations are OpenMPI, MPICH2, MVAPICH2, Intel MPI or – for Windows clusters – MS-MPI. If the individual processes engage in a large amount of commu- nication, the response time of the network (latency) becomes im- portant. Latency in a Gigabit Ethernet or a 10GE network is typi- cally around 10 µs. High-speed interconnects such as InfiniBand, reduce latency by a factor of 10 down to as low as 1 µs. Therefore, high-speed interconnects can greatly speed up total processing. The other frequently used variant is called SMP applications. SMP, in this HPC context, stands for Shared Memory Processing. It involves the use of shared memory areas, the specific imple- mentation of which is dependent on the choice of the underly- ing operating system. Consequently, SMP jobs generally only run on a single node, where they can in turn be multi-threaded and thus be parallelized across the number of CPUs per node. For many HPC applications, both the MPP and SMP variant can be chosen. Many applications are not inherently suitable for parallel execu- tion. In such a case, there is no communication between the in- dividual compute nodes, and therefore no need for a high-speed network between them; nevertheless, multiple computing jobs can be run simultaneously and sequentially on each individual node, depending on the number of CPUs. In order to ensure optimum computing performance for these applications, it must be examined how many CPUs and cores de- liver the optimum performance. We find applications of this sequential type of work typically in the fields of data analysis or Monte-Carlo simulations.
  • 8. 8 xCAT as a Powerful and Flexible Deployment Tool xCAT (Extreme Cluster Administration Tool) is an open source toolkit for the deployment and low-level administration of HPC cluster environments, small as well as large ones. xCAT provides simple commands for hardware control, node discovery, the collection of MAC addresses, and the node de- ployment with (diskful) or without local (diskless) installation. The cluster configuration is stored in a relational database. Node groups for different operating system images can be defined. Also, user-specific scripts can be executed automati- cally at installation time. xCAT Provides the Following Low-Level Administrative Fea- tures ʎʎ Remote console support ʎʎ Parallel remote shell and remote copy commands ʎʎ Plugins for various monitoring tools like Ganglia or Nagios ʎʎ Hardware control commands for node discovery, collect- ing MAC addresses, remote power switching and resetting of nodes ʎʎ Automatic configuration of syslog, remote shell, DNS, DHCP, and ntp within the cluster ʎʎ Extensive documentation and man pages For cluster monitoring, we install and configure the open source tool Ganglia or the even more powerful open source HighPerformanceComputing Flexible Deployment With xCAT
  • 9. 9 solution Nagios, according to the customer’s preferences and requirements. Local Installation or Diskless Installation We offer a diskful or a diskless installation of the cluster nodes. A diskless installation means the operating system is hosted partially within the main memory, larger parts may or may not be included via NFS or other means. This approach allows for deploying large amounts of nodes very efficiently, and the cluster is up and running within a very small times- cale. Also, updating the cluster can be done in a very efficient way. For this, only the boot image has to be updated, and the nodes have to be rebooted. After this, the nodes run either a new kernel or even a new operating system. Moreover, with this approach, partitioning the cluster can also be very effi- ciently done, either for testing purposes, or for allocating dif- ferent cluster partitions for different users or applications. Development Tools, Middleware, and Applications According to the application, optimization strategy, or under- lying architecture, different compilers lead to code results of very different performance. Moreover, different, mainly com- mercial, applications, require different MPI implementations. And even when the code is self-developed, developers often prefer one MPI implementation over another. According to the customer’s wishes, we install various compil- ers, MPI middleware, as well as job management systems like Parastation, Grid Engine, Torque/Maui, or the very powerful Moab HPC Suite for the high-level cluster management.
  • 10. 10 HighPerformanceComputing Services and Customer Care from A to Z transtec HPC as a Service You will get a range of applications like LS-Dyna, ANSYS, Gromacs, NAMD etc. from all kinds of areas pre-installed, integrated into an enterprise-ready cloud and workload management system, and ready-to run. Do you miss your application? Ask us: HPC@transtec.de transtec Platform as a Service You will be provided with dynamically provided compute nodes for running your individual code. The operating system will be pre-installed according to your require- ments. Common Linux distributions like RedHat, CentOS, or SLES are the standard. Do you need another distribu- tion? Ask us: HPC@transtec.de transtec Hosting as a Service You will be provided with hosting space inside a profes- sionally managed and secured datacenter where you can have your machines hosted, managed, maintained, according to your requirements. Thus, you can build up your own private cloud. What range of hosting and main- tenance services do you need? Tell us: HPC@transtec.de HPC @ transtec: Services and Customer Care from A to Z transtec AG has over 30 years of experience in scientific computing and is one of the earliest manufacturers of HPC clusters. For nearly a decade, transtec has delivered highly customized High Perfor- manceclustersbasedonstandardcomponentstoacademicandin- dustry customers across Europe with all the high quality standards and the customer-centric approach that transtec is well known for. Every transtec HPC solution is more than just a rack full of hard- ware – it is a comprehensive solution with everything the HPC user, owner,andoperatorneed.Intheearlystagesofanycustomer’sHPC project, transtec experts provide extensive and detailed consult- ing to the customer – they benefit from expertise and experience. Consulting is followed by benchmarking of different systems with either specifically crafted customer code or generally accepted individual Presales consulting application-, customer-, site-specific sizing of HPC solution burn-in tests of systems benchmarking of different systems continual improvement software & OS installation application installation onsite hardware assembly integration into customer’s environment customer training maintenance, support & managed services individual Presales consulting application-, customer-, site-specific sizing of HPC solution burn-in tests of systems benchmarking of different systems continual improvement software & OS installation application installation onsite hardware assembly integration into customer’s environment customer training maintenance, support & managed services Services and Customer Care from A to Z
  • 11. 11 benchmarking routines; this aids customers in sizing and devising the optimal and detailed HPC configuration. Each and every piece of HPC hardware that leaves our factory un- dergoes a burn-in procedure of 24 hours or more if necessary. We make sure that any hardware shipped meets our and our custom- ers’ quality requirements. transtec HPC solutions are turnkey solu- tions.Bydefault,atranstecHPCclusterhaseverythinginstalledand configured – from hardware and operating system to important middleware components like cluster management or developer tools and the customer’s production applications. Onsite delivery means onsite integration into the customer’s production environ- ment, be it establishing network connectivity to the corporate net- work, or setting up software and configuration parts. transtec HPC clusters are ready-to-run systems – we deliver, you turn the key, the system delivers high performance. Every HPC proj- ect entails transfer to production: IT operation processes and poli- ciesapplytothenewHPCsystem.Effectively,ITpersonnelistrained hands-on, introduced to hardware components and software, with all operational aspects of configuration management. transtec services do not stop when the implementation projects ends. Beyond transfer to production, transtec takes care. transtec offers a variety of support and service options, tailored to the cus- tomer’s needs. When you are in need of a new installation, a major reconfiguration or an update of your solution – transtec is able to support your staff and, if you lack the resources for maintaining the cluster yourself, maintain the HPC solution for you. From Pro- fessional Services to Managed Services for daily operations and required service levels, transtec will be your complete HPC service and solution provider. transtec’s high standards of performance, re- liability and dependability assure your productivity and complete satisfaction. transtec’s offerings of HPC Managed Services offer customers the possibility of having the complete management and administra- tion of the HPC cluster managed by transtec service specialists, in an ITIL compliant way. Moreover, transtec’s HPC on Demand servic- es help provide access to HPC resources whenever they need them, for example, because they do not have the possibility of owning and running an HPC cluster themselves, due to lacking infrastruc- ture, know-how, or admin staff. transtec HPC Cloud Services Lastbutnotleasttranstec’sservicesportfolioevolvesascustomers‘ demands change. Starting this year, transtec is able to provide HPC Cloud Services. transtec uses a dedicated datacenter to provide computing power to customers who are in need of more capacity than they own, which is why this workflow model is sometimes called computing-on-demand. With these dynamically provided resources, customers with the possibility to have their jobs run on HPC nodes in a dedicated datacenter, professionally managed and secured, and individually customizable. Numerous standard ap- plications like ANSYS, LS-Dyna, OpenFOAM, as well as lots of codes like Gromacs, NAMD, VMD, and others are pre-installed, integrated intoanenterprise-readycloudandworkloadmanagementenviron- ment, and ready to run. Alternatively, whenever customers are in need of space for hosting their own HPC equipment because they do not have the space ca- pacity or cooling and power infrastructure themselves, transtec is also able to provide Hosting Services to those customers who’d like to have their equipment professionally hosted, maintained, and managed. Customers can thus build up their own private cloud! Are you interested in any of transtec’s broad range of HPC related services? Write us an email to HPC@transtec.de. We’ll be happy to hear from you!
  • 13. 13 Bright Cluster Manager re- moves the complexity from the installation, manage- ment and use of HPC clus- ters — onpremise or in the cloud. With Bright Cluster Man- ager, you can easily install, manage and use multiple clusters simultaneously, in- cluding compute, Hadoop, storage, database and work- station clusters.
  • 14. 14 The Bright Advantage Bright Cluster Manager delivers improved productivity, in- creased uptime, proven scalability and security, while reduc- ing operating cost: Rapid Productivity Gains ʎʎ Short learning curve: intuitive GUI drives it all ʎʎ Quick installation: one hour from bare metal to compute- ready ʎʎ Fast, flexible provisioning: incremental, live, disk-full, disk-less, over InfiniBand, to virtual machines, auto node discovery ʎʎ Comprehensive monitoring: on-the-fly graphs, Rackview, mul- tiple clusters, custom metrics ʎʎ Powerful automation: thresholds, alerts, actions ʎʎ Complete GPU support: NVIDIA, AMD1, CUDA, OpenCL ʎʎ Full support for Intel Xeon Phi ʎʎ On-demand SMP: instant ScaleMP virtual SMP deployment ʎʎ Fast customization and task automation: powerful cluster management shell, SOAP and JSON APIs make it easy ʎʎ Seamless integration with leading workload managers: Slurm, Open Grid Scheduler, Torque, openlava, Maui2, PBS Profes- sional, Univa Grid Engine, Moab2, LSF ʎʎ Integrated (parallel) application development environment ʎʎ Easy maintenance: automatically update your cluster from Linux and Bright Computing repositories ʎʎ Easily extendable, web-based User Portal ʎʎ Cloud-readiness at no extra cost3, supporting scenarios “Clus- ter-on-Demand” and “Cluster-Extension”, with data-aware scheduling ʎʎ Deploys, provisions, monitors and manages Hadoop clusters ʎʎ Future-proof: transparent customization minimizes disrup- tion from staffing changes Maximum Uptime ʎʎ Automatic head node failover: prevents system downtime ʎʎ Powerful cluster automation: drives pre-emptive actions based on monitoring thresholds AdvancedClusterManagement MadeEasy Easy-to-use, complete and scalable The cluster installer takes you through the installation process and offers advanced options such as “Express” and “Remote”.
  • 15. 15 ʎʎ Comprehensive cluster monitoring and health checking: au- tomatic sidelining of unhealthy nodes to prevent job failure Scalability from Deskside to TOP500 ʎʎ Off-loadable provisioning: enables maximum scalability ʎʎ Proven: used on some of the world’s largest clusters Minimum Overhead / Maximum Performance ʎʎ Lightweight: single daemon drives all functionality ʎʎ Optimized: daemon has minimal impact on operating system and applications ʎʎ Efficient: single database for all metric and configuration data Top Security ʎʎ Key-signed repositories: controlled, automated security and other updates ʎʎ Encryptionoption:forexternalandinternalcommunications ʎʎ Safe: X509v3 certificate-based public-key authentication ʎʎ Sensible access: role-based access control, complete audit trail ʎʎ Protected: firewalls and LDAP A Unified Approach Bright Cluster Manager was written from the ground up as a to- tally integrated and unified cluster management solution. This fundamental approach provides comprehensive cluster manage- ment that is easy to use and functionality-rich, yet has minimal impact on system performance. It has a single, light-weight daemon, a central database for all monitoring and configura- tion data, and a single CLI and GUI for all cluster management functionality. Bright Cluster Manager is extremely easy to use, scalable, secure and reliable. You can monitor and manage all aspects of your clusters with virtually no learning curve. Bright’s approach is in sharp contrast with other cluster man- agement offerings, all of which take a “toolkit” approach. These toolkits combine a Linux distribution with many third-party tools for provisioning, monitoring, alerting, etc. This approach has critical limitations: these separate tools were not designed to work together; were often not designed for HPC, nor designed to scale. Furthermore, each of the tools has its own interface (mostly command-line based), and each has its own daemon(s) and database(s). Countless hours of scripting and testing by highly skilled people are required to get the tools to work for a specific cluster, and much of it goes undocumented. Time is wasted, and the cluster is at risk if staff changes occur, losing the “in-head” knowledge of the custom scripts. By selecting a cluster node in the tree on the left and the Tasks tab on the right, you can execute a number of powerful tasks on that node with just a single mouse click.
  • 16. 1616 Ease of Installation Bright Cluster Manager is easy to install. Installation and testing of a fully functional cluster from “bare metal” can be complet- ed in less than an hour. Configuration choices made during the installation can be modified afterwards. Multiple installation modes are available, including unattended and remote modes. Cluster nodes can be automatically identified based on switch ports rather than MAC addresses, improving speed and reliabil- ity of installation, as well as subsequent maintenance. All major hardware brands are supported: Dell, Cray, Cisco, DDN, IBM, HP, Supermicro, Acer, Asus and more. Ease of Use Bright Cluster Manager is easy to use, with two interface op- tions: the intuitive Cluster Management Graphical User Interface (CMGUI) and the powerful Cluster Management Shell (CMSH). The CMGUI is a standalone desktop application that provides a single system view for managing all hardware and software aspects of the cluster through a single point of control. Admin- istrative functions are streamlined as all tasks are performed through one intuitive, visual interface. Multiple clusters can be managed simultaneously. The CMGUI runs on Linux, Windows and OS X, and can be extended using plugins. The CMSH provides practically the same functionality as the CMGUI, but via a com- mand-line interface. The CMSH can be used both interactively and in batch mode via scripts. Either way, you now have unprecedented flexibility and control over your clusters. Support for Linux and Windows Bright Cluster Manager is based on Linux and is available with a choice of pre-integrated, pre-configured and optimized Linux distributions, including SUSE Linux Enterprise Server, Red Hat Enterprise Linux, CentOS and Scientific Linux. Dual-boot installa- tions with Windows HPC Server are supported as well, allowing nodes to either boot from the Bright-managed Linux head node, or the Windows-managed head node. AdvancedClusterManagement MadeEasy Easy-to-use, complete and scalable TheOverviewtabprovidesinstant,high-levelinsightintothestatusof the cluster.
  • 17. 1717 Extensive Development Environment Bright Cluster Manager provides an extensive HPC development environment for both serial and parallel applications, including the following (some are cost options): ʎʎ Compilers, including full suites from GNU, Intel, AMD and Port- land Group. ʎʎ Debuggers and profilers, including the GNU debugger and profiler, TAU, TotalView, Allinea DDT and Allinea MAP. ʎʎ GPU libraries, including CUDA and OpenCL. ʎʎ MPI libraries, including OpenMPI, MPICH, MPICH2, MPICHMX, MPICH2-MX, MVAPICH and MVAPICH2; all cross-compiled with the compilers installed on Bright Cluster Manager, and opti- mized for high-speed interconnects such as InfiniBand and 10GE. ʎʎ Mathematical libraries, including ACML, FFTW, Goto-BLAS, MKL and ScaLAPACK. ʎʎ Other libraries, including Global Arrays, HDF5, IIPP, TBB, Net- CDF and PETSc. Bright Cluster Manager also provides Environment Modules to make it easy to maintain multiple versions of compilers, libraries andapplications fordifferentusers on the cluster, without creating compatibility conflicts. Each Environment Module file contains the information needed to configure the shell for an application, and automatically sets these variables correctly for the particular appli- cationwhenitisloaded.BrightClusterManagerincludesmanypre- configured module files for many scenarios, such as combinations of compliers, mathematical and MPI libraries. Powerful Image Management and Provisioning Bright Cluster Manager features sophisticated software image management and provisioning capability. A virtually unlimited number of images can be created and assigned to as many dif- ferent categories of nodes as required. Default or custom Linux kernels can be assigned to individual images. Incremental changes to images can be deployed to live nodes without re- booting or re-installation. The provisioning system only propagates changes to the images, minimizing time and impact on system performance and avail- ability. Provisioning capability can be assigned to any number of nodes on-the-fly, for maximum flexibility and scalability. Bright Cluster Manager can also provision over InfiniBand and to ram- disk or virtual machine. Comprehensive Monitoring With Bright Cluster Manager, you can collect, monitor, visual- ize and analyze a comprehensive set of metrics. Many software and hardware metrics available to the Linux kernel, and many hardware management interface metrics (IPMI, DRAC, iLO, etc.) are sampled. Cluster metrics, such as GPU, Xeon Phi and CPU temperatures, fan speeds and network statistics can be visualized by simply dragging and dropping them into a graphing window. Multiple metrics can be combined in one graph and graphs can be zoomed into. A Graphing wizard allows creation of all graphs for a selected combination of metrics and nodes. Graph layout and color configurations can be tailored to your requirements and stored for re-use.
  • 18. 18 High performance meets efficiency Initially, massively parallel systems constitute a challenge to both administrators and users. They are complex beasts. Anyone building HPC clusters will need to tame the beast, master the complexity and present users and administrators with an easy- to-use, easy-to-manage system landscape. Leading HPC solution providers such as transtec achieve this goal. They hide the complexity of HPC under the hood and match high performance with efficiency and ease-of-use for both users and administrators. The “P” in “HPC” gains a double meaning: “Performance” plus “Productivity”. Cluster and workload management software like Moab HPC Suite, Bright Cluster Manager or QLogic IFS provide the means to master and hide the inherent complexity of HPC systems. For administrators and users, HPC clusters are presented as single, large machines, with many different tuning parameters. The software also provides a unified view of existing clusters whenever unified management is added as a requirement by the customer at any point in time after the first installation. Thus, daily routine tasks such as job management, user management, queue partitioning and management, can be performed easily with either graphical or web-based tools, without any advanced scripting skills or technical expertise required from the adminis- trator or user. AdvancedClusterManagement MadeEasy Cloud Bursting With Bright
  • 19. 19 Bright Cluster Manager unleashes and manages the unlimited power of the cloud Create a complete cluster in Amazon EC2, or easily extend your onsite cluster into the cloud, enabling you to dynamically add capacity and manage these nodes as part of your onsite cluster. Both can be achieved in just a few mouse clicks, without the need for expert knowledge of Linux or cloud computing. Bright Cluster Manager’s unique data aware scheduling capabil- ity means that your data is automatically in place in EC2 at the start of the job, and that the output data is returned as soon as the job is completed. With Bright Cluster Manager, every cluster is cloud-ready, at no extra cost. The same powerful cluster provisioning, monitoring, scheduling and management capabilities that Bright Cluster Manager provides to onsite clusters extend into the cloud. The Bright advantage for cloud bursting ʎʎ Ease of use: Intuitive GUI virtually eliminates user learning curve; no need to understand Linux or EC2 to manage system. Alternatively, cluster management shell provides powerful scripting capabilities to automate tasks ʎʎ Complete management solution: Installation/initialization, provisioning, monitoring, scheduling and management in one integrated environment ʎʎ Integrated workload management: Wide selection of work- load managers included and automatically configured with local, cloud and mixed queues ʎʎ Singlesystemview;completevisibilityandcontrol:Cloudcom- pute nodes managed as elements of the on-site cluster; visible from a single console with drill-downs and historic data. ʎʎ Efficient data management via data aware scheduling: Auto- matically ensures data is in position at start of computation; delivers results back when complete. ʎʎ Secure, automatic gateway: Mirrored LDAP and DNS services over automatically-created VPN connects local and cloud- based nodes for secure communication. ʎʎ Cost savings: More efficient use of cloud resources; support for spot instances, minimal user intervention. Two cloud bursting scenarios Bright Cluster Manager supports two cloud bursting scenarios: “Cluster-on-Demand” and “Cluster Extension”. Scenario 1: Cluster-on-Demand The Cluster-on-Demand scenario is ideal if you do not have a cluster onsite, or need to set up a totally separate cluster. With just a few mouse clicks you can instantly create a complete clus- ter in the public cloud, for any duration of time. Scenario 2: Cluster Extension TheClusterExtensionscenarioisidealifyouhaveaclusteronsite but you need more compute power, including GPUs. With Bright Cluster Manager, you can instantly add EC2-based resources to
  • 20. 20 your onsite cluster, for any duration of time. Bursting into the cloud is as easy as adding nodes to an onsite cluster — there are only a few additional steps after providing the public cloud ac- count information to Bright Cluster Manager. The Bright approach to managing and monitoring a cluster in the cloud provides complete uniformity, as cloud nodes are man- aged and monitored the same way as local nodes: ʎʎ Load-balanced provisioning ʎʎ Software image management ʎʎ Integrated workload management ʎʎ Interfaces — GUI, shell, user portal ʎʎ Monitoring and health checking ʎʎ Compilers, debuggers, MPI libraries, mathematical libraries and environment modules Bright Cluster Manager also provides additional features that are unique to the cloud. AdvancedClusterManagement MadeEasy Easy-to-use, complete and scalable The Add Cloud Provider Wizard and the Node Creation Wizard make the cloud bursting process easy, also for users with no cloud experience.
  • 21. 21 Amazon spot instance support Bright Cluster Manager enables users to take advantage of the cost savings offered by Amazon’s spot instances. Users can spec- ify the use of spot instances, and Bright will automatically sched- ule as available, reducing the cost to compute without the need to monitor spot prices and manually schedule. Hardware Virtual Machine (HVM) virtualization Bright Cluster Manager automatically initializes all Amazon instance types, including Cluster Compute and Cluster GPU in- stances that rely on HVM virtualization. Data aware scheduling Data aware scheduling ensures that input data is transferred to the cloud and made accessible just prior to the job starting, and that the output data is transferred back. There is no need to wait (and monitor) for the data to load prior to submitting jobs (de- laying entry into the job queue), nor any risk of starting the job before the data transfer is complete (crashing the job). Users sub- mit their jobs, and Bright’s data aware scheduling does the rest. Bright Cluster Manager provides the choice: “To Cloud, or Not to Cloud” Not all workloads are suitable for the cloud. The ideal situation for most organizations is to have the ability to choose between onsite clusters for jobs that require low latency communication, complex I/O or sensitive data; and cloud clusters for many other types of jobs. Bright Cluster Manager delivers the best of both worlds: a pow- erful management solution for local and cloud clusters, with the ability to easily extend local clusters into the cloud without compromising provisioning, monitoring or managing the cloud resources. One or more cloud nodes can be configured under the Cloud Settings tab. Bright Computing
  • 22. 22 Examples include CPU, GPU and Xeon Phi temperatures, fan speeds, switches, hard disk SMART information, system load, memory utilization, network metrics, storage metrics, power systems statistics, and workload management metrics. Custom metrics can also easily be defined. Metric sampling is done very efficiently — in one process, or out- of-band where possible. You have full flexibility over how and when metrics are sampled, and historic data can be consolidat- ed over time to save disk space. Cluster Management Automation Cluster management automation takes pre-emptive actions when predetermined system thresholds are exceeded, saving time and preventing hardware damage. Thresholds can be con- figured on any of the available metrics. The built-in configura- tion wizard guides you through the steps of defining a rule: selecting metrics, defining thresholds and specifying actions. For example, a temperature threshold for GPUs can be estab- lished that results in the system automatically shutting down an AdvancedClusterManagement MadeEasy Easy-to-use, complete and scalable The parallel shell allows for simultaneous execution of commands or scripts across node groups or across the entire cluster. The status of cluster nodes, switches, other hardware, as well as up to six metrics can be visualized in the Rackview. A zoomout option is available for clusters with many racks.
  • 23. 23 overheated GPU unit and sending a text message to your mobile phone. Several predefined actions are available, but any built-in cluster management command, Linux command or script can be used as an action. Comprehensive GPU Management Bright Cluster Manager radically reduces the time and effort of managing GPUs, and fully integrates these devices into the single view of the overall system. Bright includes powerful GPU management and monitoring capability that leverages function- ality in NVIDIA Tesla and AMD GPUs. You can easily assume maximum control of the GPUs and gain instant and time-based status insight. Depending on the GPU make and model, Bright monitors a full range of GPU metrics, including: ʎʎ GPU temperature, fan speed, utilization ʎʎ GPU exclusivity, compute, display, persistance mode ʎʎ GPU memory utilization, ECC statistics ʎʎ Unit fan speed, serial number, temperature, power usage, vol- tages and currents, LED status, firmware. ʎʎ Board serial, driver version, PCI info. Beyond metrics, Bright Cluster Manager features built-in support for GPU computing with CUDA and OpenCL libraries. Switching between current and previous versions of CUDA and OpenCL has also been made easy. Full Support for Intel Xeon Phi Bright Cluster Manager makes it easy to set up and use the Intel Xeon Phi coprocessor. Bright includes everything that is needed to get Phi to work, including a setup wizard in the CMGUI. Bright ensures that your software environment is set up correctly, so that the Intel Xeon Phi coprocessor is available for applications that are able to take advantage of it. Bright collects and displays a wide range of metrics for Phi, en- suring that the coprocessor is visible and manageable as a de- vice type, as well as including Phi as a resource in the workload management system. Bright’s pre-job health checking ensures that Phi is functioning properly before directing tasks to the coprocessor. Multi-Tasking Via Parallel Shell The parallel shell allows simultaneous execution of multiple commands and scripts across the cluster as a whole, or across easily definable groups of nodes. Output from the executed commands is displayed in a convenient way with variable levels of verbosity. Running commands and scripts can be killed eas- ily if necessary. The parallel shell is available through both the CMGUI and the CMSH. Integrated Workload Management Bright Cluster Manager is integrated with a wide selection of free and commercial workload managers. This integration provides a number of benefits: ʎʎ The selected workload manager gets automatically installed and configured The automation configuration wizard guides you through the steps of defining a rule: selecting metrics, defining thresholds and specifying actions
  • 24. 24 ʎʎ Many workload manager metrics are monitored ʎʎ The CMGUI and User Portal provide a user-friendly interface to the workload manager ʎʎ The CMSH and the SOAP & JSON APIs provide direct and po- werful access to a number of workload manager commands and metrics ʎʎ Reliable workload manager failover is properly configured ʎʎ The workload manager is continuously made aware of the health state of nodes (see section on Health Checking) ʎʎ The workload manager is used to save power through auto- power on/off based on workload ʎʎ The workload manager is used for data-aware scheduling of jobs to the cloud The following user-selectable workload managers are tightly in- tegrated with Bright Cluster Manager: ʎʎ PBS Professional, Univa Grid Engine, Moab, LSF ʎʎ Slurm, openlava, Open Grid Scheduler, Torque, Maui Alternatively, other workload managers, such as LoadLevele and Condor can be installed on top of Bright Cluster Manager. Integrated SMP Support Bright Cluster Manager — Advanced Edition dynamically aggre- gates multiple cluster nodes into a single virtual SMP node, us- ing ScaleMP’s Versatile SMP™ (vSMP) architecture. Creating and dismantling a virtual SMP node can be achieved with just a few AdvancedClusterManagement MadeEasy Easy-to-use, complete and scalable Creating and dismantling a virtual SMP node can be achieved with just a few clicks within the GUI or a single command in the cluster management shell. Example graphs that visualize metrics on a GPU cluster.
  • 25. 25 clicks within the CMGUI. Virtual SMP nodes can also be launched and dismantled automatically using the scripting capabilities of the CMSH. In Bright Cluster Manager a virtual SMP node behaves like any other node, enabling transparent, on-the-fly provisioning, con- figuration, monitoring and management of virtual SMP nodes as part of the overall system management. Maximum Uptime with Head Node Failover Bright Cluster Manager — Advanced Edition allows two head nodes to be configured in active-active failover mode. Both head nodes are on active duty, but if one fails, the other takes over all tasks, seamlesly. Maximum Uptime with Health Checking Bright Cluster Manager — Advanced Edition includes a powerful cluster health checking framework that maxi- mizes system uptime. It continually checks multiple health indicators for all hardware and software components and proactively initiates corrective actions. It can also automati- cally perform a series of standard and user-defined tests just before starting a new job, to ensure a successful execution, and preventing the “black hole node syndrome”. Examples of corrective actions include autonomous bypass of faulty nodes, automatic job requeuing to avoid queue flushing, and process “jailing” to allocate, track, trace and flush completed user processes. The health checking framework ensures the highest job throughput, the best overall cluster efficiency and the lowest administration overhead. Top Cluster Security Bright Cluster Manager offers an unprecedented level of security that can easily be tailored to local requirements. Security features include: ʎʎ Automated security and other updates from key-signed Linux and Bright Computing repositories. ʎʎ Encrypted internal and external communications. ʎʎ X509v3 certificate based public-key authentication to the cluster management infrastructure. ʎʎ Role-based access control and complete audit trail. ʎʎ Firewalls, LDAP and SSH. User and Group Management Users can be added to the cluster through the CMGUI or the CMSH. Bright Cluster Manager comes with a pre-configured LDAP database, but an external LDAP service, or alternative au- thentication system, can be used instead. Web-Based User Portal The web-based User Portal provides read-only access to essen- tial cluster information, including a general overview of the clus- ter status, node hardware and software properties, workload manager statistics and user-customizable graphs. The User Por- tal can easily be customized and expanded using PHP and the SOAP or JSON APIs. Workload management queues can be viewed and configured from the GUI, without the need for workload management expertise.
  • 26. 26 Multi-Cluster Capability Bright Cluster Manager — Advanced Edition is ideal for organiza- tions that need to manage multiple clusters, either in one or in multiple locations. Capabilities include: ʎʎ All cluster management and monitoring functionality is available for all clusters through one GUI. ʎʎ Selecting any set of configurations in one cluster and ex- porting them to any or all other clusters with a few mouse clicks. ʎʎ Metric visualizations and summaries across clusters. ʎʎ Making node images available to other clusters. Fundamentally API-Based Bright Cluster Manager is fundamentally API-based, which means that any cluster management command and any piece of cluster management data — whether it is monitoring data or configuration data — is available through the API. Both a SOAP and a JSON API are available and interfaces for various programming languages, including C++, Python and PHP are provided. Cloud Bursting Bright Cluster Manager supports two cloud bursting sce- narios: “Cluster-on-Demand” — running stand-alone clusters AdvancedClusterManagement MadeEasy Easy-to-use, complete and scalable “The building blocks for transtec HPC solutions must be chosen according to our goals ease-of- management and ease-of-use. With Bright Cluster Manager, we are happy to have the technology leader at hand, meeting these requirements, and our customers value that.” Armin Jäger HPC Solution Engineer
  • 27. 27 in the cloud; and “Cluster Extension” — adding cloud-based resources to existing, onsite clusters and managing these cloud nodes as if they were local. In addition, Bright provides data aware scheduling to ensure that data is accessible in the cloud at the start of jobs, and results are promptly trans- ferred back. Both scenarios can be achieved in a few simple steps. Every Bright cluster is automatically cloud-ready, at no extra cost. Hadoop Cluster Management Bright Cluster Manager is the ideal basis for Hadoop clusters. Bright installs on bare metal, configuring a fully operational Hadoop cluster in less than one hour. In the process, Bright pre- pares your Hadoop cluster for use by provisioning the operating system and the general cluster management and monitoring ca- pabilities required as on any cluster. Bright then manages and monitors your Hadoop cluster’s hardware and system software throughout its life-cycle, collecting and graphically displaying a full range of Hadoop metrics from the HDFS, RPC and JVM sub-systems. Bright sig- nificantly reduces setup time for Cloudera, Hortonworks and other Hadoop stacks, and increases both uptime and Map- Reduce job throughput. This functionality is scheduled to be further enhanced in upcoming releases of Bright, including dedicated manage- ment roles and profiles for name nodes, data nodes, as well as advanced Hadoop health checking and monitoring func- tionality. Standard and Advanced Editions Bright Cluster Manager is available in two editions: Standard and Advanced. The table on this page lists the differences. You can easily upgrade from the Standard to the Advanced Edition as your cluster grows in size or complexity. The web-based User Portal provides read-only access to essential cluster in- formation, including a general overview of the cluster status, node hardware and software properties, workload manager statistics and user-customizable graphs. Bright Cluster Manager can manage multiple clusters simultaneously. This overview shows clusters in Oslo, Abu Dhabi and Houston, all managed through one GUI.
  • 28. 28 Documentation and Services A comprehensive system administrator manual and user man- ual are included in PDF format. Standard and tailored services are available, including various levels of support, installation, training and consultancy. AdvancedClusterManagement MadeEasy Easy-to-use, complete and scalable Feature Standard Advanced Choice of Linux distributions x x Intel Cluster Ready x x Cluster Management GUI x x Cluster Management Shell x x Web-Based User Portal x x SOAP API x x Node Provisioning x x Node Identification x x Cluster Monitoring x x Cluster Automation x x User Management x x Role-based Access Control x x Parallel Shell x x Workload Manager Integration x x Cluster Security x x Compilers x x Debuggers & Profilers x x MPI Libraries x x Mathematical Libraries x x Environment Modules x x Cloud Bursting x x Hadoop Management & Monitoring x x NVIDIA CUDA & OpenCL - x GPU Management & Monitoring - x Xeon Phi Management & Monitoring - x ScaleMP Management & Monitoring - x Redundant Failover Head Nodes - x Cluster Health Checking - x Off-loadable Provisioning - x Multi-Cluster Management - x Suggested Number of Nodes 4–128 129–10,000+ Standard Support x x Premium Support Optional Optional Cluster health checks can be visualized in the Rackview. This screenshot shows that GPU unit 41 fails a health check called “AllFansRunning”.
  • 29. 29
  • 31. 31 Tandis que tous les systèmes HPC se trouvent confrontés à de grands challenges en matière de complexité, d’évolutivité et d’exigences des flux applicatifs et des ressources, les systèmes HPC des entreprises font face aux défis et attentes les plus exigeants. Les systèmes HPC d’entreprises doivent satisfaire aux exigences critiques et prioritaires des flux d’applications pour les entreprises commerciales et les organisations de recherche et académiques à orientation commerciale. Ils doi- vent trouver un équilibre entre des SLA et des priorités com- plexes. En effet, leurs flux HPC ont un impact direct sur leurs revenus, la livraison de leurs produits et leurs objectifs. 31 While all HPC systems face challenges in workload de- mand, resource complexity, and scale, enterprise HPC systems face more strin- gent challenges and expec- tations. Enterprise HPC systems must meet mission-critical and priority HPC workload demands for commercial businesses and business-ori- ented research and academ- ic organizations. They have complex SLAs and priorities to balance. Their HPC work- loads directly impact the revenue, product delivery, and organizational objec- tives of their organizations.
  • 32. 32 Enterprise HPC Challenges Enterprise HPC systems must eliminate job delays and failures. They are also seeking to improve resource utilization and man- agement efficiency across multiple heterogeneous systems. To- maximize user productivity, they are required to make it easier to access and use HPC resources for users or even expand to other clusters or HPC cloud to better handle workload demand surges. Intelligent Workload Management As today’s leading HPC facilities move beyond petascale towards exascale systems that incorporate increasingly sophisticated and specialized technologies, equally sophisticated and intelligent management capabilities are essential. With a proven history of managing the most advanced, diverse, and data-intensive sys- tems in the Top500, Moab continues to be the preferred workload management solution for next generation HPC facilities. IntelligentHPC WorkloadManagement Moab HPC Suite – Basic Edition Moab is the “brain” of an HPC system, intelligently optimizing workload throughput while balancing service levels and priorities.
  • 33. 33 Moab HPC Suite – Basic Edition Moab HPC Suite - Basic Edition is an intelligent workload man- agement solution. It automates the scheduling, managing, monitoring, and reporting of HPC workloads on massive scale, multi-technology installations. The patented Moab intelligence engine uses multi-dimensional policies to accelerate running workloads across the ideal combination of diverse resources. These policies balance high utilization and throughput goals with competing workload priorities and SLA requirements. The speed and accuracy of the automated scheduling decisions optimizes workload throughput and resource utilization. This gets more work accomplished in less time, and in the right prio- rity order. Moab HPC Suite - Basic Edition optimizes the value and satisfaction from HPC systems while reducing manage- ment cost and complexity. Moab HPC Suite – Enterprise Edition MoabHPCSuite-EnterpriseEditionprovidesenterprise-readyHPC workload management that accelerates productivity, automates workload uptime, and consistently meets SLAs and business prior- ities for HPC systems and HPC cloud. It uses the battle-tested and patented Moab intelligence engine to automatically balance the complex, mission-critical workload priorities of enterprise HPC systems. Enterprise customers benefit from a single integrated product that brings together key enterprise HPC capabilities, im- plementation services, and 24x7 support. This speeds the realiza- tion of benefits from the HPC system for the business including: ʎʎ Higher job throughput ʎʎ Massive scalability for faster response and system expansion ʎʎ Optimum utilization of 90-99% on a consistent basis ʎʎ Fast, simple job submission and management to increase
  • 34. 34 productivity ʎʎ Reduced cluster management complexity and support costs across heterogeneous systems ʎʎ Reduced job failures and auto-recovery from failures ʎʎ SLAs consistently met for improved user satisfaction ʎʎ Reduce power usage and costs by 10-30% Productivity Acceleration Moab HPC Suite – Enterprise Edition gets more results deliv- ered faster from HPC resources at lower costs by accelerating overall system, user and administrator productivity. Moab pro- vides the scalability, 90-99 percent utilization, and simple job submission that is required to maximize the productivity of enterprise HPC systems. Enterprise use cases and capabilities include: ʎʎ Massive multi-point scalability to accelerate job response and throughput, including high throughput computing ʎʎ Workload-optimized allocation policies and provisioning to get more results out of existing heterogeneous resources and reduce costs, including topology-based allocation ʎʎ Unify workload management cross heterogeneous clus- ters to maximize resource availability and administration efficiency by managing them as one cluster ʎʎ Optimized, intelligent scheduling packs workloads and backfills around priority jobs and reservations while balanc- ing SLAs to efficiently use all available resources ʎʎ Optimized scheduling and management of accelerators, both Intel MIC and GPGPUs, for jobs to maximize their utili- zation and effectiveness ʎʎ Simplified job submission and management with advanced job arrays, self-service portal, and templates ʎʎ Administrator dashboards and reporting tools reduce man- agement complexity, time, and costs ʎʎ Workload-aware auto-power management reduces energy use and costs by 10-30 percent Intelligent HPC Workload Management Moab HPC Suite – Enterprise Edition “With Moab HPC Suite, we can meet very demand- ing customers’ requirements as regards unifi ed management of heterogeneous cluster environ- ments, grid management, and provide them with fl exible and powerful confi guration and reporting options. Our customers value that highly.” Thomas Gebert HPC Solution Architect
  • 35. 35 Uptime Automation Job and resource failures in enterprise HPC systems lead to de- layed results, missed organizational opportunities, and missed objectives. Moab HPC Suite – Enterprise Edition intelligently au- tomates workload and resource uptime in the HPC system to en- sure that workload completes successfully and reliably, avoiding these failures. Enterprises can benefit from: ʎʎ Intelligent resource placement to prevent job failures with granular resource modeling to meet workload requirements and avoid at-risk resources ʎʎ Auto-response to failures and events with configurable ac- tions to pre-failure conditions, amber alerts, or other metrics and monitors ʎʎ Workload-aware future maintenance scheduling that helps maintain a stable HPC system without disrupting workload productivity ʎʎ Real-world expertise for fast time-to-value and system uptime with included implementation, training, and 24x7 support remote services Auto SLA Enforcement Moab HPC Suite – Enterprise Edition uses the powerful Moab intelligence engine to optimally schedule and dynamically adjust workload to consistently meet service level agree- ments (SLAs), guarantees, or business priorities. This auto- matically ensures that the right workloads are completed at the optimal times, taking into account the complex number of using departments, priorities and SLAs to be balanced. Moab provides: ʎʎ Usage accounting and budget enforcement that schedules resources and reports on usage in line with resource shar- ing agreements and precise budgets (includes usage limits,
  • 36. 36 usage reports, auto budget management and dynamic fair- share policies) ʎʎ SLA and priority polices that make sure the highest priorty workloads are processed first (includes Quality of Service, hierarchical priority weighting) ʎʎ Continuous plus future scheduling that ensures priorities and guarantees are proactively met as conditions and work- load changes (i.e. future reservations, pre-emption, etc.) Grid- and Cloud-Ready Workload Management The benefits of a traditional HPC environment can be extended to more efficiently manage and meet workload demand with the ulti-cluster grid and HPC cloud management capabilities in Moab HPC Suite - Enterprise Edition. It allows you to: ʎʎ Showback or chargeback for pay-for-use so actual resource usage is tracked with flexible chargeback rates and report- ing by user, department, cost center, or cluster ʎʎ Manage and share workload across multiple remote clusters to meet growing workload demand or surges by adding on the Moab HPC Suite – Grid Option. Moab HPC Suite – Grid Option The Moab HPC Suite - Grid Option is a powerful grid-workload management solution that includes scheduling, advanced poli- cy management, and tools to control all the components of ad- vanced grids. Unlike other grid solutions, Moab HPC Suite - Grid Option connects disparate clusters into a logical whole, enabling grid administrators and grid policies to have sovereignty over all systems while preserving control at the individual cluster. Moab HPC Suite - Grid Option has powerful applications that al- low organizations to consolidate reporting; information gather- ing; and workload, resource, and data management. Moab HPC Suite - Grid Option delivers these services in a near-transparent way: users are unaware they are using grid resources—they know only that they are getting work done faster and more eas- ily than ever before. Intelligent HPC Workload Management Moab HPC Suite – Grid Option
  • 37. 37 Moab manages many of the largest clusters and grids in the world. Moab technologies are used broadly across Fortune 500 companies and manage more than a third of the compute cores in the top 100 systems of the Top500 supercomputers. Adaptive Computing is a globally trusted ISV (independent software ven- dor), and the full scalability and functionality Moab HPC Suite with the Grid Option offers in a single integrated solution has traditionally made it a significantly more cost-effective option than other tools on the market today. Components The Moab HPC Suite - Grid Option is available as an add-on to Moab HPC Suite - Basic Edition and -Enterprise Edition. It extends the capabilities and functionality of the Moab HPC Suite compo- nents to manage grid environments including: ʎʎ Moab Workload Manager for Grids – a policy-based work- load management and scheduling multi-dimensional deci- sion engine ʎʎ Moab Cluster Manager for Grids – a powerful and unified graphical administration tool for monitoring, managing and reporting tool across multiple clusters ʎʎ Moab ViewpointTM for Grids – a Web-based self-service end-user job-submission and management portal and ad- ministrator dashboard Benefits ʎʎ Unified management across heterogeneous clusters provides ability to move quickly from cluster to optimized grid ʎʎ Policy-driven and predictive scheduling ensures that jobs start and run in the fastest time possible by selecting optimal resources ʎʎ Flexible policy and decision engine adjusts workload pro- cessing at both grid and cluster level ʎʎ Grid-wide interface and reporting tools provide view of grid resources, status and usage charts, and trends over time for capacity planning, diagnostics, and accounting ʎʎ Advanced administrative control allows various business units to access and view grid resources, regardless of physical or organizational boundaries, or alternatively restricts access to resources by specific departments or entities ʎʎ Scalable architecture to support peta-scale, highthroughput computing and beyond Grid Control with Automated Tasks, Policies, and Re- porting ʎʎ Guarantee that the most-critical work runs first with flexible global policies that respect local cluster policies but continue to support grid service-level agreements ʎʎ Ensure availability of key resources at specific times with ad- vanced reservations ʎʎ Tune policies prior to rollout with cluster- and grid-level simu- lations ʎʎ Use a global view of all grid operations for self-diagnosis, plan- ning, reporting, and accounting across all resources, jobs, and clusters Moab HPC Suite -Grid Option can be flexibly configured for centralized, cen- tralized and local, or peer-to-peer grid policies, decisions and rules. It is able to manage multiple resource managers and multiple Moab instances.
  • 38. 38 HarnessingthePowerofIntelXeonPhiandGPGPUs The mathematical acceleration delivered by many-core General Purpose Graphics Processing Units (GPGPUs) offers significant performance advantages for many classes of numerically in- tensive applications. Parallel computing tools such as NVIDIA’s CUDA and the OpenCL framework have made harnessing the power of these technologies much more accessible to develop- ers, resulting in the increasing deployment of hybrid GPGPU- based systems and introducing significant challenges for ad- ministrators and workload management systems. The new Intel Xeon Phi coprocessors offer breakthrough perfor- mance for highly parallel applications with the benefit of lever- aging existing code and flexibility in programming models for faster application development. Moving an application to Intel Xeon Phi coprocessors has the promising benefit of requiring substantially less effort and lower power usage. Such a small investment can reap big performance gains for both data-inten- sive and numerical or scientific computing applications that can make the most of the Intel Many Integrated Cores (MIC) technol- ogy. So it is no surprise that organizations are quickly embracing these new coprocessors in their HPC systems. The main goal is to create breakthrough discoveries, products and research that improve our world. Harnessing and optimiz- ing the power of these new accelerator technologies is key to doing this as quickly as possible. Moab HPC Suite helps organi- zations easily integrate accelerator and coprocessor technolo- gies into their HPC systems, optimizing their utilization for the right workloads and problems they are trying to solve. Intelligent HPC Workload Management OptimizingAcceleratorswithMoabHPCSuite
  • 39. 39 Moab HPC Suite automatically schedules hybrid systems incor- porating Intel Xeon Phi coprocessors and GPGPU accelerators, optimizing their utilization as just another resource type with policies. Organizations can choose the accelerators that best meet their different workload needs. Managing Hybrid Accelerator Systems Hybrid accelerator systems add a new dimension of manage- ment complexity when allocating workloads to available re- sources. In addition to the traditional needs of aligning work- load placement with software stack dependencies, CPU type, memory, and interconnect requirements, intelligent workload management systems now need to consider: ʎʎ Workload’s ability to exploit Xeon Phi or GPGPU technology ʎʎ Additional software dependenciesand reduce costs, including topology-based allocation ʎʎ Current health and usage status of available Xeon Phis or GPGPUs ʎʎ Resource configuration for type and number of Xeon Phis or GPGPUs Moab HPC Suite auto detects which types of accelerators are where in the system to reduce the management effort and costs as these processors are introduced and maintained together in an HPC system. This gives customers maximum choice and per- formance in selecting the accelerators that work best for each of their workloads, whether Xeon Phi, GPGPU or a hybrid mix of the two. Moab HPC Suite optimizes accelerator utilization with policies that ensure the right ones get used for the right user and group jobs at the right time.
  • 40. 40 Optimizing Intel Xeon Phi and GPGPUs Utilization System and application diversity is one of the first issues workload management software must address in optimizing accelerator utilization. The ability of current codes to use GP- GPUs and Xeon Phis effectively, ongoing cluster expansion, and costs usually mean only a portion of a system will be equipped with one or a mix of both types of accelerators. Moab HPC Suite has a twofold role in reducing the complexity of their administration and optimizing their utilization. First, it must be able to automatically detect new GPGPUs or Intel Xeon Phi coprocessors in the environment and their availabil- ity without the cumbersome burden of manual administra- tor configuration. Second, and most importantly, it must ac- curately match Xeon Phi-bound or GPGPUenabled workloads with the appropriately equipped resources in addition to managing contention for those limited resources according to organizational policy. Moab’s powerful allocation and pri- oritization policies can ensure the right jobs from users and groups get to the optimal accelerator resources at the right time. This keeps the accelerators at peak utilization for the right priority jobs. Moab’s policies are a powerful ally to administrators in auto determining the optimal Xeon Phi coprocessor or GPGPU to use and ones to avoid when scheduling jobs. These alloca- tion policies can be based on any of the GPGPU or Xeon Phi metrics such as memory (to ensure job needs are met), tem- perature (to avoid hardware errors or failures), utilization, or other metrics. IntelligentHPC WorkloadManagement OptimizingAcceleratorswithMoabHPCSuite
  • 41. 41 Use the following metrics in policies to optimize allocation of Intel Xeon Phi coprocessors: ʎʎ Number of Cores ʎʎ Number of Hardware Threads ʎʎ Physical Memory ʎʎ Free Physical Memory ʎʎ Swap Memory ʎʎ Free Swap Memory ʎʎ Max Frequency (in MHz) Use the following metrics in policies to optimize allocation of GPGPUs and for management diagnostics: ʎʎ Error counts; single/double-bit, commands to reset counts ʎʎ Temperature ʎʎ Fan speed ʎʎ Memory; total, used, utilization ʎʎ Utilization percent ʎʎ Metrics time stamp Improve GPGPU Job Speed and Success The GPGPU drivers supplied by vendors today allow multiple jobs to share a GPGPU without corrupting results, but sharing GPGPUs and other key GPGPU factors can significantly impact the speed and success of GPGPU jobs and the final level of ser- vice delivered to the system users and the organization. Consider the example of a user estimating the time for a GPGPU job based on exclusive access to a GPGPU, but the workload man- ager allowing the GPGPU to be shared when the job is scheduled. The job will likely exceed the estimated time and be terminated by the workload manager unsuccessfully, leaving a very unsatis- fied user and the organization, group or project they represent. Moab HPC Suite provides the ability to schedule GPGPU jobs to run exclusively and finish in the shortest time possible for cer- tain types, or classes of jobs or users to ensure job speed and success.
  • 42. 42 Cluster Sovereignty and Trusted Sharing ʎʎ Guarantee that shared resources are allocated fairly with global policies that fully respect local cluster configuration and needs ʎʎ Establish trust between resource owners through graphical usage controls, reports, and accounting across all shared resources ʎʎ Maintain cluster sovereignty with granular settings to con- trol where jobs can originate and be processed n Establish resource ownership and enforce appropriate access levels with prioritization, preemption, and access guarantees Increase User Collaboration and Productivity ʎʎ Reduce end-user training and job management time with easy-to-use graphical interfaces ʎʎ Enable end users to easily submit and manage their jobs through an optional web portal, minimizing the costs of ca- tering to a growing base of needy users ʎʎ Collaborate more effectively with multicluster co-allocation, allowing key resource reservations for high-priority projects ʎʎ Leverage saved job templates, allowing users to submit mul- Intelligent HPC Workload Management Moab HPC Suite – Application Portal Edition The Visual Cluster view in Moab Cluster Manager for Grids makes cluster and grid management easy and efficient.
  • 43. 43 tiple jobs quickly and with minimal changes ʎʎ Speed job processing with enhanced grid placement options for job arrays; optimal or single cluster placement Process More Work in Less Time to Maximize ROI ʎʎ Achieve higher, more consistent resource utilization with in- telligent scheduling, matching job requests to the bestsuited resources, including GPGPUs ʎʎ Use optimized data staging to ensure that remote data trans- fers are synchronized with resource availability to minimize poor utilization ʎʎ Allow local cluster-level optimizations of most grid workload Unify Management Across Independent Clusters ʎʎ Unify management across existing internal, external, and partner clusters – even if they have different resource man- agers, databases, operating systems, and hardware ʎʎ Out-of-the-box support for both local area and wide area grids ʎʎ Manage secure access to resources with simple credential mapping or interface with popular security tool sets ʎʎ Leverage existing data-migration technologies, such as SCP or GridFTP Moab HPC Suite – Application Portal Edition Remove the Barriers to Harnessing HPC Organizations of all sizes and types are looking to harness the potential of high performance computing (HPC) to enable and accelerate more competitive product design and research. To do this, you must find a way to extend the power of HPC resourc- es to designers, researchers and engineers not familiar with or trained on using the specialized HPC technologies. Simplifying the user access to and interaction with applications running on more efficient HPC clusters removes the barriers to discovery and innovation for all types of departments and organizations. Moab® HPC Suite – Application Portal Edition provides easy single-point access to common technical applications, data, HPC resources, and job submissions to accelerate research analysis and collaboration process for end users while reducing comput- ing costs. HPC Ease of Use Drives Productivity Moab HPC Suite – Application Portal Edition improves ease of use and productivity for end users using HPC resources with sin- gle-point access to their common technical applications, data, resources, and job submissions across the design and research process. It simplifies the specification and running of jobs on HPC resources to a few simple clicks on an application specific template with key capabilities including: ʎʎ Application-centric job submission templates for common manufacturing, energy, life-sciences, education and other industry technical applications ʎʎ Support for batch and interactive applications, such as simulations or analysis sessions, to accelerate the full proj- ect or design cycle ʎʎ No special HPC training required to enable more users to start accessing HPC resources with intuitive portal ʎʎ Distributed data management avoids unnecessary remote file transfers for users, storing data optimally on the network for easy access and protection, and optimizing file transfer to/ from user desktops if needed Accelerate Collaboration While Maintaining Control More and more projects are done across teams and with partner and industry organizations. Moab HPC Suite – Application Portal Edition enables you to accelerate this collaboration between indi- viduals and teams while maintaining control as additional users, inside and outside the organization, can easily and securely ac- cessprojectapplications,HPCresourcesanddatatospeedproject cycles. Key capabilities and benefits you can leverage are: ʎʎ Encrypted and controllable access, by class of user, ser- vices, application or resource, for remote and partner users improves collaboration while protecting infrastructure and intellectual property
  • 44. 44 ʎʎ Enable multi-site, globally available HPC services available anytime, anywhere, accessible through many different types of devices via a standard web browser ʎʎ Reduced client requirements for projectsmeans new proj- ect members can quickly contribute without being limited by their desktop capabilities Reduce Costs with Optimized Utilization and Sharing Moab HPC Suite – Application Portal Edition reduces the costs of technical computing by optimizing resource utilization and the sharing of resources including HPC compute nodes, application licenses, and network storage. The patented Moab intelligence engine and its powerful policies deliver value in key areas of uti- lization and sharing: ʎʎ Maximize HPC resource utilization with intelligent, opti- mized scheduling policies that pack and enable more work to be done on HPC resources to meet growing and changing demand ʎʎ Optimize application license utilization and access by shar- ing application licenses in a pool, re-using and optimizing us- age with allocation policies that integrate with license man- agers for lower costs and better service levels to a broader set of users ʎʎ Usage budgeting and priorities enforcement with usage accounting and priority policies that ensure resources are shared in-line with service level agreements and project pri- orities ʎʎ Leverage and fully utilize centralized storage capacity in- stead of duplicative, expensive, and underutilized individual workstation storage Key Intelligent Workload Management Capabilities: ʎʎ Massive multi-point scalability ʎʎ Workload-optimized allocation policies and provisioning ʎʎ Unify workload management cross heterogeneous clusters ʎʎ Optimized, intelligent scheduling Intelligent HPC Workload Management MoabHPCSuite–RemoteVisualizationEdition
  • 45. 45 ʎʎ Optimized scheduling and management of accelerators (both Intel MIC and GPGPUs) ʎʎ Administrator dashboards and reporting tools ʎʎ Workload-aware auto power management ʎʎ Intelligent resource placement to prevent job failures ʎʎ Auto-response to failures and events ʎʎ Workload-aware future maintenance scheduling ʎʎ Usage accounting and budget enforcement ʎʎ SLA and priority polices ʎʎ Continuous plus future scheduling MoabHPCSuite–RemoteVisualizationEdition Removing Inefficiencies in Current 3D Visualization Methods Using 3D and 2D visualization, valuable data is interpreted and new innovations are discovered. There are numerous inefficien- cies in how current technical computing methods support users and organizations in achieving these discoveries in a cost effec- tive way. High-cost workstations and their software are often not fully utilized by the limited user(s) that have access to them. They are also costly to manage and they don’t keep pace long with workload technology requirements. In addition, the work- station model also requires slow transfers of the large data sets between them. This is costly and inefficient in network band- width, user productivity, and team collaboration while posing in- creased data and security risks. There is a solution that removes these inefficiencies using new technical cloud computing mod- els. Moab HPC Suite – Remote Visualization Edition significantly reduces the costs of visualization technical computing while improving the productivity, collaboration and security of the de- sign and research process. ReducetheCostsofVisualizationTechnicalComputing Moab HPC Suite – Remote Visualization Edition significantly reduces the hardware, network and management costs of vi- sualization technical computing. It enables you to create easy, centralized access to shared visualization compute, applica- tion and data resources. These compute, application, and data resources reside in a technical compute cloud in your data cen- ter instead of in expensive and underutilized individual work- stations. ʎʎ Reduce hardware costs by consolidating expensive individual technical workstations into centralized visualization servers for higher utilization by multiple users; reduce additive or up- grade technical workstation or specialty hardware purchases, such as GPUs, for individual users. ʎʎ Reduce management costs by moving from remote user work- stations that are difficult and expensive to maintain, upgrade, and back-up to centrally managed visualization servers that require less admin overhead ʎʎ Decreasenetworkaccesscostsandcongestionassignificantly lower loads of just compressed, visualized pixels are moving to users, not full data sets ʎʎ Reduce energy usageas centralized visualization servers con- sume less energy for the user demand met ʎʎ Reduce data storage costs by consolidating data into common storage node(s) instead of under-utilized individual storage The Application Portal Edition easily extends the power of HPC to your users and projects to drive innovation and competentive advantage.
  • 46. 46 Consolidation and Grid Many sites have multiple clusters as a result of having multiple independent groups or locations, each with demands for HPC, and frequent additions of newer machines. Each new cluster increases the overall administrative burden and overhead. Addi- tionally, many of these systems can sit idle while others are over- loaded. Because of this systems-management challenge, sites turn to grids to maximize the efficiency of their clusters. Grids can be enabled either independently or in conjunction with one another in three areas: ʎʎ Reporting Grids Managers want to have global reporting across all HPC resources so they can see how users and proj- ects are really utilizing hardware and so they can effectively plan capacity. Unfortunately, manually consolidating all of this information in an intelligible manner for more than just a couple clusters is a management nightmare. To solve that problem, sites will create a reporting grid, or share informa- tion across their clusters for reporting and capacity-planning purposes. ʎʎ Management Grids Managing multiple clusters indepen- dently can be especially difficult when processes change, because policies must be manually reconfigured across all clusters. To ease that difficulty, sites often set up manage- ment grids that impose a synchronized management layer across all clusters while still allowing each cluster some level of autonomy. ʎʎ Workload-Sharing Grids Sites with multiple clusters often have the problem of some clusters sitting idle while other clusters have large backlogs. Such inequality in cluster utili- zation wastes expensive resources, and training users to per- form different workload-submission routines across various clusters can be difficult and expensive as well. To avoid these problems, sites often set up workload-sharing grids. These grids can be as simple as centralizing user submission or as Intelligent HPC Workload Management Using Moab With Grid Environments
  • 47. 47 complex as having each cluster maintain its own user sub- mission routine with an underlying grid-management tool that migrates jobs between clusters. Inhibitors to Grid Environments Three common inhibitors keep sites from enabling grid environ- ments: ʎʎ Politics Because grids combine resources across users, groups, and projects that were previously independent, grid implementation can be a political nightmare. To create a grid in the real world, sites need a tool that allows clusters to retain some level of sovereignty while participating in the larger grid. ʎʎ Multiple Resource Managers Most sites have a variety of resource managers used by various groups within the orga- nization, and each group typically has a large investment in scripts that are specific to one resource manager and that cannot be changed. To implement grid computing effective- ly, sites need a robust tool that has flexibility in integrating with multiple resource managers. ʎʎ Credentials Many sites have different log-in credentials for each cluster, and those credentials are generally indepen- dent of one another. For example, one user might be Joe.P on one cluster and J_Peterson on another. To enable grid envi- ronments, sites must create a combined user space that can recognize and combine these different credentials. Using Moab in a Grid Sites can use Moab to set up any combination of reporting, management, and workload-sharing grids. Moab is a grid metas- cheduler that allows sites to set up grids that work effectively in the real world. Moab’s feature-rich functionality overcomes the inhibitors of politics, multiple resource managers, and varying credentials by providing: ʎʎ Grid Sovereignty Moab has multiple features that break down political barriers by letting sites choose how each clus- ter shares in the grid. Sites can control what information is shared between clusters and can specify which workload is passed between clusters. In fact, sites can even choose to let each cluster be completely sovereign in making decisions about grid participation for itself. ʎʎ Support for Multiple Resource Managers Moab meta- schedules across all common resource managers. It fully integrates with TORQUE and SLURM, the two most-common open-source resource managers, and also has limited in- tegration with commercial tools such as PBS Pro and SGE. Moab’s integration includes the ability to recognize when a user has a script that requires one of these tools, and Moab can intelligently ensure that the script is sent to the correct machine. Moab even has the ability to translate common scripts across multiple resource managers. ʎʎ Credential Mapping Moab can map credentials across clus- ters to ensure that users and projects are tracked appropri- ately and to provide consolidated reporting.
  • 48. 48 Improve Collaboration, Productivity and Security With Moab HPC Suite – Remote Visualization Edition, you can im- prove the productivity, collaboration and security of the design and research process by only transferring pixels instead of data to users to do their simulations and analysis. This enables a wid- er range of users to collaborate and be more productive at any time, on the same data, from anywhere without any data trans- fer time lags or security issues. Users also have improved imme- diate access to specialty applications and resources, like GPUs, they might need for a project so they are no longer limited by personal workstation constraints or to a single working location. ʎʎ Improve workforce collaboration by enabling multiple users to access and collaborate on the same interactive ap- plication data at the same time from almost any location or device, with little or no training needed ʎʎ Eliminate data transfer time lags for users by keeping data close to the visualization applications and compute resourc- es instead of transferring back and forth to remote worksta- tions; only smaller compressed pixels are transferred ʎʎ Improve security with centralized data storage so data no longer gets transferred to where it shouldn’t, gets lost, or gets inappropriately accessed on remote workstations, only pixels get moved MaximizeUtilizationandSharedResourceGuarantees Moab HPC Suite – Remote Visualization Edition maximizes your resource utilization and scalability while guaranteeing shared resource access to users and groups with optimized workload management policies. Moab’s patented policy intelligence en- gine optimizes the scheduling of the visualization sessions across the shared resources to maximize standard and specialty resource utilization, helping them scale to support larger vol- umes of visualization workloads. These intelligent policies also guarantee shared resource access to users to ensure a high ser- vice level as they transition from individual workstations. Intelligent HPC Workload Management MoabHPCSuite–RemoteVisualizationEdition
  • 49. 49 ʎʎ Guarantee shared visualization resource access for users with priority policies and usage budgets that ensure they receive the appropriate service level. Policies include ses- sion priorities and reservations, number of users per node, fairshare usage policies, and usage accounting and budgets across multiple groups or users. ʎʎ Maximize resource utilization and scalability by packing visualization workload optimally on shared servers using Moab allocation and scheduling policies. These policies can include optimal resource allocation by visualization session characteristics (like CPU, memory, etc.), workload packing, number of users per node, and GPU policies, etc. ʎʎ Improve application license utilization to reduce software costs and improve access with a common pool of 3D visual- ization applications shared across multiple users, with usage optimized by intelligent Moab allocation policies, integrated with license managers. ʎʎ Optimized scheduling and management of GPUs and other accelerators to maximize their utilization and effectiveness ʎʎ Enable multiple Windows or Linux user sessions on a single machine managed by Moab to maximize resource and power utilization. ʎʎ Enable dynamic OS re-provisioning on resources, managed by Moab, to optimally meet changing visualization applica- tion workload demand and maximize resource utilization and availability to users. The Remote Visualization Edition makes visualization more cost effective and productive with an innovative technical compute cloud.
  • 50. transtec HPC solutions are designed for maximum flexibility, and ease of management. We not only offer our customers the most powerful and flexible cluster management solution out there, but also provide them with customized setup and site- specific configuration. Whether a customer needs a dynamical Linux-Windows dual-boot solution, unifi ed management of dif- ferent clusters at different sites, or the fi ne-tuning of the Moab scheduler for implementing fi ne-grained policy confi guration – transtec not only gives you the framework at hand, but also helps you adapt the system according to your special needs. Needless to say, when customers are in need of special train- ings, transtec will be there to provide customers, administra- tors, or users with specially adjusted Educational Services. Having a many years’ experience in High Performance Comput- ing enabled us to develop effi cient concepts for installing and deploying HPC clusters. For this, we leverage well-proven 3rd party tools, which we assemble to a total solution and adapt and confi gure according to the customer’s requirements. We manage everything from the complete installation and confi guration of the operating system, necessary middleware components like job and cluster management systems up to the customer’s applications. 50 Intelligent HPC Workload Management MoabHPCSuite–RemoteVisualizationEdition
  • 51. Moab HPC Suite Use Case Basic Enterprice App. Portal Remote Visualization ProductivityAcceleration Massive multifaceted scalability x x x x Workload optimized resource allocation x x x x Optimized, intelligent scheduling of work- load and resources x x x x Optimized accelerator scheduling & manage- ment (Intel MIC & GPGPU) x x x x Simplified user job submission & mgmt. - Application Portal x x x x x Administrator dashboard and management reporting x x x x Unified workload management across het- erogeneous cluster (s) x Single cluster only x x x Workload optimized node provisioning x x x Visualization Workload Optimization x Workload-aware auto-power management x x x Uptime Automation Intelligent resource placement for failure prevention x x x x Workload-aware maintenance scheduling x x x x Basic deployment, training and premium (24x7) support service Standard support only x x x Auto response to failures and events x Limited no re-boot/provision x x x Auto SLA enforcement Resource sharing and usage policies x x x x SLA and oriority policies x x x x Continuous and future reservations scheduling x x x x Usage accounting and budget enforcement x x x Grid- and Cloud-Ready Manage and share workload across multiple clusters in wich area grid x w/grid Option x w/grid Option x w/grid Option x w/grid Option User self-service: job request & management portal x x x x Pay-for-use showback and chargeback x x x Workload optimized node provisioning & re-purposing x x x Multiple clusters in single domain 51
  • 53. 53 InfiniBand (IB) is an efficient I/O technology that provides high- speed data transfers and ultra-low latencies for computing and storage over a highly reliable and scalable single fabric. The InfiniBand industry standard ecosystem creates cost effective hardware and software solutions that easily scale from genera- tion to generation. InfiniBand is a high-bandwidth, low-latency network interconnect solution that has grown tremendous market share in the High Performance Computing (HPC) cluster community. InfiniBand was designed to take the place of today’s data center networking tech- nology. In the late 1990s, a group of next generation I/O architects formed an open, community driven network technology to provide scalability and stability based on successes from other network designs. Today, InfiniBand is a popular and widely used I/O fabric among customers within the Top500 supercomputers: major Uni- versities and Labs; Life Sciences; Biomedical; Oil and Gas (Seismic, Reservoir, Modeling Applications); Computer Aided Design and En- gineering; Enterprise Oracle; and Financial Applications. HPC systems and enterprise grids deliver unprecedented time-to-market and perfor- mance advantages to many research and corporate cus- tomers, struggling every day with compute and data intensive processes. This of- ten generates or transforms massive amounts of jobs and data, that needs to be handled and archived effi- ciently to deliver timely in- formation to users distrib- uted in multiple locations, with different security con- cerns. Poor usability of such com- plex systems often nega- tively impacts users’ pro- ductivity, and ad-hoc data management often increas- es information entropy and dissipates knowledge and intellectual property.
  • 54. 54 NICEEnginFrame A Technical Portal for Remote Visualization Solving distributed computing issues for our customers, it is easy to understand that a modern, user-friendly web front-end to HPC and grids can drastically improve engineering productiv- ity, if properly designed to address the specific challenges of the Technical Computing market. Nice EnginFrame overcomes many common issues in the areas of usability, data management, security and integration, open- ing the way to a broader, more effective use of the Technical Computing resources. The key components of our solutions are: ʎʎ a flexible and modular Java-based kernel, with clear separati- on between customizations and core services ʎʎ powerful data management features, reflecting the typical needs of engineering applications ʎʎ comprehensive security options and a fine grained authoriza- tion system ʎʎ scheduler abstraction layer to adapt to different workload and resource managers ʎʎ responsive and competent support services
  • 55. 55 Enduserscantypicallyenjoythefollowingimprovements: ʎʎ user-friendly, intuitive access to computing resources, using a standard web browser ʎʎ application-centric job submission ʎʎ organized access to job information and data ʎʎ increased mobility and reduced client requirements On the other side, the Technical Computing Portal delivers sig- nificant added-value for system administrators and IT: ʎʎ reduced training needs to enable users to access the resour- ces ʎʎ centralized configuration and immediate deployment of ser- vices ʎʎ comprehensive authorization to access services and informa- tion ʎʎ reduced support calls and submission errors Coupled with our Remote Visualization solutions, our custom- ers quickly deploy end-to-end engineering processes on their Intranet, Extranet or Internet. EnginFrame Highlights ʎʎ Universal and flexible access to your Grid infrastructure En- ginFrame provides easy accessover intranet, extranet or the Internet using standard and secure Internet languages and protocols. ʎʎ Interface to the Grid in your organization EnginFrame offers pluggable server-side XML services to enable easy integrati- on of the leading workload schedulers in the market. Plug-in modules for all major grid solutions including the Platform Computing LSF and OCS, Microsoft Windows HPC Server 2008, Sun GridEngine, UnivaUD UniCluster, IBM LoadLeve- ler, Altair PBS/Pro and OpenPBS, Torque, Condor, EDG/gLite Globus-based toolkits, and provides easy interfaces to the existing computing infrastructure. ʎʎ Security and Access Control You can give encrypted and con- trollable access to remote users and improve collaboration with partners while protecting your infrastructure and Intel- lectualProperty(IP). Thisenablesyoutospeedupdesigncycles while working with related departments or external partners and eases communication through a secure infrastructure. Distributed Data Management The comprehensive remote file management built into EnginFrame avoids unnecessary file transfers, and enables server-side treatment of data, as well as file transfer from/to the user’s desktop. ʎʎ Interactive application support Through the integration with leading GUI virtualization solutions (including Desktop Cloud Visualization (DCV), VNC, VirtualGL and NX), EnginFrame enables you to address all stages of the design process, both batch and interactive. ʎʎ SOA-enable your applications and resources EnginFrame can automatically publish your computing applications via standard WebServices (tested both for Java and .NET intero- perability), enabling immediate integration with other enter- prise applications. Business Benefits Category Business Benefits Web and Cloud Portal ʎʎ Simplify user access; minimize the training requirements for HPC users; broaden the user community ʎʎ Enable access from intranet, internet or extranet