In this deck from the 2019 Stanford HPC Conference, Gregory M. Kurtzer from Sylabs Inc presents: Singularity - Container workflows for compute.
"Singularity is a widely adopted container technology specifically designed for compute-based workflows making application and environment reproducibility, portability and security a reality for HPC and AI researchers and resources. Here we will describe a high-level overview of Singularity and demonstrate how to integrate Singularity containers into existing application and resource workflows as well as describe some new trending models that we have been seeing."
Gregory M. Kurtzer is the CEO and founder of Sylabs Inc., the company behind the open source container project Singularity. Sylabs caters to the needs of various compute-based workflows like traditional simulation, data science, real time analytics, and AI use-cases. Previously, Greg has spent most of his career enabling massive scale compute focused use cases where he created and led various open source projects along that mission, including the Warewulf cluster management toolkit, CentOS Linux, and most recently, the container system Singularity.
Watch the video: https://youtu.be/cfQJ0JMjYEM
Learn more: https://www.sylabs.io/
and
http://hpcadvisorycouncil.com/events/2019/stanford-workshop/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
2. Gregory M. Kurtzer
CEO and Founder, Sylabs Inc.
Previously spent ~20 years at LBNL/DOE as
the HPC Systems Architect.
I’m also known for founding various open
source projects like Warewulf, CentOS Linux,
and most recently Singularity!
INTRODUCTIONS…
3. Host 2Host 1
APPLICATION CONTAINERIZATION 101
CPU Memory Devices
Kernel
Applications, Libraries, Services
CPU Memory Devices
Kernel
Apps, libs, servicesContainer
SCP, HTTP,
FTP, Archive
An environment can be
built on one host,
encapsulated, and
packaged up into a
container image.
The container image
can be copied to
another host, and
applications can be
executed directly as if
they are running
native.
You can additionally isolate
or integrate the container
environment on the host as
the need necessitates.
4. Singularity is differentiated by two
primary categories:
• Container Format: Sylabs created an image format
to encapsulate OCI and Docker based containers,
which is single file based, cryptographically signed,
trusted, and immutable.
• Runtime Engine: Standardizing on existing POSIX
security practices, Singularity improves performance,
integration, ease of use and reduces attack surfaces
while enabling HPC and the growing need for
compute based orchestration.
DESIGNED FOR SECURITY, MOBILITY, AND PERFORMANCE
Runtime
Engine
Environment
Format
5. SINGULARITY IMAGE FORMAT (SIF)
”Building a container can be done in only 52 lines of
code!” – Liz Rice, Container Camp 2016
SIF is a unique, single file,
container encapsulation
format!
SIF is to containers what RPM and DEB is to source code!
ImmutableRuntime
ContainerImage
GlobalHeader
RecipeDefinition
Labels
Environment
WritableOverlay
SignatureBlock
CRYPTOGRAPHICALLYSIGNED
Descriptors
6. SIF encapsulates OCI and Docker containers
into a single file adding benefits such as:
• Guaranteed immutable and reproducible
• Easy to move, share, archive, etc..
• POSIX compatible
• Encapsulates the entire application and environment stack
• Cryptographic signatures and validation
• No layers or dependencies
• No tarballs, SIF is the runtime format
• Encryption (with in-kernel description) coming soon
A NEW DELIVERY PARADIGM FOR SOFTWARE
Singularity Container
TRUST
sha256:94ed0..
sha256:94061..
sha256:aa74a...
sha256:becac…
…
Host OperatingSystem
PresentationLayer
Root Owned Container
Daemon
Network Registry
7. SIF PERFORMANCE
Objectives:
1.Measure scaling of python startup
and import speed with increasing
numbers of concurrent python
interpreters
2.Compare scaling of a standard
python installation with an identical
containerized installation
Note: Underlying file system is NFS, max
jobs was 5120 over 320 nodes, graph is
logarithmic on both axis.
DR. WOLFGANG RESCH
HTTPS://GITHUB.COM/WRESCH/PYTHON_IMPORT_PROBLEM
Invocation performance over shared storage
8. Singularity provides absolute trust and
accountability
Execution of containers can be limited to only
valid keys, and/or key fingerprints
If a malicious user is found, keys
are revoked from the Sylabs
KeyStore, limiting exposure
ABSOLUTE TRUST OF ALL WORKLOADS
$ singularity pull container.sif library://user/container
$ singularity verify container.sif
Data integrity checked, authentic and signed by:
Gregory Kurtzer g@sylabs.io, KeyID F4EIAL82E…
$ singularity sign container.sif
$ singularity push container.sif library://user/container
9. EXTREME MOBILITY OF COMPUTE – BYOE
Absolute mobility from laptop, to
HPC, cloud all and the way out to
the edge.
• Changing the packaging and mobility paradigm for
application and data
• Disrupts the barriers of portability and bridges the
gaps between all available resources
• From private resources, to public clouds and all the
way out to edge and IoT
Local Compute
IoT Edge
NVIDIA DGX
10. Designed for the complicated
integration needs of compute
• Container Runtime:
• Works on all supported Linux Distributions (runtimes and kernels)
• Designed for massive efficiency and performance
• Additional support for alignment between user and kernel space
• Container Image:
• Designed for absolute mobility, user freedom, and reproducibility
• Highly performant on shared and parallel file system deployments
• Can be easily shared, archived, and controls compliant; containers are
just data
• Environment:
• Optimized for application workflows like MPI and schedulers
• Allows direct access to GPUs, InfiniBand, FPGAs, file systems, data,
etc.
COMPATIBLE AND INTEGRATION AWARE
11. Data is shared between container and host as fluently as if contained
applications were running natively on the host.
NATIVE HOST INTEGRATION
$ singularity exec ubuntu.sif pwd
$ singularity exec ubuntu.sif python ./python_script_in_pwd.py
$ cat python_script_in_pwd.py | singularity exec docker://python:latest python
12. Singularity integrates with all batch resource managers, with zero
modifications, by calling the Singularity command directly within the
batch script
BATCH SUPPORT
#!/bin/sh
#SBATCH --N 32
mpirun singularity exec ~/ubuntu.sif mpi_program.exe
13. With a PMIx supporting launcher, you can run a fully contained MPI
process directly from a compatible resource manager
MPI AND SLURM
$ srun -n 32 singularity exec ubuntu.sif mpi_program.exe
14. When a container includes a GPU enabled application and libraries,
Singularity (with the “--nv” option) can properly inject the required Nvidia
GPU driver libraries into the container, to match the host’s kernel
GPU / CUDA SUPPORT
$ singularity exec --nv ubuntu.sif gpu_program.exe
$ singularity run --nv docker://tensorflow/tensorflow:gpu_latest
16. IMB NETWORK PERFORMANCE
Benchmarks published by SDSC at UCSD
https://dl.acm.org/citation.cfm?doid=3093338.3106737
IMB SendRecv Run using Singularity and Non-Singularity IMB PingPong Run using Singularity and Non-Singularity
Content published here with explicit permission from the authors
17. OSU NETWORK LATENCY
Benchmarks published by SDSC at UCSD
https://dl.acm.org/citation.cfm?doid=3093338.3106737
Content published here with explicit permission from the authors
18. LS-DYNA PERFORMANCE
Benchmarks published by the Dell EMC HPC Innovation Lab
http://en.community.dell.com/techcenter/high-performance-computing/b/general_hpc/archive/2018/02/19/performance-of-ls-dyna-on-singularity-containers
“The performance difference while running LS-DYNA within Singularity containers remains within 2%, which is within
the run-to-run variability of the application itself..”
19. Designed for the security needs of
compute
• Container Engine:
• Singularity has no root owned daemon processes
• Implements privilege separation over an API to a secure thread
• DoD: Singularity is the only allowed container system
• Audited and certified by EU lab for use on the European Compute Grid
• NSF grant for 3rd party security assessment (in progress, going well!)
• Container:
• Singularity containers are immutable
• Cryptographically signed and verifiable
• Public keys can be managed over standard HKP protocol (or Sylabs key
services)
• Environment Requirements:
• Containers are run as the calling user
• Blocks all privilege escalation from within the container
SECURITY FOCUSED
20. You are always yourself within a Singularity context, and Singularity will
block escalation attempts within the container
Even if you know the root password, even if you have sudo installed,
even if you implement a SUID hack, Singularity will prevent privilege
escalation
SECURITY BLOCKS
$ singularity exec centos.sif whoami
$ singularity exec centos.sif sudo su -
$ singularity exec centos.sif /proc/$$/root/bin/su
21. • System administrators, always in 100% control
• Supports User Namespace (when kernel supports it)
• Linux Capabilities (per user or group ACLs)
• Directly integrates with host’s:
• SELinux
• AppArmor
• Seccomp
• Container execution can be limited by:
• Container owner or group
• Location on file system (trusted paths)
• Whitelist/blacklist by signed container finger prints
ADDITIONAL SECURITY FEATURES
22. • Backend code updated to GO
• Fully OCI compatible (3.1: `singularity oci …`)
• Integration with enterprise standards:
• OCI: Image support with all container registries
• CNI: Support for all container networking options (port forwarding, NAT, etc..)
• CGroups: Resource limitations
• SIF updates
• Encapsulation of OCI and Docker formats
• Immutable and 100% guaranteed reproducible
• Cryptographically signed and verifiable
• No tarballs or archives: SIF is the runtime container format
• Multi-stage builds, and “disposal” development overlay
• Nvidia HPC-CM container builder
• Build tool integration: Spack, EasyBuild,… Docker, Img, Buildah, etc…
• Native support for MacOS and Windows (coming soon)
• Kubernetes Support (native CRI)
WHAT ELSE IS NEW AND COMING SOON
24. BRIDGING THE GAP BETWEEN COMPUTE AND SERVICES
Native integration between Singularity with OCI, Kubernetes and Nomad to be
completed in Q1 2019.
25. AI workflows typically have a “train” and “execute”
workflow, where the training is the most
computationally intensive
Singularity enables this workflow and
enables large scale distribution and
provides the needed assurance, security
and accountability for scale and
production
Train
Distribute
Build
Inference
ARTIFICIAL INTELLIGENCE MULTISTAGE WORKFLOWS
26. • Parallel training
• Distribution of trained models
• Real time AI / compute
• Data streaming
• Complete validation and trust
• Supporting all tools
• “HPC as a Service”
Singularity is the unifying substrate for
all compute needs
EXPANDING THE WORKFLOW SUPPORT OF THE
ECOSYSTEM
Data
Stream(s)
Kubernetes
Kafka - Stream Splitter and Balancer
Compute
Based Service
Compute
Based Service
Compute
Based Service
Compute
Based Service
Real time collectors, Visualization,
Storage, analytics, etc.
27. TENSORFLOW GPU PERFORMANCE
HPC and AI Solutions Engineering group at Dell EMC
https://www.nextplatform.com/2018/03/19/singularity-containers-for-hpc-deep-learning/
“The performance comparison between a bare metal versus a containerized version of the framework at 32 Tesla V100
is still under 2%, showing negligible performance delta between the two.”
30. Singularity, the container runtime of
choice for HPC, EPC/AI, and
enterprise workloads
As of Singularity 3.0:
• Multi-millions of container runs per day
• Approx 250,000 downloads (not counting redistributors)
• Installed on over 5 million x86 cores, 250k ARM
The same reasons that make Singularity fantastic for HPC,
is what makes Singularity fantastic for all enterprise compute needs!
MASSIVE ADOPTION AND GROWTH
33. • SingularityPRO:
• Fully supported versions of Singularity
• Code curated, trusted builds, RPM/DEB, simple deployment
• Feature identical to open source
• Stable with long term life
• Per node or site licensed
• Sylabs Cloud Services (SCS):
• KeyStore: Public key service for signed containers
• Container Library: A place to host, develop, sell, reference, and share
containers and AI trained models
• Remote Builder: safely and securely build containers without root, with a
web based development interface or use the native Singularity CLI
• Pipelines: CI/CD configurable pipelines for DevOps workflows (coming soon)
• Professional services, support, training, development, etc.
SYLABS OFFERINGS