SlideShare una empresa de Scribd logo
1 de 15
Descargar para leer sin conexión
Deep Learning Resources at
MDC
Alf Wachsmann
Feb 13th, 2019
Agenda
1. Available hardware suitable for DL
2. How to get access
3. How to use it
Available Hardware
Nvidia DGX-1 (maxg01)
- 8x Nvidia Tesla V100
- 512 GB 2,133 MHz DDR4 RDIMM
- 4x 1.92 TB SSD RAID 0 (7 TB usable)
- 2x Intel(R) Xeon(R) CPU E5-2698 v4 @ 2.20GHz (20 cores)
- 1x 1 Gb/s Ethernet (will go up to 10 Gb/s soon)
Temporary loaner hardware from DDN:
- 4x Mellanox EDR IB ConnectX-4
- Lustre access to AI200 storage via RDMA from inside docker
container(s)
Mellanox SB7800
InfiniBand EDR 100Gb/s Switch
DDN AI200 (pre-production loaner from DDN)
- 4 x EDR InfiniBand
- 21 x 2.5” dual port 1.92 TB NVMe SSDs (26 TB usable)
Available Hardware
HPE Proliant DL380 Gen10 (maxg03, maxg04)
- 3x Nvidia Tesla V100 (maxg04)
- 2x Nvidia Tesla V100 (maxg03)
- 192 GB 2,133 MHz DDR4 RDIMM
- 2x Intel(R) Xeon(R) Gold 6134 CPU v5 @ 3.20GHz (16 cores total)
- 2x 10 Gb/s Ethernet
AG Daumke (maxg02):
- 4x Nvidia Pascal TITAN Xp
- 92 GB 2,133 MHz DDR4 RDIMM
- 2x Intel(R) Xeon(R) Silver 4110 CPU v5 @ 2.10GHz (16 cores total)
- 1x 10 Gb/s Ethernet
Available Hardware
Nvidia Tesla V100:
- 640 Tensor Cores
- 5120 CUDA Cores
- Double-Precision 7 teraFLOPS
- Single-Precision 14 teraFLOPS
- Deep Learning 112 teraFLOPS
- Interconnect Bandwidth: 32 GB/s
- Memory: 16 GB HBM2
- Max Power Consumption: 250 W
- Data center quality
Nvidia Pascal Titan Xp:
- 3840 CUDA Cores
- Memory: 12 GB GDDR5X
- Max Power Consumption: 250 W
- Consumer gaming quality
All data and pictures are from the Nvidia web site
How to get access
- All resources are connected to the Max Cluster
- Should be accessed via the batch system
- Documentation:
https://nagios.mdc-berlin.net/prod/wiki/doku.php?id=public:manuals:hpc:intro-en:
usage#getting_access_to_the_gpu_compute_nodes
Alf Wachsmann -Deep Learning Resources at MDC
Example: stardist
Example from Deep Learning Club (Dec 3rd, 2018):
Uwe Schmidt and Martin Weigert from MPI-CBG
"Deep learning based image restoration and cell segmentation for fluorescence microscopy“
Read more about their work and methods:
https://github.com/mpicbg-csbd/stardist
Use containers as an easy solution for trying out software.
Read our (short) documentation about containers on Max Cluster:
https://nagios.mdc-berlin.net/prod/wiki/doku.php?id=public:manuals:hpc:user-guide:05-containers
How to use it: Containers! https://ngc.nvidia.com/catalog/containers
Nvidia provides
collection of
GPU optimzed
containers with
all necessary
software
built into them
Using containers on Max Cluster
Create our own Singularity container from the TensorFlow Docker container with the stardist software in it.
N.B.: Needs sudo/root, i.e. use your own computer to build to container.
$ cat stardist.singularity
Bootstrap: docker
From: nvcr.io/nvidia/tensorflow:18.11-py3
%post
apt-get -y update && apt-get -y install firefox
pip install jupyter
pip install stardist
mkdir /notebooks && chmod a+rwx /notebooks
%runscript
jupyter notebook --notebook-dir=/notebooks --ip 0.0.0.0 --allow-root
$ sudo singularity build /tmp/stardist.sif stardist.singularity # Image will run on CentOS, Ubuntu, etc.
download https://github.com/mpicbg-csbd/stardist to /home/awachs/Software/stardist/
$ singularity run --nv -B /home/awachs/Software/stardist/examples:/notebooks -B /tmp:/run /tmp/stardist.sif
Alf Wachsmann -Deep Learning Resources at MDC
Alf Wachsmann -Deep Learning Resources at MDC
Alf Wachsmann -Deep Learning Resources at MDC
Using containers on Max Cluster
- Showed you interactive use.
- Submitting to batch system works just as well. Please consult our documentation.
- Important command to know about:
Trends
DL is now big business. More specialized hardware for large scale inference will appear.
Cloud providers will always offer the latest HW.
Intel:
- Xeon v6 (“Cascade Lake ”)
- Vector Neural Network Instructions (AVX-512_VNNI) to speed up inference
Nvidia:
- T4 Tensor Core GPU for AI Inference:
- Turing Tensor Cores 320
- NVIDIA CUDA® cores 2,560
- Max Power Consumption: 70 W
- Pre-Trained Networks (the ones below are for Medicine):
- NVIDIA Transfer Learning Toolkit: https://developer.nvidia.com/transfer-learning-toolkit
- NVIDIA AI-Assisted Annotation SDK: https://developer.nvidia.com/clara/annotation
Google:
- Pre-trained variant calling network: https://github.com/google/deepvariant

Más contenido relacionado

Último

The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
KarakKing
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 

Último (20)

On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptx
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 

Destacado

Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellGood Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Saba Software
 
Introduction to C Programming Language
Introduction to C Programming LanguageIntroduction to C Programming Language
Introduction to C Programming Language
Simplilearn
 

Destacado (20)

How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy Presentation
 
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellGood Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
 
Introduction to C Programming Language
Introduction to C Programming LanguageIntroduction to C Programming Language
Introduction to C Programming Language
 

Alf Wachsmann -Deep Learning Resources at MDC

  • 1. Deep Learning Resources at MDC Alf Wachsmann Feb 13th, 2019
  • 2. Agenda 1. Available hardware suitable for DL 2. How to get access 3. How to use it
  • 3. Available Hardware Nvidia DGX-1 (maxg01) - 8x Nvidia Tesla V100 - 512 GB 2,133 MHz DDR4 RDIMM - 4x 1.92 TB SSD RAID 0 (7 TB usable) - 2x Intel(R) Xeon(R) CPU E5-2698 v4 @ 2.20GHz (20 cores) - 1x 1 Gb/s Ethernet (will go up to 10 Gb/s soon) Temporary loaner hardware from DDN: - 4x Mellanox EDR IB ConnectX-4 - Lustre access to AI200 storage via RDMA from inside docker container(s) Mellanox SB7800 InfiniBand EDR 100Gb/s Switch DDN AI200 (pre-production loaner from DDN) - 4 x EDR InfiniBand - 21 x 2.5” dual port 1.92 TB NVMe SSDs (26 TB usable)
  • 4. Available Hardware HPE Proliant DL380 Gen10 (maxg03, maxg04) - 3x Nvidia Tesla V100 (maxg04) - 2x Nvidia Tesla V100 (maxg03) - 192 GB 2,133 MHz DDR4 RDIMM - 2x Intel(R) Xeon(R) Gold 6134 CPU v5 @ 3.20GHz (16 cores total) - 2x 10 Gb/s Ethernet AG Daumke (maxg02): - 4x Nvidia Pascal TITAN Xp - 92 GB 2,133 MHz DDR4 RDIMM - 2x Intel(R) Xeon(R) Silver 4110 CPU v5 @ 2.10GHz (16 cores total) - 1x 10 Gb/s Ethernet
  • 5. Available Hardware Nvidia Tesla V100: - 640 Tensor Cores - 5120 CUDA Cores - Double-Precision 7 teraFLOPS - Single-Precision 14 teraFLOPS - Deep Learning 112 teraFLOPS - Interconnect Bandwidth: 32 GB/s - Memory: 16 GB HBM2 - Max Power Consumption: 250 W - Data center quality Nvidia Pascal Titan Xp: - 3840 CUDA Cores - Memory: 12 GB GDDR5X - Max Power Consumption: 250 W - Consumer gaming quality All data and pictures are from the Nvidia web site
  • 6. How to get access - All resources are connected to the Max Cluster - Should be accessed via the batch system - Documentation: https://nagios.mdc-berlin.net/prod/wiki/doku.php?id=public:manuals:hpc:intro-en: usage#getting_access_to_the_gpu_compute_nodes
  • 8. Example: stardist Example from Deep Learning Club (Dec 3rd, 2018): Uwe Schmidt and Martin Weigert from MPI-CBG "Deep learning based image restoration and cell segmentation for fluorescence microscopy“ Read more about their work and methods: https://github.com/mpicbg-csbd/stardist Use containers as an easy solution for trying out software. Read our (short) documentation about containers on Max Cluster: https://nagios.mdc-berlin.net/prod/wiki/doku.php?id=public:manuals:hpc:user-guide:05-containers
  • 9. How to use it: Containers! https://ngc.nvidia.com/catalog/containers Nvidia provides collection of GPU optimzed containers with all necessary software built into them
  • 10. Using containers on Max Cluster Create our own Singularity container from the TensorFlow Docker container with the stardist software in it. N.B.: Needs sudo/root, i.e. use your own computer to build to container. $ cat stardist.singularity Bootstrap: docker From: nvcr.io/nvidia/tensorflow:18.11-py3 %post apt-get -y update && apt-get -y install firefox pip install jupyter pip install stardist mkdir /notebooks && chmod a+rwx /notebooks %runscript jupyter notebook --notebook-dir=/notebooks --ip 0.0.0.0 --allow-root $ sudo singularity build /tmp/stardist.sif stardist.singularity # Image will run on CentOS, Ubuntu, etc. download https://github.com/mpicbg-csbd/stardist to /home/awachs/Software/stardist/ $ singularity run --nv -B /home/awachs/Software/stardist/examples:/notebooks -B /tmp:/run /tmp/stardist.sif
  • 14. Using containers on Max Cluster - Showed you interactive use. - Submitting to batch system works just as well. Please consult our documentation. - Important command to know about:
  • 15. Trends DL is now big business. More specialized hardware for large scale inference will appear. Cloud providers will always offer the latest HW. Intel: - Xeon v6 (“Cascade Lake ”) - Vector Neural Network Instructions (AVX-512_VNNI) to speed up inference Nvidia: - T4 Tensor Core GPU for AI Inference: - Turing Tensor Cores 320 - NVIDIA CUDA® cores 2,560 - Max Power Consumption: 70 W - Pre-Trained Networks (the ones below are for Medicine): - NVIDIA Transfer Learning Toolkit: https://developer.nvidia.com/transfer-learning-toolkit - NVIDIA AI-Assisted Annotation SDK: https://developer.nvidia.com/clara/annotation Google: - Pre-trained variant calling network: https://github.com/google/deepvariant