SlideShare una empresa de Scribd logo
1 de 15
Descargar para leer sin conexión
RDMAoE collaboration with KISTI




          Tuesday 6/7/2011
     10:00am-11:00am (50B-2222)

          mbalman@lbl.gov
RDMA for High Performance Data
              Movement

     Network I/O operations are costly:
      −   CPU load
      −   Context switching
      −   Memory latency

     Zero-copy networking
      −   NIC copies data directly to/from application
          memory

     IB transport (HPC applications)

     iWARP (TCP stack / TOE)
RDMA model


    One sided operations

    Get/Put semantics
              
                  Send/receive

    Direct data placement
              
                  RDMA Write
              
                  RDMA Read

    Asyschronous
        −   Work Queue (send queue – receive queue)
        −   Completion Queue
RDMA Programming Model

    Objects
                
                    Queue Pairs (protection domain)
                
                    Send queue (RDMA write, RDMA read)
                
                    Receive queue
                
                    Modify state
                
                    Completetion queue (poll)
                
                    Memory region (MR)

    Functions (verbs)
        −     IB (libmlx4) iWARP (libcxgb3)

    Librdmacm (connection setup)
RDMA/iWARP


    Implicit RDMA support

    Explicit RDMA support


    iWARP
       −    encapsulate RDMA traffic at a high level
       −    Use TCP stack
       −    Without TOE is it beneficial?
Alternative Approaches


    RDMA over Converged Ethernet (RoCE)
       −    Lightweight RDMA transport over Ethernet
              
                  Widely deployed technology
              
                  Support kernel bypass
              
                  OFED 1.5.1 supports RoCE

    SoftRDMAs...
       −    SoftRoCE (OFED 1.5.1 supports softRoCE)
       −    SoftiWARP (new TPC kernel stack)
Hidden Cost


    Memory Registration
       −   RDMA Read/Write

    Connection Setup
       −   Librdmacm


→ Bulk data movement?

    Asynchronous Model
       −   Buffer Management
Challanges in Bulk Transfer


    Application Level Adjustments

    Request Aggregation
        −   Small data files
        −   Does FTP like transfer mechanism is appropriate
            for RDMA?

    File System Overhead
        −   Asynchronous Operations

    Connection Caching / Multiple Connection?
Local Area / Wide Area


    IB RDMA designed for local area
        −   How does RDMA perform in Wide Area?

    iWARP
        −   No promising results - Over TCP (with TOE?)
        −   SoftiWARP ???

    RoCE
        −   Isolated traffic ? / much less CPU usage
        −   softRoCE?
GridFTP over RDMA


    XIO driver for GridFTP
        −   Experimented using Chelsio cards (cxgb3)
        −   10GE
        −   WAN testing in progress!

        −   Local area: 910MBbps – 1175MBps

        −   Much better than GridFTP over TCP
              
                   Much less CPU load (1/2)
FTP100 – FTP over RDMA


    Experimented with Mellonox Cards
       −    Local area – 10GE

       −    iWARP
              
                  Did not perform well compared to TCP
                     −   No significant gain
       −    RoCE tests
              
                  In progress (have some initial results)
              
                  Limited by the disk performance
              
                  Mem2mem:
                     −   Can already saturate the 10GE link
What is Next?

Experiments RDMA model over WAN


    SoftiWARP from IBM Zurich
        −   TCP kernel stack implementing/defining RDMA
            iverbs


    SoftRoCE – OFED 1.5.2-rxe distribution
        −   Multiple connections?
Transfer Applications over RDMA


    Simple Client/Server:
        −   Developing a prototype for transferring climate
            dataset using RDMA protocols
        −   Asysnchronous memory management module


    Application level tuning?
        −   Memory regions (max/min?)
        −   Multiple QPs
Climate Analysis

Climate Applications are Data-Intensive


    Shared data repository:
        −   Data files needs to be downloaded for further
            processing and analysis
        −   Data retrieval is the main bottleneck
        −   Multiple clients (working as VM instances)
               
                   Can not depent on HW support
               
                   SoftRoCE ? softiWARP
What can we do for WAN testing?



    Q&A?



→ https://sdm.lbl.gov/climate100/

Más contenido relacionado

La actualidad más candente

LF_DPDK17_mediated devices: better userland IO
LF_DPDK17_mediated devices: better userland IOLF_DPDK17_mediated devices: better userland IO
LF_DPDK17_mediated devices: better userland IO
LF_DPDK
 
Slash n: Technical Session 2 - Messaging as a Platform - Shashwat Agarwal, V...
Slash n: Technical Session 2 - Messaging as a Platform - Shashwat Agarwal,  V...Slash n: Technical Session 2 - Messaging as a Platform - Shashwat Agarwal,  V...
Slash n: Technical Session 2 - Messaging as a Platform - Shashwat Agarwal, V...
slashn
 
Computer network (5)
Computer network (5)Computer network (5)
Computer network (5)
NYversity
 
DB2 Data Sharing Performance for Beginners
DB2 Data Sharing Performance for BeginnersDB2 Data Sharing Performance for Beginners
DB2 Data Sharing Performance for Beginners
Martin Packer
 

La actualidad más candente (20)

zIIP Capacity Planning
zIIP Capacity PlanningzIIP Capacity Planning
zIIP Capacity Planning
 
LF_DPDK17_mediated devices: better userland IO
LF_DPDK17_mediated devices: better userland IOLF_DPDK17_mediated devices: better userland IO
LF_DPDK17_mediated devices: better userland IO
 
Time For D.I.M.E?
Time For D.I.M.E?Time For D.I.M.E?
Time For D.I.M.E?
 
Design installation-commissioning-red raider-cluster-ttu
Design installation-commissioning-red raider-cluster-ttuDesign installation-commissioning-red raider-cluster-ttu
Design installation-commissioning-red raider-cluster-ttu
 
ソフトウェアでのパケット処理あれこれ〜何故我々はロードバランサを自作するに至ったのか〜
ソフトウェアでのパケット処理あれこれ〜何故我々はロードバランサを自作するに至ったのか〜ソフトウェアでのパケット処理あれこれ〜何故我々はロードバランサを自作するに至ったのか〜
ソフトウェアでのパケット処理あれこれ〜何故我々はロードバランサを自作するに至ったのか〜
 
Unifying Network Filtering Rules for the Linux Kernel with eBPF
Unifying Network Filtering Rules for the Linux Kernel with eBPFUnifying Network Filtering Rules for the Linux Kernel with eBPF
Unifying Network Filtering Rules for the Linux Kernel with eBPF
 
Nap extras
Nap extrasNap extras
Nap extras
 
Slash n: Technical Session 2 - Messaging as a Platform - Shashwat Agarwal, V...
Slash n: Technical Session 2 - Messaging as a Platform - Shashwat Agarwal,  V...Slash n: Technical Session 2 - Messaging as a Platform - Shashwat Agarwal,  V...
Slash n: Technical Session 2 - Messaging as a Platform - Shashwat Agarwal, V...
 
zIIP Capacity Planning - May 2018
zIIP Capacity Planning - May 2018zIIP Capacity Planning - May 2018
zIIP Capacity Planning - May 2018
 
Решения WANDL и NorthStar для операторов
Решения WANDL и NorthStar для операторовРешения WANDL и NorthStar для операторов
Решения WANDL и NorthStar для операторов
 
DB2 Through My Eyes
DB2 Through My EyesDB2 Through My Eyes
DB2 Through My Eyes
 
DCTcp
DCTcpDCTcp
DCTcp
 
FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)
 
TRex Traffic Generator - Hanoch Haim
TRex Traffic Generator - Hanoch HaimTRex Traffic Generator - Hanoch Haim
TRex Traffic Generator - Hanoch Haim
 
Virtual net performance
Virtual net performanceVirtual net performance
Virtual net performance
 
Computer network (5)
Computer network (5)Computer network (5)
Computer network (5)
 
Using Q4M - a message queue storage engine for MySQL
Using Q4M - a message queue storage engine for MySQLUsing Q4M - a message queue storage engine for MySQL
Using Q4M - a message queue storage engine for MySQL
 
Munich 2016 - Z011598 Martin Packer - He Picks On CICS
Munich 2016 - Z011598 Martin Packer - He Picks On CICSMunich 2016 - Z011598 Martin Packer - He Picks On CICS
Munich 2016 - Z011598 Martin Packer - He Picks On CICS
 
DB2 Data Sharing Performance for Beginners
DB2 Data Sharing Performance for BeginnersDB2 Data Sharing Performance for Beginners
DB2 Data Sharing Performance for Beginners
 
Paper on RDMA enabled Cluster FileSystem at Intel Developer Forum
Paper on RDMA enabled Cluster FileSystem at Intel Developer ForumPaper on RDMA enabled Cluster FileSystem at Intel Developer Forum
Paper on RDMA enabled Cluster FileSystem at Intel Developer Forum
 

Destacado

【標準】Wi−fi提案書2010 1206
【標準】Wi−fi提案書2010 1206【標準】Wi−fi提案書2010 1206
【標準】Wi−fi提案書2010 1206
Nagayama Shinichi
 
Presto - Hadoop Conference Japan 2014
Presto - Hadoop Conference Japan 2014Presto - Hadoop Conference Japan 2014
Presto - Hadoop Conference Japan 2014
Sadayuki Furuhashi
 

Destacado (19)

gcp ja night #27 Google Cloud Endpoints with Golang
gcp ja night #27 Google Cloud Endpoints with Golanggcp ja night #27 Google Cloud Endpoints with Golang
gcp ja night #27 Google Cloud Endpoints with Golang
 
Innovation egg 第5回 『クラウド運用の本音』オープニング
Innovation egg 第5回 『クラウド運用の本音』オープニングInnovation egg 第5回 『クラウド運用の本音』オープニング
Innovation egg 第5回 『クラウド運用の本音』オープニング
 
G Suite Overview by Kamene Projects
G Suite Overview by Kamene ProjectsG Suite Overview by Kamene Projects
G Suite Overview by Kamene Projects
 
Gcpug in fukuoka!20150411 #gcpug
Gcpug in fukuoka!20150411 #gcpugGcpug in fukuoka!20150411 #gcpug
Gcpug in fukuoka!20150411 #gcpug
 
【標準】Wi−fi提案書2010 1206
【標準】Wi−fi提案書2010 1206【標準】Wi−fi提案書2010 1206
【標準】Wi−fi提案書2010 1206
 
Gceハンズオン20150411イン福岡
Gceハンズオン20150411イン福岡Gceハンズオン20150411イン福岡
Gceハンズオン20150411イン福岡
 
FPGAX6_hayashi
FPGAX6_hayashiFPGAX6_hayashi
FPGAX6_hayashi
 
[db tech showcase Tokyo 2015] A14:Amazon Redshiftの元となったスケールアウト型カラムナーDB徹底解説 その...
[db tech showcase Tokyo 2015] A14:Amazon Redshiftの元となったスケールアウト型カラムナーDB徹底解説 その...[db tech showcase Tokyo 2015] A14:Amazon Redshiftの元となったスケールアウト型カラムナーDB徹底解説 その...
[db tech showcase Tokyo 2015] A14:Amazon Redshiftの元となったスケールアウト型カラムナーDB徹底解説 その...
 
AWSとGCPを使用したインフラ環境
AWSとGCPを使用したインフラ環境AWSとGCPを使用したインフラ環境
AWSとGCPを使用したインフラ環境
 
AWS Black Belt Online Seminar 10 Years of AWS
AWS Black Belt Online Seminar 10 Years of AWSAWS Black Belt Online Seminar 10 Years of AWS
AWS Black Belt Online Seminar 10 Years of AWS
 
Wowzaを用いた配信基盤 Takusuta tech conf01
Wowzaを用いた配信基盤 Takusuta tech conf01Wowzaを用いた配信基盤 Takusuta tech conf01
Wowzaを用いた配信基盤 Takusuta tech conf01
 
マーケティングとは何か
マーケティングとは何かマーケティングとは何か
マーケティングとは何か
 
Presto - Hadoop Conference Japan 2014
Presto - Hadoop Conference Japan 2014Presto - Hadoop Conference Japan 2014
Presto - Hadoop Conference Japan 2014
 
[Aurora事例祭り]毎日新聞ニュースサイトをクラウド化 ~Amazon Aurora 導入事例紹介~
[Aurora事例祭り]毎日新聞ニュースサイトをクラウド化  ~Amazon Aurora 導入事例紹介~[Aurora事例祭り]毎日新聞ニュースサイトをクラウド化  ~Amazon Aurora 導入事例紹介~
[Aurora事例祭り]毎日新聞ニュースサイトをクラウド化 ~Amazon Aurora 導入事例紹介~
 
0528 kanntigai ui_ux
0528 kanntigai ui_ux0528 kanntigai ui_ux
0528 kanntigai ui_ux
 
女子の心をつかむUIデザインポイント - MERY編 -
女子の心をつかむUIデザインポイント - MERY編 -女子の心をつかむUIデザインポイント - MERY編 -
女子の心をつかむUIデザインポイント - MERY編 -
 
A Tour of Google Cloud Platform
A Tour of Google Cloud PlatformA Tour of Google Cloud Platform
A Tour of Google Cloud Platform
 
[Aurora事例祭り]Amazon Aurora を使いこなすためのベストプラクティス
[Aurora事例祭り]Amazon Aurora を使いこなすためのベストプラクティス[Aurora事例祭り]Amazon Aurora を使いこなすためのベストプラクティス
[Aurora事例祭り]Amazon Aurora を使いこなすためのベストプラクティス
 
Can We Assess Creativity?
Can We Assess Creativity?Can We Assess Creativity?
Can We Assess Creativity?
 

Similar a Rdma presentation-kisti-v2

Accelerating Ceph with RDMA and NVMe-oF
Accelerating Ceph with RDMA and NVMe-oFAccelerating Ceph with RDMA and NVMe-oF
Accelerating Ceph with RDMA and NVMe-oF
inside-BigData.com
 
Advanced Networking: The Critical Path for HPC, Cloud, Machine Learning and more
Advanced Networking: The Critical Path for HPC, Cloud, Machine Learning and moreAdvanced Networking: The Critical Path for HPC, Cloud, Machine Learning and more
Advanced Networking: The Critical Path for HPC, Cloud, Machine Learning and more
inside-BigData.com
 

Similar a Rdma presentation-kisti-v2 (20)

5G Cellular D2D RDMA Clusters
5G Cellular D2D RDMA Clusters5G Cellular D2D RDMA Clusters
5G Cellular D2D RDMA Clusters
 
Ceph Day Beijing - Ceph RDMA Update
Ceph Day Beijing - Ceph RDMA UpdateCeph Day Beijing - Ceph RDMA Update
Ceph Day Beijing - Ceph RDMA Update
 
Ceph Day Beijing - Ceph RDMA Update
Ceph Day Beijing - Ceph RDMA UpdateCeph Day Beijing - Ceph RDMA Update
Ceph Day Beijing - Ceph RDMA Update
 
High perf-networking
High perf-networkingHigh perf-networking
High perf-networking
 
Pristine rina-tnc-2016
Pristine rina-tnc-2016Pristine rina-tnc-2016
Pristine rina-tnc-2016
 
Pristine rina-tnc-2016
Pristine rina-tnc-2016Pristine rina-tnc-2016
Pristine rina-tnc-2016
 
100 M pps on PC.
100 M pps on PC.100 M pps on PC.
100 M pps on PC.
 
Ap nr5000 pt file
Ap nr5000 pt fileAp nr5000 pt file
Ap nr5000 pt file
 
Accelerating Ceph with RDMA and NVMe-oF
Accelerating Ceph with RDMA and NVMe-oFAccelerating Ceph with RDMA and NVMe-oF
Accelerating Ceph with RDMA and NVMe-oF
 
Sharing High-Performance Interconnects Across Multiple Virtual Machines
Sharing High-Performance Interconnects Across Multiple Virtual MachinesSharing High-Performance Interconnects Across Multiple Virtual Machines
Sharing High-Performance Interconnects Across Multiple Virtual Machines
 
ARM LPC2300/LPC2400 TCP/IP Stack Porting
ARM LPC2300/LPC2400 TCP/IP Stack PortingARM LPC2300/LPC2400 TCP/IP Stack Porting
ARM LPC2300/LPC2400 TCP/IP Stack Porting
 
Update on IRATI technical work after month 6
Update on IRATI technical work after month 6Update on IRATI technical work after month 6
Update on IRATI technical work after month 6
 
UAV Data Link Design for Dependable Real-Time Communications
UAV Data Link Design for Dependable Real-Time CommunicationsUAV Data Link Design for Dependable Real-Time Communications
UAV Data Link Design for Dependable Real-Time Communications
 
Advanced Networking: The Critical Path for HPC, Cloud, Machine Learning and more
Advanced Networking: The Critical Path for HPC, Cloud, Machine Learning and moreAdvanced Networking: The Critical Path for HPC, Cloud, Machine Learning and more
Advanced Networking: The Critical Path for HPC, Cloud, Machine Learning and more
 
High performace network of Cloud Native Taiwan User Group
High performace network of Cloud Native Taiwan User GroupHigh performace network of Cloud Native Taiwan User Group
High performace network of Cloud Native Taiwan User Group
 
Userspace networking
Userspace networkingUserspace networking
Userspace networking
 
Accelerating Ceph with iWARP RDMA over Ethernet - Brien Porter, Haodong Tang
Accelerating Ceph with iWARP RDMA over Ethernet - Brien Porter, Haodong TangAccelerating Ceph with iWARP RDMA over Ethernet - Brien Porter, Haodong Tang
Accelerating Ceph with iWARP RDMA over Ethernet - Brien Porter, Haodong Tang
 
Network protocol
Network protocolNetwork protocol
Network protocol
 
Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...
Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...
Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...
 
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDK
 

Más de balmanme

Network-aware Data Management for Large Scale Distributed Applications, IBM R...
Network-aware Data Management for Large Scale Distributed Applications, IBM R...Network-aware Data Management for Large Scale Distributed Applications, IBM R...
Network-aware Data Management for Large Scale Distributed Applications, IBM R...
balmanme
 
Network-aware Data Management for High Throughput Flows Akamai, Cambridge, ...
Network-aware Data Management for High Throughput Flows   Akamai, Cambridge, ...Network-aware Data Management for High Throughput Flows   Akamai, Cambridge, ...
Network-aware Data Management for High Throughput Flows Akamai, Cambridge, ...
balmanme
 
Hpcwire100gnetworktosupportbigscience 130725203822-phpapp01-1
Hpcwire100gnetworktosupportbigscience 130725203822-phpapp01-1Hpcwire100gnetworktosupportbigscience 130725203822-phpapp01-1
Hpcwire100gnetworktosupportbigscience 130725203822-phpapp01-1
balmanme
 
A 100 gigabit highway for science: researchers take a 'test drive' on ani tes...
A 100 gigabit highway for science: researchers take a 'test drive' on ani tes...A 100 gigabit highway for science: researchers take a 'test drive' on ani tes...
A 100 gigabit highway for science: researchers take a 'test drive' on ani tes...
balmanme
 
Balman stork cw09
Balman stork cw09Balman stork cw09
Balman stork cw09
balmanme
 
Available technologies: algorithm for flexible bandwidth reservations for dat...
Available technologies: algorithm for flexible bandwidth reservations for dat...Available technologies: algorithm for flexible bandwidth reservations for dat...
Available technologies: algorithm for flexible bandwidth reservations for dat...
balmanme
 
Berkeley lab team develops flexible reservation algorithm for advance network...
Berkeley lab team develops flexible reservation algorithm for advance network...Berkeley lab team develops flexible reservation algorithm for advance network...
Berkeley lab team develops flexible reservation algorithm for advance network...
balmanme
 
Cybertools stork-2009-cybertools allhandmeeting-poster
Cybertools stork-2009-cybertools allhandmeeting-posterCybertools stork-2009-cybertools allhandmeeting-poster
Cybertools stork-2009-cybertools allhandmeeting-poster
balmanme
 
Presentation summerstudent 2009-aug09-lbl-summer
Presentation summerstudent 2009-aug09-lbl-summerPresentation summerstudent 2009-aug09-lbl-summer
Presentation summerstudent 2009-aug09-lbl-summer
balmanme
 
Lblc sseminar jun09-2009-jun09-lblcsseminar
Lblc sseminar jun09-2009-jun09-lblcsseminarLblc sseminar jun09-2009-jun09-lblcsseminar
Lblc sseminar jun09-2009-jun09-lblcsseminar
balmanme
 
Presentation southernstork 2009-nov-southernworkshop
Presentation southernstork 2009-nov-southernworkshopPresentation southernstork 2009-nov-southernworkshop
Presentation southernstork 2009-nov-southernworkshop
balmanme
 
Aug17presentation.v2 2009-aug09-lblc sseminar
Aug17presentation.v2 2009-aug09-lblc sseminarAug17presentation.v2 2009-aug09-lblc sseminar
Aug17presentation.v2 2009-aug09-lblc sseminar
balmanme
 
Analyzing Data Movements and Identifying Techniques for Next-generation Networks
Analyzing Data Movements and Identifying Techniques for Next-generation NetworksAnalyzing Data Movements and Identifying Techniques for Next-generation Networks
Analyzing Data Movements and Identifying Techniques for Next-generation Networks
balmanme
 

Más de balmanme (20)

Network-aware Data Management for Large Scale Distributed Applications, IBM R...
Network-aware Data Management for Large Scale Distributed Applications, IBM R...Network-aware Data Management for Large Scale Distributed Applications, IBM R...
Network-aware Data Management for Large Scale Distributed Applications, IBM R...
 
Network-aware Data Management for High Throughput Flows Akamai, Cambridge, ...
Network-aware Data Management for High Throughput Flows   Akamai, Cambridge, ...Network-aware Data Management for High Throughput Flows   Akamai, Cambridge, ...
Network-aware Data Management for High Throughput Flows Akamai, Cambridge, ...
 
Hpcwire100gnetworktosupportbigscience 130725203822-phpapp01-1
Hpcwire100gnetworktosupportbigscience 130725203822-phpapp01-1Hpcwire100gnetworktosupportbigscience 130725203822-phpapp01-1
Hpcwire100gnetworktosupportbigscience 130725203822-phpapp01-1
 
Experiences with High-bandwidth Networks
Experiences with High-bandwidth NetworksExperiences with High-bandwidth Networks
Experiences with High-bandwidth Networks
 
A 100 gigabit highway for science: researchers take a 'test drive' on ani tes...
A 100 gigabit highway for science: researchers take a 'test drive' on ani tes...A 100 gigabit highway for science: researchers take a 'test drive' on ani tes...
A 100 gigabit highway for science: researchers take a 'test drive' on ani tes...
 
Balman stork cw09
Balman stork cw09Balman stork cw09
Balman stork cw09
 
Available technologies: algorithm for flexible bandwidth reservations for dat...
Available technologies: algorithm for flexible bandwidth reservations for dat...Available technologies: algorithm for flexible bandwidth reservations for dat...
Available technologies: algorithm for flexible bandwidth reservations for dat...
 
Berkeley lab team develops flexible reservation algorithm for advance network...
Berkeley lab team develops flexible reservation algorithm for advance network...Berkeley lab team develops flexible reservation algorithm for advance network...
Berkeley lab team develops flexible reservation algorithm for advance network...
 
Dynamic adaptation balman
Dynamic adaptation balmanDynamic adaptation balman
Dynamic adaptation balman
 
Nersc dtn-perf-100121.test_results-nercmeeting-jan21-2010
Nersc dtn-perf-100121.test_results-nercmeeting-jan21-2010Nersc dtn-perf-100121.test_results-nercmeeting-jan21-2010
Nersc dtn-perf-100121.test_results-nercmeeting-jan21-2010
 
Cybertools stork-2009-cybertools allhandmeeting-poster
Cybertools stork-2009-cybertools allhandmeeting-posterCybertools stork-2009-cybertools allhandmeeting-poster
Cybertools stork-2009-cybertools allhandmeeting-poster
 
Presentation summerstudent 2009-aug09-lbl-summer
Presentation summerstudent 2009-aug09-lbl-summerPresentation summerstudent 2009-aug09-lbl-summer
Presentation summerstudent 2009-aug09-lbl-summer
 
Lblc sseminar jun09-2009-jun09-lblcsseminar
Lblc sseminar jun09-2009-jun09-lblcsseminarLblc sseminar jun09-2009-jun09-lblcsseminar
Lblc sseminar jun09-2009-jun09-lblcsseminar
 
Presentation southernstork 2009-nov-southernworkshop
Presentation southernstork 2009-nov-southernworkshopPresentation southernstork 2009-nov-southernworkshop
Presentation southernstork 2009-nov-southernworkshop
 
Balman dissertation Copyright @ 2010 Mehmet Balman
Balman dissertation Copyright @ 2010 Mehmet BalmanBalman dissertation Copyright @ 2010 Mehmet Balman
Balman dissertation Copyright @ 2010 Mehmet Balman
 
Aug17presentation.v2 2009-aug09-lblc sseminar
Aug17presentation.v2 2009-aug09-lblc sseminarAug17presentation.v2 2009-aug09-lblc sseminar
Aug17presentation.v2 2009-aug09-lblc sseminar
 
Pdcs2010 balman-presentation
Pdcs2010 balman-presentationPdcs2010 balman-presentation
Pdcs2010 balman-presentation
 
Analyzing Data Movements and Identifying Techniques for Next-generation Networks
Analyzing Data Movements and Identifying Techniques for Next-generation NetworksAnalyzing Data Movements and Identifying Techniques for Next-generation Networks
Analyzing Data Movements and Identifying Techniques for Next-generation Networks
 
MemzNet: Memory-Mapped Zero-copy Network Channel -- Streaming exascala data o...
MemzNet: Memory-Mapped Zero-copy Network Channel -- Streaming exascala data o...MemzNet: Memory-Mapped Zero-copy Network Channel -- Streaming exascala data o...
MemzNet: Memory-Mapped Zero-copy Network Channel -- Streaming exascala data o...
 
Opening ndm2012 sc12
Opening ndm2012 sc12Opening ndm2012 sc12
Opening ndm2012 sc12
 

Último

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Último (20)

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 

Rdma presentation-kisti-v2

  • 1. RDMAoE collaboration with KISTI Tuesday 6/7/2011 10:00am-11:00am (50B-2222) mbalman@lbl.gov
  • 2. RDMA for High Performance Data Movement  Network I/O operations are costly: − CPU load − Context switching − Memory latency  Zero-copy networking − NIC copies data directly to/from application memory  IB transport (HPC applications)  iWARP (TCP stack / TOE)
  • 3. RDMA model  One sided operations  Get/Put semantics  Send/receive  Direct data placement  RDMA Write  RDMA Read  Asyschronous − Work Queue (send queue – receive queue) − Completion Queue
  • 4. RDMA Programming Model  Objects  Queue Pairs (protection domain)  Send queue (RDMA write, RDMA read)  Receive queue  Modify state  Completetion queue (poll)  Memory region (MR)  Functions (verbs) − IB (libmlx4) iWARP (libcxgb3)  Librdmacm (connection setup)
  • 5. RDMA/iWARP  Implicit RDMA support  Explicit RDMA support  iWARP − encapsulate RDMA traffic at a high level − Use TCP stack − Without TOE is it beneficial?
  • 6. Alternative Approaches  RDMA over Converged Ethernet (RoCE) − Lightweight RDMA transport over Ethernet  Widely deployed technology  Support kernel bypass  OFED 1.5.1 supports RoCE  SoftRDMAs... − SoftRoCE (OFED 1.5.1 supports softRoCE) − SoftiWARP (new TPC kernel stack)
  • 7. Hidden Cost  Memory Registration − RDMA Read/Write  Connection Setup − Librdmacm → Bulk data movement?  Asynchronous Model − Buffer Management
  • 8. Challanges in Bulk Transfer  Application Level Adjustments  Request Aggregation − Small data files − Does FTP like transfer mechanism is appropriate for RDMA?  File System Overhead − Asynchronous Operations  Connection Caching / Multiple Connection?
  • 9. Local Area / Wide Area  IB RDMA designed for local area − How does RDMA perform in Wide Area?  iWARP − No promising results - Over TCP (with TOE?) − SoftiWARP ???  RoCE − Isolated traffic ? / much less CPU usage − softRoCE?
  • 10. GridFTP over RDMA  XIO driver for GridFTP − Experimented using Chelsio cards (cxgb3) − 10GE − WAN testing in progress! − Local area: 910MBbps – 1175MBps − Much better than GridFTP over TCP  Much less CPU load (1/2)
  • 11. FTP100 – FTP over RDMA  Experimented with Mellonox Cards − Local area – 10GE − iWARP  Did not perform well compared to TCP − No significant gain − RoCE tests  In progress (have some initial results)  Limited by the disk performance  Mem2mem: − Can already saturate the 10GE link
  • 12. What is Next? Experiments RDMA model over WAN  SoftiWARP from IBM Zurich − TCP kernel stack implementing/defining RDMA iverbs  SoftRoCE – OFED 1.5.2-rxe distribution − Multiple connections?
  • 13. Transfer Applications over RDMA  Simple Client/Server: − Developing a prototype for transferring climate dataset using RDMA protocols − Asysnchronous memory management module  Application level tuning? − Memory regions (max/min?) − Multiple QPs
  • 14. Climate Analysis Climate Applications are Data-Intensive  Shared data repository: − Data files needs to be downloaded for further processing and analysis − Data retrieval is the main bottleneck − Multiple clients (working as VM instances)  Can not depent on HW support  SoftRoCE ? softiWARP
  • 15. What can we do for WAN testing?  Q&A? → https://sdm.lbl.gov/climate100/