SlideShare una empresa de Scribd logo
1 de 37
Hadoop Webcast

Boosting Hadoop Performance with
Emulex OneConnect® 10Gb Ethernet Adapters
Agenda


 Digital Content, today and tomorrow
 What is Big Data?
 Information as an Asset
 A Solution to the Problem
 The Moving Bottleneck
 Hadoop on 10GbE
 Testing Configurations and Objectives
 Testing Results
 Comparison Analysis – The Tale of the Tape
 Q&A



                           © 2011 Emulex Corporation   2
Digital Content – Big Data’s Singularity

             A Decade of Digital Universe Growth: Storage in Exabytes

                                                                                                   Sources of growth:

10000
                                                                                                    –   Consumer participation
 8000
                                                                                                    –   Photo and video archiving
                                                                                                    –   eCommerce
 6000                                                                                               –   Social media
                                                                                                    –   Social networking
 4000                                                                                               –   Mobile applications
                                                                                                    –   Search engine indexing
 2000                                                                                               –   Web logs
                                                                                                    –   Medical records
    0                                                                                               –   Financial transactions
                  2005                     2010                       2015
                                                                                                    –   Scientific research
  Source: IDC's Digital Universe Study, sponsored by EMC, June 2011
                                                                                                    –   Surveillance


                                                                       © 2011 Emulex Corporation                                    3
What is Big Data?


  Collections of data exceeding the capabilities of traditional
  database management tools…
   – with dynamic, incremental data created around the data preceding it
   – scaling with advances in technology
   – from a growing number of sources
  Think Big Bang theory…
   – but in the order of bytes
  Spawning an entire ecosystem of
  new technologies and services
   – Powerful
   – Dynamic
   – Scalable




                                 © 2011 Emulex Corporation                 4
Tapping into Information as an Asset


  Organizations actively analyze data rather than just store it




  Increased Velocity                                    Actionable Data
  Larger Volume                                         Competitive Differentiation
  Greater Variety                                       Unlocking Value

                            © 2011 Emulex Corporation                               5
A Solution to the Problem – Hadoop


  A powerful, fault-tolerant, self-healing open source
  platform, allowing for the distributed computing on commodity
  clusters
  Scaling to thousands of compute nodes, and efficiently
  managing petabytes of data
  Leverages two key pieces of technology:
   – Hadoop Distributed File System (HDFS)
   – Hadoop MapReduce
  Capable of being deployed alongside legacy
  Enabling old and new data to be combined in powerful ways
  Accessed by data intensive applications



                             © 2011 Emulex Corporation            6
Artem Gavrilov

Senior Architect
Advanced Development Organization
Agenda


 Digital Content, today and tomorrow
 What is Big Data?
 Information as an Asset
 A Solution to the Problem
 The Moving Bottleneck
 Hadoop on 10GbE
 Testing Configurations and Objectives
 Testing Results
 Comparison Analysis – The Tale of the Tape
 Q&A



                           © 2011 Emulex Corporation   8
The Moving Bottleneck in Hadoop Clusters


  Designed to run on 1GbE performance characteristics
   – Ubiquity
   – Availability
   – Cost
  Today’s commodity servers deliver astounding performance
  gains over their predecessors
  Multi-core multi-threaded processors, fast DDR, and expanded
  memory space, faster and larger internal system drives have
  moved the bottleneck to the legacy 1GbE network
  Performance characteristics available on today’s servers:
   –   Processor (4 cores, 8 threads): 25.6GB/s max. memory bandwidth
   –   PCIe 3.0 bus: 8GT/s bit rate
   –   DDR4 memory modules: up to 3,200 MT/s
   –   Storage: SSDs capable of 6Gb/s; SATA drivers capable of 600MB/s
                              © 2011 Emulex Corporation                  9
Hadoop Cluster Hardware – Then and Now




                   4 Processor Generations




                  DDR2 to DDR3 Transition




                 Higher Density Drives & SSDs




                     No Change – 1GbE




                        © 2011 Emulex Corporation   10
Hadoop on 10GbE


  Network I/O performance must scale with the increase in…
   – Processing power
   – Memory capacity
   – Storage performance
  Network performance is essential to support larger and faster
  systems
  Migrating from a 1GbE to a 10GbE network, leveraging Emulex
  OneConnect adapters resulted in a massive performance gain




                           © 2011 Emulex Corporation              11
Fine Tuning Hadoop


  Hadoop workloads vary greatly
   – No “one size fits all” approach
   – 200+ cluster-wide and job-specific parameters that can be fine tuned
  With the workload variety comes a disparity in the distribution
  of resource demands, which can be classed as:

  CPU Intensive                                       I/O Intensive
   –   Machine learning                                    –   Indexing
   –   Complex data/text mining                            –   Searching
   –   Natural language processing                         –   Grouping
   –   Feature extraction                                  –   Decoding/decompressing
                                                           –   Data importing/exporting




                               © 2011 Emulex Corporation                                  12
The Setup


 Servers:                                           Storage:
  – HP ML350 G6                                            – SATA II 500GB 7200rpm Disk
      •   Dual, Quad core Xeon 2GHz                          Drives, 6 per node
      •   16 GB DDR3                                       – HP Smart Array G6 RAID
      •   Broadcom 1GbE BCM5715                              Controller (JBOD - No RAID
      •   Emulex OneConnect 10GbE                            configured)
          OCe11102 Ethernet Adapter
                                                    Cluster Configuration:
 OS and Software:                                          – 15 servers with discrete roles
  – Ubuntu 64 bit                                              • 1 NameNode
  – Hadoop (Cloudera Distribution)                             • 11 DataNodes
                                                               • 3 Clients
                                                           – 1GbE and 10GbE Switches




                               © 2011 Emulex Corporation                                      13
The Setup


                                 NameNode



                                                        DataNode 11
  Client 1
             10Gb Switch
  Client 2
             1Gb Switch
                                                   DataNode 2
  Client 3
                                                  DataNode 1




                      © 2011 Emulex Corporation                       14
Test Objective


     Measure HDFS throughput ingesting data into a Hadoop cluster
      –   Examining multiple client configurations
      –   Raising HDFS „put‟ operations per client
      –   Transferring a constant 5GB file
      –   Replication factor set to three
      –   Duplicated for 1GbE and 10GbE Networks


Clients                       1                                2                 3

DataNodes                     11                               11                11

„Put‟ Operations         1, 2, 4, 6, 8                    1, 2, 4, 6, 8     1, 2, 4, 6, 8

Total Operations         1, 2, 4, 6, 8                 2, 4, 8, 12, 16    3, 6, 12, 18, 24


                                   © 2011 Emulex Corporation                                15
Test Results – Legacy 1GbE

                      Data Import – Single Client, Single „Put‟ Operation


       1000                                                                   A single client, running a
        800
                                                                              single operation makes
                                                                              maximal use of the network
        600
MBps




        400
                                                                              HDFS efficiently transfers
        200                                                                   data to DataNodes within the
                                                                              cluster, averaging 108MBps
          0
              0   8     16     24      32    40   48      56                  out of the client server
                                Time (sec)

                             1 Operation




                                                  © 2011 Emulex Corporation                                16
Test Results – Legacy 1GbE

                       Data Import – Single Client, Multiple „Put‟ Operations


       1000                                                                               When more than one ‘put’
        800
                                                                                          operation runs on a
                                                                                          client, the 1GbE network
        600                                                                               becomes the bottleneck
MBps




        400


        200                                                                               Increasing the number of
                                                                                          operations did not increase
          0
              0         8       16      24      32    40     48       56                  client throughput – restricted
                                         Time (sec)                                       by the network connection
                  1 Operation        4 Operations     8 Operations




                                                              © 2011 Emulex Corporation                               17
Test Results – Legacy 1GbE

                    Data Import – Multiple Clients, Multiple „Put‟ Operations


       1000                                                                               Expected to observe
        800
                                                                                          throughput scale with
                                                                                          additional clients
        600
MBps




        400
                                                                                          Combined In and Out traffic
        200                                                                               averaged 225MBps
          0
              0         8       16      24      32    40     48       56
                                         Time (sec)

                  1 Operation        4 Operations     8 Operations




                                                              © 2011 Emulex Corporation                                 18
Test Results – Legacy 1GbE

                    Data Import – Multiple Clients, Multiple „Put‟ Operations


       1000                                                                               As network load increases
        800


        600
MBps




                                                                                    1GbE quickly reaches saturation
        400


        200


          0
              0         8       16      24      32    40     48       56
                                                                                    becomes the system bottleneck
                                         Time (sec)

                  1 Operation        4 Operations     8 Operations




                                                              © 2011 Emulex Corporation                               19
Test Results – Emulex OneConnect 10GbE

                     Data Import – Single Client, Single „Put‟ Operation


       180                                                                       Immediate performance
       160
                                                                                 improvement of 50%
       140
                                                                                 compared to 1GbE network
       120
MBps




       100
        80
        60                                                                       Data transfer completed in
        40                                                                       less than three quarters of
        20
                                                                                 the time
         0
             0   8     16      24        32     40   48      56
                                   Time (sec)

                            1GbE        10GbE




                                                     © 2011 Emulex Corporation                                 20
Test Results – Emulex OneConnect 10GbE

                          Data Import – Single Client, Multiple „Put‟ Operations


       1000                                                                                    Increased network load is
        800
                                                                                               met with increased
                                                                                               throughput
        600
MBps




        400
                                                                                               Achieved transfer rates of
        200                                                                                    800MBps, nearly 8X the
                                                                                               observed throughput of the
          0
              0       8    16   24    32      40    48   56   64     72     80                 1GbE configuration
                                           Time (sec)

                  1 Operation        4 Operations        8 Operations




                                                                   © 2011 Emulex Corporation                                21
Test Results – Emulex OneConnect 10GbE

                    Data Import – Multiple Clients, Multiple „Put‟ Operations


       1800                                                                             Throughput scales with
       1600
                                                                                        additional clients being
       1400
                                                                                        brought on-line
       1200
MBps




       1000
        800
        600                                                                             The 10GbE network does not
        400                                                                             limit transfer rates as the
        200
                                                                                        clients and their operations
          0
              0          25     50      75        100     125        150                increase
                                     Time (sec)

                  1 Operation    4 Operations       8 Operations




                                                            © 2011 Emulex Corporation                              22
Tale of the Tape – 1GbE vs 10GbE

                                Maximum Throughput Achieved


       1800                                                        Clients                    3
       1600
       1400
                                                                   DataNodes                  11
       1200
MBps




       1000
                                                                   „Put‟ Operations           6
        800
        600
        400
                                                                   Total Operations           18
        200
          0                                                        Data Size               270GB
              1   101        201          301     401
                             Time (sec)
                                                                   1GbE Max MBps             250
                        1G         10G
                                                                   10GbE Max MBps     1,674 (6.7X faster)



                                                © 2011 Emulex Corporation                               23
Tale of the Tape – 1GbE vs 10GbE

                                       Average Throughput Achieved


       1000   ~4X throughput enables more                                    Clients                    3
                efficient real time analysis
        800
                                                                             DataNodes                 11
        600
MBps




                                                                             „Put‟ Operations           6
        400

                                                                             Total Operations          18
        200


          0                                                                  Data Size               270GB
              1     2            4         8         12         18
                        Number of 'put' operations
                                                                             1GbE Avg MBps             216
                            1G       10G
                                                                             10GbE Avg MBps     831 (3.85X faster)



                                                          © 2011 Emulex Corporation                                  24
Tale of the Tape – 1GbE vs 10GbE

                                              Time to Completion (seconds)


             600       Load times reduced by 75%                                   Clients                    3
             500        improving batch analysis
                                                                                   DataNodes                 11
             400
Time (sec)




             300                                                                   „Put‟ Operations           6
             200
                                                                                   Total Operations          18
             100

               0                                                                   Data Size               270GB
                   1       2            4         8        12        18
                               Numer of 'put' operations
                                                                                   1GbE Completion           453
                                   1G       10G
                                                                                   10GbE Completion   115 (3.94X faster)



                                                                © 2011 Emulex Corporation                                  25
Key Takeaways

 Hadoop runs faster with 10G
  – Up to 8 times faster in some scenarios
 Fine tuning parameters is important for performance
  – Improvements may not be possible without proper configuration
 Future performance gains are possible
  – Hadoop was designed for 1GbE, but small changes will enable the full
    potential of 10GbE
 Hadoop is better with Emulex OneConnect Ethernet Adapters
  – “It just works” – right out of the box
  – Leverage our expertise to configure your Hadoop installation for
    maximum performance




                              © 2011 Emulex Corporation                    26
Questions
Questions


 Which 1GbE and 10GbE switches were included in our tests?
 And would we see better performance with a switch that had
 lower latency?
 We used several different models of Cisco switches – each with
 different latency attributes. We found that latency didn’t impact
 throughput performance in a significant way. In one case, when
 moving to a switch with double the latency performance, we
 only witnessed roughly 1% increase in the throughput
 performance. Within the construct of our tests, we did not find
 that latency was critical to the performance results.




                          © 2011 Emulex Corporation                  28
Questions


 Did we find the network being the bottleneck prior to the disk
 subsystem becoming the bottleneck?
 Yes, and it comes through in our graphs. It’s important to note
 that at the beginning of our tests, we encountered some disk
 performance bottlenecks due to some configuration issues.
 Proving that it is essential to understand the configuration
 settings for your Hadoop cluster in order to tap the full potential
 of your disks. With commodity disks, the standard performance
 characteristics is 100MBps per disk, typical environments have
 6 disks per node, totaling 600MBps in performance potential. In
 some cases, you don’t need disk operations to actually happen
 – data is moved from memory to memory, but in most
 cases, data is moved from disk to disk on different machines.
 In those cases, disk performance is important. However, in our
 test cases, the disk performance was not a bottleneck.

                           © 2011 Emulex Corporation                   29
Questions


 How many 1GbE NICs were used? Were multiple 1GbE NICs
 bridged together, or just a single 10GbE NIC?
 Our configuration used a single 1GbE NIC with two ports.
 Which is the typical commodity server configuration.
 Theoretically, you can install multiple cards, and get better
 performance, but it is a more difficult proposition, and would
 cost more than a single 10GbE NIC, aside from the fact that
 there likely would not be enough slots on the motherboard to
 accommodate that many cards.




                          © 2011 Emulex Corporation               30
Questions


 What is the maximum throughput of 10GbE?
 10GbE maximum throughput is 1.25GB/s for single direction
 data transfer. When aggregated with receiving data, 2.5GB/s is
 the maximum. Hadoop is not designed to accommodate this
 speed, yet. Hopefully, it will be there soon. It’s important to
 mention that most 10GbE solutions today come with two
 ports, which means that you can achieve up to 5GB/s
 performance. Of course, in order to leverage that
 performance, you have to have a disk sub-system that operates
 close to that level. We observed that in cases where two 10GbE
 ports were used, you have 12 high performance disks. Today, it
 is not necessary because Hadoop does not use the network
 efficiently, so even with 6 disks, you will see a significant
 performance gain.


                          © 2011 Emulex Corporation                31
Questions


 Do we have a list of the parameters that need to be tuned within
 Hadoop in order to maximize the performance of our 10GbE
 NICs?
 The settings will vary depending on the environment. There
 isn’t a one-size-fits-all approach. Some of these parameters
 have been published in our white paper, and we will review that
 paper to ensure that all of those parameters are addressed.




                          © 2011 Emulex Corporation                 32
Questions


 Are these results comparable to other 10GbE NICs or is this
 something unique to the Emulex technology portfolio?
 We included multiple cards from our competitors in this
 research project. Emulex cards did offer performance
 advantages over our competition – approximately 10%. The
 important observation was that competitors cards were more
 prone to failures – servers stopped responding, system reboots
 needed, etc. Emulex cards were far more reliable across the
 board, which we believe is more important than fractional
 performance gains.




                          © 2011 Emulex Corporation               33
Questions


 If the tests did not saturated the bandwidth of a 1GbE link, is the
 cause of the performance increase with 10GbE attributable to
 the “bursty” nature of the transfer itself?
 Hadoop is not optimized for networking, which is why there are
 some odd observations from time to time. There are times
 when even on 1GbE connections, it’s possible to not reach 50%
 of maximum throughput – a by product of its design. Hadoop
 was designed to run multiple jobs and operations, and in those
 instances these performance issues do not manifest
 themselves.




                           © 2011 Emulex Corporation                   34
Questions


 Would a round-robin bonding configuration be possible with
 10GbE, and would there be a performance gain from that?
 Theoretically, it is possible. Practically, it is unlikely due to the
 underlying disk system becoming the bottleneck (for the
 moment). If there are SSDs, or more than 6 disks being
 used, there is potential for performance improvement.




                             © 2011 Emulex Corporation                   35
Questions


 Have we run tests with SSDs, higher RPM spindles, or larger
 spindle configurations?
 Yes, we did. And we encountered some interesting results.
 While we did see improvements of approximately 40%, we
 anticipated much better results with SSDs. The biggest issue
 with SSDs has to do with the way Hadoop interfaces with them
 – it does not tap into the full potential of the disk. Ultimately, we
 landed on throughput being the most important factor for
 performance, not necessarily I/O.




                            © 2011 Emulex Corporation                    36
Thank You…




             © 2011 Emulex Corporation   37

Más contenido relacionado

La actualidad más candente

Track 3, session 3,big data infrastructure by sunil brid
Track 3, session 3,big data infrastructure by sunil bridTrack 3, session 3,big data infrastructure by sunil brid
Track 3, session 3,big data infrastructure by sunil bridEMC Forum India
 
The Rise of Big Data and On-Demand IT
The Rise of Big Data and On-Demand ITThe Rise of Big Data and On-Demand IT
The Rise of Big Data and On-Demand ITInnoTech
 
Brochure : The EMC Big Data Solution
Brochure : The EMC Big Data Solution Brochure : The EMC Big Data Solution
Brochure : The EMC Big Data Solution EMC
 
Sanjay Mirchandani’s KeyNote – EMC Forum India – Mumbai November 17, 2011
Sanjay Mirchandani’s KeyNote – EMC Forum India – Mumbai November 17, 2011Sanjay Mirchandani’s KeyNote – EMC Forum India – Mumbai November 17, 2011
Sanjay Mirchandani’s KeyNote – EMC Forum India – Mumbai November 17, 2011EMC Forum India
 
EMC Big Data Solutions Overview
EMC Big Data Solutions OverviewEMC Big Data Solutions Overview
EMC Big Data Solutions Overviewwalshe1
 
Shams.khawaja
Shams.khawajaShams.khawaja
Shams.khawajaNASAPMC
 
Hp Ncoic Susanne Balle Sept17 Final
Hp Ncoic Susanne Balle Sept17 FinalHp Ncoic Susanne Balle Sept17 Final
Hp Ncoic Susanne Balle Sept17 FinalGovCloud Network
 
Renaissance in vm network connectivity
Renaissance in vm network connectivityRenaissance in vm network connectivity
Renaissance in vm network connectivityIT Brand Pulse
 
Hadoop Analytics + Enterprise Class Storage: One-Stop Solution From EMC for H...
Hadoop Analytics + Enterprise Class Storage: One-Stop Solution From EMC for H...Hadoop Analytics + Enterprise Class Storage: One-Stop Solution From EMC for H...
Hadoop Analytics + Enterprise Class Storage: One-Stop Solution From EMC for H...EMC
 
Cccc net app_wallacefung
Cccc net app_wallacefungCccc net app_wallacefung
Cccc net app_wallacefungCloud Congress
 
Oracle tech fmw-03-cloud-computing-neum-15.04.2010
Oracle tech fmw-03-cloud-computing-neum-15.04.2010Oracle tech fmw-03-cloud-computing-neum-15.04.2010
Oracle tech fmw-03-cloud-computing-neum-15.04.2010Oracle BH
 
Journey to the Software Defined Data Center: EMA Research Results Revealed
Journey to the Software Defined Data Center: EMA Research Results Revealed Journey to the Software Defined Data Center: EMA Research Results Revealed
Journey to the Software Defined Data Center: EMA Research Results Revealed Enterprise Management Associates
 
Optimizing Cloud Computing with IPv6
Optimizing Cloud Computing with IPv6Optimizing Cloud Computing with IPv6
Optimizing Cloud Computing with IPv6John Rhoton
 
Oracle Cloud Computing Strategy (EMO)
Oracle Cloud Computing Strategy (EMO)Oracle Cloud Computing Strategy (EMO)
Oracle Cloud Computing Strategy (EMO)rachgregs
 
Cloud Networking: Network aspects of the cloud
Cloud Networking: Network aspects of the cloudCloud Networking: Network aspects of the cloud
Cloud Networking: Network aspects of the cloudSAIL
 
Cloud Infrastructure and Services (CIS) - Webinar
Cloud Infrastructure and Services (CIS) - WebinarCloud Infrastructure and Services (CIS) - Webinar
Cloud Infrastructure and Services (CIS) - WebinarEMC
 
Zsl cloud-application migration-8_phased_approach
Zsl cloud-application migration-8_phased_approachZsl cloud-application migration-8_phased_approach
Zsl cloud-application migration-8_phased_approachzslmarketing
 
Oracle cloud strategy
Oracle cloud strategyOracle cloud strategy
Oracle cloud strategyAgora Group
 
Presentatie Cisco NetApp Proact over FlexPod
Presentatie Cisco NetApp Proact over FlexPodPresentatie Cisco NetApp Proact over FlexPod
Presentatie Cisco NetApp Proact over FlexPodProact Netherlands B.V.
 

La actualidad más candente (20)

Track 3, session 3,big data infrastructure by sunil brid
Track 3, session 3,big data infrastructure by sunil bridTrack 3, session 3,big data infrastructure by sunil brid
Track 3, session 3,big data infrastructure by sunil brid
 
The Rise of Big Data and On-Demand IT
The Rise of Big Data and On-Demand ITThe Rise of Big Data and On-Demand IT
The Rise of Big Data and On-Demand IT
 
Brochure : The EMC Big Data Solution
Brochure : The EMC Big Data Solution Brochure : The EMC Big Data Solution
Brochure : The EMC Big Data Solution
 
Sanjay Mirchandani’s KeyNote – EMC Forum India – Mumbai November 17, 2011
Sanjay Mirchandani’s KeyNote – EMC Forum India – Mumbai November 17, 2011Sanjay Mirchandani’s KeyNote – EMC Forum India – Mumbai November 17, 2011
Sanjay Mirchandani’s KeyNote – EMC Forum India – Mumbai November 17, 2011
 
EMC Big Data Solutions Overview
EMC Big Data Solutions OverviewEMC Big Data Solutions Overview
EMC Big Data Solutions Overview
 
Shams.khawaja
Shams.khawajaShams.khawaja
Shams.khawaja
 
Brocade powering communications & collaboration
Brocade powering communications & collaborationBrocade powering communications & collaboration
Brocade powering communications & collaboration
 
Hp Ncoic Susanne Balle Sept17 Final
Hp Ncoic Susanne Balle Sept17 FinalHp Ncoic Susanne Balle Sept17 Final
Hp Ncoic Susanne Balle Sept17 Final
 
Renaissance in vm network connectivity
Renaissance in vm network connectivityRenaissance in vm network connectivity
Renaissance in vm network connectivity
 
Hadoop Analytics + Enterprise Class Storage: One-Stop Solution From EMC for H...
Hadoop Analytics + Enterprise Class Storage: One-Stop Solution From EMC for H...Hadoop Analytics + Enterprise Class Storage: One-Stop Solution From EMC for H...
Hadoop Analytics + Enterprise Class Storage: One-Stop Solution From EMC for H...
 
Cccc net app_wallacefung
Cccc net app_wallacefungCccc net app_wallacefung
Cccc net app_wallacefung
 
Oracle tech fmw-03-cloud-computing-neum-15.04.2010
Oracle tech fmw-03-cloud-computing-neum-15.04.2010Oracle tech fmw-03-cloud-computing-neum-15.04.2010
Oracle tech fmw-03-cloud-computing-neum-15.04.2010
 
Journey to the Software Defined Data Center: EMA Research Results Revealed
Journey to the Software Defined Data Center: EMA Research Results Revealed Journey to the Software Defined Data Center: EMA Research Results Revealed
Journey to the Software Defined Data Center: EMA Research Results Revealed
 
Optimizing Cloud Computing with IPv6
Optimizing Cloud Computing with IPv6Optimizing Cloud Computing with IPv6
Optimizing Cloud Computing with IPv6
 
Oracle Cloud Computing Strategy (EMO)
Oracle Cloud Computing Strategy (EMO)Oracle Cloud Computing Strategy (EMO)
Oracle Cloud Computing Strategy (EMO)
 
Cloud Networking: Network aspects of the cloud
Cloud Networking: Network aspects of the cloudCloud Networking: Network aspects of the cloud
Cloud Networking: Network aspects of the cloud
 
Cloud Infrastructure and Services (CIS) - Webinar
Cloud Infrastructure and Services (CIS) - WebinarCloud Infrastructure and Services (CIS) - Webinar
Cloud Infrastructure and Services (CIS) - Webinar
 
Zsl cloud-application migration-8_phased_approach
Zsl cloud-application migration-8_phased_approachZsl cloud-application migration-8_phased_approach
Zsl cloud-application migration-8_phased_approach
 
Oracle cloud strategy
Oracle cloud strategyOracle cloud strategy
Oracle cloud strategy
 
Presentatie Cisco NetApp Proact over FlexPod
Presentatie Cisco NetApp Proact over FlexPodPresentatie Cisco NetApp Proact over FlexPod
Presentatie Cisco NetApp Proact over FlexPod
 

Similar a Boosting Hadoop Performance with Emulex OneConnect® 10Gb Ethernet Adapters

Research ON Big Data
Research ON Big DataResearch ON Big Data
Research ON Big Datamysqlops
 
Research on big data
Research on big dataResearch on big data
Research on big dataRoby Chen
 
Oracle databáze – Konsolidovaná Data Management Platforma
Oracle databáze – Konsolidovaná Data Management PlatformaOracle databáze – Konsolidovaná Data Management Platforma
Oracle databáze – Konsolidovaná Data Management PlatformaMarketingArrowECS_CZ
 
Hadoop as data refinery
Hadoop as data refineryHadoop as data refinery
Hadoop as data refinerySteve Loughran
 
Hadoop as Data Refinery - Steve Loughran
Hadoop as Data Refinery - Steve LoughranHadoop as Data Refinery - Steve Loughran
Hadoop as Data Refinery - Steve LoughranJAX London
 
Big Data and HPC
Big Data and HPCBig Data and HPC
Big Data and HPCNetApp
 
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011Cloudera, Inc.
 
Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...
Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...
Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...Cloudera, Inc.
 
IBM Data Centric Systems & OpenPOWER
IBM Data Centric Systems & OpenPOWERIBM Data Centric Systems & OpenPOWER
IBM Data Centric Systems & OpenPOWERinside-BigData.com
 
Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You?
Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You?  Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You?
Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You? EMC
 
INCONTRI AL CINEMA - HUS VM: la nuova piattaforma unificata di Hitachi Data...
 INCONTRI AL CINEMA - HUS VM: la nuova piattaforma unificata di Hitachi Data... INCONTRI AL CINEMA - HUS VM: la nuova piattaforma unificata di Hitachi Data...
INCONTRI AL CINEMA - HUS VM: la nuova piattaforma unificata di Hitachi Data...Mauden SpA
 
Executive Breakfast SysValue-NetApp-VMWare - 16 de Março de 2012 - Apresentaç...
Executive Breakfast SysValue-NetApp-VMWare - 16 de Março de 2012 - Apresentaç...Executive Breakfast SysValue-NetApp-VMWare - 16 de Março de 2012 - Apresentaç...
Executive Breakfast SysValue-NetApp-VMWare - 16 de Março de 2012 - Apresentaç...Joao Barreto Fernandes
 
Making your Analytics Investment Pay Off - StampedeCon 2012
Making your Analytics Investment Pay Off - StampedeCon 2012Making your Analytics Investment Pay Off - StampedeCon 2012
Making your Analytics Investment Pay Off - StampedeCon 2012StampedeCon
 
In-Place analytics with Unified Data Access
In-Place analytics with Unified Data AccessIn-Place analytics with Unified Data Access
In-Place analytics with Unified Data AccessDataWorks Summit
 
Optimizing Performance of your Oracle Database using 8Gb Fibre Channel
Optimizing Performance of your Oracle Database using 8Gb Fibre ChannelOptimizing Performance of your Oracle Database using 8Gb Fibre Channel
Optimizing Performance of your Oracle Database using 8Gb Fibre ChannelEmulex Corporation
 
Living objects network performance_management_v2
Living objects network performance_management_v2Living objects network performance_management_v2
Living objects network performance_management_v2Yoan SMADJA
 
MS TechDays 2011 - Virtualization Solutions to Optimize Performance
MS TechDays 2011 - Virtualization Solutions to Optimize PerformanceMS TechDays 2011 - Virtualization Solutions to Optimize Performance
MS TechDays 2011 - Virtualization Solutions to Optimize PerformanceSpiffy
 
PDoolan Oracle Overview
PDoolan Oracle OverviewPDoolan Oracle Overview
PDoolan Oracle OverviewPeter Doolan
 
Jak konsolidovat Vaše databáze s využitím Cloud služeb?
Jak konsolidovat Vaše databáze s využitím Cloud služeb?Jak konsolidovat Vaše databáze s využitím Cloud služeb?
Jak konsolidovat Vaše databáze s využitím Cloud služeb?MarketingArrowECS_CZ
 

Similar a Boosting Hadoop Performance with Emulex OneConnect® 10Gb Ethernet Adapters (20)

Research ON Big Data
Research ON Big DataResearch ON Big Data
Research ON Big Data
 
Research on big data
Research on big dataResearch on big data
Research on big data
 
Oracle databáze – Konsolidovaná Data Management Platforma
Oracle databáze – Konsolidovaná Data Management PlatformaOracle databáze – Konsolidovaná Data Management Platforma
Oracle databáze – Konsolidovaná Data Management Platforma
 
Hadoop as data refinery
Hadoop as data refineryHadoop as data refinery
Hadoop as data refinery
 
Hadoop as Data Refinery - Steve Loughran
Hadoop as Data Refinery - Steve LoughranHadoop as Data Refinery - Steve Loughran
Hadoop as Data Refinery - Steve Loughran
 
Big Data and HPC
Big Data and HPCBig Data and HPC
Big Data and HPC
 
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
 
EMC Unified Analytics Platform. Gintaras Pelenis
EMC Unified Analytics Platform. Gintaras PelenisEMC Unified Analytics Platform. Gintaras Pelenis
EMC Unified Analytics Platform. Gintaras Pelenis
 
Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...
Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...
Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...
 
IBM Data Centric Systems & OpenPOWER
IBM Data Centric Systems & OpenPOWERIBM Data Centric Systems & OpenPOWER
IBM Data Centric Systems & OpenPOWER
 
Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You?
Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You?  Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You?
Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You?
 
INCONTRI AL CINEMA - HUS VM: la nuova piattaforma unificata di Hitachi Data...
 INCONTRI AL CINEMA - HUS VM: la nuova piattaforma unificata di Hitachi Data... INCONTRI AL CINEMA - HUS VM: la nuova piattaforma unificata di Hitachi Data...
INCONTRI AL CINEMA - HUS VM: la nuova piattaforma unificata di Hitachi Data...
 
Executive Breakfast SysValue-NetApp-VMWare - 16 de Março de 2012 - Apresentaç...
Executive Breakfast SysValue-NetApp-VMWare - 16 de Março de 2012 - Apresentaç...Executive Breakfast SysValue-NetApp-VMWare - 16 de Março de 2012 - Apresentaç...
Executive Breakfast SysValue-NetApp-VMWare - 16 de Março de 2012 - Apresentaç...
 
Making your Analytics Investment Pay Off - StampedeCon 2012
Making your Analytics Investment Pay Off - StampedeCon 2012Making your Analytics Investment Pay Off - StampedeCon 2012
Making your Analytics Investment Pay Off - StampedeCon 2012
 
In-Place analytics with Unified Data Access
In-Place analytics with Unified Data AccessIn-Place analytics with Unified Data Access
In-Place analytics with Unified Data Access
 
Optimizing Performance of your Oracle Database using 8Gb Fibre Channel
Optimizing Performance of your Oracle Database using 8Gb Fibre ChannelOptimizing Performance of your Oracle Database using 8Gb Fibre Channel
Optimizing Performance of your Oracle Database using 8Gb Fibre Channel
 
Living objects network performance_management_v2
Living objects network performance_management_v2Living objects network performance_management_v2
Living objects network performance_management_v2
 
MS TechDays 2011 - Virtualization Solutions to Optimize Performance
MS TechDays 2011 - Virtualization Solutions to Optimize PerformanceMS TechDays 2011 - Virtualization Solutions to Optimize Performance
MS TechDays 2011 - Virtualization Solutions to Optimize Performance
 
PDoolan Oracle Overview
PDoolan Oracle OverviewPDoolan Oracle Overview
PDoolan Oracle Overview
 
Jak konsolidovat Vaše databáze s využitím Cloud služeb?
Jak konsolidovat Vaše databáze s využitím Cloud služeb?Jak konsolidovat Vaše databáze s využitím Cloud služeb?
Jak konsolidovat Vaše databáze s využitím Cloud služeb?
 

Más de Emulex Corporation

Acronym Soup – NFV, SDN, OVN and VNF
Acronym Soup – NFV, SDN, OVN and VNFAcronym Soup – NFV, SDN, OVN and VNF
Acronym Soup – NFV, SDN, OVN and VNFEmulex Corporation
 
Improving Incident Response: Building a More Efficient IT Infrastructure
Improving Incident Response: Building a More Efficient IT InfrastructureImproving Incident Response: Building a More Efficient IT Infrastructure
Improving Incident Response: Building a More Efficient IT InfrastructureEmulex Corporation
 
Using NetFlow to Streamline Security Analysis and Response to Cyber Threats
Using NetFlow to Streamline Security Analysis and Response to Cyber ThreatsUsing NetFlow to Streamline Security Analysis and Response to Cyber Threats
Using NetFlow to Streamline Security Analysis and Response to Cyber ThreatsEmulex Corporation
 
Network Forensics for Splunk, an Emulex presentation
Network Forensics for Splunk, an Emulex presentationNetwork Forensics for Splunk, an Emulex presentation
Network Forensics for Splunk, an Emulex presentationEmulex Corporation
 
Using NetFlow to Improve Network Visibility and Application Performance
Using NetFlow to Improve Network Visibility and Application PerformanceUsing NetFlow to Improve Network Visibility and Application Performance
Using NetFlow to Improve Network Visibility and Application PerformanceEmulex Corporation
 
Using Network Recording and Search to Improve IT Service Delivery
Using Network Recording and Search to Improve IT Service DeliveryUsing Network Recording and Search to Improve IT Service Delivery
Using Network Recording and Search to Improve IT Service DeliveryEmulex Corporation
 
Introducing Endace Packets - EndaceVision™ with Protocol Decodes
Introducing Endace Packets - EndaceVision™ with Protocol DecodesIntroducing Endace Packets - EndaceVision™ with Protocol Decodes
Introducing Endace Packets - EndaceVision™ with Protocol DecodesEmulex Corporation
 
Linked in Twitter Facebook Google+ Email Embed Share Flash Across Virtualized...
Linked in Twitter Facebook Google+ Email Embed Share Flash Across Virtualized...Linked in Twitter Facebook Google+ Email Embed Share Flash Across Virtualized...
Linked in Twitter Facebook Google+ Email Embed Share Flash Across Virtualized...Emulex Corporation
 
Tap DANZing - Arista Networks Redefining the Cost of Accessing Network Traffic
Tap DANZing - Arista Networks Redefining the Cost of Accessing Network TrafficTap DANZing - Arista Networks Redefining the Cost of Accessing Network Traffic
Tap DANZing - Arista Networks Redefining the Cost of Accessing Network TrafficEmulex Corporation
 
First Look Webcast: OneCore Storage SDK 3.6 Roll-out and Walkthrough
First Look Webcast: OneCore Storage SDK 3.6 Roll-out and WalkthroughFirst Look Webcast: OneCore Storage SDK 3.6 Roll-out and Walkthrough
First Look Webcast: OneCore Storage SDK 3.6 Roll-out and WalkthroughEmulex Corporation
 
Why I/O is Strategic for Convergence - with 451 Research
Why I/O is Strategic for Convergence - with 451 ResearchWhy I/O is Strategic for Convergence - with 451 Research
Why I/O is Strategic for Convergence - with 451 ResearchEmulex Corporation
 
Emulex and the Evaluator Group Present Why I/O is Strategic for Big Data
Emulex and the Evaluator Group Present Why I/O is Strategic for Big Data Emulex and the Evaluator Group Present Why I/O is Strategic for Big Data
Emulex and the Evaluator Group Present Why I/O is Strategic for Big Data Emulex Corporation
 
Get Better I/O Performance in VMware vSphere 5.1 Environments with Emulex 16G...
Get Better I/O Performance in VMware vSphere 5.1 Environments with Emulex 16G...Get Better I/O Performance in VMware vSphere 5.1 Environments with Emulex 16G...
Get Better I/O Performance in VMware vSphere 5.1 Environments with Emulex 16G...Emulex Corporation
 
Get Better I/O Performance in VMware vSphere 5.1 Environments with Emulex 16G...
Get Better I/O Performance in VMware vSphere 5.1 Environments with Emulex 16G...Get Better I/O Performance in VMware vSphere 5.1 Environments with Emulex 16G...
Get Better I/O Performance in VMware vSphere 5.1 Environments with Emulex 16G...Emulex Corporation
 
Emulex and Enterprise Strategy Group Present Why I/O is Strategic for Virtual...
Emulex and Enterprise Strategy Group Present Why I/O is Strategic for Virtual...Emulex and Enterprise Strategy Group Present Why I/O is Strategic for Virtual...
Emulex and Enterprise Strategy Group Present Why I/O is Strategic for Virtual...Emulex Corporation
 
Introducing OneCommand Vision 3.0, I/O management that gives your application...
Introducing OneCommand Vision 3.0, I/O management that gives your application...Introducing OneCommand Vision 3.0, I/O management that gives your application...
Introducing OneCommand Vision 3.0, I/O management that gives your application...Emulex Corporation
 
Integrating and Optimizing Suricata with FastStack™ Sniffer10G™
Integrating and Optimizing Suricata with FastStack™ Sniffer10G™Integrating and Optimizing Suricata with FastStack™ Sniffer10G™
Integrating and Optimizing Suricata with FastStack™ Sniffer10G™Emulex Corporation
 
An Introduction to the Emulex Network Xceleration Solution – FastStack™ Sniff...
An Introduction to the Emulex Network Xceleration Solution – FastStack™ Sniff...An Introduction to the Emulex Network Xceleration Solution – FastStack™ Sniff...
An Introduction to the Emulex Network Xceleration Solution – FastStack™ Sniff...Emulex Corporation
 

Más de Emulex Corporation (20)

Acronym Soup – NFV, SDN, OVN and VNF
Acronym Soup – NFV, SDN, OVN and VNFAcronym Soup – NFV, SDN, OVN and VNF
Acronym Soup – NFV, SDN, OVN and VNF
 
Improving Incident Response: Building a More Efficient IT Infrastructure
Improving Incident Response: Building a More Efficient IT InfrastructureImproving Incident Response: Building a More Efficient IT Infrastructure
Improving Incident Response: Building a More Efficient IT Infrastructure
 
SC Magazine eSymposium: SIEM
SC Magazine eSymposium: SIEMSC Magazine eSymposium: SIEM
SC Magazine eSymposium: SIEM
 
Using NetFlow to Streamline Security Analysis and Response to Cyber Threats
Using NetFlow to Streamline Security Analysis and Response to Cyber ThreatsUsing NetFlow to Streamline Security Analysis and Response to Cyber Threats
Using NetFlow to Streamline Security Analysis and Response to Cyber Threats
 
Network Forensics for Splunk, an Emulex presentation
Network Forensics for Splunk, an Emulex presentationNetwork Forensics for Splunk, an Emulex presentation
Network Forensics for Splunk, an Emulex presentation
 
Using NetFlow to Improve Network Visibility and Application Performance
Using NetFlow to Improve Network Visibility and Application PerformanceUsing NetFlow to Improve Network Visibility and Application Performance
Using NetFlow to Improve Network Visibility and Application Performance
 
The Great IT Migration
The Great IT MigrationThe Great IT Migration
The Great IT Migration
 
Using Network Recording and Search to Improve IT Service Delivery
Using Network Recording and Search to Improve IT Service DeliveryUsing Network Recording and Search to Improve IT Service Delivery
Using Network Recording and Search to Improve IT Service Delivery
 
Introducing Endace Packets - EndaceVision™ with Protocol Decodes
Introducing Endace Packets - EndaceVision™ with Protocol DecodesIntroducing Endace Packets - EndaceVision™ with Protocol Decodes
Introducing Endace Packets - EndaceVision™ with Protocol Decodes
 
Linked in Twitter Facebook Google+ Email Embed Share Flash Across Virtualized...
Linked in Twitter Facebook Google+ Email Embed Share Flash Across Virtualized...Linked in Twitter Facebook Google+ Email Embed Share Flash Across Virtualized...
Linked in Twitter Facebook Google+ Email Embed Share Flash Across Virtualized...
 
Tap DANZing - Arista Networks Redefining the Cost of Accessing Network Traffic
Tap DANZing - Arista Networks Redefining the Cost of Accessing Network TrafficTap DANZing - Arista Networks Redefining the Cost of Accessing Network Traffic
Tap DANZing - Arista Networks Redefining the Cost of Accessing Network Traffic
 
First Look Webcast: OneCore Storage SDK 3.6 Roll-out and Walkthrough
First Look Webcast: OneCore Storage SDK 3.6 Roll-out and WalkthroughFirst Look Webcast: OneCore Storage SDK 3.6 Roll-out and Walkthrough
First Look Webcast: OneCore Storage SDK 3.6 Roll-out and Walkthrough
 
Why I/O is Strategic for Convergence - with 451 Research
Why I/O is Strategic for Convergence - with 451 ResearchWhy I/O is Strategic for Convergence - with 451 Research
Why I/O is Strategic for Convergence - with 451 Research
 
Emulex and the Evaluator Group Present Why I/O is Strategic for Big Data
Emulex and the Evaluator Group Present Why I/O is Strategic for Big Data Emulex and the Evaluator Group Present Why I/O is Strategic for Big Data
Emulex and the Evaluator Group Present Why I/O is Strategic for Big Data
 
Get Better I/O Performance in VMware vSphere 5.1 Environments with Emulex 16G...
Get Better I/O Performance in VMware vSphere 5.1 Environments with Emulex 16G...Get Better I/O Performance in VMware vSphere 5.1 Environments with Emulex 16G...
Get Better I/O Performance in VMware vSphere 5.1 Environments with Emulex 16G...
 
Get Better I/O Performance in VMware vSphere 5.1 Environments with Emulex 16G...
Get Better I/O Performance in VMware vSphere 5.1 Environments with Emulex 16G...Get Better I/O Performance in VMware vSphere 5.1 Environments with Emulex 16G...
Get Better I/O Performance in VMware vSphere 5.1 Environments with Emulex 16G...
 
Emulex and Enterprise Strategy Group Present Why I/O is Strategic for Virtual...
Emulex and Enterprise Strategy Group Present Why I/O is Strategic for Virtual...Emulex and Enterprise Strategy Group Present Why I/O is Strategic for Virtual...
Emulex and Enterprise Strategy Group Present Why I/O is Strategic for Virtual...
 
Introducing OneCommand Vision 3.0, I/O management that gives your application...
Introducing OneCommand Vision 3.0, I/O management that gives your application...Introducing OneCommand Vision 3.0, I/O management that gives your application...
Introducing OneCommand Vision 3.0, I/O management that gives your application...
 
Integrating and Optimizing Suricata with FastStack™ Sniffer10G™
Integrating and Optimizing Suricata with FastStack™ Sniffer10G™Integrating and Optimizing Suricata with FastStack™ Sniffer10G™
Integrating and Optimizing Suricata with FastStack™ Sniffer10G™
 
An Introduction to the Emulex Network Xceleration Solution – FastStack™ Sniff...
An Introduction to the Emulex Network Xceleration Solution – FastStack™ Sniff...An Introduction to the Emulex Network Xceleration Solution – FastStack™ Sniff...
An Introduction to the Emulex Network Xceleration Solution – FastStack™ Sniff...
 

Último

Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 

Último (20)

Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 

Boosting Hadoop Performance with Emulex OneConnect® 10Gb Ethernet Adapters

  • 1. Hadoop Webcast Boosting Hadoop Performance with Emulex OneConnect® 10Gb Ethernet Adapters
  • 2. Agenda Digital Content, today and tomorrow What is Big Data? Information as an Asset A Solution to the Problem The Moving Bottleneck Hadoop on 10GbE Testing Configurations and Objectives Testing Results Comparison Analysis – The Tale of the Tape Q&A © 2011 Emulex Corporation 2
  • 3. Digital Content – Big Data’s Singularity A Decade of Digital Universe Growth: Storage in Exabytes Sources of growth: 10000 – Consumer participation 8000 – Photo and video archiving – eCommerce 6000 – Social media – Social networking 4000 – Mobile applications – Search engine indexing 2000 – Web logs – Medical records 0 – Financial transactions 2005 2010 2015 – Scientific research Source: IDC's Digital Universe Study, sponsored by EMC, June 2011 – Surveillance © 2011 Emulex Corporation 3
  • 4. What is Big Data? Collections of data exceeding the capabilities of traditional database management tools… – with dynamic, incremental data created around the data preceding it – scaling with advances in technology – from a growing number of sources Think Big Bang theory… – but in the order of bytes Spawning an entire ecosystem of new technologies and services – Powerful – Dynamic – Scalable © 2011 Emulex Corporation 4
  • 5. Tapping into Information as an Asset Organizations actively analyze data rather than just store it  Increased Velocity  Actionable Data  Larger Volume  Competitive Differentiation  Greater Variety  Unlocking Value © 2011 Emulex Corporation 5
  • 6. A Solution to the Problem – Hadoop A powerful, fault-tolerant, self-healing open source platform, allowing for the distributed computing on commodity clusters Scaling to thousands of compute nodes, and efficiently managing petabytes of data Leverages two key pieces of technology: – Hadoop Distributed File System (HDFS) – Hadoop MapReduce Capable of being deployed alongside legacy Enabling old and new data to be combined in powerful ways Accessed by data intensive applications © 2011 Emulex Corporation 6
  • 7. Artem Gavrilov Senior Architect Advanced Development Organization
  • 8. Agenda Digital Content, today and tomorrow What is Big Data? Information as an Asset A Solution to the Problem The Moving Bottleneck Hadoop on 10GbE Testing Configurations and Objectives Testing Results Comparison Analysis – The Tale of the Tape Q&A © 2011 Emulex Corporation 8
  • 9. The Moving Bottleneck in Hadoop Clusters Designed to run on 1GbE performance characteristics – Ubiquity – Availability – Cost Today’s commodity servers deliver astounding performance gains over their predecessors Multi-core multi-threaded processors, fast DDR, and expanded memory space, faster and larger internal system drives have moved the bottleneck to the legacy 1GbE network Performance characteristics available on today’s servers: – Processor (4 cores, 8 threads): 25.6GB/s max. memory bandwidth – PCIe 3.0 bus: 8GT/s bit rate – DDR4 memory modules: up to 3,200 MT/s – Storage: SSDs capable of 6Gb/s; SATA drivers capable of 600MB/s © 2011 Emulex Corporation 9
  • 10. Hadoop Cluster Hardware – Then and Now 4 Processor Generations DDR2 to DDR3 Transition Higher Density Drives & SSDs No Change – 1GbE © 2011 Emulex Corporation 10
  • 11. Hadoop on 10GbE Network I/O performance must scale with the increase in… – Processing power – Memory capacity – Storage performance Network performance is essential to support larger and faster systems Migrating from a 1GbE to a 10GbE network, leveraging Emulex OneConnect adapters resulted in a massive performance gain © 2011 Emulex Corporation 11
  • 12. Fine Tuning Hadoop Hadoop workloads vary greatly – No “one size fits all” approach – 200+ cluster-wide and job-specific parameters that can be fine tuned With the workload variety comes a disparity in the distribution of resource demands, which can be classed as: CPU Intensive I/O Intensive – Machine learning – Indexing – Complex data/text mining – Searching – Natural language processing – Grouping – Feature extraction – Decoding/decompressing – Data importing/exporting © 2011 Emulex Corporation 12
  • 13. The Setup Servers: Storage: – HP ML350 G6 – SATA II 500GB 7200rpm Disk • Dual, Quad core Xeon 2GHz Drives, 6 per node • 16 GB DDR3 – HP Smart Array G6 RAID • Broadcom 1GbE BCM5715 Controller (JBOD - No RAID • Emulex OneConnect 10GbE configured) OCe11102 Ethernet Adapter Cluster Configuration: OS and Software: – 15 servers with discrete roles – Ubuntu 64 bit • 1 NameNode – Hadoop (Cloudera Distribution) • 11 DataNodes • 3 Clients – 1GbE and 10GbE Switches © 2011 Emulex Corporation 13
  • 14. The Setup NameNode DataNode 11 Client 1 10Gb Switch Client 2 1Gb Switch DataNode 2 Client 3 DataNode 1 © 2011 Emulex Corporation 14
  • 15. Test Objective Measure HDFS throughput ingesting data into a Hadoop cluster – Examining multiple client configurations – Raising HDFS „put‟ operations per client – Transferring a constant 5GB file – Replication factor set to three – Duplicated for 1GbE and 10GbE Networks Clients 1 2 3 DataNodes 11 11 11 „Put‟ Operations 1, 2, 4, 6, 8 1, 2, 4, 6, 8 1, 2, 4, 6, 8 Total Operations 1, 2, 4, 6, 8 2, 4, 8, 12, 16 3, 6, 12, 18, 24 © 2011 Emulex Corporation 15
  • 16. Test Results – Legacy 1GbE Data Import – Single Client, Single „Put‟ Operation 1000 A single client, running a 800 single operation makes maximal use of the network 600 MBps 400 HDFS efficiently transfers 200 data to DataNodes within the cluster, averaging 108MBps 0 0 8 16 24 32 40 48 56 out of the client server Time (sec) 1 Operation © 2011 Emulex Corporation 16
  • 17. Test Results – Legacy 1GbE Data Import – Single Client, Multiple „Put‟ Operations 1000 When more than one ‘put’ 800 operation runs on a client, the 1GbE network 600 becomes the bottleneck MBps 400 200 Increasing the number of operations did not increase 0 0 8 16 24 32 40 48 56 client throughput – restricted Time (sec) by the network connection 1 Operation 4 Operations 8 Operations © 2011 Emulex Corporation 17
  • 18. Test Results – Legacy 1GbE Data Import – Multiple Clients, Multiple „Put‟ Operations 1000 Expected to observe 800 throughput scale with additional clients 600 MBps 400 Combined In and Out traffic 200 averaged 225MBps 0 0 8 16 24 32 40 48 56 Time (sec) 1 Operation 4 Operations 8 Operations © 2011 Emulex Corporation 18
  • 19. Test Results – Legacy 1GbE Data Import – Multiple Clients, Multiple „Put‟ Operations 1000 As network load increases 800 600 MBps 1GbE quickly reaches saturation 400 200 0 0 8 16 24 32 40 48 56 becomes the system bottleneck Time (sec) 1 Operation 4 Operations 8 Operations © 2011 Emulex Corporation 19
  • 20. Test Results – Emulex OneConnect 10GbE Data Import – Single Client, Single „Put‟ Operation 180 Immediate performance 160 improvement of 50% 140 compared to 1GbE network 120 MBps 100 80 60 Data transfer completed in 40 less than three quarters of 20 the time 0 0 8 16 24 32 40 48 56 Time (sec) 1GbE 10GbE © 2011 Emulex Corporation 20
  • 21. Test Results – Emulex OneConnect 10GbE Data Import – Single Client, Multiple „Put‟ Operations 1000 Increased network load is 800 met with increased throughput 600 MBps 400 Achieved transfer rates of 200 800MBps, nearly 8X the observed throughput of the 0 0 8 16 24 32 40 48 56 64 72 80 1GbE configuration Time (sec) 1 Operation 4 Operations 8 Operations © 2011 Emulex Corporation 21
  • 22. Test Results – Emulex OneConnect 10GbE Data Import – Multiple Clients, Multiple „Put‟ Operations 1800 Throughput scales with 1600 additional clients being 1400 brought on-line 1200 MBps 1000 800 600 The 10GbE network does not 400 limit transfer rates as the 200 clients and their operations 0 0 25 50 75 100 125 150 increase Time (sec) 1 Operation 4 Operations 8 Operations © 2011 Emulex Corporation 22
  • 23. Tale of the Tape – 1GbE vs 10GbE Maximum Throughput Achieved 1800 Clients 3 1600 1400 DataNodes 11 1200 MBps 1000 „Put‟ Operations 6 800 600 400 Total Operations 18 200 0 Data Size 270GB 1 101 201 301 401 Time (sec) 1GbE Max MBps 250 1G 10G 10GbE Max MBps 1,674 (6.7X faster) © 2011 Emulex Corporation 23
  • 24. Tale of the Tape – 1GbE vs 10GbE Average Throughput Achieved 1000 ~4X throughput enables more Clients 3 efficient real time analysis 800 DataNodes 11 600 MBps „Put‟ Operations 6 400 Total Operations 18 200 0 Data Size 270GB 1 2 4 8 12 18 Number of 'put' operations 1GbE Avg MBps 216 1G 10G 10GbE Avg MBps 831 (3.85X faster) © 2011 Emulex Corporation 24
  • 25. Tale of the Tape – 1GbE vs 10GbE Time to Completion (seconds) 600 Load times reduced by 75% Clients 3 500 improving batch analysis DataNodes 11 400 Time (sec) 300 „Put‟ Operations 6 200 Total Operations 18 100 0 Data Size 270GB 1 2 4 8 12 18 Numer of 'put' operations 1GbE Completion 453 1G 10G 10GbE Completion 115 (3.94X faster) © 2011 Emulex Corporation 25
  • 26. Key Takeaways Hadoop runs faster with 10G – Up to 8 times faster in some scenarios Fine tuning parameters is important for performance – Improvements may not be possible without proper configuration Future performance gains are possible – Hadoop was designed for 1GbE, but small changes will enable the full potential of 10GbE Hadoop is better with Emulex OneConnect Ethernet Adapters – “It just works” – right out of the box – Leverage our expertise to configure your Hadoop installation for maximum performance © 2011 Emulex Corporation 26
  • 28. Questions Which 1GbE and 10GbE switches were included in our tests? And would we see better performance with a switch that had lower latency? We used several different models of Cisco switches – each with different latency attributes. We found that latency didn’t impact throughput performance in a significant way. In one case, when moving to a switch with double the latency performance, we only witnessed roughly 1% increase in the throughput performance. Within the construct of our tests, we did not find that latency was critical to the performance results. © 2011 Emulex Corporation 28
  • 29. Questions Did we find the network being the bottleneck prior to the disk subsystem becoming the bottleneck? Yes, and it comes through in our graphs. It’s important to note that at the beginning of our tests, we encountered some disk performance bottlenecks due to some configuration issues. Proving that it is essential to understand the configuration settings for your Hadoop cluster in order to tap the full potential of your disks. With commodity disks, the standard performance characteristics is 100MBps per disk, typical environments have 6 disks per node, totaling 600MBps in performance potential. In some cases, you don’t need disk operations to actually happen – data is moved from memory to memory, but in most cases, data is moved from disk to disk on different machines. In those cases, disk performance is important. However, in our test cases, the disk performance was not a bottleneck. © 2011 Emulex Corporation 29
  • 30. Questions How many 1GbE NICs were used? Were multiple 1GbE NICs bridged together, or just a single 10GbE NIC? Our configuration used a single 1GbE NIC with two ports. Which is the typical commodity server configuration. Theoretically, you can install multiple cards, and get better performance, but it is a more difficult proposition, and would cost more than a single 10GbE NIC, aside from the fact that there likely would not be enough slots on the motherboard to accommodate that many cards. © 2011 Emulex Corporation 30
  • 31. Questions What is the maximum throughput of 10GbE? 10GbE maximum throughput is 1.25GB/s for single direction data transfer. When aggregated with receiving data, 2.5GB/s is the maximum. Hadoop is not designed to accommodate this speed, yet. Hopefully, it will be there soon. It’s important to mention that most 10GbE solutions today come with two ports, which means that you can achieve up to 5GB/s performance. Of course, in order to leverage that performance, you have to have a disk sub-system that operates close to that level. We observed that in cases where two 10GbE ports were used, you have 12 high performance disks. Today, it is not necessary because Hadoop does not use the network efficiently, so even with 6 disks, you will see a significant performance gain. © 2011 Emulex Corporation 31
  • 32. Questions Do we have a list of the parameters that need to be tuned within Hadoop in order to maximize the performance of our 10GbE NICs? The settings will vary depending on the environment. There isn’t a one-size-fits-all approach. Some of these parameters have been published in our white paper, and we will review that paper to ensure that all of those parameters are addressed. © 2011 Emulex Corporation 32
  • 33. Questions Are these results comparable to other 10GbE NICs or is this something unique to the Emulex technology portfolio? We included multiple cards from our competitors in this research project. Emulex cards did offer performance advantages over our competition – approximately 10%. The important observation was that competitors cards were more prone to failures – servers stopped responding, system reboots needed, etc. Emulex cards were far more reliable across the board, which we believe is more important than fractional performance gains. © 2011 Emulex Corporation 33
  • 34. Questions If the tests did not saturated the bandwidth of a 1GbE link, is the cause of the performance increase with 10GbE attributable to the “bursty” nature of the transfer itself? Hadoop is not optimized for networking, which is why there are some odd observations from time to time. There are times when even on 1GbE connections, it’s possible to not reach 50% of maximum throughput – a by product of its design. Hadoop was designed to run multiple jobs and operations, and in those instances these performance issues do not manifest themselves. © 2011 Emulex Corporation 34
  • 35. Questions Would a round-robin bonding configuration be possible with 10GbE, and would there be a performance gain from that? Theoretically, it is possible. Practically, it is unlikely due to the underlying disk system becoming the bottleneck (for the moment). If there are SSDs, or more than 6 disks being used, there is potential for performance improvement. © 2011 Emulex Corporation 35
  • 36. Questions Have we run tests with SSDs, higher RPM spindles, or larger spindle configurations? Yes, we did. And we encountered some interesting results. While we did see improvements of approximately 40%, we anticipated much better results with SSDs. The biggest issue with SSDs has to do with the way Hadoop interfaces with them – it does not tap into the full potential of the disk. Ultimately, we landed on throughput being the most important factor for performance, not necessarily I/O. © 2011 Emulex Corporation 36
  • 37. Thank You… © 2011 Emulex Corporation 37