SlideShare a Scribd company logo
1 of 58
Download to read offline
Secure Cloud Storage and
    Computing Using
Reconfigurable Hardware

Victor Costan (龍望), Hsin-Jung Yang (楊昕蓉),
      Srini Devadas, Nickolai Zeldovich
Why Security Matters
Cloud Computing: Dreams and Reality

• The Cloud: Ideal Picture   • The Cloud: Reality
Cloud Storage: Attack Vectors




 Hypervisor        State        Hardware
   Bugs         Manipulation     Attacks
Replay Attacks are Harmful
Spot the Differences
Spot the Differences
Spot the Differences
Spot the Differences: Job
Spot the Differences: Job
Spot the Differences: Name, Relationship Status
Why It Matters

• We rely on fresh data to make decisions
  – Google searches
  – Facebook profiles
  – Twitter, Linked-In


• Outdated data has big impact on users
  – Wrong profile information: confusion, embarrassment
  – Old search results: bad business decisions, embarrassment
  – Old document versions: costly business decisions, regulatory issues
System Design
Design:
Cloud Storage API
• Block Device
  – Fixed block size (1Mb)
  – Write(block number, block)
  – Read(block number)  block


• Easy to reason about the security

• File systems operate on top of this abstraction


                   B1            B2            B3          B4

                            Disk divided into 1MB blocks
Design:
System Architecture

                                                                Client
 FPGA / ASIC                 Secure NVRAM
  (Trusted)                       Chip



                                             System Bus
                                             (Untrusted)
                                                                   Internet
                                                                 (Untrusted)



    CPU           Disk            RAM
 (Untrusted)                   (Untrusted)       Network Card
               (Untrusted)                        (Untrusted)
Design:
Trusted Storage on Untrusted Disks
160-bit hash in trusted memory authenticates 1TB disk


                                    Root Hash
                                                                            Root hash matches
                                    h7=h(h5||h6)                            iff all blocks match
  20
levels
                h5=h(h1||h2)                                                       Nodes hash
                                                        h6=h(h3||h4)              their children


         h1=h(B1)        h2=h(B2)            h3=h(B3)            h4=h(B4)         Leaves hash
                                                                                   their blocks


           B1                  B2                  B3                  B4

                        Disk divided into 1MB blocks
Design:
Hash Tree Caching
 Node   Hash                      Verified   Left        Right
 number                                      child       child
 1         fabe3c05d8ba995af93e   Y          Y           N

 2         e6fc9bc13d624ace2394   Y          Y           Y
                                                                      The FPGA
                                                                     caches hash
 4         53a81fc2dcc53e4da819   Y          N           N
                                                                      tree nodes
 5         b2ce548dfa2f91d83ec6   Y          N           N


                                                                      1
     The untrusted OS is free to choose
      the caching policy, for maximum                        2                3
                performance
                                                     4           5        6        7
Design:
Hash Tree Cache
• Server stores entire hash tree in RAM
• FPGA has a cache that stores a subset of nodes
• Server tells FPGA what nodes to store




                          Cache management commands



              1                             Node Hash    Verified
                                               1 fabe…      Y
      2               3                        2 e6fc…      Y
                                               4 53a8…      Y
  4       5       6         7
                                               5 b2ce…      Y
Design:
Hash Tree Cache - Load

• Server tells the FPGA to load a node into a cache entry
• The cache entry is unverified right after a load

                1                                       1

       2                                        2

  4                                         4       5

 Node Hash          Verified          Node Hash         Verified
      1 fabe…          Y                  1 fabe…           Y
      2 e6fc…          Y                  2 e6fc…           Y
      4 53a8…          N                  4 53a8…           N
                                          5 b2ce…           N
Design:
Hash Tree Cache - Verify

• Server tells the FPGA to use a node to verify its children
• FPGA checks that parent’s hash matches children hashes

                1                                       1

       2                                        2

  4        5                                4       5

 Node Hash          Verified          Node Hash         Verified
      1 fabe…          Y                 1 fabe…            Y
      2 e6fc…          Y                 2 e6fc…            Y
      4 53a8…          N                 4 53a8…            Y
      5 b2ce…          N                 5 b2ce…            Y
Design:
Hash Tree Cache - Efficiency

• Checking leaf 33 requires 10 node loads for a cold cache on
  this toy example (38 loads on the real FPGA tree)
• Remember the root is always loaded in the cache

                                         1

                                 2           3

                             4       5

                    8            9

          16            17

     32        33
Design:
Hash Tree Cache - Efficiency

• Checking leaf 38 only 4 node loads, because 9 is already in
  the cache and verified
• Server can predict client requests and manage cache for
  high performance
                                                    1

                                      2                 3

                             4                  5

                    8                  9

          16            17       18        19

     32        33                     38        39
Results
Results:
System Architecture

                                                                Client
 FPGA / ASIC                 Secure NVRAM
  (Trusted)                       Chip



                                             System Bus
                                             (Untrusted)
                                                                   Internet
                                                                 (Untrusted)



    CPU           Disk            RAM
 (Untrusted)                   (Untrusted)       Network Card
               (Untrusted)                        (Untrusted)
Results: Server Prototype
Results: Server Prototype
Results: Normal Operation
Results: FPGA Board, Normal Operation
Results: Attack Does Not Impact User
Results: FPGA Board, Under Attack
Results: Performance Block Diagram

         Read / Write 1MB Data Block to Disk

                Limit: Disk I/O Speed



                Hash 1MB Data Block

Limit: Hash Engine Speed      Limit: FPGA Data Bus



           Load & Verify Hash Tree Nodes

Limit: Hash Engine Speed       Limit: Dependencies



           Update Hash Tree (Writes Only)

Limit: Hash Engine Speed       Limit: Dependencies



                HMAC (Sign) Result

              Limit: Hash Engine Speed
Results: Performance Block Diagram

        Read / Write 1MB Data Block to Disk

               Limit: Disk I/O Speed



               Hash 1MB Data Block

Limit: Hash Engine Speed     Limit: FPGA Data Bus



           Load & Verify Hash Tree Nodes

Limit: Hash Engine Speed      Limit: Dependencies



           Update Hash Tree (Writes Only)

Limit: Hash Engine Speed      Limit: Dependencies



                HMAC (Sign) Result

             Limit: Hash Engine Speed
Results: Prototype Performance (est.)

        Read / Write 1MB Data Block to Disk         Disk I/O         Throughput
               Limit: Disk I/O Speed
                                                    7,200 RPM HDD         70 MB/s
                                                    10,000 RPM HDD       100 MB/s
               Hash 1MB Data Block
                                                    15,000 RPM HDD       130 MB/s
Limit: Hash Engine Speed     Limit: FPGA Data Bus
                                                    SSD                  250 MB/s
           Load & Verify Hash Tree Nodes

Limit: Hash Engine Speed      Limit: Dependencies

                                                    1 MB = 1 block
           Update Hash Tree (Writes Only)

Limit: Hash Engine Speed      Limit: Dependencies



                HMAC (Sign) Result

             Limit: Hash Engine Speed
Results: Performance Block Diagram

         Read / Write 1MB Data Block to Disk

                  Limit: Disk I/O Speed



                Hash 1MB Data Block

Limit: Hash Engine Speed         Limit: FPGA Data Bus



            Load & Verify Hash Tree Nodes

 Limit: Hash Engine Speed         Limit: Dependencies



            Update Hash Tree (Writes Only)

 Limit: Hash Engine Speed         Limit: Dependencies



                  HMAC (Sign) Result

                Limit: Hash Engine Speed
Results: Prototype Performance (est.)

         Read / Write 1MB Data Block to Disk            Operation         Throughput
                  Limit: Disk I/O Speed                 Block Hash              800 MB/s
                                                        Pipelined              3,200 MB/s
                Hash 1MB Data Block                     Block Hash
Limit: Hash Engine Speed         Limit: FPGA Data Bus



            Load & Verify Hash Tree Nodes
                                                        1 MB = 1 block
 Limit: Hash Engine Speed         Limit: Dependencies


                                                        Transport          Throughput
            Update Hash Tree (Writes Only)
                                                        PCI Express x16      4,096 MB/s
 Limit: Hash Engine Speed         Limit: Dependencies
                                                        SATA II                384 MB/s
                  HMAC (Sign) Result                    PCI Express x1         250 MB/s
                Limit: Hash Engine Speed                Ethernet               125 MB/s
Results: Performance Block Diagram

         Read / Write 1MB Data Block to Disk

                  Limit: Disk I/O Speed



                 Hash 1MB Data Block

 Limit: Hash Engine Speed        Limit: FPGA Data Bus



           Load & Verify Hash Tree Nodes

Limit: Hash Engine Speed         Limit: Dependencies



            Update Hash Tree (Writes Only)

 Limit: Hash Engine Speed         Limit: Dependencies



                  HMAC (Sign) Result

                Limit: Hash Engine Speed
Results: Prototype Performance (est.)

         Read / Write 1MB Data Block to Disk            Operation         Throughput
                  Limit: Disk I/O Speed                 Tree Node Hash           1.25 M/s
                                                        Pipelined                 5.0 M/s
                 Hash 1MB Data Block                    Tree Node Hash
 Limit: Hash Engine Speed        Limit: FPGA Data Bus   Tree Operations           62.5 k/s
                                                        Optimized Tree            2.5 M/s
           Load & Verify Hash Tree Nodes                Operations
Limit: Hash Engine Speed         Limit: Dependencies
                                                        1 MB = 1 block
            Update Hash Tree (Writes Only)              Transport          Throughput
 Limit: Hash Engine Speed         Limit: Dependencies   PCI Express x16      4,096 MB/s
                                                        SATA II                384 MB/s
                  HMAC (Sign) Result                    PCI Express x1         250 MB/s
                Limit: Hash Engine Speed
                                                        Ethernet               125 MB/s
Results: Performance Block Diagram

         Read / Write 1MB Data Block to Disk

                  Limit: Disk I/O Speed



                 Hash 1MB Data Block

 Limit: Hash Engine Speed        Limit: FPGA Data Bus



            Load & Verify Hash Tree Nodes

 Limit: Hash Engine Speed         Limit: Dependencies



           Update Hash Tree (Writes Only)

Limit: Hash Engine Speed         Limit: Dependencies



                  HMAC (Sign) Result

                Limit: Hash Engine Speed
Results: Prototype Performance (est.)

         Read / Write 1MB Data Block to Disk            Operation         Throughput
                  Limit: Disk I/O Speed                 Tree Node Hash           1.25 M/s
                                                        Pipelined                 5.0 M/s
                 Hash 1MB Data Block                    Tree Node Hash
 Limit: Hash Engine Speed        Limit: FPGA Data Bus   Tree Operations           62.5 k/s


            Load & Verify Hash Tree Nodes

 Limit: Hash Engine Speed         Limit: Dependencies
                                                        1 MB = 1 block
           Update Hash Tree (Writes Only)               Transport          Throughput
Limit: Hash Engine Speed         Limit: Dependencies    PCI Express x16      4,096 MB/s
                                                        SATA II                384 MB/s
                  HMAC (Sign) Result                    PCI Express x1         250 MB/s
                Limit: Hash Engine Speed
                                                        Ethernet               125 MB/s
Results: Performance Block Diagram

         Read / Write 1MB Data Block to Disk

                Limit: Disk I/O Speed



                Hash 1MB Data Block

Limit: Hash Engine Speed      Limit: FPGA Data Bus



           Load & Verify Hash Tree Nodes

Limit: Hash Engine Speed       Limit: Dependencies



           Update Hash Tree (Writes Only)

Limit: Hash Engine Speed       Limit: Dependencies



                HMAC (Sign) Result

             Limit: Hash Engine Speed
Results: Prototype Performance (est.)

         Read / Write 1MB Data Block to Disk         Operation         Throughput
                Limit: Disk I/O Speed                Node HMAC                1.25 M/s

                Hash 1MB Data Block

Limit: Hash Engine Speed      Limit: FPGA Data Bus



           Load & Verify Hash Tree Nodes

Limit: Hash Engine Speed       Limit: Dependencies
                                                     1 MB = 1 block
           Update Hash Tree (Writes Only)            Transport          Throughput
Limit: Hash Engine Speed       Limit: Dependencies   PCI Express x16      4,096 MB/s
                                                     SATA II                384 MB/s
                HMAC (Sign) Result                   PCI Express x1         250 MB/s
             Limit: Hash Engine Speed
                                                     Ethernet               125 MB/s
Results: Performance Block Diagram

                                                     • Steps are performed in
         Read / Write 1MB Data Block to Disk

                Limit: Disk I/O Speed
                                                       parallel (pipelined),
                                                       because they are in
                Hash 1MB Data Block
                                                       different system
Limit: Hash Engine Speed      Limit: FPGA Data Bus     components
                                                     • However, the slowest
           Load & Verify Hash Tree Nodes
                                                       step is the bottleneck
Limit: Hash Engine Speed       Limit: Dependencies
                                                       for the entire system
           Update Hash Tree (Writes Only)            • Each step can be made
Limit: Hash Engine Speed       Limit: Dependencies     faster by adding more
                                                       hardware (e.g. more
                HMAC (Sign) Result                     disks), assuming cache
              Limit: Hash Engine Speed
                                                       policies can scale up
Results: Ping-Pong Workload
         10                            • Typical collaboration
          9                              scenario
          8
          7
                                       • Real-Life
          6
                                         – Google Docs
 Block




          5
                                         – Facebook Messages
          4                              – Dropbox
          3
          2
                                       • Straight-up LRU shines
          1
                                         here
          0
              0   5    10    15   20
                      Time
Results: Photo Gallery Workload
         10                            • Modeled after data on
          9                              photo applications
          8
          7
                                       • Real-Life
          6
                                         – Facebook’s #1 Feature
 Block




          5
                                         – Google Picasa
          4                              – Flixter
          3
          2
                                       • Special policy inspired
          1
                                         by Facebook Haystack
          0                              classifies photos, loads
              0   5    10    15   20
                      Time               cache predictively
Results: Map-Reduce Workload
         30                   • Index-generating Map-
                                Reduce
         25


         20                   • Real-Life
                                – Google Pagerank
 Block




         15
                                – Facebook friend graph
                                  (EdgeRank)
         10


          5                   • Special policy that
                                takes advantage of
          0                     Map-Reduce access
              0     5    10     pattern
                  Time
Results: Cache Hit Rates
                            • Applications: 2 users
 1                            collaborating on a file (ping-
                              pong), photo gallery
0.9                           browsing, Map-Reduce job
0.8
                            • Cache policies: Speculative
                              Last-Recently Used,
0.7              Spec LRU
                              Facebook Haystack’s policy
                 Haystack
                              optimized for caching,
0.6              MR-Aware
                              policy optimized for Map-
                              Reduce access patterns
0.5                         • Conclusion: no policy
                              works well on all
                              applications, so app server
                              must drive policy
Results: Protocol Overhead

  • Client – Server Bandwidth overhead: 0.002%
      – Operation: 1 HMAC (20 bytes) per 1MB = 0.002%
      – Handshake: extra secret exchange piggybacks on SSL: 5%


  • Latency overhead (1 client): 4%
      – Without security: 8.2ms / request
      – With security: 8.5ms / request
      – Latency overhead = the latency of a very fast Internet hop


  • No throughput overhead (N-clients)
      – With or without security: 100MB/s
      – Need 40 HDDs to saturate PCI-E x16, 52 HDDs to saturate FPGA



MIT COMPUTER SCIENCE AND ARTIFICIAL INTELLIGENCE LABORATORY
Results: Protocol Overhead

• Protocol is simple
  enough to implement
  on browser side
  – Chrome
  – Firefox
  – Internet Explorer 10


• Easy integration in
  existing Web
  applications

• End-to-end security
Thank You!



  Questions?
Other Applications


  • FPGA can be used to load user-specified circuits and
    perform arbitrary computation with security guarantees

  • Applications: encrypted image search, financial calculations

  • Potential applications in highly regulated industries, e.g.
    medical record keeping and processing, secure financial
    services




MIT COMPUTER SCIENCE AND ARTIFICIAL INTELLIGENCE LABORATORY
Secure Computation:
   Overview
                                Untrusted
                               computation:                      VM image 
                                                                  CPU cores
                                VM image                Cloud
          Task
                                 Trusted               Machine   Circuit spec
                               computation:                        FPGA
                               Circuit spec                         LUTs




     • Most code is untrusted, executes in a VM

     • Trusted code is broken up into kernels which become
       circuits deployed onto an FPGA

     • If efficiency is not an issue, deploy a processor on the
       FPGA, execute software securely
MIT COMPUTER SCIENCE AND ARTIFICIAL INTELLIGENCE LABORATORY             6/9/2011
Secure Computation: Challenge

• Multi-tenancy is the key              VM Hypervisor
  to the cloud’s cost
                             Client 1      Client 2     Client 3
  effectiveness                VM            VM           VM


                                          PCI Express
• FPGA can host different
  applications running in           FPGA controller
  parallel
                                                  Client 2
                                                 Application
• Challenge: isolation        Client 1
  between applications,      Application
  just like a hypervisor                          Client 3
                                                 Application
Other Applications


• FPGA can be used to load user-specified circuits and
  perform arbitrary computation with security guarantees

• Applications: encrypted image search, financial calculations

• Potential applications in highly regulated industries, e.g.
  medical record keeping and processing, secure financial
  services
Design:
FPGA Boot Sequence
                         random nonce

                         PKcard + Manufacturer Certificate

Check certificate against e-fuses
Check Pkcard against certificate
                                    PUFsyndrome + SignPKcard(PUFsyndrome)

Compute SKfpga from PUFsyndrome

                         Root Hash + SignPKcard(nonce || Root Hash)

Verify signature
                        EncSKfpga(SKcard) + MACSKfpga(nonce || SKcard)

Verify MAC
Design:
Client Trust Model
• Each FPGA – NVRAM pair has a Endorsement Key (EK)
• Manufacturer certifies the public EK
• Client uses the public EK to encrypt a HMAC key, which
  becomes its shared secret with the trusted hardware


                                                 Manufacturer
                  verify           Endorsement         sign
   Client
                                   Certificate
       generate
  HMAC key                                         PubEK PrivEK
              encrypt with PubEK
                                                                decrypt with
    Encrypted HMAC key                                            PrivEK
                                                          HMAC key
Design:
Hash Tree Security

1. Impossible to come up with a block B1’ such that B1 ≠ B1’
   but h(B1) = h(B1’)

2. Impossible to come up with a node hash h1’ such that h1’
   such that h1 ≠ h1’ but h(h1||h2) = h(h1’||h2)

Therefore, the root hash authenticates the entire contents of
the tree.
Design:
FPGA Boot Sequence Security

• Server OS transfers messages between FPGA and Trusted
  Memory  untrusted channel

• FPGA authenticates Trusted Memory using Manufacturer
  Certificate, whose public key is burned into FPGA’s e-fuses

• Trusted Memory authenticates FPGA using its Physically
  Unclonable Function (PUF)

• At manufacturing time, FPGA is paired with memory chip

• FPGA can be paired with new memory chip if necessary
Design:
Hash Tree Cache Security

• Server OS responsible for loading and verifying tree nodes

• Parent node hash verifies children nodes

• Reading a block requires the block’s leaf to be verified

• Writing a block requires the path from the block’s leaf to the
  root to be loaded and verified

• A node can be loaded in at most one cache line, to prevent
  replay attacks using stale node hashes

More Related Content

Similar to Trusted Cloud Storage Tech Talk

Mongodb - Scaling write performance
Mongodb - Scaling write performanceMongodb - Scaling write performance
Mongodb - Scaling write performanceDaum DNA
 
Cache is King ( Or How To Stop Worrying And Start Caching in Java) at Chicago...
Cache is King ( Or How To Stop Worrying And Start Caching in Java) at Chicago...Cache is King ( Or How To Stop Worrying And Start Caching in Java) at Chicago...
Cache is King ( Or How To Stop Worrying And Start Caching in Java) at Chicago...srisatish ambati
 
Cache on Delivery
Cache on DeliveryCache on Delivery
Cache on DeliverySensePost
 
Facebook's HBase Backups - StampedeCon 2012
Facebook's HBase Backups - StampedeCon 2012Facebook's HBase Backups - StampedeCon 2012
Facebook's HBase Backups - StampedeCon 2012StampedeCon
 
Accelerating NoSQL
Accelerating NoSQLAccelerating NoSQL
Accelerating NoSQLsunnygleason
 
Crypto Strikes Back! (Google 2009)
Crypto Strikes Back! (Google 2009)Crypto Strikes Back! (Google 2009)
Crypto Strikes Back! (Google 2009)Nate Lawson
 
Disk IO Benchmarking in shared multi-tenant environments
Disk IO Benchmarking in shared multi-tenant environmentsDisk IO Benchmarking in shared multi-tenant environments
Disk IO Benchmarking in shared multi-tenant environmentsRodrigo Campos
 
Os Wardenupdated
Os WardenupdatedOs Wardenupdated
Os Wardenupdatedoscon2007
 
MongoDB: Scaling write performance | Devon 2012
MongoDB: Scaling write performance | Devon 2012MongoDB: Scaling write performance | Devon 2012
MongoDB: Scaling write performance | Devon 2012Daum DNA
 
Windows server 8 hyper v & storage (hans vredevoort)
Windows server 8 hyper v & storage (hans vredevoort)Windows server 8 hyper v & storage (hans vredevoort)
Windows server 8 hyper v & storage (hans vredevoort)hypervnu
 
00 opencapi acceleration framework yonglu_ver2
00 opencapi acceleration framework yonglu_ver200 opencapi acceleration framework yonglu_ver2
00 opencapi acceleration framework yonglu_ver2Yutaka Kawai
 
Rooting your internals - Exploiting Internal Network Vulns via the Browser Us...
Rooting your internals - Exploiting Internal Network Vulns via the Browser Us...Rooting your internals - Exploiting Internal Network Vulns via the Browser Us...
Rooting your internals - Exploiting Internal Network Vulns via the Browser Us...Michele Orru
 
Near-realtime analytics with Kafka and HBase
Near-realtime analytics with Kafka and HBaseNear-realtime analytics with Kafka and HBase
Near-realtime analytics with Kafka and HBasedave_revell
 

Similar to Trusted Cloud Storage Tech Talk (17)

Radius
RadiusRadius
Radius
 
Mongodb - Scaling write performance
Mongodb - Scaling write performanceMongodb - Scaling write performance
Mongodb - Scaling write performance
 
Cache is King ( Or How To Stop Worrying And Start Caching in Java) at Chicago...
Cache is King ( Or How To Stop Worrying And Start Caching in Java) at Chicago...Cache is King ( Or How To Stop Worrying And Start Caching in Java) at Chicago...
Cache is King ( Or How To Stop Worrying And Start Caching in Java) at Chicago...
 
Cache on Delivery
Cache on DeliveryCache on Delivery
Cache on Delivery
 
GPU programming
GPU programmingGPU programming
GPU programming
 
Facebook's HBase Backups - StampedeCon 2012
Facebook's HBase Backups - StampedeCon 2012Facebook's HBase Backups - StampedeCon 2012
Facebook's HBase Backups - StampedeCon 2012
 
Raid
RaidRaid
Raid
 
Accelerating NoSQL
Accelerating NoSQLAccelerating NoSQL
Accelerating NoSQL
 
Cachememory
CachememoryCachememory
Cachememory
 
Crypto Strikes Back! (Google 2009)
Crypto Strikes Back! (Google 2009)Crypto Strikes Back! (Google 2009)
Crypto Strikes Back! (Google 2009)
 
Disk IO Benchmarking in shared multi-tenant environments
Disk IO Benchmarking in shared multi-tenant environmentsDisk IO Benchmarking in shared multi-tenant environments
Disk IO Benchmarking in shared multi-tenant environments
 
Os Wardenupdated
Os WardenupdatedOs Wardenupdated
Os Wardenupdated
 
MongoDB: Scaling write performance | Devon 2012
MongoDB: Scaling write performance | Devon 2012MongoDB: Scaling write performance | Devon 2012
MongoDB: Scaling write performance | Devon 2012
 
Windows server 8 hyper v & storage (hans vredevoort)
Windows server 8 hyper v & storage (hans vredevoort)Windows server 8 hyper v & storage (hans vredevoort)
Windows server 8 hyper v & storage (hans vredevoort)
 
00 opencapi acceleration framework yonglu_ver2
00 opencapi acceleration framework yonglu_ver200 opencapi acceleration framework yonglu_ver2
00 opencapi acceleration framework yonglu_ver2
 
Rooting your internals - Exploiting Internal Network Vulns via the Browser Us...
Rooting your internals - Exploiting Internal Network Vulns via the Browser Us...Rooting your internals - Exploiting Internal Network Vulns via the Browser Us...
Rooting your internals - Exploiting Internal Network Vulns via the Browser Us...
 
Near-realtime analytics with Kafka and HBase
Near-realtime analytics with Kafka and HBaseNear-realtime analytics with Kafka and HBase
Near-realtime analytics with Kafka and HBase
 

Recently uploaded

Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 

Recently uploaded (20)

Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 

Trusted Cloud Storage Tech Talk

  • 1. Secure Cloud Storage and Computing Using Reconfigurable Hardware Victor Costan (龍望), Hsin-Jung Yang (楊昕蓉), Srini Devadas, Nickolai Zeldovich
  • 3. Cloud Computing: Dreams and Reality • The Cloud: Ideal Picture • The Cloud: Reality
  • 4. Cloud Storage: Attack Vectors Hypervisor State Hardware Bugs Manipulation Attacks
  • 11. Spot the Differences: Name, Relationship Status
  • 12. Why It Matters • We rely on fresh data to make decisions – Google searches – Facebook profiles – Twitter, Linked-In • Outdated data has big impact on users – Wrong profile information: confusion, embarrassment – Old search results: bad business decisions, embarrassment – Old document versions: costly business decisions, regulatory issues
  • 14. Design: Cloud Storage API • Block Device – Fixed block size (1Mb) – Write(block number, block) – Read(block number)  block • Easy to reason about the security • File systems operate on top of this abstraction B1 B2 B3 B4 Disk divided into 1MB blocks
  • 15. Design: System Architecture Client FPGA / ASIC Secure NVRAM (Trusted) Chip System Bus (Untrusted) Internet (Untrusted) CPU Disk RAM (Untrusted) (Untrusted) Network Card (Untrusted) (Untrusted)
  • 16. Design: Trusted Storage on Untrusted Disks 160-bit hash in trusted memory authenticates 1TB disk Root Hash Root hash matches h7=h(h5||h6) iff all blocks match 20 levels h5=h(h1||h2) Nodes hash h6=h(h3||h4) their children h1=h(B1) h2=h(B2) h3=h(B3) h4=h(B4) Leaves hash their blocks B1 B2 B3 B4 Disk divided into 1MB blocks
  • 17. Design: Hash Tree Caching Node Hash Verified Left Right number child child 1 fabe3c05d8ba995af93e Y Y N 2 e6fc9bc13d624ace2394 Y Y Y The FPGA caches hash 4 53a81fc2dcc53e4da819 Y N N tree nodes 5 b2ce548dfa2f91d83ec6 Y N N 1 The untrusted OS is free to choose the caching policy, for maximum 2 3 performance 4 5 6 7
  • 18. Design: Hash Tree Cache • Server stores entire hash tree in RAM • FPGA has a cache that stores a subset of nodes • Server tells FPGA what nodes to store Cache management commands 1 Node Hash Verified 1 fabe… Y 2 3 2 e6fc… Y 4 53a8… Y 4 5 6 7 5 b2ce… Y
  • 19. Design: Hash Tree Cache - Load • Server tells the FPGA to load a node into a cache entry • The cache entry is unverified right after a load 1 1 2 2 4 4 5 Node Hash Verified Node Hash Verified 1 fabe… Y 1 fabe… Y 2 e6fc… Y 2 e6fc… Y 4 53a8… N 4 53a8… N 5 b2ce… N
  • 20. Design: Hash Tree Cache - Verify • Server tells the FPGA to use a node to verify its children • FPGA checks that parent’s hash matches children hashes 1 1 2 2 4 5 4 5 Node Hash Verified Node Hash Verified 1 fabe… Y 1 fabe… Y 2 e6fc… Y 2 e6fc… Y 4 53a8… N 4 53a8… Y 5 b2ce… N 5 b2ce… Y
  • 21. Design: Hash Tree Cache - Efficiency • Checking leaf 33 requires 10 node loads for a cold cache on this toy example (38 loads on the real FPGA tree) • Remember the root is always loaded in the cache 1 2 3 4 5 8 9 16 17 32 33
  • 22. Design: Hash Tree Cache - Efficiency • Checking leaf 38 only 4 node loads, because 9 is already in the cache and verified • Server can predict client requests and manage cache for high performance 1 2 3 4 5 8 9 16 17 18 19 32 33 38 39
  • 24. Results: System Architecture Client FPGA / ASIC Secure NVRAM (Trusted) Chip System Bus (Untrusted) Internet (Untrusted) CPU Disk RAM (Untrusted) (Untrusted) Network Card (Untrusted) (Untrusted)
  • 28. Results: FPGA Board, Normal Operation
  • 29. Results: Attack Does Not Impact User
  • 30. Results: FPGA Board, Under Attack
  • 31. Results: Performance Block Diagram Read / Write 1MB Data Block to Disk Limit: Disk I/O Speed Hash 1MB Data Block Limit: Hash Engine Speed Limit: FPGA Data Bus Load & Verify Hash Tree Nodes Limit: Hash Engine Speed Limit: Dependencies Update Hash Tree (Writes Only) Limit: Hash Engine Speed Limit: Dependencies HMAC (Sign) Result Limit: Hash Engine Speed
  • 32. Results: Performance Block Diagram Read / Write 1MB Data Block to Disk Limit: Disk I/O Speed Hash 1MB Data Block Limit: Hash Engine Speed Limit: FPGA Data Bus Load & Verify Hash Tree Nodes Limit: Hash Engine Speed Limit: Dependencies Update Hash Tree (Writes Only) Limit: Hash Engine Speed Limit: Dependencies HMAC (Sign) Result Limit: Hash Engine Speed
  • 33. Results: Prototype Performance (est.) Read / Write 1MB Data Block to Disk Disk I/O Throughput Limit: Disk I/O Speed 7,200 RPM HDD 70 MB/s 10,000 RPM HDD 100 MB/s Hash 1MB Data Block 15,000 RPM HDD 130 MB/s Limit: Hash Engine Speed Limit: FPGA Data Bus SSD 250 MB/s Load & Verify Hash Tree Nodes Limit: Hash Engine Speed Limit: Dependencies 1 MB = 1 block Update Hash Tree (Writes Only) Limit: Hash Engine Speed Limit: Dependencies HMAC (Sign) Result Limit: Hash Engine Speed
  • 34. Results: Performance Block Diagram Read / Write 1MB Data Block to Disk Limit: Disk I/O Speed Hash 1MB Data Block Limit: Hash Engine Speed Limit: FPGA Data Bus Load & Verify Hash Tree Nodes Limit: Hash Engine Speed Limit: Dependencies Update Hash Tree (Writes Only) Limit: Hash Engine Speed Limit: Dependencies HMAC (Sign) Result Limit: Hash Engine Speed
  • 35. Results: Prototype Performance (est.) Read / Write 1MB Data Block to Disk Operation Throughput Limit: Disk I/O Speed Block Hash 800 MB/s Pipelined 3,200 MB/s Hash 1MB Data Block Block Hash Limit: Hash Engine Speed Limit: FPGA Data Bus Load & Verify Hash Tree Nodes 1 MB = 1 block Limit: Hash Engine Speed Limit: Dependencies Transport Throughput Update Hash Tree (Writes Only) PCI Express x16 4,096 MB/s Limit: Hash Engine Speed Limit: Dependencies SATA II 384 MB/s HMAC (Sign) Result PCI Express x1 250 MB/s Limit: Hash Engine Speed Ethernet 125 MB/s
  • 36. Results: Performance Block Diagram Read / Write 1MB Data Block to Disk Limit: Disk I/O Speed Hash 1MB Data Block Limit: Hash Engine Speed Limit: FPGA Data Bus Load & Verify Hash Tree Nodes Limit: Hash Engine Speed Limit: Dependencies Update Hash Tree (Writes Only) Limit: Hash Engine Speed Limit: Dependencies HMAC (Sign) Result Limit: Hash Engine Speed
  • 37. Results: Prototype Performance (est.) Read / Write 1MB Data Block to Disk Operation Throughput Limit: Disk I/O Speed Tree Node Hash 1.25 M/s Pipelined 5.0 M/s Hash 1MB Data Block Tree Node Hash Limit: Hash Engine Speed Limit: FPGA Data Bus Tree Operations 62.5 k/s Optimized Tree 2.5 M/s Load & Verify Hash Tree Nodes Operations Limit: Hash Engine Speed Limit: Dependencies 1 MB = 1 block Update Hash Tree (Writes Only) Transport Throughput Limit: Hash Engine Speed Limit: Dependencies PCI Express x16 4,096 MB/s SATA II 384 MB/s HMAC (Sign) Result PCI Express x1 250 MB/s Limit: Hash Engine Speed Ethernet 125 MB/s
  • 38. Results: Performance Block Diagram Read / Write 1MB Data Block to Disk Limit: Disk I/O Speed Hash 1MB Data Block Limit: Hash Engine Speed Limit: FPGA Data Bus Load & Verify Hash Tree Nodes Limit: Hash Engine Speed Limit: Dependencies Update Hash Tree (Writes Only) Limit: Hash Engine Speed Limit: Dependencies HMAC (Sign) Result Limit: Hash Engine Speed
  • 39. Results: Prototype Performance (est.) Read / Write 1MB Data Block to Disk Operation Throughput Limit: Disk I/O Speed Tree Node Hash 1.25 M/s Pipelined 5.0 M/s Hash 1MB Data Block Tree Node Hash Limit: Hash Engine Speed Limit: FPGA Data Bus Tree Operations 62.5 k/s Load & Verify Hash Tree Nodes Limit: Hash Engine Speed Limit: Dependencies 1 MB = 1 block Update Hash Tree (Writes Only) Transport Throughput Limit: Hash Engine Speed Limit: Dependencies PCI Express x16 4,096 MB/s SATA II 384 MB/s HMAC (Sign) Result PCI Express x1 250 MB/s Limit: Hash Engine Speed Ethernet 125 MB/s
  • 40. Results: Performance Block Diagram Read / Write 1MB Data Block to Disk Limit: Disk I/O Speed Hash 1MB Data Block Limit: Hash Engine Speed Limit: FPGA Data Bus Load & Verify Hash Tree Nodes Limit: Hash Engine Speed Limit: Dependencies Update Hash Tree (Writes Only) Limit: Hash Engine Speed Limit: Dependencies HMAC (Sign) Result Limit: Hash Engine Speed
  • 41. Results: Prototype Performance (est.) Read / Write 1MB Data Block to Disk Operation Throughput Limit: Disk I/O Speed Node HMAC 1.25 M/s Hash 1MB Data Block Limit: Hash Engine Speed Limit: FPGA Data Bus Load & Verify Hash Tree Nodes Limit: Hash Engine Speed Limit: Dependencies 1 MB = 1 block Update Hash Tree (Writes Only) Transport Throughput Limit: Hash Engine Speed Limit: Dependencies PCI Express x16 4,096 MB/s SATA II 384 MB/s HMAC (Sign) Result PCI Express x1 250 MB/s Limit: Hash Engine Speed Ethernet 125 MB/s
  • 42. Results: Performance Block Diagram • Steps are performed in Read / Write 1MB Data Block to Disk Limit: Disk I/O Speed parallel (pipelined), because they are in Hash 1MB Data Block different system Limit: Hash Engine Speed Limit: FPGA Data Bus components • However, the slowest Load & Verify Hash Tree Nodes step is the bottleneck Limit: Hash Engine Speed Limit: Dependencies for the entire system Update Hash Tree (Writes Only) • Each step can be made Limit: Hash Engine Speed Limit: Dependencies faster by adding more hardware (e.g. more HMAC (Sign) Result disks), assuming cache Limit: Hash Engine Speed policies can scale up
  • 43. Results: Ping-Pong Workload 10 • Typical collaboration 9 scenario 8 7 • Real-Life 6 – Google Docs Block 5 – Facebook Messages 4 – Dropbox 3 2 • Straight-up LRU shines 1 here 0 0 5 10 15 20 Time
  • 44. Results: Photo Gallery Workload 10 • Modeled after data on 9 photo applications 8 7 • Real-Life 6 – Facebook’s #1 Feature Block 5 – Google Picasa 4 – Flixter 3 2 • Special policy inspired 1 by Facebook Haystack 0 classifies photos, loads 0 5 10 15 20 Time cache predictively
  • 45. Results: Map-Reduce Workload 30 • Index-generating Map- Reduce 25 20 • Real-Life – Google Pagerank Block 15 – Facebook friend graph (EdgeRank) 10 5 • Special policy that takes advantage of 0 Map-Reduce access 0 5 10 pattern Time
  • 46. Results: Cache Hit Rates • Applications: 2 users 1 collaborating on a file (ping- pong), photo gallery 0.9 browsing, Map-Reduce job 0.8 • Cache policies: Speculative Last-Recently Used, 0.7 Spec LRU Facebook Haystack’s policy Haystack optimized for caching, 0.6 MR-Aware policy optimized for Map- Reduce access patterns 0.5 • Conclusion: no policy works well on all applications, so app server must drive policy
  • 47. Results: Protocol Overhead • Client – Server Bandwidth overhead: 0.002% – Operation: 1 HMAC (20 bytes) per 1MB = 0.002% – Handshake: extra secret exchange piggybacks on SSL: 5% • Latency overhead (1 client): 4% – Without security: 8.2ms / request – With security: 8.5ms / request – Latency overhead = the latency of a very fast Internet hop • No throughput overhead (N-clients) – With or without security: 100MB/s – Need 40 HDDs to saturate PCI-E x16, 52 HDDs to saturate FPGA MIT COMPUTER SCIENCE AND ARTIFICIAL INTELLIGENCE LABORATORY
  • 48. Results: Protocol Overhead • Protocol is simple enough to implement on browser side – Chrome – Firefox – Internet Explorer 10 • Easy integration in existing Web applications • End-to-end security
  • 49. Thank You! Questions?
  • 50. Other Applications • FPGA can be used to load user-specified circuits and perform arbitrary computation with security guarantees • Applications: encrypted image search, financial calculations • Potential applications in highly regulated industries, e.g. medical record keeping and processing, secure financial services MIT COMPUTER SCIENCE AND ARTIFICIAL INTELLIGENCE LABORATORY
  • 51. Secure Computation: Overview Untrusted computation: VM image  CPU cores VM image Cloud Task Trusted Machine Circuit spec computation:  FPGA Circuit spec LUTs • Most code is untrusted, executes in a VM • Trusted code is broken up into kernels which become circuits deployed onto an FPGA • If efficiency is not an issue, deploy a processor on the FPGA, execute software securely MIT COMPUTER SCIENCE AND ARTIFICIAL INTELLIGENCE LABORATORY 6/9/2011
  • 52. Secure Computation: Challenge • Multi-tenancy is the key VM Hypervisor to the cloud’s cost Client 1 Client 2 Client 3 effectiveness VM VM VM PCI Express • FPGA can host different applications running in FPGA controller parallel Client 2 Application • Challenge: isolation Client 1 between applications, Application just like a hypervisor Client 3 Application
  • 53. Other Applications • FPGA can be used to load user-specified circuits and perform arbitrary computation with security guarantees • Applications: encrypted image search, financial calculations • Potential applications in highly regulated industries, e.g. medical record keeping and processing, secure financial services
  • 54. Design: FPGA Boot Sequence random nonce PKcard + Manufacturer Certificate Check certificate against e-fuses Check Pkcard against certificate PUFsyndrome + SignPKcard(PUFsyndrome) Compute SKfpga from PUFsyndrome Root Hash + SignPKcard(nonce || Root Hash) Verify signature EncSKfpga(SKcard) + MACSKfpga(nonce || SKcard) Verify MAC
  • 55. Design: Client Trust Model • Each FPGA – NVRAM pair has a Endorsement Key (EK) • Manufacturer certifies the public EK • Client uses the public EK to encrypt a HMAC key, which becomes its shared secret with the trusted hardware Manufacturer verify Endorsement sign Client Certificate generate HMAC key PubEK PrivEK encrypt with PubEK decrypt with Encrypted HMAC key PrivEK HMAC key
  • 56. Design: Hash Tree Security 1. Impossible to come up with a block B1’ such that B1 ≠ B1’ but h(B1) = h(B1’) 2. Impossible to come up with a node hash h1’ such that h1’ such that h1 ≠ h1’ but h(h1||h2) = h(h1’||h2) Therefore, the root hash authenticates the entire contents of the tree.
  • 57. Design: FPGA Boot Sequence Security • Server OS transfers messages between FPGA and Trusted Memory  untrusted channel • FPGA authenticates Trusted Memory using Manufacturer Certificate, whose public key is burned into FPGA’s e-fuses • Trusted Memory authenticates FPGA using its Physically Unclonable Function (PUF) • At manufacturing time, FPGA is paired with memory chip • FPGA can be paired with new memory chip if necessary
  • 58. Design: Hash Tree Cache Security • Server OS responsible for loading and verifying tree nodes • Parent node hash verifies children nodes • Reading a block requires the block’s leaf to be verified • Writing a block requires the path from the block’s leaf to the root to be loaded and verified • A node can be loaded in at most one cache line, to prevent replay attacks using stale node hashes