SlideShare una empresa de Scribd logo
1 de 8
AES implementation
on FPGA
From the FASTEST to the SMALLEST
By
Tim Good and Mohammed Benaissa
Presented By Leonardo Antichi
The AIM
• Show two new FPGA designs for AES implementation:
• Faster & continuous throughput (25Gbps)
• Smallest Area (124 slices)
• Both support a 128-bit key
• Demonstrate the advantage of an FPGA specific
optimisation over an ASIC number-of-gates approach
showing the speed improvement made thanks to the loop
unrolled design
The Fastest
• Key Feature:
• 25 Gbps on a Xilinx
Spartan-III
• Key may be periodically
changed without loss
of throughput
• Operating mode may
be changed between
encryption and
decryption at will
• How:
• Specific FPGA pipeline
placement (deep cuts)
• Parallel loop-unrolled
design
• SubBytes operation in
computational form
(rather than a “SBox”)
• Separate clock domain
for key expansion
The Fastest
• Key Feature:
• 25 Gbps on a Xilinx
Spartan-III
• Key may be periodically
changed without loss of
throughput
• Operating mode may be
changed between encryption
and decryption at will
• How:
• Specific FPGA pipeline
placement (deep cuts)
• Parallel loop-unrolled design
• SubBytes operation in
computational form (rather
than a “SBox”)
• Separate clock domain for
key expansion
Fig. 1. Block diagram for each middle round Fig. 2. Block diagram for final round
The Smallest
• Key Feature:
• 124 slices only (60% of the
Xilinx Spartan-II device)
• 2,2 Mbps throughput
• How:
• Specific Processor
Architecture using 8-bit
data-path (ASIP)
• Subroutines and looping
• Pipelined design
permitting execution of a
new instruction every
cycle
• 360 bits of RAM
The Smallest
• Key Feature:
• 124 slices only (60% of the
Xilinx Spartan-II device)
• 2,2 Mbps throughput
• How:
• Specific Processor
Architecture using 8-bit
data-path (ASIP)
• Subroutines and looping
• Pipelined design permitting
execution of a new
instruction every cycle
• 360 bits of RAM
Fig. 3. ASIP Architecture
S-Box, 42
Fig. 4. Slice utilization versus design unit
Conclusion
Fig. 5. Throughput versus area for the different FPGA designs
Thank you
For your attention

Más contenido relacionado

La actualidad más candente

LEGaTO Heterogeneous Hardware
LEGaTO Heterogeneous HardwareLEGaTO Heterogeneous Hardware
LEGaTO Heterogeneous HardwareLEGATO project
 
“Flexible Machine Learning Solutions with Lattice FPGAs,” a Presentation from...
“Flexible Machine Learning Solutions with Lattice FPGAs,” a Presentation from...“Flexible Machine Learning Solutions with Lattice FPGAs,” a Presentation from...
“Flexible Machine Learning Solutions with Lattice FPGAs,” a Presentation from...Edge AI and Vision Alliance
 
Hyperscan - Mohammad Abdul Awal
Hyperscan - Mohammad Abdul AwalHyperscan - Mohammad Abdul Awal
Hyperscan - Mohammad Abdul Awalharryvanhaaren
 
DPDK Architecture Musings - Andy Harvey
DPDK Architecture Musings - Andy HarveyDPDK Architecture Musings - Andy Harvey
DPDK Architecture Musings - Andy Harveyharryvanhaaren
 
The Microarchitecure Of FPGA Based Soft Processor
The Microarchitecure Of FPGA Based Soft ProcessorThe Microarchitecure Of FPGA Based Soft Processor
The Microarchitecure Of FPGA Based Soft ProcessorDeepak Tomar
 
TMS20DM8148 Embedded Linux Session II
TMS20DM8148 Embedded Linux Session IITMS20DM8148 Embedded Linux Session II
TMS20DM8148 Embedded Linux Session IINEEVEE Technologies
 
HKG15-506: Comcast - Lessons learned from migrating the RDK code base to the ...
HKG15-506: Comcast - Lessons learned from migrating the RDK code base to the ...HKG15-506: Comcast - Lessons learned from migrating the RDK code base to the ...
HKG15-506: Comcast - Lessons learned from migrating the RDK code base to the ...Linaro
 
NXP i.MX6 Multi Media Processor & Peripherals
NXP i.MX6 Multi Media Processor & PeripheralsNXP i.MX6 Multi Media Processor & Peripherals
NXP i.MX6 Multi Media Processor & PeripheralsNEEVEE Technologies
 
0 foundation update__final - Mendy Furmanek
0 foundation update__final - Mendy Furmanek0 foundation update__final - Mendy Furmanek
0 foundation update__final - Mendy FurmanekYutaka Kawai
 
TIAD 2016 : Network automation with Ansible and OpenConfig/YANG
TIAD 2016 : Network automation with Ansible and OpenConfig/YANGTIAD 2016 : Network automation with Ansible and OpenConfig/YANG
TIAD 2016 : Network automation with Ansible and OpenConfig/YANGThe Incredible Automation Day
 
Open Network OS Overview as of 2015/10/16
Open Network OS Overview as of 2015/10/16Open Network OS Overview as of 2015/10/16
Open Network OS Overview as of 2015/10/16Kentaro Ebisawa
 
DPDK Acceleration with Arkville
DPDK Acceleration with ArkvilleDPDK Acceleration with Arkville
DPDK Acceleration with ArkvilleShepard Siegel
 
Atf 3 q15-1 - introduction
Atf 3 q15-1 - introductionAtf 3 q15-1 - introduction
Atf 3 q15-1 - introductionMason Mei
 
Design and Testing Challenges for Chiplet Based Design: Assembly and Test View
Design and Testing Challenges for Chiplet Based Design: Assembly and Test ViewDesign and Testing Challenges for Chiplet Based Design: Assembly and Test View
Design and Testing Challenges for Chiplet Based Design: Assembly and Test ViewODSA Workgroup
 
ONAP - Open Network Automation Platform
ONAP - Open Network Automation PlatformONAP - Open Network Automation Platform
ONAP - Open Network Automation PlatformAtul Pandey
 

La actualidad más candente (20)

Building a Router
Building a RouterBuilding a Router
Building a Router
 
LEGaTO Heterogeneous Hardware
LEGaTO Heterogeneous HardwareLEGaTO Heterogeneous Hardware
LEGaTO Heterogeneous Hardware
 
“Flexible Machine Learning Solutions with Lattice FPGAs,” a Presentation from...
“Flexible Machine Learning Solutions with Lattice FPGAs,” a Presentation from...“Flexible Machine Learning Solutions with Lattice FPGAs,” a Presentation from...
“Flexible Machine Learning Solutions with Lattice FPGAs,” a Presentation from...
 
Hyperscan - Mohammad Abdul Awal
Hyperscan - Mohammad Abdul AwalHyperscan - Mohammad Abdul Awal
Hyperscan - Mohammad Abdul Awal
 
DPDK Architecture Musings - Andy Harvey
DPDK Architecture Musings - Andy HarveyDPDK Architecture Musings - Andy Harvey
DPDK Architecture Musings - Andy Harvey
 
Ppt quick logic
Ppt quick logicPpt quick logic
Ppt quick logic
 
The Microarchitecure Of FPGA Based Soft Processor
The Microarchitecure Of FPGA Based Soft ProcessorThe Microarchitecure Of FPGA Based Soft Processor
The Microarchitecure Of FPGA Based Soft Processor
 
TMS20DM8148 Embedded Linux Session II
TMS20DM8148 Embedded Linux Session IITMS20DM8148 Embedded Linux Session II
TMS20DM8148 Embedded Linux Session II
 
HKG15-506: Comcast - Lessons learned from migrating the RDK code base to the ...
HKG15-506: Comcast - Lessons learned from migrating the RDK code base to the ...HKG15-506: Comcast - Lessons learned from migrating the RDK code base to the ...
HKG15-506: Comcast - Lessons learned from migrating the RDK code base to the ...
 
iCAM
iCAMiCAM
iCAM
 
PLB
PLBPLB
PLB
 
NXP i.MX6 Multi Media Processor & Peripherals
NXP i.MX6 Multi Media Processor & PeripheralsNXP i.MX6 Multi Media Processor & Peripherals
NXP i.MX6 Multi Media Processor & Peripherals
 
0 foundation update__final - Mendy Furmanek
0 foundation update__final - Mendy Furmanek0 foundation update__final - Mendy Furmanek
0 foundation update__final - Mendy Furmanek
 
TIAD 2016 : Network automation with Ansible and OpenConfig/YANG
TIAD 2016 : Network automation with Ansible and OpenConfig/YANGTIAD 2016 : Network automation with Ansible and OpenConfig/YANG
TIAD 2016 : Network automation with Ansible and OpenConfig/YANG
 
Open Network OS Overview as of 2015/10/16
Open Network OS Overview as of 2015/10/16Open Network OS Overview as of 2015/10/16
Open Network OS Overview as of 2015/10/16
 
DPDK Acceleration with Arkville
DPDK Acceleration with ArkvilleDPDK Acceleration with Arkville
DPDK Acceleration with Arkville
 
Atf 3 q15-1 - introduction
Atf 3 q15-1 - introductionAtf 3 q15-1 - introduction
Atf 3 q15-1 - introduction
 
Router3
Router3Router3
Router3
 
Design and Testing Challenges for Chiplet Based Design: Assembly and Test View
Design and Testing Challenges for Chiplet Based Design: Assembly and Test ViewDesign and Testing Challenges for Chiplet Based Design: Assembly and Test View
Design and Testing Challenges for Chiplet Based Design: Assembly and Test View
 
ONAP - Open Network Automation Platform
ONAP - Open Network Automation PlatformONAP - Open Network Automation Platform
ONAP - Open Network Automation Platform
 

Similar a AES Implementation on FPGA

Field programmable Gate Arrays Chapter 6.pdf
Field programmable Gate Arrays Chapter 6.pdfField programmable Gate Arrays Chapter 6.pdf
Field programmable Gate Arrays Chapter 6.pdfffwwx10
 
FPGA Selection Methodology for Real time projects
FPGA Selection Methodology for Real time projectsFPGA Selection Methodology for Real time projects
FPGA Selection Methodology for Real time projectsKrishna Gaihre
 
Synopsys User Group Presentation
Synopsys User Group PresentationSynopsys User Group Presentation
Synopsys User Group Presentationemlawgr
 
CFD acceleration with FPGA (byteLAKE's presentation from PPAM 2019)
CFD acceleration with FPGA (byteLAKE's presentation from PPAM 2019)CFD acceleration with FPGA (byteLAKE's presentation from PPAM 2019)
CFD acceleration with FPGA (byteLAKE's presentation from PPAM 2019)byteLAKE
 
2022-Cauldron-If-Conversion-for-a-Partially-Predicated-VLIW-Architecture.pdf
2022-Cauldron-If-Conversion-for-a-Partially-Predicated-VLIW-Architecture.pdf2022-Cauldron-If-Conversion-for-a-Partially-Predicated-VLIW-Architecture.pdf
2022-Cauldron-If-Conversion-for-a-Partially-Predicated-VLIW-Architecture.pdfssuser866937
 
VLSI design Dr B.jagadeesh UNIT-5.pptx
VLSI design Dr B.jagadeesh   UNIT-5.pptxVLSI design Dr B.jagadeesh   UNIT-5.pptx
VLSI design Dr B.jagadeesh UNIT-5.pptxjagadeesh276791
 
S2C China ICCAD 2010 Presentation
S2C China ICCAD 2010 PresentationS2C China ICCAD 2010 Presentation
S2C China ICCAD 2010 Presentationsrpollock
 
Heterogeneous Computing on POWER - IBM and OpenPOWER technologies to accelera...
Heterogeneous Computing on POWER - IBM and OpenPOWER technologies to accelera...Heterogeneous Computing on POWER - IBM and OpenPOWER technologies to accelera...
Heterogeneous Computing on POWER - IBM and OpenPOWER technologies to accelera...Cesar Maciel
 
Vhdl Project List - Verilog Projects
Vhdl Project List - Verilog Projects Vhdl Project List - Verilog Projects
Vhdl Project List - Verilog Projects E2MATRIX
 
Cpld and fpga mod vi
Cpld and fpga   mod viCpld and fpga   mod vi
Cpld and fpga mod viAgi George
 
Basic FPGA Architecture, Virtex CLB IO blocks
Basic FPGA Architecture, Virtex CLB IO blocksBasic FPGA Architecture, Virtex CLB IO blocks
Basic FPGA Architecture, Virtex CLB IO blocksVenkataramanLakshmin1
 

Similar a AES Implementation on FPGA (20)

HiPEAC-Keynote.pptx
HiPEAC-Keynote.pptxHiPEAC-Keynote.pptx
HiPEAC-Keynote.pptx
 
nios.ppt
nios.pptnios.ppt
nios.ppt
 
Introduction to EDA Tools
Introduction to EDA ToolsIntroduction to EDA Tools
Introduction to EDA Tools
 
SoC FPGA Technology
SoC FPGA TechnologySoC FPGA Technology
SoC FPGA Technology
 
Field programmable Gate Arrays Chapter 6.pdf
Field programmable Gate Arrays Chapter 6.pdfField programmable Gate Arrays Chapter 6.pdf
Field programmable Gate Arrays Chapter 6.pdf
 
FPGA Selection Methodology for Real time projects
FPGA Selection Methodology for Real time projectsFPGA Selection Methodology for Real time projects
FPGA Selection Methodology for Real time projects
 
VF360 OpenVPX Board w. Altera Stratix and TI KeyStone DSP
VF360 OpenVPX Board w. Altera Stratix and TI KeyStone DSPVF360 OpenVPX Board w. Altera Stratix and TI KeyStone DSP
VF360 OpenVPX Board w. Altera Stratix and TI KeyStone DSP
 
Synopsys User Group Presentation
Synopsys User Group PresentationSynopsys User Group Presentation
Synopsys User Group Presentation
 
Pc 104 express w. virtex 5-2014_5
Pc 104 express w. virtex 5-2014_5Pc 104 express w. virtex 5-2014_5
Pc 104 express w. virtex 5-2014_5
 
CFD acceleration with FPGA (byteLAKE's presentation from PPAM 2019)
CFD acceleration with FPGA (byteLAKE's presentation from PPAM 2019)CFD acceleration with FPGA (byteLAKE's presentation from PPAM 2019)
CFD acceleration with FPGA (byteLAKE's presentation from PPAM 2019)
 
Smart logic
Smart logicSmart logic
Smart logic
 
2022-Cauldron-If-Conversion-for-a-Partially-Predicated-VLIW-Architecture.pdf
2022-Cauldron-If-Conversion-for-a-Partially-Predicated-VLIW-Architecture.pdf2022-Cauldron-If-Conversion-for-a-Partially-Predicated-VLIW-Architecture.pdf
2022-Cauldron-If-Conversion-for-a-Partially-Predicated-VLIW-Architecture.pdf
 
An introduction to FPGAs and Their MPSOCs
An introduction to FPGAs and Their MPSOCs  An introduction to FPGAs and Their MPSOCs
An introduction to FPGAs and Their MPSOCs
 
VLSI design Dr B.jagadeesh UNIT-5.pptx
VLSI design Dr B.jagadeesh   UNIT-5.pptxVLSI design Dr B.jagadeesh   UNIT-5.pptx
VLSI design Dr B.jagadeesh UNIT-5.pptx
 
S2C China ICCAD 2010 Presentation
S2C China ICCAD 2010 PresentationS2C China ICCAD 2010 Presentation
S2C China ICCAD 2010 Presentation
 
ASIC VS FPGA.ppt
ASIC VS FPGA.pptASIC VS FPGA.ppt
ASIC VS FPGA.ppt
 
Heterogeneous Computing on POWER - IBM and OpenPOWER technologies to accelera...
Heterogeneous Computing on POWER - IBM and OpenPOWER technologies to accelera...Heterogeneous Computing on POWER - IBM and OpenPOWER technologies to accelera...
Heterogeneous Computing on POWER - IBM and OpenPOWER technologies to accelera...
 
Vhdl Project List - Verilog Projects
Vhdl Project List - Verilog Projects Vhdl Project List - Verilog Projects
Vhdl Project List - Verilog Projects
 
Cpld and fpga mod vi
Cpld and fpga   mod viCpld and fpga   mod vi
Cpld and fpga mod vi
 
Basic FPGA Architecture, Virtex CLB IO blocks
Basic FPGA Architecture, Virtex CLB IO blocksBasic FPGA Architecture, Virtex CLB IO blocks
Basic FPGA Architecture, Virtex CLB IO blocks
 

Más de Leonardo Antichi

Más de Leonardo Antichi (6)

The Equation Group & Greyfish
The Equation Group & GreyfishThe Equation Group & Greyfish
The Equation Group & Greyfish
 
The CCleaner Infection
The CCleaner InfectionThe CCleaner Infection
The CCleaner Infection
 
Short Brocade Presentation
Short Brocade PresentationShort Brocade Presentation
Short Brocade Presentation
 
Checkpoint Overview
Checkpoint OverviewCheckpoint Overview
Checkpoint Overview
 
Forcepoint Overview
Forcepoint OverviewForcepoint Overview
Forcepoint Overview
 
Behavioral biometrics
Behavioral biometricsBehavioral biometrics
Behavioral biometrics
 

Último

CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 

Último (20)

CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 

AES Implementation on FPGA

  • 1. AES implementation on FPGA From the FASTEST to the SMALLEST By Tim Good and Mohammed Benaissa Presented By Leonardo Antichi
  • 2. The AIM • Show two new FPGA designs for AES implementation: • Faster & continuous throughput (25Gbps) • Smallest Area (124 slices) • Both support a 128-bit key • Demonstrate the advantage of an FPGA specific optimisation over an ASIC number-of-gates approach showing the speed improvement made thanks to the loop unrolled design
  • 3. The Fastest • Key Feature: • 25 Gbps on a Xilinx Spartan-III • Key may be periodically changed without loss of throughput • Operating mode may be changed between encryption and decryption at will • How: • Specific FPGA pipeline placement (deep cuts) • Parallel loop-unrolled design • SubBytes operation in computational form (rather than a “SBox”) • Separate clock domain for key expansion
  • 4. The Fastest • Key Feature: • 25 Gbps on a Xilinx Spartan-III • Key may be periodically changed without loss of throughput • Operating mode may be changed between encryption and decryption at will • How: • Specific FPGA pipeline placement (deep cuts) • Parallel loop-unrolled design • SubBytes operation in computational form (rather than a “SBox”) • Separate clock domain for key expansion Fig. 1. Block diagram for each middle round Fig. 2. Block diagram for final round
  • 5. The Smallest • Key Feature: • 124 slices only (60% of the Xilinx Spartan-II device) • 2,2 Mbps throughput • How: • Specific Processor Architecture using 8-bit data-path (ASIP) • Subroutines and looping • Pipelined design permitting execution of a new instruction every cycle • 360 bits of RAM
  • 6. The Smallest • Key Feature: • 124 slices only (60% of the Xilinx Spartan-II device) • 2,2 Mbps throughput • How: • Specific Processor Architecture using 8-bit data-path (ASIP) • Subroutines and looping • Pipelined design permitting execution of a new instruction every cycle • 360 bits of RAM Fig. 3. ASIP Architecture S-Box, 42 Fig. 4. Slice utilization versus design unit
  • 7. Conclusion Fig. 5. Throughput versus area for the different FPGA designs
  • 8. Thank you For your attention

Notas del editor

  1. THE AIM WHY FPGA? Expanding Design cost, risk, time to market Mass produced ARCHITECTURE CHOISES THE TWO DESIGNS The research objective is to explore the design space associated with the AES algorithm and in particular its FPGA (Field Programmable Gate Array ) hardware implementation. Why FPGA? Because FPGA has been expanding from its traditional role in prototyping to mainstream production. This change is being driven by commercial pressures to reduce design cost, risk and achieve a faster time to market. The Advances in technology have resulted in mask programmed mass produced versions of FPGA fabrics being offered by the leading manufacturers which, for some applications, remove the necessity to move prototype designs from FPGA to ASIC whilst still achieving a low unit cost. It is important also to underline that architectural choices for a given application are driven by the system requirements in terms of speed and the resources used and those can simply be viewed as throughput and area. For this porpoise two new FPGA designs are presented: The first is believed to be the fastest, achieving 25 Gb/s throughput using a Xilinx Spartan-III (XC3S2000) device, while, The second is believed to be the smallest and fits into a Xilinx Spartan-II (XC2S15) device, only requiring two block memories and 124 slices to achieve a throughput of 2.2 Mbps.
  2. DEEP CUT PIPELINE PLACEMENT UNROLLED DESIGN NO SBOX (Composite field arithmetic) FPGAs are particularly fast in terms of throughput when designs are implemented using deep pipeline cuts. The attainable speed is set by the longest path in the design so it is important to place the pipeline cuts such that the path lengths between the pipeline latches are balanced. First, the algorithm must be converted into a suitable form for deep pipelining. This is achieved by removing all the loops to form a loop-unrolled design where the data is moved through the stationary execution resources. On each clock cycle, a new item of data is input and progresses over numerous cycles through the pipeline resulting in the output of data each cycle, however, with some unavoidable latency. One of the key optimisations was to express the SubBytes operation in computational form rather than as a look-up table. Earlier implementations used the look-up table approach (the “S” box) but this resulted in an unbreakable delay due to the time required pass through the FPGA block memories. The FIPS-197 specification provided the mathematical derivation for SubBytes in terms of Galois Field (2^8) arithmetic. This was efficiently exploited by hardware implementations using composite field arithmetic which permitted a deeper level of pipelining and thus improved throughput.
  3. KEY CHANGES The high speed design presented here includes support for continued throughput during key changes for both encryption and decryption which previous pipelined designs have omitted.
  4. 8bit design Longer clock less area 13546 cycles 18885 cycles ASIP 259 slices There already exist a number of good 32-bit based designs. In these designs the AES is implemented by breaking up a round into a number of smaller 32-bit wide operations. We decided to move to an 8 bit design increasing the clock necessary for completing the task but saving resources in terms of area occupied. This results in a final design using the PicoBlaze of 119 slices plus the block memories which are accounted for here by an equivalent number of slices (once again 32 bits per slice was used). The resulting design had an equivalent slice count of 452 and with a 90.2 MHz maximum clock. Key expansion followed by encipher took 13546 cycles and key expansion followed by decipher 18885 cycles. The average encipher-decipher throughput was 0.71 Mbps. An application specific instruction processor (ASIP) was developed based around an 8-bit datapath and minimal program ROM size. Minimisation of the ROM size resulted in a requirement to support subroutines and looping. This added area to the control portion of the design but the saving was made in terms of the size of the ROM. The total design, including an equivalent number of slices for the block memories only occupies 259 slices to give a throughput of 2.2Mbps.
  5. PROCESSOR DESIGN AREA (60%datapath 40%control) The datapath consisted of two processing units, the first to perform the SubBytes operation using resource shared composite field arithmetic and the second to perform multiply accumulate operations in Galois Field 2^8. A minimal set of instructions was developed (15 in total) to perform the operations required for the AES. The processor (Figure 3) used a pipelined design permitting execution of a new instruction every cycle. Figure 4 is a pie chart depicting the balance of area between the various design units. Of the processor hardware approx 60% of the area is required for the datapath and 40% for the control.
  6. THE FASTEST, APPLICATION THE SMALLEST, APPLICATION 32BIT These designs show the extremes of what is possible and have radically different applications In terms of speed to area ratio the unrolled designs perform the best as there is no controller overhead. However, such designs are very large and need a 1 – 2 million gate device but achieve throughputs up to 25 Gbps. These designs have applications in fixed infrastructure such as IPsec for e-commerce servers. The low area design described here achieves 2.2 Mbps which is sufficient for most wireless and home applications. The area required is small thus fundamentally low power so has utility in the future mobile area. The design requires just over half the available resources of the smallest available Spartan-II FPGA. The advantage of the 8-bit ASIP over the more traditional 8-bit microcontroller architecture (PicoBlaze) is shown by the approximate factor of three improvement in throughput and 40% reduction in area (including estimated area for memories). The 32-bit datapath designs occupy the middle ground between the two extremes and have utility where moderate throughput in the 100 – 200 Mbps is required.