SlideShare una empresa de Scribd logo
1 de 83
SUMMARY OF COURSE
      PROJECTS
    SETIAWAN SOEKAMTOPUTRA


MASTER OF ELECTRICAL AND COMPUTER
            ENGINEERING
 ILLINOIS INSTITUTE OF TECHNOLOGY
      DECEMBER 2010 GRADUATE
CONTENTS

• 32-bit Pipelined CPU
• MC68K-Based Monitor Program
• Pipelined MIPS Processor with hazard handler and data
  forwarding
• Simple Mesh-Like and Ring-Like Network on Chip Design
• Small office network design
• 4-bit 10t adder circuit with dual-vt logic design
• Single-ended 6T vs. standard 6T SRAM bitcell design
• QR Matrix Factorization
• Electro Active Polymer Energy Harvesting Design
• Advanced Encryption Standard Hardware Design

                                                      2
SPRING 2009

• Introduction to VLSI Design
  • 32-bit Pipelined CPU
  • Multiplier with accumulator and pipeline optimization
• Microcomputer
  • MC68K-Based Monitor Program
• Advanced Computer Architecture
  • Pipelined MIPS Processor with hazard handler and data
    forwarding




                                                      Return   3
32-BIT PIPELINED CPU

• Hardware Description Language
  • Verilog
• Tools
  • Compiler: Cadence Verilog XL
  • Logic Synthesis: Synopsys Design Compiler
  • Simulation tool: Cadence‟s SimVision, Mentor Graphics
    Modelsim
  • Place and Route: Cadence SOC Encounter
  • Mentor Graphic‟s Modelsim
• Objectives
  • Execute ASIC Flow in this implementation using verilog
    • RTL, post-synthesis, and post-PR simulation for verification
  • Determine maximum frequency, area, delay, and power


                                                                     Return   4
32-BIT PIPELINED CPU

• 32-bit Memory File
• Eight ALU functions: multiplication, add, subtraction,
  OR, AND, XOR, XNOR
• M:multiplicand, N: multiplier
• Multiplier:
  • Radix 2r produce N/r partial products
  • Radix-4 booth-encoded Multiplier  Reduces number of
    partial products (N/2 vs. N)
  • Wallace Tree  Reduces number of logic levels required to
    perform summation



                                                    Return      5
32-BIT PIPELINED CPU




                       Return   6
32-BIT PIPELINED CPU




                       Return   7
32-BIT PIPELINED CPU




                       Return   8
32-BIT PIPELINED CPU

• Results
• Maximum frequency: 40 < f < 41
  MHz




                                   Return   9
32-BIT PIPELINED CPU

• Case studies:
  • Case 1: Modify ALU multiplier to multiplier with accumulator
    (MAC) (useful for implementing DSP)
  • Case 2: Pipeline optimization
• MAC benefit: reduces #instruction sets to compute
  the final result of sum of product functions.
• Pipeline optimization is applied by inserting registers
  at the critical path (in this case MAC unit)




                                                       Return   10
Case I
         32-BIT PIPELINED CPU




                                Return   11
32-BIT PIPELINED CPU

• Case 1 results




• Case 2 results




                                 Return   12
32-BIT PIPELINED CPU

• Case 2 Decision to put registers




                                     Return   13
32-BIT PIPELINED CPU

• Provided:
  • Multiplier accumulator block diagram
  • Simple CPU design written in verilog
  • All required tools
• Implementation
  • Construct fore-mentioned unit in verilog and modify the
    design to fit new unit
  • Apply numbers of registers for pipelining
• Design functionality Test
  • Verify in sumulation that function F= (-10)* 5 + (-60)*2 + (-
    60)*8 outputs the correct result


                                                           Return   14
32-BIT PIPELINED CPU

• Results




                                   Return   15
32-BIT PIPELINED CPU

• Additional Analysis Result
  • Finding the maximum frequency
  • Expected maximum frequency of the design: 58 MHz
  • Frequency vs. area vs. power consumption




                                                 Return   16
MC68K-BASED MONITOR PROGRAM

• instructor: Dr. Jafar Saniie
• Requirements/Specifications
  • Construct a simple monitor program for MC68000 processor
    that allows user to execute common memory and register
    accesses, basic exception handlers.
• Language
  • 68000 assembly language
• Tools
  • Easy68k Editor/Assembler/Simulator




                                                   Return   17
MC68K-BASED MONITOR PROGRAM

 • Monitor program
   flowchart




Return                     18
MC68K-BASED MONITOR PROGRAM


• Monitor
  program
  system
  diagram




                                     Return   19
MC68K-BASED MONITOR PROGRAM

• Includes command interpreter that check and validate
  user inputs.
• Monitor debugger commands:
  •   MEMD    Memory display
  •   MEMS    Memory Set
  •   SORT    Memory Sort
  •   FILL    Memory Fill
  •   MOVE    Memory move
  •   MEMM    Memory Modify
  •   FIND    Block Memory Search
  •   REGM    Register Modify
  •   REGD    Register Display
  •   RUNS    Execute program at specified location



                                                       Return   20
MC68K-BASED MONITOR PROGRAM

• Monitor debugger Exception handling commands:
 •   TBUS    Bus Error Exception
 •   TADD    Address Error
 •   TILL    Illegal Exception
 •   TPRI    Privilege Violation
 •   TDIV   Division by Zero




                                        Return    21
MC68K-BASED MONITOR PROGRAM

 • Results (partial of 17 commands made)
                                 Register display




                     Memory display




Return          Command interpreter
                                                    22
HIGH-PERFORMANCE PIPELINED
        MIPS PROCESSOR
• MIPS (Microprocessor without Interlocked Pipeline Stages) is a
  reduced instruction set computer (RISC) instruction set
  architecture (ISA)
• instructor: Prof. Jia Wang
• Requirements/Specifications
  • Design a MIPS processor with pipeline, data forwarding, and hazard
    handling capabilities.
  • Run RTL Simulation to verify the functionalities
• Language
  • VHDL
• Tools
  • Modelsim PE 6.5
  • MARS 3.6 MIPS Simulator
• Provided:
  • Data memory unit design
  • Testbench code

                                                             Return      23
HIGH-PERFORMANCE PIPELINED
        MIPS PROCESSOR
• Data width: 32-bit
                              • Branch Hazard
• 5-stage pipeline
  •   Instruction Fetch        • Branch calculation occurred in
  •   Instruction Decode         Instruction Decode Stage
  •   Execute
  •   Memory Access
                               • Branch miss only costs one cycle
  •   Write-Back                 of stall.
• Main Modules                • Data Hazard
  •   Program counter (PC)
  •   Control Unit             • Stall if data being written is going
  •   ALU Control Unit           to be used at the next instruction
  •   Register File
  •   ALU                     • Data Forwarding
  •   Instruction Memory
  •   Data Memory
                               • Result data is used immediately
  •   Hazard Detection Unit      rather than written back to
  •   Forwarding Unit            register file first.

                                                         Return   24
HIGH-PERFORMANCE PIPELINED MIPS PROCESSOR


 • MIPS Architecture




Return                                        25
HIGH-PERFORMANCE PIPELINED
       MIPS PROCESSOR
• Test program (Running on MARS 3.6)




                                       Return   26
HIGH-PERFORMANCE PIPELINED
       MIPS PROCESSOR
• Result




                        Return   27
FALL 2009

• Hardware/Software Co-Design
 • Simple Mesh-Like Network on Chip Design
 • Simple Ring-Like Network on Chip Design
• Introduction to Computer Network
 • Design of 2-story small office computer network




                                                     Return   28
HARDWARE/SOFTWARE CO-
           DESIGN


• Projects:
 • Network on chip prototype design with three
   nodes
 • Simple Mesh-Like Network on Chip Design




                                           Return   29
NETWORK ON CHIP PROTOTYPE
   DESIGN WITH THREE NODES
• Instructor: Prof. Jia Wang
• Specifications
  • Three-node in partially connected mesh topology NoC
    architecture
  • Three processing elements and three routers.
  • Queue system: FIFO
• Language
  • SystemC running on Visual C++
• Tools
  • Microsoft Visual C++



                                                  Return   30
NETWORK ON CHIP PROTOTYPE
   DESIGN WITH THREE NODES
• Three-node NoC System Diagram




• Third node function (called PE_dumpbox)
  • It receives all packets that cannot be processed by the
    destination processing unit due to overloading in the network

                                                        Return      31
NETWORK ON CHIP PROTOTYPE
      DESIGN WITH THREE NODES
• Results
    • Overload in Router 1 network
      buffer at cycle 3




    • 3rd processing unit
      PE_dumpbox receives
      packet




Return                               32
MESH-LIKE NETWORK ON CHIP
          PROTOTYPE DESIGN
• Specifications
  •   a simple mesh-like NoC architecture.
  •   One router has one processing unit (PE).
  •   Queue system: FIFO
  •   4 by 4 matrix-like size
• Language
  • SystemC
• Tools
  • Microsoft Visual C++




                                                 Return   33
MESH-LIKE NETWORK ON CHIP
      PROTOTYPE DESIGN
• Simple NoC Architecture




                            Return   34
MESH-LIKE NETWORK ON CHIP
       PROTOTYPE DESIGN
• Results
  • Generated packets




  • Result shows packets are
    delivered




                               Return   35
MESH-LIKE NETWORK ON CHIP
       PROTOTYPE DESIGN
• Results
  • Delays due to the fact
    that only one packet is
    delivered to processing
    element PE at a time




                              Return   36
MESH-LIKE NETWORK ON CHIP
      PROTOTYPE DESIGN
• Benefit and drawback:
 • Packet arrives in the destination address with fewer hops
    reducing contention and increasing average bit rate.
 • Increases the complexity of the design and more wires
   are needed.




                                                     Return    37
INTRODUCTION TO COMPUTER
           NETWORK
• Project:
  • Design a prototype of 2-story small office computer network
    capable of serving 20 users with three department LANs,
    four servers and wireless Internet
• Language
  • N/A
• Tools
  • Microsoft Visio




                                                     Return   38
SMALL OFFICE NETWORK DESIGN

• Proposed configurations
 • IP address allocation




                            Return   39
SMALL OFFICE NETWORK DESIGN

• Proposed configurations
 • Design Topology




                            Return   40
SMALL OFFICE NETWORK DESIGN

• Office Layout




                                        2nd floor


                   Colored arrows show how
1st floor          cables are managed
                               Return               41
SPRING 2010

• Advanced VLSI
  • 4-bit 10t adder circuit with dual-vt logic design
• High Performance VLSI IC System
  • Single-ended 6T vs. standard 6T SRAM bitcell design
    comparison
• QR Factorization
  • Implementing QR factorization algorithm in C




                                                        Return   42
4-BIT 10T ADDER CIRCUIT WITH
          DUAL-VT LOGIC DESIGN
• Project:
  • 4-bit 10t adder circuit with dual-vt logic design
• Specifications
  • Adder circuit is based on:
      J. Lin, M. Sheu, and C.Ho. A Novel High-Speed and Energy Efficient 10-Transistor Full
      Adder Design. IEEE Trans. on Circuits and Systems, May 2007.
  •   Adder: cascaded Carry ripple Adders
  •   Technology node: 45nm (FreePDK)
  •   Voltage: 1.1V @ 25 MHz
  •   Performance measurements (delay and power consumption) for 10T
      Adder Circuit using high-threshold (Vt), low-Vt, and dual-Vt transistors
• Tools
  • Cadence Virtuoso Schematic Design
  • Synopsys HSPICE Simulator
  • Nanosim Simulator


                                                                             Return      43
4-BIT 10T ADDER CIRCUIT WITH
     DUAL-VT LOGIC DESIGN
• High Vt vs. low Vt




• Full Adder Design (1-bit)
  • Complementary and level restoring carry logic (CLRCL)




                                                       Return   44
4-BIT 10T ADDER CIRCUIT WITH
     DUAL-VT LOGIC DESIGN
• Full Adder Design (1-bit) Critical Path
  • Dual-VT: Low-VT apply on transistors which are in critical path for
    speed and High-VT for others for low leakage
  • NMOS at multiplexer and PMOS in inverter are low-VT transistors




                                                            Return    45
4-BIT 10T ADDER CIRCUIT WITH
      DUAL-VT LOGIC DESIGN
• Logic Equation
          Sum = (A XNOR B).Cin + (A XOR B). Cin_bar
            Cout= (A XOR B) .Cin + (A XNOR B).A
• Design Components
 • Inverter (left) and multiplexer (right)




                                                      Return   46
4-BIT 10T ADDER CIRCUIT WITH
       DUAL-VT LOGIC DESIGN
• 1-bit Full Adder (consisting of multiplexers and
  inversters) and its symbol




• 4-bit Full Adder




                                                Return   47
4-BIT 10T ADDER CIRCUIT WITH
       DUAL-VT LOGIC DESIGN
• Methodology
  • Using combination of input vector to measure delay and
    power consumptions
       • Delay          : Switching delay between least significant bit (bit 0)
         and most significant bit (bit 3)
       • Power          : Average and maximum power during simulation

• Results                       4.00E-10

                                3.50E-10
  • Delay (in seconds)
                                3.00E-10

                                2.50E-10
                                                                            High-VT
                                2.00E-10
                                                                            Low-VT
                                1.50E-10
                                                                            Dual-VT
                                1.00E-10

                                5.00E-11

                               0.00E+00
                                           High-to-Low   Low-to-High
                                                                       Return         48
4-BIT 10T ADDER CIRCUIT WITH
               DUAL-VT LOGIC DESIGN
   • Results
       • Power consumption (in Watt)
6.00E-05                                 5.00E-04
                                         4.50E-04
5.00E-05
                                         4.00E-04
4.00E-05                                 3.50E-04
                                         3.00E-04
3.00E-05                      High-VT    2.50E-04                      High-VT
                              Low-VT     2.00E-04
2.00E-05                                                               Low-VT
                                         1.50E-04
                              Dual-VT                                  Dual-VT
1.00E-05                                 1.00E-04
                                         5.00E-05
0.00E+00                                0.00E+00


            Average Power                           Maximum Power




                                                              Return    49
4-BIT 10T ADDER CIRCUIT WITH
       DUAL-VT LOGIC DESIGN
• Results




                            Return   50
4-BIT 10T ADDER CIRCUIT WITH
       DUAL-VT LOGIC DESIGN
• Issue
  • Voltage degradation specifically for high-vt or high
    frequency (> 125 MHz) due to pass transistors behavior to
    deliver weak-1 (NMOS) or weak-0 (PMOS).




                                                      Return    51
SINGLE-ENDED 6T VS. STANDARD 6T
      SRAM BITCELL DESIGN
• Specifications
  • Design from:
  J. Singh, et al. Single Ended 6T SRAM with Isolated Read-Port for Low-
  Power Embedded Systems. IEEE. 2009
  • Technology node: 45nm
  • Use: high VT MOSFET
• Tools
  • Cadence Virtuoso Schematic Design
  • Synopsys HSPICE Simulator




                                                                Return     52
SINGLE-ENDED 6T VS. STANDARD 6T
      SRAM BITCELL DESIGN
• Background
 • SRAM consumes majority of die area
 • Dynamic power via reads and writes activities
 • Static power : retaining its logic value
• Benefits/Drawbacks of Single-Ended SRAM
 • Faster reading logic „1‟
 • One bit line (no complementary bit bar line) wire
   reduction
 • More delay in Writing „1‟ due to weak-1 behavior of pass
   transistor NMOS (but around 85% of writes are zero writes)
 • Role of Isolated Read Port: Prevents bitcell content to be
   exposed during READs
 • Considerable lower power dissipation, better read SNM

                                                      Return    53
SINGLE-ENDED 6T VS. STANDARD 6T
     SRAM BITCELL DESIGN




                           Return   54
SINGLE-ENDED 6T VS. STANDARD 6T
      SRAM BITCELL DESIGN
• Standard 6T SRAM
 • Read: precharge
   BL and BL* 
   WordLine=1
 • Write: assert new
   value to BL and BL*
    WordLine=1
 • Transistor sizing:
   • Access transistor:
     medium
   • Pullup TR: weak
   • Pulldown TR: Strong



                            Return   55
SINGLE-ENDED 6T VS. STANDARD 6T
     SRAM BITCELL DESIGN




                           Return   56
SINGLE-ENDED 6T VS. STANDARD 6T
     SRAM BITCELL DESIGN




                           Return   57
SINGLE-ENDED 6T VS. STANDARD 6T
     SRAM BITCELL DESIGN




                           Return   58
SINGLE-ENDED 6T VS. STANDARD 6T
     SRAM BITCELL DESIGN




                           Return   59
SINGLE-ENDED 6T VS. STANDARD 6T
      SRAM BITCELL DESIGN
• Standard SRAM Design (using Cadence Virtuoso)




                                          Return   60
SINGLE-ENDED 6T VS. STANDARD 6T
      SRAM BITCELL DESIGN
• Single-Ended SRAM Design




                             Return   61
SINGLE-ENDED 6T VS. STANDARD 6T
         SRAM BITCELL DESIGN
  • Comparison Results
     • Write Delay (0 to 0.5Vdd or 1 to 0.5Vdd)




“…around 85% of the instruction write bits are “0,” and over 90% of the data
                  write bits are “0.”.. “ (quoted from [3])
                                [3] Y. Chang, F. Lai, C. Yang. Zero-Aware Asymmetric SRAM Cell for
                                Reducing Cache Power in Writing Zero. IEEE Trans. On VLSI
                                Systems, Vol.12, No.8, August 2004.



                                                                                 Return         62
SINGLE-ENDED 6T VS. STANDARD 6T
      SRAM BITCELL DESIGN
• Comparison Results
 • Power Consumption Comparison




                                  Return   63
SINGLE-ENDED 6T VS. STANDARD 6T
      SRAM BITCELL DESIGN
• Noise Margin




                            Return   64
QR MATRIX FACTORIZATION

• Purposes:
  • Implementing QR factorization algorithm in C
• Specifications
  • Written in C under RedHat OS
• QR Factorization
  • Decomposition method of a matrix to solve linear problems or
    equations without inverting one of the left-hand side matrix.
  • Applicable to: m-by-n matrix A
  • Decomposition: A = QR where Q is an orthogonal matrix of size m-by-
    m, and R is an upper triangular
  • The QR decomposition provides an alternative way of solving the
    system of equations Ax = b without inverting the matrix A. The fact that
    Q is orthogonal means that QTQ = I, so that Ax = b is
  • equivalent to Rx = QTb, which is easier to solve since R is triangular.



                                                                Return    65
QR MATRIX FACTORIZATION

• Algorithm




                         Return   66
QR MATRIX FACTORIZATION

• Result




                          Return   67
FALL 2010




• Electro Active Polymer Energy Harvesting
• Advanced Encryption Standard




                                      Return   68
ELECTRO ACTIVE POLYMER
   ENERGY HARVESTING DESIGN
• EAP Circuitry provides mechanical to electrical
  energy conversion when it is stretched, given bias
  voltage.
• EAP material  VHB 4905 tape and carbon grease




                                             Return    69
ELECTRO ACTIVE POLYMER
   ENERGY HARVESTING DESIGN
• Previous prototype:           • Drawbacks
                                  • High energy consumption
  • Charge management             • EAP output power is too
    IC: TI‟s bq2000                 small to even turn on battery
  • Li-ion battery 3V, 45mAh        charging circuit (which
                                    needs 20.6 mA)
  • Application: TI‟s eZ430-    • Solutions
    F2013                         • EAP material efficiency
  • Boost Converter to              • Higher capacitance
    supply biasing voltage (5     • Battery and circuit that can
    V  1.5KV):                     store small energy without
                                    requiring much energy to
    • EMCO Q15N-5                   operate
                                  • Apply low biasing voltage 
                                    eliminate use of boost
                                    converter

                                                     Return   70
ELECTRO ACTIVE POLYMER
   ENERGY HARVESTING DESIGN
• Simulation model using Simulink
  • Circuit model parameters:
    • EAP Model parameters, input voltage (battery), and output
      capacitor Co




                                                         Return   71
ELECTRO ACTIVE POLYMER
   ENERGY HARVESTING DESIGN
• Simulation model using Simulink
  • EAP Model Parameters:
    • Cidle, Cforced, force frequency f(how often the EAP is stretched)
    • Absolute function to create always-positive sine waveform from
      original sine wave




                                                            Return    72
ELECTRO ACTIVE POLYMER
   ENERGY HARVESTING DESIGN
• Simulation result:




                         Return   73
ELECTRO ACTIVE POLYMER
      ENERGY HARVESTING DESIGN
• Prototype:
  •   Battery charging : Cymbet CBC5300
  •   Battery          : 2xCBC050 (3x50uAh) at 3.5V output
  •   Capability to harvest 1.05V
  •   PCB Layout Tool : Altium Designer
  •   Application: MSP430-F2274 with CC2500 2.4GHz RF
      Transceiver




                                                      Return   74
ELECTRO ACTIVE POLYMER
    ENERGY HARVESTING DESIGN
•




                          Return   75
ELECTRO ACTIVE POLYMER
    ENERGY HARVESTING DESIGN
•




                          Return   76
ELECTRO ACTIVE POLYMER
ENERGY HARVESTING DESIGN




                   Battery Charging
                   profile for CBC050




                          Return   77
ADVANCED ENCRYPTION STANDARD
      HARDWARE DESIGN
• Variant AES with 512-bit and 1024-bit key
• Area and power consumption comparison with 128-bit
  and 256-bit AES keys
• CMOS technology          : 45nm
• Operating Voltage        : 1.1 V @ 100 MHz
• Verilog language
• Tools:
  • Synthesis     : Synopsys DC Compiler
  • Simulation    : Modelsim
• Find the relationship between key size and implemented
  hardware area and power consumption.

                                               Return   78
ADVANCED ENCRYPTION STANDARD
      HARDWARE DESIGN
          Cipher Key                     Plaintext
•                                                                  Initial Round
          Key Expansion   RoundKey[0]   AddRoundKey

                                                              Normal Round
                                          SubBytes

                                         ShiftRows

                                        MixColumns         i=i+1

                          RoundKey[i]   AddRoundKey

                                                            yes
                                         i < Number of
                                             rounds?

                                                                     Final Round
                                              No

                                          SubBytes

                                         ShiftRows

                                        AddRoundKey

                                        Ciphered Text

                                                         Return              79
ADVANCED ENCRYPTION STANDARD
                   HARDWARE DESIGN
                                                                                                                      plaintext (in bytes)
        • Block View of AES                                                                                           0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

          Operation                                                                                                                           XOR
                                                                                                                      First roundkey (in bytes)
                                                                                                                      0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

                                                                                                                          State Block                       State Block
                                                                                                                           0 4 8 12          SubBytes       S0 S4 S8       S12
                                                                                                                                          (Replaces each
                                          SubBytes
                                                                                                                           1 5 9 13       byte with S-box   S1 S5 S9       S13
                                    Mux




Plain_text   AddRoundKey                     and      MixColumns
                                          ShiftRows                      AddRoundKey
                                                                                                                           2 6 10 14          value)        S2 S6 S10      S14
                                                                   Mux




                                                                                                  Ciphered
                                                                                            Mux




                                                                                  Initial           _text

Cipher_key   Key Expansion Module                                                 value
                                                                                  (zero)                                   3 7 11 15                        S3 S7 S11      S15

                                                                                                                           State Block(after ShiftRows)
                                                                                                                            S0 S4 S8 S12
                                                                                                               Ready                                          MixColumns
                                                                                                                 for       S5 S9 S13 S1            XOR           a(x)
                                                                                                             next round
                                                                                                                           S10 S14 S2 S6         Per Column
                                                                                                                           S15 S3 S7 S11

                                                                                                                          State Block(after MixColums) Next roundkey
                                                                                                                           M0 M4 M8 M12                 K0 K4 K8 K12
                                                                                                                           M5 M9 M13 M1                 k1 K5 K9 K13
Return                                                                                                                    M10 M14 M2 M6
                                                                                                                                                 XOR K2 K6 K10 K14
                                                                                                                                                                                 80
                                                                                                                          m15 M3 M7 M11                 K3 K7 K11 K15
ADVANCED ENCRYPTION STANDARD
        HARDWARE DESIGN
 • Block Diagram


                                          SubBytes
                                    Mux




Plain_text   AddRoundKey                     and      MixColumns
                                          ShiftRows                      AddRoundKey




                                                                   Mux
                                                                                                  Ciphered




                                                                                            Mux
                                                                                  Initial           _text
             Key Expansion Module                                                 value
Cipher_key
                                                                                  (zero)




                                                                                            Return           81
ADVANCED ENCRYPTION STANDARD
        HARDWARE DESIGN
          7
Results   6
                                           y = 0.852x + 2.739
                                               R² = 0.985

          5                     100000
                                 95000
          4                      90000                                                 power (dynamic) in mW
                                 85000
                                 80000                                                 power (static) in mW
          3                      75000                                                 Total Power in mW
                                 70000
                                 65000                                                 Linear (Total Power in mW)
          2
                                 60000
                                 55000
          1                      50000
                                         AES128    AES256    AES512   AES1024
                                   area 58824.876 64188.036 76881.193 96312.560
          0
                 AES128       AES256                AES512                   AES1024


           power (dynamic) in mW power (static) in mW Total Power in mW
   AES128                   3.3574           0.2971603         3.6545603
   AES256                   3.9442           0.3341722         4.2783722
   AES512                   5.0289            0.409219          5.438119
   AES1024                  5.6042           0.5053051         6.1095051

                                                                                        Return             82
ADVANCED ENCRYPTION STANDARD
       HARDWARE DESIGN
Results: Area

                100000

                 95000

                 90000

                 85000

                 80000

                 75000

                 70000

                 65000

                 60000

                 55000

                 50000
                            AES128       AES256        AES512       AES1024
                   area   58824.87654   64188.0369   76881.19388   96312.56036




                                                                         Return   83

Más contenido relacionado

La actualidad más candente

SOC System Design Approach
SOC System Design ApproachSOC System Design Approach
SOC System Design ApproachA B Shinde
 
SOC Chip Basics
SOC Chip BasicsSOC Chip Basics
SOC Chip BasicsA B Shinde
 
System design techniques and networks
System design techniques and networksSystem design techniques and networks
System design techniques and networksRAMPRAKASHT1
 
SOC Application Studies: Image Compression
SOC Application Studies: Image CompressionSOC Application Studies: Image Compression
SOC Application Studies: Image CompressionA B Shinde
 
SOC Peripheral Components & SOC Tools
SOC Peripheral Components & SOC ToolsSOC Peripheral Components & SOC Tools
SOC Peripheral Components & SOC ToolsA B Shinde
 
⭐⭐⭐⭐⭐ Monitoring of system memory usage embedded in #FPGA
⭐⭐⭐⭐⭐ Monitoring of system memory usage embedded in #FPGA⭐⭐⭐⭐⭐ Monitoring of system memory usage embedded in #FPGA
⭐⭐⭐⭐⭐ Monitoring of system memory usage embedded in #FPGAVictor Asanza
 
Implementation of Soft-core Processor on FPGA
Implementation of Soft-core Processor on FPGAImplementation of Soft-core Processor on FPGA
Implementation of Soft-core Processor on FPGADeepak Kumar
 
MIPI DevCon 2016: Accelerating Software Development for MIPI CSI-2 Cameras
MIPI DevCon 2016: Accelerating Software Development for MIPI CSI-2 CamerasMIPI DevCon 2016: Accelerating Software Development for MIPI CSI-2 Cameras
MIPI DevCon 2016: Accelerating Software Development for MIPI CSI-2 CamerasMIPI Alliance
 
Processors used in System on chip
Processors used in System on chip Processors used in System on chip
Processors used in System on chip A B Shinde
 
MIPI DevCon 2016: Accelerating UFS and MIPI UniPro Interoperability Testing
MIPI DevCon 2016: Accelerating UFS and MIPI UniPro Interoperability TestingMIPI DevCon 2016: Accelerating UFS and MIPI UniPro Interoperability Testing
MIPI DevCon 2016: Accelerating UFS and MIPI UniPro Interoperability TestingMIPI Alliance
 
System On Chip (SOC)
System On Chip (SOC)System On Chip (SOC)
System On Chip (SOC)Shivam Gupta
 
Xilinx fpga cores
Xilinx fpga coresXilinx fpga cores
Xilinx fpga coressanaz nouri
 
Runtime Reconfigurable Network-on-chips for FPGA-based Devices
Runtime Reconfigurable Network-on-chips for FPGA-based DevicesRuntime Reconfigurable Network-on-chips for FPGA-based Devices
Runtime Reconfigurable Network-on-chips for FPGA-based DevicesMugdha2289
 
chap 18 multicore computers
chap 18 multicore computers chap 18 multicore computers
chap 18 multicore computers Sher Shah Merkhel
 
The Microarchitecure Of FPGA Based Soft Processor
The Microarchitecure Of FPGA Based Soft ProcessorThe Microarchitecure Of FPGA Based Soft Processor
The Microarchitecure Of FPGA Based Soft ProcessorDeepak Tomar
 
Public Seminar_Final 18112014
Public Seminar_Final 18112014Public Seminar_Final 18112014
Public Seminar_Final 18112014Hossam Hassan
 

La actualidad más candente (20)

SOC System Design Approach
SOC System Design ApproachSOC System Design Approach
SOC System Design Approach
 
SOC Chip Basics
SOC Chip BasicsSOC Chip Basics
SOC Chip Basics
 
System design techniques and networks
System design techniques and networksSystem design techniques and networks
System design techniques and networks
 
SOC Application Studies: Image Compression
SOC Application Studies: Image CompressionSOC Application Studies: Image Compression
SOC Application Studies: Image Compression
 
SOC Peripheral Components & SOC Tools
SOC Peripheral Components & SOC ToolsSOC Peripheral Components & SOC Tools
SOC Peripheral Components & SOC Tools
 
Baseband processor final rev
Baseband processor final revBaseband processor final rev
Baseband processor final rev
 
⭐⭐⭐⭐⭐ Monitoring of system memory usage embedded in #FPGA
⭐⭐⭐⭐⭐ Monitoring of system memory usage embedded in #FPGA⭐⭐⭐⭐⭐ Monitoring of system memory usage embedded in #FPGA
⭐⭐⭐⭐⭐ Monitoring of system memory usage embedded in #FPGA
 
Implementation of Soft-core Processor on FPGA
Implementation of Soft-core Processor on FPGAImplementation of Soft-core Processor on FPGA
Implementation of Soft-core Processor on FPGA
 
27 multicore
27 multicore27 multicore
27 multicore
 
Smart logic
Smart logicSmart logic
Smart logic
 
MIPI DevCon 2016: Accelerating Software Development for MIPI CSI-2 Cameras
MIPI DevCon 2016: Accelerating Software Development for MIPI CSI-2 CamerasMIPI DevCon 2016: Accelerating Software Development for MIPI CSI-2 Cameras
MIPI DevCon 2016: Accelerating Software Development for MIPI CSI-2 Cameras
 
Processors used in System on chip
Processors used in System on chip Processors used in System on chip
Processors used in System on chip
 
MIPI DevCon 2016: Accelerating UFS and MIPI UniPro Interoperability Testing
MIPI DevCon 2016: Accelerating UFS and MIPI UniPro Interoperability TestingMIPI DevCon 2016: Accelerating UFS and MIPI UniPro Interoperability Testing
MIPI DevCon 2016: Accelerating UFS and MIPI UniPro Interoperability Testing
 
07 input output
07 input output07 input output
07 input output
 
System On Chip (SOC)
System On Chip (SOC)System On Chip (SOC)
System On Chip (SOC)
 
Xilinx fpga cores
Xilinx fpga coresXilinx fpga cores
Xilinx fpga cores
 
Runtime Reconfigurable Network-on-chips for FPGA-based Devices
Runtime Reconfigurable Network-on-chips for FPGA-based DevicesRuntime Reconfigurable Network-on-chips for FPGA-based Devices
Runtime Reconfigurable Network-on-chips for FPGA-based Devices
 
chap 18 multicore computers
chap 18 multicore computers chap 18 multicore computers
chap 18 multicore computers
 
The Microarchitecure Of FPGA Based Soft Processor
The Microarchitecure Of FPGA Based Soft ProcessorThe Microarchitecure Of FPGA Based Soft Processor
The Microarchitecure Of FPGA Based Soft Processor
 
Public Seminar_Final 18112014
Public Seminar_Final 18112014Public Seminar_Final 18112014
Public Seminar_Final 18112014
 

Destacado

SRAM- Ultra low voltage operation
SRAM- Ultra low voltage operationSRAM- Ultra low voltage operation
SRAM- Ultra low voltage operationTeam-VLSI-ITMU
 
Routeco cyber security and secure remote access 1 01
Routeco cyber security and secure remote access 1 01Routeco cyber security and secure remote access 1 01
Routeco cyber security and secure remote access 1 01RoutecoMarketing
 
It 200 project 2 - group 4 - final
It 200   project 2 - group 4 - finalIt 200   project 2 - group 4 - final
It 200 project 2 - group 4 - finalLoren Schwappach
 
SRAM read and write and sense amplifier
SRAM read and write and sense amplifierSRAM read and write and sense amplifier
SRAM read and write and sense amplifierSoumyajit Langal
 
Single Ended Schmitt Trigger Based Robust Low Power SRAM Cell
Single Ended Schmitt Trigger Based Robust Low Power SRAM CellSingle Ended Schmitt Trigger Based Robust Low Power SRAM Cell
Single Ended Schmitt Trigger Based Robust Low Power SRAM CellVishwanath Hiremath
 
Project Report Of SRAM Design
Project Report Of SRAM DesignProject Report Of SRAM Design
Project Report Of SRAM DesignAalay Kapadia
 
Write stability analysis of 8 t novel sram cell
Write stability analysis of 8 t novel sram cellWrite stability analysis of 8 t novel sram cell
Write stability analysis of 8 t novel sram cellMr Santosh Kumar Chhotray
 

Destacado (12)

SRAM Design
SRAM DesignSRAM Design
SRAM Design
 
SRAM- Ultra low voltage operation
SRAM- Ultra low voltage operationSRAM- Ultra low voltage operation
SRAM- Ultra low voltage operation
 
SRAM
SRAMSRAM
SRAM
 
Routeco cyber security and secure remote access 1 01
Routeco cyber security and secure remote access 1 01Routeco cyber security and secure remote access 1 01
Routeco cyber security and secure remote access 1 01
 
Low power sram
Low power sramLow power sram
Low power sram
 
It 200 project 2 - group 4 - final
It 200   project 2 - group 4 - finalIt 200   project 2 - group 4 - final
It 200 project 2 - group 4 - final
 
Sram pdf
Sram pdfSram pdf
Sram pdf
 
SRAM read and write and sense amplifier
SRAM read and write and sense amplifierSRAM read and write and sense amplifier
SRAM read and write and sense amplifier
 
Single Ended Schmitt Trigger Based Robust Low Power SRAM Cell
Single Ended Schmitt Trigger Based Robust Low Power SRAM CellSingle Ended Schmitt Trigger Based Robust Low Power SRAM Cell
Single Ended Schmitt Trigger Based Robust Low Power SRAM Cell
 
Project Report Of SRAM Design
Project Report Of SRAM DesignProject Report Of SRAM Design
Project Report Of SRAM Design
 
Write stability analysis of 8 t novel sram cell
Write stability analysis of 8 t novel sram cellWrite stability analysis of 8 t novel sram cell
Write stability analysis of 8 t novel sram cell
 
SRAM
SRAMSRAM
SRAM
 

Similar a Summary Of Course Projects

Microchip's PIC Micro Controller
Microchip's PIC Micro ControllerMicrochip's PIC Micro Controller
Microchip's PIC Micro ControllerMidhu S V Unnithan
 
Microblaze Performance Monitoring Engine.ppt
Microblaze Performance Monitoring Engine.pptMicroblaze Performance Monitoring Engine.ppt
Microblaze Performance Monitoring Engine.pptqhicham
 
DIY OFDM Session
DIY OFDM SessionDIY OFDM Session
DIY OFDM SessionNutaq
 
Project Slides for Website 2020-22.pptx
Project Slides for Website 2020-22.pptxProject Slides for Website 2020-22.pptx
Project Slides for Website 2020-22.pptxAkshitAgiwal1
 
A Dataflow Processing Chip for Training Deep Neural Networks
A Dataflow Processing Chip for Training Deep Neural NetworksA Dataflow Processing Chip for Training Deep Neural Networks
A Dataflow Processing Chip for Training Deep Neural Networksinside-BigData.com
 
Design of control unit.pptx
Design of control unit.pptxDesign of control unit.pptx
Design of control unit.pptxShubham014
 
Introduction to Digital Signal processors
Introduction to Digital Signal processorsIntroduction to Digital Signal processors
Introduction to Digital Signal processorsPeriyanayagiS
 
Advanced Computer Architecture
Advanced Computer ArchitectureAdvanced Computer Architecture
Advanced Computer Architecturenibiganesh
 
Embedded computing platform design
Embedded computing platform designEmbedded computing platform design
Embedded computing platform designRAMPRAKASHT1
 
Basics of micro controllers for biginners
Basics of  micro controllers for biginnersBasics of  micro controllers for biginners
Basics of micro controllers for biginnersGerwin Makanyanga
 
Sony Computer Entertainment Europe Research & Development Division
Sony Computer Entertainment Europe Research & Development DivisionSony Computer Entertainment Europe Research & Development Division
Sony Computer Entertainment Europe Research & Development DivisionSlide_N
 
High Performance Erlang - Pitfalls and Solutions
High Performance Erlang - Pitfalls and SolutionsHigh Performance Erlang - Pitfalls and Solutions
High Performance Erlang - Pitfalls and SolutionsYinghai Lu
 
Computer organization & ARM microcontrollers module 3 PPT
Computer organization & ARM microcontrollers module 3 PPTComputer organization & ARM microcontrollers module 3 PPT
Computer organization & ARM microcontrollers module 3 PPTChetanNaikJECE
 

Similar a Summary Of Course Projects (20)

Microchip's PIC Micro Controller
Microchip's PIC Micro ControllerMicrochip's PIC Micro Controller
Microchip's PIC Micro Controller
 
Microblaze Performance Monitoring Engine.ppt
Microblaze Performance Monitoring Engine.pptMicroblaze Performance Monitoring Engine.ppt
Microblaze Performance Monitoring Engine.ppt
 
DIY OFDM Session
DIY OFDM SessionDIY OFDM Session
DIY OFDM Session
 
Project Slides for Website 2020-22.pptx
Project Slides for Website 2020-22.pptxProject Slides for Website 2020-22.pptx
Project Slides for Website 2020-22.pptx
 
Dsp ajal
Dsp  ajalDsp  ajal
Dsp ajal
 
A Dataflow Processing Chip for Training Deep Neural Networks
A Dataflow Processing Chip for Training Deep Neural NetworksA Dataflow Processing Chip for Training Deep Neural Networks
A Dataflow Processing Chip for Training Deep Neural Networks
 
module01.ppt
module01.pptmodule01.ppt
module01.ppt
 
Design of control unit.pptx
Design of control unit.pptxDesign of control unit.pptx
Design of control unit.pptx
 
Introduction to Digital Signal processors
Introduction to Digital Signal processorsIntroduction to Digital Signal processors
Introduction to Digital Signal processors
 
Advanced Computer Architecture
Advanced Computer ArchitectureAdvanced Computer Architecture
Advanced Computer Architecture
 
Embedded computing platform design
Embedded computing platform designEmbedded computing platform design
Embedded computing platform design
 
Microprocessor - Intel Pentium Series
Microprocessor - Intel Pentium SeriesMicroprocessor - Intel Pentium Series
Microprocessor - Intel Pentium Series
 
Aa sort-v4
Aa sort-v4Aa sort-v4
Aa sort-v4
 
Basics of micro controllers for biginners
Basics of  micro controllers for biginnersBasics of  micro controllers for biginners
Basics of micro controllers for biginners
 
Arm processor
Arm processorArm processor
Arm processor
 
CONDOR @ NGCLE@e-Novia 15.11.2017
CONDOR @ NGCLE@e-Novia 15.11.2017CONDOR @ NGCLE@e-Novia 15.11.2017
CONDOR @ NGCLE@e-Novia 15.11.2017
 
Sony Computer Entertainment Europe Research & Development Division
Sony Computer Entertainment Europe Research & Development DivisionSony Computer Entertainment Europe Research & Development Division
Sony Computer Entertainment Europe Research & Development Division
 
BTCS501_MM_Ch9.pptx
BTCS501_MM_Ch9.pptxBTCS501_MM_Ch9.pptx
BTCS501_MM_Ch9.pptx
 
High Performance Erlang - Pitfalls and Solutions
High Performance Erlang - Pitfalls and SolutionsHigh Performance Erlang - Pitfalls and Solutions
High Performance Erlang - Pitfalls and Solutions
 
Computer organization & ARM microcontrollers module 3 PPT
Computer organization & ARM microcontrollers module 3 PPTComputer organization & ARM microcontrollers module 3 PPT
Computer organization & ARM microcontrollers module 3 PPT
 

Summary Of Course Projects

  • 1. SUMMARY OF COURSE PROJECTS SETIAWAN SOEKAMTOPUTRA MASTER OF ELECTRICAL AND COMPUTER ENGINEERING ILLINOIS INSTITUTE OF TECHNOLOGY DECEMBER 2010 GRADUATE
  • 2. CONTENTS • 32-bit Pipelined CPU • MC68K-Based Monitor Program • Pipelined MIPS Processor with hazard handler and data forwarding • Simple Mesh-Like and Ring-Like Network on Chip Design • Small office network design • 4-bit 10t adder circuit with dual-vt logic design • Single-ended 6T vs. standard 6T SRAM bitcell design • QR Matrix Factorization • Electro Active Polymer Energy Harvesting Design • Advanced Encryption Standard Hardware Design 2
  • 3. SPRING 2009 • Introduction to VLSI Design • 32-bit Pipelined CPU • Multiplier with accumulator and pipeline optimization • Microcomputer • MC68K-Based Monitor Program • Advanced Computer Architecture • Pipelined MIPS Processor with hazard handler and data forwarding Return 3
  • 4. 32-BIT PIPELINED CPU • Hardware Description Language • Verilog • Tools • Compiler: Cadence Verilog XL • Logic Synthesis: Synopsys Design Compiler • Simulation tool: Cadence‟s SimVision, Mentor Graphics Modelsim • Place and Route: Cadence SOC Encounter • Mentor Graphic‟s Modelsim • Objectives • Execute ASIC Flow in this implementation using verilog • RTL, post-synthesis, and post-PR simulation for verification • Determine maximum frequency, area, delay, and power Return 4
  • 5. 32-BIT PIPELINED CPU • 32-bit Memory File • Eight ALU functions: multiplication, add, subtraction, OR, AND, XOR, XNOR • M:multiplicand, N: multiplier • Multiplier: • Radix 2r produce N/r partial products • Radix-4 booth-encoded Multiplier  Reduces number of partial products (N/2 vs. N) • Wallace Tree  Reduces number of logic levels required to perform summation Return 5
  • 9. 32-BIT PIPELINED CPU • Results • Maximum frequency: 40 < f < 41 MHz Return 9
  • 10. 32-BIT PIPELINED CPU • Case studies: • Case 1: Modify ALU multiplier to multiplier with accumulator (MAC) (useful for implementing DSP) • Case 2: Pipeline optimization • MAC benefit: reduces #instruction sets to compute the final result of sum of product functions. • Pipeline optimization is applied by inserting registers at the critical path (in this case MAC unit) Return 10
  • 11. Case I 32-BIT PIPELINED CPU Return 11
  • 12. 32-BIT PIPELINED CPU • Case 1 results • Case 2 results Return 12
  • 13. 32-BIT PIPELINED CPU • Case 2 Decision to put registers Return 13
  • 14. 32-BIT PIPELINED CPU • Provided: • Multiplier accumulator block diagram • Simple CPU design written in verilog • All required tools • Implementation • Construct fore-mentioned unit in verilog and modify the design to fit new unit • Apply numbers of registers for pipelining • Design functionality Test • Verify in sumulation that function F= (-10)* 5 + (-60)*2 + (- 60)*8 outputs the correct result Return 14
  • 15. 32-BIT PIPELINED CPU • Results Return 15
  • 16. 32-BIT PIPELINED CPU • Additional Analysis Result • Finding the maximum frequency • Expected maximum frequency of the design: 58 MHz • Frequency vs. area vs. power consumption Return 16
  • 17. MC68K-BASED MONITOR PROGRAM • instructor: Dr. Jafar Saniie • Requirements/Specifications • Construct a simple monitor program for MC68000 processor that allows user to execute common memory and register accesses, basic exception handlers. • Language • 68000 assembly language • Tools • Easy68k Editor/Assembler/Simulator Return 17
  • 18. MC68K-BASED MONITOR PROGRAM • Monitor program flowchart Return 18
  • 19. MC68K-BASED MONITOR PROGRAM • Monitor program system diagram Return 19
  • 20. MC68K-BASED MONITOR PROGRAM • Includes command interpreter that check and validate user inputs. • Monitor debugger commands: • MEMD  Memory display • MEMS  Memory Set • SORT  Memory Sort • FILL  Memory Fill • MOVE  Memory move • MEMM  Memory Modify • FIND  Block Memory Search • REGM  Register Modify • REGD  Register Display • RUNS  Execute program at specified location Return 20
  • 21. MC68K-BASED MONITOR PROGRAM • Monitor debugger Exception handling commands: • TBUS  Bus Error Exception • TADD  Address Error • TILL  Illegal Exception • TPRI  Privilege Violation • TDIV Division by Zero Return 21
  • 22. MC68K-BASED MONITOR PROGRAM • Results (partial of 17 commands made) Register display Memory display Return Command interpreter 22
  • 23. HIGH-PERFORMANCE PIPELINED MIPS PROCESSOR • MIPS (Microprocessor without Interlocked Pipeline Stages) is a reduced instruction set computer (RISC) instruction set architecture (ISA) • instructor: Prof. Jia Wang • Requirements/Specifications • Design a MIPS processor with pipeline, data forwarding, and hazard handling capabilities. • Run RTL Simulation to verify the functionalities • Language • VHDL • Tools • Modelsim PE 6.5 • MARS 3.6 MIPS Simulator • Provided: • Data memory unit design • Testbench code Return 23
  • 24. HIGH-PERFORMANCE PIPELINED MIPS PROCESSOR • Data width: 32-bit • Branch Hazard • 5-stage pipeline • Instruction Fetch • Branch calculation occurred in • Instruction Decode Instruction Decode Stage • Execute • Memory Access • Branch miss only costs one cycle • Write-Back of stall. • Main Modules • Data Hazard • Program counter (PC) • Control Unit • Stall if data being written is going • ALU Control Unit to be used at the next instruction • Register File • ALU • Data Forwarding • Instruction Memory • Data Memory • Result data is used immediately • Hazard Detection Unit rather than written back to • Forwarding Unit register file first. Return 24
  • 25. HIGH-PERFORMANCE PIPELINED MIPS PROCESSOR • MIPS Architecture Return 25
  • 26. HIGH-PERFORMANCE PIPELINED MIPS PROCESSOR • Test program (Running on MARS 3.6) Return 26
  • 27. HIGH-PERFORMANCE PIPELINED MIPS PROCESSOR • Result Return 27
  • 28. FALL 2009 • Hardware/Software Co-Design • Simple Mesh-Like Network on Chip Design • Simple Ring-Like Network on Chip Design • Introduction to Computer Network • Design of 2-story small office computer network Return 28
  • 29. HARDWARE/SOFTWARE CO- DESIGN • Projects: • Network on chip prototype design with three nodes • Simple Mesh-Like Network on Chip Design Return 29
  • 30. NETWORK ON CHIP PROTOTYPE DESIGN WITH THREE NODES • Instructor: Prof. Jia Wang • Specifications • Three-node in partially connected mesh topology NoC architecture • Three processing elements and three routers. • Queue system: FIFO • Language • SystemC running on Visual C++ • Tools • Microsoft Visual C++ Return 30
  • 31. NETWORK ON CHIP PROTOTYPE DESIGN WITH THREE NODES • Three-node NoC System Diagram • Third node function (called PE_dumpbox) • It receives all packets that cannot be processed by the destination processing unit due to overloading in the network Return 31
  • 32. NETWORK ON CHIP PROTOTYPE DESIGN WITH THREE NODES • Results • Overload in Router 1 network buffer at cycle 3 • 3rd processing unit PE_dumpbox receives packet Return 32
  • 33. MESH-LIKE NETWORK ON CHIP PROTOTYPE DESIGN • Specifications • a simple mesh-like NoC architecture. • One router has one processing unit (PE). • Queue system: FIFO • 4 by 4 matrix-like size • Language • SystemC • Tools • Microsoft Visual C++ Return 33
  • 34. MESH-LIKE NETWORK ON CHIP PROTOTYPE DESIGN • Simple NoC Architecture Return 34
  • 35. MESH-LIKE NETWORK ON CHIP PROTOTYPE DESIGN • Results • Generated packets • Result shows packets are delivered Return 35
  • 36. MESH-LIKE NETWORK ON CHIP PROTOTYPE DESIGN • Results • Delays due to the fact that only one packet is delivered to processing element PE at a time Return 36
  • 37. MESH-LIKE NETWORK ON CHIP PROTOTYPE DESIGN • Benefit and drawback: • Packet arrives in the destination address with fewer hops  reducing contention and increasing average bit rate. • Increases the complexity of the design and more wires are needed. Return 37
  • 38. INTRODUCTION TO COMPUTER NETWORK • Project: • Design a prototype of 2-story small office computer network capable of serving 20 users with three department LANs, four servers and wireless Internet • Language • N/A • Tools • Microsoft Visio Return 38
  • 39. SMALL OFFICE NETWORK DESIGN • Proposed configurations • IP address allocation Return 39
  • 40. SMALL OFFICE NETWORK DESIGN • Proposed configurations • Design Topology Return 40
  • 41. SMALL OFFICE NETWORK DESIGN • Office Layout 2nd floor Colored arrows show how 1st floor cables are managed Return 41
  • 42. SPRING 2010 • Advanced VLSI • 4-bit 10t adder circuit with dual-vt logic design • High Performance VLSI IC System • Single-ended 6T vs. standard 6T SRAM bitcell design comparison • QR Factorization • Implementing QR factorization algorithm in C Return 42
  • 43. 4-BIT 10T ADDER CIRCUIT WITH DUAL-VT LOGIC DESIGN • Project: • 4-bit 10t adder circuit with dual-vt logic design • Specifications • Adder circuit is based on: J. Lin, M. Sheu, and C.Ho. A Novel High-Speed and Energy Efficient 10-Transistor Full Adder Design. IEEE Trans. on Circuits and Systems, May 2007. • Adder: cascaded Carry ripple Adders • Technology node: 45nm (FreePDK) • Voltage: 1.1V @ 25 MHz • Performance measurements (delay and power consumption) for 10T Adder Circuit using high-threshold (Vt), low-Vt, and dual-Vt transistors • Tools • Cadence Virtuoso Schematic Design • Synopsys HSPICE Simulator • Nanosim Simulator Return 43
  • 44. 4-BIT 10T ADDER CIRCUIT WITH DUAL-VT LOGIC DESIGN • High Vt vs. low Vt • Full Adder Design (1-bit) • Complementary and level restoring carry logic (CLRCL) Return 44
  • 45. 4-BIT 10T ADDER CIRCUIT WITH DUAL-VT LOGIC DESIGN • Full Adder Design (1-bit) Critical Path • Dual-VT: Low-VT apply on transistors which are in critical path for speed and High-VT for others for low leakage • NMOS at multiplexer and PMOS in inverter are low-VT transistors Return 45
  • 46. 4-BIT 10T ADDER CIRCUIT WITH DUAL-VT LOGIC DESIGN • Logic Equation Sum = (A XNOR B).Cin + (A XOR B). Cin_bar Cout= (A XOR B) .Cin + (A XNOR B).A • Design Components • Inverter (left) and multiplexer (right) Return 46
  • 47. 4-BIT 10T ADDER CIRCUIT WITH DUAL-VT LOGIC DESIGN • 1-bit Full Adder (consisting of multiplexers and inversters) and its symbol • 4-bit Full Adder Return 47
  • 48. 4-BIT 10T ADDER CIRCUIT WITH DUAL-VT LOGIC DESIGN • Methodology • Using combination of input vector to measure delay and power consumptions • Delay : Switching delay between least significant bit (bit 0) and most significant bit (bit 3) • Power : Average and maximum power during simulation • Results 4.00E-10 3.50E-10 • Delay (in seconds) 3.00E-10 2.50E-10 High-VT 2.00E-10 Low-VT 1.50E-10 Dual-VT 1.00E-10 5.00E-11 0.00E+00 High-to-Low Low-to-High Return 48
  • 49. 4-BIT 10T ADDER CIRCUIT WITH DUAL-VT LOGIC DESIGN • Results • Power consumption (in Watt) 6.00E-05 5.00E-04 4.50E-04 5.00E-05 4.00E-04 4.00E-05 3.50E-04 3.00E-04 3.00E-05 High-VT 2.50E-04 High-VT Low-VT 2.00E-04 2.00E-05 Low-VT 1.50E-04 Dual-VT Dual-VT 1.00E-05 1.00E-04 5.00E-05 0.00E+00 0.00E+00 Average Power Maximum Power Return 49
  • 50. 4-BIT 10T ADDER CIRCUIT WITH DUAL-VT LOGIC DESIGN • Results Return 50
  • 51. 4-BIT 10T ADDER CIRCUIT WITH DUAL-VT LOGIC DESIGN • Issue • Voltage degradation specifically for high-vt or high frequency (> 125 MHz) due to pass transistors behavior to deliver weak-1 (NMOS) or weak-0 (PMOS). Return 51
  • 52. SINGLE-ENDED 6T VS. STANDARD 6T SRAM BITCELL DESIGN • Specifications • Design from: J. Singh, et al. Single Ended 6T SRAM with Isolated Read-Port for Low- Power Embedded Systems. IEEE. 2009 • Technology node: 45nm • Use: high VT MOSFET • Tools • Cadence Virtuoso Schematic Design • Synopsys HSPICE Simulator Return 52
  • 53. SINGLE-ENDED 6T VS. STANDARD 6T SRAM BITCELL DESIGN • Background • SRAM consumes majority of die area • Dynamic power via reads and writes activities • Static power : retaining its logic value • Benefits/Drawbacks of Single-Ended SRAM • Faster reading logic „1‟ • One bit line (no complementary bit bar line) wire reduction • More delay in Writing „1‟ due to weak-1 behavior of pass transistor NMOS (but around 85% of writes are zero writes) • Role of Isolated Read Port: Prevents bitcell content to be exposed during READs • Considerable lower power dissipation, better read SNM Return 53
  • 54. SINGLE-ENDED 6T VS. STANDARD 6T SRAM BITCELL DESIGN Return 54
  • 55. SINGLE-ENDED 6T VS. STANDARD 6T SRAM BITCELL DESIGN • Standard 6T SRAM • Read: precharge BL and BL*  WordLine=1 • Write: assert new value to BL and BL*  WordLine=1 • Transistor sizing: • Access transistor: medium • Pullup TR: weak • Pulldown TR: Strong Return 55
  • 56. SINGLE-ENDED 6T VS. STANDARD 6T SRAM BITCELL DESIGN Return 56
  • 57. SINGLE-ENDED 6T VS. STANDARD 6T SRAM BITCELL DESIGN Return 57
  • 58. SINGLE-ENDED 6T VS. STANDARD 6T SRAM BITCELL DESIGN Return 58
  • 59. SINGLE-ENDED 6T VS. STANDARD 6T SRAM BITCELL DESIGN Return 59
  • 60. SINGLE-ENDED 6T VS. STANDARD 6T SRAM BITCELL DESIGN • Standard SRAM Design (using Cadence Virtuoso) Return 60
  • 61. SINGLE-ENDED 6T VS. STANDARD 6T SRAM BITCELL DESIGN • Single-Ended SRAM Design Return 61
  • 62. SINGLE-ENDED 6T VS. STANDARD 6T SRAM BITCELL DESIGN • Comparison Results • Write Delay (0 to 0.5Vdd or 1 to 0.5Vdd) “…around 85% of the instruction write bits are “0,” and over 90% of the data write bits are “0.”.. “ (quoted from [3]) [3] Y. Chang, F. Lai, C. Yang. Zero-Aware Asymmetric SRAM Cell for Reducing Cache Power in Writing Zero. IEEE Trans. On VLSI Systems, Vol.12, No.8, August 2004. Return 62
  • 63. SINGLE-ENDED 6T VS. STANDARD 6T SRAM BITCELL DESIGN • Comparison Results • Power Consumption Comparison Return 63
  • 64. SINGLE-ENDED 6T VS. STANDARD 6T SRAM BITCELL DESIGN • Noise Margin Return 64
  • 65. QR MATRIX FACTORIZATION • Purposes: • Implementing QR factorization algorithm in C • Specifications • Written in C under RedHat OS • QR Factorization • Decomposition method of a matrix to solve linear problems or equations without inverting one of the left-hand side matrix. • Applicable to: m-by-n matrix A • Decomposition: A = QR where Q is an orthogonal matrix of size m-by- m, and R is an upper triangular • The QR decomposition provides an alternative way of solving the system of equations Ax = b without inverting the matrix A. The fact that Q is orthogonal means that QTQ = I, so that Ax = b is • equivalent to Rx = QTb, which is easier to solve since R is triangular. Return 65
  • 66. QR MATRIX FACTORIZATION • Algorithm Return 66
  • 67. QR MATRIX FACTORIZATION • Result Return 67
  • 68. FALL 2010 • Electro Active Polymer Energy Harvesting • Advanced Encryption Standard Return 68
  • 69. ELECTRO ACTIVE POLYMER ENERGY HARVESTING DESIGN • EAP Circuitry provides mechanical to electrical energy conversion when it is stretched, given bias voltage. • EAP material  VHB 4905 tape and carbon grease Return 69
  • 70. ELECTRO ACTIVE POLYMER ENERGY HARVESTING DESIGN • Previous prototype: • Drawbacks • High energy consumption • Charge management • EAP output power is too IC: TI‟s bq2000 small to even turn on battery • Li-ion battery 3V, 45mAh charging circuit (which needs 20.6 mA) • Application: TI‟s eZ430- • Solutions F2013 • EAP material efficiency • Boost Converter to • Higher capacitance supply biasing voltage (5 • Battery and circuit that can V  1.5KV): store small energy without requiring much energy to • EMCO Q15N-5 operate • Apply low biasing voltage  eliminate use of boost converter Return 70
  • 71. ELECTRO ACTIVE POLYMER ENERGY HARVESTING DESIGN • Simulation model using Simulink • Circuit model parameters: • EAP Model parameters, input voltage (battery), and output capacitor Co Return 71
  • 72. ELECTRO ACTIVE POLYMER ENERGY HARVESTING DESIGN • Simulation model using Simulink • EAP Model Parameters: • Cidle, Cforced, force frequency f(how often the EAP is stretched) • Absolute function to create always-positive sine waveform from original sine wave Return 72
  • 73. ELECTRO ACTIVE POLYMER ENERGY HARVESTING DESIGN • Simulation result: Return 73
  • 74. ELECTRO ACTIVE POLYMER ENERGY HARVESTING DESIGN • Prototype: • Battery charging : Cymbet CBC5300 • Battery : 2xCBC050 (3x50uAh) at 3.5V output • Capability to harvest 1.05V • PCB Layout Tool : Altium Designer • Application: MSP430-F2274 with CC2500 2.4GHz RF Transceiver Return 74
  • 75. ELECTRO ACTIVE POLYMER ENERGY HARVESTING DESIGN • Return 75
  • 76. ELECTRO ACTIVE POLYMER ENERGY HARVESTING DESIGN • Return 76
  • 77. ELECTRO ACTIVE POLYMER ENERGY HARVESTING DESIGN Battery Charging profile for CBC050 Return 77
  • 78. ADVANCED ENCRYPTION STANDARD HARDWARE DESIGN • Variant AES with 512-bit and 1024-bit key • Area and power consumption comparison with 128-bit and 256-bit AES keys • CMOS technology : 45nm • Operating Voltage : 1.1 V @ 100 MHz • Verilog language • Tools: • Synthesis : Synopsys DC Compiler • Simulation : Modelsim • Find the relationship between key size and implemented hardware area and power consumption. Return 78
  • 79. ADVANCED ENCRYPTION STANDARD HARDWARE DESIGN Cipher Key Plaintext • Initial Round Key Expansion RoundKey[0] AddRoundKey Normal Round SubBytes ShiftRows MixColumns i=i+1 RoundKey[i] AddRoundKey yes i < Number of rounds? Final Round No SubBytes ShiftRows AddRoundKey Ciphered Text Return 79
  • 80. ADVANCED ENCRYPTION STANDARD HARDWARE DESIGN plaintext (in bytes) • Block View of AES 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Operation XOR First roundkey (in bytes) 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 State Block State Block 0 4 8 12 SubBytes S0 S4 S8 S12 (Replaces each SubBytes 1 5 9 13 byte with S-box S1 S5 S9 S13 Mux Plain_text AddRoundKey and MixColumns ShiftRows AddRoundKey 2 6 10 14 value) S2 S6 S10 S14 Mux Ciphered Mux Initial _text Cipher_key Key Expansion Module value (zero) 3 7 11 15 S3 S7 S11 S15 State Block(after ShiftRows) S0 S4 S8 S12 Ready MixColumns for S5 S9 S13 S1 XOR a(x) next round S10 S14 S2 S6 Per Column S15 S3 S7 S11 State Block(after MixColums) Next roundkey M0 M4 M8 M12 K0 K4 K8 K12 M5 M9 M13 M1 k1 K5 K9 K13 Return M10 M14 M2 M6 XOR K2 K6 K10 K14 80 m15 M3 M7 M11 K3 K7 K11 K15
  • 81. ADVANCED ENCRYPTION STANDARD HARDWARE DESIGN • Block Diagram SubBytes Mux Plain_text AddRoundKey and MixColumns ShiftRows AddRoundKey Mux Ciphered Mux Initial _text Key Expansion Module value Cipher_key (zero) Return 81
  • 82. ADVANCED ENCRYPTION STANDARD HARDWARE DESIGN 7 Results 6 y = 0.852x + 2.739 R² = 0.985 5 100000 95000 4 90000 power (dynamic) in mW 85000 80000 power (static) in mW 3 75000 Total Power in mW 70000 65000 Linear (Total Power in mW) 2 60000 55000 1 50000 AES128 AES256 AES512 AES1024 area 58824.876 64188.036 76881.193 96312.560 0 AES128 AES256 AES512 AES1024 power (dynamic) in mW power (static) in mW Total Power in mW AES128 3.3574 0.2971603 3.6545603 AES256 3.9442 0.3341722 4.2783722 AES512 5.0289 0.409219 5.438119 AES1024 5.6042 0.5053051 6.1095051 Return 82
  • 83. ADVANCED ENCRYPTION STANDARD HARDWARE DESIGN Results: Area 100000 95000 90000 85000 80000 75000 70000 65000 60000 55000 50000 AES128 AES256 AES512 AES1024 area 58824.87654 64188.0369 76881.19388 96312.56036 Return 83