SlideShare una empresa de Scribd logo
1 de 60
Descargar para leer sin conexión
An Energy Efficient Sub-threshold Multiplication
 and Accumulation Unit for Low Power Digital
        Signal Processing Applications

                 Harsha Yelisala


           SPRING 2009 - SUMMER 2010
Technology Profile

   The following technologies are used in this project,
       90nm Pass Transistor Technology.
       Cadence IC design.
       Virtuoso schematic.
       Virtuoso Analog Design Environment.
       Cadence Spectre Simulator.
       Virtuoso Layout Suite.
       Synopsys Nanosim.
       Synopsys Hspice.
       Tcl Scripting.
       Perl Scripting.
       Python Programming Language.
Aim



  The Objectives of this project are
      1. To design an industry standard energy efficient circuit in a
      90nm Technology.
      2. To emphasize the Subthreshold mode of operation.
      3. To get hands on expertise on Cadence and Synopsys Tools.
      4. To understand the hardware design flow.
      5. To work with Perl and Tcl Scripting Languages.
Introduction



   Abstract
   The increased use of power consuming devices led to a new corner
   of research in energy and power efficient designs. The conventional
   design methodologies proved to be inefficient when energy
   efficiency is a prime metric. Of the several novel approaches, the
   one that is promising in terms of high energy savings and reduced
   complexity is the Sub-threshold mode of operation. A 220mV
   energy efficient Subthreshold MAC unit is designed based on the
   designed custom cell library in 90nm Pass transistor technology.
Work Flow



     1. Studying the literature regarding Subthreshold operation.
     2. Investigating various logic families for Subthreshold scheme.
     3. Designing a custom library of standard cells out of the
     proposed logic family.
     4. Designing a MAC unit.
     5. Verifying and testing the unit from power and energy
     perspective.
Subthreshold Mode




  What is Subthreshold mode
  A basic MOS transistor works in three different modes of operation.
  1. Active or Saturation Mode
  2. Linear or Triode Mode
  3. Cutoff or Subthreshold Mode
Modes of a MOSFET operation




  Modes of a MOSFET
  A basic MOS transistor works in three different modes of operation.
  1. Active or Saturation Mode
  2. Linear or Triode Mode
  3. Cutoff or Subthreshold Mode
All about Subthreshold Mode!
   What is Subthreshold mode
   The subthreshold operation of CMOS transistor is performed when
   the gate to source potential (Vgs ) is less than threshold
   voltage(Vth ).
   Advantages:
   1. As the device is operating in ultra low voltages(200-300mV),
   the dynamic power component is highly reduced.
   2. Highly suitable for low power low speed applications like sensor
   nodes, battery operated devices etc.,

   Disadvantages:
   1. As the driving currents are the weak leakage currents the time
   to charge and discharge the nodes is high, making the speed in
   between 1-10MHz.
   2. Transistor sizing criticality
   3. Low On-Off Current ratio.
   4. High Sensitivity to Process, Voltage and Temperature variations.
Subthreshold Current Model (1 of 2)


   In Subthreshold regime, the drain current(Ids ) varies exponentially.
   In long channel device, threshold voltage does not depend on drain
   voltage or channel length. But in sub-micron technology, due to
   drain induced barrier lowering(DIBL), threshold voltage does
   depend on drain voltage, as source/drain depletion region
   penetrates significantly into the channel.
   The subthreshold current of CMOS transistor is given by the
   following equation,

             Isub = I0 × e (Vgs −Vth +ηVds )/nvt × 1 − e −Vds /Vth .   (1)
Subthreshold Current Model (2 of 2)

            Isub = I0 × e (Vgs −Vth +ηVds )/nvt × 1 − e −Vds /Vth .   (2)
   where

                                                 2
                      I0 = µo Cox (W /L)(n − 1)Vth                    (3)
   and    Vgs = transistor gate to source voltage,
          Vds = drain to source voltage,
          Vth = threshold voltage,
          vt = KT /q is the thermal voltage,
          n = subthreshold slope factor = (1 + Cd /Cox )
          Cd = drain capacitance
          Cox = gate capacitance
          η = DIBL co-efficient
          µo = Mobility.
          W and L are the width and channel length of MOSFET
   respectively.
Subthreshold Power Model (1 )




   For low frequency mobile devices, the advantage of subthreshold
   design is widely achieved through radical circuit power reduction at
   the cost of operating speed . The total power consumption of the
   digital circuit is given by following equation.

                Ptotal = Pdynamic + Pshort−circuit + Pstatic       (4)
Subthreshold Power Model (2 )



   Dynamic Power
   Dynamic power is described by following equation,

                         Pdynamic = αfCeff Vdd 2                       (5)
   where α is activity factor, f is switching frequency, Ceff is the
   effective capacitance. As dynamic power is directly proportional
   with the square of supply voltage, significant power reduction is
   achieved in subthreshold voltage.
Subthreshold Power Model (3 )
   Dynamic Power
   At 220mV, the dynamic charging current which is directly
   proportional with dynamic power, is reduced by almost 248.49X
   compared to supply voltage of 1.2V for an inverter at TT process
   corner.

                              3
                             10
                                         TT
                                         FS
                                         SF
                              2          SS
                             10
                                         FF
              Current (uA)




                              1
                             10




                              0
                             10




                              −1
                             10
                                  0.2   0.4   0.6           0.8    1   1.2
                                              Supply voltage (V)
Subthreshold Power Model (4 )

   Static Power
   Static power is the power consumed by the circuit during idle state
   and described by following equation.

                          Pstatic = ILeakage Vdd                  (6)

   The leakage current consists of various components, subthreshold
   leakage, gate tunneling, gate induced drain lowering (GIDL) and
   reverse bias diode leakage. The subthreshold leakage varies
   according to equation (2). Thus with reduction of drain voltage,
   the DIBL effect reduces which in turn reduces subthreshold leakage
   current. The gate tunneling has significant contribution to overall
   leakage current, which also reduces with gate or supply voltage.
   GIDL and reverse bias diode leakage also significantly reduce due
   to supply voltage reduction in a subthreshold circuit.
Subthreshold Power Model (5 )
   Static Power
   At 220mV, the subthreshold leakage current at weak inversion is
   reduced by almost 8.55X compared to strong inversion(supply
   voltage 1.2V) at TT process corner.

                              3
                             10
                                        TT
                                        FS
                                        SF
                              2         SS
                             10         FF
              Current (nA)




                              1
                             10




                              0
                             10




                              −1
                             10
                                  0.2    0.4   0.6           0.8    1   1.2
                                               Supply voltage (V)
Subthreshold Power Model (6 )



   Short Circuit Power
   Short circuit power is the power dissipated due to current
   conduction between Vdd and VSS during logic transition. It is
   described by the following equation.

                         Pstatic = Ishort−circuit Vdd                 (7)
   Although short-circuit current flowing time is increased due to
   slower operation in subthreshold, but reduced supply voltage
   decreases electron conduction, which in turn reduces Ishort−circuit .
Subthreshold Power Model (5 )
   Short Circuit Power
   At 220mV, there is a 446.45X reduction in short circuit current
   compared to full rail voltage of 1.2V in TT process corner.

                                2
                               10




                                1
                               10
                Current (uA)




                                0
                               10

                                                                         TT
                                                                         FS
                                −1                                       SF
                               10                                        SS
                                                                         FF


                                −2
                               10
                                    0.2   0.4   0.6           0.8    1        1.2
                                                Supply voltage (V)



   Figure: Short circuit current rating under varying supply voltage for an
Subthreshold Design Challenges (1)




      Transistor Sizing Criticality
      On-Off Current Ratio
      PVT variations
      Noise Margin
Subthreshold Design Challenges (2)



   Transistor Sizing Criticality
   The relative strength of pull-up, pull-down is very critical for
   optimal rise and fall time. As subthreshold current depends
   exponentially on Vth , any variation in threshold of NMOS and
   PMOS can change the β ratio drastically which directly affects
   rise/fall time and may trigger logic failure. The shift in β ratio is
   observed in low-voltage, enforcing us to size the cell transistor very
   carefully.
Subthreshold Design Challenges (2)



   Transistor Sizing Criticality
   The relative strength of pull-up, pull-down is very critical for
   optimal rise and fall time. As subthreshold current depends
   exponentially on Vth , any variation in threshold of NMOS and
   PMOS can change the β ratio drastically which directly affects
   rise/fall time and may trigger logic failure. The shift in β ratio is
   observed in low-voltage, enforcing us to size the cell transistor very
   carefully.
Subthreshold Design Challenges (2)
   Ratio of NMOS ION and PMOS ION at different corners
                                     3
                                    10
                                                                                  TT
                                                                                  FF
                                                                                  FS
                                                                                  SF
                                                                                  SS
              ION NMOS / ION PMOS


                                     2
                                    10




                                     1
                                    10




                                     0
                                    10
                                         0   0.2   0.4   0.6      0.8   1   1.2        1.4
                                                           Supply(V)




      Figure: Ratio of NMOS ION and PMOS ION at different corners
Subthreshold Design Challenges (2)
   Ratio of NMOS ION and PMOS ION at different temperatures

                                     30
                                                                                   −40C
                                                                                   −20C
                                                                                   0C
                                     25
                                                                                   20C
                                                                                   40C
                                                                                   60C
               ION NMOS / ION PMOS



                                     20                                            80C
                                                                                   100C
                                                                                   120C
                                     15



                                     10



                                      5



                                      0
                                          0   0.2   0.4   0.6      0.8   1   1.2          1.4
                                                            Supply(V)




    Figure: Ratio of NMOS ION and PMOS ION at different temperatures
Subthreshold Design Challenges (3)

   On-Off Current Ratio
   The drain current of MOSFET increases exponentially in
   subthreshold region whereas in strong inversion it changes very
   slowly due to velocity saturation of majority carriers. In
   subthreshold region, the threshold voltage deviation and
   degradation of ION /IOFF of the current makes the circuit operation
   very critical. In subthreshold region like 0.2V, ION /IOFF degrades
   to below 300 at room temperature.There is strong race condition
   between on and off devices during setting of a critical signal and
   this determines the maximum number of allowable cells per
   bit-line. When this current ratio degrades to very low value, it
   becomes very difficult to differentiate between logic ‘1’ and logic
   ‘0’. If we consider process variations, this ratio becomes worse in
   FF corner as shown.
Subthreshold Design Challenges (3)
   On-Off Current Ratio
                                  5
                                 10



                                  4
                                 10
               NMOS ION / IOFF




                                  3
                                 10

                                                                               −40C
                                                                               −20C
                                  2
                                 10                                            0C
                                                                               20C
                                                                               40C
                                  1                                            60C
                                 10
                                                                               80C
                                                                               100C
                                                                               120C
                                  0
                                 10
                                      0   0.2   0.4   0.6      0.8   1   1.2          1.4
                                                        Supply(V)




       Figure: Ratio of NMOS ION and IOFF at different temperatures


   Observation: Significant β ratio variation is observed in low
Subthreshold Design Challenges (3)
   On-Off Current Ratio
                                  7
                                 10


                                  6
                                 10


                                  5
                                 10
               PMOS ION / IOFF




                                  4
                                 10

                                                                               −40C
                                  3
                                 10                                            −20C
                                                                               0C
                                  2                                            20C
                                 10
                                                                               40C
                                                                               60C
                                  1
                                 10                                            80C
                                                                               100C
                                  0
                                                                               120C
                                 10
                                      0   0.2   0.4   0.6      0.8   1   1.2          1.4
                                                        Supply(V)




       Figure: Ratio of PMOS ION and IOFF at different temperatures


   Observation: Significant β ratio variation is observed in low
On-Off Current Ratio
                               5
                              10



                               4
                              10



            NMOS ION / IOFF    3
                              10



                               2                                  TT
                              10
                                                                  FF
                                                                  FS
                                                                  SF
                               1
                              10                                  SS



                               0
                              10
                                   0   0.2   0.4   0.6      0.8    1   1.2   1.4
                                                     Supply(V)




      Figure: Ratio of NMOS ION and IOFF at different corners


Observation: Significant β ratio variation is observed in low
voltage at different temperatures.
On-Off Current Ratio
                               6
                              10



                               5
                              10



            PMOS ION / IOFF    4
                              10



                               3
                                                                      TT
                              10                                      FF
                                                                      FS
                                                                      SF
                               2                                      SS
                              10



                               1
                              10
                                   0   0.2   0.4   0.6      0.8   1   1.2   1.4
                                                     Supply(V)




      Figure: Ratio of PMOS ION and IOFF at different corners


Observation: Significant β ratio variation is observed in low
voltage at different temperatures.
A Look into other Logic families



   The conventional Complimentary MOS Logic family when
   operated in subthreshold voltages poses several disadvantages.
   A few of them are:
   1. High Power dissipation
   2. Weak Noise margins.
   3. Huge delays.
   Thus it is evident that a CMOS logic family is not optimum for
   subthreshold operation.
A study of several other logic families is made with power and
energy consumption as prime concern.

Table: Minimum working voltages for different logic families for a basic
AND gate
     Logic Family     Minimum Voltage(mv)    Delay(ns)   Driving Current(nA)   Power(nW)   PDP(fJ)
      Sub-CMOS               250               2.56              3330             1859      4.759
    Pseudo NMOS              220               4.765            102.56           0.6023     2.87
       DTMOS                 180              8.4173            32.54            233.63     1.97
       Domino                240              7.6477            476.13           639.41     4.89
    Pass Transistor          200              4.9953            201.43           426.17     2.13
        DTPT                 175               6.598            128.39           204.68     1.35




Table: Energy comparison at 250mV for different logic families for basic
AND gate
           Logic Family      Delay(ns)      Driving Current(nA)      Power(nW)      PDP(fJ)
            Sub-CMOS           2.56                 3330                1859         4.759
          Pseudo NMOS         3.8637              761.938              0.9848        3.805
             DTMOS            11.116               89.204              1.501         16.68
             Domino           4.5477               568.31              1.119         5.09
          Pass Transistor     2.2641               652.88              1.502         3.39
              DTPT            1.8432                830                1.503         2.77
Custom Cell Library


   All the standard cells are designed in 90nm PT technology. The
   cells are fine tuned for their sizings, driving capability and minimum
   working voltage magnitudes. The cells that are customized are:
       Inverter
       Buffer
       And
       Or
       Xor
       Xnor
Inverter




   This is the only gate in the library that is based on CMOS
   technology. The only modification is that the driving capability of
   the cell is increased by improving the effective channel length of
   the P and N devices as shown.
Buffer




  Buffer gate is obtained by connecting two inverters in series.
And (1 of 2)

                                       Operation:
                                           When A=0, B=0 the transistors
   A
            p1
                                           p1, n1, n3 are on and p2, n2, p3,
                                           p4 are off and transmits gnd.
            n1

   A'
                              output
                                           When A=0, B=1 the transistors
                                           p1, n1, p3 are on and p2, n2, n3,
            p2
   B
                         p3   p4           p4 are off and transmits gnd.
                   vdc




            n2                             When A=1, B=0 the transistors
   B'                                      p2, n2, n3, p4 are on and p1, n1,
   gnd
             n3                            p3 are off and transmits B.
                                           When A=1, B=1 the transistors
         Figure: And gate
                                           p1, n2, p3, p4 are on and p2, n2,
                                           n3 are off and transmits vdc.
And (2 of 2)

                                       Need for additional Mosfets n3, p3, p4:


   A
                                           when inputs are A=1, B=0, the
            p1                             output node is discharged to zero.
            n1
                              output
                                           when inputs are A=1, B=1, the
   A'                                      output should be connected to B
            p2                             and should charge it to ‘1’.
                         p3   p4
   B
                                           But due to larger sub threshold
                   vdc




            n2

   B'
                                           delay, the node which was
   gnd
                                           discharged earlier takes longer
             n3
                                           time to charge to ‘1’.
         Figure: And gate                  Hence an alternate path is
                                           provided to charge the output
                                           node to ‘1’ .
Or (1 of 2)

                                    Operation:
                                        When A=0, B=0 the transistors
                                        p1, n1 are on and p2 is off and
                                        transmits B.
   A             p2
                           output       When A=0, B=1 the transistors
          p1
                                        p1, n1 are on and p2 is off and
   B                                    transmits B.
          n1                            When A=1, B=0 the transistors
    A'
                                        p1, n1 are off and p2 is on and
                                        transmits A.
         Figure: Or gate
                                        When A=1, B=1 the transistors
                                        p1, n1 are off and p2 is on and
                                        transmits A.
Or (2 of 2)




   A
                                    This works fine in strong inversion
                 p2
                           output   region. But when subthreshold mode is
          p1                        considered, the output current is not
   B                                sufficient for the gate to drive a FO4
          n1                        load. Hence a chain of two inverters are
    A'                              connected at the final output to
                                    consider it as custom OR gate.
         Figure: Or gate
Xnor

                                    Operation:
                                        When A=0, B=0 the transistors
                n1
   B'                                   p1, n1 are on and p2, n2 are off
                                        and transmits B .
                p1
                                        When A=0, B=1 the transistors
       A                   output
                                        p1, n1 are on and p2, n2 are off
                                        and transmits B .
                n2
       B                                When A=1, B=0 the transistors
                                        p1, n1 are off and p2, n2 are on
                 p2                     and transmits B.
       A'                               When A=1, B=1 the transistors
                                        p1, n1 are off and p2, n2 are on
            Figure: Xnor gate           and transmits B.
Xor(1 of 2)

                                Operation:
                                    When A=0, B=0 the transistors
              p1
    B
                                    p1, n1 are off and p2, n2 are on
                                    and transmits B.
              n1
                                    When A=0, B=1 the transistors
    A                  output       p1, n1 are off and p2, n2 are on
                                    and transmits B.
              p2
    B
                                    When A=1, B=0 the transistors
                                    p1, n1 are on and p2, n2 are off
              n2                    and transmits B .
    A
                                    When A=1, B=1 the transistors
                                    p1, n1 are on and p2, n2 are off
        Figure: Xor gate            and transmits B .
Xor(2 of 2)



              p1
    B                           However, the direct XOR
              n1
                                implementation is not used in our
                                custom library, as the XOR derived from
    A                  output
                                XNOR works for much lesser minimum
                                working voltage than direct XOR
              p2
    B
                                implementation upon investigation. The
                                details are mentioned in the further
              n2                slides.
    A


        Figure: Xor gate
Summary of the standard cells in PT technology




   Table: Electrical characteristics of different basic cells using pass
   transistor logic in TT process corner

    Basic cell   Minimum Voltage(mv)   Delay(ns)   Driving Current(fA)   Power(nW)   PDP(aJ)
      Buffer             148             2.7258            582.06           0.134      0.365
     Inverter           150             1.5655            590.65           0.197      0.308
       XOR              155             1.5739            611.69           0.562      0.884
      NAND              170             0.9638            673.64           0.435      0.419
       AND              175             2.1523            689.82            0.47      1.011
        OR              155             3.9219            611.81           0.431      1.6903
    Full adder          185             2.9647            734.61           29.516     87.506
Design of a MAC Unit



      MAC is one of the most occurring and energy consuming
      operation in DSP or other computationally intensive
      applications.
      It represents a fundamental building block in all DSP tasks.
      Therefore, designing an ultra-low power MAC becomes a
      subject of substantial research interest.
      An energy efficient MAC unit is designed using the custom
      cell library.
Design of a MAC Unit




   Brief Specifications:
       Inputs : 8-bit Multiplier, 8-bit Multiplicand, 17-bit Addend
       Outputs :17-bit MAC output
       Type of Multiplier : Radix-4 Booth encoded multiplier
       Type of Adder : Ripple carry adder
Block diagram of MAC unit

                                MULTIPLIER                                        ADDER

       MD<7:0>                       -MD     Partial Product
                 2s Compliment                Generation

        I                                      PP0      P0
                                     -2MD
        N            Shifter                                                              <16:0>
                                                               Partial
        P                                      PP1      P1     Product   <16:0>   Adder
        U                            2MD                                                   O
                     Shifter                                    Adder
        T                                      PP2      P2                                 U
                                                                                           T
       MR<7:0>                                 PP3      P3                                 P
                 Booth Encoder                                                             U
                                                                                           T



                               Figure: Block diagram of MAC unit


   :
Flowchart of MAC Unit

                                                         MULTIPLICAND




                                                2s Compliment




                               Boot h encoder




                                                                        Shift er s
               MULTIPLIER                       Partial product
                                                  generation




                                                Partial product
                                                   addition




                            ADDER INPUT
                                                     Adder


                                                         MAC OUTPUT




              Figure: Flowchart for MAC operation
Sequence of logic flow
      The multiplicand(MD) input enters the 2s compliment block
      which negates the value of MD.
      The obtained -MD when shifted left gives a -2MD.
      The non negated MD is also shifted left to obtain 2MD.
      The booth encoder block encodes the 8 bit multiplier(MR) to
      12 bits which are used to control the partial product
      generation.
      The partial product generation involves selection of four 8 bit
      vectors based on the encoded bits.
      The four partial products are generated by the PP0, PP1, PP2
      and PP3 blocks respectively.
      The partial products are shifted and sign extended to 16 bits
      by the P0, P1, P2 and P3 blocks respectively.
      The obtained partial products are finally added to obtain the
      17 bit multiplier output.
      A 17 bit external input is added with the obtained multiplier
      product to give final MAC output.
Modified booth encoding algorithm



   Modified booth encoding algorithm is an often selected algorithm
   for multiplication of signed numbers. This scheme is selected by its
   virtue of reducing the number of partial products to half the
   number of multiplier bits as compared to a conventional booth
   encoding scheme. This reduces the number of iterations at an
   increased circuit complexity. Thus the power consumption is also
   reduced by half. The modified booth encoder based multiplier
   architecture is designed keeping in view of the power consumption.
Algorithm Description and Control Implementation
   The modified booth algorithm considers 3 multiplier bits (MRi+1 ,
   MRi , MRi−1 ) at a time and encodes to any value among -2MD,
   -MD, 0, MD, 2MD based on Table below. The value MRi refers to
   the i th bit of the multiplier where i ranges from 0 to number of
   multiplier bits and MR−1 is taken to be 0.

   Table: Mapping of multiplier bits to encoded bits using Radix 4 Booth
   Encoder
              MRi+1   MRi    MRi−1   Partial Product   A   B   C
               0       0      0              0         0   0   0
               0       0      1             MD         0   1   0
               0       1      0             MD         0   1   0
               0       1      1           2MD          0   0   1
               1       0      0           -2MD         1   0   1
               1       0      1            -MD         1   1   0
               1       1      0            -MD         1   1   0
               1       1      1              0         1   0   0

   where A, B, C indicate the encoded bits for a given MRi+1 , MRi ,
   MRi−1 bits of the multiplier bit sequence starting from the LSB.
Example
  Consider an example where,
    Multiplier(MR)    :01001000    Adder input as
    Multiplicand(MD):00110110      01100010001000001
  So, 2MD=01101100, -MD=11001010, -2MD=10010100
  Encoding the MR:
          010010000                 000 encodes to 000
          01001000                  100 encodes to 101
          01001000                  001 encodes to 010
          01001000                  010 encodes to 010
  Partial Products:            After shifting and sign extending:
    pp0 :00000000              p0 :0000000000000000
    pp1 :10010100              p1 :1111111001010000
    pp2 :00110110              p2 :0000001101100000
    pp3 :00110110              p3 :0000110110000000
  Adder = 01100010001000001 + Product = 00000111100110000
  MAC OUTPUT = 0000111100110000
Test Chip

   A 17 bit subthreshold MAC unit is implemented using 90nm
   CMOS technology. The fan-in of each logic gate is carefully
   selected to achieve maximum robustness in near-threshold supply
   voltage. Since pad-frame input to the MAC is 1.2V, input data
   and clock signals are down-converted using level shifter down
   converter. The output of MAC is up converted to 1.2V before
   being latched to output padframe using an efficient 2-stage down
   level-shifter. The design layout is done using cadence virtuoso.A
   total of four metal layers are employed to design the MAC unit.
   The MAC unit size is 658.4µm × 149.49µm which consumes an
   area of 0.098mm2 in 90nm technology. The transistor level circuit
   analysis is performed using random test vector. The design is
   elaborately tested for PVT variations.
Full chip layout of the proposed design with pad frame




                   Figure: Layout of MAC unit


   :
Design Specs



              Table: Subthreshold MAC design specifications
                       Minimum voltage       220mV
                             Speed           1 MHz
                      Energy per operation   1.63pJ
                        Average power        2.04uW
                        Standby power         1.4uW

   The MAC unit is configured to operate at an extremely low voltage
   of 220mV at a speed of 1MHz for the worst case process corner
   (SS) at room temperature and can be functional even down to
   180mV at typical corner (TT).
MAC Simulation Results (1 of 8)

                     100

                      90

                      80

                      70

                      60
        power (uW)




                      50

                      40

                      30

                      20

                      10

                       0
                       200   250   300        350       400   450   500
                                         voltage (mV)



   Figure: Average Power Consumption of MAC at different supply voltages


   :
MAC Simulation Results (2 of 8)

                                  12

                                                                                SS
                                  10                                            SF
                                                                                FS
                                                                                TT
                                   8                                            FF
                Frequency (MHz)




                                   6



                                   4



                                   2



                                   0
                                   220   225   230       235        240   245    250
                                                     Voltage (mV)



   Figure: Operating frequency of MAC unit at different supply voltages
   under global variation


   :
MAC Simulation Results (3 of 8)

                        7000



                        6000



                        5000
       Energy/op (fJ)




                        4000



                        3000



                        2000



                        1000
                           200     250      300        350       400   450        500
                                                  voltage (mV)



                           Figure: Energy/operation at different supply voltages


   :
MAC Simulation Results (4 of 8)

                         3
                               static current
                               dynamic current
                        2.5    capacitive current

                         2


                        1.5
        Current (uA)




                         1


                        0.5


                         0


                       −0.5


                        −1
                         200    250         300         350       400   450   500
                                                    Votage (mV)



   Figure: Short circuit, static and capacitive current ratings at different
   supply voltages


   :
MAC Simulation Results (5 of 8)

                                  3
                                         temp 0c
                                         temp 27c
                                 2.5     temp100c


                                  2
          Stand By Power (uW)




                                 1.5


                                  1


                                 0.5


                                  0


                                −0.5
                                   200    250       300      350        400   450   500
                                                          Supply (mV)



       Figure: Standby power versus supply voltage at different temperatures


   :
MAC Simulation Results (6 of 8)

                         3
                                    static current
                                    dynamic current
                        2.5
                                    capacitive current

                         2


                        1.5
        Current (uA)




                         1


                        0.5


                         0


                       −0.5


                        −1
                        −40   −20         0        20      40       60   80   100   120
                                                         temp (c)



   Figure: Current ratings at different operating temperatures at supply
   voltage 220mV


   :
MAC Simulation Results (7 of 8)
                    1000


                    900


                    800


                    700


                    600
       dealy (ns)




                    500


                    400


                    300


                    200


                    100
                     −40   −20   0   20      40      60   80   100   120
                                          temp (c)



   Figure: Performance of MAC at different temperatures at supply voltage
   220mV


   :
MAC Simulation Results (8 of 8)

                     300



                     250



                     200
        power (uW)




                     150



                     100



                      50



                      0
                      −40   −20   0   20      40      60   80   100   120
                                           temp (c)



   Figure: Average power of MAC at different temperatures at supply
   voltage 220mV


   :
Conclusion
   In this research project,
        Several logical families are investigated in subthreshold range
        to build the optimum subthreshold standard cells.
        Pass transistor logic family was chosen due to its energy
        efficiency compared to other subthreshold logic families.
        An optimal design choice is made for each subthreshold
        standard cell, based on power delay product.
        A 17 bit subthreshold MAC chip is implemented using
        customized subthreshold standard cells.
        The custom cell layout is done using cadence virtuoso and
        tested in all process corners using nanosim simulator.
        It is designed to work for a minimum voltage of 220mV and
        consumes an ultra low energy as minimum as 1.62pJ per
        operation for an operating performance of 1.0MHz.

Más contenido relacionado

La actualidad más candente

SVM Simulation for three level inverter
SVM Simulation for three level inverterSVM Simulation for three level inverter
SVM Simulation for three level inverterZunAib Ali
 
Dee 6113 CMOS IC DESIGN (Chapter 3 ~ CMOS inverter)
Dee 6113 CMOS IC DESIGN (Chapter 3 ~ CMOS inverter)Dee 6113 CMOS IC DESIGN (Chapter 3 ~ CMOS inverter)
Dee 6113 CMOS IC DESIGN (Chapter 3 ~ CMOS inverter)MielWitwicky
 
Interconnect timing model
Interconnect  timing modelInterconnect  timing model
Interconnect timing modelPrachi Pandey
 
Implementation of SVPWM control on FPGA for three phase MATRIX CONVERTER
Implementation of SVPWM control on FPGA for three phase MATRIX CONVERTERImplementation of SVPWM control on FPGA for three phase MATRIX CONVERTER
Implementation of SVPWM control on FPGA for three phase MATRIX CONVERTERIDES Editor
 
Space vector pwm_inverter
Space vector pwm_inverterSpace vector pwm_inverter
Space vector pwm_inverterZunAib Ali
 
Space Vector Modulation in Voltage Sourced Three Level Neutral Point Clamped ...
Space Vector Modulation in Voltage Sourced Three Level Neutral Point Clamped ...Space Vector Modulation in Voltage Sourced Three Level Neutral Point Clamped ...
Space Vector Modulation in Voltage Sourced Three Level Neutral Point Clamped ...emredurna
 
Space Vector Pulse Width Modulation Schemes for Two-Level Voltage Source Inve...
Space Vector Pulse Width Modulation Schemes for Two-Level Voltage Source Inve...Space Vector Pulse Width Modulation Schemes for Two-Level Voltage Source Inve...
Space Vector Pulse Width Modulation Schemes for Two-Level Voltage Source Inve...IDES Editor
 
Low Power Design - PPT 1
Low Power Design - PPT 1 Low Power Design - PPT 1
Low Power Design - PPT 1 Varun Bansal
 
Neutral point clamped inverter
Neutral point clamped inverterNeutral point clamped inverter
Neutral point clamped inverterZunAib Ali
 
A novel voltage reference without the operational amplifier and resistors
A novel voltage reference without the operational amplifier and resistorsA novel voltage reference without the operational amplifier and resistors
A novel voltage reference without the operational amplifier and resistorsIJRES Journal
 
Ee6378 bandgap reference
Ee6378 bandgap referenceEe6378 bandgap reference
Ee6378 bandgap referencessuser2038c9
 

La actualidad más candente (20)

SVM Simulation for three level inverter
SVM Simulation for three level inverterSVM Simulation for three level inverter
SVM Simulation for three level inverter
 
Logic Gate
Logic GateLogic Gate
Logic Gate
 
Combinational Logic
Combinational LogicCombinational Logic
Combinational Logic
 
Dee 6113 CMOS IC DESIGN (Chapter 3 ~ CMOS inverter)
Dee 6113 CMOS IC DESIGN (Chapter 3 ~ CMOS inverter)Dee 6113 CMOS IC DESIGN (Chapter 3 ~ CMOS inverter)
Dee 6113 CMOS IC DESIGN (Chapter 3 ~ CMOS inverter)
 
Interconnect timing model
Interconnect  timing modelInterconnect  timing model
Interconnect timing model
 
Implementation of SVPWM control on FPGA for three phase MATRIX CONVERTER
Implementation of SVPWM control on FPGA for three phase MATRIX CONVERTERImplementation of SVPWM control on FPGA for three phase MATRIX CONVERTER
Implementation of SVPWM control on FPGA for three phase MATRIX CONVERTER
 
Space vector pwm_inverter
Space vector pwm_inverterSpace vector pwm_inverter
Space vector pwm_inverter
 
Space Vector Modulation in Voltage Sourced Three Level Neutral Point Clamped ...
Space Vector Modulation in Voltage Sourced Three Level Neutral Point Clamped ...Space Vector Modulation in Voltage Sourced Three Level Neutral Point Clamped ...
Space Vector Modulation in Voltage Sourced Three Level Neutral Point Clamped ...
 
ECNG 6503 #2
ECNG 6503  #2ECNG 6503  #2
ECNG 6503 #2
 
Svpwm
SvpwmSvpwm
Svpwm
 
VLSI
VLSIVLSI
VLSI
 
Space Vector Pulse Width Modulation Schemes for Two-Level Voltage Source Inve...
Space Vector Pulse Width Modulation Schemes for Two-Level Voltage Source Inve...Space Vector Pulse Width Modulation Schemes for Two-Level Voltage Source Inve...
Space Vector Pulse Width Modulation Schemes for Two-Level Voltage Source Inve...
 
Vlsi
VlsiVlsi
Vlsi
 
Low Power Design - PPT 1
Low Power Design - PPT 1 Low Power Design - PPT 1
Low Power Design - PPT 1
 
Neutral point clamped inverter
Neutral point clamped inverterNeutral point clamped inverter
Neutral point clamped inverter
 
Inverter
InverterInverter
Inverter
 
Dynamic logic circuits
Dynamic logic circuitsDynamic logic circuits
Dynamic logic circuits
 
Switched capacitor circuits_shish
Switched capacitor circuits_shishSwitched capacitor circuits_shish
Switched capacitor circuits_shish
 
A novel voltage reference without the operational amplifier and resistors
A novel voltage reference without the operational amplifier and resistorsA novel voltage reference without the operational amplifier and resistors
A novel voltage reference without the operational amplifier and resistors
 
Ee6378 bandgap reference
Ee6378 bandgap referenceEe6378 bandgap reference
Ee6378 bandgap reference
 

Similar a Project mac

Power Dissipation of VLSI Circuits and Modern Techniques of Designing Low Pow...
Power Dissipation of VLSI Circuits and Modern Techniques of Designing Low Pow...Power Dissipation of VLSI Circuits and Modern Techniques of Designing Low Pow...
Power Dissipation of VLSI Circuits and Modern Techniques of Designing Low Pow...IJSRD
 
Power consumption
Power consumptionPower consumption
Power consumptionsdpable
 
Power Electronics - Power Semi – Conductor Devices
Power Electronics - Power Semi – Conductor DevicesPower Electronics - Power Semi – Conductor Devices
Power Electronics - Power Semi – Conductor DevicesBurdwan University
 
vlsi 2 unit.pdfvlsi unit 2 important notes for ece department
vlsi 2 unit.pdfvlsi unit 2 important notes for ece departmentvlsi 2 unit.pdfvlsi unit 2 important notes for ece department
vlsi 2 unit.pdfvlsi unit 2 important notes for ece departmentnitcse
 
Synchronous Rectification for Forward Converters_SMappus_June 4 2010
Synchronous Rectification for Forward Converters_SMappus_June 4 2010Synchronous Rectification for Forward Converters_SMappus_June 4 2010
Synchronous Rectification for Forward Converters_SMappus_June 4 2010Steve Mappus
 
Enhancing the Design of VRM for Testing Magnetic Components
Enhancing the Design of VRM for Testing Magnetic ComponentsEnhancing the Design of VRM for Testing Magnetic Components
Enhancing the Design of VRM for Testing Magnetic ComponentsIJERA Editor
 
Low Power Design Techniques for ASIC / SOC Design
Low Power Design Techniques for ASIC / SOC DesignLow Power Design Techniques for ASIC / SOC Design
Low Power Design Techniques for ASIC / SOC DesignRajesh_navandar
 
Design and implementation of cyclo converter for high frequency applications
Design and implementation of cyclo converter for high frequency applicationsDesign and implementation of cyclo converter for high frequency applications
Design and implementation of cyclo converter for high frequency applicationscuashok07
 
Bg31189192
Bg31189192 Bg31189192
Bg31189192 IJMER
 
Review of Step down Converter with Efficient ZVS Operation
Review of Step down Converter with Efficient ZVS OperationReview of Step down Converter with Efficient ZVS Operation
Review of Step down Converter with Efficient ZVS OperationIJRST Journal
 
Ee660 ex 25_second_order_effects_schwappach
Ee660 ex 25_second_order_effects_schwappachEe660 ex 25_second_order_effects_schwappach
Ee660 ex 25_second_order_effects_schwappachLoren Schwappach
 
Analysis of pocket double gate tunnel fet for low stand by power logic circuits
Analysis of pocket double gate tunnel fet for low stand by power logic circuitsAnalysis of pocket double gate tunnel fet for low stand by power logic circuits
Analysis of pocket double gate tunnel fet for low stand by power logic circuitsVLSICS Design
 
IRJET- Comparison of Power Dissipation in Inverter using SVL Techniques
IRJET- Comparison of Power Dissipation in Inverter using SVL TechniquesIRJET- Comparison of Power Dissipation in Inverter using SVL Techniques
IRJET- Comparison of Power Dissipation in Inverter using SVL TechniquesIRJET Journal
 
Lect2 up080 (100324)
Lect2 up080 (100324)Lect2 up080 (100324)
Lect2 up080 (100324)aicdesign
 
Comparison of CMOS Current Mirror Sources
Comparison of CMOS Current Mirror SourcesComparison of CMOS Current Mirror Sources
Comparison of CMOS Current Mirror Sourcesidescitation
 

Similar a Project mac (20)

Power Dissipation of VLSI Circuits and Modern Techniques of Designing Low Pow...
Power Dissipation of VLSI Circuits and Modern Techniques of Designing Low Pow...Power Dissipation of VLSI Circuits and Modern Techniques of Designing Low Pow...
Power Dissipation of VLSI Circuits and Modern Techniques of Designing Low Pow...
 
Power consumption
Power consumptionPower consumption
Power consumption
 
Power Electronics - Power Semi – Conductor Devices
Power Electronics - Power Semi – Conductor DevicesPower Electronics - Power Semi – Conductor Devices
Power Electronics - Power Semi – Conductor Devices
 
vlsi 2 unit.pdfvlsi unit 2 important notes for ece department
vlsi 2 unit.pdfvlsi unit 2 important notes for ece departmentvlsi 2 unit.pdfvlsi unit 2 important notes for ece department
vlsi 2 unit.pdfvlsi unit 2 important notes for ece department
 
Ep24889895
Ep24889895Ep24889895
Ep24889895
 
Power
PowerPower
Power
 
Synchronous Rectification for Forward Converters_SMappus_June 4 2010
Synchronous Rectification for Forward Converters_SMappus_June 4 2010Synchronous Rectification for Forward Converters_SMappus_June 4 2010
Synchronous Rectification for Forward Converters_SMappus_June 4 2010
 
Enhancing the Design of VRM for Testing Magnetic Components
Enhancing the Design of VRM for Testing Magnetic ComponentsEnhancing the Design of VRM for Testing Magnetic Components
Enhancing the Design of VRM for Testing Magnetic Components
 
Analysis, Design and Investigation on a New Single-Phase Switched Quasi Z-Sou...
Analysis, Design and Investigation on a New Single-Phase Switched Quasi Z-Sou...Analysis, Design and Investigation on a New Single-Phase Switched Quasi Z-Sou...
Analysis, Design and Investigation on a New Single-Phase Switched Quasi Z-Sou...
 
Low Power Design Techniques for ASIC / SOC Design
Low Power Design Techniques for ASIC / SOC DesignLow Power Design Techniques for ASIC / SOC Design
Low Power Design Techniques for ASIC / SOC Design
 
Design and implementation of cyclo converter for high frequency applications
Design and implementation of cyclo converter for high frequency applicationsDesign and implementation of cyclo converter for high frequency applications
Design and implementation of cyclo converter for high frequency applications
 
Bg31189192
Bg31189192 Bg31189192
Bg31189192
 
Review of Step down Converter with Efficient ZVS Operation
Review of Step down Converter with Efficient ZVS OperationReview of Step down Converter with Efficient ZVS Operation
Review of Step down Converter with Efficient ZVS Operation
 
Ee660 ex 25_second_order_effects_schwappach
Ee660 ex 25_second_order_effects_schwappachEe660 ex 25_second_order_effects_schwappach
Ee660 ex 25_second_order_effects_schwappach
 
Analysis of pocket double gate tunnel fet for low stand by power logic circuits
Analysis of pocket double gate tunnel fet for low stand by power logic circuitsAnalysis of pocket double gate tunnel fet for low stand by power logic circuits
Analysis of pocket double gate tunnel fet for low stand by power logic circuits
 
Mosfet 2
Mosfet 2Mosfet 2
Mosfet 2
 
Mosfet 3
Mosfet 3Mosfet 3
Mosfet 3
 
IRJET- Comparison of Power Dissipation in Inverter using SVL Techniques
IRJET- Comparison of Power Dissipation in Inverter using SVL TechniquesIRJET- Comparison of Power Dissipation in Inverter using SVL Techniques
IRJET- Comparison of Power Dissipation in Inverter using SVL Techniques
 
Lect2 up080 (100324)
Lect2 up080 (100324)Lect2 up080 (100324)
Lect2 up080 (100324)
 
Comparison of CMOS Current Mirror Sources
Comparison of CMOS Current Mirror SourcesComparison of CMOS Current Mirror Sources
Comparison of CMOS Current Mirror Sources
 

Project mac

  • 1. An Energy Efficient Sub-threshold Multiplication and Accumulation Unit for Low Power Digital Signal Processing Applications Harsha Yelisala SPRING 2009 - SUMMER 2010
  • 2. Technology Profile The following technologies are used in this project, 90nm Pass Transistor Technology. Cadence IC design. Virtuoso schematic. Virtuoso Analog Design Environment. Cadence Spectre Simulator. Virtuoso Layout Suite. Synopsys Nanosim. Synopsys Hspice. Tcl Scripting. Perl Scripting. Python Programming Language.
  • 3. Aim The Objectives of this project are 1. To design an industry standard energy efficient circuit in a 90nm Technology. 2. To emphasize the Subthreshold mode of operation. 3. To get hands on expertise on Cadence and Synopsys Tools. 4. To understand the hardware design flow. 5. To work with Perl and Tcl Scripting Languages.
  • 4. Introduction Abstract The increased use of power consuming devices led to a new corner of research in energy and power efficient designs. The conventional design methodologies proved to be inefficient when energy efficiency is a prime metric. Of the several novel approaches, the one that is promising in terms of high energy savings and reduced complexity is the Sub-threshold mode of operation. A 220mV energy efficient Subthreshold MAC unit is designed based on the designed custom cell library in 90nm Pass transistor technology.
  • 5. Work Flow 1. Studying the literature regarding Subthreshold operation. 2. Investigating various logic families for Subthreshold scheme. 3. Designing a custom library of standard cells out of the proposed logic family. 4. Designing a MAC unit. 5. Verifying and testing the unit from power and energy perspective.
  • 6. Subthreshold Mode What is Subthreshold mode A basic MOS transistor works in three different modes of operation. 1. Active or Saturation Mode 2. Linear or Triode Mode 3. Cutoff or Subthreshold Mode
  • 7. Modes of a MOSFET operation Modes of a MOSFET A basic MOS transistor works in three different modes of operation. 1. Active or Saturation Mode 2. Linear or Triode Mode 3. Cutoff or Subthreshold Mode
  • 8. All about Subthreshold Mode! What is Subthreshold mode The subthreshold operation of CMOS transistor is performed when the gate to source potential (Vgs ) is less than threshold voltage(Vth ). Advantages: 1. As the device is operating in ultra low voltages(200-300mV), the dynamic power component is highly reduced. 2. Highly suitable for low power low speed applications like sensor nodes, battery operated devices etc., Disadvantages: 1. As the driving currents are the weak leakage currents the time to charge and discharge the nodes is high, making the speed in between 1-10MHz. 2. Transistor sizing criticality 3. Low On-Off Current ratio. 4. High Sensitivity to Process, Voltage and Temperature variations.
  • 9. Subthreshold Current Model (1 of 2) In Subthreshold regime, the drain current(Ids ) varies exponentially. In long channel device, threshold voltage does not depend on drain voltage or channel length. But in sub-micron technology, due to drain induced barrier lowering(DIBL), threshold voltage does depend on drain voltage, as source/drain depletion region penetrates significantly into the channel. The subthreshold current of CMOS transistor is given by the following equation, Isub = I0 × e (Vgs −Vth +ηVds )/nvt × 1 − e −Vds /Vth . (1)
  • 10. Subthreshold Current Model (2 of 2) Isub = I0 × e (Vgs −Vth +ηVds )/nvt × 1 − e −Vds /Vth . (2) where 2 I0 = µo Cox (W /L)(n − 1)Vth (3) and Vgs = transistor gate to source voltage, Vds = drain to source voltage, Vth = threshold voltage, vt = KT /q is the thermal voltage, n = subthreshold slope factor = (1 + Cd /Cox ) Cd = drain capacitance Cox = gate capacitance η = DIBL co-efficient µo = Mobility. W and L are the width and channel length of MOSFET respectively.
  • 11. Subthreshold Power Model (1 ) For low frequency mobile devices, the advantage of subthreshold design is widely achieved through radical circuit power reduction at the cost of operating speed . The total power consumption of the digital circuit is given by following equation. Ptotal = Pdynamic + Pshort−circuit + Pstatic (4)
  • 12. Subthreshold Power Model (2 ) Dynamic Power Dynamic power is described by following equation, Pdynamic = αfCeff Vdd 2 (5) where α is activity factor, f is switching frequency, Ceff is the effective capacitance. As dynamic power is directly proportional with the square of supply voltage, significant power reduction is achieved in subthreshold voltage.
  • 13. Subthreshold Power Model (3 ) Dynamic Power At 220mV, the dynamic charging current which is directly proportional with dynamic power, is reduced by almost 248.49X compared to supply voltage of 1.2V for an inverter at TT process corner. 3 10 TT FS SF 2 SS 10 FF Current (uA) 1 10 0 10 −1 10 0.2 0.4 0.6 0.8 1 1.2 Supply voltage (V)
  • 14. Subthreshold Power Model (4 ) Static Power Static power is the power consumed by the circuit during idle state and described by following equation. Pstatic = ILeakage Vdd (6) The leakage current consists of various components, subthreshold leakage, gate tunneling, gate induced drain lowering (GIDL) and reverse bias diode leakage. The subthreshold leakage varies according to equation (2). Thus with reduction of drain voltage, the DIBL effect reduces which in turn reduces subthreshold leakage current. The gate tunneling has significant contribution to overall leakage current, which also reduces with gate or supply voltage. GIDL and reverse bias diode leakage also significantly reduce due to supply voltage reduction in a subthreshold circuit.
  • 15. Subthreshold Power Model (5 ) Static Power At 220mV, the subthreshold leakage current at weak inversion is reduced by almost 8.55X compared to strong inversion(supply voltage 1.2V) at TT process corner. 3 10 TT FS SF 2 SS 10 FF Current (nA) 1 10 0 10 −1 10 0.2 0.4 0.6 0.8 1 1.2 Supply voltage (V)
  • 16. Subthreshold Power Model (6 ) Short Circuit Power Short circuit power is the power dissipated due to current conduction between Vdd and VSS during logic transition. It is described by the following equation. Pstatic = Ishort−circuit Vdd (7) Although short-circuit current flowing time is increased due to slower operation in subthreshold, but reduced supply voltage decreases electron conduction, which in turn reduces Ishort−circuit .
  • 17. Subthreshold Power Model (5 ) Short Circuit Power At 220mV, there is a 446.45X reduction in short circuit current compared to full rail voltage of 1.2V in TT process corner. 2 10 1 10 Current (uA) 0 10 TT FS −1 SF 10 SS FF −2 10 0.2 0.4 0.6 0.8 1 1.2 Supply voltage (V) Figure: Short circuit current rating under varying supply voltage for an
  • 18. Subthreshold Design Challenges (1) Transistor Sizing Criticality On-Off Current Ratio PVT variations Noise Margin
  • 19. Subthreshold Design Challenges (2) Transistor Sizing Criticality The relative strength of pull-up, pull-down is very critical for optimal rise and fall time. As subthreshold current depends exponentially on Vth , any variation in threshold of NMOS and PMOS can change the β ratio drastically which directly affects rise/fall time and may trigger logic failure. The shift in β ratio is observed in low-voltage, enforcing us to size the cell transistor very carefully.
  • 20. Subthreshold Design Challenges (2) Transistor Sizing Criticality The relative strength of pull-up, pull-down is very critical for optimal rise and fall time. As subthreshold current depends exponentially on Vth , any variation in threshold of NMOS and PMOS can change the β ratio drastically which directly affects rise/fall time and may trigger logic failure. The shift in β ratio is observed in low-voltage, enforcing us to size the cell transistor very carefully.
  • 21. Subthreshold Design Challenges (2) Ratio of NMOS ION and PMOS ION at different corners 3 10 TT FF FS SF SS ION NMOS / ION PMOS 2 10 1 10 0 10 0 0.2 0.4 0.6 0.8 1 1.2 1.4 Supply(V) Figure: Ratio of NMOS ION and PMOS ION at different corners
  • 22. Subthreshold Design Challenges (2) Ratio of NMOS ION and PMOS ION at different temperatures 30 −40C −20C 0C 25 20C 40C 60C ION NMOS / ION PMOS 20 80C 100C 120C 15 10 5 0 0 0.2 0.4 0.6 0.8 1 1.2 1.4 Supply(V) Figure: Ratio of NMOS ION and PMOS ION at different temperatures
  • 23. Subthreshold Design Challenges (3) On-Off Current Ratio The drain current of MOSFET increases exponentially in subthreshold region whereas in strong inversion it changes very slowly due to velocity saturation of majority carriers. In subthreshold region, the threshold voltage deviation and degradation of ION /IOFF of the current makes the circuit operation very critical. In subthreshold region like 0.2V, ION /IOFF degrades to below 300 at room temperature.There is strong race condition between on and off devices during setting of a critical signal and this determines the maximum number of allowable cells per bit-line. When this current ratio degrades to very low value, it becomes very difficult to differentiate between logic ‘1’ and logic ‘0’. If we consider process variations, this ratio becomes worse in FF corner as shown.
  • 24. Subthreshold Design Challenges (3) On-Off Current Ratio 5 10 4 10 NMOS ION / IOFF 3 10 −40C −20C 2 10 0C 20C 40C 1 60C 10 80C 100C 120C 0 10 0 0.2 0.4 0.6 0.8 1 1.2 1.4 Supply(V) Figure: Ratio of NMOS ION and IOFF at different temperatures Observation: Significant β ratio variation is observed in low
  • 25. Subthreshold Design Challenges (3) On-Off Current Ratio 7 10 6 10 5 10 PMOS ION / IOFF 4 10 −40C 3 10 −20C 0C 2 20C 10 40C 60C 1 10 80C 100C 0 120C 10 0 0.2 0.4 0.6 0.8 1 1.2 1.4 Supply(V) Figure: Ratio of PMOS ION and IOFF at different temperatures Observation: Significant β ratio variation is observed in low
  • 26. On-Off Current Ratio 5 10 4 10 NMOS ION / IOFF 3 10 2 TT 10 FF FS SF 1 10 SS 0 10 0 0.2 0.4 0.6 0.8 1 1.2 1.4 Supply(V) Figure: Ratio of NMOS ION and IOFF at different corners Observation: Significant β ratio variation is observed in low voltage at different temperatures.
  • 27. On-Off Current Ratio 6 10 5 10 PMOS ION / IOFF 4 10 3 TT 10 FF FS SF 2 SS 10 1 10 0 0.2 0.4 0.6 0.8 1 1.2 1.4 Supply(V) Figure: Ratio of PMOS ION and IOFF at different corners Observation: Significant β ratio variation is observed in low voltage at different temperatures.
  • 28. A Look into other Logic families The conventional Complimentary MOS Logic family when operated in subthreshold voltages poses several disadvantages. A few of them are: 1. High Power dissipation 2. Weak Noise margins. 3. Huge delays. Thus it is evident that a CMOS logic family is not optimum for subthreshold operation.
  • 29. A study of several other logic families is made with power and energy consumption as prime concern. Table: Minimum working voltages for different logic families for a basic AND gate Logic Family Minimum Voltage(mv) Delay(ns) Driving Current(nA) Power(nW) PDP(fJ) Sub-CMOS 250 2.56 3330 1859 4.759 Pseudo NMOS 220 4.765 102.56 0.6023 2.87 DTMOS 180 8.4173 32.54 233.63 1.97 Domino 240 7.6477 476.13 639.41 4.89 Pass Transistor 200 4.9953 201.43 426.17 2.13 DTPT 175 6.598 128.39 204.68 1.35 Table: Energy comparison at 250mV for different logic families for basic AND gate Logic Family Delay(ns) Driving Current(nA) Power(nW) PDP(fJ) Sub-CMOS 2.56 3330 1859 4.759 Pseudo NMOS 3.8637 761.938 0.9848 3.805 DTMOS 11.116 89.204 1.501 16.68 Domino 4.5477 568.31 1.119 5.09 Pass Transistor 2.2641 652.88 1.502 3.39 DTPT 1.8432 830 1.503 2.77
  • 30. Custom Cell Library All the standard cells are designed in 90nm PT technology. The cells are fine tuned for their sizings, driving capability and minimum working voltage magnitudes. The cells that are customized are: Inverter Buffer And Or Xor Xnor
  • 31. Inverter This is the only gate in the library that is based on CMOS technology. The only modification is that the driving capability of the cell is increased by improving the effective channel length of the P and N devices as shown.
  • 32. Buffer Buffer gate is obtained by connecting two inverters in series.
  • 33. And (1 of 2) Operation: When A=0, B=0 the transistors A p1 p1, n1, n3 are on and p2, n2, p3, p4 are off and transmits gnd. n1 A' output When A=0, B=1 the transistors p1, n1, p3 are on and p2, n2, n3, p2 B p3 p4 p4 are off and transmits gnd. vdc n2 When A=1, B=0 the transistors B' p2, n2, n3, p4 are on and p1, n1, gnd n3 p3 are off and transmits B. When A=1, B=1 the transistors Figure: And gate p1, n2, p3, p4 are on and p2, n2, n3 are off and transmits vdc.
  • 34. And (2 of 2) Need for additional Mosfets n3, p3, p4: A when inputs are A=1, B=0, the p1 output node is discharged to zero. n1 output when inputs are A=1, B=1, the A' output should be connected to B p2 and should charge it to ‘1’. p3 p4 B But due to larger sub threshold vdc n2 B' delay, the node which was gnd discharged earlier takes longer n3 time to charge to ‘1’. Figure: And gate Hence an alternate path is provided to charge the output node to ‘1’ .
  • 35. Or (1 of 2) Operation: When A=0, B=0 the transistors p1, n1 are on and p2 is off and transmits B. A p2 output When A=0, B=1 the transistors p1 p1, n1 are on and p2 is off and B transmits B. n1 When A=1, B=0 the transistors A' p1, n1 are off and p2 is on and transmits A. Figure: Or gate When A=1, B=1 the transistors p1, n1 are off and p2 is on and transmits A.
  • 36. Or (2 of 2) A This works fine in strong inversion p2 output region. But when subthreshold mode is p1 considered, the output current is not B sufficient for the gate to drive a FO4 n1 load. Hence a chain of two inverters are A' connected at the final output to consider it as custom OR gate. Figure: Or gate
  • 37. Xnor Operation: When A=0, B=0 the transistors n1 B' p1, n1 are on and p2, n2 are off and transmits B . p1 When A=0, B=1 the transistors A output p1, n1 are on and p2, n2 are off and transmits B . n2 B When A=1, B=0 the transistors p1, n1 are off and p2, n2 are on p2 and transmits B. A' When A=1, B=1 the transistors p1, n1 are off and p2, n2 are on Figure: Xnor gate and transmits B.
  • 38. Xor(1 of 2) Operation: When A=0, B=0 the transistors p1 B p1, n1 are off and p2, n2 are on and transmits B. n1 When A=0, B=1 the transistors A output p1, n1 are off and p2, n2 are on and transmits B. p2 B When A=1, B=0 the transistors p1, n1 are on and p2, n2 are off n2 and transmits B . A When A=1, B=1 the transistors p1, n1 are on and p2, n2 are off Figure: Xor gate and transmits B .
  • 39. Xor(2 of 2) p1 B However, the direct XOR n1 implementation is not used in our custom library, as the XOR derived from A output XNOR works for much lesser minimum working voltage than direct XOR p2 B implementation upon investigation. The details are mentioned in the further n2 slides. A Figure: Xor gate
  • 40. Summary of the standard cells in PT technology Table: Electrical characteristics of different basic cells using pass transistor logic in TT process corner Basic cell Minimum Voltage(mv) Delay(ns) Driving Current(fA) Power(nW) PDP(aJ) Buffer 148 2.7258 582.06 0.134 0.365 Inverter 150 1.5655 590.65 0.197 0.308 XOR 155 1.5739 611.69 0.562 0.884 NAND 170 0.9638 673.64 0.435 0.419 AND 175 2.1523 689.82 0.47 1.011 OR 155 3.9219 611.81 0.431 1.6903 Full adder 185 2.9647 734.61 29.516 87.506
  • 41. Design of a MAC Unit MAC is one of the most occurring and energy consuming operation in DSP or other computationally intensive applications. It represents a fundamental building block in all DSP tasks. Therefore, designing an ultra-low power MAC becomes a subject of substantial research interest. An energy efficient MAC unit is designed using the custom cell library.
  • 42. Design of a MAC Unit Brief Specifications: Inputs : 8-bit Multiplier, 8-bit Multiplicand, 17-bit Addend Outputs :17-bit MAC output Type of Multiplier : Radix-4 Booth encoded multiplier Type of Adder : Ripple carry adder
  • 43. Block diagram of MAC unit MULTIPLIER ADDER MD<7:0> -MD Partial Product 2s Compliment Generation I PP0 P0 -2MD N Shifter <16:0> Partial P PP1 P1 Product <16:0> Adder U 2MD O Shifter Adder T PP2 P2 U T MR<7:0> PP3 P3 P Booth Encoder U T Figure: Block diagram of MAC unit :
  • 44. Flowchart of MAC Unit MULTIPLICAND 2s Compliment Boot h encoder Shift er s MULTIPLIER Partial product generation Partial product addition ADDER INPUT Adder MAC OUTPUT Figure: Flowchart for MAC operation
  • 45. Sequence of logic flow The multiplicand(MD) input enters the 2s compliment block which negates the value of MD. The obtained -MD when shifted left gives a -2MD. The non negated MD is also shifted left to obtain 2MD. The booth encoder block encodes the 8 bit multiplier(MR) to 12 bits which are used to control the partial product generation. The partial product generation involves selection of four 8 bit vectors based on the encoded bits. The four partial products are generated by the PP0, PP1, PP2 and PP3 blocks respectively. The partial products are shifted and sign extended to 16 bits by the P0, P1, P2 and P3 blocks respectively. The obtained partial products are finally added to obtain the 17 bit multiplier output. A 17 bit external input is added with the obtained multiplier product to give final MAC output.
  • 46. Modified booth encoding algorithm Modified booth encoding algorithm is an often selected algorithm for multiplication of signed numbers. This scheme is selected by its virtue of reducing the number of partial products to half the number of multiplier bits as compared to a conventional booth encoding scheme. This reduces the number of iterations at an increased circuit complexity. Thus the power consumption is also reduced by half. The modified booth encoder based multiplier architecture is designed keeping in view of the power consumption.
  • 47. Algorithm Description and Control Implementation The modified booth algorithm considers 3 multiplier bits (MRi+1 , MRi , MRi−1 ) at a time and encodes to any value among -2MD, -MD, 0, MD, 2MD based on Table below. The value MRi refers to the i th bit of the multiplier where i ranges from 0 to number of multiplier bits and MR−1 is taken to be 0. Table: Mapping of multiplier bits to encoded bits using Radix 4 Booth Encoder MRi+1 MRi MRi−1 Partial Product A B C 0 0 0 0 0 0 0 0 0 1 MD 0 1 0 0 1 0 MD 0 1 0 0 1 1 2MD 0 0 1 1 0 0 -2MD 1 0 1 1 0 1 -MD 1 1 0 1 1 0 -MD 1 1 0 1 1 1 0 1 0 0 where A, B, C indicate the encoded bits for a given MRi+1 , MRi , MRi−1 bits of the multiplier bit sequence starting from the LSB.
  • 48. Example Consider an example where, Multiplier(MR) :01001000 Adder input as Multiplicand(MD):00110110 01100010001000001 So, 2MD=01101100, -MD=11001010, -2MD=10010100 Encoding the MR: 010010000 000 encodes to 000 01001000 100 encodes to 101 01001000 001 encodes to 010 01001000 010 encodes to 010 Partial Products: After shifting and sign extending: pp0 :00000000 p0 :0000000000000000 pp1 :10010100 p1 :1111111001010000 pp2 :00110110 p2 :0000001101100000 pp3 :00110110 p3 :0000110110000000 Adder = 01100010001000001 + Product = 00000111100110000 MAC OUTPUT = 0000111100110000
  • 49. Test Chip A 17 bit subthreshold MAC unit is implemented using 90nm CMOS technology. The fan-in of each logic gate is carefully selected to achieve maximum robustness in near-threshold supply voltage. Since pad-frame input to the MAC is 1.2V, input data and clock signals are down-converted using level shifter down converter. The output of MAC is up converted to 1.2V before being latched to output padframe using an efficient 2-stage down level-shifter. The design layout is done using cadence virtuoso.A total of four metal layers are employed to design the MAC unit. The MAC unit size is 658.4µm × 149.49µm which consumes an area of 0.098mm2 in 90nm technology. The transistor level circuit analysis is performed using random test vector. The design is elaborately tested for PVT variations.
  • 50. Full chip layout of the proposed design with pad frame Figure: Layout of MAC unit :
  • 51. Design Specs Table: Subthreshold MAC design specifications Minimum voltage 220mV Speed 1 MHz Energy per operation 1.63pJ Average power 2.04uW Standby power 1.4uW The MAC unit is configured to operate at an extremely low voltage of 220mV at a speed of 1MHz for the worst case process corner (SS) at room temperature and can be functional even down to 180mV at typical corner (TT).
  • 52. MAC Simulation Results (1 of 8) 100 90 80 70 60 power (uW) 50 40 30 20 10 0 200 250 300 350 400 450 500 voltage (mV) Figure: Average Power Consumption of MAC at different supply voltages :
  • 53. MAC Simulation Results (2 of 8) 12 SS 10 SF FS TT 8 FF Frequency (MHz) 6 4 2 0 220 225 230 235 240 245 250 Voltage (mV) Figure: Operating frequency of MAC unit at different supply voltages under global variation :
  • 54. MAC Simulation Results (3 of 8) 7000 6000 5000 Energy/op (fJ) 4000 3000 2000 1000 200 250 300 350 400 450 500 voltage (mV) Figure: Energy/operation at different supply voltages :
  • 55. MAC Simulation Results (4 of 8) 3 static current dynamic current 2.5 capacitive current 2 1.5 Current (uA) 1 0.5 0 −0.5 −1 200 250 300 350 400 450 500 Votage (mV) Figure: Short circuit, static and capacitive current ratings at different supply voltages :
  • 56. MAC Simulation Results (5 of 8) 3 temp 0c temp 27c 2.5 temp100c 2 Stand By Power (uW) 1.5 1 0.5 0 −0.5 200 250 300 350 400 450 500 Supply (mV) Figure: Standby power versus supply voltage at different temperatures :
  • 57. MAC Simulation Results (6 of 8) 3 static current dynamic current 2.5 capacitive current 2 1.5 Current (uA) 1 0.5 0 −0.5 −1 −40 −20 0 20 40 60 80 100 120 temp (c) Figure: Current ratings at different operating temperatures at supply voltage 220mV :
  • 58. MAC Simulation Results (7 of 8) 1000 900 800 700 600 dealy (ns) 500 400 300 200 100 −40 −20 0 20 40 60 80 100 120 temp (c) Figure: Performance of MAC at different temperatures at supply voltage 220mV :
  • 59. MAC Simulation Results (8 of 8) 300 250 200 power (uW) 150 100 50 0 −40 −20 0 20 40 60 80 100 120 temp (c) Figure: Average power of MAC at different temperatures at supply voltage 220mV :
  • 60. Conclusion In this research project, Several logical families are investigated in subthreshold range to build the optimum subthreshold standard cells. Pass transistor logic family was chosen due to its energy efficiency compared to other subthreshold logic families. An optimal design choice is made for each subthreshold standard cell, based on power delay product. A 17 bit subthreshold MAC chip is implemented using customized subthreshold standard cells. The custom cell layout is done using cadence virtuoso and tested in all process corners using nanosim simulator. It is designed to work for a minimum voltage of 220mV and consumes an ultra low energy as minimum as 1.62pJ per operation for an operating performance of 1.0MHz.