SlideShare una empresa de Scribd logo
1 de 19
Descargar para leer sin conexión
HW Emulators:
Does it belong in your
Verification tool chest?
DV Club, May 23rd, 2007



 Jai Kumar
     Verification Technologist
          Sun Microsystems Inc.
     jai.kumar@sun.com
          http://sun.com
Verification Challenges

• Large SoCs -
  > Multi-core, multi-thread every where
  > Complex IOs: ENET, PCIE, etc.
  > Verification state-space explosion
• Fierce Competition -
  > Time-to-Market
  > Cost
• Tools not keeping the pace with requirements -
  > Capacity
  > Performance

                                                   Slide 2
UltraSPARC T1 Processor
   • A true revolutionary processor
   • Up to eight 4-way multithreaded
     cores for up to 32 simultaneous
     threads
   • All cores connected through a
     134.4GB/s crossbar switch
   • High-bandwidth 12-way
     associative
   • 4 DDR2 channels (23GB/s)
   • Power : 63W @1.2GHz, 1.2v
   • ~300M transistors
   • 378 sq. mm die
   • Verification covered in earlier DV
     Club presentation
DV Club                                   Jai Kumar   Slide 3
UltraSPARC T2 Processor
                             • True SoC
                             • 8 SPARC Cores, 8
                               threads each
                             • Shared 4MB L2 8-
                               banks, 16-way
                               associative
                             • 4 dual-channel
                               FBDIMM memory
                               controllers
                             • 2 1/10 Gb ENET ports
                               w/ onboard packet
                               classification and
                               filtering
                             • 1 PCIE x8 1.0 port

DV Club          Jai Kumar                      Slide 4
What is an Accelerator/Emulator?

• Emulation- synonomous with HW Acceleration;
  subtle difference – involves target system hw
• High-level proto-typing system – virtual silicon
• RTL is synthesized to gates; gates are mapped to
  FPGA or custom-processor based HW to execute
  design in parallel at hw speed
• Ease-of-use – make it appear as a fast
  “simulator” - more than 10,000X faster than SW
  Simulator
                                               Slide 5
Convince it is the right thing to do...




          Emulator HW               Gulfstream jet


DV Club                 Jai Kumar                    Slide 6
Cost of Bugs – simple analysis
 • Showstopper bug early in design phase
          > Cost: almost negligible
 • Showstopper bug close to tapeout
          > Cost: schedule impact
 • Showstopper bug after tapeout
          > Cost: Few Silicon respins, schedule impact
 • Showstopper bug close at Revenue Release
          > Loss of Revenue at $1+ Million per day
          > Reduce competitive edge
 • Showstopper bug 1year after Revenue Release
          > Cost of recall at $150Million
          > Damages corporate reputation                       Source: N.Winkworth & B.Blohm

 Getting SW/HW right the first time is critical to survival!
DV Club                                      Jai Kumar                                  Slide 7
When you put everything together
  and your simulator comes to a
  crawl....
           Who do you call?




             Emulators
DV Club                Jai Kumar   Slide 8
Design Size Impact on Sim Speed

                           1000000


                           100000
Performance (cycles/sec)




                            10000
                                                                                             SW Sim
                                                                                             Acceleration
                              1000
                                                                                             Emulation


                               100


                                10


                                 1
                                     0   20   40     60     80    100      120   140   160
                                                   Design Size (M gates)
                                                                                               Slide 9
When you need to run millions of
  cycles to catch those nasty deep
  corner case bugs....
           Who do you call?




             Emulators
DV Club                Jai Kumar     Slide 10
Simplified Bug Find Rate

                                         90

                                         80
          Bug Rate - Simulation Cycles




                                         70

                                         60

                                         50
                                                                                                                 Bug Rate
                                         40                                                                      M cycles

                                         30

                                         20

                                         10

                                         0
                                              1   2   3   4   5    6     7     8         9   10   11   12   13
                                                                  Time

DV Club                                                                      Jai Kumar                                      Slide 11
When you need to boot firmware,
  and Solaris....

           Who do you call?




                                   Simulation - ~27years
             Emulators             Emulation - 8.5hours

DV Club                Jai Kumar                    Slide 12
When you need to run real world
  PCIE, ENET IO traffic on your
  SoCs....
           Who do you call?




             Emulators
DV Club                Jai Kumar    Slide 13
Emulation Resources – at a glance
            Walk                 Run                     Drive             Fly




          SW Simulator      Competitor X                 Xtreme Server   Palladium2
Solaris
Boot         26.7 Years       2.6 Years                 2Days19Hrs       8 Hrs 30mins
TIme


                Flexibility: Supported RTL, TestBench constructs

  DV Club                                   Jai Kumar                              Slide 14
Usage Model
                                        SW Simulator (VCS/XSIM)

          Block-Level                 Full-Chip                            System                Post-Silicon
                                                   Verification Timeline                TapeOut
              Short, Directed Tests/ISS Compared
                                                                           Long, Random Tests/Self Checking

                                                                                DFT
                                    Tharas CoSim

                                                       Xtreme CoSim
                                                                                Xtreme Targetless Emulation


                                                                                  Palladium Targetless/ICE
                                                                                    Firmware/     IO Verif
                                                                                    SW Stack      (PCIE/
                                                                                                  ENET)

DV Club                                                    Jai Kumar                                          Slide 15
Emulation Verification:

    Get System HW/SW right the first time:

    •Prevent Functional Bug escapes (reduce Silicon re-
    spins - find bugs before tape-out)
    •Facilitate early SW Development and System
    Integration
    •Aid Post-Silicon Debug & Fix Validation



DV Club                      Jai Kumar              Slide 16
Planning for Success
      ●   Setup Emulation Environment early -
          −   Create technology-awareness within team early in design phase
          −   Decide Usage mode – co-sim, targetless, in-circuit emulation
          −   Minimize learning curve - integrate into existing simulation flow
          −   Not everything can be run on the emulator – prioritize & plan usage
      ●   Reduce Time-to-Model Build -
          −   Apply RTL enforcement (use lint, emulator tools);
      ●   Minimize Modeling Issues
          −   Array Modeling Methodology – abstract higher, make it “synthesizable”, race free
          −   Optimize for Capacity and Performance: Eliminate non-functional models, abstract
              low-level details, minimize and align clocks
      ●   Simplify Debug
          − Implement critical-set of Monitors for acceleration
          − Invest in debug tools to ease the debug complexity
          − Do not use it as yet another simulator (minimize waveform dumps)
          Maximize ROI.
DV Club                                             Jai Kumar                                    Slide 17
HW Emulation Results
   •      Initial design bringup in co-simulation mode with ISS
   •      Target-less emulation for fastest throughput - trillions of cycles
   •      Ran really long directed tests
   •      Ran a number of self-checking random code generators to target
          tests to specific functionality – some tests run for days
   •      Reset Sequence testing, RAS Testing, On-Chip Debug test
   •      PCIE & ENET Testing using Speed Bridges
   •      JTAG Scan Testing using In-Circuit Emulation
   •      Motherboard bringup and test using In-Circuit Emulation
   •      Facilitated readiness of silicon test/debug tools
   •      Post-Silicon RTL bug fix validation

DV Club                                    Jai Kumar                           Slide 18
Summary
          • Exponential complexity with large SoCs
            > multi-core,
            > multi-threads,
            > complex IOs
          • HW Emulation is a must
            > Traditional SW simulators alone are not enough
          • Cost is justified in big scheme of product
            development


DV Club                               Jai Kumar                Slide 19

Más contenido relacionado

Similar a Kumar sv07

HW Emulators: Does it Belong in your Verification Tool Chest?
HW Emulators: Does it Belong in your Verification Tool Chest?HW Emulators: Does it Belong in your Verification Tool Chest?
HW Emulators: Does it Belong in your Verification Tool Chest?DVClub
 
JRuby Jam Session
JRuby Jam Session JRuby Jam Session
JRuby Jam Session Engine Yard
 
OpenSPARC T1 Processor
OpenSPARC T1 ProcessorOpenSPARC T1 Processor
OpenSPARC T1 ProcessorDVClub
 
Leveraging Low-Cost FPGA Prototyping for Validation of Highly Threaded Server...
Leveraging Low-Cost FPGA Prototyping for Validation of Highly Threaded Server...Leveraging Low-Cost FPGA Prototyping for Validation of Highly Threaded Server...
Leveraging Low-Cost FPGA Prototyping for Validation of Highly Threaded Server...DVClub
 
USAA Mono-to-Serverless.pdf
USAA Mono-to-Serverless.pdfUSAA Mono-to-Serverless.pdf
USAA Mono-to-Serverless.pdfRichHagarty
 
SRAM redundancy insertion
SRAM redundancy insertionSRAM redundancy insertion
SRAM redundancy insertionchiportal
 
XT Best Practices
XT Best PracticesXT Best Practices
XT Best PracticesJeff Larkin
 
Evolution of the modern graphics architectures with a focus on GPUs | Turing1...
Evolution of the modern graphics architectures with a focus on GPUs | Turing1...Evolution of the modern graphics architectures with a focus on GPUs | Turing1...
Evolution of the modern graphics architectures with a focus on GPUs | Turing1...Persistent Systems Ltd.
 
Programming the Cell Processor A simple raytracer from pseudo-code to spu-code
Programming the Cell Processor A simple raytracer from pseudo-code to spu-codeProgramming the Cell Processor A simple raytracer from pseudo-code to spu-code
Programming the Cell Processor A simple raytracer from pseudo-code to spu-codeSlide_N
 
7nm "Navi" GPU - A GPU Built For Performance
7nm "Navi" GPU - A GPU Built For Performance 7nm "Navi" GPU - A GPU Built For Performance
7nm "Navi" GPU - A GPU Built For Performance AMD
 
Bottlenecks, Bottlenecks, and more Bottlenecks: Lessons Learned from 2 Years ...
Bottlenecks, Bottlenecks, and more Bottlenecks: Lessons Learned from 2 Years ...Bottlenecks, Bottlenecks, and more Bottlenecks: Lessons Learned from 2 Years ...
Bottlenecks, Bottlenecks, and more Bottlenecks: Lessons Learned from 2 Years ...Enkitec
 
Efficient use of NodeJS
Efficient use of NodeJSEfficient use of NodeJS
Efficient use of NodeJSYura Bogdanov
 
Why hitachi virtual storage platform does so well in a mainframe environment ...
Why hitachi virtual storage platform does so well in a mainframe environment ...Why hitachi virtual storage platform does so well in a mainframe environment ...
Why hitachi virtual storage platform does so well in a mainframe environment ...Hitachi Vantara
 
Performance tuning
Performance tuningPerformance tuning
Performance tuningJon Haddad
 

Similar a Kumar sv07 (20)

HW Emulators: Does it Belong in your Verification Tool Chest?
HW Emulators: Does it Belong in your Verification Tool Chest?HW Emulators: Does it Belong in your Verification Tool Chest?
HW Emulators: Does it Belong in your Verification Tool Chest?
 
JRuby Jam Session
JRuby Jam Session JRuby Jam Session
JRuby Jam Session
 
OpenSPARC T1 Processor
OpenSPARC T1 ProcessorOpenSPARC T1 Processor
OpenSPARC T1 Processor
 
Leveraging Low-Cost FPGA Prototyping for Validation of Highly Threaded Server...
Leveraging Low-Cost FPGA Prototyping for Validation of Highly Threaded Server...Leveraging Low-Cost FPGA Prototyping for Validation of Highly Threaded Server...
Leveraging Low-Cost FPGA Prototyping for Validation of Highly Threaded Server...
 
USAA Mono-to-Serverless.pdf
USAA Mono-to-Serverless.pdfUSAA Mono-to-Serverless.pdf
USAA Mono-to-Serverless.pdf
 
SRAM redundancy insertion
SRAM redundancy insertionSRAM redundancy insertion
SRAM redundancy insertion
 
XT Best Practices
XT Best PracticesXT Best Practices
XT Best Practices
 
Evolution of the modern graphics architectures with a focus on GPUs | Turing1...
Evolution of the modern graphics architectures with a focus on GPUs | Turing1...Evolution of the modern graphics architectures with a focus on GPUs | Turing1...
Evolution of the modern graphics architectures with a focus on GPUs | Turing1...
 
Programming the Cell Processor A simple raytracer from pseudo-code to spu-code
Programming the Cell Processor A simple raytracer from pseudo-code to spu-codeProgramming the Cell Processor A simple raytracer from pseudo-code to spu-code
Programming the Cell Processor A simple raytracer from pseudo-code to spu-code
 
7nm "Navi" GPU - A GPU Built For Performance
7nm "Navi" GPU - A GPU Built For Performance 7nm "Navi" GPU - A GPU Built For Performance
7nm "Navi" GPU - A GPU Built For Performance
 
NoSQLを知る
NoSQLを知るNoSQLを知る
NoSQLを知る
 
capacitor-inspection-system.pptx
capacitor-inspection-system.pptxcapacitor-inspection-system.pptx
capacitor-inspection-system.pptx
 
Build Programming Language Runtime with LLVM
Build Programming Language Runtime with LLVMBuild Programming Language Runtime with LLVM
Build Programming Language Runtime with LLVM
 
Bottlenecks, Bottlenecks, and more Bottlenecks: Lessons Learned from 2 Years ...
Bottlenecks, Bottlenecks, and more Bottlenecks: Lessons Learned from 2 Years ...Bottlenecks, Bottlenecks, and more Bottlenecks: Lessons Learned from 2 Years ...
Bottlenecks, Bottlenecks, and more Bottlenecks: Lessons Learned from 2 Years ...
 
Efficient use of NodeJS
Efficient use of NodeJSEfficient use of NodeJS
Efficient use of NodeJS
 
Why hitachi virtual storage platform does so well in a mainframe environment ...
Why hitachi virtual storage platform does so well in a mainframe environment ...Why hitachi virtual storage platform does so well in a mainframe environment ...
Why hitachi virtual storage platform does so well in a mainframe environment ...
 
Sun Microsystems
Sun MicrosystemsSun Microsystems
Sun Microsystems
 
Performance tuning
Performance tuningPerformance tuning
Performance tuning
 
Smallsat 2021
Smallsat 2021Smallsat 2021
Smallsat 2021
 
Android Optimization: Myth and Reality
Android Optimization: Myth and RealityAndroid Optimization: Myth and Reality
Android Optimization: Myth and Reality
 

Más de Obsidian Software (20)

Zhang rtp q307
Zhang rtp q307Zhang rtp q307
Zhang rtp q307
 
Zehr dv club_12052006
Zehr dv club_12052006Zehr dv club_12052006
Zehr dv club_12052006
 
Yang greenstein part_2
Yang greenstein part_2Yang greenstein part_2
Yang greenstein part_2
 
Yang greenstein part_1
Yang greenstein part_1Yang greenstein part_1
Yang greenstein part_1
 
Williamson arm validation metrics
Williamson arm validation metricsWilliamson arm validation metrics
Williamson arm validation metrics
 
Whipp q3 2008_sv
Whipp q3 2008_svWhipp q3 2008_sv
Whipp q3 2008_sv
 
Vishakantaiah validating
Vishakantaiah validatingVishakantaiah validating
Vishakantaiah validating
 
Validation and-design-in-a-small-team-environment
Validation and-design-in-a-small-team-environmentValidation and-design-in-a-small-team-environment
Validation and-design-in-a-small-team-environment
 
Tobin verification isglobal
Tobin verification isglobalTobin verification isglobal
Tobin verification isglobal
 
Tierney bq207
Tierney bq207Tierney bq207
Tierney bq207
 
The validation attitude
The validation attitudeThe validation attitude
The validation attitude
 
Thaker q3 2008
Thaker q3 2008Thaker q3 2008
Thaker q3 2008
 
Thaker q3 2008
Thaker q3 2008Thaker q3 2008
Thaker q3 2008
 
Strickland dvclub
Strickland dvclubStrickland dvclub
Strickland dvclub
 
Stinson post si and verification
Stinson post si and verificationStinson post si and verification
Stinson post si and verification
 
Shultz dallas q108
Shultz dallas q108Shultz dallas q108
Shultz dallas q108
 
Shreeve dv club_ams
Shreeve dv club_amsShreeve dv club_ams
Shreeve dv club_ams
 
Sharam salamian
Sharam salamianSharam salamian
Sharam salamian
 
Schulz sv q2_2009
Schulz sv q2_2009Schulz sv q2_2009
Schulz sv q2_2009
 
Schulz dallas q1_2008
Schulz dallas q1_2008Schulz dallas q1_2008
Schulz dallas q1_2008
 

Kumar sv07

  • 1. HW Emulators: Does it belong in your Verification tool chest? DV Club, May 23rd, 2007 Jai Kumar Verification Technologist Sun Microsystems Inc. jai.kumar@sun.com http://sun.com
  • 2. Verification Challenges • Large SoCs - > Multi-core, multi-thread every where > Complex IOs: ENET, PCIE, etc. > Verification state-space explosion • Fierce Competition - > Time-to-Market > Cost • Tools not keeping the pace with requirements - > Capacity > Performance Slide 2
  • 3. UltraSPARC T1 Processor • A true revolutionary processor • Up to eight 4-way multithreaded cores for up to 32 simultaneous threads • All cores connected through a 134.4GB/s crossbar switch • High-bandwidth 12-way associative • 4 DDR2 channels (23GB/s) • Power : 63W @1.2GHz, 1.2v • ~300M transistors • 378 sq. mm die • Verification covered in earlier DV Club presentation DV Club Jai Kumar Slide 3
  • 4. UltraSPARC T2 Processor • True SoC • 8 SPARC Cores, 8 threads each • Shared 4MB L2 8- banks, 16-way associative • 4 dual-channel FBDIMM memory controllers • 2 1/10 Gb ENET ports w/ onboard packet classification and filtering • 1 PCIE x8 1.0 port DV Club Jai Kumar Slide 4
  • 5. What is an Accelerator/Emulator? • Emulation- synonomous with HW Acceleration; subtle difference – involves target system hw • High-level proto-typing system – virtual silicon • RTL is synthesized to gates; gates are mapped to FPGA or custom-processor based HW to execute design in parallel at hw speed • Ease-of-use – make it appear as a fast “simulator” - more than 10,000X faster than SW Simulator Slide 5
  • 6. Convince it is the right thing to do... Emulator HW Gulfstream jet DV Club Jai Kumar Slide 6
  • 7. Cost of Bugs – simple analysis • Showstopper bug early in design phase > Cost: almost negligible • Showstopper bug close to tapeout > Cost: schedule impact • Showstopper bug after tapeout > Cost: Few Silicon respins, schedule impact • Showstopper bug close at Revenue Release > Loss of Revenue at $1+ Million per day > Reduce competitive edge • Showstopper bug 1year after Revenue Release > Cost of recall at $150Million > Damages corporate reputation Source: N.Winkworth & B.Blohm Getting SW/HW right the first time is critical to survival! DV Club Jai Kumar Slide 7
  • 8. When you put everything together and your simulator comes to a crawl.... Who do you call? Emulators DV Club Jai Kumar Slide 8
  • 9. Design Size Impact on Sim Speed 1000000 100000 Performance (cycles/sec) 10000 SW Sim Acceleration 1000 Emulation 100 10 1 0 20 40 60 80 100 120 140 160 Design Size (M gates) Slide 9
  • 10. When you need to run millions of cycles to catch those nasty deep corner case bugs.... Who do you call? Emulators DV Club Jai Kumar Slide 10
  • 11. Simplified Bug Find Rate 90 80 Bug Rate - Simulation Cycles 70 60 50 Bug Rate 40 M cycles 30 20 10 0 1 2 3 4 5 6 7 8 9 10 11 12 13 Time DV Club Jai Kumar Slide 11
  • 12. When you need to boot firmware, and Solaris.... Who do you call? Simulation - ~27years Emulators Emulation - 8.5hours DV Club Jai Kumar Slide 12
  • 13. When you need to run real world PCIE, ENET IO traffic on your SoCs.... Who do you call? Emulators DV Club Jai Kumar Slide 13
  • 14. Emulation Resources – at a glance Walk Run Drive Fly SW Simulator Competitor X Xtreme Server Palladium2 Solaris Boot 26.7 Years 2.6 Years 2Days19Hrs 8 Hrs 30mins TIme Flexibility: Supported RTL, TestBench constructs DV Club Jai Kumar Slide 14
  • 15. Usage Model SW Simulator (VCS/XSIM) Block-Level Full-Chip System Post-Silicon Verification Timeline TapeOut Short, Directed Tests/ISS Compared Long, Random Tests/Self Checking DFT Tharas CoSim Xtreme CoSim Xtreme Targetless Emulation Palladium Targetless/ICE Firmware/ IO Verif SW Stack (PCIE/ ENET) DV Club Jai Kumar Slide 15
  • 16. Emulation Verification: Get System HW/SW right the first time: •Prevent Functional Bug escapes (reduce Silicon re- spins - find bugs before tape-out) •Facilitate early SW Development and System Integration •Aid Post-Silicon Debug & Fix Validation DV Club Jai Kumar Slide 16
  • 17. Planning for Success ● Setup Emulation Environment early - − Create technology-awareness within team early in design phase − Decide Usage mode – co-sim, targetless, in-circuit emulation − Minimize learning curve - integrate into existing simulation flow − Not everything can be run on the emulator – prioritize & plan usage ● Reduce Time-to-Model Build - − Apply RTL enforcement (use lint, emulator tools); ● Minimize Modeling Issues − Array Modeling Methodology – abstract higher, make it “synthesizable”, race free − Optimize for Capacity and Performance: Eliminate non-functional models, abstract low-level details, minimize and align clocks ● Simplify Debug − Implement critical-set of Monitors for acceleration − Invest in debug tools to ease the debug complexity − Do not use it as yet another simulator (minimize waveform dumps) Maximize ROI. DV Club Jai Kumar Slide 17
  • 18. HW Emulation Results • Initial design bringup in co-simulation mode with ISS • Target-less emulation for fastest throughput - trillions of cycles • Ran really long directed tests • Ran a number of self-checking random code generators to target tests to specific functionality – some tests run for days • Reset Sequence testing, RAS Testing, On-Chip Debug test • PCIE & ENET Testing using Speed Bridges • JTAG Scan Testing using In-Circuit Emulation • Motherboard bringup and test using In-Circuit Emulation • Facilitated readiness of silicon test/debug tools • Post-Silicon RTL bug fix validation DV Club Jai Kumar Slide 18
  • 19. Summary • Exponential complexity with large SoCs > multi-core, > multi-threads, > complex IOs • HW Emulation is a must > Traditional SW simulators alone are not enough • Cost is justified in big scheme of product development DV Club Jai Kumar Slide 19