SlideShare una empresa de Scribd logo
1 de 26
Descargar para leer sin conexión
Mike Muller
CTO
Is there anything new in heterogeneous computing?
Evolution

Wearable
Intelligence
13
Mobile
Computing

PC
82

89

93

07

10
IOT

Embedded
77

97

Consumer

Smart
Appliances

Computing

Cloud
Server
1960

1970

1980

1990

2000

2010

2020
What’s the Innovation?
Wireless

3G

MEMS
CCD
Media
Social Media?
Semiconductor Process?

GPS
Mobility Trends: CMOS
10,000

cm2/(V·s)

1,000
100
10
1990

NMOS
PMOS
1995

2000

2005

2010

2015

Planar CMOS

5nm

HNW

FinFET

Strain

3.5nm

2020

2025

III-V GE NEMS

HKMG

Switches

7nm

14nm 10nm

VNW

spintronics

2D: C, MoS

Graphene wire, CNT via

Interconnect

Al wires

// 3DIC Opto I/O Opto int

CU wires

SADP

Patterning

LELE

SAQP

LELELE

EUV

Seq. 3D

EUV + DWEB
EUV LELE

EUV + DSA
Printing:

Moore’s Law and Ink Jets
Drops/Second

1/Size (pL-1)

1E11

1E1
10’s microns

1E10
100’s microns

1E9

1E0

1E8
1E7

1E-1

1E6
10,000 nozzles

1E5

1E-2

10 nozzles

1E4
1E3

1E-3
1980

1985

1990

1995

2000

2005

2010

2015

2020
Printing and Imprinting Thin Film Transistors (TFT)
 Can be transparent, bio-degradable and even ingestible
 Unit cost 1000 less than mainstream CMOS




 CMOS @ $40,000/m2 vs. TFT @ $10/m2
Printing CAPEX can be less than $1,000
 350dpi = 200um @ 20 m/s
 Can print batteries, antenna
 Mainly organic at ~20 volts
Imprint CAPEX a $2M DVD press is high volume
 Better controllability hence higher density and performance
 1um today scale to 50nm features as used today for BluRay discs
 Mainly Inorganic NMOS only at ~2 volts
Mobility Trends: CMOS & Thin Film Transistors
10000
1000
CPU

cm2/(V·s)

100
10
1
0.1

ARM1

3µ
6MHz
CortexM0

0.01

2µ
20kHz

0.001
0.0001
0.00001
1990

1995

2000

2005

2010

2015

Conventional NMOS
Conventional PMOS
TFT

2020

2025
Top Right

and Bottom Left
Is There Anything New in Heterogeneous Computing?
Vector Add

Reduction

Matrix Mul

GPU OpenCL on GPU

1.00

1.00

1.00

GPU OpenCL on FPGA

0.14

0.02

0.89

FPGA OpenCL on FPGA

1.71

1.62

31.85

1998
Manual Partitioning
C & Assembler

ARM

+

DSP

2013
Manual Partitioning
C++ & OpenCL/RenderScript

ARM

+

GPU
How Do People Program?

~20M Programmers

Web

Mobile
Embedded
~200k

Desktop

 Simple, old-school ray tracer
 Start with C++ code and accelerate the code with Heterogeneous Systems
void traceScreen()
{
for(y = 0; y < height; ++y) {
for(x = 0; x < width; ++x){
Ray ray = generateRay(x, y);
IntersectableObject *obj = traceRay(ray);
framebuffer[y][x] = colorPixelForObject(obj);
}
}
}

void traceScreen()
{
par_for_2D(height, width, [&](int y, int x) {
Ray ray = generateRay(x, y);
IntersectableObject *obj = traceRay(ray);
framebuffer[y][x] = colorPixelForObject(obj);
});
}
Moving the Code onto OpenCL 1.x
 Need to make the following changes
a)
b)
c)
d)
e)
f)
g)

Get rid of all the pointers, both in scene vector and internally in CSGObject
Rewrite the use of std::vector, as OpenCL C does not understand C++ data type internals
Get rid of the virtual function calls
Change the classes to structs
Get rid of recursion in CSGObject
Avoid accessing the global scene variable in accelerated code
Port the code base to OpenCL C
Moving the Code onto OpenCL 2
 Need to make the following changes
a)
b)
c)
d)
e)
f)
g)

Get rid of all the pointers, both in scene vector and internally in CSGObject
Rewrite the use of std::vector, as OpenCL C does not understand C++ data type internals
Get rid of the virtual function calls
Change the classes to structs
Get rid of recursion in CSGObject
Avoid accessing the global scene variable in accelerated code
Port the code base to OpenCL C

 OpenCL 2 solves point a) with shared address space, but not the rest
Moving the Code onto C++ AMP
 Need to make the following changes
a)
b)
c)
d)
e)
f)
g)

Get rid of all the pointers, both in scene vector and internally in CSGObject
Rewrite the use of std::vector, as C++ AMP cannot call into C++ standard library
Get rid of the virtual function calls
Change the classes to structs
Get rid of recursion in CSGObject
Avoid accessing the global scene variable in accelerated code
Port the code base to OpenCL C

 C++ AMP solves points d), f) and g), but not the rest
Moving the Code onto HSA
 Need to make the following changes
a)
b)
c)
d)
e)
f)
g)

Get rid of all the pointers, both in scene vector and internally in CSGObject
Rewrite the use of std::vector, as HSAIL does not understand C++ data type internals
Get rid of the virtual function calls
Change the classes to structs
Get rid of recursion in CSGObject
Avoid accessing the global scene variable in accelerated code
Port the code base to a language on top of HSAIL

 HSA solves points a), c), d), e) and soon f)
What Makes GPUs Good For Power Efficient Compute?
 Relaxed single-threaded performance




 No dynamic scheduling
 No branch prediction
 No register renaming, no result forwarding
 Longer pipelines
 Lower clock frequencies
Multi-threading
 Tolerate long latencies to memory
Increasing the ALU/control ratio
 Short-vectors exposed to programmers
 SIMT/Warp/VLIW/Wavefront based execution
..
Heterogeneous Compute Homogeneous Architecture

big

LITTLE

 How about a SIMTish ARM?
 Familiar programming model, C++ and OpenMP
 Fewer seams
 Sharing data structures and function pointers/vtables

Integer Pipe
FP Pipe
Load/Store Pipe

Write

SIMT
Queue

RESEARCH

Throughput
Moving the Code onto a Warped ARM
 Need to make the following changes








Get rid of all the pointers, both in scene vector and internally in CSGObject
Rewrite the use of std::vector, as OpenCL C does not understand C++ data type internals
Get rid of the virtual function calls
Change the classes to structs
Get rid of recursion in CSGObject
Avoid accessing the global scene variable in accelerated code
Port the code base to OpenCL C
Performance vs Effort
 We’ve implemented SGEMM, a matrix-matrix multiplication benchmark, in various
ways, to investigate the tradeoff between programmer effort and performance payoff
SGEMM version
ARM in C

Speedup

Effort
1x

Low

ARM in C with NEON intrinsics, prefetching

15x

Medium - High

ARM in assembly with NEON, prefetching

26x

High

SIMTish ARM in C

35x

Low

SIMTish ARM in C, unrolled

44x

Low - Medium

Mali GPU x 4 way

136x

High
Scale Needs Standards
Works for geeks…
No proper orchestration
Battle for the apps platform
Needs home IT support
Or only single manufacturer

IPv4
Sonosnet

IPv6

Imagine that there
were a 1000 of these
connected devices….
Functional Becomes the Internet of things
Functional

Little Data
Mike

My Data

X

Gym

X
Life
Insurance

!
Their Data

Car
Insurance

Rob Curtis Haymakers Cambridge
Picture by Keith Jones
Sharing Needs Trust
IOT Medical Devices
 First implantable Pacemaker 1958
 Can a pacemaker be hacked to kill?
 Or just a plot line in US TV series
RF interface for adjusting settings


 First hacked in 2008


 “Sustained effort by a team of specialists” – The New York Times
 Range a few cm
Today
 MIT grad students
 One weekend
 Range 50 feet
Trust Needs Security
It’s a Heterogeneous Future

Reach

The future
Open Data
and Objects

Scale Needs Standards
Sharing Needs Trust
Trust Needs Security

Applications
Mobile internet
Internet / broadband
M2M
SaaS
Fixed Telephony Networks

Smart
Everything
Sensors & Actuators
Networks

Today

Mobile Telephony

Más contenido relacionado

La actualidad más candente

CC-4000, Characterizing APU Performance in HadoopCL on Heterogeneous Distribu...
CC-4000, Characterizing APU Performance in HadoopCL on Heterogeneous Distribu...CC-4000, Characterizing APU Performance in HadoopCL on Heterogeneous Distribu...
CC-4000, Characterizing APU Performance in HadoopCL on Heterogeneous Distribu...AMD Developer Central
 
TressFX The Fast and The Furry by Nicolas Thibieroz
TressFX The Fast and The Furry by Nicolas ThibierozTressFX The Fast and The Furry by Nicolas Thibieroz
TressFX The Fast and The Furry by Nicolas ThibierozAMD Developer Central
 
PL-4048, Adapting languages for parallel processing on GPUs, by Neil Henning
PL-4048, Adapting languages for parallel processing on GPUs, by Neil HenningPL-4048, Adapting languages for parallel processing on GPUs, by Neil Henning
PL-4048, Adapting languages for parallel processing on GPUs, by Neil HenningAMD Developer Central
 
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware WebinarAn Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware WebinarAMD Developer Central
 
PL-4043, Accelerating OpenVL for Heterogeneous Platforms, by Gregor Miller
PL-4043, Accelerating OpenVL for Heterogeneous Platforms, by Gregor MillerPL-4043, Accelerating OpenVL for Heterogeneous Platforms, by Gregor Miller
PL-4043, Accelerating OpenVL for Heterogeneous Platforms, by Gregor MillerAMD Developer Central
 
Leverage the Speed of OpenCL™ with AMD Math Libraries
Leverage the Speed of OpenCL™ with AMD Math LibrariesLeverage the Speed of OpenCL™ with AMD Math Libraries
Leverage the Speed of OpenCL™ with AMD Math LibrariesAMD Developer Central
 
GS-4147, TressFX 2.0, by Bill-Bilodeau
GS-4147, TressFX 2.0, by Bill-BilodeauGS-4147, TressFX 2.0, by Bill-Bilodeau
GS-4147, TressFX 2.0, by Bill-BilodeauAMD Developer Central
 
CE-4027, Sensor Fusion – HID virtualized over LPC, by Reed Hinkel
CE-4027, Sensor Fusion – HID virtualized over LPC, by Reed HinkelCE-4027, Sensor Fusion – HID virtualized over LPC, by Reed Hinkel
CE-4027, Sensor Fusion – HID virtualized over LPC, by Reed HinkelAMD Developer Central
 
HSA-4123, HSA Memory Model, by Ben Gaster
HSA-4123, HSA Memory Model, by Ben GasterHSA-4123, HSA Memory Model, by Ben Gaster
HSA-4123, HSA Memory Model, by Ben GasterAMD Developer Central
 
Bolt C++ Standard Template Libary for HSA by Ben Sanders, AMD
Bolt C++ Standard Template Libary for HSA  by Ben Sanders, AMDBolt C++ Standard Template Libary for HSA  by Ben Sanders, AMD
Bolt C++ Standard Template Libary for HSA by Ben Sanders, AMDHSA Foundation
 
PG-4037, Fast modal analysis with NX Nastran and GPUs, by Leonard Hoffnung
PG-4037, Fast modal analysis with NX Nastran and GPUs, by Leonard HoffnungPG-4037, Fast modal analysis with NX Nastran and GPUs, by Leonard Hoffnung
PG-4037, Fast modal analysis with NX Nastran and GPUs, by Leonard HoffnungAMD Developer Central
 
PT-4053, Advanced OpenCL - Debugging and Profiling Using AMD CodeXL, by Uri S...
PT-4053, Advanced OpenCL - Debugging and Profiling Using AMD CodeXL, by Uri S...PT-4053, Advanced OpenCL - Debugging and Profiling Using AMD CodeXL, by Uri S...
PT-4053, Advanced OpenCL - Debugging and Profiling Using AMD CodeXL, by Uri S...AMD Developer Central
 
ONNC - 0.9.1 release
ONNC - 0.9.1 releaseONNC - 0.9.1 release
ONNC - 0.9.1 releaseLuba Tang
 
Intel's Presentation in SIGGRAPH OpenCL BOF
Intel's Presentation in SIGGRAPH OpenCL BOFIntel's Presentation in SIGGRAPH OpenCL BOF
Intel's Presentation in SIGGRAPH OpenCL BOFOfer Rosenberg
 
PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...
PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...
PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...AMD Developer Central
 
MM-4105, Realtime 4K HDR Decoding with GPU ACES, by Gary Demos
MM-4105, Realtime 4K HDR Decoding with GPU ACES, by Gary DemosMM-4105, Realtime 4K HDR Decoding with GPU ACES, by Gary Demos
MM-4105, Realtime 4K HDR Decoding with GPU ACES, by Gary DemosAMD Developer Central
 

La actualidad más candente (20)

CC-4000, Characterizing APU Performance in HadoopCL on Heterogeneous Distribu...
CC-4000, Characterizing APU Performance in HadoopCL on Heterogeneous Distribu...CC-4000, Characterizing APU Performance in HadoopCL on Heterogeneous Distribu...
CC-4000, Characterizing APU Performance in HadoopCL on Heterogeneous Distribu...
 
TressFX The Fast and The Furry by Nicolas Thibieroz
TressFX The Fast and The Furry by Nicolas ThibierozTressFX The Fast and The Furry by Nicolas Thibieroz
TressFX The Fast and The Furry by Nicolas Thibieroz
 
GPU Ecosystem
GPU EcosystemGPU Ecosystem
GPU Ecosystem
 
PL-4048, Adapting languages for parallel processing on GPUs, by Neil Henning
PL-4048, Adapting languages for parallel processing on GPUs, by Neil HenningPL-4048, Adapting languages for parallel processing on GPUs, by Neil Henning
PL-4048, Adapting languages for parallel processing on GPUs, by Neil Henning
 
Gcn performance ftw by stephan hodes
Gcn performance ftw by stephan hodesGcn performance ftw by stephan hodes
Gcn performance ftw by stephan hodes
 
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware WebinarAn Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
 
Media SDK Webinar 2014
Media SDK Webinar 2014Media SDK Webinar 2014
Media SDK Webinar 2014
 
PL-4043, Accelerating OpenVL for Heterogeneous Platforms, by Gregor Miller
PL-4043, Accelerating OpenVL for Heterogeneous Platforms, by Gregor MillerPL-4043, Accelerating OpenVL for Heterogeneous Platforms, by Gregor Miller
PL-4043, Accelerating OpenVL for Heterogeneous Platforms, by Gregor Miller
 
Leverage the Speed of OpenCL™ with AMD Math Libraries
Leverage the Speed of OpenCL™ with AMD Math LibrariesLeverage the Speed of OpenCL™ with AMD Math Libraries
Leverage the Speed of OpenCL™ with AMD Math Libraries
 
GS-4147, TressFX 2.0, by Bill-Bilodeau
GS-4147, TressFX 2.0, by Bill-BilodeauGS-4147, TressFX 2.0, by Bill-Bilodeau
GS-4147, TressFX 2.0, by Bill-Bilodeau
 
CE-4027, Sensor Fusion – HID virtualized over LPC, by Reed Hinkel
CE-4027, Sensor Fusion – HID virtualized over LPC, by Reed HinkelCE-4027, Sensor Fusion – HID virtualized over LPC, by Reed Hinkel
CE-4027, Sensor Fusion – HID virtualized over LPC, by Reed Hinkel
 
HSA-4123, HSA Memory Model, by Ben Gaster
HSA-4123, HSA Memory Model, by Ben GasterHSA-4123, HSA Memory Model, by Ben Gaster
HSA-4123, HSA Memory Model, by Ben Gaster
 
Bolt C++ Standard Template Libary for HSA by Ben Sanders, AMD
Bolt C++ Standard Template Libary for HSA  by Ben Sanders, AMDBolt C++ Standard Template Libary for HSA  by Ben Sanders, AMD
Bolt C++ Standard Template Libary for HSA by Ben Sanders, AMD
 
PG-4037, Fast modal analysis with NX Nastran and GPUs, by Leonard Hoffnung
PG-4037, Fast modal analysis with NX Nastran and GPUs, by Leonard HoffnungPG-4037, Fast modal analysis with NX Nastran and GPUs, by Leonard Hoffnung
PG-4037, Fast modal analysis with NX Nastran and GPUs, by Leonard Hoffnung
 
PT-4053, Advanced OpenCL - Debugging and Profiling Using AMD CodeXL, by Uri S...
PT-4053, Advanced OpenCL - Debugging and Profiling Using AMD CodeXL, by Uri S...PT-4053, Advanced OpenCL - Debugging and Profiling Using AMD CodeXL, by Uri S...
PT-4053, Advanced OpenCL - Debugging and Profiling Using AMD CodeXL, by Uri S...
 
ONNC - 0.9.1 release
ONNC - 0.9.1 releaseONNC - 0.9.1 release
ONNC - 0.9.1 release
 
Intel's Presentation in SIGGRAPH OpenCL BOF
Intel's Presentation in SIGGRAPH OpenCL BOFIntel's Presentation in SIGGRAPH OpenCL BOF
Intel's Presentation in SIGGRAPH OpenCL BOF
 
PostgreSQL with OpenCL
PostgreSQL with OpenCLPostgreSQL with OpenCL
PostgreSQL with OpenCL
 
PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...
PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...
PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...
 
MM-4105, Realtime 4K HDR Decoding with GPU ACES, by Gary Demos
MM-4105, Realtime 4K HDR Decoding with GPU ACES, by Gary DemosMM-4105, Realtime 4K HDR Decoding with GPU ACES, by Gary Demos
MM-4105, Realtime 4K HDR Decoding with GPU ACES, by Gary Demos
 

Similar a Keynote (Mike Muller) - Is There Anything New in Heterogeneous Computing - by Mike Muller, Chief Technology Officer, ARM

OrientDB and Hazelcast
OrientDB and HazelcastOrientDB and Hazelcast
OrientDB and HazelcastLuca Garulli
 
“Introduction to the TVM Open Source Deep Learning Compiler Stack,” a Present...
“Introduction to the TVM Open Source Deep Learning Compiler Stack,” a Present...“Introduction to the TVM Open Source Deep Learning Compiler Stack,” a Present...
“Introduction to the TVM Open Source Deep Learning Compiler Stack,” a Present...Edge AI and Vision Alliance
 
Marek Suplata Projects
Marek Suplata ProjectsMarek Suplata Projects
Marek Suplata Projectsguest14f12f
 
OrientDB & Hazelcast: In-Memory Distributed Graph Database
 OrientDB & Hazelcast: In-Memory Distributed Graph Database OrientDB & Hazelcast: In-Memory Distributed Graph Database
OrientDB & Hazelcast: In-Memory Distributed Graph DatabaseHazelcast
 
Digital Ic Applications Jntu Model Paper{Www.Studentyogi.Com}
Digital Ic Applications Jntu Model Paper{Www.Studentyogi.Com}Digital Ic Applications Jntu Model Paper{Www.Studentyogi.Com}
Digital Ic Applications Jntu Model Paper{Www.Studentyogi.Com}guest3f9c6b
 
D I G I T A L I C A P P L I C A T I O N S J N T U M O D E L P A P E R{Www
D I G I T A L  I C  A P P L I C A T I O N S  J N T U  M O D E L  P A P E R{WwwD I G I T A L  I C  A P P L I C A T I O N S  J N T U  M O D E L  P A P E R{Www
D I G I T A L I C A P P L I C A T I O N S J N T U M O D E L P A P E R{Wwwguest3f9c6b
 
11 Synchoricity as the basis for going Beyond Moore
11 Synchoricity as the basis for going Beyond Moore11 Synchoricity as the basis for going Beyond Moore
11 Synchoricity as the basis for going Beyond MooreRCCSRENKEI
 
tau 2015 spyrou fpga timing
tau 2015 spyrou fpga timingtau 2015 spyrou fpga timing
tau 2015 spyrou fpga timingTom Spyrou
 
Edge optimized architecture for fabric defect detection in real-time
Edge optimized architecture for fabric defect detection in real-timeEdge optimized architecture for fabric defect detection in real-time
Edge optimized architecture for fabric defect detection in real-timeShuquan Huang
 
20081114 Friday Food iLabt Bart Joris
20081114 Friday Food iLabt Bart Joris20081114 Friday Food iLabt Bart Joris
20081114 Friday Food iLabt Bart Jorisimec.archive
 
IMAGE CAPTURE, PROCESSING AND TRANSFER VIA ETHERNET UNDER CONTROL OF MATLAB G...
IMAGE CAPTURE, PROCESSING AND TRANSFER VIA ETHERNET UNDER CONTROL OF MATLAB G...IMAGE CAPTURE, PROCESSING AND TRANSFER VIA ETHERNET UNDER CONTROL OF MATLAB G...
IMAGE CAPTURE, PROCESSING AND TRANSFER VIA ETHERNET UNDER CONTROL OF MATLAB G...Christopher Diamantopoulos
 
Use C++ and Intel® Threading Building Blocks (Intel® TBB) for Hardware Progra...
Use C++ and Intel® Threading Building Blocks (Intel® TBB) for Hardware Progra...Use C++ and Intel® Threading Building Blocks (Intel® TBB) for Hardware Progra...
Use C++ and Intel® Threading Building Blocks (Intel® TBB) for Hardware Progra...Intel® Software
 
Scalability for All: Unreal Engine* 4 with Intel
Scalability for All: Unreal Engine* 4 with Intel Scalability for All: Unreal Engine* 4 with Intel
Scalability for All: Unreal Engine* 4 with Intel Intel® Software
 
CAD STANDARDS - SMART MANUFACTURING MECH
CAD STANDARDS - SMART MANUFACTURING MECHCAD STANDARDS - SMART MANUFACTURING MECH
CAD STANDARDS - SMART MANUFACTURING MECHRAJESHS631800
 

Similar a Keynote (Mike Muller) - Is There Anything New in Heterogeneous Computing - by Mike Muller, Chief Technology Officer, ARM (20)

Circuit Simplifier
Circuit SimplifierCircuit Simplifier
Circuit Simplifier
 
OrientDB and Hazelcast
OrientDB and HazelcastOrientDB and Hazelcast
OrientDB and Hazelcast
 
“Introduction to the TVM Open Source Deep Learning Compiler Stack,” a Present...
“Introduction to the TVM Open Source Deep Learning Compiler Stack,” a Present...“Introduction to the TVM Open Source Deep Learning Compiler Stack,” a Present...
“Introduction to the TVM Open Source Deep Learning Compiler Stack,” a Present...
 
Marek Suplata Projects
Marek Suplata ProjectsMarek Suplata Projects
Marek Suplata Projects
 
OrientDB & Hazelcast: In-Memory Distributed Graph Database
 OrientDB & Hazelcast: In-Memory Distributed Graph Database OrientDB & Hazelcast: In-Memory Distributed Graph Database
OrientDB & Hazelcast: In-Memory Distributed Graph Database
 
An35225228
An35225228An35225228
An35225228
 
Embedded system
Embedded systemEmbedded system
Embedded system
 
Digital Ic Applications Jntu Model Paper{Www.Studentyogi.Com}
Digital Ic Applications Jntu Model Paper{Www.Studentyogi.Com}Digital Ic Applications Jntu Model Paper{Www.Studentyogi.Com}
Digital Ic Applications Jntu Model Paper{Www.Studentyogi.Com}
 
D I G I T A L I C A P P L I C A T I O N S J N T U M O D E L P A P E R{Www
D I G I T A L  I C  A P P L I C A T I O N S  J N T U  M O D E L  P A P E R{WwwD I G I T A L  I C  A P P L I C A T I O N S  J N T U  M O D E L  P A P E R{Www
D I G I T A L I C A P P L I C A T I O N S J N T U M O D E L P A P E R{Www
 
11 Synchoricity as the basis for going Beyond Moore
11 Synchoricity as the basis for going Beyond Moore11 Synchoricity as the basis for going Beyond Moore
11 Synchoricity as the basis for going Beyond Moore
 
tau 2015 spyrou fpga timing
tau 2015 spyrou fpga timingtau 2015 spyrou fpga timing
tau 2015 spyrou fpga timing
 
Introduction to 2D/3D Graphics
Introduction to 2D/3D GraphicsIntroduction to 2D/3D Graphics
Introduction to 2D/3D Graphics
 
Edge optimized architecture for fabric defect detection in real-time
Edge optimized architecture for fabric defect detection in real-timeEdge optimized architecture for fabric defect detection in real-time
Edge optimized architecture for fabric defect detection in real-time
 
I010315760
I010315760I010315760
I010315760
 
20081114 Friday Food iLabt Bart Joris
20081114 Friday Food iLabt Bart Joris20081114 Friday Food iLabt Bart Joris
20081114 Friday Food iLabt Bart Joris
 
IMAGE CAPTURE, PROCESSING AND TRANSFER VIA ETHERNET UNDER CONTROL OF MATLAB G...
IMAGE CAPTURE, PROCESSING AND TRANSFER VIA ETHERNET UNDER CONTROL OF MATLAB G...IMAGE CAPTURE, PROCESSING AND TRANSFER VIA ETHERNET UNDER CONTROL OF MATLAB G...
IMAGE CAPTURE, PROCESSING AND TRANSFER VIA ETHERNET UNDER CONTROL OF MATLAB G...
 
Dsp lab manual 15 11-2016
Dsp lab manual 15 11-2016Dsp lab manual 15 11-2016
Dsp lab manual 15 11-2016
 
Use C++ and Intel® Threading Building Blocks (Intel® TBB) for Hardware Progra...
Use C++ and Intel® Threading Building Blocks (Intel® TBB) for Hardware Progra...Use C++ and Intel® Threading Building Blocks (Intel® TBB) for Hardware Progra...
Use C++ and Intel® Threading Building Blocks (Intel® TBB) for Hardware Progra...
 
Scalability for All: Unreal Engine* 4 with Intel
Scalability for All: Unreal Engine* 4 with Intel Scalability for All: Unreal Engine* 4 with Intel
Scalability for All: Unreal Engine* 4 with Intel
 
CAD STANDARDS - SMART MANUFACTURING MECH
CAD STANDARDS - SMART MANUFACTURING MECHCAD STANDARDS - SMART MANUFACTURING MECH
CAD STANDARDS - SMART MANUFACTURING MECH
 

Más de AMD Developer Central

DX12 & Vulkan: Dawn of a New Generation of Graphics APIs
DX12 & Vulkan: Dawn of a New Generation of Graphics APIsDX12 & Vulkan: Dawn of a New Generation of Graphics APIs
DX12 & Vulkan: Dawn of a New Generation of Graphics APIsAMD Developer Central
 
Webinar: Whats New in Java 8 with Develop Intelligence
Webinar: Whats New in Java 8 with Develop IntelligenceWebinar: Whats New in Java 8 with Develop Intelligence
Webinar: Whats New in Java 8 with Develop IntelligenceAMD Developer Central
 
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...AMD Developer Central
 
Rendering Battlefield 4 with Mantle by Yuriy ODonnell
Rendering Battlefield 4 with Mantle by Yuriy ODonnellRendering Battlefield 4 with Mantle by Yuriy ODonnell
Rendering Battlefield 4 with Mantle by Yuriy ODonnellAMD Developer Central
 
Low-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil PerssonLow-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil PerssonAMD Developer Central
 
Introduction to Direct 3D 12 by Ivan Nevraev
Introduction to Direct 3D 12 by Ivan NevraevIntroduction to Direct 3D 12 by Ivan Nevraev
Introduction to Direct 3D 12 by Ivan NevraevAMD Developer Central
 
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth ThomasHoly smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth ThomasAMD Developer Central
 
Computer Vision Powered by Heterogeneous System Architecture (HSA) by Dr. Ha...
Computer Vision Powered by Heterogeneous System Architecture (HSA) by  Dr. Ha...Computer Vision Powered by Heterogeneous System Architecture (HSA) by  Dr. Ha...
Computer Vision Powered by Heterogeneous System Architecture (HSA) by Dr. Ha...AMD Developer Central
 
Productive OpenCL Programming An Introduction to OpenCL Libraries with Array...
Productive OpenCL Programming An Introduction to OpenCL Libraries  with Array...Productive OpenCL Programming An Introduction to OpenCL Libraries  with Array...
Productive OpenCL Programming An Introduction to OpenCL Libraries with Array...AMD Developer Central
 
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14AMD Developer Central
 
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14AMD Developer Central
 
Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...
Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...
Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...AMD Developer Central
 
Mantle - Introducing a new API for Graphics - AMD at GDC14
Mantle - Introducing a new API for Graphics - AMD at GDC14Mantle - Introducing a new API for Graphics - AMD at GDC14
Mantle - Introducing a new API for Graphics - AMD at GDC14AMD Developer Central
 
Direct3D and the Future of Graphics APIs - AMD at GDC14
Direct3D and the Future of Graphics APIs - AMD at GDC14Direct3D and the Future of Graphics APIs - AMD at GDC14
Direct3D and the Future of Graphics APIs - AMD at GDC14AMD Developer Central
 
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14AMD Developer Central
 
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla Mah
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla MahGS-4106 The AMD GCN Architecture - A Crash Course, by Layla Mah
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla MahAMD Developer Central
 

Más de AMD Developer Central (20)

DX12 & Vulkan: Dawn of a New Generation of Graphics APIs
DX12 & Vulkan: Dawn of a New Generation of Graphics APIsDX12 & Vulkan: Dawn of a New Generation of Graphics APIs
DX12 & Vulkan: Dawn of a New Generation of Graphics APIs
 
Introduction to Node.js
Introduction to Node.jsIntroduction to Node.js
Introduction to Node.js
 
DirectGMA on AMD’S FirePro™ GPUS
DirectGMA on AMD’S  FirePro™ GPUSDirectGMA on AMD’S  FirePro™ GPUS
DirectGMA on AMD’S FirePro™ GPUS
 
Webinar: Whats New in Java 8 with Develop Intelligence
Webinar: Whats New in Java 8 with Develop IntelligenceWebinar: Whats New in Java 8 with Develop Intelligence
Webinar: Whats New in Java 8 with Develop Intelligence
 
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...
 
Inside XBox- One, by Martin Fuller
Inside XBox- One, by Martin FullerInside XBox- One, by Martin Fuller
Inside XBox- One, by Martin Fuller
 
Rendering Battlefield 4 with Mantle by Yuriy ODonnell
Rendering Battlefield 4 with Mantle by Yuriy ODonnellRendering Battlefield 4 with Mantle by Yuriy ODonnell
Rendering Battlefield 4 with Mantle by Yuriy ODonnell
 
Low-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil PerssonLow-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil Persson
 
Inside XBOX ONE by Martin Fuller
Inside XBOX ONE by Martin FullerInside XBOX ONE by Martin Fuller
Inside XBOX ONE by Martin Fuller
 
Introduction to Direct 3D 12 by Ivan Nevraev
Introduction to Direct 3D 12 by Ivan NevraevIntroduction to Direct 3D 12 by Ivan Nevraev
Introduction to Direct 3D 12 by Ivan Nevraev
 
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth ThomasHoly smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
 
Computer Vision Powered by Heterogeneous System Architecture (HSA) by Dr. Ha...
Computer Vision Powered by Heterogeneous System Architecture (HSA) by  Dr. Ha...Computer Vision Powered by Heterogeneous System Architecture (HSA) by  Dr. Ha...
Computer Vision Powered by Heterogeneous System Architecture (HSA) by Dr. Ha...
 
Productive OpenCL Programming An Introduction to OpenCL Libraries with Array...
Productive OpenCL Programming An Introduction to OpenCL Libraries  with Array...Productive OpenCL Programming An Introduction to OpenCL Libraries  with Array...
Productive OpenCL Programming An Introduction to OpenCL Libraries with Array...
 
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14
 
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
 
Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...
Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...
Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...
 
Mantle - Introducing a new API for Graphics - AMD at GDC14
Mantle - Introducing a new API for Graphics - AMD at GDC14Mantle - Introducing a new API for Graphics - AMD at GDC14
Mantle - Introducing a new API for Graphics - AMD at GDC14
 
Direct3D and the Future of Graphics APIs - AMD at GDC14
Direct3D and the Future of Graphics APIs - AMD at GDC14Direct3D and the Future of Graphics APIs - AMD at GDC14
Direct3D and the Future of Graphics APIs - AMD at GDC14
 
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
 
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla Mah
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla MahGS-4106 The AMD GCN Architecture - A Crash Course, by Layla Mah
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla Mah
 

Último

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 

Último (20)

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 

Keynote (Mike Muller) - Is There Anything New in Heterogeneous Computing - by Mike Muller, Chief Technology Officer, ARM

  • 1. Mike Muller CTO Is there anything new in heterogeneous computing?
  • 4. Mobility Trends: CMOS 10,000 cm2/(V·s) 1,000 100 10 1990 NMOS PMOS 1995 2000 2005 2010 2015 Planar CMOS 5nm HNW FinFET Strain 3.5nm 2020 2025 III-V GE NEMS HKMG Switches 7nm 14nm 10nm VNW spintronics 2D: C, MoS Graphene wire, CNT via Interconnect Al wires // 3DIC Opto I/O Opto int CU wires SADP Patterning LELE SAQP LELELE EUV Seq. 3D EUV + DWEB EUV LELE EUV + DSA
  • 5. Printing: Moore’s Law and Ink Jets Drops/Second 1/Size (pL-1) 1E11 1E1 10’s microns 1E10 100’s microns 1E9 1E0 1E8 1E7 1E-1 1E6 10,000 nozzles 1E5 1E-2 10 nozzles 1E4 1E3 1E-3 1980 1985 1990 1995 2000 2005 2010 2015 2020
  • 6. Printing and Imprinting Thin Film Transistors (TFT)  Can be transparent, bio-degradable and even ingestible  Unit cost 1000 less than mainstream CMOS    CMOS @ $40,000/m2 vs. TFT @ $10/m2 Printing CAPEX can be less than $1,000  350dpi = 200um @ 20 m/s  Can print batteries, antenna  Mainly organic at ~20 volts Imprint CAPEX a $2M DVD press is high volume  Better controllability hence higher density and performance  1um today scale to 50nm features as used today for BluRay discs  Mainly Inorganic NMOS only at ~2 volts
  • 7. Mobility Trends: CMOS & Thin Film Transistors 10000 1000 CPU cm2/(V·s) 100 10 1 0.1 ARM1 3µ 6MHz CortexM0 0.01 2µ 20kHz 0.001 0.0001 0.00001 1990 1995 2000 2005 2010 2015 Conventional NMOS Conventional PMOS TFT 2020 2025
  • 9. Is There Anything New in Heterogeneous Computing? Vector Add Reduction Matrix Mul GPU OpenCL on GPU 1.00 1.00 1.00 GPU OpenCL on FPGA 0.14 0.02 0.89 FPGA OpenCL on FPGA 1.71 1.62 31.85 1998 Manual Partitioning C & Assembler ARM + DSP 2013 Manual Partitioning C++ & OpenCL/RenderScript ARM + GPU
  • 10. How Do People Program? ~20M Programmers Web Mobile Embedded ~200k Desktop  Simple, old-school ray tracer  Start with C++ code and accelerate the code with Heterogeneous Systems void traceScreen() { for(y = 0; y < height; ++y) { for(x = 0; x < width; ++x){ Ray ray = generateRay(x, y); IntersectableObject *obj = traceRay(ray); framebuffer[y][x] = colorPixelForObject(obj); } } } void traceScreen() { par_for_2D(height, width, [&](int y, int x) { Ray ray = generateRay(x, y); IntersectableObject *obj = traceRay(ray); framebuffer[y][x] = colorPixelForObject(obj); }); }
  • 11. Moving the Code onto OpenCL 1.x  Need to make the following changes a) b) c) d) e) f) g) Get rid of all the pointers, both in scene vector and internally in CSGObject Rewrite the use of std::vector, as OpenCL C does not understand C++ data type internals Get rid of the virtual function calls Change the classes to structs Get rid of recursion in CSGObject Avoid accessing the global scene variable in accelerated code Port the code base to OpenCL C
  • 12. Moving the Code onto OpenCL 2  Need to make the following changes a) b) c) d) e) f) g) Get rid of all the pointers, both in scene vector and internally in CSGObject Rewrite the use of std::vector, as OpenCL C does not understand C++ data type internals Get rid of the virtual function calls Change the classes to structs Get rid of recursion in CSGObject Avoid accessing the global scene variable in accelerated code Port the code base to OpenCL C  OpenCL 2 solves point a) with shared address space, but not the rest
  • 13. Moving the Code onto C++ AMP  Need to make the following changes a) b) c) d) e) f) g) Get rid of all the pointers, both in scene vector and internally in CSGObject Rewrite the use of std::vector, as C++ AMP cannot call into C++ standard library Get rid of the virtual function calls Change the classes to structs Get rid of recursion in CSGObject Avoid accessing the global scene variable in accelerated code Port the code base to OpenCL C  C++ AMP solves points d), f) and g), but not the rest
  • 14. Moving the Code onto HSA  Need to make the following changes a) b) c) d) e) f) g) Get rid of all the pointers, both in scene vector and internally in CSGObject Rewrite the use of std::vector, as HSAIL does not understand C++ data type internals Get rid of the virtual function calls Change the classes to structs Get rid of recursion in CSGObject Avoid accessing the global scene variable in accelerated code Port the code base to a language on top of HSAIL  HSA solves points a), c), d), e) and soon f)
  • 15. What Makes GPUs Good For Power Efficient Compute?  Relaxed single-threaded performance    No dynamic scheduling  No branch prediction  No register renaming, no result forwarding  Longer pipelines  Lower clock frequencies Multi-threading  Tolerate long latencies to memory Increasing the ALU/control ratio  Short-vectors exposed to programmers  SIMT/Warp/VLIW/Wavefront based execution
  • 16. .. Heterogeneous Compute Homogeneous Architecture big LITTLE  How about a SIMTish ARM?  Familiar programming model, C++ and OpenMP  Fewer seams  Sharing data structures and function pointers/vtables Integer Pipe FP Pipe Load/Store Pipe Write SIMT Queue RESEARCH Throughput
  • 17. Moving the Code onto a Warped ARM  Need to make the following changes        Get rid of all the pointers, both in scene vector and internally in CSGObject Rewrite the use of std::vector, as OpenCL C does not understand C++ data type internals Get rid of the virtual function calls Change the classes to structs Get rid of recursion in CSGObject Avoid accessing the global scene variable in accelerated code Port the code base to OpenCL C
  • 18. Performance vs Effort  We’ve implemented SGEMM, a matrix-matrix multiplication benchmark, in various ways, to investigate the tradeoff between programmer effort and performance payoff SGEMM version ARM in C Speedup Effort 1x Low ARM in C with NEON intrinsics, prefetching 15x Medium - High ARM in assembly with NEON, prefetching 26x High SIMTish ARM in C 35x Low SIMTish ARM in C, unrolled 44x Low - Medium Mali GPU x 4 way 136x High
  • 20. Works for geeks… No proper orchestration Battle for the apps platform Needs home IT support Or only single manufacturer IPv4 Sonosnet IPv6 Imagine that there were a 1000 of these connected devices….
  • 21. Functional Becomes the Internet of things Functional Little Data
  • 22. Mike My Data X Gym X Life Insurance ! Their Data Car Insurance Rob Curtis Haymakers Cambridge Picture by Keith Jones
  • 24. IOT Medical Devices  First implantable Pacemaker 1958  Can a pacemaker be hacked to kill?  Or just a plot line in US TV series RF interface for adjusting settings   First hacked in 2008   “Sustained effort by a team of specialists” – The New York Times  Range a few cm Today  MIT grad students  One weekend  Range 50 feet
  • 26. It’s a Heterogeneous Future Reach The future Open Data and Objects Scale Needs Standards Sharing Needs Trust Trust Needs Security Applications Mobile internet Internet / broadband M2M SaaS Fixed Telephony Networks Smart Everything Sensors & Actuators Networks Today Mobile Telephony