SlideShare una empresa de Scribd logo
1 de 39
Descargar para leer sin conexión
| © 2013 Aptina Imaging Corporation | Aptina Confidential1
© 2013 Aptina Imaging Corporation. All rights reserved. Products are warranted only to meet Aptina’s production data sheet specifications. Information, products, and/or
specifications are subject to change without notice. All information is provided on an “AS IS” basis without warranties of any kind. Dates are estimates only. Drawings not to
scale. Aptina and the Aptina logo are trademarks of Aptina Imaging Corporation. All other trademarks are the property of their respective owners.
Imaging on Embedded GPUs
Investigating flexible imaging pipelines using
embedded GPUs
Mikaël Bourges-Sévenier (msevenier at aptina dot com)
Director, High-Performance Imaging
December 19, 2013
Bay Area Multimedia
| © 2013 Aptina Imaging Corporation | Aptina Confidential2
•  Overview: the need for computational imaging
•  What is imaging?
•  Architecture of some embedded GPUs
•  8MP MobileHDR pipeline on ARM Mali T604
•  Khronos Camera: a standard API for computational imaging
•  Q&A
Agenda
| © 2013 Aptina Imaging Corporation | Aptina Confidential3
Computational Imaging evolution
Spatial
(Volumetric)
Gesture
AR
Face Detect
Face Track
Presence
Colorimetry
Brightness
Web Cam
Smart
Camera
True Color, Brightness
Compensation, Exposure control
User Identity
Access Control
Augmented Information
3D Imaging
Interactive
Services
| © 2013 Aptina Imaging Corporation | Aptina Confidential4
•  Requires significant computing over large data sets
Mobile Compute driving Imaging use cases
Augmented
Reality
Face, Body and
Gesture Tracking
Computational
Photography
3D Scene/Object
Reconstruction
Time
| © 2013 Aptina Imaging Corporation | Aptina Confidential5
Increasing Use of Imaging SensorsDifferentiationOpportunity
Time
Photography
Input = 2D Camera
Processors = ISP + CPU
Product = Static Images
Computational Photography
Input = MEMS + 2D Camera
Processors = ISP + CPU + GPU
Product = Real Time Images
We are here
Perceptual Imaging
Input = MEMS + Depth Camera
Processors = ISP + CPU + GPU + DSP
Product = Real Time Extracted Information
Perceptual Imaging1. Uses the full array of mobile sensors
2. to extract information in real-time
3. about the user and environment
4. to generate enhanced user interactions
| © 2013 Aptina Imaging Corporation | Aptina Confidential6
Hardware Save Power e.g. Camera Sensor ISP
•  CPU
‣  Single processor or Neon SIMD - running fast
‣  Makes heavy use of general memory
‣  Non-optimal performance and power
•  GPU
‣  Programmable and flexible
‣  Many way parallelism - run at lower frequency
‣  Efficient image caching close to processors
‣  BUT cycles frames in and out of memory
•  Camera ISP (Image Signal Processor)
‣  Little or no programmability
‣  Data flows thru compact hardware pipe
‣  Scan-line-based - no global memory
‣  Best perf/watt
| © 2013 Aptina Imaging Corporation | Aptina Confidential7
0
50
100
150
200
250
300
350
400
450
Sep-2011 Dec-2011 Apr-2012 Jul-2012 Oct-2012 Jan-2013 May-2013 Aug-2013 Nov-2013 Mar-2014 Jun-2014
Evolution of Embedded GPUs
GFLOPS
Trend
Adreno 320
Adreno 330
Mali T628
PowerVR 6
Tegra 5
PowerVR 5XT
Mali T604
40% more GFLOPS/quarter
Estimated at
sustained peak
performance.
Likely to be much
less in practice.
| © 2013 Aptina Imaging Corporation | Aptina Confidential8
•  Pre-processing: for non-standard Bayer pixels (e.g. iHDR)
•  ISP: for fast demosaic, lens shading, denoising, 3A, statistics …
•  Post-processing: for special reconstruction of colors (e.g. Clarity+)
•  Processing requires control of metadata aligned with data
Computational Imaging pipeline
Pre-processing
Image Signal Processor
(ISP)
Post-processing
CMOS sensor
Color Filter Array
Lens
Bayer RGB
YUV
App
Lens, sensor, aperture control
Metadata
3A
stats
| © 2013 Aptina Imaging Corporation | Aptina Confidential9
•  DSP are similar to CPU
‣  Typically integer optimized (some have rudimentary floating point support)
‣  With signal processing intrinsics
•  FPGA
‣  Can be tailored to a cross between CPU/DSP and GPU
Different Computing Devices
Latency-Optimized CPU
Fast serial
Processing
lots of big on-chip caches
sophisticated control
Throughput-Optimized GPU
Scalable parallel
Processing
multithreading can hide latency
simpler control, cost amortized over ALUs via SIMD
a b
c
+ +
SISD
(scalar ALU)
SIMD
(vector ALU)
b1 b2 b3 b4a2a1 a4a3
c1 c2 c3 c4
OpenCL works on
all devices but
performance
isn’t guaranteed
| © 2013 Aptina Imaging Corporation | Aptina Confidential10
•  Stream-based (ISP)
‣  For low-memory devices
‣  Set of lines processed by kernels
‣  Delay: #lines a kernel needs
•  Frame-based (GPU)
‣  For fast data-parallel devices
‣  Full image frame processed
‣  Delay: whole frame(s)
Stream-based vs. Frame-based
Kernel
continuous stream
of pixels
Q
Kernel
final image
accumulates lines
Kernel Kernel KernelFrame Frame
Frame Frame
Completely
different
kernels
| © 2013 Aptina Imaging Corporation | Aptina Confidential11
What is Imaging?
Capture image from a camera sensor and process it to get
a render-able image.
| © 2013 Aptina Imaging Corporation | Aptina Confidential12
How Imaging Sensors work
http://www.photoaxe.com
Bayer GRBG pattern
•  50% green
•  25% red and blue
Bayer CFA is one type
of pattern
| © 2013 Aptina Imaging Corporation | Aptina Confidential13
Bayer Demosaicing
•  50% More G than R, B since eye is more sensitive to luminance
than chrominance
•  Convert pixel colors from Bayer space to Full RGB color
•  Complex interpolation to avoid artifacts (e.g. on edges)
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
RGB
0 1
2 3
0 GRBG
1 RGGB
2 GBRG
3 BGGR
| © 2013 Aptina Imaging Corporation | Aptina Confidential14
OpenCL (memory system)
Desktop Embedded
Non-uniform memory
•  Data is physically copied between
GPU and CPU memory
Uniform memory
•  __local memory may be in __global
•  Cheap data exchange between
CPU and GPU
| © 2013 Aptina Imaging Corporation | Aptina Confidential15
A tour of some embedded GPUs
ARM Mali T604, Qualcomm Adreno 330
| © 2013 Aptina Imaging Corporation | Aptina Confidential16
ARM Mali T604, T628
•  Found in Samsung Exynos 5 Dual (T604)/Octa
(T628) Application Processors
‣  Chromebook, Nexus 10, Samsung S4…
•  32nm process for T604, 28nm for T628
•  T604 has 4 shader cores, T628 has 8 cores
•  Tri-pipe architecture: each GPU core has 3 types
of instruction pipelines
‣  1x load/store
‣  1x texture
‣  2x ALU (T604) / 4x ALU (T628)
•  64-bit integers and IEEE 754 floating-point ALUs
| © 2013 Aptina Imaging Corporation | Aptina Confidential17
29868v00
CONFIDENTIAL
OpenCL and OpenGL ES
The Vithar Architecture:
OpenGL ESOpenCL
Load/Store
Pipeline
Arithmetic
Pipeline
Arithmetic
Pipeline
Texturing
Pipeline
Thread Issue
Thread Completion
•  3 kinds of pipelines
‣  Arithmetic
‣  Load/Store
‣  Texture
•  Barrel-threaded (like AMD/NVIDIA)
•  No SIMT execution (unlike AMD/NVIDIA)
•  SIMD (like AMD)
‣  Use vectors for best performance!
•  256 threads max (64 in practice)
OpenCL and OpenGL ES
| © 2013 Aptina Imaging Corporation | Aptina Confidential18
•  Automatic hardware load
balancing
•  Seamless concurrent
execution
•  Integrated seamless power
manager
Midgard Job execution and Load-balancingJob Execution and Load-balancing
| © 2013 Aptina Imaging Corporation | Aptina Confidential19
Qualcomm MSM8974
•  Process: 28nm
•  CPU: 4x Krait 2.3 GHz,
‣  ARMv7A Neon instruction set
‣  Power and performance efficiencies over ARM
‣  4KB+4KB L0, 16KB+16KB L1, 2MB L2 cache
‣  No 64b support
•  GPU: Adreno 330 450 MHz
‣  32x 32b scalar ALUs/pipeline, 8 pipelines, 129.6 GFLOPS
•  16b kernels provide 2x performance
‣  128b registers
‣  8 KB local memory per shader core
‣  8 KB constant memory
‣  12 reads, 4 writes simultaneous per clock
‣  512 work-items max
‣  1.5 MB on-chip SRAM
‣  Tiled renderer max 3.6 GPix/s
•  Hexagon DSP
‣  3x core, 600 MHz, 16 KB L1, 256 KB L2, integrated MMU
‣  Limited floating-point support (no division, no log/
exp…)
•  RAM: 2GB 2x LP-DDR3 800 MHz (12.8 GB/s)
MSM8974 Adreno 330 vs Adreno 320
Adreno 330 has better performance
450 MHz GPU clock (up from 400 MHz in Adreno 320)
2x better shader performance than A320 – 2x more ALU blocks
Dedicated GPU power rail
Will allow GPU to be at a lower frequency and voltage than the FABR
Adreno 330 Shader Processor “SP” Block
Total of 32 (32-bit)
scalar ALUs
m
sevenier-aptina.com
98.248.48.48
2013.10.19
at21:47:19
PD
T
16-bit ALUs used if
all kernel is 16-bit,
otherwise 32b ALU is
used
| © 2013 Aptina Imaging Corporation | Aptina Confidential20
MobileHDR pipeline
| © 2013 Aptina Imaging Corporation | Aptina Confidential21
Arndale Samsung Exynos 5 Dual board
•  Arndale Samsung Exynos 5 board
‣  CPU: ARM Corte-A15 (2-core) 1.7 GHz 32nm
•  32KB L1 cache, 1MB L2 cache
‣  GPU: ARM MALI T604
•  64 concurrent threads
•  Vector ALUs
•  128b registers
•  OpenCL 1.1 Full Profile
‣  RAM: 2GB LP-DDR3 800 MHz (12.8 GB/s)
‣  Truly unified cached memory
•  CPU and GPU memory is shared – NO COPY!
•  128b wide L1 and L2 access
| © 2013 Aptina Imaging Corporation | Aptina Confidential22
ARM Mali T604 GPUs
In Samsung Exynos 5 Dual
Type Vector GPU Process 32nm
OpenCL 1.1 Full Profile Unified memory Yes
Rendering Tile Work-items 256
Clock 533MHz L2 cache 1MB
Register width 128b Global memory 2GB LP-DDR3 800Mhz (12.8 GB/s)
ALUs 8 (2 ALUs/core) Throughput 100 GFLOPS
Local memory 32KB/core (global)
Constant memory 64KB
Texture cache yes
Compute devices (shader
cores)
4
Cacheline 64 bytes
16/32/64b floats No/yes/yes
| © 2013 Aptina Imaging Corporation | Aptina Confidential23
Avoid buffer copy
•  Mali/Adreno have unified memory
‣  Use CL_MEM_ALLOC_PTR to avoid copy between CPU and GPU
•  Mali has no local memory
•  Adreno has local memory (1.5MB SRAM 115GB/s)
Host data pointers
Global
Memory
Buffer created
by malloc()
CPU
(Host)
GPU
(Compute
Device)
Buffers created by user (malloc) are not
mapped into the GPU memory space
Global
Memory
Buffer created
by malloc()
CPU
(Host)
Buffer created by
clCreateBuffer()
GPU
(Compute
Device)
COPY
clCreateBuffer(CL_MEM_USE_HOST_PTR)
creates a new buffer and copies the data over
(but the copy operations are expensive)
Global
Memory
Buffer created
by malloc()
Buffers created by user (malloc) are not
mapped into the GPU memory space
Global
Memory
Buffer created
by malloc()
CPU
(Host)
Buffer created by
clCreateBuffer()
GPU
(Compute
Device)
COPY
clCreateBuffer(CL_MEM_USE_HOST_PTR)
creates a new buffer and copies the data over
(but the copy operations are expensive)
Host data pointers
Global
Memory
CPU
(Host)
Buffer created by
clCreateBuffer()
GPU
(Compute
Device)
clCre
create
Where  possible  don’t  use  CL_
– Create buffers at the start of your app
– Use CL_MEM_ALLOC_HOST_PTR instead of m
– Then you can use the buffer on both
clCreateBuffer(CL_MEM_USE_HOST_PTR) clCreateBuffer(CL_MEM_ALLOC_HOST_PTR)malloc()
| © 2013 Aptina Imaging Corporation | Aptina Confidential24
Aptina Sensor with MobileHDR™ Turned off
| © 2013 Aptina Imaging Corporation | Aptina Confidential25
Aptina Sensor with MobileHDR™ Turned on
| © 2013 Aptina Imaging Corporation | Aptina Confidential26
AR0833 8MP Camera sensor
•  Frame is inscribed in a 1/3.2” circle
‣  4:3 for images e.g. 8MP 3264 x 2448
‣  16:9 for video e.g. 6MP 3264 x 1836
•  10-bit per pixel (framed in 16 bits)
•  At 30fps, we need 343 MB/s for 180 MPix/s
•  Interlaced HDR feature
•  Interface with ISP
‣  Data over MIPI CSI-2 (serial)
‣  Control over I2C
4:3
2448
3264
16:9
1836
3264
1/3.2" image circle
| © 2013 Aptina Imaging Corporation | Aptina Confidential27
Feature: Interlaced HDR
•  1 frame contains 2 exposures
interlaced
•  Ratio between odd and even pairs
‣  User controlled: 1x, 2x, 4x, 8x
single frame are captured at different integration times. This output is then mat
with an algorithm designed to reconstruct this output into an HDR still image or
The sensor HDR is controlled by two shutter pointers (Shutter pointer1, Shutter
pointer2) that control the integration of the odd (Shutter pointer1) and even (Sh
pointer 2) row pairs.
Figure 16: HDR Integration Time
Tint 1
Tint 2
Sample pointer
Shutter pointer 1
Shutter pointer 2
I-FRAME 1
I-FRAME 2
Output Frame from S
EXPOSURE
I-FRAME 1
EXPOSURE
I-FRAME 2
Output
I-FRAME 1 and 2
Features
Interlaced HDR Readout
The sensor enables HDR by outputting frames where even and odd row pairs within a
single frame are captured at different integration times. This output is then matched
with an algorithm designed to reconstruct this output into an HDR still image or video.
The sensor HDR is controlled by two shutter pointers (Shutter pointer1, Shutter
pointer2) that control the integration of the odd (Shutter pointer1) and even (Shutter
pointer 2) row pairs.
Figure 16: HDR Integration Time
Tint 1
Tint 2
Sample pointer
Shutter pointer 1
Shutter pointer 2
I-FRAME 1
I-FRAME 2
Output Frame from Sensor
EXPOSURE
I-FRAME 1
EXPOSURE
I-FRAME 2
Output
I-FRAME 1 and 2
Aptina reserves the right to change products or specifications witho
AR0833_DS - Rev. F Pub. 4/13 EN 30 ©2011 Aptina Imaging Corporation. All right
Figure 16: HDR Integration Time
Tint 1
Tint 2
Sample pointer
Shutter pointer 1
Shutter pointer 2
I-FRAME 1
I-FRAME 2
Output Frame from Senso
EXPOSURE
I-FRAME 1
EXPOSURE
I-FRAME 2
Output
I-FRAME 1 and 2
Exposure 1
Exposure 2
| © 2013 Aptina Imaging Corporation | Aptina Confidential28
mobileHDR demo
•  Zero-copy between sensor/OpenCL and OpenCL/OpenGL
•  On Arndale board (Samsung Exynos 5 Dual with Mali T604 GPU)
Noise
Reduction
iHDR
Reconstruction
Bayer scaler
Tone Mapping Color Correction
10b iHDR
3264x1836 14b
RGB888
EGLImage
CL Image
1080p
OpenCL
GL Texture
OpenGL ES
| © 2013 Aptina Imaging Corporation | Aptina Confidential29
Summary
•  Embedded GPUs are ideal candidates for computational imaging
‣  Performance at reasonable image size is now available
‣  Power efficiency is being addressed
•  OpenCL 1.1 is available on all recent application processors
‣  But may be reserved to OEM
‣  Performance portability isn’t guaranteed (but so it is true for any high-
performance applications)
•  Opening camera imaging processing “black box” is now feasible for
incredible new applications
| © 2013 Aptina Imaging Corporation | Aptina Confidential30
Khronos Camera
A standard to control image acquisition and
processing.
| © 2013 Aptina Imaging Corporation | Aptina Confidential31
Typical Imaging Pipeline
•  Pre- and Post-processing can be done on CPU, GPU, DSP…
•  ISP controls camera via 3A algorithms
Auto Exposure (AE), Auto White Balance (AWB), Auto Focus (AF)
•  ISP may be a separate chip or within Application Processor
Pre-processing
Image Signal Processor
(ISP)
Post-processing
CMOS sensor
Color Filter Array
Lens
Bayer RGB/YUV
App
Lens, sensor, aperture control 3A
Need for advanced camera control API:
- to drive more flexible app camera control
- over more types of camera sensors
- with tighter integration with the rest of the system
| © 2013 Aptina Imaging Corporation | Aptina Confidential32
Advanced Camera Control Use Cases
•  High-dynamic range (HDR) and computational flash photography
‣  High-speed burst with individual frame control over exposure and flash
•  Rolling shutter elimination
‣  High-precision intra-frame synchronization between camera and motion sensor
•  HDR Panorama, photo-spheres
‣  Continuous frame capture with constant exposure and white balance
•  Subject isolation and depth detection
•  High-speed burst with individual frame control over focus
•  Time-of-flight or structured light depth camera processing
‣  Aligned stacking of data from multiple sensors
•  Augmented Reality
‣  60Hz, low-latency capture with motion sensor synchronization
‣  Multiple Region of Interest (ROI) capture
‣  Multiple sensors for scene scaling
‣  Detailed feedback on camera operation per frame
| © 2013 Aptina Imaging Corporation | Aptina Confidential33
Camera API Architecture (FCAM based)
•  No global state
‣  State travels with image requests
‣  Every stage in the pipeline may have different state
•  -> allows fast, deterministic state changes
•  Synchronize devices
‣  Lens, flash, sound capture, gyro…
‣  Devices can schedule Actions
•  E.g. to be triggered on exposure change
•  Enables device synchronization
| © 2013 Aptina Imaging Corporation | Aptina Confidential34
Visual Sensor Revolution
•  Single sensor RGB cameras are just the start of the mobile visual revolution
‣  IR sensors – LEAP Motion, eye-trackers
•  Multi-sensors: Stereo pairs -> Plenoptic array -> Depth cameras
‣  Stereo pair can enable object scaling and enhanced depth extraction
‣  Plenoptic Field processing needs FFTs and ray-casting
•  Hybrid visual sensing solutions
‣  Different sensors mixed for different distances and lighting conditions
•  GPUs today – more dedicated ISPs tomorrow?
Dual Camera
LG Electronics
Plenoptic Array
Pelican imaging
Capri Structured Light 3D Camera
PrimeSense
| © 2013 Aptina Imaging Corporation | Aptina Confidential35
Khronos APIs for Augmented Reality
Advanced Camera
Control and stream
generation
3D Rendering and Video
Composition
On GPU
Audio
Rendering
Application
on CPUs, GPUs
and DSPs
Sensor
Fusion
Vision
Processing
MEMS
Sensors
Camera Control
API
EGLStream -
stream data
between APIs
Precision timestamps
on all sensor samples
AR needs not just advanced sensor processing, vision
acceleration, computation and rendering - but also for all
these subsystems to work efficiently together
| © 2013 Aptina Imaging Corporation | Aptina Confidential36
Khronos Camera API
•  Catalyze camera functionality not available on any current platform
‣  Open API that aligns with future platform directions for easy adoption
‣  E.g. could be used to implement future versions of Android Camera HAL
•  Control multiple sensors with synch and alignment
‣  E.g. Stereo pairs, Plenoptic arrays, TOF or structured light depth cameras
•  More detailed control per frame
‣  Format flexibility, Region of Interest (ROI) selection
•  Global Timing & Synchronization
‣  E.g. Between cameras and MEMS sensors
•  Application control over ISP processing (including 3A)
‣  Including multiple, re-entrant ISPs
•  Flexible processing/streaming
‣  Multiple output streams and streaming rows (not just frames)
‣  RAW, Bayer and YUV Processing
| © 2013 Aptina Imaging Corporation | Aptina Confidential37
Camera API Design Milestones and Philosophy
•  C-language API starting from proven designs
‣  e.g. FCAM, Android camera HAL V3
•  Design alignment with widely used hardware standards
‣  e.g. MIPI CSI
•  Focus on mobile, power-limited devices
‣  But do not preclude other use cases such as automotive, surveillance, DSLR…
•  Minimize overlap and maximize interoperability with other Khronos APIs
‣  But other Khronos APIs are not required
•  Provide support for vendor-specific extensions
Apr13
Jul13
Group charter
approved
4Q13
Provisional
specification
1Q14
First draft
specification
2Q14
Sample
implementation
and tests
3Q14
Specification
ratification
| © 2013 Aptina Imaging Corporation | Aptina Confidential38
Questions & Answers
Thank you!
Imaging on Embedded GPUs

Más contenido relacionado

La actualidad más candente

XPDDS17: Shared Virtual Memory Virtualization Implementation on Xen - Yi Liu,...
XPDDS17: Shared Virtual Memory Virtualization Implementation on Xen - Yi Liu,...XPDDS17: Shared Virtual Memory Virtualization Implementation on Xen - Yi Liu,...
XPDDS17: Shared Virtual Memory Virtualization Implementation on Xen - Yi Liu,...The Linux Foundation
 
Virtualization Support in ARMv8+
Virtualization Support in ARMv8+Virtualization Support in ARMv8+
Virtualization Support in ARMv8+Aananth C N
 
Achieving the Ultimate Performance with KVM
Achieving the Ultimate Performance with KVMAchieving the Ultimate Performance with KVM
Achieving the Ultimate Performance with KVMDevOps.com
 
التعرف على الوجه مقابل مصادقة الوجه.pdf
التعرف على الوجه مقابل مصادقة الوجه.pdfالتعرف على الوجه مقابل مصادقة الوجه.pdf
التعرف على الوجه مقابل مصادقة الوجه.pdfBahaa Abdulhadi
 
Redesigning Xen Memory Sharing (Grant) Mechanism
Redesigning Xen Memory Sharing (Grant) MechanismRedesigning Xen Memory Sharing (Grant) Mechanism
Redesigning Xen Memory Sharing (Grant) MechanismThe Linux Foundation
 
Embedded Recipes 2019 - Introduction to JTAG debugging
Embedded Recipes 2019 - Introduction to JTAG debuggingEmbedded Recipes 2019 - Introduction to JTAG debugging
Embedded Recipes 2019 - Introduction to JTAG debuggingAnne Nicolas
 
Nimble storage investor_deck_public
Nimble storage investor_deck_publicNimble storage investor_deck_public
Nimble storage investor_deck_publicSequoia Capital
 
Redteaming HID attacks
Redteaming HID attacksRedteaming HID attacks
Redteaming HID attacksJuan Espin
 
Supercomputers
SupercomputersSupercomputers
Supercomputersparwind
 
Electronic authentication more than just a password
Electronic authentication more than just a passwordElectronic authentication more than just a password
Electronic authentication more than just a passwordNicholas Davis
 
Q1 Memory Fabric Forum: Building Fast and Secure Chips with CXL IP
Q1 Memory Fabric Forum: Building Fast and Secure Chips with CXL IPQ1 Memory Fabric Forum: Building Fast and Secure Chips with CXL IP
Q1 Memory Fabric Forum: Building Fast and Secure Chips with CXL IPMemory Fabric Forum
 
Uboot startup sequence
Uboot startup sequenceUboot startup sequence
Uboot startup sequenceHoucheng Lin
 
Graphical Password Authentication
Graphical Password AuthenticationGraphical Password Authentication
Graphical Password AuthenticationAbhijit Akotkar
 
DRM Basics With Irdeto and Bitmovin
DRM Basics With Irdeto and BitmovinDRM Basics With Irdeto and Bitmovin
DRM Basics With Irdeto and BitmovinBitmovin Inc
 
MIPI DevCon 2021: Meeting the Needs of Next-Generation Displays with a High-P...
MIPI DevCon 2021: Meeting the Needs of Next-Generation Displays with a High-P...MIPI DevCon 2021: Meeting the Needs of Next-Generation Displays with a High-P...
MIPI DevCon 2021: Meeting the Needs of Next-Generation Displays with a High-P...MIPI Alliance
 
Micron: Memory Expansion with CXL Modules: Benefits, Use Cases and Enriching ...
Micron: Memory Expansion with CXL Modules: Benefits, Use Cases and Enriching ...Micron: Memory Expansion with CXL Modules: Benefits, Use Cases and Enriching ...
Micron: Memory Expansion with CXL Modules: Benefits, Use Cases and Enriching ...Memory Fabric Forum
 
CXL Consortium Update: Advancing Coherent Connectivity
CXL Consortium Update: Advancing Coherent ConnectivityCXL Consortium Update: Advancing Coherent Connectivity
CXL Consortium Update: Advancing Coherent ConnectivityMemory Fabric Forum
 

La actualidad más candente (20)

CXL Fabric Management Standards
CXL Fabric Management StandardsCXL Fabric Management Standards
CXL Fabric Management Standards
 
XPDDS17: Shared Virtual Memory Virtualization Implementation on Xen - Yi Liu,...
XPDDS17: Shared Virtual Memory Virtualization Implementation on Xen - Yi Liu,...XPDDS17: Shared Virtual Memory Virtualization Implementation on Xen - Yi Liu,...
XPDDS17: Shared Virtual Memory Virtualization Implementation on Xen - Yi Liu,...
 
Virtualization Support in ARMv8+
Virtualization Support in ARMv8+Virtualization Support in ARMv8+
Virtualization Support in ARMv8+
 
Achieving the Ultimate Performance with KVM
Achieving the Ultimate Performance with KVMAchieving the Ultimate Performance with KVM
Achieving the Ultimate Performance with KVM
 
التعرف على الوجه مقابل مصادقة الوجه.pdf
التعرف على الوجه مقابل مصادقة الوجه.pdfالتعرف على الوجه مقابل مصادقة الوجه.pdf
التعرف على الوجه مقابل مصادقة الوجه.pdf
 
eMMC 5.0 Total IP Solution
eMMC 5.0 Total IP SolutioneMMC 5.0 Total IP Solution
eMMC 5.0 Total IP Solution
 
Redesigning Xen Memory Sharing (Grant) Mechanism
Redesigning Xen Memory Sharing (Grant) MechanismRedesigning Xen Memory Sharing (Grant) Mechanism
Redesigning Xen Memory Sharing (Grant) Mechanism
 
Embedded Recipes 2019 - Introduction to JTAG debugging
Embedded Recipes 2019 - Introduction to JTAG debuggingEmbedded Recipes 2019 - Introduction to JTAG debugging
Embedded Recipes 2019 - Introduction to JTAG debugging
 
DFI_Blog
DFI_BlogDFI_Blog
DFI_Blog
 
Nimble storage investor_deck_public
Nimble storage investor_deck_publicNimble storage investor_deck_public
Nimble storage investor_deck_public
 
Redteaming HID attacks
Redteaming HID attacksRedteaming HID attacks
Redteaming HID attacks
 
Supercomputers
SupercomputersSupercomputers
Supercomputers
 
Electronic authentication more than just a password
Electronic authentication more than just a passwordElectronic authentication more than just a password
Electronic authentication more than just a password
 
Q1 Memory Fabric Forum: Building Fast and Secure Chips with CXL IP
Q1 Memory Fabric Forum: Building Fast and Secure Chips with CXL IPQ1 Memory Fabric Forum: Building Fast and Secure Chips with CXL IP
Q1 Memory Fabric Forum: Building Fast and Secure Chips with CXL IP
 
Uboot startup sequence
Uboot startup sequenceUboot startup sequence
Uboot startup sequence
 
Graphical Password Authentication
Graphical Password AuthenticationGraphical Password Authentication
Graphical Password Authentication
 
DRM Basics With Irdeto and Bitmovin
DRM Basics With Irdeto and BitmovinDRM Basics With Irdeto and Bitmovin
DRM Basics With Irdeto and Bitmovin
 
MIPI DevCon 2021: Meeting the Needs of Next-Generation Displays with a High-P...
MIPI DevCon 2021: Meeting the Needs of Next-Generation Displays with a High-P...MIPI DevCon 2021: Meeting the Needs of Next-Generation Displays with a High-P...
MIPI DevCon 2021: Meeting the Needs of Next-Generation Displays with a High-P...
 
Micron: Memory Expansion with CXL Modules: Benefits, Use Cases and Enriching ...
Micron: Memory Expansion with CXL Modules: Benefits, Use Cases and Enriching ...Micron: Memory Expansion with CXL Modules: Benefits, Use Cases and Enriching ...
Micron: Memory Expansion with CXL Modules: Benefits, Use Cases and Enriching ...
 
CXL Consortium Update: Advancing Coherent Connectivity
CXL Consortium Update: Advancing Coherent ConnectivityCXL Consortium Update: Advancing Coherent Connectivity
CXL Consortium Update: Advancing Coherent Connectivity
 

Destacado

Qualcomm SnapDragon 800 Mobile Device
Qualcomm SnapDragon 800 Mobile DeviceQualcomm SnapDragon 800 Mobile Device
Qualcomm SnapDragon 800 Mobile DeviceJJ Wu
 
Svn에서 git으로 이주하기
Svn에서 git으로 이주하기Svn에서 git으로 이주하기
Svn에서 git으로 이주하기Seunghwa Song
 
ROS Hands-On Intro/Tutorial (Robotic Vision Summer School 2015) #RVSS #ACRV
ROS Hands-On Intro/Tutorial (Robotic Vision Summer School 2015) #RVSS #ACRVROS Hands-On Intro/Tutorial (Robotic Vision Summer School 2015) #RVSS #ACRV
ROS Hands-On Intro/Tutorial (Robotic Vision Summer School 2015) #RVSS #ACRVJuxi Leitner
 
ABS 2014 - Android Kit Kat Internals
ABS 2014 - Android Kit Kat InternalsABS 2014 - Android Kit Kat Internals
ABS 2014 - Android Kit Kat InternalsBenjamin Zores
 
"The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li...
"The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li..."The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li...
"The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li...Edge AI and Vision Alliance
 
OpenCV 에서 OpenCL 살짝 써보기
OpenCV 에서 OpenCL 살짝 써보기OpenCV 에서 OpenCL 살짝 써보기
OpenCV 에서 OpenCL 살짝 써보기Seunghwa Song
 
Camera 2.0 in Android 4.2
Camera 2.0 in Android 4.2 Camera 2.0 in Android 4.2
Camera 2.0 in Android 4.2 Balwinder Kaur
 

Destacado (8)

Qualcomm SnapDragon 800 Mobile Device
Qualcomm SnapDragon 800 Mobile DeviceQualcomm SnapDragon 800 Mobile Device
Qualcomm SnapDragon 800 Mobile Device
 
CS-ISP Overview
CS-ISP OverviewCS-ISP Overview
CS-ISP Overview
 
Svn에서 git으로 이주하기
Svn에서 git으로 이주하기Svn에서 git으로 이주하기
Svn에서 git으로 이주하기
 
ROS Hands-On Intro/Tutorial (Robotic Vision Summer School 2015) #RVSS #ACRV
ROS Hands-On Intro/Tutorial (Robotic Vision Summer School 2015) #RVSS #ACRVROS Hands-On Intro/Tutorial (Robotic Vision Summer School 2015) #RVSS #ACRV
ROS Hands-On Intro/Tutorial (Robotic Vision Summer School 2015) #RVSS #ACRV
 
ABS 2014 - Android Kit Kat Internals
ABS 2014 - Android Kit Kat InternalsABS 2014 - Android Kit Kat Internals
ABS 2014 - Android Kit Kat Internals
 
"The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li...
"The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li..."The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li...
"The OpenVX Hardware Acceleration API for Embedded Vision Applications and Li...
 
OpenCV 에서 OpenCL 살짝 써보기
OpenCV 에서 OpenCL 살짝 써보기OpenCV 에서 OpenCL 살짝 써보기
OpenCV 에서 OpenCL 살짝 써보기
 
Camera 2.0 in Android 4.2
Camera 2.0 in Android 4.2 Camera 2.0 in Android 4.2
Camera 2.0 in Android 4.2
 

Similar a Imaging on Embedded GPUs

“Jumpstart Your Edge AI Vision Application with New Development Kits from Avn...
“Jumpstart Your Edge AI Vision Application with New Development Kits from Avn...“Jumpstart Your Edge AI Vision Application with New Development Kits from Avn...
“Jumpstart Your Edge AI Vision Application with New Development Kits from Avn...Edge AI and Vision Alliance
 
Droidcon2013 triangles gangolells_imagination
Droidcon2013 triangles gangolells_imaginationDroidcon2013 triangles gangolells_imagination
Droidcon2013 triangles gangolells_imaginationDroidcon Berlin
 
GPU Architecture NVIDIA (GTX GeForce 480)
GPU Architecture NVIDIA (GTX GeForce 480)GPU Architecture NVIDIA (GTX GeForce 480)
GPU Architecture NVIDIA (GTX GeForce 480)Fatima Qayyum
 
Gpu with cuda architecture
Gpu with cuda architectureGpu with cuda architecture
Gpu with cuda architectureDhaval Kaneria
 
Hai Tao at AI Frontiers: Deep Learning For Embedded Vision System
Hai Tao at AI Frontiers: Deep Learning For Embedded Vision SystemHai Tao at AI Frontiers: Deep Learning For Embedded Vision System
Hai Tao at AI Frontiers: Deep Learning For Embedded Vision SystemAI Frontiers
 
Arm Neoverse market update_05122020.pdf
Arm Neoverse market update_05122020.pdfArm Neoverse market update_05122020.pdf
Arm Neoverse market update_05122020.pdfPaul Yang
 
“Enabling Ultra-low Power Edge Inference and On-device Learning with Akida,” ...
“Enabling Ultra-low Power Edge Inference and On-device Learning with Akida,” ...“Enabling Ultra-low Power Edge Inference and On-device Learning with Akida,” ...
“Enabling Ultra-low Power Edge Inference and On-device Learning with Akida,” ...Edge AI and Vision Alliance
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsYoss Cohen
 
"An Ultra-low-power Multi-core Engine for Inference on Encrypted DNNs," a Pre...
"An Ultra-low-power Multi-core Engine for Inference on Encrypted DNNs," a Pre..."An Ultra-low-power Multi-core Engine for Inference on Encrypted DNNs," a Pre...
"An Ultra-low-power Multi-core Engine for Inference on Encrypted DNNs," a Pre...Edge AI and Vision Alliance
 
Machine Learning Developers - Know your GPUs
Machine Learning Developers - Know your GPUsMachine Learning Developers - Know your GPUs
Machine Learning Developers - Know your GPUsAmazon Web Services
 
Hot Chips: AMD Next Gen 7nm Ryzen 4000 APU
Hot Chips: AMD Next Gen 7nm Ryzen 4000 APUHot Chips: AMD Next Gen 7nm Ryzen 4000 APU
Hot Chips: AMD Next Gen 7nm Ryzen 4000 APUAMD
 
Accelerating Data Science With GPUs
Accelerating Data Science With GPUsAccelerating Data Science With GPUs
Accelerating Data Science With GPUsiguazio
 
Large-Scale Optimization Strategies for Typical HPC Workloads
Large-Scale Optimization Strategies for Typical HPC WorkloadsLarge-Scale Optimization Strategies for Typical HPC Workloads
Large-Scale Optimization Strategies for Typical HPC Workloadsinside-BigData.com
 
Lecture 15 ryuzo okada - vision processors for embedded computer vision
Lecture 15   ryuzo okada - vision processors for embedded computer visionLecture 15   ryuzo okada - vision processors for embedded computer vision
Lecture 15 ryuzo okada - vision processors for embedded computer visionmustafa sarac
 
Ximea - the pc camera, 90 gflps smart camera
Ximea  - the pc camera, 90 gflps smart cameraXimea  - the pc camera, 90 gflps smart camera
Ximea - the pc camera, 90 gflps smart cameraXIMEA
 
Nervana and the Future of Computing
Nervana and the Future of ComputingNervana and the Future of Computing
Nervana and the Future of ComputingIntel Nervana
 
Hot Chips: AMD Next Gen 7nm Ryzen 4000 APU
Hot Chips: AMD Next Gen 7nm Ryzen 4000 APUHot Chips: AMD Next Gen 7nm Ryzen 4000 APU
Hot Chips: AMD Next Gen 7nm Ryzen 4000 APUAMD
 
19564926 graphics-processing-unit
19564926 graphics-processing-unit19564926 graphics-processing-unit
19564926 graphics-processing-unitDayakar Siddula
 
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese..."Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...Edge AI and Vision Alliance
 

Similar a Imaging on Embedded GPUs (20)

Imaging using ARM T6xx GPU
Imaging using ARM T6xx GPUImaging using ARM T6xx GPU
Imaging using ARM T6xx GPU
 
“Jumpstart Your Edge AI Vision Application with New Development Kits from Avn...
“Jumpstart Your Edge AI Vision Application with New Development Kits from Avn...“Jumpstart Your Edge AI Vision Application with New Development Kits from Avn...
“Jumpstart Your Edge AI Vision Application with New Development Kits from Avn...
 
Droidcon2013 triangles gangolells_imagination
Droidcon2013 triangles gangolells_imaginationDroidcon2013 triangles gangolells_imagination
Droidcon2013 triangles gangolells_imagination
 
GPU Architecture NVIDIA (GTX GeForce 480)
GPU Architecture NVIDIA (GTX GeForce 480)GPU Architecture NVIDIA (GTX GeForce 480)
GPU Architecture NVIDIA (GTX GeForce 480)
 
Gpu with cuda architecture
Gpu with cuda architectureGpu with cuda architecture
Gpu with cuda architecture
 
Hai Tao at AI Frontiers: Deep Learning For Embedded Vision System
Hai Tao at AI Frontiers: Deep Learning For Embedded Vision SystemHai Tao at AI Frontiers: Deep Learning For Embedded Vision System
Hai Tao at AI Frontiers: Deep Learning For Embedded Vision System
 
Arm Neoverse market update_05122020.pdf
Arm Neoverse market update_05122020.pdfArm Neoverse market update_05122020.pdf
Arm Neoverse market update_05122020.pdf
 
“Enabling Ultra-low Power Edge Inference and On-device Learning with Akida,” ...
“Enabling Ultra-low Power Edge Inference and On-device Learning with Akida,” ...“Enabling Ultra-low Power Edge Inference and On-device Learning with Akida,” ...
“Enabling Ultra-low Power Edge Inference and On-device Learning with Akida,” ...
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platforms
 
"An Ultra-low-power Multi-core Engine for Inference on Encrypted DNNs," a Pre...
"An Ultra-low-power Multi-core Engine for Inference on Encrypted DNNs," a Pre..."An Ultra-low-power Multi-core Engine for Inference on Encrypted DNNs," a Pre...
"An Ultra-low-power Multi-core Engine for Inference on Encrypted DNNs," a Pre...
 
Machine Learning Developers - Know your GPUs
Machine Learning Developers - Know your GPUsMachine Learning Developers - Know your GPUs
Machine Learning Developers - Know your GPUs
 
Hot Chips: AMD Next Gen 7nm Ryzen 4000 APU
Hot Chips: AMD Next Gen 7nm Ryzen 4000 APUHot Chips: AMD Next Gen 7nm Ryzen 4000 APU
Hot Chips: AMD Next Gen 7nm Ryzen 4000 APU
 
Accelerating Data Science With GPUs
Accelerating Data Science With GPUsAccelerating Data Science With GPUs
Accelerating Data Science With GPUs
 
Large-Scale Optimization Strategies for Typical HPC Workloads
Large-Scale Optimization Strategies for Typical HPC WorkloadsLarge-Scale Optimization Strategies for Typical HPC Workloads
Large-Scale Optimization Strategies for Typical HPC Workloads
 
Lecture 15 ryuzo okada - vision processors for embedded computer vision
Lecture 15   ryuzo okada - vision processors for embedded computer visionLecture 15   ryuzo okada - vision processors for embedded computer vision
Lecture 15 ryuzo okada - vision processors for embedded computer vision
 
Ximea - the pc camera, 90 gflps smart camera
Ximea  - the pc camera, 90 gflps smart cameraXimea  - the pc camera, 90 gflps smart camera
Ximea - the pc camera, 90 gflps smart camera
 
Nervana and the Future of Computing
Nervana and the Future of ComputingNervana and the Future of Computing
Nervana and the Future of Computing
 
Hot Chips: AMD Next Gen 7nm Ryzen 4000 APU
Hot Chips: AMD Next Gen 7nm Ryzen 4000 APUHot Chips: AMD Next Gen 7nm Ryzen 4000 APU
Hot Chips: AMD Next Gen 7nm Ryzen 4000 APU
 
19564926 graphics-processing-unit
19564926 graphics-processing-unit19564926 graphics-processing-unit
19564926 graphics-processing-unit
 
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese..."Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
 

Último

Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 

Último (20)

Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 

Imaging on Embedded GPUs

  • 1. | © 2013 Aptina Imaging Corporation | Aptina Confidential1 © 2013 Aptina Imaging Corporation. All rights reserved. Products are warranted only to meet Aptina’s production data sheet specifications. Information, products, and/or specifications are subject to change without notice. All information is provided on an “AS IS” basis without warranties of any kind. Dates are estimates only. Drawings not to scale. Aptina and the Aptina logo are trademarks of Aptina Imaging Corporation. All other trademarks are the property of their respective owners. Imaging on Embedded GPUs Investigating flexible imaging pipelines using embedded GPUs Mikaël Bourges-Sévenier (msevenier at aptina dot com) Director, High-Performance Imaging December 19, 2013 Bay Area Multimedia
  • 2. | © 2013 Aptina Imaging Corporation | Aptina Confidential2 •  Overview: the need for computational imaging •  What is imaging? •  Architecture of some embedded GPUs •  8MP MobileHDR pipeline on ARM Mali T604 •  Khronos Camera: a standard API for computational imaging •  Q&A Agenda
  • 3. | © 2013 Aptina Imaging Corporation | Aptina Confidential3 Computational Imaging evolution Spatial (Volumetric) Gesture AR Face Detect Face Track Presence Colorimetry Brightness Web Cam Smart Camera True Color, Brightness Compensation, Exposure control User Identity Access Control Augmented Information 3D Imaging Interactive Services
  • 4. | © 2013 Aptina Imaging Corporation | Aptina Confidential4 •  Requires significant computing over large data sets Mobile Compute driving Imaging use cases Augmented Reality Face, Body and Gesture Tracking Computational Photography 3D Scene/Object Reconstruction Time
  • 5. | © 2013 Aptina Imaging Corporation | Aptina Confidential5 Increasing Use of Imaging SensorsDifferentiationOpportunity Time Photography Input = 2D Camera Processors = ISP + CPU Product = Static Images Computational Photography Input = MEMS + 2D Camera Processors = ISP + CPU + GPU Product = Real Time Images We are here Perceptual Imaging Input = MEMS + Depth Camera Processors = ISP + CPU + GPU + DSP Product = Real Time Extracted Information Perceptual Imaging1. Uses the full array of mobile sensors 2. to extract information in real-time 3. about the user and environment 4. to generate enhanced user interactions
  • 6. | © 2013 Aptina Imaging Corporation | Aptina Confidential6 Hardware Save Power e.g. Camera Sensor ISP •  CPU ‣  Single processor or Neon SIMD - running fast ‣  Makes heavy use of general memory ‣  Non-optimal performance and power •  GPU ‣  Programmable and flexible ‣  Many way parallelism - run at lower frequency ‣  Efficient image caching close to processors ‣  BUT cycles frames in and out of memory •  Camera ISP (Image Signal Processor) ‣  Little or no programmability ‣  Data flows thru compact hardware pipe ‣  Scan-line-based - no global memory ‣  Best perf/watt
  • 7. | © 2013 Aptina Imaging Corporation | Aptina Confidential7 0 50 100 150 200 250 300 350 400 450 Sep-2011 Dec-2011 Apr-2012 Jul-2012 Oct-2012 Jan-2013 May-2013 Aug-2013 Nov-2013 Mar-2014 Jun-2014 Evolution of Embedded GPUs GFLOPS Trend Adreno 320 Adreno 330 Mali T628 PowerVR 6 Tegra 5 PowerVR 5XT Mali T604 40% more GFLOPS/quarter Estimated at sustained peak performance. Likely to be much less in practice.
  • 8. | © 2013 Aptina Imaging Corporation | Aptina Confidential8 •  Pre-processing: for non-standard Bayer pixels (e.g. iHDR) •  ISP: for fast demosaic, lens shading, denoising, 3A, statistics … •  Post-processing: for special reconstruction of colors (e.g. Clarity+) •  Processing requires control of metadata aligned with data Computational Imaging pipeline Pre-processing Image Signal Processor (ISP) Post-processing CMOS sensor Color Filter Array Lens Bayer RGB YUV App Lens, sensor, aperture control Metadata 3A stats
  • 9. | © 2013 Aptina Imaging Corporation | Aptina Confidential9 •  DSP are similar to CPU ‣  Typically integer optimized (some have rudimentary floating point support) ‣  With signal processing intrinsics •  FPGA ‣  Can be tailored to a cross between CPU/DSP and GPU Different Computing Devices Latency-Optimized CPU Fast serial Processing lots of big on-chip caches sophisticated control Throughput-Optimized GPU Scalable parallel Processing multithreading can hide latency simpler control, cost amortized over ALUs via SIMD a b c + + SISD (scalar ALU) SIMD (vector ALU) b1 b2 b3 b4a2a1 a4a3 c1 c2 c3 c4 OpenCL works on all devices but performance isn’t guaranteed
  • 10. | © 2013 Aptina Imaging Corporation | Aptina Confidential10 •  Stream-based (ISP) ‣  For low-memory devices ‣  Set of lines processed by kernels ‣  Delay: #lines a kernel needs •  Frame-based (GPU) ‣  For fast data-parallel devices ‣  Full image frame processed ‣  Delay: whole frame(s) Stream-based vs. Frame-based Kernel continuous stream of pixels Q Kernel final image accumulates lines Kernel Kernel KernelFrame Frame Frame Frame Completely different kernels
  • 11. | © 2013 Aptina Imaging Corporation | Aptina Confidential11 What is Imaging? Capture image from a camera sensor and process it to get a render-able image.
  • 12. | © 2013 Aptina Imaging Corporation | Aptina Confidential12 How Imaging Sensors work http://www.photoaxe.com Bayer GRBG pattern •  50% green •  25% red and blue Bayer CFA is one type of pattern
  • 13. | © 2013 Aptina Imaging Corporation | Aptina Confidential13 Bayer Demosaicing •  50% More G than R, B since eye is more sensitive to luminance than chrominance •  Convert pixel colors from Bayer space to Full RGB color •  Complex interpolation to avoid artifacts (e.g. on edges) RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB RGB 0 1 2 3 0 GRBG 1 RGGB 2 GBRG 3 BGGR
  • 14. | © 2013 Aptina Imaging Corporation | Aptina Confidential14 OpenCL (memory system) Desktop Embedded Non-uniform memory •  Data is physically copied between GPU and CPU memory Uniform memory •  __local memory may be in __global •  Cheap data exchange between CPU and GPU
  • 15. | © 2013 Aptina Imaging Corporation | Aptina Confidential15 A tour of some embedded GPUs ARM Mali T604, Qualcomm Adreno 330
  • 16. | © 2013 Aptina Imaging Corporation | Aptina Confidential16 ARM Mali T604, T628 •  Found in Samsung Exynos 5 Dual (T604)/Octa (T628) Application Processors ‣  Chromebook, Nexus 10, Samsung S4… •  32nm process for T604, 28nm for T628 •  T604 has 4 shader cores, T628 has 8 cores •  Tri-pipe architecture: each GPU core has 3 types of instruction pipelines ‣  1x load/store ‣  1x texture ‣  2x ALU (T604) / 4x ALU (T628) •  64-bit integers and IEEE 754 floating-point ALUs
  • 17. | © 2013 Aptina Imaging Corporation | Aptina Confidential17 29868v00 CONFIDENTIAL OpenCL and OpenGL ES The Vithar Architecture: OpenGL ESOpenCL Load/Store Pipeline Arithmetic Pipeline Arithmetic Pipeline Texturing Pipeline Thread Issue Thread Completion •  3 kinds of pipelines ‣  Arithmetic ‣  Load/Store ‣  Texture •  Barrel-threaded (like AMD/NVIDIA) •  No SIMT execution (unlike AMD/NVIDIA) •  SIMD (like AMD) ‣  Use vectors for best performance! •  256 threads max (64 in practice) OpenCL and OpenGL ES
  • 18. | © 2013 Aptina Imaging Corporation | Aptina Confidential18 •  Automatic hardware load balancing •  Seamless concurrent execution •  Integrated seamless power manager Midgard Job execution and Load-balancingJob Execution and Load-balancing
  • 19. | © 2013 Aptina Imaging Corporation | Aptina Confidential19 Qualcomm MSM8974 •  Process: 28nm •  CPU: 4x Krait 2.3 GHz, ‣  ARMv7A Neon instruction set ‣  Power and performance efficiencies over ARM ‣  4KB+4KB L0, 16KB+16KB L1, 2MB L2 cache ‣  No 64b support •  GPU: Adreno 330 450 MHz ‣  32x 32b scalar ALUs/pipeline, 8 pipelines, 129.6 GFLOPS •  16b kernels provide 2x performance ‣  128b registers ‣  8 KB local memory per shader core ‣  8 KB constant memory ‣  12 reads, 4 writes simultaneous per clock ‣  512 work-items max ‣  1.5 MB on-chip SRAM ‣  Tiled renderer max 3.6 GPix/s •  Hexagon DSP ‣  3x core, 600 MHz, 16 KB L1, 256 KB L2, integrated MMU ‣  Limited floating-point support (no division, no log/ exp…) •  RAM: 2GB 2x LP-DDR3 800 MHz (12.8 GB/s) MSM8974 Adreno 330 vs Adreno 320 Adreno 330 has better performance 450 MHz GPU clock (up from 400 MHz in Adreno 320) 2x better shader performance than A320 – 2x more ALU blocks Dedicated GPU power rail Will allow GPU to be at a lower frequency and voltage than the FABR Adreno 330 Shader Processor “SP” Block Total of 32 (32-bit) scalar ALUs m sevenier-aptina.com 98.248.48.48 2013.10.19 at21:47:19 PD T 16-bit ALUs used if all kernel is 16-bit, otherwise 32b ALU is used
  • 20. | © 2013 Aptina Imaging Corporation | Aptina Confidential20 MobileHDR pipeline
  • 21. | © 2013 Aptina Imaging Corporation | Aptina Confidential21 Arndale Samsung Exynos 5 Dual board •  Arndale Samsung Exynos 5 board ‣  CPU: ARM Corte-A15 (2-core) 1.7 GHz 32nm •  32KB L1 cache, 1MB L2 cache ‣  GPU: ARM MALI T604 •  64 concurrent threads •  Vector ALUs •  128b registers •  OpenCL 1.1 Full Profile ‣  RAM: 2GB LP-DDR3 800 MHz (12.8 GB/s) ‣  Truly unified cached memory •  CPU and GPU memory is shared – NO COPY! •  128b wide L1 and L2 access
  • 22. | © 2013 Aptina Imaging Corporation | Aptina Confidential22 ARM Mali T604 GPUs In Samsung Exynos 5 Dual Type Vector GPU Process 32nm OpenCL 1.1 Full Profile Unified memory Yes Rendering Tile Work-items 256 Clock 533MHz L2 cache 1MB Register width 128b Global memory 2GB LP-DDR3 800Mhz (12.8 GB/s) ALUs 8 (2 ALUs/core) Throughput 100 GFLOPS Local memory 32KB/core (global) Constant memory 64KB Texture cache yes Compute devices (shader cores) 4 Cacheline 64 bytes 16/32/64b floats No/yes/yes
  • 23. | © 2013 Aptina Imaging Corporation | Aptina Confidential23 Avoid buffer copy •  Mali/Adreno have unified memory ‣  Use CL_MEM_ALLOC_PTR to avoid copy between CPU and GPU •  Mali has no local memory •  Adreno has local memory (1.5MB SRAM 115GB/s) Host data pointers Global Memory Buffer created by malloc() CPU (Host) GPU (Compute Device) Buffers created by user (malloc) are not mapped into the GPU memory space Global Memory Buffer created by malloc() CPU (Host) Buffer created by clCreateBuffer() GPU (Compute Device) COPY clCreateBuffer(CL_MEM_USE_HOST_PTR) creates a new buffer and copies the data over (but the copy operations are expensive) Global Memory Buffer created by malloc() Buffers created by user (malloc) are not mapped into the GPU memory space Global Memory Buffer created by malloc() CPU (Host) Buffer created by clCreateBuffer() GPU (Compute Device) COPY clCreateBuffer(CL_MEM_USE_HOST_PTR) creates a new buffer and copies the data over (but the copy operations are expensive) Host data pointers Global Memory CPU (Host) Buffer created by clCreateBuffer() GPU (Compute Device) clCre create Where  possible  don’t  use  CL_ – Create buffers at the start of your app – Use CL_MEM_ALLOC_HOST_PTR instead of m – Then you can use the buffer on both clCreateBuffer(CL_MEM_USE_HOST_PTR) clCreateBuffer(CL_MEM_ALLOC_HOST_PTR)malloc()
  • 24. | © 2013 Aptina Imaging Corporation | Aptina Confidential24 Aptina Sensor with MobileHDR™ Turned off
  • 25. | © 2013 Aptina Imaging Corporation | Aptina Confidential25 Aptina Sensor with MobileHDR™ Turned on
  • 26. | © 2013 Aptina Imaging Corporation | Aptina Confidential26 AR0833 8MP Camera sensor •  Frame is inscribed in a 1/3.2” circle ‣  4:3 for images e.g. 8MP 3264 x 2448 ‣  16:9 for video e.g. 6MP 3264 x 1836 •  10-bit per pixel (framed in 16 bits) •  At 30fps, we need 343 MB/s for 180 MPix/s •  Interlaced HDR feature •  Interface with ISP ‣  Data over MIPI CSI-2 (serial) ‣  Control over I2C 4:3 2448 3264 16:9 1836 3264 1/3.2" image circle
  • 27. | © 2013 Aptina Imaging Corporation | Aptina Confidential27 Feature: Interlaced HDR •  1 frame contains 2 exposures interlaced •  Ratio between odd and even pairs ‣  User controlled: 1x, 2x, 4x, 8x single frame are captured at different integration times. This output is then mat with an algorithm designed to reconstruct this output into an HDR still image or The sensor HDR is controlled by two shutter pointers (Shutter pointer1, Shutter pointer2) that control the integration of the odd (Shutter pointer1) and even (Sh pointer 2) row pairs. Figure 16: HDR Integration Time Tint 1 Tint 2 Sample pointer Shutter pointer 1 Shutter pointer 2 I-FRAME 1 I-FRAME 2 Output Frame from S EXPOSURE I-FRAME 1 EXPOSURE I-FRAME 2 Output I-FRAME 1 and 2 Features Interlaced HDR Readout The sensor enables HDR by outputting frames where even and odd row pairs within a single frame are captured at different integration times. This output is then matched with an algorithm designed to reconstruct this output into an HDR still image or video. The sensor HDR is controlled by two shutter pointers (Shutter pointer1, Shutter pointer2) that control the integration of the odd (Shutter pointer1) and even (Shutter pointer 2) row pairs. Figure 16: HDR Integration Time Tint 1 Tint 2 Sample pointer Shutter pointer 1 Shutter pointer 2 I-FRAME 1 I-FRAME 2 Output Frame from Sensor EXPOSURE I-FRAME 1 EXPOSURE I-FRAME 2 Output I-FRAME 1 and 2 Aptina reserves the right to change products or specifications witho AR0833_DS - Rev. F Pub. 4/13 EN 30 ©2011 Aptina Imaging Corporation. All right Figure 16: HDR Integration Time Tint 1 Tint 2 Sample pointer Shutter pointer 1 Shutter pointer 2 I-FRAME 1 I-FRAME 2 Output Frame from Senso EXPOSURE I-FRAME 1 EXPOSURE I-FRAME 2 Output I-FRAME 1 and 2 Exposure 1 Exposure 2
  • 28. | © 2013 Aptina Imaging Corporation | Aptina Confidential28 mobileHDR demo •  Zero-copy between sensor/OpenCL and OpenCL/OpenGL •  On Arndale board (Samsung Exynos 5 Dual with Mali T604 GPU) Noise Reduction iHDR Reconstruction Bayer scaler Tone Mapping Color Correction 10b iHDR 3264x1836 14b RGB888 EGLImage CL Image 1080p OpenCL GL Texture OpenGL ES
  • 29. | © 2013 Aptina Imaging Corporation | Aptina Confidential29 Summary •  Embedded GPUs are ideal candidates for computational imaging ‣  Performance at reasonable image size is now available ‣  Power efficiency is being addressed •  OpenCL 1.1 is available on all recent application processors ‣  But may be reserved to OEM ‣  Performance portability isn’t guaranteed (but so it is true for any high- performance applications) •  Opening camera imaging processing “black box” is now feasible for incredible new applications
  • 30. | © 2013 Aptina Imaging Corporation | Aptina Confidential30 Khronos Camera A standard to control image acquisition and processing.
  • 31. | © 2013 Aptina Imaging Corporation | Aptina Confidential31 Typical Imaging Pipeline •  Pre- and Post-processing can be done on CPU, GPU, DSP… •  ISP controls camera via 3A algorithms Auto Exposure (AE), Auto White Balance (AWB), Auto Focus (AF) •  ISP may be a separate chip or within Application Processor Pre-processing Image Signal Processor (ISP) Post-processing CMOS sensor Color Filter Array Lens Bayer RGB/YUV App Lens, sensor, aperture control 3A Need for advanced camera control API: - to drive more flexible app camera control - over more types of camera sensors - with tighter integration with the rest of the system
  • 32. | © 2013 Aptina Imaging Corporation | Aptina Confidential32 Advanced Camera Control Use Cases •  High-dynamic range (HDR) and computational flash photography ‣  High-speed burst with individual frame control over exposure and flash •  Rolling shutter elimination ‣  High-precision intra-frame synchronization between camera and motion sensor •  HDR Panorama, photo-spheres ‣  Continuous frame capture with constant exposure and white balance •  Subject isolation and depth detection •  High-speed burst with individual frame control over focus •  Time-of-flight or structured light depth camera processing ‣  Aligned stacking of data from multiple sensors •  Augmented Reality ‣  60Hz, low-latency capture with motion sensor synchronization ‣  Multiple Region of Interest (ROI) capture ‣  Multiple sensors for scene scaling ‣  Detailed feedback on camera operation per frame
  • 33. | © 2013 Aptina Imaging Corporation | Aptina Confidential33 Camera API Architecture (FCAM based) •  No global state ‣  State travels with image requests ‣  Every stage in the pipeline may have different state •  -> allows fast, deterministic state changes •  Synchronize devices ‣  Lens, flash, sound capture, gyro… ‣  Devices can schedule Actions •  E.g. to be triggered on exposure change •  Enables device synchronization
  • 34. | © 2013 Aptina Imaging Corporation | Aptina Confidential34 Visual Sensor Revolution •  Single sensor RGB cameras are just the start of the mobile visual revolution ‣  IR sensors – LEAP Motion, eye-trackers •  Multi-sensors: Stereo pairs -> Plenoptic array -> Depth cameras ‣  Stereo pair can enable object scaling and enhanced depth extraction ‣  Plenoptic Field processing needs FFTs and ray-casting •  Hybrid visual sensing solutions ‣  Different sensors mixed for different distances and lighting conditions •  GPUs today – more dedicated ISPs tomorrow? Dual Camera LG Electronics Plenoptic Array Pelican imaging Capri Structured Light 3D Camera PrimeSense
  • 35. | © 2013 Aptina Imaging Corporation | Aptina Confidential35 Khronos APIs for Augmented Reality Advanced Camera Control and stream generation 3D Rendering and Video Composition On GPU Audio Rendering Application on CPUs, GPUs and DSPs Sensor Fusion Vision Processing MEMS Sensors Camera Control API EGLStream - stream data between APIs Precision timestamps on all sensor samples AR needs not just advanced sensor processing, vision acceleration, computation and rendering - but also for all these subsystems to work efficiently together
  • 36. | © 2013 Aptina Imaging Corporation | Aptina Confidential36 Khronos Camera API •  Catalyze camera functionality not available on any current platform ‣  Open API that aligns with future platform directions for easy adoption ‣  E.g. could be used to implement future versions of Android Camera HAL •  Control multiple sensors with synch and alignment ‣  E.g. Stereo pairs, Plenoptic arrays, TOF or structured light depth cameras •  More detailed control per frame ‣  Format flexibility, Region of Interest (ROI) selection •  Global Timing & Synchronization ‣  E.g. Between cameras and MEMS sensors •  Application control over ISP processing (including 3A) ‣  Including multiple, re-entrant ISPs •  Flexible processing/streaming ‣  Multiple output streams and streaming rows (not just frames) ‣  RAW, Bayer and YUV Processing
  • 37. | © 2013 Aptina Imaging Corporation | Aptina Confidential37 Camera API Design Milestones and Philosophy •  C-language API starting from proven designs ‣  e.g. FCAM, Android camera HAL V3 •  Design alignment with widely used hardware standards ‣  e.g. MIPI CSI •  Focus on mobile, power-limited devices ‣  But do not preclude other use cases such as automotive, surveillance, DSLR… •  Minimize overlap and maximize interoperability with other Khronos APIs ‣  But other Khronos APIs are not required •  Provide support for vendor-specific extensions Apr13 Jul13 Group charter approved 4Q13 Provisional specification 1Q14 First draft specification 2Q14 Sample implementation and tests 3Q14 Specification ratification
  • 38. | © 2013 Aptina Imaging Corporation | Aptina Confidential38 Questions & Answers Thank you!