SlideShare una empresa de Scribd logo
1 de 29
Descargar para leer sin conexión
Introduction to SPIR for
Application and Compiler
Developers

Yaxun Sam Liu
OUTLINE
y What is SPIR and why it is useful
‒ Why do we need SPIR since we already have LLVM IR

y SPIR for Application Developers
‒ How to generate SPIR
‒ How to load SPIR
‒ Portability considerations using SPIR

y SPIR for Compiler Developers
‒ Introduction to SPIR spec
‒ How to implement a SPIR loader

y References

2 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
WHAT IS SPIR
y A Binary Format
‒ SPIR means Standard Portable Intermediate Representation
‒ A portable binary format for OpenCLTM programs
‒ Defined by SPIR spec
‒ Based on LLVM IR
‒ Supports most of OpenCLTM core features
‒ Current version is 1.2, corresponding to OpenCLTM 1.2
‒ Developed by Khronos Group, OpenCLTM working group, SPIR subgroup
‒ A SPIR binary is bitness aware, means
‒ The pointer size in a SPIR binary is either 32 bit or 64 bit depending on target devices
‒ Two sets of SPIR binaries are needed for shipping products in SPIR to both 32 and 64 bit devices

y An extension for OpenCLTM
‒ Defined by SPIR host API
‒ Denoted by cl_khr_spir
‒ OpenCLTM devices supporting cl_khr_spir is able to load SPIR binary and run it
3 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
WHY IS SPIR USEFUL
y Why is SPIR useful
‒ For Game/Application Developers
‒ Can ship OpenCLTM program in binary instead of source code
‒ Can ship just a few binaries for one OpenCLTM program instead of tons of binaries for different platforms/devices

‒ For Compiler Developers
‒ Can compile other programming languages to SPIR which can be run on OpenCLTM devices

4 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
HOW TO GENERATE SPIR
y SPIR generation is optional for devices supporting cl_khr_spir
‒ A device supporting cl_khr_spir is only required to be able to consume SPIR
‒ Whether to support SPIR generation is vendors’ choice

y Generating SPIR in host program
‒ SPIR spec and host API does not define how to generate SPIR
‒ If SPIR generation is supported, it is likely to be done as
‒
‒
‒
‒

Load OpenCLTM source code by clCreateProgramWithSource
Compile OpenCLTM source code by clCompileProgram with a vendor specific option for generating SPIR
Get the SPIR binary by clGetProgramInfo with CL_PROGRAM_BINARIES
Save the SPIR binary to a file

y Generating SPIR by offline compiler
‒ Clang 3.3/3.4 can compile OpenCLTM source code to SPIR-like LLVM bitcode
‒ A patch for Clang 3.2 is available to Khronos members which can compile OpenCLTM source code to SPIR 1.2
‒ Clang options for generating SPIR: -cl-std=CL1.2 -emit-llvm -triple spir[32|64]-unknown-unknown
5 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
HOW TO LOAD SPIR
LOAD A SINGLE SPIR BINARY
SPIR Binary
clCreateProgramWithBinary
cl_program
clBuildProgram
cl_program
clCreateKernel
cl_kernel

6 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
HOW TO LOAD SPIR
MULTIPLE SPIR BINARIES, OPENCLTM SOURCE CODES AND VENDOR-SPECIFIC BINARIES
OpenCLTM Source

SPIR Binary

Vendor-specific Binary

clCreateProgramWithSource

clCreateProgramWithBinary

clCreateProgramWithBinary

cl_program

cl_program

cl_program

clCompileProgram

clCompileProgram

cl_program

cl_program
clLinkProgram
cl_program
clCreateKernel
cl_kernel

7 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
PORTABILITY CONSIDERATIONS USING SPIR
y Check whether a device supports SPIR
‒ Get all supported extensions by clGetDeviceInfo with CL_DEVICE_EXTENSIONS
‒ Check whether cl_khr_spir is included

y Supporting both 32 and 64 bit devices
‒ Two sets of SPIR binaries are needed, one for 32 bit devices, the other for 64 bit devices
‒ Check bitness of a device by clGetDeviceInfo with CL_DEVICE_ADDRESS_BITS
‒ Load 32 or 64 bit SPIR binaries accordingly

y Supporting optional extensions
‒ Get all supported extensions by clGetDeviceInfo with CL_DEVICE_EXTENSIONS
‒ Check if the required extension is supported
‒ If yes, load the SPIR binary
‒ If no, either fallback to a SPIR binary or OpenCLTM source using only core extensions, or fail gracefully

8 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
PORTABILITY CONSIDERATIONS USING SPIR
y SPIR binaries generated from non-portable OpenCLTM source is not portable
‒ Not following restrictions specified by OpenCLTM spec 1.2 section 6.9
‒ Casting a pointer from one address space to a different address space
‒ Casting an OpenCLTM opaque structure to a different type
‒ Performing arithmetic operations or comparison on a sampler
‒ Performing sizeof on OpenCLTM opaque structures
‒ etc.

9 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
SPIR FOR COMPILER DEVELOPERS
y Introduction to SPIR spec
‒ Relation between SPIR 1.2 and LLVM 3.2
‒ Mapping of OpenCLTM to SPIR
‒
‒
‒
‒
‒
‒
‒

Data types
Enumeration values
Calling conventions
Address spaces
Name mangling
Used extensions
Kernel argument info

y How to implement a SPIR loader
‒ Overall structure
‒ Transforming data types
‒ Transforming meta data
‒ Demangling and mapping builtin function names
10 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
RELATION BETWEEN SPIR AND LLVM BITCODE
y SPIR binary is subset of LLVM bitcode
‒ A valid SPIR 1.2 binary is valid LLVM 3.2 bitcode
‒ SPIR is defined by mapping OpenCLTM C entities to LLVM and also imposing restrictions on LLVM 3.2 bitcode format
‒
‒
‒
‒

Specific target triple and data layout for 32 and 64 bit devices
Specific ABI
Specific calling conventions
Restrictions on allowed instructions, intrinsic functions, linkage types, parameter attributes, visibility styles, function
attributes, etc.

y The ideas behind SPIR
‒ To be expressive enough to represent OpenCLTM C programs
‒ To carry enough information for OpenCLTM runtime to execute and query the kernels
‒ Do not introduce unnecessary entities
‒ This may limit SPIR’s expressiveness for other languages, but facilitates development of SPIR loader
‒ Balance the burden between SPIR producer and loader

11 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
MAPPING OF OPENCLTM TO SPIR
DATA TYPES

y OpenCLTM builtin scalar types are mapped to LLVM primitive types
‒ bool -> i1
‒ char -> i8
‒ unsigned char, uchar -> i8
‒ short -> i16
‒ unsigned short, ushort -> i16
‒ int -> i32
‒ unsigned int, uint -> i32
‒ long -> i64
‒ unsigned long, ulong -> i64
‒ float -> float
‒ double -> double
‒ half -> half
‒ void -> void

y OpenCLTM builtin vector types are mapped to LLVM vector types
‒ charn < n x i8 >

12 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
MAPPING OF OPENCLTM TO SPIR
DATA TYPES

y Image and event types are mapped to LLVM opaque structure
‒ image1d_t -> %opencl.image1d_t
‒ image1d_array_t -> %opencl.image1d_array_t
‒ image1d_buer_t -> %opencl.image1d_buer_t
‒ image2d_t -> %opencl.image2d_t
‒ image2d_array_t -> %opencl.image2d_array_t
‒ image3d_t -> %opencl.image3d_t
‒ image2d_msaa_t -> %opencl.image2d_msaa_t
‒ image2d_array_msaa_t -> %opencl.image2d_array_msaa_t
‒ image2d_msaa_depth_t -> %opencl.image2d_msaa_depth_t
‒ image2d_array_msaa_depth_t -> %opencl.image2d_array_msaa_depth_t
‒ image2d_depth_t -> %opencl.image2d_depth_t
‒ image2d_array_depth_t -> %opencl.image2d_array_depth_t
‒ event_t -> %opencl.event_t

13 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
MAPPING OF OPENCLTM TO SPIR
DATA TYPES

y Sampler type is mapped to LLVM i32 type
‒ Although sampler is represented by integer in SPIR, arithmetic operations and comparison with other values are not
allowed.

y size_t, diffptr_t, intptr_t, uintptr_t is mapped to LLVM i32 or i64 depending on the bitness of SPIR
y Signed/unsignedness of integer types
‒ LLVM does not have unsigned integer types
‒ OpenCLTM unsigned and signed integer types of the same bit width are mapped to the same type in SPIR
‒ If signed/unsignedness of an integer type is needed, usually the information can be obtained through
‒ Mangled function names
‒ Sign extension of function arguments and return type
‒ Kernel argument metadata

14 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
MAPPING OF OPENCLTM TO SPIR
CALLING CONVENTIONS

y SPIR uses calling convention to indicate whether a function is a kernel function
‒ Kernel functions use spir_kernel calling convention
‒ Non-kernel functions use spir_func calling convention
‒ No other calling conventions are allowed in SPIR

15 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
MAPPING OF OPENCLTM TO SPIR
ADDRESS SPACES

y OpenCLTM C address spaces are mapped to LLVM address spaces
‒ Private -> 0
‒ Global -> 1
‒ Constant -> 2
‒ Local -> 3

y Casting a pointer to a different address space is not allowed
y OpenCLTM C function-level local variables are mapped to LLVM module scope global variables
‒ The variable name is mapped as <function name>.<variable name>

16 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
MAPPING OF OPENCLTM TO SPIR
ENUMERATION VALUES

y SPIR defines enumeration values used by OpenCLTM C programs
‒ Image channel order -> same as cl.h
‒ Image data type -> same as cl.h
‒ Sampler enumeration values (based on cl.h but not exactly the same)
‒ Addressing mode
‒ CLK_ADDRESS_NONE=0x0000
‒ CLK_ADDRESS_CLAMP_TO_EDGE=0x0002
‒ CLK_ADDRESS_CLAMP=0x0004
‒ CLK_ADDRESS_REPEAT=0x0006
‒ CLK_ADDRESS_MIRRORED_REPEAT=0x0008

‒ Normalized coords
‒ CLK_NORMALIZED_COORDS_FALSE=0x0000
‒ CLK_NORMALIZED_COORDS_TRUE=0x0001

‒ Filter mode
‒ CLK_FILTER_NEAREST=0x0010
‒ CLK_FILTER_LINEAR=_0x0020

17 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
MAPPING OF OPENCLTM TO SPIR
NAME MANGLING

y OpenCLTM C builtin functions are mangled
y OpenCLTM C kernel functions and non-kernel user functions are not mangled
y Other languages may choose to mangle non-kernel user functions
y SPIR adopts name mangling scheme of Itanium C++ ABI section 5.1 with extended rules for OpenCLTM C
data types, address spaces, access qualifiers
‒ Unsigned/signed integer types of the same bit widths are mangled to different names
‒ Pointers of non-private address space N -> PU3ASN<mangled element type>
‒ Vector type of N elements -> DvN_<mangled element type>
‒ OpenCLTM C opaque types (image, sampler, event) -> <string length>ocl_<type name> e.g.
‒ sampler_t -> 11ocl_sampler

‒ Access qualifiers: read only -> U1R, write only -> U1W, read write -> U1B
‒ size_t and uintptr_t are treated as uint or ulong
‒ Ptrdiff_t and intptr_t are treated as int or long

18 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
MAPPING OF OPENCLTM TO SPIR
USE OF OPTIONAL CORE FEATURES AND EXTENSIONS

y SPIR contains information about used optional features and extensions
‒ Runtime can reject SPIR binaries using unsupported optional features
‒ Application can select SPIR binaries based on used optional features and extensions

y Metadata for used core features
‒ openclTM.used.optional.core.features
‒ Two core features are allowed:
‒ cl_image: indicates images are used
‒ cl_double: indicates doubles are used

y Metadata for used extensions
‒ openclTM.used.extensions
‒
‒
‒
‒
‒
‒

cl_khr_int64_base_atomics
cl_khr_int64_extended_atomics
cl_khr_fp16
cl_khr_gl_sharing
cl_khr_gl_event
etc

19 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
MAPPING OF OPENCLTM TO SPIR
KERNEL ATTRIBUTES

y SPIR contains information about optional kernel attributes
‒ Reqd_work_group_size
‒ Work_group_size_hint
‒ Vec_type_hint

y For each kernel, there is a metadata for optional kernel attributes
‒ !opencl.kernels = {!0, !1, ..., !N}
‒
‒
‒
‒

!0 = metadata { < function signature >, !01, !02, ..., , !0i }
!1 = metadata { < function signature >, !11, !12, ..., , !1j }
...
!N = metadata { < function signature >, !N1, !N2, ..., , !Nk }

20 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
MAPPING OF OPENCLTM TO SPIR
KERNEL ARGUMENT INFO

y SPIR contains kernel argument information required by OpenCLTM runtime for executing kernels
y For each kernel argument, there is metadata
‒ kernel_arg_addr_space
‒ kernel_arg_access_qual
‒ kernel_arg_type
‒ kernel_arg_base_type
‒ kernel_arg_type_qual
‒ kernel_arg_name : optional, only exists if -cl-kernel-arg-info is used when producing SPIR

21 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
SPIR ABI
y SPIR uses the default ABI of Clang 3.2
‒ Any aggregate type is passed as a pointer. Memory allocation (if needed) is the responsibility of the caller function.
‒ Enumeration types are handled as the underlying integer type.
‒ If the argument type is a promotable integer type, it will be extended according to the C99 integer promotion rules.
‒ Any other type, including floating point types, vectors, etc.. will be passed directly as the corresponding LLVM type.

22 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
HOW TO IMPLEMENT SPIR LOADER
OVERALL STRUCTURE – IDEAL CASE
User’s OpenCLTM Source

User’s SPIR Binary

Builtin Library Source

compile

compile

SPIR Binary

SPIR Binary

Optimize, link
Linked SPIR Binary
Optimize, codegen
Executable Kernels

y Backend consumes SPIR directly without transforming to vendor’s LLVM format
23 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
HOW TO IMPLEMENT SPIR LOADER
OVERALL STRUCTURE – ACTUAL CASE
User’s OpenCLTM Source

User’s SPIR Binary

Builtin Library Source

compile

SPIR loader

compile

Vendor’s LLVM Binary

Vendor’s LLVM Binary

Vendor’s LLVM Binary

Optimize, link
Vendor’s Linked Binary
Optimize, codegen
Executable Kernels

y Backend transforms SPIR to vendor’s LLVM format
24 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
WHY IS SPIR LOADER NEEDED
y Vendor uses different LLVM entities or format to convey information required by OpenCLTM runtime for
querying and executing kernels
y Vendor’s frontend does special transformations which are not done by SPIR producer
y Vendor’s backend is shared by different frontends, some of which do not generate SPIR
y Vendor’s builtin library uses different name mangling scheme

25 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
HOW TO IMPLEMENT SPIR LOADER
y Verify SPIR target triple and data layout is compatible with target device
y Set target triple for target device
y Demangle builtin functions and re-mangle them using vendor’s name mangling scheme
y Transform data types
y Transform metadata
y Transform calling conventions
y Perform special transformations done by frontend
‒ If possible, consider moving the transformations from frontend to backend

26 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
SPIR CONFORMANCE TEST
y SPIR is a Khronos extension
y To claim supporting SPIR, vendor’s OpenCLTM implementation needs to pass SPIR conformance test
y SPIR 1.2 conformance test is going to be part of OpenCLTM 1.2 conformance test

27 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
REFERENCES
y Khronos OpenCLTM Working Group SPIR subgroup, SPIR provisional spec http://www.khronos.org/files/
opencl-spir-12-provisional.pdf, version 1.2.
y LLVM Team. LLVM Bitcode File Format. http://www.llvm.org/releases/3.2/docs/BitCodeFormat.html,
2012. Version 3.2.
y CodeSourcery, Compaq, EDG, HP, IBM, Intel, Red Hat, SGI, and others. Itanium C++ ABI. http://
mentorembedded.github.com/cxx-abi/abi.html .
y Khronos OpenCLTM Working Group. The OpenCLTM Specication, version 1.2. http://www.khronos.org/
registry/cl/specs/opencl-1.2.pdf, November 2012.
y LLVM Team. LLVM Language Reference Manual. http://www.llvm.org/releases/3.2/docs/LangRef.html ,
2012. Version 3.2.

28 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
DISCLAIMER & ATTRIBUTION
The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors.
The information contained herein is subject to change and may be rendered inaccurate for many reasons, including but not limited to product and roadmap
changes, component and motherboard version changes, new model and/or product releases, product differences between differing manufacturers,
software changes, BIOS flashes, firmware upgrades, or the like. AMD assumes no obligation to update or otherwise correct or revise this information.
However, AMD reserves the right to revise this information and to make changes from time to time to the content hereof without obligation of AMD to
notify any person of such revisions or changes.
AMD MAKES NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE CONTENTS HEREOF AND ASSUMES NO RESPONSIBILITY FOR ANY
INACCURACIES, ERRORS OR OMISSIONS THAT MAY APPEAR IN THIS INFORMATION.
AMD SPECIFICALLY DISCLAIMS ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE. IN NO EVENT WILL AMD
BE LIABLE TO ANY PERSON FOR ANY DIRECT, INDIRECT, SPECIAL OR OTHER CONSEQUENTIAL DAMAGES ARISING FROM THE USE OF ANY INFORMATION
CONTAINED HEREIN, EVEN IF AMD IS EXPRESSLY ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.

ATTRIBUTION
© 2013 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo and combinations thereof are trademarks of Advanced Micro Devices,
Inc. in the United States and/or other jurisdictions. OpenCLTM is a trademark of Apple Inc. Other names are for informational purposes only and may be
trademarks of their respective owners.
29 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013

Más contenido relacionado

La actualidad más candente

PG-4039, RapidFire API, by Dmitry Kozlov
PG-4039, RapidFire API, by Dmitry KozlovPG-4039, RapidFire API, by Dmitry Kozlov
PG-4039, RapidFire API, by Dmitry Kozlov
AMD Developer Central
 

La actualidad más candente (20)

IS-4081, Rabbit: Reinventing Video Chat, by Philippe Clavel
IS-4081, Rabbit: Reinventing Video Chat, by Philippe ClavelIS-4081, Rabbit: Reinventing Video Chat, by Philippe Clavel
IS-4081, Rabbit: Reinventing Video Chat, by Philippe Clavel
 
PT-4053, Advanced OpenCL - Debugging and Profiling Using AMD CodeXL, by Uri S...
PT-4053, Advanced OpenCL - Debugging and Profiling Using AMD CodeXL, by Uri S...PT-4053, Advanced OpenCL - Debugging and Profiling Using AMD CodeXL, by Uri S...
PT-4053, Advanced OpenCL - Debugging and Profiling Using AMD CodeXL, by Uri S...
 
MM-4092, Optimizing FFMPEG and Handbrake Using OpenCL and Other AMD HW Capabi...
MM-4092, Optimizing FFMPEG and Handbrake Using OpenCL and Other AMD HW Capabi...MM-4092, Optimizing FFMPEG and Handbrake Using OpenCL and Other AMD HW Capabi...
MM-4092, Optimizing FFMPEG and Handbrake Using OpenCL and Other AMD HW Capabi...
 
PL-4043, Accelerating OpenVL for Heterogeneous Platforms, by Gregor Miller
PL-4043, Accelerating OpenVL for Heterogeneous Platforms, by Gregor MillerPL-4043, Accelerating OpenVL for Heterogeneous Platforms, by Gregor Miller
PL-4043, Accelerating OpenVL for Heterogeneous Platforms, by Gregor Miller
 
PT-4055, Optimizing Raytracing on GCN with AMD Development Tools, by Tzachi C...
PT-4055, Optimizing Raytracing on GCN with AMD Development Tools, by Tzachi C...PT-4055, Optimizing Raytracing on GCN with AMD Development Tools, by Tzachi C...
PT-4055, Optimizing Raytracing on GCN with AMD Development Tools, by Tzachi C...
 
PL-4048, Adapting languages for parallel processing on GPUs, by Neil Henning
PL-4048, Adapting languages for parallel processing on GPUs, by Neil HenningPL-4048, Adapting languages for parallel processing on GPUs, by Neil Henning
PL-4048, Adapting languages for parallel processing on GPUs, by Neil Henning
 
Leverage the Speed of OpenCL™ with AMD Math Libraries
Leverage the Speed of OpenCL™ with AMD Math LibrariesLeverage the Speed of OpenCL™ with AMD Math Libraries
Leverage the Speed of OpenCL™ with AMD Math Libraries
 
PT-4059, Bolt: A C++ Template Library for Heterogeneous Computing, by Ben Sander
PT-4059, Bolt: A C++ Template Library for Heterogeneous Computing, by Ben SanderPT-4059, Bolt: A C++ Template Library for Heterogeneous Computing, by Ben Sander
PT-4059, Bolt: A C++ Template Library for Heterogeneous Computing, by Ben Sander
 
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla Mah
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla MahGS-4106 The AMD GCN Architecture - A Crash Course, by Layla Mah
GS-4106 The AMD GCN Architecture - A Crash Course, by Layla Mah
 
MM-4105, Realtime 4K HDR Decoding with GPU ACES, by Gary Demos
MM-4105, Realtime 4K HDR Decoding with GPU ACES, by Gary DemosMM-4105, Realtime 4K HDR Decoding with GPU ACES, by Gary Demos
MM-4105, Realtime 4K HDR Decoding with GPU ACES, by Gary Demos
 
PG-4039, RapidFire API, by Dmitry Kozlov
PG-4039, RapidFire API, by Dmitry KozlovPG-4039, RapidFire API, by Dmitry Kozlov
PG-4039, RapidFire API, by Dmitry Kozlov
 
CC-4000, Characterizing APU Performance in HadoopCL on Heterogeneous Distribu...
CC-4000, Characterizing APU Performance in HadoopCL on Heterogeneous Distribu...CC-4000, Characterizing APU Performance in HadoopCL on Heterogeneous Distribu...
CC-4000, Characterizing APU Performance in HadoopCL on Heterogeneous Distribu...
 
HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...
HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...
HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...
 
WT-4073, ANGLE and cross-platform WebGL support, by Shannon Woods
WT-4073, ANGLE and cross-platform WebGL support, by Shannon WoodsWT-4073, ANGLE and cross-platform WebGL support, by Shannon Woods
WT-4073, ANGLE and cross-platform WebGL support, by Shannon Woods
 
PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...
PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...
PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...
 
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...
Keynote (Johan Andersson) - Mantle for Developers - by Johan Andersson, Techn...
 
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...
 
GS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael Mantor
GS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael MantorGS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael Mantor
GS-4152, AMD’s Radeon R9-290X, One Big dGPU, by Michael Mantor
 
MM-4097, OpenCV-CL, by Harris Gasparakis, Vadim Pisarevsky and Andrey Pavlenko
MM-4097, OpenCV-CL, by Harris Gasparakis, Vadim Pisarevsky and Andrey PavlenkoMM-4097, OpenCV-CL, by Harris Gasparakis, Vadim Pisarevsky and Andrey Pavlenko
MM-4097, OpenCV-CL, by Harris Gasparakis, Vadim Pisarevsky and Andrey Pavlenko
 
MM-4104, Smart Sharpen using OpenCL in Adobe Photoshop CC – Challenges and Ac...
MM-4104, Smart Sharpen using OpenCL in Adobe Photoshop CC – Challenges and Ac...MM-4104, Smart Sharpen using OpenCL in Adobe Photoshop CC – Challenges and Ac...
MM-4104, Smart Sharpen using OpenCL in Adobe Photoshop CC – Challenges and Ac...
 

Similar a PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compiler Developers, by Yaxun Liu

TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...
TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...
TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...
chiportal
 
Princeton Dec 2022 Meetup_ NiFi + Flink + Pulsar
Princeton Dec 2022 Meetup_ NiFi + Flink + PulsarPrinceton Dec 2022 Meetup_ NiFi + Flink + Pulsar
Princeton Dec 2022 Meetup_ NiFi + Flink + Pulsar
Timothy Spann
 

Similar a PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compiler Developers, by Yaxun Liu (20)

HKG18-318 - OpenAMP Workshop
HKG18-318 - OpenAMP WorkshopHKG18-318 - OpenAMP Workshop
HKG18-318 - OpenAMP Workshop
 
PHP QA Tools
PHP QA ToolsPHP QA Tools
PHP QA Tools
 
Cloud Native APIs: The API Operator for Kubernetes
Cloud Native APIs: The API Operator for KubernetesCloud Native APIs: The API Operator for Kubernetes
Cloud Native APIs: The API Operator for Kubernetes
 
OpenDataPlane - Bill Fischofer
OpenDataPlane - Bill FischoferOpenDataPlane - Bill Fischofer
OpenDataPlane - Bill Fischofer
 
Summit 16: ARM Mini-Summit - OpenDataPlane Monarch Release - Linaro
Summit 16: ARM Mini-Summit -   OpenDataPlane Monarch Release - LinaroSummit 16: ARM Mini-Summit -   OpenDataPlane Monarch Release - Linaro
Summit 16: ARM Mini-Summit - OpenDataPlane Monarch Release - Linaro
 
TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...
TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...
TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...
 
[Srijan Wednesday Webinar] How to Run Stateless and Stateful Services on K8S ...
[Srijan Wednesday Webinar] How to Run Stateless and Stateful Services on K8S ...[Srijan Wednesday Webinar] How to Run Stateless and Stateful Services on K8S ...
[Srijan Wednesday Webinar] How to Run Stateless and Stateful Services on K8S ...
 
OpenDDR and Jakarta MVC - JavaLand 2021
OpenDDR and Jakarta MVC - JavaLand 2021OpenDDR and Jakarta MVC - JavaLand 2021
OpenDDR and Jakarta MVC - JavaLand 2021
 
Build and deploy scientific Python Applications
Build and deploy scientific Python Applications  Build and deploy scientific Python Applications
Build and deploy scientific Python Applications
 
Princeton Dec 2022 Meetup_ NiFi + Flink + Pulsar
Princeton Dec 2022 Meetup_ NiFi + Flink + PulsarPrinceton Dec 2022 Meetup_ NiFi + Flink + Pulsar
Princeton Dec 2022 Meetup_ NiFi + Flink + Pulsar
 
Why Spring Belongs In Your Data Stream (From Edge to Multi-Cloud)
Why Spring Belongs In Your Data Stream (From Edge to Multi-Cloud)Why Spring Belongs In Your Data Stream (From Edge to Multi-Cloud)
Why Spring Belongs In Your Data Stream (From Edge to Multi-Cloud)
 
HKG15-110: ODP Project Update
HKG15-110: ODP Project UpdateHKG15-110: ODP Project Update
HKG15-110: ODP Project Update
 
Using IO Visor to Secure Microservices Running on CloudFoundry [OpenStack Sum...
Using IO Visor to Secure Microservices Running on CloudFoundry [OpenStack Sum...Using IO Visor to Secure Microservices Running on CloudFoundry [OpenStack Sum...
Using IO Visor to Secure Microservices Running on CloudFoundry [OpenStack Sum...
 
OpenDDR
OpenDDROpenDDR
OpenDDR
 
Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...
Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...
Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in A...
 
Presentation 4 rifidi emulator lab
Presentation 4 rifidi emulator labPresentation 4 rifidi emulator lab
Presentation 4 rifidi emulator lab
 
2022 APIsecure_Securing APIs with Open Standards
2022 APIsecure_Securing APIs with Open Standards2022 APIsecure_Securing APIs with Open Standards
2022 APIsecure_Securing APIs with Open Standards
 
TFI2014 Session II - Requirements for SDN - Brian Field
TFI2014 Session II - Requirements for SDN - Brian FieldTFI2014 Session II - Requirements for SDN - Brian Field
TFI2014 Session II - Requirements for SDN - Brian Field
 
FIWARE Tech Summit - Stream Processing with Kurento Media Server
FIWARE Tech Summit - Stream Processing with Kurento Media ServerFIWARE Tech Summit - Stream Processing with Kurento Media Server
FIWARE Tech Summit - Stream Processing with Kurento Media Server
 
OpenCR tutorial_icra2017
OpenCR tutorial_icra2017 OpenCR tutorial_icra2017
OpenCR tutorial_icra2017
 

Más de AMD Developer Central

Rendering Battlefield 4 with Mantle by Yuriy ODonnell
Rendering Battlefield 4 with Mantle by Yuriy ODonnellRendering Battlefield 4 with Mantle by Yuriy ODonnell
Rendering Battlefield 4 with Mantle by Yuriy ODonnell
AMD Developer Central
 

Más de AMD Developer Central (20)

DX12 & Vulkan: Dawn of a New Generation of Graphics APIs
DX12 & Vulkan: Dawn of a New Generation of Graphics APIsDX12 & Vulkan: Dawn of a New Generation of Graphics APIs
DX12 & Vulkan: Dawn of a New Generation of Graphics APIs
 
Introduction to Node.js
Introduction to Node.jsIntroduction to Node.js
Introduction to Node.js
 
Media SDK Webinar 2014
Media SDK Webinar 2014Media SDK Webinar 2014
Media SDK Webinar 2014
 
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware WebinarAn Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
 
DirectGMA on AMD’S FirePro™ GPUS
DirectGMA on AMD’S  FirePro™ GPUSDirectGMA on AMD’S  FirePro™ GPUS
DirectGMA on AMD’S FirePro™ GPUS
 
Webinar: Whats New in Java 8 with Develop Intelligence
Webinar: Whats New in Java 8 with Develop IntelligenceWebinar: Whats New in Java 8 with Develop Intelligence
Webinar: Whats New in Java 8 with Develop Intelligence
 
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...
 
Inside XBox- One, by Martin Fuller
Inside XBox- One, by Martin FullerInside XBox- One, by Martin Fuller
Inside XBox- One, by Martin Fuller
 
TressFX The Fast and The Furry by Nicolas Thibieroz
TressFX The Fast and The Furry by Nicolas ThibierozTressFX The Fast and The Furry by Nicolas Thibieroz
TressFX The Fast and The Furry by Nicolas Thibieroz
 
Rendering Battlefield 4 with Mantle by Yuriy ODonnell
Rendering Battlefield 4 with Mantle by Yuriy ODonnellRendering Battlefield 4 with Mantle by Yuriy ODonnell
Rendering Battlefield 4 with Mantle by Yuriy ODonnell
 
Low-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil PerssonLow-level Shader Optimization for Next-Gen and DX11 by Emil Persson
Low-level Shader Optimization for Next-Gen and DX11 by Emil Persson
 
Gcn performance ftw by stephan hodes
Gcn performance ftw by stephan hodesGcn performance ftw by stephan hodes
Gcn performance ftw by stephan hodes
 
Inside XBOX ONE by Martin Fuller
Inside XBOX ONE by Martin FullerInside XBOX ONE by Martin Fuller
Inside XBOX ONE by Martin Fuller
 
Direct3D12 and the Future of Graphics APIs by Dave Oldcorn
Direct3D12 and the Future of Graphics APIs by Dave OldcornDirect3D12 and the Future of Graphics APIs by Dave Oldcorn
Direct3D12 and the Future of Graphics APIs by Dave Oldcorn
 
Introduction to Direct 3D 12 by Ivan Nevraev
Introduction to Direct 3D 12 by Ivan NevraevIntroduction to Direct 3D 12 by Ivan Nevraev
Introduction to Direct 3D 12 by Ivan Nevraev
 
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth ThomasHoly smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
Holy smoke! Faster Particle Rendering using Direct Compute by Gareth Thomas
 
Computer Vision Powered by Heterogeneous System Architecture (HSA) by Dr. Ha...
Computer Vision Powered by Heterogeneous System Architecture (HSA) by  Dr. Ha...Computer Vision Powered by Heterogeneous System Architecture (HSA) by  Dr. Ha...
Computer Vision Powered by Heterogeneous System Architecture (HSA) by Dr. Ha...
 
Productive OpenCL Programming An Introduction to OpenCL Libraries with Array...
Productive OpenCL Programming An Introduction to OpenCL Libraries  with Array...Productive OpenCL Programming An Introduction to OpenCL Libraries  with Array...
Productive OpenCL Programming An Introduction to OpenCL Libraries with Array...
 
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14
 
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14
 

Último

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Último (20)

Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 

PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compiler Developers, by Yaxun Liu

  • 1. Introduction to SPIR for Application and Compiler Developers Yaxun Sam Liu
  • 2. OUTLINE y What is SPIR and why it is useful ‒ Why do we need SPIR since we already have LLVM IR y SPIR for Application Developers ‒ How to generate SPIR ‒ How to load SPIR ‒ Portability considerations using SPIR y SPIR for Compiler Developers ‒ Introduction to SPIR spec ‒ How to implement a SPIR loader y References 2 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 3. WHAT IS SPIR y A Binary Format ‒ SPIR means Standard Portable Intermediate Representation ‒ A portable binary format for OpenCLTM programs ‒ Defined by SPIR spec ‒ Based on LLVM IR ‒ Supports most of OpenCLTM core features ‒ Current version is 1.2, corresponding to OpenCLTM 1.2 ‒ Developed by Khronos Group, OpenCLTM working group, SPIR subgroup ‒ A SPIR binary is bitness aware, means ‒ The pointer size in a SPIR binary is either 32 bit or 64 bit depending on target devices ‒ Two sets of SPIR binaries are needed for shipping products in SPIR to both 32 and 64 bit devices y An extension for OpenCLTM ‒ Defined by SPIR host API ‒ Denoted by cl_khr_spir ‒ OpenCLTM devices supporting cl_khr_spir is able to load SPIR binary and run it 3 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 4. WHY IS SPIR USEFUL y Why is SPIR useful ‒ For Game/Application Developers ‒ Can ship OpenCLTM program in binary instead of source code ‒ Can ship just a few binaries for one OpenCLTM program instead of tons of binaries for different platforms/devices ‒ For Compiler Developers ‒ Can compile other programming languages to SPIR which can be run on OpenCLTM devices 4 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 5. HOW TO GENERATE SPIR y SPIR generation is optional for devices supporting cl_khr_spir ‒ A device supporting cl_khr_spir is only required to be able to consume SPIR ‒ Whether to support SPIR generation is vendors’ choice y Generating SPIR in host program ‒ SPIR spec and host API does not define how to generate SPIR ‒ If SPIR generation is supported, it is likely to be done as ‒ ‒ ‒ ‒ Load OpenCLTM source code by clCreateProgramWithSource Compile OpenCLTM source code by clCompileProgram with a vendor specific option for generating SPIR Get the SPIR binary by clGetProgramInfo with CL_PROGRAM_BINARIES Save the SPIR binary to a file y Generating SPIR by offline compiler ‒ Clang 3.3/3.4 can compile OpenCLTM source code to SPIR-like LLVM bitcode ‒ A patch for Clang 3.2 is available to Khronos members which can compile OpenCLTM source code to SPIR 1.2 ‒ Clang options for generating SPIR: -cl-std=CL1.2 -emit-llvm -triple spir[32|64]-unknown-unknown 5 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 6. HOW TO LOAD SPIR LOAD A SINGLE SPIR BINARY SPIR Binary clCreateProgramWithBinary cl_program clBuildProgram cl_program clCreateKernel cl_kernel 6 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 7. HOW TO LOAD SPIR MULTIPLE SPIR BINARIES, OPENCLTM SOURCE CODES AND VENDOR-SPECIFIC BINARIES OpenCLTM Source SPIR Binary Vendor-specific Binary clCreateProgramWithSource clCreateProgramWithBinary clCreateProgramWithBinary cl_program cl_program cl_program clCompileProgram clCompileProgram cl_program cl_program clLinkProgram cl_program clCreateKernel cl_kernel 7 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 8. PORTABILITY CONSIDERATIONS USING SPIR y Check whether a device supports SPIR ‒ Get all supported extensions by clGetDeviceInfo with CL_DEVICE_EXTENSIONS ‒ Check whether cl_khr_spir is included y Supporting both 32 and 64 bit devices ‒ Two sets of SPIR binaries are needed, one for 32 bit devices, the other for 64 bit devices ‒ Check bitness of a device by clGetDeviceInfo with CL_DEVICE_ADDRESS_BITS ‒ Load 32 or 64 bit SPIR binaries accordingly y Supporting optional extensions ‒ Get all supported extensions by clGetDeviceInfo with CL_DEVICE_EXTENSIONS ‒ Check if the required extension is supported ‒ If yes, load the SPIR binary ‒ If no, either fallback to a SPIR binary or OpenCLTM source using only core extensions, or fail gracefully 8 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 9. PORTABILITY CONSIDERATIONS USING SPIR y SPIR binaries generated from non-portable OpenCLTM source is not portable ‒ Not following restrictions specified by OpenCLTM spec 1.2 section 6.9 ‒ Casting a pointer from one address space to a different address space ‒ Casting an OpenCLTM opaque structure to a different type ‒ Performing arithmetic operations or comparison on a sampler ‒ Performing sizeof on OpenCLTM opaque structures ‒ etc. 9 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 10. SPIR FOR COMPILER DEVELOPERS y Introduction to SPIR spec ‒ Relation between SPIR 1.2 and LLVM 3.2 ‒ Mapping of OpenCLTM to SPIR ‒ ‒ ‒ ‒ ‒ ‒ ‒ Data types Enumeration values Calling conventions Address spaces Name mangling Used extensions Kernel argument info y How to implement a SPIR loader ‒ Overall structure ‒ Transforming data types ‒ Transforming meta data ‒ Demangling and mapping builtin function names 10 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 11. RELATION BETWEEN SPIR AND LLVM BITCODE y SPIR binary is subset of LLVM bitcode ‒ A valid SPIR 1.2 binary is valid LLVM 3.2 bitcode ‒ SPIR is defined by mapping OpenCLTM C entities to LLVM and also imposing restrictions on LLVM 3.2 bitcode format ‒ ‒ ‒ ‒ Specific target triple and data layout for 32 and 64 bit devices Specific ABI Specific calling conventions Restrictions on allowed instructions, intrinsic functions, linkage types, parameter attributes, visibility styles, function attributes, etc. y The ideas behind SPIR ‒ To be expressive enough to represent OpenCLTM C programs ‒ To carry enough information for OpenCLTM runtime to execute and query the kernels ‒ Do not introduce unnecessary entities ‒ This may limit SPIR’s expressiveness for other languages, but facilitates development of SPIR loader ‒ Balance the burden between SPIR producer and loader 11 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 12. MAPPING OF OPENCLTM TO SPIR DATA TYPES y OpenCLTM builtin scalar types are mapped to LLVM primitive types ‒ bool -> i1 ‒ char -> i8 ‒ unsigned char, uchar -> i8 ‒ short -> i16 ‒ unsigned short, ushort -> i16 ‒ int -> i32 ‒ unsigned int, uint -> i32 ‒ long -> i64 ‒ unsigned long, ulong -> i64 ‒ float -> float ‒ double -> double ‒ half -> half ‒ void -> void y OpenCLTM builtin vector types are mapped to LLVM vector types ‒ charn < n x i8 > 12 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 13. MAPPING OF OPENCLTM TO SPIR DATA TYPES y Image and event types are mapped to LLVM opaque structure ‒ image1d_t -> %opencl.image1d_t ‒ image1d_array_t -> %opencl.image1d_array_t ‒ image1d_buer_t -> %opencl.image1d_buer_t ‒ image2d_t -> %opencl.image2d_t ‒ image2d_array_t -> %opencl.image2d_array_t ‒ image3d_t -> %opencl.image3d_t ‒ image2d_msaa_t -> %opencl.image2d_msaa_t ‒ image2d_array_msaa_t -> %opencl.image2d_array_msaa_t ‒ image2d_msaa_depth_t -> %opencl.image2d_msaa_depth_t ‒ image2d_array_msaa_depth_t -> %opencl.image2d_array_msaa_depth_t ‒ image2d_depth_t -> %opencl.image2d_depth_t ‒ image2d_array_depth_t -> %opencl.image2d_array_depth_t ‒ event_t -> %opencl.event_t 13 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 14. MAPPING OF OPENCLTM TO SPIR DATA TYPES y Sampler type is mapped to LLVM i32 type ‒ Although sampler is represented by integer in SPIR, arithmetic operations and comparison with other values are not allowed. y size_t, diffptr_t, intptr_t, uintptr_t is mapped to LLVM i32 or i64 depending on the bitness of SPIR y Signed/unsignedness of integer types ‒ LLVM does not have unsigned integer types ‒ OpenCLTM unsigned and signed integer types of the same bit width are mapped to the same type in SPIR ‒ If signed/unsignedness of an integer type is needed, usually the information can be obtained through ‒ Mangled function names ‒ Sign extension of function arguments and return type ‒ Kernel argument metadata 14 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 15. MAPPING OF OPENCLTM TO SPIR CALLING CONVENTIONS y SPIR uses calling convention to indicate whether a function is a kernel function ‒ Kernel functions use spir_kernel calling convention ‒ Non-kernel functions use spir_func calling convention ‒ No other calling conventions are allowed in SPIR 15 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 16. MAPPING OF OPENCLTM TO SPIR ADDRESS SPACES y OpenCLTM C address spaces are mapped to LLVM address spaces ‒ Private -> 0 ‒ Global -> 1 ‒ Constant -> 2 ‒ Local -> 3 y Casting a pointer to a different address space is not allowed y OpenCLTM C function-level local variables are mapped to LLVM module scope global variables ‒ The variable name is mapped as <function name>.<variable name> 16 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 17. MAPPING OF OPENCLTM TO SPIR ENUMERATION VALUES y SPIR defines enumeration values used by OpenCLTM C programs ‒ Image channel order -> same as cl.h ‒ Image data type -> same as cl.h ‒ Sampler enumeration values (based on cl.h but not exactly the same) ‒ Addressing mode ‒ CLK_ADDRESS_NONE=0x0000 ‒ CLK_ADDRESS_CLAMP_TO_EDGE=0x0002 ‒ CLK_ADDRESS_CLAMP=0x0004 ‒ CLK_ADDRESS_REPEAT=0x0006 ‒ CLK_ADDRESS_MIRRORED_REPEAT=0x0008 ‒ Normalized coords ‒ CLK_NORMALIZED_COORDS_FALSE=0x0000 ‒ CLK_NORMALIZED_COORDS_TRUE=0x0001 ‒ Filter mode ‒ CLK_FILTER_NEAREST=0x0010 ‒ CLK_FILTER_LINEAR=_0x0020 17 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 18. MAPPING OF OPENCLTM TO SPIR NAME MANGLING y OpenCLTM C builtin functions are mangled y OpenCLTM C kernel functions and non-kernel user functions are not mangled y Other languages may choose to mangle non-kernel user functions y SPIR adopts name mangling scheme of Itanium C++ ABI section 5.1 with extended rules for OpenCLTM C data types, address spaces, access qualifiers ‒ Unsigned/signed integer types of the same bit widths are mangled to different names ‒ Pointers of non-private address space N -> PU3ASN<mangled element type> ‒ Vector type of N elements -> DvN_<mangled element type> ‒ OpenCLTM C opaque types (image, sampler, event) -> <string length>ocl_<type name> e.g. ‒ sampler_t -> 11ocl_sampler ‒ Access qualifiers: read only -> U1R, write only -> U1W, read write -> U1B ‒ size_t and uintptr_t are treated as uint or ulong ‒ Ptrdiff_t and intptr_t are treated as int or long 18 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 19. MAPPING OF OPENCLTM TO SPIR USE OF OPTIONAL CORE FEATURES AND EXTENSIONS y SPIR contains information about used optional features and extensions ‒ Runtime can reject SPIR binaries using unsupported optional features ‒ Application can select SPIR binaries based on used optional features and extensions y Metadata for used core features ‒ openclTM.used.optional.core.features ‒ Two core features are allowed: ‒ cl_image: indicates images are used ‒ cl_double: indicates doubles are used y Metadata for used extensions ‒ openclTM.used.extensions ‒ ‒ ‒ ‒ ‒ ‒ cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_fp16 cl_khr_gl_sharing cl_khr_gl_event etc 19 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 20. MAPPING OF OPENCLTM TO SPIR KERNEL ATTRIBUTES y SPIR contains information about optional kernel attributes ‒ Reqd_work_group_size ‒ Work_group_size_hint ‒ Vec_type_hint y For each kernel, there is a metadata for optional kernel attributes ‒ !opencl.kernels = {!0, !1, ..., !N} ‒ ‒ ‒ ‒ !0 = metadata { < function signature >, !01, !02, ..., , !0i } !1 = metadata { < function signature >, !11, !12, ..., , !1j } ... !N = metadata { < function signature >, !N1, !N2, ..., , !Nk } 20 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 21. MAPPING OF OPENCLTM TO SPIR KERNEL ARGUMENT INFO y SPIR contains kernel argument information required by OpenCLTM runtime for executing kernels y For each kernel argument, there is metadata ‒ kernel_arg_addr_space ‒ kernel_arg_access_qual ‒ kernel_arg_type ‒ kernel_arg_base_type ‒ kernel_arg_type_qual ‒ kernel_arg_name : optional, only exists if -cl-kernel-arg-info is used when producing SPIR 21 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 22. SPIR ABI y SPIR uses the default ABI of Clang 3.2 ‒ Any aggregate type is passed as a pointer. Memory allocation (if needed) is the responsibility of the caller function. ‒ Enumeration types are handled as the underlying integer type. ‒ If the argument type is a promotable integer type, it will be extended according to the C99 integer promotion rules. ‒ Any other type, including floating point types, vectors, etc.. will be passed directly as the corresponding LLVM type. 22 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 23. HOW TO IMPLEMENT SPIR LOADER OVERALL STRUCTURE – IDEAL CASE User’s OpenCLTM Source User’s SPIR Binary Builtin Library Source compile compile SPIR Binary SPIR Binary Optimize, link Linked SPIR Binary Optimize, codegen Executable Kernels y Backend consumes SPIR directly without transforming to vendor’s LLVM format 23 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 24. HOW TO IMPLEMENT SPIR LOADER OVERALL STRUCTURE – ACTUAL CASE User’s OpenCLTM Source User’s SPIR Binary Builtin Library Source compile SPIR loader compile Vendor’s LLVM Binary Vendor’s LLVM Binary Vendor’s LLVM Binary Optimize, link Vendor’s Linked Binary Optimize, codegen Executable Kernels y Backend transforms SPIR to vendor’s LLVM format 24 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 25. WHY IS SPIR LOADER NEEDED y Vendor uses different LLVM entities or format to convey information required by OpenCLTM runtime for querying and executing kernels y Vendor’s frontend does special transformations which are not done by SPIR producer y Vendor’s backend is shared by different frontends, some of which do not generate SPIR y Vendor’s builtin library uses different name mangling scheme 25 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 26. HOW TO IMPLEMENT SPIR LOADER y Verify SPIR target triple and data layout is compatible with target device y Set target triple for target device y Demangle builtin functions and re-mangle them using vendor’s name mangling scheme y Transform data types y Transform metadata y Transform calling conventions y Perform special transformations done by frontend ‒ If possible, consider moving the transformations from frontend to backend 26 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 27. SPIR CONFORMANCE TEST y SPIR is a Khronos extension y To claim supporting SPIR, vendor’s OpenCLTM implementation needs to pass SPIR conformance test y SPIR 1.2 conformance test is going to be part of OpenCLTM 1.2 conformance test 27 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 28. REFERENCES y Khronos OpenCLTM Working Group SPIR subgroup, SPIR provisional spec http://www.khronos.org/files/ opencl-spir-12-provisional.pdf, version 1.2. y LLVM Team. LLVM Bitcode File Format. http://www.llvm.org/releases/3.2/docs/BitCodeFormat.html, 2012. Version 3.2. y CodeSourcery, Compaq, EDG, HP, IBM, Intel, Red Hat, SGI, and others. Itanium C++ ABI. http:// mentorembedded.github.com/cxx-abi/abi.html . y Khronos OpenCLTM Working Group. The OpenCLTM Specication, version 1.2. http://www.khronos.org/ registry/cl/specs/opencl-1.2.pdf, November 2012. y LLVM Team. LLVM Language Reference Manual. http://www.llvm.org/releases/3.2/docs/LangRef.html , 2012. Version 3.2. 28 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013
  • 29. DISCLAIMER & ATTRIBUTION The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors. The information contained herein is subject to change and may be rendered inaccurate for many reasons, including but not limited to product and roadmap changes, component and motherboard version changes, new model and/or product releases, product differences between differing manufacturers, software changes, BIOS flashes, firmware upgrades, or the like. AMD assumes no obligation to update or otherwise correct or revise this information. However, AMD reserves the right to revise this information and to make changes from time to time to the content hereof without obligation of AMD to notify any person of such revisions or changes. AMD MAKES NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE CONTENTS HEREOF AND ASSUMES NO RESPONSIBILITY FOR ANY INACCURACIES, ERRORS OR OMISSIONS THAT MAY APPEAR IN THIS INFORMATION. AMD SPECIFICALLY DISCLAIMS ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE. IN NO EVENT WILL AMD BE LIABLE TO ANY PERSON FOR ANY DIRECT, INDIRECT, SPECIAL OR OTHER CONSEQUENTIAL DAMAGES ARISING FROM THE USE OF ANY INFORMATION CONTAINED HEREIN, EVEN IF AMD IS EXPRESSLY ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. ATTRIBUTION © 2013 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo and combinations thereof are trademarks of Advanced Micro Devices, Inc. in the United States and/or other jurisdictions. OpenCLTM is a trademark of Apple Inc. Other names are for informational purposes only and may be trademarks of their respective owners. 29 | INTRODUCTION TO SPIR FOR APPLICATION DEVELOPERS AND COMPILER DEVELOPERS | November 5, 2013