SlideShare una empresa de Scribd logo
1 de 8
Descargar para leer sin conexión
www.khronos.org/sycl©2018 Khronos Group - Rev. 0818
SYCL 1.2.1 API Reference Guide Page 1
Runtime: Device class [4.6.4]
device();
explicit device(const device_selector &deviceSelector);
explicit device(cl_device_id deviceId);
cl_device_id get() const;
platform get_platform() const;
bool is_host() const;
bool is_cpu() const;
bool is_gpu() const;
bool is_accelerator() const;
template <info::device param> typename info::param_traits<
info::device, param>::return_type get_info() const;
bool has_extension(const string_class &extension) const;
template <info::partition_property prop>
vector_class<device> create_sub_devices(
size_t nbSubDev) const;
template <info::partition_property prop>
vector_class<device> create_sub_devices(
const vector_class<size_t> &counts) const;
template <info::partition_property prop>
vector_class<device> create_sub_devices(
info::affinity_domain affinityDomain) const;
static vector_class<device> get_devices(
info::device_type deviceType =
info::device_type::all);
(Continued on next page)
SYCL™ is a C++ programming model for OpenCL which builds on the underlying concepts,
portability, and efficiency of OpenCL while adding much of the ease of use and flexibility of C++.
[n.n] refers to sections in the SYCL 1.2.1 specification available at khronos.org/registry/sycl
SYCL example [3.1.3]
Below is an example of a typical SYCL application which schedules a job
to run in parallel on any OpenCL accelerator.
#include <CL/sycl.hpp>
#include <iostream>
int main() {
using namespace cl::sycl;
int data[1024]; // Allocates data to be worked on
// Include all the SYCL work in a {} block to ensure all
// SYCL tasks are completed before exiting the block.
{
// Create a queue to enqueue work to
queue myQueue;
// Wrap the data array variable in a buffer.
buffer<size_t, 1> resultBuf { data, range<1> { 1024 } };
// Create a command group to
// issue commands to the queue.
myQueue.submit([&](handler & cgh)
{
// Request access to the buffer
auto writeResult = resultBuf.get_access<access::mode::write>(cgh);
// Enqueue a parallel_for task.
cgh.parallel_for<class simple_test>(range<1>{ 1024 }, [=](id<1> idx)
{
writeResult[idx] = idx[0];
}); // End of the kernel function
}); // End of the queue commands
} // End of scope, so wait for the queued work to complete
// Print result
for (int i = 0; i < 1024; i++) {
std::cout <<''data[''<< i << ''] = '' << data[i] << std::endl;
}
return 0;
}
Runtime: Context class [4.6.3]
explicit context(async_handler asyncHandler = {});
context(const device &dev,
async_handler asyncHandler = {});
context(const platform &plt,
async_handler asyncHandler = {});
context(const vector_class<device> &deviceList,
async_handler asyncHandler = {});
context(cl_context clContext,
async_handler asyncHandler = {});
cl_context get() const;
bool is_host() const;
template <info::context param>
typename info::param_traits
<info::context, param>::return_type get_info() const;
platform get_platform() const;
vector_class<device> get_devices() const;
Context queries using get_info():
Descriptor Return type
info::context::reference_count cl_uint
info::context::platform platform
info::context::devices vector_class<device>
Runtime: Platform class [4.6.2]
platform();
explicit platform(cl_platform_id platformID);
explicit platform(const device_selector &deviceSelector);
cl_platform_id get() const;
static vector_class<platform> get_platforms();
vector_class<device> get_devices(
info::device_type = info::device_type::all) const;
template <info::platform param>
typename info::param_traits<
info::platform, param>::return_type get_info() const;
bool has_extension(const string_class & extension) const;
bool is_host() const;
Platform information descriptors:
Platform descriptors Return type
info::platform::profile string_class
info::platform::version string_class
info::platform::name string_class
info::platform::vendor string_class
info::platform::extensions vector_class<string_class>
Header file
SYCL programs must include the
<CL/sycl.hpp> header file to provide
all of the SYCL features.
Namespace
All SYCL names are defined in the
cl::sycl namespace.
Queue
See queue class functions [4.6.5]
on page 2 of this reference guide.
Buffer
See buffer class functions [4.7.2]
on page 2 of this reference guide.
Accessor
See accessor class function [4.7.6]
on page 3 of this reference guide.
Handler
See handler class functions [4.8.3]
on page 5 of this reference guide.
Scopes
The kernel scope specifies a single
kernel function that will be, or has
been, compiled by a device compiler
and executed on a device. See
Invoking Kernels [4.8.5] in the spec.
The command group scope specifies
a unit of work which is comprised of
a kernel function and accessors.
The application scope specifies all
other code outside of a command
group scope.
Runtime: Device selection class [4.6.1]
device_selector();
device_selector(const device_selector &rhs);
device_selector &operator=(const device_selector &rhs);
virtual ~device_selector();
device select_device() const;
virtual int operator()(const device &device) const;
SYCL device selectors:
Device selectors Description
default_selector Devices selected by heuristics of the system
gpu_selector
Select devices according to device type
info::device::device_type::gpu
cpu_selector
Select devices according to device type
info::device::device_type::cpu
host_selector Selects the SYCL host
Runtime: Common interface [4.3.2 - 4.3.4]
Member functions for by-value semantics
T may be device, context, queue, program, kernel, event,
buffer, image, sampler, accessor, or stream.
T(const T &rhs);
T(T &&rhs);
T &operator=(const T &rhs);
T &operator=(T &&rhs);
~T();
bool operator==(const T &rhs) const;
bool operator!=(const T &rhs) const;
Member functions for by-value semantics
In the following functions, T may be id, range, item, nd_item,
h_item, group, or nd_range.
T(const T &rhs) = default;
T(T &&rhs) = default;
T &operator=(const T &rhs) = default;
T &operator=(T &&rhs) = default;
~T() = default;
bool operator==(const T &rhs) const;
bool operator!=(const T &rhs) const;
Properties interface [4.3.4.1]
Some runtime classes provide this properties interface.
class T {
. . .
template <typename propertyT>
bool has_property() const;
template <typename propertyT>
propertyT get_property() const;
. . .
};
class property_list {
public:
template <typename... propertyTN>
property_list(propertyTN... props);
};
www.khronos.org/sycl©2018 Khronos Group - Rev. 0818
SYCL 1.2.1 API Reference Guide Page 2
Runtime: Device class (continued)
Device queries using get_info():
Descriptor in info::device Return type
device_type info::device_type
vendor_id cl_uint
max_compute_units cl_uint
max_work_item_dimensions cl_uint
max_work_item_sizes id<3>
max_work_group_size size_t
preferred_vector_width_char
cl_uint
preferred_vector_width_short
preferred_vector_width_int
preferred_vector_width_long
preferred_vector_width_float
preferred_vector_width_double
preferred_vector_width_half
native_vector_width_char
cl_uint
native_vector_width_short
native_vector_width_int
native_vector_width_long
native_vector_width_float
native_vector_width_double
native_vector_width_half
max_clock_frequency cl_uint
address_bits cl_uint
max_mem_alloc_size cl_ulong
image_support bool
Descriptor in info::device Return type
max_read_image_args cl_uint
max_write_image_args cl_uint
image2d_max_width size_t
image2d_max_height size_t
image3d_max_width size_t
image3d_max_height size_t
image3d_max_depth size_t
image_max_buffer_size size_t
image_max_array_size size_t
max_samplers cl_uint
max_parameter_size size_t
mem_base_addr_align cl_uint
half_fp_config vector_class<info::fp_config>
single_fp_config vector_class<info::fp_config>
double_fp_config vector_class<info::fp_config>
global_mem_cache_type info::global_mem_cache_type
global_mem_cache_line_size cl_uint
global_mem_cache_size cl_ulong
global_mem_size cl_ulong
max_constant_buffer_size cl_ulong
max_constant_args cl_uint
local_mem_type info::local_mem_type
local_mem_size cl_ulong
error_correction_support bool
host_unified_memory bool
profiling_timer_resolution size_t
is_endian_little bool
Descriptor in info::device Return type
is_available bool
is_compiler_available bool
is_linker_available bool
execution_capabilities
vector_class
<info::execution_capability>
queue_profiling bool
built_in_kernels vector_class<string_class>
platform platform
name string_class
vendor string_class
driver_version string_class
profile string_class
version string_class
opencl_c_version string_class
extensions vector_class<string_class>
printf_buffer_size size_t
preferred_interop_user_sync bool
parent_device device
partition_max_sub_devices cl_uint
partition_properties
vector_class <info::
partition_property>
partition_affinity_domains
vector_class
<info::partition_affinity_
domain>
partition_type_property info::partition_property
partition_type_affinity_domain info::partition_affinity_domain
reference_count cl_uint
Runtime: Queue class [4.6.5]
Property class members:
property::queue::enable_profiling::enable_profiling();
explicit queue(const property_list &propList = {});
queue(const async_handler &asyncHandler,
const property_list &propList = {});
queue(const device_selector &deviceSelector,
const property_list &propList = {});
queue(const device_selector &deviceSelector,
const async_handler &asyncHandler,
const property_list &propList = {});
queue(const device &syclDevice,
const property_list &propList = {});
queue(const device &syclDevice,
const async_handler &asyncHandler,
const property_list &propList = {});
queue(const context &syclContext,
const device_selector &deviceSelector,
const property_list &propList = {});
queue(const context &syclContext,
const device_selector &deviceSelector,
const async_handler &asyncHandler,
const property_list &propList = {});
queue(cl_command_queue clQueue,
const context &syclContext,
const async_handler &asyncHandler = {});
cl_command_queue get() const;
context get_context() const;
device get_device() const;
bool is_host() const;
void wait();
void wait_and_throw();
void throw_asynchronous();
template <info::queue param>
typename info::param_traits<
info::queue, param>::return_type get_info() const;
template <typename T> event submit(T cgf);
template <typename T> event submit(T cgf,
const queue &secondaryQueue);
Queue queries using get_info():
Descriptor Return type
info::queue::context context
info::queue::device device
info::queue::reference_count cl_uint
Buffer class [4.7.2]
Property class members:
property::buffer::use_host_ptr::use_host_ptr();
property::buffer::use_mutex::use_mutex(mutex_class &mutexRef);
property::buffer::context_bound::context_bound(
context boundContext);
mutex_class *property::buffer::use_mutex::get_mutex_ptr() const;
context property::buffer::context_bound::get_context() const;
Class Declaration
template <typename T, int dimensions = 1,
typename AllocatorT = cl::sycl::buffer_allocator>
class buffer;
Member Functions
buffer(const range<dimensions> &bufferRange,
const property_list &propList = {});
buffer(const range<dimensions> &bufferRange,
AllocatorT allocator, const property_list &propList = {});
buffer(T *hostData, const range<dimensions> &bufferRange,
const property_list &propList = {});
buffer(T *hostData, const range<dimensions> &bufferRange,
AllocatorT allocator, const property_list &propList = {});
buffer(const T *hostData,
const range<dimensions> &bufferRange,
const property_list &propList = {});
buffer(const T *hostData,
const range<dimensions> &bufferRange,
AllocatorT allocator, const property_list &propList = {});
buffer(shared_ptr_class<T> &hostData,
const range<dimensions> &bufferRange,
AllocatorT allocator, const property_list &propList = {});
buffer(shared_ptr_class<T> &hostData,
const range<dimensions> &bufferRange,
const property_list &propList = {});
template <class InputIterator>
buffer(InputIterator first, InputIterator last,
AllocatorT allocator, const property_list &propList = {});
template <class InputIterator>
buffer(InputIterator first, InputIterator last,
const property_list &propList = {});
buffer(buffer<T, dimensions, AllocatorT> &b,
const index<dimensions> &baseIndex,
const range<dimensions> &subRange);
buffer(cl_mem clMemObject, const context &syclContext,
event availableEvent = {});
range<dimensions> get_range() const;
size_t get_count() const;
size_t get_size() const;
AllocatorT get_allocator() const;
template <access::mode mode,
access::target target = access::target::global_buffer>
accessor<T, dimensions, mode, target>
get_access(handler &commandGroupHandler);
template <access::mode mode>accessor<T, dimensions,
mode, access::target::host_buffer> get_access();
template <access::mode mode,
access::target target = access::target::global_buffer>
accessor<T, dimensions, mode, target>
get_access(handler &commandGroupHandler,
range<dimensions> accessRange,
id<dimensions> accessOffset = {});
template <access::mode mode>
accessor<T, dimensions, mode, access::target::host_buffer>
get_access(range<dimensions> accessRange,
id<dimensions> accessOffset = {});
template <typename Destination = std::nullptr_t>
void set_final_data(Destination finalData = std::nullptr);
void set_write_back(bool flag = true);
bool is_sub_buffer() const;
template <typename reinterpretT, int reinterpretDim>
buffer<reinterpretT, reinterpretDim, AllocatorT>
reinterpret(range<reinterpretDim> reinterpretRange)
const;
Runtime: Event class [4.6.6]
event()
event(cl_event clEvent, const context &syclContext);
cl_event get();
vector_class<event> get_wait_list();
void wait();
void wait_and_throw();
static void wait(const vector_class<event> &eventList);
static void wait_and_throw(
const vector_class<event> &eventList);
template <info::event param>typename info::param_traits<
info::event, param>::return_type get_info() const;
template <info::event_profiling param>
typename info::param_traits<info::event_profiling,
param>::return_type get_profiling_info() const;
Event queries using get_info()
Descriptor Return type
info::event::command_execution_status info::event::command_
status
info::event::reference_count cl_uint
info::event_profiling::command_submit cl_ulong
info::event_profiling::command_start cl_ulong
info::event_profiling::command_end cl_ulong
www.khronos.org/sycl©2018 Khronos Group - Rev. 0818
SYCL 1.2.1 API Reference Guide Page 3
Accessor class [4.7.6]
Enums
access::mode access::target
read,
write,
read_write,
discard_write,
discard_read_write,
atomic
global_buffer,
constant_buffer,
local,
image,
host_buffer,
host_image,
image_array
Class Declaration
template <typename dataT, int dimensions,
access::mode accessmode, access::target accessTarget
= access::target::global_buffer, access::placeholder
isPlaceholder = access::placeholder::false_t> class accessor;
Members of class accessor for buffers
accessor(buffer<dataT, 1> &bufferRef);
accessor(buffer<dataT, 1> &bufferRef,
handler &commandGroupHandlerRef);
accessor(buffer<dataT, dimensions> &bufferRef);
accessor(buffer<dataT, dimensions> &bufferRef,
handler &commandGroupHandlerRef);
accessor(buffer<dataT, dimensions> &bufferRef,
range<dimensions> accessRange,
id<dimensions> accessOffset = {});
accessor(buffer<dataT, dimensions> &bufferRef,
handler &commandGroupHandlerRef,
range<dimensions> accessRange,
id<dimensions> accessOffset = {});
constexpr bool is_placeholder() const;
size_t get_size() const;
size_t get_count() const;
range<dimensions> get_range() const;
range<dimensions> get_offset() const;
operator dataT &() const;
dataT &operator[](id<dimensions> index) const;
dataT &operator[](size_t index) const;
operator dataT() const;
dataT operator[](id<dimensions> index) const;
dataT operator[](size_t index) const;
operator atomic<dataT, access::address_space::global_space>
() const;
atomic<dataT, access::address_space::global_space>
operator[](id<dimensions> index) const;
atomic<dataT, access::address_space::global_space>
operator[](size_t index) const;
__unspecified__ &operator[](size_t index) const;
dataT *get_pointer() const;
global_ptr<dataT> get_pointer() const;
constant_ptr<dataT> get_pointer() const;
Members of class accessor for local access
accessor(handler &commandGroupHandlerRef);
accessor(range<dimensions> allocationSize,
handler &commandGroupHandlerRef);
size_t get_size() const;
size_t get_count() const;
operator dataT &() const;
dataT &operator[](id<dimensions> index) const;
dataT &operator[](size_t index) const;
operator atomic<dataT, access::address_space::local_space>
() const;
atomic<dataT, access::address_space::local_space>
operator[](id<dimensions> index) const;
atomic<dataT, access::address_space::local_space>
operator[](size_t index) const;
__unspecified__ &operator[](size_t index) const;
local_ptr<dataT> get_pointer() const;
Members of class accessor for images
template <typename AllocatorT>
accessor(image<dimensions, AllocatorT> &imageRef);
template <typename AllocatorT>
accessor(image<dimensions, AllocatorT> &imageRef,
handler &commandGroupHandlerRef);
template <typename AllocatorT>
accessor(image<dimensions + 1, AllocatorT> &imageRef,
handler &commandGroupHandlerRef);
accessor<dataT, dimensions, mode, image>
operator[](size_t index) const;
size_t get_size() const;
size_t get_count() const;
template <typename coordT>
dataT read(const coordT &coords) const;
template <typename coordT>
dataT read(const coordT &coords, const sampler &smpl)
const;
template <typename coordT> void write(const coordT
&coords, const dataT &color) const;
Accessor Capabilities
Buffer [Table 4.44]
The data type must match that of the SYCL buffer. The
dimensionality is 0, 1, 2, or 3.
Access Target
Accessor
Type Access Mode
Placeholder
modes
global_buffer Device
read, write, read_write
discard_write,
discard_read_write,
atomic
false_t
true_t
constant_buffer Device read
false_t
true_t
host_buffer Host
read, write, read_write
discard_write,
discard_read_write
false_t
Local [Table 4.47]
The data type must match that of the SYCL buffer. The
dimensionality is 0, 1, 2, or 3.
Access Target
Accessor
Type Access Mode
Placeholder
modes
local Device read_write, atomic false_t
(Continued on next page)
Image class [4.7.3]
Property class members:
property::image::use_host_ptr::use_host_ptr();
property::image::use_mutex::use_mutex(mutex_class &mutexRef);
property::image::context_bound::context_bound(
context boundContext);
mutex_class *property::image::use_mutex::get_mutex_ptr() const;
context property::image::context_bound::get_context() const;
Enums
image_channel_order image_channel_type
a,
r,
rx,
rg,
rgx,
ra,
rgb,
rgbx,
rgba,
argb,
bgra,
intensity,
luminance,
abgr
snorm_int8,
snorm_int16,
unorm_int8,
unorm_int16,
unorm_short_565,
unorm_short_555,
unorm_int_101010,
signed_int8,
signed_int16,
signed_int32,
unsigned_int8,
unsigned_int16,
unsigned_int32,
fp16,
fp32
Class Declaration
template <int dimensions = 1, typename AllocatorT =
cl::sycl::image_allocator> class image;
Member Functions
image(image_channel_order order,
image_channel_type type,
const range<dimensions> &range,
const property_list &propList = {});
image(image_channel_order order,
image_channel_type type,
const range<dimensions> &range, AllocatorT allocator,
const property_list &propList = {});
image(image_channel_order order, image_channel_type type,
const range<dimensions> &range,
const range<dimensions - 1> &pitch,
const property_list &propList = {});
image(void *hostPointer, image_channel_order order,
image_channel_type type,
const range<dimensions> &range,
const range<dimensions - 1> &pitch, AllocatorT allocator,
const property_list &propList = {});
image(void *hostPointer,
image_channel_order order, image_channel_type type,
const range<dimensions> &range,
const property_list &propList = {});
image(void *hostPointer,
image_channel_order order, image_channel_type type,
const range<dimensions> &range, AllocatorT allocator,
const property_list &propList = {});
image(void *hostPointer,
image_channel_order order, image_channel_type type,
const range<dimensions> &range,
const range<dimensions - 1> &pitch,
const property_list &propList = {});
image(void *hostPointer, image_channel_order order,
image_channel_type type,
const range<dimensions> &range, const
range<dimensions - 1> &pitch, AllocatorT allocator,
const property_list &propList = {});
image(void *hostPointer, image_channel_order order,
image_channel_type type,
const range<dimensions> &range,
const property_list &propList = {});
image(void *hostPointer,
image_channel_order order, image_channel_type type,
const range<dimensions> &range, AllocatorT allocator,
const property_list &propList = {});
image(shared_ptr_class<void> &hostPointer,
image_channel_order order, image_channel_type type,
const range<dimensions> &range,
const property_list &propList = {});
image(shared_ptr_class<void> &hostPointer,
image_channel_order order, image_channel_type type,
const range<dimensions> &range, AllocatorT allocator,
const property_list &propList = {});
image(shared_ptr_class<void> &hostPointer,
image_channel_order order, image_channel_type type,
const range<dimensions> &range,
const range<dimensions - 1> &pitch,
const property_list &propList = {});
image(shared_ptr_class<void> &hostPointer,
image_channel_order order, image_channel_type type,
const range<dimensions> &range, const
range<dimensions - 1> &pitch, AllocatorT allocator,
const property_list &propList = {});
image(cl_mem clMemObject, const context &syclContext,
event availableEvent = {});
range<dimensions> get_range() const;
range<dimensions-1> get_pitch() const;
size_t get_count() const;
size_t get_size() const;
AllocatorT get_allocator() const;
template <typename dataT, access::mode mode>
accessor<dataT, dimensions,
accessMode, access::target::image>
get_access(handler &commandGroupHandler);
template <typename dataT, access::mode mode>
accessor<dataT, dimensions, accessMode,
access::target::host_image> get_access();
template <typename Destination = std::nullptr_t>
void set_final_data(Destination finalData = std::nullptr);
void set_write_back(bool flag = true);
Data management: Host allocation [4.7.1]
The default allocator for memory objects is implementation
defined, but users can supply their own allocator class.
buffer<int, 1, UserDefinedAllocator<int> > b(d);
Default allocators
Allocators Description
buffer_allocator Default buffer allocator used by the runtime,
when no allocator is defined by the user.
image_allocator Default image allocator used by the runtime for
the image class, when no allocator is defined by
the user. Must be a byte-sized allocator.
www.khronos.org/sycl©2018 Khronos Group - Rev. 0818
SYCL 1.2.1 API Reference Guide Page 4
Multi_ptr class [4.7.7]
Class Declaration
template <typename elementType,
access::address_space addressSpace> class multi_ptr;
Member Functions
multi_ptr();
multi_ptr(const multi_ptr&);
multi_ptr(multi_ptr&&);
multi_ptr(pointer_t);
multi_ptr(elementType*);
multi_ptr(void*);
multi_ptr(std::nullptr_t);
~multi_ptr();
multi_ptr &operator=(const multi_ptr&);
multi_ptr &operator=(multi_ptr&&);
multi_ptr &operator=(pointer_t);
multi_ptr &operator=(elementType*);
multi_ptr &operator=(void*);
multi_ptr &operator=(std::nullptr_t);
elementType &operator=() const;
elementType *operator=() const;
template <int dimensions, access::mode addressMode>
multi_ptr(accessor<elementType, dimensions,
addressMode, access::target::global_buffer>);
template <int dimensions, access::mode addressMode>
multi_ptr(accessor<elementType, dimensions,
addressMode, access::target::local>);
template <int dimensions, access::mode addressMode>
multi_ptr(accessor<elementType, dimensions,
addressMode, access::target::constant_buffer>);
template <typename elementType, int dimensions,
access::mode addressMode>
multi_ptr(accessor<elementType, dimensions,
addressMode, access::target::global_buffer>);
template <typename elementType, int dimensions,
access::mode addressMode>
multi_ptr (accessor<elementType, dimensions,
addressMode, access::target::local>);
template <typename elementType, int dimensions,
access::mode addressMode>
multi_ptr(accessor<elementType, dimensions,
addressMode, access::target::constant_buffer>);
pointer_t get() const;
operator void*() const;
template <typename elementType>
explicit operator multi_ptr<elementType, addressSpace>()
const;
Non-member Functions
template <typename elementType, access::address_space
addressSpace> multi_ptr<elementType, addressSpace>
make_ptr(multi_ptr<elementType,
addressSpace>::pointer_t);
template
<typename elementType, access::address_space
addressSpace> multi_ptr<elementType, addressSpace>
make_ptr(elementType*);
template
<typename elementType, access::address_space
addressSpace>
bool operatorOP(const multi_ptr<elementType,
addressSpace> & lhs, const multi_ptr<elementType,
addressSpace>& rhs);
OP is one of ==, !=, <, >, <=, >=
template
<typename elementType, access::address_space
addressSpace>
bool operatorOP(const multi_ptr<elementType,
addressSpace>& lhs, std::nullptr_t rhs);
OP is one of ==, !=, <, >, <=, >=
template
<typename elementType, access::address_space
addressSpace>
bool operatorOP(std::nullptr_t lhs,
const multi_ptr<elementType, addressSpace>& rhs);
OP is one of ==, !=, <, >, <=, >=
Template specialization aliases [4.7.7.2]
template <typename elementType>
using global_ptr = multi_ptr<elementType,
access::address_space::global_space>;
template <typename elementType>
using local_ptr = multi_ptr<elementType,
access::address_space::local_space>;
template <typename elementType>
using constant_ptr = multi_ptr<elementType,
access::address_space::constant_space>;
template <typename elementType>
using private_ptr = multi_ptr<elementType,
access::address_space::private_space>;
void prefetch(size_t numElements) const;
Ranges and index space identifiers[4.8.1]
Class range [4.8.1.1]
Class declaration
template <size_t dimensions = 1> class range;
Member functions
range(size_t dim0);
range(size_t dim0, size_t dim1);
range(size_t dim0, size_t dim1, size_t dim2);
size_t get(int dimension) const;
size_t &operator[](int dimension);
size_t size() const;
range<dimensions> operatorOP(
const range<dimensions> &rhs) const;
range<dimensions> operatorOP(const size_t &rhs) const;
range<dimensions> &operatorOP(
const range<dimensions> &rhs);
range<dimensions> &operatorOP(const size_t &rhs);
Non-member function
template <int dimensions>
range<dimensions> operatorOP(const size_t &lhs,
const range<dimensions> &rhs);
Class nd_range [4.8.1.2]
Class declaration
template <size_t dimensions = 1> struct nd_range;
Member functions
nd_range(range<dimensions> globalSize,
range<dimensions> localSize,
id<dimensions>
offset = id<dimensions>());
range<dimensions> get_global_range() const;
range<dimensions> get_local_range() const;
range<dimensions> get_group_range() const;
id<dimensions> get_offset() const;
Class id [4.8.1.3]
Class declaration
template <size_t dimensions = 1> struct id;
Member functions
id();
id(size_t dim0);
id(size_t dim0, size_t dim1);
id(size_t dim0, size_t dim1, size_t dim2);
id(const range<dimensions> &range);
id(const item<dimensions> &item);
size_t get(int dimension) const;
size_t &operator[](int dimension) const;
operator size_t();
id<dimensions> operatorOP(const id<dimensions> &rhs)
const;
where OP is one of +, -, *, /, %, <<, >>, &, |, ˆ, &&, ||, <, >, <=, >=
id<dimensions> operatorOP(const size_t &rhs) const;
where OP is one of +, -, *, /, %, <<, >>, &, |, ˆ, &&, ||, <, >, <=, >=
id<dimensions> &operatorOP(const id<dimensions> &rhs);
where OP is one of +=, -=, *=, /=, %=, <<=, >>=, &=, |=, ˆ=
id<dimensions> &operatorOP(const size_t &rhs);
where OP is one of +=, -=, *=, /=, %=, <<=, >>=, &=, |=, ˆ=
Non-member functions
template <int dimensions> id<dimensions> operatorOP(
const size_t &lhs, const id<dimensions> &rhs);
where OP is one of +, -, *, /, %, <<, >>, &, |, ˆ
Class item [4.8.1.4-5]
Class declaration
template <int dimensions = 1, bool with_offset = true>
struct item;
Member functions
id<dimensions> get_id() const;
size_t get_id(int dimension) const;
size_t &operator[](int dimension);
range<dimensions> get_range() const;
id<dimensions> get_offset() const;
operator item<dimensions, true>() const;
size_t get_linear_id() const;
Class nd_item [4.8.1.6]
Class declaration
template <size_t dimensions = 1> struct nd_item;
Member functions
id<dimensions> get_global_id() const;
size_t get_global_id(int dimension) const;
size_t get_global_linear_id() const;
id<dimensions> get_local_id() const;
size_t get_local_id(int dimension) const;
size_t get_local_linear_id() const;
group<dimensions> get_group() const;
size_t get_group(int dimension) const;
size_t get_group_linear_id() const;
size_t get_num_range(int dimension) const;
id<dimensions> get_num_range() const;
range<dimensions> get_global_range() const;
range<dimensions> get_local_range() const;
id<dimensions> get_offset() const;
nd_range<dimensions> get_nd_range() const;
void barrier(access::fence_space accessSpace =
access::fence_space::global_and_local) const;
template
<access::mode accessMode = access::mode::read_write>
void mem_fence(access::fence_space accessSpace =
access::fence_space::global_and_local) const;
template <typename dataT>
device_event async_work_group_copy(
local_ptr<dataT> dest, global_ptr<dataT> src,
size_t numElements) const;
template <typename dataT>
device_event async_work_group_copy(
global_ptr<dataT> dest, local_ptr<dataT> src,
size_t numElements) const;
(Continued on next page)
Sampler class [4.7.8]
Enums
addressing_mode sampler_filtering_mode
mirrored_repeat
repeat
clamp_to_edge
clamp
none
nearest
linear
coordinate_normalization_mode
normalized
unnormalized
Class sampler members
sampler(coordinate_normalization_mode
normalizationMode, addressing_mode addressingMode,
filtering_mode filteringMode);
sampler(cl_sampler clSampler, const context &syclContext);
addressing_mode get_addresssing_mode() const;
filtering_mode get_filtering_mode() const;
coordinate_normalization_mode
get_coordinate_normalization_mode() const;
Accessor class (continued)
Image [Table 4.50]
The data types are cl_int4, cl_uint4, cl_float4, and cl_half4, and
the dimensionality is between 1 and 3 (inclusive).
Access Target
Accessor
Type Dimensionality Access Mode
Place-
holder
image Device 1, 2, or 3
read, write,
discard_write,
false_timage_array Device 1 or 2
host_image Host 1, 2, or 3
www.khronos.org/sycl©2018 Khronos Group - Rev. 0818
SYCL 1.2.1 API Reference Guide Page 5
Kernel class [4.8.7]
kernel(cl_kernel clKernel, const context &syclContext);
cl_kernel get() const;
bool is_host() const;
context get_context() const;
program get_program() const;
template <info::kernel param>
typename info::param_traits<
info::kernel, param>::return_type get_info() const;
template <info::kernel_work_group param>
typename info::param_traits<info::kernel_work_group,
param>::return_type
get_work_group_info(const device &dev) const;
Kernel queries using get_info()
Descriptor Return type
info::kernel::function_name string_class
info::kernel::num_args cl_uint
info::kernel::context context
info::kernel::program program
info::kernel::reference_count cl_uint
info::kernel::attributes string_class
Ranges, index space identifiers (cont.)
template <typename dataT>
device_event async_work_group_copy(
local_ptr<dataT> dest, global_ptr<dataT> src,
size_t numElements, size_t srcStride) const;
template <typename dataT>
device_event async_work_group_copy(
global_ptr<dataT> dest, local_ptr<dataT> src,
size_t numElements, size_t destStride) const;
template <typename... eventTN>
void wait_for(eventTN... events) const;
Class h_item [4.8.1.7]
Class declaration
template <int dimensions> struct h_item;
Member functions
item<dimensions, false> get_global() const;
item<dimensions, false> get_local() const;
item<dimensions, false> get_logical_local() const;
item<dimensions, false> get_physical_local() const;
range<dimensions> get_global_range() const;
size_t get_global_range(int dimension) const;
id<dimensions> get_global_id() const;
size_t get_global_id(int dimension) const;
range<dimensions> get_local_range() const;
size_t get_local_range(int dimension) const;
id<dimensions> get_local_id() const;
size_t get_local_id(int dimension) const;
range<dimensions> get_logical_local_range() const;
size_t get_logical_local_range(int dimension) const;
id<dimensions> get_logical_local_id() const;
size_t get_logical_local_id(int dimension) const;
range<dimensions> get_physical_local_range() const;
size_t get_physical_local_range(int dimension) const;
id<dimensions> get_physical_local_id() const;
size_t get_physical_local_id(int dimension) const;
Class group [4.8.1.8]
Class declaration
template <int dimensions = 1> struct group;
Member functions
id<dimensions> get_id() const;
size_t get_id(int dimension) const;
range<dimensions> get_global_range() const;
size_t get_global_range(int dimension) const;
range<dimensions> get_group_range() const;
range<dimensions> get_local_range() const;
size_t get_local_range(int dimension) const;
size_t get_group_range(int dimension) const;
size_t operator[](int dimension) const;
size_t get_linear_id() const;
template<typename workItemFunctionT>
void parallel_for_work_item(workItemFunctionT func)
const;
template<typename workItemFunctionT>
void parallel_for_work_item(range<dimensions>
flexibleRange, workItemFunctionT func) const;
template
<access::mode accessMode = access::mode::read_write>
void mem_fence(access::fence_space accessSpace =
access::fence_space::global_and_local) const;
template <typename dataT>
device_event async_work_group_copy(
local_ptr<dataT> dest, global_ptr<dataT> src,
size_t numElements) const;
template <typename dataT>
device_event async_work_group_copy(
global_ptr<dataT> dest, local_ptr<dataT> src,
size_t numElements) const;
template <typename dataT>
device_event async_work_group_copy(
local_ptr<dataT> dest, global_ptr<dataT> src,
size_t numElements, size_t srcStride) const;
template <typename dataT>
device_event async_work_group_copy(
global_ptr<dataT> dest, local_ptr<dataT> src,
size_t numElements, size_t destStride) const;
template <typename... eventTN>
void wait_for(eventTN... events) const;
class device_event member [4.8.1.9 - 4.8.1.10]
void wait() const;
Programclass[4.8.8]
Enums
program_state
none compiled linked
Member functions
explicit program(const context &context);
program(const context &context,
vector_class<device> deviceList);
program(vector_class<program> programList,
string_class linkOptions ='''');
program(const context &context, cl_program clProgram);
cl_program get() const;
bool is_host() const;
template <typename kernelT>
void compile_with_kernel_type(
string_class compileOptions ='''');
template <typename kernelT>
void build_with_kernel_type(
string_class buildOptions = '''');
void compile_with_source(string_class kernelSource,
string_class compileOptions = '''');
void build_with_source(string_class kernelSource,
string_class buildOptions = '''');
void link(string_class linkOptions ='''');
template <typename kernelT>
bool has_kernel<kernelT>() const;
bool has_kernel(string_class kernelName) const;
template <typename kernelT>
kernel get_kernel<kernelT>() const;
kernel get_kernel(string_class kernelName) const;
template <info::program param>
typename info::param_traits<
info::program, param>::return_type get_info() const;
vector_class<vector_class<char>> get_binaries() const;
context get_context() const;
vector_class<device> get_devices() const;
string_class get_compile_options() const;
string_class get_link_options() const;
string_class get_build_options() const;
program_state get_state() const;
Information descriptors:
Descriptor Return type
info::program::reference_count cl_uint
info::program::context cl_context
info::program::devices vector_class<device>
Command group class handler [4.8.3 - 6]
template <typename dataT, int dimensions, access::mode
accessMode, access::target accessTarget>
void require(accessor<dataT, dimensions, accessMode,
accessTarget, placeholder::true_t> acc);
template <typename T> void set_arg(int argIndex, T && arg);
template <typename... Ts> void set_args(Ts &&... args);
template <typename KernelName, typename KernelType>
void single_task(KernelType kernelFunc);
template <typename KernelName,
typename KernelType, int dimensions>
void parallel_for(range<dimensions> numWorkItems,
KernelType kernelFunc);
template <typename KernelName,
typename KernelType, int dimensions>
void parallel_for(range<dimensions> numWorkItems,
id<dimensions> workItemOffset, KernelType kernelFunc);
template <typename KernelName,
typename KernelType, int dimensions>
void parallel_for(nd_range<dimensions> executionRange,
KernelType kernelFunc);
template <typename KernelName,
typename WorkgroupFunctionType, int dimensions>
void parallel_for_work_group(
range<dimensions> numWorkGroups,
WorkgroupFunctionType kernelFunc);
template <typename KernelName, typename
WorkgroupFunctionType, int dimensions>
void parallel_for_work_group(
range<dimensions> numWorkGroups,
range<dimensions> workGroupSize,
WorkgroupFunctionType kernelFunc);
void single_task(kernel syclKernel);
template <int dimensions>
void parallel_for(range<dimensions> numWorkItems,
kernel syclKernel);
template <int dimensions>
void parallel_for(range<dimensions> numWorkItems,
id<dimensions> workItemOffset, kernel syclKernel);
template <int dimensions> void parallel_for(
nd_range<dimensions> ndRange, kernel syclKernel);
template <typename T, int dim, access::mode mode,
access::target tgt> void copy(accessor<T, dim, mode, tgt>
src, shared_ptr_class<T> dest);
template <typename T, int dim, access::mode mode,
access::target tgt> void copy(shared_ptr_class<T> src,
accessor<T, dim, mode, tgt> dest);
template <typename T, int dim, access::mode mode,
access::target tgt> void copy(
accessor<T, dim, mode, tgt> src, T * dest);
template <typename T, int dim, access::mode mode,
access::target tgt> void copy(
const T * src, accessor<T, dim, mode, tgt> dest);
template <typename T, int dim, access::mode mode,
access::target tgt> void copy(
accessor<T, dim, mode, tgt> src,
accessor<T, dim, mode, tgt> dest);
template <typename T, int dim, access::mode mode,
access::target tgt> void update_host(
accessor<T, dim, mode, tgt> acc);
template<typename T, int dim, access::mode mode,
access::target tgt> void fill(
accessor<T, dim, mode, tgt> dest, const T& src);
class private_memory members [4.8.5.3]
class private_memory {
public:
private_memory(const group<Dimensions> &);
T &operator()(const h_item<Dimensions> &id);
};
www.khronos.org/sycl©2018 Khronos Group - Rev. 0818
SYCL 1.2.1 API Reference Guide Page 6
Scalar data types [4.10.1. 6.5]
SYCL supports the C++11 ISO standard data types. The following data type aliases are defined for OpenCL interoperability in the cl::sycl namespace:
char,
signed char,
unsigned char,
short int,
unsigned short int,
bool,
int,
unsigned int,
long int,
unsigned long int,
long long int,
unsignedlonglongint,
size_t,
float,
double,
cl_bool,
cl_char,
cl_uchar,
cl_short,
cl_ushort
cl_int,
cl_uint
cl_long,
cl_ulong
cl_float
cl_double
cl_half
half
byte
Synchronization & atomics [4.11]
Enums
fence_space : char memory_order : int
local_space
global_space
global_and_local
relaxed
Class atomic
Class declaration
template <typename T, access::address_space addressSpace
= access::address_space::global_space> class atomic;
Member functions
template <typename pointerT>
atomic(multi_ptr<pointerT, addressSpace> ptr);
void store(T operand, memory_order memoryOrder =
memory_order::relaxed) volatile;
T load(memory_order memoryOrder =
memory_order::relaxed) const volatile;
T exchange(T operand,
memory_order memoryOrder = memory_order::relaxed)
volatile;
bool compare_exchange_strong(T& expected, T desired,
memory_order successMemoryOrder =
memory_order::relaxed,
memory_order failMemoryOrder =
memory_order::relaxed) volatile;
T fetch_X(T operand, memory_order memoryOrder =
memory_order::relaxed) volatile;
where X is one of add, sub, and, or, xor, min, max
Available only when T != float
Non-member functions
template <typename T, access::address_space addressSpace>
T atomic_load(atomic<T, addressSpace> object,
memory_order memoryOrder = memory_order::relaxed)
volatile;
template <typename T, access::address_space addressSpace>
void atomic_store(atomic<T, addressSpace>
object, T operand, memory_order memoryOrder =
memory_order::relaxed) volatile;
template <typename T, access::address_space addressSpace>
T atomic_exchange(atomic<T, addressSpace> object,
T operand, memory_order memoryOrder =
memory_order::relaxed) volatile;
template <typename T, access::address_space addressSpace>
bool atomic_compare_exchange_strong(
atomic<T, addressSpace> object, T &expected, T desired,
memory_order successMemoryOrder =
memory_order::relaxed, memory_order failMemoryOrder
= memory_order::relaxed) volatile;
template <typename T, access::address_space addressSpace>
T atomic_fetch_X(atomic<T, addressSpace> object,
T operand, memory_order memoryOrder =
memory_order::relaxed) volatile;
where X is one of add, sub, and, or, xor, min, max
Exception classes [4.9.2]
Class: exception
const char *what() const;
context get_context() const;
bool has_context() const;
cl_int get_cl_code() const;
Class: exception_list
size_t size() const;
iterator begin() const;
iterator end() const;
Exception types
Derived from
cl::sycl::runtime_error class
Derived from
cl::sycl::device_error class
kernel_error compile_program_error
nd_range_error link_program_error
accessor_error invalid_object_error
event_error memory_allocation_error
invalid_parameter_error platform_error
profiling_error
feature_not_supported
Stream class [4.12]
Enums
stream_manipulator
scientific,
hex,
oct,
noshowbase,
showbase,
showpos,
endl,
noshowpos,
fixed,
hexfloat,
defaultfloat,
dec
Member functions
stream(size_t bufferSize, size_t maxStatementSize, handler& cgh);
size_t get_size() const;
size_t get_max_statement_size() const;
Non-member functions
template <typename T>
const stream& operator<<(const stream& os, const T &rhs);
__precision_manipulator__ setprecision(int precision);
__width_manipulator__ setw(int width);
Vector data types [4.10.2]
RET’s dataT template parameter depends on vec’s dataT
template parameter:
dataT RET
cl_char, cl_uchar cl_char
cl_short, cl_ushort or cl_half cl_short
cl_int, cl_uint or cl_float cl_int
cl_long, cl_ulong or cl_double cl_long
Class declaration
template <typename dataT, int numElements> class vec;
Members
using element_type = dataT;
vec();
explicit vec(const dataT &arg);
template <typename... argTN> vec(const argTN&... args);
vec(const vec<dataT, numElements> &rhs);
#ifdef __SYCL_DEVICE_ONLY__
vec(vector_t openclVector);
operator vector_t() const;
#endif
operator dataT() const; // Available only if numElements == 1
size_t get_count() const;
size_t get_size() const;
template <typename convertT,
rounding_mode roundingMode>
vec<convertT, numElements> convert() const;
template <typename asT> asT as() const;
The following XYZW members are available only when
numElements <= 4. RGBA members are available only when
numElements == 4.
template<int... swizzleIndexes>
__swizzled_vec__ swizzle() const;
__swizzled_vec__ XYZW_ACCESS() const;
__swizzled_vec__ RGBA_ACCESS() const;
__swizzled_vec__ INDEX_ACCESS() const;
#ifdef SYCL_SIMPLE_SWIZZLES
__swizzled_vec__ XYZW_SWIZZLE() const;
__swizzled_vec__ RGBA_SWIZZLE() const;
#endif
The following lo, hi, odd, and even members are available only
when numElements > 1.
__swizzled_vec__ lo() const;
__swizzled_vec__ hi() const;
__swizzled_vec__ odd() const;
__swizzled_vec__ even() const;
template <access::address_space addressSpace>
void load(size_t offset,
multi_ptr<dataT, addressSpace> ptr);
template <access::address_space addressSpace>
void store(size_t offset,
multi_ptr<dataT, addressSpace> ptr) const;
vec<dataT, numElements> &operator=(const vec<dataT, numElements> &rhs);
vec<dataT, numElements> &operator=(const dataT &rhs);
vec<RET, numElements> operator!();
vec<dataT, numElements> operator~(); // Not available for floating point types
Member functions with OP variable
For all types,
OP may be:
For non floating-point
types, OP may be:
vec<dataT, numElements> operatorOP(const vec<dataT, numElements> &rhs) const;
+, -, *, / %, &, |, ^, <<, >>
vec<dataT, numElements> operatorOP(const dataT &rhs) const;
vec<dataT, numElements> &operatorOP(const vec<dataT, numElements> &rhs); +=, -=, *=,
/=
%=, &=, |=, ^=,
<<=, >>=vec<dataT, numElements> &operatorOP(const dataT &rhs);
vec<RET, numElements> operatorOP(const vec<dataT, numElements> &rhs) const; &&, ||,
==, !=, <, >,
<=, >=vec<RET, numElements> operatorOP(const dataT &rhs) const;
vec<dataT, numElements> &operatorOP();
++, --
vec<dataT, numElements> operatorOP(int);
Non-member functions
Non-member functions with OP variable
For all types, OP
may be:
For non floating-point
types, OP may be:
template <typename dataT, int numElements>
vec<dataT, numElements> operatorOP(const dataT &lhs)
const vec<dataT, numElements> &rhs);
+, -, *, / %, &, |, ^, <<, >>
template <typename dataT, int numElements> vec<RET, numElements>
operatorOP(const dataT &lhs, const vec<dataT, numElements> &rhs);
&&, ||, ==, !=, <,
>, <=, >=
www.khronos.org/sycl©2018 Khronos Group - Rev. 0818
SYCL 1.2.1 API Reference Guide Page 7
Math functions[4.13.3]
Math funtions are available in the namespace cl::sycl. In all cases
below, n may be one of 2, 3, 4, 8, or 16.
Tf (genfloat in the spec) is type float[n], double[n], or half[n].
Tff (genfloatf) is type float[n].
Tfd (genfloatd) is type double[n].
Th (genfloath) is type half[n].
sTf (sgenfloat) is type float, double, or half.
Ti (genint) is type int[n].
uTi (ugenint) is type unsigned int or uintn.
uTli (ugenlonginteger) is unsigned long int, ulonglongn, ulongn,
unsigned long long int.
N indicates native variants, available in cl::sycl::native.
H indicates half variants, available in cl::sycl::halfprecision,
implemented with a minimum of 10 bits of accuracy.
Tf acos (Tf x) Arc cosine
Tf acosh (Tf x) Inverse hyperbolic cosine
Tf acospi (Tf x) acos (x) / π
Tf asin (Tf x) Arc sine
Tf asinh (Tf x) Inverse hyperbolic sine
Tf asinpi (Tf x) asin (x) / π
Tf atan (Tf y_over_x) Arc tangent
Tf atan2 (Tf y, Tf x) Arc tangent of y / x
Tf atanh (Tf x) Hyperbolic arc tangent
Tf atanpi (Tf x) atan (x) / π
Tf atan2pi (Tf y, Tf x) atan2 (y, x) / π
Tf cbrt (Tf x) Cube root
Tf ceil (Tf x) Round to integer toward + infinity
Tf copysign (Tf x, Tf y) x with sign changed to sign of y
Tf cos (Tf x)
Tff cos (Tff x) N H
Cosine
Tf cosh (Tf x) Hyperbolic cosine
Tf cospi (Tf x) cos (π x)
Tff divide (Tff x, Tff y) N H
x / y
(Not available in cl::sycl.)
Tf erfc (Tf x) Complementary error function
Tf erf (Tf x) Calculates error function
Tf exp (Tf x)
Tff exp (Tff x) N H
Exponential base e
Tf exp2 (Tf x)
Tff exp2 (Tff x) N H
Exponential base 2
Tf exp10 (Tf x)
Tff exp10 (Tff x) N H
Exponential base 10
Tf expm1 (Tf x) ex
-1.0
Tf fabs (Tf x) Absolute value
Tf fdim (Tf x, Tf y) Positive difference between x and y
Tf floor (Tf x) Round to integer toward infinity
Tf fma (Tf a, Tf b, Tf c) Multiply and add, then round
Tf fmax (Tf x, Tf y)
Tf fmax (Tf x, sTf y)
Return y if x < y,
otherwise it returns x
Tf fmin (Tf x, Tf y)
Tf fmin (Tf x, sTf y)
Return y if y < x,
otherwise it returns x
Tf fmod (Tf x, Tf y) Modulus. Returns x – y * trunc (x/y)
Tf fract (Tf x, Tf *iptr) Fractional value in x
Tf frexp (Tf x, Ti *exp) Extract mantissa and exponent
Tf hypot (Tf x, Tf y) Square root of x2
+ y2
Ti ilogb (Tf x) Return exponent as an integer value
Tf ldexp (Tf x, Ti k)
doublen ldexp (doublen x, int k)
x * 2n
Tf lgamma (Tf x) Log gamma function
Tf lgamma_r (Tf x, Ti *signp) Log gamma function
Tf log (Tf x)
Tff log (Tff x) N H
Natural logarithm
Tf log2 (Tf x)
Tff log2 (Tff x) N H
Base 2 logarithm
Tf log10 (Tf x)
Tff log10 (Tff x) N H
Base 10 logarithm
Tf log1p (Tf x) ln (1.0 + x)
Tf logb (Tf x) Return exponent as an integer value
Tf mad (Tf a, Tf b, Tf c) Approximates a * b + c
Tf maxmag (Tf x, Tf y) Maximum magnitude of x and y
Tf minmag (Tf x, Tf y) Minimum magnitude of x and y
Tf modf (Tf x, Tf *iptr) Decompose floating-point number
Tff nan(uTi nancode)
Tfd nan(uTli nancode)
Quiet NaN
(Return is scalar when nancode
is scalar)
Tf nextafter (Tf x, Tf y) Next representable floating-point
value after x in the direction of y
Tf pow (Tf x, Tf y) Compute x to the power of y
Tf pown (Tf x, Tiy) Compute xy
, where y is an integer
Tf powr (Tf x, Tf y)
Tff powr (Tff x, Tff y) N
Tff powr (Tff x, Th y) H
Compute xy
, where x is >= 0
Tff recip (Tff x) N H
1 / x
(Not available in cl::sycl.)
Tf remainder (Tf x, Tf y) Floating point remainder
Tf remquo (Tf x, Tf y, Ti *quo) Remainder and quotient
Tf rint (Tf x) Round to nearest even integer
Tf rootn (Tf x, Ti y) Compute x to the power of 1/y
Tf round (Tf x) Integral value nearest to x
rounding
Tf rsqrt (Tf x)
Tff rsqrt (Tff x) N H
Inverse square root
Tf sin (Tf x)
Tff sin (Tff x) N H
Sine
Tf sincos (Tf x, Tf *cosval) Sine and cosine of x
Tf sinh (Tf x) Hyperbolic sine
Tf sinpi (Tf x) sin (π x)
Tf sqrt (Tf x)
Tff sqrt (Tff x) N H
Square root
Tf tan (Tf x)
Tff tan (Tff x) N H
Tangent
Tf tanh (Tf x) Hyperbolic tangent
Tf tanpi (Tf x) tan (π x)
Tf tgamma (Tf x) Gamma function
Tf trunc (Tf x) Round to integer toward zero
Integer functions[4.13.4]
Integer funtions are available in the namespace cl::sycl. In all
cases below, n may be one of 2, 3, 4, 8, or 16. If a type in the
functions below is shown with [xbit] in its name, this indicates
that the type is x bits in size.
Tint (geninteger in the spec) is type int[n], uint[n], unsigned int,
char, char[n], signed char, scharn, ucharn, unsigned
short[n], unsigned short, ushort[n], longn, ulongn, long
int, unsigned long int, long long int, longlongn, ulonglongn
unsigned long long int.
uTint (ugeninteger) is type unsigned char, ucharn,
unsigned short, ushortn, unsigned int, uintn, unsigned long int,
ulongn, ulonglongn, unsigned long long int.
iTint (igeninteger) is type signed char, scharn, short[n], int[n],
long int, longn, long long int, longlongn.
sTint (sgeninteger) is type char, signed char, unsigned char, short,
unsigned short, int, unsigned int, long int, unsigned long int,
long long int, unsigned long long int.
uTint abs (Tint x) | x |
uTint abs_diff (Tint x, Tint y) | x – y | without modulo overflow
Tint add_sat (Tint x, Tint y) x + y and saturates the result
Tint hadd (Tint x, Tint y) (x + y) >> 1 withoutmod.overflow
Tint rhadd (Tint x, Tint y) (x + y + 1) >> 1
Tint clamp (Tint x, Tint min,
Tint max)
Tint clamp (Tint x, sTint min,
sTint max)
min(max(x, min), max)
Tint clz (Tint x) number of leading 0-bits in x
Tint mad_hi (Tint a, Tint b,
Tint c)
mul_hi(a, b) + c
Tint mad_sat (Tint a, Tint b,
Tint c)
a * b + c and saturates the result
Tint max (Tint x, Tint y)
Tint max (Tint x, sTint y)
y if x < y, otherwise it returns x
Tint min (Tint x, Tint y)
Tint min (Tint x, sTint y)
y if y < x, otherwise it returns x
Tint mul_hi (Tint x, Tint y) high half of the product of x and y
Tint popcount (Tint x) Number of non-zero bits in x
Tint rotate (Tint v, Tint i) result[indx] = v[indx] << i[indx]
Tint sub_sat (Tint x, Tint y) x - y and saturates the result
uTint16bit upsample (uTint8bit hi,
uTint8bit lo)
result[i]= ((ushort)hi[i]<< 8)|lo[i]
iTint16bit upsample (iTint8bit hi,
uTint8bit lo)
result[i]=((short)hi[i]<< 8)|lo[i]
uTint32bit upsample ( uTint16bit hi,
uTint16bit lo)
result[i]=((uint)hi[i]<< 16)|lo[i]
iTint32bit upsample (iTint16bit hi,
uTint16bit lo)
result[i]=((int)hi[i]<< 16)|lo[i]
uTint64bit upsample (uTint32bit hi,
uTint32bit lo)
result[i]=((ulonglong)hi[i]<< 32)
|lo[i]
iTint64bit upsample (iTint32bit hi,
uTint32bit lo)
result[i]=((longlong)hi[i]<< 32)|lo[i]
Tint32bit mad24 (Tint32bit x,
Tint32bit y, Tint32bit z)
Tint32bit mad24 (Tint32bit x,
Tint32bit y, Tint32bit z)
Multiply 24-bit integer values x, y, add 32-
bit integer result to 32-bit integer z
Tint32bit mul24 (Tint32bit x,
Tint32bit y)
Multiply 24-bit integer values x and y
www.khronos.org/sycl©2018 Khronos Group - Rev. 0818
SYCL 1.2.1 API Reference Guide Page 8
© 2018 Khronos Group. All rights reserved. SYCL is a trademark of the Khronos Group. The Khronos Group
is an industry consortium creating open standards for the authoring and acceleration of parallel computing,
graphics, dynamic media, and more on a wide variety of platforms and devices. See www.khronos.org to
learn more about the Khronos Group. See www.khronos.org/sycl to learn more about SYCL.
Relational built-in functions [4.13.7]
Relational functions are available in the namespace cl::sycl. In
all cases below, n may be one of 2, 3, 4, 8, or 16. If a type in the
functions below is shown with [xbit] in its name, this indicates
that the type is x bits in size.
Tint (geninteger in the spec) is type int[n], uint[n], unsigned int,
char, char[n], signed char, scharn, ucharn, unsigned
short[n], unsigned short, ushort[n], longn, ulongn, long
int, unsigned long int, long long int, longlongn, ulonglongn
unsigned long long int.
iTint (igeninteger) is type signed char, scharn, short[n], int[n],
long int, longn, long long int, longlongn.
uTint (ugeninteger) is type unsigned char, ucharn,
unsigned short, ushortn, unsigned int, uintn,
unsigned long int, ulongn, ulonglongn, unsigned long long int.
Ti (genint) is type int[n].
uTi (ugenint) is type unsigned int or uintn.
Tff (genfloatf) is type float[n].
Tfd (genfloatd) is type double[n].
T (gentype) is type float[n], double[n], or half[n], or any type
listed for above for Tint.
int any (iTint x) 1 if MSB in component of x is
set; else 0
int all (iTint x) 1 if MSB in all components of x are
set; else 0
T bitselect (T a, T b, T c) Eachbitofresultiscorresponding
bitofaifcorrespondingbitofcis0
Tint select (Tint a, Tint b, iTint c)
Tint select (Tint a, Tint b, uTint c)
Tff select (Tff a, Tff b, Ti c)
Tff select (Tff a, Tff b, uTi c)
Tfd select (Tfd a, Tfd b, iTint64bit c)
Tfd select (Tfd a, Tfd b,
uTint64bit c)
For each component of a vector
type, result[i] = if MSB of c[i] is set
? b[i] : a[i] For scalar type, result
= c? b : a
iTint32bit function (Tff x, Tff y)
iTint64bit function (Tfd x, Tfd y)
function may be one of isequal, isnotequal,
isgreater, isgreaterequal, isless, islessequal,
islessgreater, isordered, isunordered.
This format is used
for many relational
functions. Replace
function with the
function name.
iTint32bit function (Tff x)
iTint64bit function (Tfd x)
function may be one of isfinite, isinf, isnan,
isnormal, signbit.
Geometric Functions [4.13.6]
Geometric functions are available in the namespace cl::sycl.
Tgf (gengeofloat in the spec) is type float, float2, float3, float4.
Tgd (gengeodouble) is type double, double2, double3, double4.
float4 cross (float4 p0, float4 p1)
float3 cross (float3 p0, float3 p1)
double4 cross (double4 p0, double4 p1)
double3 cross (double3 p0, double3 p1)
Cross product
float distance (Tgf p0, Tgf p1)
double distance (Tgd p0, Tgd p1)
Vector distance
float dot (Tgf p0, Tgf p1)
double dot (Tgd p0, Tgd p1)
Dot product
float length (Tgf p)
double length (Tgd p)
Vector length
Tgf normalize (Tgf p)
Tgd normalize (Tgd p)
Normal vector length 1
float fast_distance (Tgf p0, Tgf p1) Vector distance
float fast_length (Tgf p) Vector length
Tgf fast_normalize (Tgf p) Normal vector length 1
Common functions[4.13.5]
Common funtions are available in the namespace cl::sycl. On the
host the vector types use the vec class and on an OpenCL device
use the corresponding OpenCL vector types. In all cases below, n
may be one of 2, 3, 4, 8, or 16.
Tf (genfloat in the spec) is type float[n], double[n], or half[n].
Tff (genfloatf) is type float[n].
Tfd (genfloatd) is type double[n].
Tf clamp (Tf x, Tf minval, Tf maxval);
Tff clamp (Tff x, float minval,float maxval);
Tfd clamp (Tfd x, double minval,
doublen maxval);
Clamp x to range given by
minval, maxval
Tf degrees (Tf radians); radians to degrees
Tf abs (Tf x, Tf y);
Tff abs (Tff x, float y);
Tfd abs (Tfd x, double y);
Max of x and y
Tf max (Tf x, Tf y);
Tff max (Tff x, float y);
Tfd max (Tfd x, double y);
Max of x and y
Tf min (Tf x, Tf y);
Tff min (Tff x, float y);
Tfd min (Tfd x, double y);
Min of x and y
Tf mix (Tf x, Tf y, Tf a);
Tff mix (Tff x, Tff y, float a);
Tfd mix (Tfd x, Tfd y, double a) ;
Linear blend of x and y
Tf radians (Tf degrees); degrees to radians
Tf step (Tf edge, Tf x);
Tff step (float edge, Tff x);
Tfd step (double edge, Tfd x);
0.0 if x < edge, else 1.0
Tf smoothstep (Tf edge0, Tf edge1, Tf x);
Tff smoothstep(float edge0,float edge1,
Tff x);
Tfd smoothstep(double edge0,
double edge1,Tfd x);
Step and interpolate
Tf sign (Tf x); Sign of x
Preprocessor directives and macros [6.6]
CL_SYCL_LANGUAGE_VERSION Integer version, e.g.: 121
__FAST_RELAXED_MATH__ -cl-fast-relaxed-math
__SYCL_DEVICE_ONLY__ defined when in device compilation
__SYCL_SINGLE_SOURCE__ produce host and device binary
SYCL_EXTERNAL
allow kernel external linkage
(optional)
Examples of how to invoke kernels
Example: single_task invoke [4.8.5.1]
SYCL provides a simple interface to enqueue a kernel that will be
sequentially executed on an OpenCL device.
myQueue.submit([&](handler & cgh) {
cgh.single_task<class kernel_name>(
[=] () {
// [kernel code]
}));
});
Examples: parallel_for invoke [4.8.5.2]
Example #1
Using a lambda function for a kernel invocation.
myQueue.submit([&](handler & cgh) {
auto acc = myBuffer.get_access<access::mode::write>(cgh);
cgh.parallel_for<class myKernel>
(range<1>(numWorkItems),
[=] (id<1> index) {
acc[index] = 42.0f;
});
});
Example #2
Allowing the runtime to choose the index space best matching
the range provided using information given the scheduled
interface instead of the general interface.
class MyKernel;
myQueue.submit([&](handler & cgh)
{
auto acc=myBuffer.get_access<access::mode::write>(cgh);
cgh.parallel_for<class myKernel>(range<1>(
numWorkItems), [=] (item<1> item)
{
size_t index = item.get_global();
acc[index] = 42.0f;
});
});
Example #3
Two examples of launching a kernel functor over a 3D grid. In
the first example, work_item IDs range from 0 to 2.
myQueue.submit([&](handler & cgh) {
cgh.parallel_for<class example_kernel1>(
range<3>(3,3,3), // global range
[=] (item<3> it) {
//[kernel code]
});
});
Example #4
Launching sixty-four work-items in a three-dimensional grid with
four in each dimension and divided into eight work-groups.
myQueue.submit([&](handler & cgh) {
cgh.parallel_for<class example_kernel>(
nd_range<3>(range<3>(4, 4, 4), range<3>(2, 2, 2)),
[=] (nd_item<3> item) {
//[kernel code]
// Work-group synchronization
item.barrier(access::fence_space::global_space);
//[kernel code]
});
});
Parallel For hierarchical invoke [4.8.5.3]
myQueue.submit([&](handler & cgh) {
// Issue 8 work-groups of 8 work-items each
cgh.parallel_for_work_group<class example_kernel>(
range<3>(2, 2, 2), range<3>(2, 2, 2),
[=](group<3> myGroup)
{
// [workgroup code]
int myLocal; // this variable is shared between workitems
// This variable will be instantiated for each work-item
// separately
private_memory<int> myPrivate(myGroup);
// Issue parallel sets of work-items each sized using the
// runtime default
myGroup.parallel_for_work_item([&](h_item<3> myItem)
{
// [work-item code]
myPrivate(myItem) = 0;
});
// Carry private value across loops
myGroup.parallel_for_work_item([&](h_item<3> myItem)
{
//[work-item code]
output[myItem.get_global()] = myPrivate(myItem);
});
//[workgroup code]
});
});

Más contenido relacionado

La actualidad más candente

The OpenCL C++ Wrapper 1.2 Reference Card
The OpenCL C++ Wrapper 1.2 Reference CardThe OpenCL C++ Wrapper 1.2 Reference Card
The OpenCL C++ Wrapper 1.2 Reference CardThe Khronos Group Inc.
 
Алексей Кутумов, Вектор с нуля
Алексей Кутумов, Вектор с нуляАлексей Кутумов, Вектор с нуля
Алексей Кутумов, Вектор с нуляSergey Platonov
 
Open CL For Speedup Workshop
Open CL For Speedup WorkshopOpen CL For Speedup Workshop
Open CL For Speedup WorkshopOfer Rosenberg
 

La actualidad más candente (20)

OpenGL SC 2.0 Quick Reference
OpenGL SC 2.0 Quick ReferenceOpenGL SC 2.0 Quick Reference
OpenGL SC 2.0 Quick Reference
 
OpenMAX AL 1.0 Reference Card
OpenMAX AL 1.0 Reference CardOpenMAX AL 1.0 Reference Card
OpenMAX AL 1.0 Reference Card
 
Vulkan 1.0 Quick Reference
Vulkan 1.0 Quick ReferenceVulkan 1.0 Quick Reference
Vulkan 1.0 Quick Reference
 
OpenCL 2.0 Reference Card
OpenCL 2.0 Reference CardOpenCL 2.0 Reference Card
OpenCL 2.0 Reference Card
 
OpenXR 1.0 Reference Guide
OpenXR 1.0 Reference GuideOpenXR 1.0 Reference Guide
OpenXR 1.0 Reference Guide
 
glTF 2.0 Reference Guide
glTF 2.0 Reference GuideglTF 2.0 Reference Guide
glTF 2.0 Reference Guide
 
WebGL 2.0 Reference Guide
WebGL 2.0 Reference GuideWebGL 2.0 Reference Guide
WebGL 2.0 Reference Guide
 
OpenGL 4.5 Reference Card
OpenGL 4.5 Reference CardOpenGL 4.5 Reference Card
OpenGL 4.5 Reference Card
 
OpenGL 4.6 Reference Guide
OpenGL 4.6 Reference GuideOpenGL 4.6 Reference Guide
OpenGL 4.6 Reference Guide
 
OpenCL 1.2 Reference Card
OpenCL 1.2 Reference CardOpenCL 1.2 Reference Card
OpenCL 1.2 Reference Card
 
OpenSL ES 1.1 Reference Card
OpenSL ES 1.1 Reference CardOpenSL ES 1.1 Reference Card
OpenSL ES 1.1 Reference Card
 
OpenGL 4.4 Reference Card
OpenGL 4.4 Reference CardOpenGL 4.4 Reference Card
OpenGL 4.4 Reference Card
 
OpenWF 1.0 Reference Card
OpenWF 1.0 Reference CardOpenWF 1.0 Reference Card
OpenWF 1.0 Reference Card
 
OpenVG 1.1 Reference Card
OpenVG 1.1 Reference Card OpenVG 1.1 Reference Card
OpenVG 1.1 Reference Card
 
OpenCL 1.1 Reference Card
OpenCL 1.1 Reference CardOpenCL 1.1 Reference Card
OpenCL 1.1 Reference Card
 
The OpenCL C++ Wrapper 1.2 Reference Card
The OpenCL C++ Wrapper 1.2 Reference CardThe OpenCL C++ Wrapper 1.2 Reference Card
The OpenCL C++ Wrapper 1.2 Reference Card
 
Алексей Кутумов, Вектор с нуля
Алексей Кутумов, Вектор с нуляАлексей Кутумов, Вектор с нуля
Алексей Кутумов, Вектор с нуля
 
OpenCL Reference Card
OpenCL Reference CardOpenCL Reference Card
OpenCL Reference Card
 
Open CL For Speedup Workshop
Open CL For Speedup WorkshopOpen CL For Speedup Workshop
Open CL For Speedup Workshop
 
EGL 1.4 Reference Card
EGL 1.4 Reference CardEGL 1.4 Reference Card
EGL 1.4 Reference Card
 

Similar a SYCL 1.2.1 Reference Card

Attributes & .NET components
Attributes & .NET componentsAttributes & .NET components
Attributes & .NET componentsBình Trọng Án
 
Hack an ASP .NET website? Hard, but possible!
Hack an ASP .NET website? Hard, but possible! Hack an ASP .NET website? Hard, but possible!
Hack an ASP .NET website? Hard, but possible! Vladimir Kochetkov
 
DevOps Enabling Your Team
DevOps Enabling Your TeamDevOps Enabling Your Team
DevOps Enabling Your TeamGR8Conf
 
Symfony2 from the Trenches
Symfony2 from the TrenchesSymfony2 from the Trenches
Symfony2 from the TrenchesJonathan Wage
 
Build resource server &amp; client for OCF Cloud (2018.8.30)
Build resource server &amp; client for OCF Cloud (2018.8.30)Build resource server &amp; client for OCF Cloud (2018.8.30)
Build resource server &amp; client for OCF Cloud (2018.8.30)남균 김
 
ITT 2014 - Peter Steinberger - Architecting Modular Codebases
ITT 2014 - Peter Steinberger - Architecting Modular CodebasesITT 2014 - Peter Steinberger - Architecting Modular Codebases
ITT 2014 - Peter Steinberger - Architecting Modular CodebasesIstanbul Tech Talks
 
Symfony2 - from the trenches
Symfony2 - from the trenchesSymfony2 - from the trenches
Symfony2 - from the trenchesLukas Smith
 
SynapseIndia mobile apps deployment framework internal architecture
SynapseIndia mobile apps deployment framework internal architectureSynapseIndia mobile apps deployment framework internal architecture
SynapseIndia mobile apps deployment framework internal architectureSynapseindiappsdevelopment
 
Monitoring InfluxEnterprise
Monitoring InfluxEnterpriseMonitoring InfluxEnterprise
Monitoring InfluxEnterpriseInfluxData
 
Samsung WebCL Prototype API
Samsung WebCL Prototype APISamsung WebCL Prototype API
Samsung WebCL Prototype APIRyo Jin
 
GHC Participant Training
GHC Participant TrainingGHC Participant Training
GHC Participant TrainingAidIQ
 
Metasploit Railguns presentation @ tcs hyderabad
Metasploit Railguns presentation @ tcs hyderabadMetasploit Railguns presentation @ tcs hyderabad
Metasploit Railguns presentation @ tcs hyderabadChaitanya krishna
 
Pro typescript.ch03.Object Orientation in TypeScript
Pro typescript.ch03.Object Orientation in TypeScriptPro typescript.ch03.Object Orientation in TypeScript
Pro typescript.ch03.Object Orientation in TypeScriptSeok-joon Yun
 
How to instantiate any view controller for free
How to instantiate any view controller for freeHow to instantiate any view controller for free
How to instantiate any view controller for freeBenotCaron
 

Similar a SYCL 1.2.1 Reference Card (20)

OpenCL C++ Wrapper 1.2 Reference Card
OpenCL C++ Wrapper 1.2 Reference CardOpenCL C++ Wrapper 1.2 Reference Card
OpenCL C++ Wrapper 1.2 Reference Card
 
Attributes & .NET components
Attributes & .NET componentsAttributes & .NET components
Attributes & .NET components
 
Hack an ASP .NET website? Hard, but possible!
Hack an ASP .NET website? Hard, but possible! Hack an ASP .NET website? Hard, but possible!
Hack an ASP .NET website? Hard, but possible!
 
DevOps Enabling Your Team
DevOps Enabling Your TeamDevOps Enabling Your Team
DevOps Enabling Your Team
 
Symfony2 from the Trenches
Symfony2 from the TrenchesSymfony2 from the Trenches
Symfony2 from the Trenches
 
Build resource server &amp; client for OCF Cloud (2018.8.30)
Build resource server &amp; client for OCF Cloud (2018.8.30)Build resource server &amp; client for OCF Cloud (2018.8.30)
Build resource server &amp; client for OCF Cloud (2018.8.30)
 
ITT 2014 - Peter Steinberger - Architecting Modular Codebases
ITT 2014 - Peter Steinberger - Architecting Modular CodebasesITT 2014 - Peter Steinberger - Architecting Modular Codebases
ITT 2014 - Peter Steinberger - Architecting Modular Codebases
 
Symfony2 - from the trenches
Symfony2 - from the trenchesSymfony2 - from the trenches
Symfony2 - from the trenches
 
SynapseIndia mobile apps deployment framework internal architecture
SynapseIndia mobile apps deployment framework internal architectureSynapseIndia mobile apps deployment framework internal architecture
SynapseIndia mobile apps deployment framework internal architecture
 
Monitoring InfluxEnterprise
Monitoring InfluxEnterpriseMonitoring InfluxEnterprise
Monitoring InfluxEnterprise
 
Opencl cheet sheet
Opencl cheet sheetOpencl cheet sheet
Opencl cheet sheet
 
Hack ASP.NET website
Hack ASP.NET websiteHack ASP.NET website
Hack ASP.NET website
 
TO Hack an ASP .NET website?
TO Hack an ASP .NET website?  TO Hack an ASP .NET website?
TO Hack an ASP .NET website?
 
Reflection
ReflectionReflection
Reflection
 
srgoc
srgocsrgoc
srgoc
 
Samsung WebCL Prototype API
Samsung WebCL Prototype APISamsung WebCL Prototype API
Samsung WebCL Prototype API
 
GHC Participant Training
GHC Participant TrainingGHC Participant Training
GHC Participant Training
 
Metasploit Railguns presentation @ tcs hyderabad
Metasploit Railguns presentation @ tcs hyderabadMetasploit Railguns presentation @ tcs hyderabad
Metasploit Railguns presentation @ tcs hyderabad
 
Pro typescript.ch03.Object Orientation in TypeScript
Pro typescript.ch03.Object Orientation in TypeScriptPro typescript.ch03.Object Orientation in TypeScript
Pro typescript.ch03.Object Orientation in TypeScript
 
How to instantiate any view controller for free
How to instantiate any view controller for freeHow to instantiate any view controller for free
How to instantiate any view controller for free
 

Más de The Khronos Group Inc.

Vulkan Ray Tracing Update JP Translation
Vulkan Ray Tracing Update JP TranslationVulkan Ray Tracing Update JP Translation
Vulkan Ray Tracing Update JP TranslationThe Khronos Group Inc.
 
Vulkan Update Japan Virtual Open House Feb 2021
Vulkan Update Japan Virtual Open House Feb 2021Vulkan Update Japan Virtual Open House Feb 2021
Vulkan Update Japan Virtual Open House Feb 2021The Khronos Group Inc.
 
Vulkan Ray Tracing Update Japan Virtual Open House Feb 2021
Vulkan Ray Tracing Update Japan Virtual Open House Feb 2021Vulkan Ray Tracing Update Japan Virtual Open House Feb 2021
Vulkan Ray Tracing Update Japan Virtual Open House Feb 2021The Khronos Group Inc.
 
OpenCL Overview Japan Virtual Open House Feb 2021
OpenCL Overview Japan Virtual Open House Feb 2021OpenCL Overview Japan Virtual Open House Feb 2021
OpenCL Overview Japan Virtual Open House Feb 2021The Khronos Group Inc.
 
Vulkan ML Japan Virtual Open House Feb 2021
Vulkan ML Japan Virtual Open House Feb 2021Vulkan ML Japan Virtual Open House Feb 2021
Vulkan ML Japan Virtual Open House Feb 2021The Khronos Group Inc.
 
glTF Overview Japan Virtual Open House Feb 2021
glTF Overview Japan Virtual Open House Feb 2021glTF Overview Japan Virtual Open House Feb 2021
glTF Overview Japan Virtual Open House Feb 2021The Khronos Group Inc.
 
Khronos Overview Japan Virtual Open House Feb 2021
Khronos Overview Japan Virtual Open House Feb 2021Khronos Overview Japan Virtual Open House Feb 2021
Khronos Overview Japan Virtual Open House Feb 2021The Khronos Group Inc.
 

Más de The Khronos Group Inc. (16)

Vulkan Ray Tracing Update JP Translation
Vulkan Ray Tracing Update JP TranslationVulkan Ray Tracing Update JP Translation
Vulkan Ray Tracing Update JP Translation
 
Vulkan ML JP Translation
Vulkan ML JP TranslationVulkan ML JP Translation
Vulkan ML JP Translation
 
OpenCL Overview JP Translation
OpenCL Overview JP TranslationOpenCL Overview JP Translation
OpenCL Overview JP Translation
 
glTF overview JP Translation
glTF overview JP TranslationglTF overview JP Translation
glTF overview JP Translation
 
Khronos Overview JP Translation
Khronos Overview JP TranslationKhronos Overview JP Translation
Khronos Overview JP Translation
 
Vulkan Update Japan Virtual Open House Feb 2021
Vulkan Update Japan Virtual Open House Feb 2021Vulkan Update Japan Virtual Open House Feb 2021
Vulkan Update Japan Virtual Open House Feb 2021
 
Vulkan Ray Tracing Update Japan Virtual Open House Feb 2021
Vulkan Ray Tracing Update Japan Virtual Open House Feb 2021Vulkan Ray Tracing Update Japan Virtual Open House Feb 2021
Vulkan Ray Tracing Update Japan Virtual Open House Feb 2021
 
OpenCL Overview Japan Virtual Open House Feb 2021
OpenCL Overview Japan Virtual Open House Feb 2021OpenCL Overview Japan Virtual Open House Feb 2021
OpenCL Overview Japan Virtual Open House Feb 2021
 
Vulkan ML Japan Virtual Open House Feb 2021
Vulkan ML Japan Virtual Open House Feb 2021Vulkan ML Japan Virtual Open House Feb 2021
Vulkan ML Japan Virtual Open House Feb 2021
 
glTF Overview Japan Virtual Open House Feb 2021
glTF Overview Japan Virtual Open House Feb 2021glTF Overview Japan Virtual Open House Feb 2021
glTF Overview Japan Virtual Open House Feb 2021
 
Khronos Overview Japan Virtual Open House Feb 2021
Khronos Overview Japan Virtual Open House Feb 2021Khronos Overview Japan Virtual Open House Feb 2021
Khronos Overview Japan Virtual Open House Feb 2021
 
SYCL 2020 Specification
SYCL 2020 SpecificationSYCL 2020 Specification
SYCL 2020 Specification
 
OpenVX 1.3 Reference Guide
OpenVX 1.3 Reference GuideOpenVX 1.3 Reference Guide
OpenVX 1.3 Reference Guide
 
OpenVX 1.2 Reference Guide
OpenVX 1.2 Reference GuideOpenVX 1.2 Reference Guide
OpenVX 1.2 Reference Guide
 
OpenVX 1.1 Reference Guide
OpenVX 1.1 Reference GuideOpenVX 1.1 Reference Guide
OpenVX 1.1 Reference Guide
 
OpenVX 1.0 Reference Guide
OpenVX 1.0 Reference GuideOpenVX 1.0 Reference Guide
OpenVX 1.0 Reference Guide
 

Último

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 

Último (20)

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 

SYCL 1.2.1 Reference Card

  • 1. www.khronos.org/sycl©2018 Khronos Group - Rev. 0818 SYCL 1.2.1 API Reference Guide Page 1 Runtime: Device class [4.6.4] device(); explicit device(const device_selector &deviceSelector); explicit device(cl_device_id deviceId); cl_device_id get() const; platform get_platform() const; bool is_host() const; bool is_cpu() const; bool is_gpu() const; bool is_accelerator() const; template <info::device param> typename info::param_traits< info::device, param>::return_type get_info() const; bool has_extension(const string_class &extension) const; template <info::partition_property prop> vector_class<device> create_sub_devices( size_t nbSubDev) const; template <info::partition_property prop> vector_class<device> create_sub_devices( const vector_class<size_t> &counts) const; template <info::partition_property prop> vector_class<device> create_sub_devices( info::affinity_domain affinityDomain) const; static vector_class<device> get_devices( info::device_type deviceType = info::device_type::all); (Continued on next page) SYCL™ is a C++ programming model for OpenCL which builds on the underlying concepts, portability, and efficiency of OpenCL while adding much of the ease of use and flexibility of C++. [n.n] refers to sections in the SYCL 1.2.1 specification available at khronos.org/registry/sycl SYCL example [3.1.3] Below is an example of a typical SYCL application which schedules a job to run in parallel on any OpenCL accelerator. #include <CL/sycl.hpp> #include <iostream> int main() { using namespace cl::sycl; int data[1024]; // Allocates data to be worked on // Include all the SYCL work in a {} block to ensure all // SYCL tasks are completed before exiting the block. { // Create a queue to enqueue work to queue myQueue; // Wrap the data array variable in a buffer. buffer<size_t, 1> resultBuf { data, range<1> { 1024 } }; // Create a command group to // issue commands to the queue. myQueue.submit([&](handler & cgh) { // Request access to the buffer auto writeResult = resultBuf.get_access<access::mode::write>(cgh); // Enqueue a parallel_for task. cgh.parallel_for<class simple_test>(range<1>{ 1024 }, [=](id<1> idx) { writeResult[idx] = idx[0]; }); // End of the kernel function }); // End of the queue commands } // End of scope, so wait for the queued work to complete // Print result for (int i = 0; i < 1024; i++) { std::cout <<''data[''<< i << ''] = '' << data[i] << std::endl; } return 0; } Runtime: Context class [4.6.3] explicit context(async_handler asyncHandler = {}); context(const device &dev, async_handler asyncHandler = {}); context(const platform &plt, async_handler asyncHandler = {}); context(const vector_class<device> &deviceList, async_handler asyncHandler = {}); context(cl_context clContext, async_handler asyncHandler = {}); cl_context get() const; bool is_host() const; template <info::context param> typename info::param_traits <info::context, param>::return_type get_info() const; platform get_platform() const; vector_class<device> get_devices() const; Context queries using get_info(): Descriptor Return type info::context::reference_count cl_uint info::context::platform platform info::context::devices vector_class<device> Runtime: Platform class [4.6.2] platform(); explicit platform(cl_platform_id platformID); explicit platform(const device_selector &deviceSelector); cl_platform_id get() const; static vector_class<platform> get_platforms(); vector_class<device> get_devices( info::device_type = info::device_type::all) const; template <info::platform param> typename info::param_traits< info::platform, param>::return_type get_info() const; bool has_extension(const string_class & extension) const; bool is_host() const; Platform information descriptors: Platform descriptors Return type info::platform::profile string_class info::platform::version string_class info::platform::name string_class info::platform::vendor string_class info::platform::extensions vector_class<string_class> Header file SYCL programs must include the <CL/sycl.hpp> header file to provide all of the SYCL features. Namespace All SYCL names are defined in the cl::sycl namespace. Queue See queue class functions [4.6.5] on page 2 of this reference guide. Buffer See buffer class functions [4.7.2] on page 2 of this reference guide. Accessor See accessor class function [4.7.6] on page 3 of this reference guide. Handler See handler class functions [4.8.3] on page 5 of this reference guide. Scopes The kernel scope specifies a single kernel function that will be, or has been, compiled by a device compiler and executed on a device. See Invoking Kernels [4.8.5] in the spec. The command group scope specifies a unit of work which is comprised of a kernel function and accessors. The application scope specifies all other code outside of a command group scope. Runtime: Device selection class [4.6.1] device_selector(); device_selector(const device_selector &rhs); device_selector &operator=(const device_selector &rhs); virtual ~device_selector(); device select_device() const; virtual int operator()(const device &device) const; SYCL device selectors: Device selectors Description default_selector Devices selected by heuristics of the system gpu_selector Select devices according to device type info::device::device_type::gpu cpu_selector Select devices according to device type info::device::device_type::cpu host_selector Selects the SYCL host Runtime: Common interface [4.3.2 - 4.3.4] Member functions for by-value semantics T may be device, context, queue, program, kernel, event, buffer, image, sampler, accessor, or stream. T(const T &rhs); T(T &&rhs); T &operator=(const T &rhs); T &operator=(T &&rhs); ~T(); bool operator==(const T &rhs) const; bool operator!=(const T &rhs) const; Member functions for by-value semantics In the following functions, T may be id, range, item, nd_item, h_item, group, or nd_range. T(const T &rhs) = default; T(T &&rhs) = default; T &operator=(const T &rhs) = default; T &operator=(T &&rhs) = default; ~T() = default; bool operator==(const T &rhs) const; bool operator!=(const T &rhs) const; Properties interface [4.3.4.1] Some runtime classes provide this properties interface. class T { . . . template <typename propertyT> bool has_property() const; template <typename propertyT> propertyT get_property() const; . . . }; class property_list { public: template <typename... propertyTN> property_list(propertyTN... props); };
  • 2. www.khronos.org/sycl©2018 Khronos Group - Rev. 0818 SYCL 1.2.1 API Reference Guide Page 2 Runtime: Device class (continued) Device queries using get_info(): Descriptor in info::device Return type device_type info::device_type vendor_id cl_uint max_compute_units cl_uint max_work_item_dimensions cl_uint max_work_item_sizes id<3> max_work_group_size size_t preferred_vector_width_char cl_uint preferred_vector_width_short preferred_vector_width_int preferred_vector_width_long preferred_vector_width_float preferred_vector_width_double preferred_vector_width_half native_vector_width_char cl_uint native_vector_width_short native_vector_width_int native_vector_width_long native_vector_width_float native_vector_width_double native_vector_width_half max_clock_frequency cl_uint address_bits cl_uint max_mem_alloc_size cl_ulong image_support bool Descriptor in info::device Return type max_read_image_args cl_uint max_write_image_args cl_uint image2d_max_width size_t image2d_max_height size_t image3d_max_width size_t image3d_max_height size_t image3d_max_depth size_t image_max_buffer_size size_t image_max_array_size size_t max_samplers cl_uint max_parameter_size size_t mem_base_addr_align cl_uint half_fp_config vector_class<info::fp_config> single_fp_config vector_class<info::fp_config> double_fp_config vector_class<info::fp_config> global_mem_cache_type info::global_mem_cache_type global_mem_cache_line_size cl_uint global_mem_cache_size cl_ulong global_mem_size cl_ulong max_constant_buffer_size cl_ulong max_constant_args cl_uint local_mem_type info::local_mem_type local_mem_size cl_ulong error_correction_support bool host_unified_memory bool profiling_timer_resolution size_t is_endian_little bool Descriptor in info::device Return type is_available bool is_compiler_available bool is_linker_available bool execution_capabilities vector_class <info::execution_capability> queue_profiling bool built_in_kernels vector_class<string_class> platform platform name string_class vendor string_class driver_version string_class profile string_class version string_class opencl_c_version string_class extensions vector_class<string_class> printf_buffer_size size_t preferred_interop_user_sync bool parent_device device partition_max_sub_devices cl_uint partition_properties vector_class <info:: partition_property> partition_affinity_domains vector_class <info::partition_affinity_ domain> partition_type_property info::partition_property partition_type_affinity_domain info::partition_affinity_domain reference_count cl_uint Runtime: Queue class [4.6.5] Property class members: property::queue::enable_profiling::enable_profiling(); explicit queue(const property_list &propList = {}); queue(const async_handler &asyncHandler, const property_list &propList = {}); queue(const device_selector &deviceSelector, const property_list &propList = {}); queue(const device_selector &deviceSelector, const async_handler &asyncHandler, const property_list &propList = {}); queue(const device &syclDevice, const property_list &propList = {}); queue(const device &syclDevice, const async_handler &asyncHandler, const property_list &propList = {}); queue(const context &syclContext, const device_selector &deviceSelector, const property_list &propList = {}); queue(const context &syclContext, const device_selector &deviceSelector, const async_handler &asyncHandler, const property_list &propList = {}); queue(cl_command_queue clQueue, const context &syclContext, const async_handler &asyncHandler = {}); cl_command_queue get() const; context get_context() const; device get_device() const; bool is_host() const; void wait(); void wait_and_throw(); void throw_asynchronous(); template <info::queue param> typename info::param_traits< info::queue, param>::return_type get_info() const; template <typename T> event submit(T cgf); template <typename T> event submit(T cgf, const queue &secondaryQueue); Queue queries using get_info(): Descriptor Return type info::queue::context context info::queue::device device info::queue::reference_count cl_uint Buffer class [4.7.2] Property class members: property::buffer::use_host_ptr::use_host_ptr(); property::buffer::use_mutex::use_mutex(mutex_class &mutexRef); property::buffer::context_bound::context_bound( context boundContext); mutex_class *property::buffer::use_mutex::get_mutex_ptr() const; context property::buffer::context_bound::get_context() const; Class Declaration template <typename T, int dimensions = 1, typename AllocatorT = cl::sycl::buffer_allocator> class buffer; Member Functions buffer(const range<dimensions> &bufferRange, const property_list &propList = {}); buffer(const range<dimensions> &bufferRange, AllocatorT allocator, const property_list &propList = {}); buffer(T *hostData, const range<dimensions> &bufferRange, const property_list &propList = {}); buffer(T *hostData, const range<dimensions> &bufferRange, AllocatorT allocator, const property_list &propList = {}); buffer(const T *hostData, const range<dimensions> &bufferRange, const property_list &propList = {}); buffer(const T *hostData, const range<dimensions> &bufferRange, AllocatorT allocator, const property_list &propList = {}); buffer(shared_ptr_class<T> &hostData, const range<dimensions> &bufferRange, AllocatorT allocator, const property_list &propList = {}); buffer(shared_ptr_class<T> &hostData, const range<dimensions> &bufferRange, const property_list &propList = {}); template <class InputIterator> buffer(InputIterator first, InputIterator last, AllocatorT allocator, const property_list &propList = {}); template <class InputIterator> buffer(InputIterator first, InputIterator last, const property_list &propList = {}); buffer(buffer<T, dimensions, AllocatorT> &b, const index<dimensions> &baseIndex, const range<dimensions> &subRange); buffer(cl_mem clMemObject, const context &syclContext, event availableEvent = {}); range<dimensions> get_range() const; size_t get_count() const; size_t get_size() const; AllocatorT get_allocator() const; template <access::mode mode, access::target target = access::target::global_buffer> accessor<T, dimensions, mode, target> get_access(handler &commandGroupHandler); template <access::mode mode>accessor<T, dimensions, mode, access::target::host_buffer> get_access(); template <access::mode mode, access::target target = access::target::global_buffer> accessor<T, dimensions, mode, target> get_access(handler &commandGroupHandler, range<dimensions> accessRange, id<dimensions> accessOffset = {}); template <access::mode mode> accessor<T, dimensions, mode, access::target::host_buffer> get_access(range<dimensions> accessRange, id<dimensions> accessOffset = {}); template <typename Destination = std::nullptr_t> void set_final_data(Destination finalData = std::nullptr); void set_write_back(bool flag = true); bool is_sub_buffer() const; template <typename reinterpretT, int reinterpretDim> buffer<reinterpretT, reinterpretDim, AllocatorT> reinterpret(range<reinterpretDim> reinterpretRange) const; Runtime: Event class [4.6.6] event() event(cl_event clEvent, const context &syclContext); cl_event get(); vector_class<event> get_wait_list(); void wait(); void wait_and_throw(); static void wait(const vector_class<event> &eventList); static void wait_and_throw( const vector_class<event> &eventList); template <info::event param>typename info::param_traits< info::event, param>::return_type get_info() const; template <info::event_profiling param> typename info::param_traits<info::event_profiling, param>::return_type get_profiling_info() const; Event queries using get_info() Descriptor Return type info::event::command_execution_status info::event::command_ status info::event::reference_count cl_uint info::event_profiling::command_submit cl_ulong info::event_profiling::command_start cl_ulong info::event_profiling::command_end cl_ulong
  • 3. www.khronos.org/sycl©2018 Khronos Group - Rev. 0818 SYCL 1.2.1 API Reference Guide Page 3 Accessor class [4.7.6] Enums access::mode access::target read, write, read_write, discard_write, discard_read_write, atomic global_buffer, constant_buffer, local, image, host_buffer, host_image, image_array Class Declaration template <typename dataT, int dimensions, access::mode accessmode, access::target accessTarget = access::target::global_buffer, access::placeholder isPlaceholder = access::placeholder::false_t> class accessor; Members of class accessor for buffers accessor(buffer<dataT, 1> &bufferRef); accessor(buffer<dataT, 1> &bufferRef, handler &commandGroupHandlerRef); accessor(buffer<dataT, dimensions> &bufferRef); accessor(buffer<dataT, dimensions> &bufferRef, handler &commandGroupHandlerRef); accessor(buffer<dataT, dimensions> &bufferRef, range<dimensions> accessRange, id<dimensions> accessOffset = {}); accessor(buffer<dataT, dimensions> &bufferRef, handler &commandGroupHandlerRef, range<dimensions> accessRange, id<dimensions> accessOffset = {}); constexpr bool is_placeholder() const; size_t get_size() const; size_t get_count() const; range<dimensions> get_range() const; range<dimensions> get_offset() const; operator dataT &() const; dataT &operator[](id<dimensions> index) const; dataT &operator[](size_t index) const; operator dataT() const; dataT operator[](id<dimensions> index) const; dataT operator[](size_t index) const; operator atomic<dataT, access::address_space::global_space> () const; atomic<dataT, access::address_space::global_space> operator[](id<dimensions> index) const; atomic<dataT, access::address_space::global_space> operator[](size_t index) const; __unspecified__ &operator[](size_t index) const; dataT *get_pointer() const; global_ptr<dataT> get_pointer() const; constant_ptr<dataT> get_pointer() const; Members of class accessor for local access accessor(handler &commandGroupHandlerRef); accessor(range<dimensions> allocationSize, handler &commandGroupHandlerRef); size_t get_size() const; size_t get_count() const; operator dataT &() const; dataT &operator[](id<dimensions> index) const; dataT &operator[](size_t index) const; operator atomic<dataT, access::address_space::local_space> () const; atomic<dataT, access::address_space::local_space> operator[](id<dimensions> index) const; atomic<dataT, access::address_space::local_space> operator[](size_t index) const; __unspecified__ &operator[](size_t index) const; local_ptr<dataT> get_pointer() const; Members of class accessor for images template <typename AllocatorT> accessor(image<dimensions, AllocatorT> &imageRef); template <typename AllocatorT> accessor(image<dimensions, AllocatorT> &imageRef, handler &commandGroupHandlerRef); template <typename AllocatorT> accessor(image<dimensions + 1, AllocatorT> &imageRef, handler &commandGroupHandlerRef); accessor<dataT, dimensions, mode, image> operator[](size_t index) const; size_t get_size() const; size_t get_count() const; template <typename coordT> dataT read(const coordT &coords) const; template <typename coordT> dataT read(const coordT &coords, const sampler &smpl) const; template <typename coordT> void write(const coordT &coords, const dataT &color) const; Accessor Capabilities Buffer [Table 4.44] The data type must match that of the SYCL buffer. The dimensionality is 0, 1, 2, or 3. Access Target Accessor Type Access Mode Placeholder modes global_buffer Device read, write, read_write discard_write, discard_read_write, atomic false_t true_t constant_buffer Device read false_t true_t host_buffer Host read, write, read_write discard_write, discard_read_write false_t Local [Table 4.47] The data type must match that of the SYCL buffer. The dimensionality is 0, 1, 2, or 3. Access Target Accessor Type Access Mode Placeholder modes local Device read_write, atomic false_t (Continued on next page) Image class [4.7.3] Property class members: property::image::use_host_ptr::use_host_ptr(); property::image::use_mutex::use_mutex(mutex_class &mutexRef); property::image::context_bound::context_bound( context boundContext); mutex_class *property::image::use_mutex::get_mutex_ptr() const; context property::image::context_bound::get_context() const; Enums image_channel_order image_channel_type a, r, rx, rg, rgx, ra, rgb, rgbx, rgba, argb, bgra, intensity, luminance, abgr snorm_int8, snorm_int16, unorm_int8, unorm_int16, unorm_short_565, unorm_short_555, unorm_int_101010, signed_int8, signed_int16, signed_int32, unsigned_int8, unsigned_int16, unsigned_int32, fp16, fp32 Class Declaration template <int dimensions = 1, typename AllocatorT = cl::sycl::image_allocator> class image; Member Functions image(image_channel_order order, image_channel_type type, const range<dimensions> &range, const property_list &propList = {}); image(image_channel_order order, image_channel_type type, const range<dimensions> &range, AllocatorT allocator, const property_list &propList = {}); image(image_channel_order order, image_channel_type type, const range<dimensions> &range, const range<dimensions - 1> &pitch, const property_list &propList = {}); image(void *hostPointer, image_channel_order order, image_channel_type type, const range<dimensions> &range, const range<dimensions - 1> &pitch, AllocatorT allocator, const property_list &propList = {}); image(void *hostPointer, image_channel_order order, image_channel_type type, const range<dimensions> &range, const property_list &propList = {}); image(void *hostPointer, image_channel_order order, image_channel_type type, const range<dimensions> &range, AllocatorT allocator, const property_list &propList = {}); image(void *hostPointer, image_channel_order order, image_channel_type type, const range<dimensions> &range, const range<dimensions - 1> &pitch, const property_list &propList = {}); image(void *hostPointer, image_channel_order order, image_channel_type type, const range<dimensions> &range, const range<dimensions - 1> &pitch, AllocatorT allocator, const property_list &propList = {}); image(void *hostPointer, image_channel_order order, image_channel_type type, const range<dimensions> &range, const property_list &propList = {}); image(void *hostPointer, image_channel_order order, image_channel_type type, const range<dimensions> &range, AllocatorT allocator, const property_list &propList = {}); image(shared_ptr_class<void> &hostPointer, image_channel_order order, image_channel_type type, const range<dimensions> &range, const property_list &propList = {}); image(shared_ptr_class<void> &hostPointer, image_channel_order order, image_channel_type type, const range<dimensions> &range, AllocatorT allocator, const property_list &propList = {}); image(shared_ptr_class<void> &hostPointer, image_channel_order order, image_channel_type type, const range<dimensions> &range, const range<dimensions - 1> &pitch, const property_list &propList = {}); image(shared_ptr_class<void> &hostPointer, image_channel_order order, image_channel_type type, const range<dimensions> &range, const range<dimensions - 1> &pitch, AllocatorT allocator, const property_list &propList = {}); image(cl_mem clMemObject, const context &syclContext, event availableEvent = {}); range<dimensions> get_range() const; range<dimensions-1> get_pitch() const; size_t get_count() const; size_t get_size() const; AllocatorT get_allocator() const; template <typename dataT, access::mode mode> accessor<dataT, dimensions, accessMode, access::target::image> get_access(handler &commandGroupHandler); template <typename dataT, access::mode mode> accessor<dataT, dimensions, accessMode, access::target::host_image> get_access(); template <typename Destination = std::nullptr_t> void set_final_data(Destination finalData = std::nullptr); void set_write_back(bool flag = true); Data management: Host allocation [4.7.1] The default allocator for memory objects is implementation defined, but users can supply their own allocator class. buffer<int, 1, UserDefinedAllocator<int> > b(d); Default allocators Allocators Description buffer_allocator Default buffer allocator used by the runtime, when no allocator is defined by the user. image_allocator Default image allocator used by the runtime for the image class, when no allocator is defined by the user. Must be a byte-sized allocator.
  • 4. www.khronos.org/sycl©2018 Khronos Group - Rev. 0818 SYCL 1.2.1 API Reference Guide Page 4 Multi_ptr class [4.7.7] Class Declaration template <typename elementType, access::address_space addressSpace> class multi_ptr; Member Functions multi_ptr(); multi_ptr(const multi_ptr&); multi_ptr(multi_ptr&&); multi_ptr(pointer_t); multi_ptr(elementType*); multi_ptr(void*); multi_ptr(std::nullptr_t); ~multi_ptr(); multi_ptr &operator=(const multi_ptr&); multi_ptr &operator=(multi_ptr&&); multi_ptr &operator=(pointer_t); multi_ptr &operator=(elementType*); multi_ptr &operator=(void*); multi_ptr &operator=(std::nullptr_t); elementType &operator=() const; elementType *operator=() const; template <int dimensions, access::mode addressMode> multi_ptr(accessor<elementType, dimensions, addressMode, access::target::global_buffer>); template <int dimensions, access::mode addressMode> multi_ptr(accessor<elementType, dimensions, addressMode, access::target::local>); template <int dimensions, access::mode addressMode> multi_ptr(accessor<elementType, dimensions, addressMode, access::target::constant_buffer>); template <typename elementType, int dimensions, access::mode addressMode> multi_ptr(accessor<elementType, dimensions, addressMode, access::target::global_buffer>); template <typename elementType, int dimensions, access::mode addressMode> multi_ptr (accessor<elementType, dimensions, addressMode, access::target::local>); template <typename elementType, int dimensions, access::mode addressMode> multi_ptr(accessor<elementType, dimensions, addressMode, access::target::constant_buffer>); pointer_t get() const; operator void*() const; template <typename elementType> explicit operator multi_ptr<elementType, addressSpace>() const; Non-member Functions template <typename elementType, access::address_space addressSpace> multi_ptr<elementType, addressSpace> make_ptr(multi_ptr<elementType, addressSpace>::pointer_t); template <typename elementType, access::address_space addressSpace> multi_ptr<elementType, addressSpace> make_ptr(elementType*); template <typename elementType, access::address_space addressSpace> bool operatorOP(const multi_ptr<elementType, addressSpace> & lhs, const multi_ptr<elementType, addressSpace>& rhs); OP is one of ==, !=, <, >, <=, >= template <typename elementType, access::address_space addressSpace> bool operatorOP(const multi_ptr<elementType, addressSpace>& lhs, std::nullptr_t rhs); OP is one of ==, !=, <, >, <=, >= template <typename elementType, access::address_space addressSpace> bool operatorOP(std::nullptr_t lhs, const multi_ptr<elementType, addressSpace>& rhs); OP is one of ==, !=, <, >, <=, >= Template specialization aliases [4.7.7.2] template <typename elementType> using global_ptr = multi_ptr<elementType, access::address_space::global_space>; template <typename elementType> using local_ptr = multi_ptr<elementType, access::address_space::local_space>; template <typename elementType> using constant_ptr = multi_ptr<elementType, access::address_space::constant_space>; template <typename elementType> using private_ptr = multi_ptr<elementType, access::address_space::private_space>; void prefetch(size_t numElements) const; Ranges and index space identifiers[4.8.1] Class range [4.8.1.1] Class declaration template <size_t dimensions = 1> class range; Member functions range(size_t dim0); range(size_t dim0, size_t dim1); range(size_t dim0, size_t dim1, size_t dim2); size_t get(int dimension) const; size_t &operator[](int dimension); size_t size() const; range<dimensions> operatorOP( const range<dimensions> &rhs) const; range<dimensions> operatorOP(const size_t &rhs) const; range<dimensions> &operatorOP( const range<dimensions> &rhs); range<dimensions> &operatorOP(const size_t &rhs); Non-member function template <int dimensions> range<dimensions> operatorOP(const size_t &lhs, const range<dimensions> &rhs); Class nd_range [4.8.1.2] Class declaration template <size_t dimensions = 1> struct nd_range; Member functions nd_range(range<dimensions> globalSize, range<dimensions> localSize, id<dimensions> offset = id<dimensions>()); range<dimensions> get_global_range() const; range<dimensions> get_local_range() const; range<dimensions> get_group_range() const; id<dimensions> get_offset() const; Class id [4.8.1.3] Class declaration template <size_t dimensions = 1> struct id; Member functions id(); id(size_t dim0); id(size_t dim0, size_t dim1); id(size_t dim0, size_t dim1, size_t dim2); id(const range<dimensions> &range); id(const item<dimensions> &item); size_t get(int dimension) const; size_t &operator[](int dimension) const; operator size_t(); id<dimensions> operatorOP(const id<dimensions> &rhs) const; where OP is one of +, -, *, /, %, <<, >>, &, |, ˆ, &&, ||, <, >, <=, >= id<dimensions> operatorOP(const size_t &rhs) const; where OP is one of +, -, *, /, %, <<, >>, &, |, ˆ, &&, ||, <, >, <=, >= id<dimensions> &operatorOP(const id<dimensions> &rhs); where OP is one of +=, -=, *=, /=, %=, <<=, >>=, &=, |=, ˆ= id<dimensions> &operatorOP(const size_t &rhs); where OP is one of +=, -=, *=, /=, %=, <<=, >>=, &=, |=, ˆ= Non-member functions template <int dimensions> id<dimensions> operatorOP( const size_t &lhs, const id<dimensions> &rhs); where OP is one of +, -, *, /, %, <<, >>, &, |, ˆ Class item [4.8.1.4-5] Class declaration template <int dimensions = 1, bool with_offset = true> struct item; Member functions id<dimensions> get_id() const; size_t get_id(int dimension) const; size_t &operator[](int dimension); range<dimensions> get_range() const; id<dimensions> get_offset() const; operator item<dimensions, true>() const; size_t get_linear_id() const; Class nd_item [4.8.1.6] Class declaration template <size_t dimensions = 1> struct nd_item; Member functions id<dimensions> get_global_id() const; size_t get_global_id(int dimension) const; size_t get_global_linear_id() const; id<dimensions> get_local_id() const; size_t get_local_id(int dimension) const; size_t get_local_linear_id() const; group<dimensions> get_group() const; size_t get_group(int dimension) const; size_t get_group_linear_id() const; size_t get_num_range(int dimension) const; id<dimensions> get_num_range() const; range<dimensions> get_global_range() const; range<dimensions> get_local_range() const; id<dimensions> get_offset() const; nd_range<dimensions> get_nd_range() const; void barrier(access::fence_space accessSpace = access::fence_space::global_and_local) const; template <access::mode accessMode = access::mode::read_write> void mem_fence(access::fence_space accessSpace = access::fence_space::global_and_local) const; template <typename dataT> device_event async_work_group_copy( local_ptr<dataT> dest, global_ptr<dataT> src, size_t numElements) const; template <typename dataT> device_event async_work_group_copy( global_ptr<dataT> dest, local_ptr<dataT> src, size_t numElements) const; (Continued on next page) Sampler class [4.7.8] Enums addressing_mode sampler_filtering_mode mirrored_repeat repeat clamp_to_edge clamp none nearest linear coordinate_normalization_mode normalized unnormalized Class sampler members sampler(coordinate_normalization_mode normalizationMode, addressing_mode addressingMode, filtering_mode filteringMode); sampler(cl_sampler clSampler, const context &syclContext); addressing_mode get_addresssing_mode() const; filtering_mode get_filtering_mode() const; coordinate_normalization_mode get_coordinate_normalization_mode() const; Accessor class (continued) Image [Table 4.50] The data types are cl_int4, cl_uint4, cl_float4, and cl_half4, and the dimensionality is between 1 and 3 (inclusive). Access Target Accessor Type Dimensionality Access Mode Place- holder image Device 1, 2, or 3 read, write, discard_write, false_timage_array Device 1 or 2 host_image Host 1, 2, or 3
  • 5. www.khronos.org/sycl©2018 Khronos Group - Rev. 0818 SYCL 1.2.1 API Reference Guide Page 5 Kernel class [4.8.7] kernel(cl_kernel clKernel, const context &syclContext); cl_kernel get() const; bool is_host() const; context get_context() const; program get_program() const; template <info::kernel param> typename info::param_traits< info::kernel, param>::return_type get_info() const; template <info::kernel_work_group param> typename info::param_traits<info::kernel_work_group, param>::return_type get_work_group_info(const device &dev) const; Kernel queries using get_info() Descriptor Return type info::kernel::function_name string_class info::kernel::num_args cl_uint info::kernel::context context info::kernel::program program info::kernel::reference_count cl_uint info::kernel::attributes string_class Ranges, index space identifiers (cont.) template <typename dataT> device_event async_work_group_copy( local_ptr<dataT> dest, global_ptr<dataT> src, size_t numElements, size_t srcStride) const; template <typename dataT> device_event async_work_group_copy( global_ptr<dataT> dest, local_ptr<dataT> src, size_t numElements, size_t destStride) const; template <typename... eventTN> void wait_for(eventTN... events) const; Class h_item [4.8.1.7] Class declaration template <int dimensions> struct h_item; Member functions item<dimensions, false> get_global() const; item<dimensions, false> get_local() const; item<dimensions, false> get_logical_local() const; item<dimensions, false> get_physical_local() const; range<dimensions> get_global_range() const; size_t get_global_range(int dimension) const; id<dimensions> get_global_id() const; size_t get_global_id(int dimension) const; range<dimensions> get_local_range() const; size_t get_local_range(int dimension) const; id<dimensions> get_local_id() const; size_t get_local_id(int dimension) const; range<dimensions> get_logical_local_range() const; size_t get_logical_local_range(int dimension) const; id<dimensions> get_logical_local_id() const; size_t get_logical_local_id(int dimension) const; range<dimensions> get_physical_local_range() const; size_t get_physical_local_range(int dimension) const; id<dimensions> get_physical_local_id() const; size_t get_physical_local_id(int dimension) const; Class group [4.8.1.8] Class declaration template <int dimensions = 1> struct group; Member functions id<dimensions> get_id() const; size_t get_id(int dimension) const; range<dimensions> get_global_range() const; size_t get_global_range(int dimension) const; range<dimensions> get_group_range() const; range<dimensions> get_local_range() const; size_t get_local_range(int dimension) const; size_t get_group_range(int dimension) const; size_t operator[](int dimension) const; size_t get_linear_id() const; template<typename workItemFunctionT> void parallel_for_work_item(workItemFunctionT func) const; template<typename workItemFunctionT> void parallel_for_work_item(range<dimensions> flexibleRange, workItemFunctionT func) const; template <access::mode accessMode = access::mode::read_write> void mem_fence(access::fence_space accessSpace = access::fence_space::global_and_local) const; template <typename dataT> device_event async_work_group_copy( local_ptr<dataT> dest, global_ptr<dataT> src, size_t numElements) const; template <typename dataT> device_event async_work_group_copy( global_ptr<dataT> dest, local_ptr<dataT> src, size_t numElements) const; template <typename dataT> device_event async_work_group_copy( local_ptr<dataT> dest, global_ptr<dataT> src, size_t numElements, size_t srcStride) const; template <typename dataT> device_event async_work_group_copy( global_ptr<dataT> dest, local_ptr<dataT> src, size_t numElements, size_t destStride) const; template <typename... eventTN> void wait_for(eventTN... events) const; class device_event member [4.8.1.9 - 4.8.1.10] void wait() const; Programclass[4.8.8] Enums program_state none compiled linked Member functions explicit program(const context &context); program(const context &context, vector_class<device> deviceList); program(vector_class<program> programList, string_class linkOptions =''''); program(const context &context, cl_program clProgram); cl_program get() const; bool is_host() const; template <typename kernelT> void compile_with_kernel_type( string_class compileOptions =''''); template <typename kernelT> void build_with_kernel_type( string_class buildOptions = ''''); void compile_with_source(string_class kernelSource, string_class compileOptions = ''''); void build_with_source(string_class kernelSource, string_class buildOptions = ''''); void link(string_class linkOptions =''''); template <typename kernelT> bool has_kernel<kernelT>() const; bool has_kernel(string_class kernelName) const; template <typename kernelT> kernel get_kernel<kernelT>() const; kernel get_kernel(string_class kernelName) const; template <info::program param> typename info::param_traits< info::program, param>::return_type get_info() const; vector_class<vector_class<char>> get_binaries() const; context get_context() const; vector_class<device> get_devices() const; string_class get_compile_options() const; string_class get_link_options() const; string_class get_build_options() const; program_state get_state() const; Information descriptors: Descriptor Return type info::program::reference_count cl_uint info::program::context cl_context info::program::devices vector_class<device> Command group class handler [4.8.3 - 6] template <typename dataT, int dimensions, access::mode accessMode, access::target accessTarget> void require(accessor<dataT, dimensions, accessMode, accessTarget, placeholder::true_t> acc); template <typename T> void set_arg(int argIndex, T && arg); template <typename... Ts> void set_args(Ts &&... args); template <typename KernelName, typename KernelType> void single_task(KernelType kernelFunc); template <typename KernelName, typename KernelType, int dimensions> void parallel_for(range<dimensions> numWorkItems, KernelType kernelFunc); template <typename KernelName, typename KernelType, int dimensions> void parallel_for(range<dimensions> numWorkItems, id<dimensions> workItemOffset, KernelType kernelFunc); template <typename KernelName, typename KernelType, int dimensions> void parallel_for(nd_range<dimensions> executionRange, KernelType kernelFunc); template <typename KernelName, typename WorkgroupFunctionType, int dimensions> void parallel_for_work_group( range<dimensions> numWorkGroups, WorkgroupFunctionType kernelFunc); template <typename KernelName, typename WorkgroupFunctionType, int dimensions> void parallel_for_work_group( range<dimensions> numWorkGroups, range<dimensions> workGroupSize, WorkgroupFunctionType kernelFunc); void single_task(kernel syclKernel); template <int dimensions> void parallel_for(range<dimensions> numWorkItems, kernel syclKernel); template <int dimensions> void parallel_for(range<dimensions> numWorkItems, id<dimensions> workItemOffset, kernel syclKernel); template <int dimensions> void parallel_for( nd_range<dimensions> ndRange, kernel syclKernel); template <typename T, int dim, access::mode mode, access::target tgt> void copy(accessor<T, dim, mode, tgt> src, shared_ptr_class<T> dest); template <typename T, int dim, access::mode mode, access::target tgt> void copy(shared_ptr_class<T> src, accessor<T, dim, mode, tgt> dest); template <typename T, int dim, access::mode mode, access::target tgt> void copy( accessor<T, dim, mode, tgt> src, T * dest); template <typename T, int dim, access::mode mode, access::target tgt> void copy( const T * src, accessor<T, dim, mode, tgt> dest); template <typename T, int dim, access::mode mode, access::target tgt> void copy( accessor<T, dim, mode, tgt> src, accessor<T, dim, mode, tgt> dest); template <typename T, int dim, access::mode mode, access::target tgt> void update_host( accessor<T, dim, mode, tgt> acc); template<typename T, int dim, access::mode mode, access::target tgt> void fill( accessor<T, dim, mode, tgt> dest, const T& src); class private_memory members [4.8.5.3] class private_memory { public: private_memory(const group<Dimensions> &); T &operator()(const h_item<Dimensions> &id); };
  • 6. www.khronos.org/sycl©2018 Khronos Group - Rev. 0818 SYCL 1.2.1 API Reference Guide Page 6 Scalar data types [4.10.1. 6.5] SYCL supports the C++11 ISO standard data types. The following data type aliases are defined for OpenCL interoperability in the cl::sycl namespace: char, signed char, unsigned char, short int, unsigned short int, bool, int, unsigned int, long int, unsigned long int, long long int, unsignedlonglongint, size_t, float, double, cl_bool, cl_char, cl_uchar, cl_short, cl_ushort cl_int, cl_uint cl_long, cl_ulong cl_float cl_double cl_half half byte Synchronization & atomics [4.11] Enums fence_space : char memory_order : int local_space global_space global_and_local relaxed Class atomic Class declaration template <typename T, access::address_space addressSpace = access::address_space::global_space> class atomic; Member functions template <typename pointerT> atomic(multi_ptr<pointerT, addressSpace> ptr); void store(T operand, memory_order memoryOrder = memory_order::relaxed) volatile; T load(memory_order memoryOrder = memory_order::relaxed) const volatile; T exchange(T operand, memory_order memoryOrder = memory_order::relaxed) volatile; bool compare_exchange_strong(T& expected, T desired, memory_order successMemoryOrder = memory_order::relaxed, memory_order failMemoryOrder = memory_order::relaxed) volatile; T fetch_X(T operand, memory_order memoryOrder = memory_order::relaxed) volatile; where X is one of add, sub, and, or, xor, min, max Available only when T != float Non-member functions template <typename T, access::address_space addressSpace> T atomic_load(atomic<T, addressSpace> object, memory_order memoryOrder = memory_order::relaxed) volatile; template <typename T, access::address_space addressSpace> void atomic_store(atomic<T, addressSpace> object, T operand, memory_order memoryOrder = memory_order::relaxed) volatile; template <typename T, access::address_space addressSpace> T atomic_exchange(atomic<T, addressSpace> object, T operand, memory_order memoryOrder = memory_order::relaxed) volatile; template <typename T, access::address_space addressSpace> bool atomic_compare_exchange_strong( atomic<T, addressSpace> object, T &expected, T desired, memory_order successMemoryOrder = memory_order::relaxed, memory_order failMemoryOrder = memory_order::relaxed) volatile; template <typename T, access::address_space addressSpace> T atomic_fetch_X(atomic<T, addressSpace> object, T operand, memory_order memoryOrder = memory_order::relaxed) volatile; where X is one of add, sub, and, or, xor, min, max Exception classes [4.9.2] Class: exception const char *what() const; context get_context() const; bool has_context() const; cl_int get_cl_code() const; Class: exception_list size_t size() const; iterator begin() const; iterator end() const; Exception types Derived from cl::sycl::runtime_error class Derived from cl::sycl::device_error class kernel_error compile_program_error nd_range_error link_program_error accessor_error invalid_object_error event_error memory_allocation_error invalid_parameter_error platform_error profiling_error feature_not_supported Stream class [4.12] Enums stream_manipulator scientific, hex, oct, noshowbase, showbase, showpos, endl, noshowpos, fixed, hexfloat, defaultfloat, dec Member functions stream(size_t bufferSize, size_t maxStatementSize, handler& cgh); size_t get_size() const; size_t get_max_statement_size() const; Non-member functions template <typename T> const stream& operator<<(const stream& os, const T &rhs); __precision_manipulator__ setprecision(int precision); __width_manipulator__ setw(int width); Vector data types [4.10.2] RET’s dataT template parameter depends on vec’s dataT template parameter: dataT RET cl_char, cl_uchar cl_char cl_short, cl_ushort or cl_half cl_short cl_int, cl_uint or cl_float cl_int cl_long, cl_ulong or cl_double cl_long Class declaration template <typename dataT, int numElements> class vec; Members using element_type = dataT; vec(); explicit vec(const dataT &arg); template <typename... argTN> vec(const argTN&... args); vec(const vec<dataT, numElements> &rhs); #ifdef __SYCL_DEVICE_ONLY__ vec(vector_t openclVector); operator vector_t() const; #endif operator dataT() const; // Available only if numElements == 1 size_t get_count() const; size_t get_size() const; template <typename convertT, rounding_mode roundingMode> vec<convertT, numElements> convert() const; template <typename asT> asT as() const; The following XYZW members are available only when numElements <= 4. RGBA members are available only when numElements == 4. template<int... swizzleIndexes> __swizzled_vec__ swizzle() const; __swizzled_vec__ XYZW_ACCESS() const; __swizzled_vec__ RGBA_ACCESS() const; __swizzled_vec__ INDEX_ACCESS() const; #ifdef SYCL_SIMPLE_SWIZZLES __swizzled_vec__ XYZW_SWIZZLE() const; __swizzled_vec__ RGBA_SWIZZLE() const; #endif The following lo, hi, odd, and even members are available only when numElements > 1. __swizzled_vec__ lo() const; __swizzled_vec__ hi() const; __swizzled_vec__ odd() const; __swizzled_vec__ even() const; template <access::address_space addressSpace> void load(size_t offset, multi_ptr<dataT, addressSpace> ptr); template <access::address_space addressSpace> void store(size_t offset, multi_ptr<dataT, addressSpace> ptr) const; vec<dataT, numElements> &operator=(const vec<dataT, numElements> &rhs); vec<dataT, numElements> &operator=(const dataT &rhs); vec<RET, numElements> operator!(); vec<dataT, numElements> operator~(); // Not available for floating point types Member functions with OP variable For all types, OP may be: For non floating-point types, OP may be: vec<dataT, numElements> operatorOP(const vec<dataT, numElements> &rhs) const; +, -, *, / %, &, |, ^, <<, >> vec<dataT, numElements> operatorOP(const dataT &rhs) const; vec<dataT, numElements> &operatorOP(const vec<dataT, numElements> &rhs); +=, -=, *=, /= %=, &=, |=, ^=, <<=, >>=vec<dataT, numElements> &operatorOP(const dataT &rhs); vec<RET, numElements> operatorOP(const vec<dataT, numElements> &rhs) const; &&, ||, ==, !=, <, >, <=, >=vec<RET, numElements> operatorOP(const dataT &rhs) const; vec<dataT, numElements> &operatorOP(); ++, -- vec<dataT, numElements> operatorOP(int); Non-member functions Non-member functions with OP variable For all types, OP may be: For non floating-point types, OP may be: template <typename dataT, int numElements> vec<dataT, numElements> operatorOP(const dataT &lhs) const vec<dataT, numElements> &rhs); +, -, *, / %, &, |, ^, <<, >> template <typename dataT, int numElements> vec<RET, numElements> operatorOP(const dataT &lhs, const vec<dataT, numElements> &rhs); &&, ||, ==, !=, <, >, <=, >=
  • 7. www.khronos.org/sycl©2018 Khronos Group - Rev. 0818 SYCL 1.2.1 API Reference Guide Page 7 Math functions[4.13.3] Math funtions are available in the namespace cl::sycl. In all cases below, n may be one of 2, 3, 4, 8, or 16. Tf (genfloat in the spec) is type float[n], double[n], or half[n]. Tff (genfloatf) is type float[n]. Tfd (genfloatd) is type double[n]. Th (genfloath) is type half[n]. sTf (sgenfloat) is type float, double, or half. Ti (genint) is type int[n]. uTi (ugenint) is type unsigned int or uintn. uTli (ugenlonginteger) is unsigned long int, ulonglongn, ulongn, unsigned long long int. N indicates native variants, available in cl::sycl::native. H indicates half variants, available in cl::sycl::halfprecision, implemented with a minimum of 10 bits of accuracy. Tf acos (Tf x) Arc cosine Tf acosh (Tf x) Inverse hyperbolic cosine Tf acospi (Tf x) acos (x) / π Tf asin (Tf x) Arc sine Tf asinh (Tf x) Inverse hyperbolic sine Tf asinpi (Tf x) asin (x) / π Tf atan (Tf y_over_x) Arc tangent Tf atan2 (Tf y, Tf x) Arc tangent of y / x Tf atanh (Tf x) Hyperbolic arc tangent Tf atanpi (Tf x) atan (x) / π Tf atan2pi (Tf y, Tf x) atan2 (y, x) / π Tf cbrt (Tf x) Cube root Tf ceil (Tf x) Round to integer toward + infinity Tf copysign (Tf x, Tf y) x with sign changed to sign of y Tf cos (Tf x) Tff cos (Tff x) N H Cosine Tf cosh (Tf x) Hyperbolic cosine Tf cospi (Tf x) cos (π x) Tff divide (Tff x, Tff y) N H x / y (Not available in cl::sycl.) Tf erfc (Tf x) Complementary error function Tf erf (Tf x) Calculates error function Tf exp (Tf x) Tff exp (Tff x) N H Exponential base e Tf exp2 (Tf x) Tff exp2 (Tff x) N H Exponential base 2 Tf exp10 (Tf x) Tff exp10 (Tff x) N H Exponential base 10 Tf expm1 (Tf x) ex -1.0 Tf fabs (Tf x) Absolute value Tf fdim (Tf x, Tf y) Positive difference between x and y Tf floor (Tf x) Round to integer toward infinity Tf fma (Tf a, Tf b, Tf c) Multiply and add, then round Tf fmax (Tf x, Tf y) Tf fmax (Tf x, sTf y) Return y if x < y, otherwise it returns x Tf fmin (Tf x, Tf y) Tf fmin (Tf x, sTf y) Return y if y < x, otherwise it returns x Tf fmod (Tf x, Tf y) Modulus. Returns x – y * trunc (x/y) Tf fract (Tf x, Tf *iptr) Fractional value in x Tf frexp (Tf x, Ti *exp) Extract mantissa and exponent Tf hypot (Tf x, Tf y) Square root of x2 + y2 Ti ilogb (Tf x) Return exponent as an integer value Tf ldexp (Tf x, Ti k) doublen ldexp (doublen x, int k) x * 2n Tf lgamma (Tf x) Log gamma function Tf lgamma_r (Tf x, Ti *signp) Log gamma function Tf log (Tf x) Tff log (Tff x) N H Natural logarithm Tf log2 (Tf x) Tff log2 (Tff x) N H Base 2 logarithm Tf log10 (Tf x) Tff log10 (Tff x) N H Base 10 logarithm Tf log1p (Tf x) ln (1.0 + x) Tf logb (Tf x) Return exponent as an integer value Tf mad (Tf a, Tf b, Tf c) Approximates a * b + c Tf maxmag (Tf x, Tf y) Maximum magnitude of x and y Tf minmag (Tf x, Tf y) Minimum magnitude of x and y Tf modf (Tf x, Tf *iptr) Decompose floating-point number Tff nan(uTi nancode) Tfd nan(uTli nancode) Quiet NaN (Return is scalar when nancode is scalar) Tf nextafter (Tf x, Tf y) Next representable floating-point value after x in the direction of y Tf pow (Tf x, Tf y) Compute x to the power of y Tf pown (Tf x, Tiy) Compute xy , where y is an integer Tf powr (Tf x, Tf y) Tff powr (Tff x, Tff y) N Tff powr (Tff x, Th y) H Compute xy , where x is >= 0 Tff recip (Tff x) N H 1 / x (Not available in cl::sycl.) Tf remainder (Tf x, Tf y) Floating point remainder Tf remquo (Tf x, Tf y, Ti *quo) Remainder and quotient Tf rint (Tf x) Round to nearest even integer Tf rootn (Tf x, Ti y) Compute x to the power of 1/y Tf round (Tf x) Integral value nearest to x rounding Tf rsqrt (Tf x) Tff rsqrt (Tff x) N H Inverse square root Tf sin (Tf x) Tff sin (Tff x) N H Sine Tf sincos (Tf x, Tf *cosval) Sine and cosine of x Tf sinh (Tf x) Hyperbolic sine Tf sinpi (Tf x) sin (π x) Tf sqrt (Tf x) Tff sqrt (Tff x) N H Square root Tf tan (Tf x) Tff tan (Tff x) N H Tangent Tf tanh (Tf x) Hyperbolic tangent Tf tanpi (Tf x) tan (π x) Tf tgamma (Tf x) Gamma function Tf trunc (Tf x) Round to integer toward zero Integer functions[4.13.4] Integer funtions are available in the namespace cl::sycl. In all cases below, n may be one of 2, 3, 4, 8, or 16. If a type in the functions below is shown with [xbit] in its name, this indicates that the type is x bits in size. Tint (geninteger in the spec) is type int[n], uint[n], unsigned int, char, char[n], signed char, scharn, ucharn, unsigned short[n], unsigned short, ushort[n], longn, ulongn, long int, unsigned long int, long long int, longlongn, ulonglongn unsigned long long int. uTint (ugeninteger) is type unsigned char, ucharn, unsigned short, ushortn, unsigned int, uintn, unsigned long int, ulongn, ulonglongn, unsigned long long int. iTint (igeninteger) is type signed char, scharn, short[n], int[n], long int, longn, long long int, longlongn. sTint (sgeninteger) is type char, signed char, unsigned char, short, unsigned short, int, unsigned int, long int, unsigned long int, long long int, unsigned long long int. uTint abs (Tint x) | x | uTint abs_diff (Tint x, Tint y) | x – y | without modulo overflow Tint add_sat (Tint x, Tint y) x + y and saturates the result Tint hadd (Tint x, Tint y) (x + y) >> 1 withoutmod.overflow Tint rhadd (Tint x, Tint y) (x + y + 1) >> 1 Tint clamp (Tint x, Tint min, Tint max) Tint clamp (Tint x, sTint min, sTint max) min(max(x, min), max) Tint clz (Tint x) number of leading 0-bits in x Tint mad_hi (Tint a, Tint b, Tint c) mul_hi(a, b) + c Tint mad_sat (Tint a, Tint b, Tint c) a * b + c and saturates the result Tint max (Tint x, Tint y) Tint max (Tint x, sTint y) y if x < y, otherwise it returns x Tint min (Tint x, Tint y) Tint min (Tint x, sTint y) y if y < x, otherwise it returns x Tint mul_hi (Tint x, Tint y) high half of the product of x and y Tint popcount (Tint x) Number of non-zero bits in x Tint rotate (Tint v, Tint i) result[indx] = v[indx] << i[indx] Tint sub_sat (Tint x, Tint y) x - y and saturates the result uTint16bit upsample (uTint8bit hi, uTint8bit lo) result[i]= ((ushort)hi[i]<< 8)|lo[i] iTint16bit upsample (iTint8bit hi, uTint8bit lo) result[i]=((short)hi[i]<< 8)|lo[i] uTint32bit upsample ( uTint16bit hi, uTint16bit lo) result[i]=((uint)hi[i]<< 16)|lo[i] iTint32bit upsample (iTint16bit hi, uTint16bit lo) result[i]=((int)hi[i]<< 16)|lo[i] uTint64bit upsample (uTint32bit hi, uTint32bit lo) result[i]=((ulonglong)hi[i]<< 32) |lo[i] iTint64bit upsample (iTint32bit hi, uTint32bit lo) result[i]=((longlong)hi[i]<< 32)|lo[i] Tint32bit mad24 (Tint32bit x, Tint32bit y, Tint32bit z) Tint32bit mad24 (Tint32bit x, Tint32bit y, Tint32bit z) Multiply 24-bit integer values x, y, add 32- bit integer result to 32-bit integer z Tint32bit mul24 (Tint32bit x, Tint32bit y) Multiply 24-bit integer values x and y
  • 8. www.khronos.org/sycl©2018 Khronos Group - Rev. 0818 SYCL 1.2.1 API Reference Guide Page 8 © 2018 Khronos Group. All rights reserved. SYCL is a trademark of the Khronos Group. The Khronos Group is an industry consortium creating open standards for the authoring and acceleration of parallel computing, graphics, dynamic media, and more on a wide variety of platforms and devices. See www.khronos.org to learn more about the Khronos Group. See www.khronos.org/sycl to learn more about SYCL. Relational built-in functions [4.13.7] Relational functions are available in the namespace cl::sycl. In all cases below, n may be one of 2, 3, 4, 8, or 16. If a type in the functions below is shown with [xbit] in its name, this indicates that the type is x bits in size. Tint (geninteger in the spec) is type int[n], uint[n], unsigned int, char, char[n], signed char, scharn, ucharn, unsigned short[n], unsigned short, ushort[n], longn, ulongn, long int, unsigned long int, long long int, longlongn, ulonglongn unsigned long long int. iTint (igeninteger) is type signed char, scharn, short[n], int[n], long int, longn, long long int, longlongn. uTint (ugeninteger) is type unsigned char, ucharn, unsigned short, ushortn, unsigned int, uintn, unsigned long int, ulongn, ulonglongn, unsigned long long int. Ti (genint) is type int[n]. uTi (ugenint) is type unsigned int or uintn. Tff (genfloatf) is type float[n]. Tfd (genfloatd) is type double[n]. T (gentype) is type float[n], double[n], or half[n], or any type listed for above for Tint. int any (iTint x) 1 if MSB in component of x is set; else 0 int all (iTint x) 1 if MSB in all components of x are set; else 0 T bitselect (T a, T b, T c) Eachbitofresultiscorresponding bitofaifcorrespondingbitofcis0 Tint select (Tint a, Tint b, iTint c) Tint select (Tint a, Tint b, uTint c) Tff select (Tff a, Tff b, Ti c) Tff select (Tff a, Tff b, uTi c) Tfd select (Tfd a, Tfd b, iTint64bit c) Tfd select (Tfd a, Tfd b, uTint64bit c) For each component of a vector type, result[i] = if MSB of c[i] is set ? b[i] : a[i] For scalar type, result = c? b : a iTint32bit function (Tff x, Tff y) iTint64bit function (Tfd x, Tfd y) function may be one of isequal, isnotequal, isgreater, isgreaterequal, isless, islessequal, islessgreater, isordered, isunordered. This format is used for many relational functions. Replace function with the function name. iTint32bit function (Tff x) iTint64bit function (Tfd x) function may be one of isfinite, isinf, isnan, isnormal, signbit. Geometric Functions [4.13.6] Geometric functions are available in the namespace cl::sycl. Tgf (gengeofloat in the spec) is type float, float2, float3, float4. Tgd (gengeodouble) is type double, double2, double3, double4. float4 cross (float4 p0, float4 p1) float3 cross (float3 p0, float3 p1) double4 cross (double4 p0, double4 p1) double3 cross (double3 p0, double3 p1) Cross product float distance (Tgf p0, Tgf p1) double distance (Tgd p0, Tgd p1) Vector distance float dot (Tgf p0, Tgf p1) double dot (Tgd p0, Tgd p1) Dot product float length (Tgf p) double length (Tgd p) Vector length Tgf normalize (Tgf p) Tgd normalize (Tgd p) Normal vector length 1 float fast_distance (Tgf p0, Tgf p1) Vector distance float fast_length (Tgf p) Vector length Tgf fast_normalize (Tgf p) Normal vector length 1 Common functions[4.13.5] Common funtions are available in the namespace cl::sycl. On the host the vector types use the vec class and on an OpenCL device use the corresponding OpenCL vector types. In all cases below, n may be one of 2, 3, 4, 8, or 16. Tf (genfloat in the spec) is type float[n], double[n], or half[n]. Tff (genfloatf) is type float[n]. Tfd (genfloatd) is type double[n]. Tf clamp (Tf x, Tf minval, Tf maxval); Tff clamp (Tff x, float minval,float maxval); Tfd clamp (Tfd x, double minval, doublen maxval); Clamp x to range given by minval, maxval Tf degrees (Tf radians); radians to degrees Tf abs (Tf x, Tf y); Tff abs (Tff x, float y); Tfd abs (Tfd x, double y); Max of x and y Tf max (Tf x, Tf y); Tff max (Tff x, float y); Tfd max (Tfd x, double y); Max of x and y Tf min (Tf x, Tf y); Tff min (Tff x, float y); Tfd min (Tfd x, double y); Min of x and y Tf mix (Tf x, Tf y, Tf a); Tff mix (Tff x, Tff y, float a); Tfd mix (Tfd x, Tfd y, double a) ; Linear blend of x and y Tf radians (Tf degrees); degrees to radians Tf step (Tf edge, Tf x); Tff step (float edge, Tff x); Tfd step (double edge, Tfd x); 0.0 if x < edge, else 1.0 Tf smoothstep (Tf edge0, Tf edge1, Tf x); Tff smoothstep(float edge0,float edge1, Tff x); Tfd smoothstep(double edge0, double edge1,Tfd x); Step and interpolate Tf sign (Tf x); Sign of x Preprocessor directives and macros [6.6] CL_SYCL_LANGUAGE_VERSION Integer version, e.g.: 121 __FAST_RELAXED_MATH__ -cl-fast-relaxed-math __SYCL_DEVICE_ONLY__ defined when in device compilation __SYCL_SINGLE_SOURCE__ produce host and device binary SYCL_EXTERNAL allow kernel external linkage (optional) Examples of how to invoke kernels Example: single_task invoke [4.8.5.1] SYCL provides a simple interface to enqueue a kernel that will be sequentially executed on an OpenCL device. myQueue.submit([&](handler & cgh) { cgh.single_task<class kernel_name>( [=] () { // [kernel code] })); }); Examples: parallel_for invoke [4.8.5.2] Example #1 Using a lambda function for a kernel invocation. myQueue.submit([&](handler & cgh) { auto acc = myBuffer.get_access<access::mode::write>(cgh); cgh.parallel_for<class myKernel> (range<1>(numWorkItems), [=] (id<1> index) { acc[index] = 42.0f; }); }); Example #2 Allowing the runtime to choose the index space best matching the range provided using information given the scheduled interface instead of the general interface. class MyKernel; myQueue.submit([&](handler & cgh) { auto acc=myBuffer.get_access<access::mode::write>(cgh); cgh.parallel_for<class myKernel>(range<1>( numWorkItems), [=] (item<1> item) { size_t index = item.get_global(); acc[index] = 42.0f; }); }); Example #3 Two examples of launching a kernel functor over a 3D grid. In the first example, work_item IDs range from 0 to 2. myQueue.submit([&](handler & cgh) { cgh.parallel_for<class example_kernel1>( range<3>(3,3,3), // global range [=] (item<3> it) { //[kernel code] }); }); Example #4 Launching sixty-four work-items in a three-dimensional grid with four in each dimension and divided into eight work-groups. myQueue.submit([&](handler & cgh) { cgh.parallel_for<class example_kernel>( nd_range<3>(range<3>(4, 4, 4), range<3>(2, 2, 2)), [=] (nd_item<3> item) { //[kernel code] // Work-group synchronization item.barrier(access::fence_space::global_space); //[kernel code] }); }); Parallel For hierarchical invoke [4.8.5.3] myQueue.submit([&](handler & cgh) { // Issue 8 work-groups of 8 work-items each cgh.parallel_for_work_group<class example_kernel>( range<3>(2, 2, 2), range<3>(2, 2, 2), [=](group<3> myGroup) { // [workgroup code] int myLocal; // this variable is shared between workitems // This variable will be instantiated for each work-item // separately private_memory<int> myPrivate(myGroup); // Issue parallel sets of work-items each sized using the // runtime default myGroup.parallel_for_work_item([&](h_item<3> myItem) { // [work-item code] myPrivate(myItem) = 0; }); // Carry private value across loops myGroup.parallel_for_work_item([&](h_item<3> myItem) { //[work-item code] output[myItem.get_global()] = myPrivate(myItem); }); //[workgroup code] }); });