oclRun.RdoclRun is used to execute code that has been compiled for
OpenCL.
oclRun(kernel, size, ..., dim = size)Kernel object as obtained from oclSimpleKernel
Length of the output vector
Additional arguments passed to the kernel
Numeric vector describing the global work
dimensions, i.e., the index range that the kernel will be run
on. The kernel can use get_global_id(n) to obtain the
(n + 1)-th dimension index and get_global_size(n) to
get the dimension. OpenCL standard supports only up to three
dimensions, you can use use index vectors as arguments if more
dimensions are required. Note that dim is not necessarily the
dimension of the result although it can be.
oclRun pushes kernel arguments, executes the kernel and
retrieves the result. The kernel is expected to have either
__global double * or __global float *
type (write-only) as the first argument which will be used for the
result and const unsigned int second argument denoting the result
length. All other arguments are assumed to be read-only and will be
filled according to the ... values.
These can either be OpenCL buffers as generated by clBuffer for
pointer arguments, or scalar values (vectors of length one) for scalar
arguments. Only integer (int), and numeric (double or
float) scalars and OpenCL buffers are supported as kernel arguments.
The caller is responsible for matching the argument types according to the
kernel in a way similar to .C and .Call.
Note that the kernel must match the input types as well, so typically
as.clBuffer() should include the mode (e.g., "numeric")
to match the kernel and/or explicit as.numeric() coercion should
be used.
The resulting buffer of length size.
library(OpenCL)
## Only proceed if this machine has at least one OpenCL platform
if (length(oclPlatforms())) {
ctx = oclContext(precision="single")
code = c("
__kernel void dnorm(
__global numeric* output,
const unsigned int count,
__global numeric* input,
const numeric mu, const numeric sigma)
{
size_t i = get_global_id(0);
if(i < count)
output[i] = exp((numeric) (-0.5 * ((input[i] - mu) / sigma) * ((input[i] - mu) / sigma)))
/ (sigma * sqrt((numeric) (2 * 3.14159265358979323846264338327950288 )) );
}")
k.dnorm <- oclSimpleKernel(ctx, "dnorm", code)
f <- function(x, mu=0, sigma=1)
as.numeric(oclRun(k.dnorm, length(x), as.clBuffer(x, ctx, "numeric"), mu, sigma))
## expect differences since the above uses single-precision but
## it should be close enough
f(1:10/2) - dnorm(1:10/2)
## does the device support double-precision?
if (any("cl_khr_fp64" == oclInfo(attributes(ctx)$device)$exts)) {
k.dnorm <- oclSimpleKernel(ctx, "dnorm", code, "double")
f <- function(x, mu=0, sigma=1)
as.numeric(oclRun(k.dnorm, length(x), as.clBuffer(x, ctx, "double"), mu, sigma))
## probably not identical, but close...
f(1:10/2) - dnorm(1:10/2)
} else cat("\nSorry, your device doesn't support double-precision\n")
## Note that in practice you can use precision="best" in the first
## example which will pick "double" on devices that support it and
## "single" elsewhere
}
#> Warning: No OpenCL platforms found - try adding Installable Client Drivers (ICD) for your hardware.