Sub-functions#
A @qd.kernel can call other functions, as long as those functions are annotated with @qd.func.
qd.func#
@qd.func is the standard annotation for a function that can be called from a kernel. @qd.func functions can also call other @qd.func functions.
@qd.func is inlined into the calling kernel at compile time. This means:
There is no function call overhead
The compiler can optimize across the call boundary
Recursive calls are not supported
@qd.func
def add(a: qd.i32, b: qd.i32) -> qd.i32:
return a + b
@qd.kernel
def compute(a: qd.Template) -> None:
for i in range(10):
a[i] = add(i, 1)
Passing fields and ndarrays#
Fields and ndarrays can be passed to @qd.func functions:
@qd.func
def increment(arr: qd.Template, idx: qd.i32) -> None:
arr[idx] += 1
@qd.kernel
def compute(a: qd.Template) -> None:
for i in range(10):
increment(a, i)
Restricting a func to the top level (requires_top_level=True)#
Experimental. requires_top_level is an experimental feature and its behavior or API may change in a future release.
Some qd.func contain for-loops that are assumed and intended to be top-level for-loops, that each become separate offloaded tasks, and ultimately separate device kernels. If such qd.func’s are placed inside other for-loops, the qd.func will no longer generate the structure of offloaded tasks and device kernels assumed, and might either run very slowly, or crash, or give incorrect results.
To enforce that a qd.func can only be used at top-level, a qd.func maybe be annotated with qd.func(requires_top_level=True). This will throws QuadrantsSyntaxError at compile time if the qd.func is not called from top level.
@qd.func(requires_top_level=True)
def op(arr: qd.Template, n: qd.i32) -> None:
for i in range(n): # one of several top-level phase loops
arr[i] = arr[i] + 1
The check is purely compile-time — it adds no runtime or GPU cost, and a correctly placed call compiles to exactly the same code as an unmarked func.
What counts as top level (allowed):
Directly in the kernel body.
Inside a
qd.static(...)loop — these are unrolled at compile time, so the calls land at top level (see static).Directly inside a
while qd.graph_do_while(...):body (see graphs).
What is rejected: nesting the call inside a runtime for, if, or while.
@qd.kernel
def good(arr: qd.Template, n: qd.i32) -> None:
op(arr, n) # OK: top level
@qd.kernel
def also_good(arr: qd.Template, n: qd.i32) -> None:
for _ in qd.static(range(2)):
op(arr, n) # OK: qd.static is compile-time
@qd.kernel
def bad(arr: qd.Template, n: qd.i32, flag: qd.i32) -> None:
if flag > 0:
op(arr, n) # QuadrantsSyntaxError raised at compile time
The check applies transitively. Because @qd.func bodies are inlined into the caller, reaching a requires_top_level=True func through an intermediate ordinary @qd.func is treated exactly as if the call were written inline at that point. So calling it through a helper that sits at the top level is allowed, while calling it through a helper that is itself nested in a runtime for / if / while is rejected.
@qd.func
def helper(arr: qd.Template, n: qd.i32) -> None:
op(arr, n) # `op` is requires_top_level=True
@qd.kernel
def good_helper(arr: qd.Template, n: qd.i32) -> None:
helper(arr, n) # OK: helper is at the kernel top level
@qd.kernel
def bad_helper(arr: qd.Template, n: qd.i32, flag: qd.i32) -> None:
if flag > 0:
helper(arr, n) # QuadrantsSyntaxError: op is transitively nested in a runtime if