Building the CUDA graph conditional fatbin#
The graph_do_while feature uses a tiny CUDA kernel that calls
cudaGraphSetConditional (a device-side function from NVIDIA’s
libcudadevrt.a) to control CUDA graph conditional while nodes. These
conditional nodes require SM 9.0+ (Hopper or later); on older GPUs,
graph_do_while falls back to a host-side loop automatically.
There are three distinct phases:
Fatbin generation (rare, manual) — A developer runs
scripts/build_condition_kernel_fatbin.py, which compiles the kernel and device-links it withlibcudadevrt.ato resolvecudaGraphSetConditional. The output is a self-contained fatbin, committed to git as a C header. Requiresnvccand the CUDA toolkit.Quadrants build (CI / developers) — The C header is
#included as a plain byte array. No CUDA toolkit needed.Runtime (end users) — The fatbin is loaded via
cuModuleLoadData. No CUDA toolkit needed.
This page documents phase 1: regenerating the pre-built fatbin.
When to regenerate#
You only need to regenerate the fatbin if:
The condition kernel source (
quadrants/runtime/cuda/graph_do_while_cond.cu) changes.You need to add support for a new SM architecture.
Prerequisites#
CUDA toolkit with
nvcc(13.0 or later, required for SM 110 support; earlier toolkits will fail withUnsupported gpu architecture 'compute_110').The
nvccbinary must be on yourPATH, or setCUDA_HOME.
Regenerating#
Run the script from the repo root:
python scripts/build_condition_kernel_fatbin.py
This will:
Compile
quadrants/runtime/cuda/graph_do_while_cond.cuwith relocatable device code for each target SM architecture.Device-link the result with
libcudadevrt.ato resolve thecudaGraphSetConditionalextern.Write the fatbin as a C byte array to
quadrants/runtime/cuda/graph_do_while_cond_fatbin.h.
After regenerating, commit the updated header. Quadrants must be rebuilt to pick up the new fatbin.
Adding a new SM architecture#
Edit the SM_VERSIONS list in scripts/build_condition_kernel_fatbin.py to
add the new SM version number (e.g., 130), then regenerate.
Files#
File |
Purpose |
|---|---|
|
CUDA C source for the condition kernel |
|
Regeneration script |
|
Generated C header (checked into git) |
How it’s used at runtime#
GraphManager::ensure_condition_kernel_loaded() in
quadrants/runtime/cuda/graph_manager.cpp loads the fatbin via
cuModuleLoadData. If the fatbin does not contain SASS for the current GPU’s
SM architecture, loading fails with a clear error pointing to this script.