From 2cdf487f6075fd681c6b4fea6f70e7543b627128 Mon Sep 17 00:00:00 2001 From: Rob Parolin Date: Sat, 16 May 2026 13:37:29 -0700 Subject: [PATCH 1/3] docs: add review-derived agent guidance --- AGENTS.md | 77 +++++++++++++++++++++++++++++++++++++++ cuda_bindings/AGENTS.md | 13 +++++++ cuda_core/AGENTS.md | 18 +++++++++ cuda_pathfinder/AGENTS.md | 10 +++++ cuda_python/AGENTS.md | 6 +++ 5 files changed, 124 insertions(+) diff --git a/AGENTS.md b/AGENTS.md index a4450fbc664..69e92f4d0d0 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -12,6 +12,83 @@ guide for package-specific conventions and workflows. - `cuda_core/`: High-level Pythonic CUDA APIs built on top of bindings. - `cuda_python/`: Metapackage and docs aggregation. +# Review-derived repository guidance + +These rules come from recurring cuda-python PR review comments. Apply them +across the repository, in addition to any package-specific `AGENTS.md`. + +## Public API and design + +- For new public APIs, major behavior changes, or broad feature work, make sure + the API surface is sketched in an issue or design discussion before coding. + Reviewers repeatedly block large feature PRs that arrive without design + context, especially before 1.0 API stabilization. +- Keep public APIs minimal and intentional. Avoid exposing private helpers just + to make tests or examples easier; prefer improving the public path or keeping + helpers private. +- When adding public behavior, update docs, examples, release notes, and API + index pages in the same PR unless the PR explicitly documents why those + updates are deferred. +- User-facing errors and warnings should name the user-actionable concept, not + a private helper. Include diagnostics when something cannot be discovered, + and avoid silent success-shaped fallbacks. + +## Tests and CI behavior + +- Add targeted regression tests for behavioral fixes. Do not add elaborate + tests that mostly prove an implementation detail or require large module + stubbing unless that is the only practical way to cover the bug. +- Do not weaken tests just to pass a platform or CI configuration. Avoid broad + platform skips such as "skip all WSL" or "skip all Windows"; query CUDA + driver/device capability or the specific missing library/feature instead. +- Preserve real user workflows in tests. Do not change global CUDA state, skip + real loading paths, or disable release-note/doc checks merely to reduce CI + load unless reviewers have agreed to that behavior change. +- Before pushing, run the narrowest relevant `pixi run ...`, `pytest`, docs, or + workflow validation command available for the touched package. If local + validation is impossible, state the exact reason in the PR. + +## Generated code and CUDA compatibility + +- Do not hand-edit generated binding artifacts as a shortcut. Fix the generator + source or templates and regenerate/sync outputs so the next generation does + not reintroduce the same review issue. +- Lint or formatting changes that touch generated files should either be made + in the generator (`cython-gen`, `cybind`, templates, or sync source) or should + exclude generated outputs from the check. +- Keep builds working across the supported CUDA major versions. Do not cimport + or call newly generated Cython symbols directly unless the older supported + CUDA-major build is gated or has a wrapper/fallback path. +- In Cython/CUDA code, preserve CUDA stream ordering, handle ownership, and + context-manager semantics. Cleanup paths should not mask a user's original + exception, and `__enter__` should not expose invalid handles. + +## Workflows, packaging, and metadata + +- GitHub workflow logic should parse structured data with `jq` or GitHub's + `--jq` support. Avoid substring `grep` checks for labels, milestones, or + JSON fields when exact matching is possible. +- Use the correct GitHub Actions context (`env`, `vars`, `github`, `inputs`) + deliberately; a wrong context often evaluates to an empty string and silently + breaks release or validation workflows. +- Keep workflow permissions minimal and explicit, and include all triggers + needed for metadata checks to rerun when labels or milestones change. +- Packaging changes should keep version constraints, wheel/sdist behavior, + release tags, and metapackage dependencies aligned across all affected + packages. Use exact version or lower-bound choices intentionally. + +## Documentation style + +- For Sphinx/Numpy-style docs, document class construction in the class + docstring and signature rather than separately documenting `__init__` or + `__new__`, unless the surrounding docs already use that convention. +- Add docs entries for new public classes/functions in the relevant + `docs/source/api*.rst` or autosummary index, and build docs when changing + generated API pages. +- Prefer concise comments that explain non-obvious compatibility, security, or + workflow choices. Remove duplicated error text, stale TODOs, and comments that + merely restate the code. + # Pull requests When creating pull requests with `gh pr create`, always assign at least one diff --git a/cuda_bindings/AGENTS.md b/cuda_bindings/AGENTS.md index 9688c9f94ca..1f38cb218a2 100644 --- a/cuda_bindings/AGENTS.md +++ b/cuda_bindings/AGENTS.md @@ -35,6 +35,12 @@ subpackage in the `cuda-python` monorepo. defined in `build_hooks.py`; update those rules when introducing new symbols. - **Platform split files**: keep `_linux.pyx` and `_windows.pyx` variants aligned when behavior should be equivalent. +- **Lint at the source**: if formatting or lint fixes affect generated files, + make the fix in the generation source (`cython-gen`, `cybind`, templates, or + sync source) or exclude generated outputs from the check. Otherwise the next + regeneration will reintroduce the same issue. +- **Cython copies**: prefer typed assignment for wrapper-owned C struct copies + over raw `memcpy` when the generated Cython/C types can define the copy size. ## Testing expectations @@ -65,3 +71,10 @@ subpackage in the `cuda-python` monorepo. `docs/source/module/` and tests in `tests/`. - Prefer changes that are easy to regenerate/rebuild rather than patching generated output directly. +- Preserve compatibility with the supported CUDA major-version matrix. If a new + CUDA header symbol is unavailable in an older supported build, gate it or call + through an existing Python wrapper instead of directly cimporting the new + generated Cython symbol. +- For external contributions touching generated `cuda_bindings` code, ask for a + reproducer and environment details, then route fixes through the generation + source rather than accepting one-off generated edits. diff --git a/cuda_core/AGENTS.md b/cuda_core/AGENTS.md index 357e228360d..771e3098d0a 100644 --- a/cuda_core/AGENTS.md +++ b/cuda_core/AGENTS.md @@ -63,3 +63,21 @@ This file describes `cuda_core`, the high-level Pythonic CUDA subpackage in the call-site consistency. - Prefer explicit error propagation over silent fallback paths. - If you change public behavior, update tests and docs under `docs/source/`. +- For new public APIs or broad feature work, sketch the API and behavior in an + issue/design discussion before opening a large implementation PR. Reviewers + often block major `cuda_core` features until API shape, examples, and + docs/release-note coverage are clear. +- Feature availability checks should query CUDA driver/device capabilities + instead of hard-coding broad platform skips. Prefer properties such as + capability flags over assumptions like "Windows", "Linux", or "WSL". +- Keep CUDA 12.x and 13.x build compatibility in mind. Do not directly cimport + newly generated binding symbols unless older supported CUDA-major builds are + gated or have a wrapper/fallback path. +- Resource and context-manager code must preserve stream ordering, ownership, + and exception semantics. `close()`/cleanup paths should use the stream that + established the resource ordering, and `__exit__` should avoid masking a + user's original exception where practical. +- Tests should cover the behavior users exercise, not just private helpers. + Avoid large module-stubbing tests for simple implementation choices; prefer + focused regressions around the public API or the smallest stable internal + boundary. diff --git a/cuda_pathfinder/AGENTS.md b/cuda_pathfinder/AGENTS.md index 52159c84fb3..0e85e9afd04 100644 --- a/cuda_pathfinder/AGENTS.md +++ b/cuda_pathfinder/AGENTS.md @@ -48,6 +48,16 @@ Windows. - **Prefer focused abstractions**: if a change is platform-specific, route it through existing platform abstraction points instead of branching in many call sites. +- **No default-location fallbacks without triage**: pathfinder intentionally + avoids guessing fallback install locations because that can hide broken + environments. Add explicit search steps or diagnostics rather than silent + fallback paths. +- **Validate environment roots before trusting them**: `CUDA_HOME`, + `CUDA_PATH`, and related roots should be checked for the needed headers or + libraries, not just for being set. +- **Keep skip logic feature-specific**: tests that depend on libraries or + driver capabilities should skip based on the missing library/capability, not + broad platform labels. ## Testing expectations diff --git a/cuda_python/AGENTS.md b/cuda_python/AGENTS.md index 7c4fb9c0b1e..8486a86ac58 100644 --- a/cuda_python/AGENTS.md +++ b/cuda_python/AGENTS.md @@ -22,3 +22,9 @@ monorepo. compatibility between metapackage versioning and subpackage constraints. - If you update docs structure, ensure `docs/build_all_docs.sh` still collects docs from `cuda_python`, `cuda_bindings`, `cuda_core`, and `cuda_pathfinder`. +- Packaging PRs should keep release tags, dependency pins, wheel/sdist behavior, + and docs aggregation aligned with the component packages. Use exact versions + or lower bounds intentionally and document the compatibility reason in the PR. +- Do not move substantial development-environment guidance here unless it is the + supported local-development path; generic "install with package manager" + instructions tend to get rejected in review. From 06c97df220c2e286157875585ada15a96dfe6932 Mon Sep 17 00:00:00 2001 From: Rob Parolin Date: Wed, 20 May 2026 15:04:14 -0700 Subject: [PATCH 2/3] Address AGENTS review feedback --- AGENTS.md | 14 ++++++++------ cuda_bindings/AGENTS.md | 10 ++++------ cuda_core/AGENTS.md | 17 ++++++++++------- 3 files changed, 22 insertions(+), 19 deletions(-) diff --git a/AGENTS.md b/AGENTS.md index 69e92f4d0d0..64ea5538792 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -38,9 +38,11 @@ across the repository, in addition to any package-specific `AGENTS.md`. - Add targeted regression tests for behavioral fixes. Do not add elaborate tests that mostly prove an implementation detail or require large module stubbing unless that is the only practical way to cover the bug. -- Do not weaken tests just to pass a platform or CI configuration. Avoid broad - platform skips such as "skip all WSL" or "skip all Windows"; query CUDA - driver/device capability or the specific missing library/feature instead. +- Do not weaken tests just to pass a platform or CI configuration. Use the + tightest available skip criteria: broad OS skips are appropriate only when + upstream documentation or the support matrix says the feature is unsupported + on that OS; otherwise query the specific CUDA driver/device capability or + missing library/feature. - Preserve real user workflows in tests. Do not change global CUDA state, skip real loading paths, or disable release-note/doc checks merely to reduce CI load unless reviewers have agreed to that behavior change. @@ -53,9 +55,9 @@ across the repository, in addition to any package-specific `AGENTS.md`. - Do not hand-edit generated binding artifacts as a shortcut. Fix the generator source or templates and regenerate/sync outputs so the next generation does not reintroduce the same review issue. -- Lint or formatting changes that touch generated files should either be made - in the generator (`cython-gen`, `cybind`, templates, or sync source) or should - exclude generated outputs from the check. +- Use `.pre-commit-config.yaml` as the source of truth for linting and + formatting. Do not perform formatting or lint fixes on files marked "This code + was automatically generated..." unless the repo config explicitly opts them in. - Keep builds working across the supported CUDA major versions. Do not cimport or call newly generated Cython symbols directly unless the older supported CUDA-major build is gated or has a wrapper/fallback path. diff --git a/cuda_bindings/AGENTS.md b/cuda_bindings/AGENTS.md index 1f38cb218a2..23aade8dfb0 100644 --- a/cuda_bindings/AGENTS.md +++ b/cuda_bindings/AGENTS.md @@ -35,10 +35,8 @@ subpackage in the `cuda-python` monorepo. defined in `build_hooks.py`; update those rules when introducing new symbols. - **Platform split files**: keep `_linux.pyx` and `_windows.pyx` variants aligned when behavior should be equivalent. -- **Lint at the source**: if formatting or lint fixes affect generated files, - make the fix in the generation source (`cython-gen`, `cybind`, templates, or - sync source) or exclude generated outputs from the check. Otherwise the next - regeneration will reintroduce the same issue. +- **Don't lint generated files:** If a file has the comment "This code was + automatically generated...", do not perform any formatting or lint fixes. - **Cython copies**: prefer typed assignment for wrapper-owned C struct copies over raw `memcpy` when the generated Cython/C types can define the copy size. @@ -76,5 +74,5 @@ subpackage in the `cuda-python` monorepo. through an existing Python wrapper instead of directly cimporting the new generated Cython symbol. - For external contributions touching generated `cuda_bindings` code, ask for a - reproducer and environment details, then route fixes through the generation - source rather than accepting one-off generated edits. + reproducer and environment details. Do not accept one-off generated edits; + NVIDIA maintainers should route accepted fixes through the generation source. diff --git a/cuda_core/AGENTS.md b/cuda_core/AGENTS.md index 771e3098d0a..8c326aa15ee 100644 --- a/cuda_core/AGENTS.md +++ b/cuda_core/AGENTS.md @@ -7,8 +7,11 @@ This file describes `cuda_core`, the high-level Pythonic CUDA subpackage in the `Program`, `Linker`, memory resources, graphs) on top of `cuda.bindings`. - **API intent**: keep interfaces Pythonic while preserving explicit CUDA behavior and error visibility. -- **Compatibility**: changes should remain compatible with supported - `cuda.bindings` major versions (12.x and 13.x). +- **API stability**: `cuda_core` is v1.0+; avoid breaking public APIs. Prefer + compatibility/deprecation paths and document intentional public changes in + docs and release notes. +- **Compatibility**: changes should remain compatible with the supported CUDA + major-version matrix. ## Package architecture @@ -65,14 +68,14 @@ This file describes `cuda_core`, the high-level Pythonic CUDA subpackage in the - If you change public behavior, update tests and docs under `docs/source/`. - For new public APIs or broad feature work, sketch the API and behavior in an issue/design discussion before opening a large implementation PR. Reviewers - often block major `cuda_core` features until API shape, examples, and - docs/release-note coverage are clear. + often block major `cuda_core` features until API shape, compatibility impact, + examples, and docs/release-note coverage are clear. - Feature availability checks should query CUDA driver/device capabilities instead of hard-coding broad platform skips. Prefer properties such as capability flags over assumptions like "Windows", "Linux", or "WSL". -- Keep CUDA 12.x and 13.x build compatibility in mind. Do not directly cimport - newly generated binding symbols unless older supported CUDA-major builds are - gated or have a wrapper/fallback path. +- Preserve compatibility with the supported CUDA major-version matrix. Do not + directly cimport newly generated binding symbols unless older supported + CUDA-major builds are gated or have a wrapper/fallback path. - Resource and context-manager code must preserve stream ordering, ownership, and exception semantics. `close()`/cleanup paths should use the stream that established the resource ordering, and `__exit__` should avoid masking a From 7a119ecd2f3db3c5afae78ce9b7b559aebd10211 Mon Sep 17 00:00:00 2001 From: Rob Parolin Date: Wed, 20 May 2026 15:11:00 -0700 Subject: [PATCH 3/3] Clarify private access testing guidance --- AGENTS.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/AGENTS.md b/AGENTS.md index 64ea5538792..59acdddb6cf 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -23,9 +23,10 @@ across the repository, in addition to any package-specific `AGENTS.md`. the API surface is sketched in an issue or design discussion before coding. Reviewers repeatedly block large feature PRs that arrive without design context, especially before 1.0 API stabilization. -- Keep public APIs minimal and intentional. Avoid exposing private helpers just - to make tests or examples easier; prefer improving the public path or keeping - helpers private. +- Keep public APIs minimal and intentional. Prefer public testing paths when + practical, but limited private or test-only access is acceptable when Python + tests need to exercise Cython internals that cannot be reached through a + stable public API. - When adding public behavior, update docs, examples, release notes, and API index pages in the same PR unless the PR explicitly documents why those updates are deferred.