Skip to content

Add griffe API checking to cuda_core#2300

Draft
mdboom wants to merge 3 commits into
NVIDIA:mainfrom
mdboom:griffe
Draft

Add griffe API checking to cuda_core#2300
mdboom wants to merge 3 commits into
NVIDIA:mainfrom
mdboom:griffe

Conversation

@mdboom

@mdboom mdboom commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

This adds a CI job to run griffe over cuda_core to detect API breakage.

For this prototype, it checks both against the merge base of the PR and against the latest cuda-core tag.

This PR also deliberately makes an API breakage (the same one found in #2280) so we can confirm it's working.

One side effect of this is that griffe /requires/ the top-level __init__.py to have an __all__ so it's explicit about what the public API is. This is a minor ongoing maintenance burden, but probably worth the effort.

@copy-pr-bot

copy-pr-bot Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@mdboom

mdboom commented Jul 2, 2026

Copy link
Copy Markdown
Contributor Author

/ok to test

@github-actions github-actions Bot added CI/CD CI/CD infrastructure cuda.core Everything related to the cuda.core module labels Jul 2, 2026
@mdboom

mdboom commented Jul 2, 2026

Copy link
Copy Markdown
Contributor Author

/ok to test

@mdboom

mdboom commented Jul 2, 2026

Copy link
Copy Markdown
Contributor Author
image

@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown

Comment thread .github/workflows/ci.yml
# Findings fail this job (shows red in the UI), but it never blocks merging
# the PR: it's intentionally excluded from `checks:`'s `needs`, so it isn't
# part of the required aggregator status.
api-check-core:

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to wire this job up in the check status job.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think we should eventually. I was thinking we could make it a "soft fail" to start with while we learn about whether it produces any false positives.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I'm not sure we ever want this to fail. There will be times in our release cycle where we are ok with breaking the API (once something has been deprecated for some time). We could maybe convince griffe to understand that someday, but in the meantime, having human-override judgement in the loop is probably good enough.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, we could admin-merge maybe? Or is there a way to skip griffe on a per-line basis (like # noqa)?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could introduce a griffe-override label or similar to enable merging. Or alternatively [API-BREAK] in the PR title, to make it totally obvious. Or both, so we can easily deal with griffe false positives.

del _patch_rlcompleter_for_cython_properties


__all__ = [

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: If this is a must to make griffe happy, should we just unify the style and always use __all__ in every submodule, private or public?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just needed in the public ones. I think that's a reasonable convention.

@leofang

leofang commented Jul 2, 2026

Copy link
Copy Markdown
Member

Very nice! It shows warnings right at the offending diff!

截圖 2026-07-02 下午5 57 03

@mdboom

mdboom commented Jul 2, 2026

Copy link
Copy Markdown
Contributor Author

Very nice! It shows warnings right at the offending diff!

Yep. Someday maybe we can get it to put it in the .pyx file, not the generated .pyi file, but this is more than good enough to catch breakages.

@mdboom

mdboom commented Jul 2, 2026

Copy link
Copy Markdown
Contributor Author

Note: The changes that this is highlighting against the cuda-core-v1.0.1 tag are coming from the fact that cuda-core-v1.0.1 predates having .pyi files. So that's not anything to be concerned about.

This PR should work as-is for checking for changes in a specific PR. For changes against the latest release, we will need to wait until the next release of cuda-core (which will have .pyi files) and then we can do that as well. (That's not really useful for review, but it's useful to have all the changes in one place so that /when/ we make a breaking release we know exactly what all of them are for release notes purposes).

@rwgk rwgk left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gpt-5.5 (with manual edits/additions)

Findings

  • Low: cuda_core/cuda/core/__init__.py adds DeviceResources to top-level __all__, but it is documented in cuda_core/docs/source/api_private.rst, not cuda_core/docs/source/api.rst. Since cuda_core/AGENTS.md defines public API by __all__, this promotes a returned helper into the explicit public surface. I would either remove it from top-level __all__ or move it to the main API page.

  • Low: .github/workflows/ci.yml runs api-check-core only when cuda_core/ changes. A follow-up that edits only .github/actions/griffe-api-check/action.yml will not exercise the action. Consider gating on core || shared, or a narrower action-specific change flag.

  • Low: .github/actions/griffe-api-check/action.yml invokes unpinned uvx griffe in both comparisons. That is fine for a prototype, but if this becomes a trusted signal, pin griffe or route it through the repo-managed environment.

Maintenance Risk

A maintenance risk from adding __all__ is source-of-truth drift. New APIs can be imported and documented but omitted from __all__, stale names can remain after removals (although import * would catch those), and returned/private helper types can accidentally become SemVer-public. The DeviceResources mismatch is a concrete example: it is imported at the top level and now listed in __all__, but the docs currently treat it as a private/returned helper.

That said, an explicit __all__ is probably the right convention if griffe is going to define and compare the supported public API from it. The burden is manageable as long as there is a consistency check that catches drift early.

A simple guard against stale names could be:

def test_cuda_core_all_exports_resolve():
    import cuda.core

    missing = [name for name in cuda.core.__all__ if not hasattr(cuda.core, name)]
    assert missing == []

Suggested Follow-On Idea

Add a follow-on docs/API consistency test that compares the public symbols in cuda.core.__all__ against the public entries in cuda_core/docs/source/api.rst, with deliberate exceptions for submodule namespaces and returned-helper/private API docs.

It should parse the Sphinx autosummary and data entries, then account for known surfaces such as cuda.core.graph, cuda.core.texture, cuda.core.utils, cuda.core.checkpoint, and entries intentionally documented in api_private.rst. The useful failure mode is: “this symbol is public by __all__ but missing from public docs” or “this symbol is documented as public but not exported by __all__.”

Comment thread .github/workflows/ci.yml
# Findings fail this job (shows red in the UI), but it never blocks merging
# the PR: it's intentionally excluded from `checks:`'s `needs`, so it isn't
# part of the required aggregator status.
api-check-core:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could introduce a griffe-override label or similar to enable merging. Or alternatively [API-BREAK] in the PR title, to make it totally obvious. Or both, so we can easily deal with griffe false positives.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI/CD CI/CD infrastructure cuda.core Everything related to the cuda.core module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants