Skip to content

[BUG]: check_spdx.py can miss stale copyright years outside staged-file workflows #2273

Description

@rwgk

toolshed/check_spdx.py currently updates stale copyright years only when a file has a staged diff:

def is_staged(filepath):
    process = subprocess.run(
        ["git", "diff", "--staged", "--", filepath],
        capture_output=True,
        text=True,
    )
    return process.stdout.strip() != ""

and:

if not is_staged(filepath) or int(end_year) >= int(CURRENT_YEAR):
    return True, blob

This means the hook behaves differently depending on how it is invoked.

In the normal local git commit path, a modified file is staged when pre-commit runs, so check_spdx.py --fix can update a stale year such as 2025 to 2025-2026.

However, if the same file is already committed and then checked by pre-commit run --all-files, the file is no longer staged. In that mode, check_spdx.py accepts the stale year even though the file may have been modified in 2026. CI-style all-files validation has the same blind spot.

Concrete example of the blind spot:

  1. Start with a file containing:

    SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
    
  2. Modify that file in 2026.

  3. Create a commit without first running pre-commit.

  4. Run:

    pre-commit run --all-files

Desired result:

The check should fail, or --fix should update the modified file to include the current year.

Actual result:

The check can pass because the file has no staged diff at the time check_spdx.py runs.

This makes the copyright-year rule non-deterministic: it may or may not be enforced in typical local workflows, and is not enforced in common validation workflows.

Possible fix directions:

  • Separate "validate SPDX syntax/license fields" from "update years for changed files".
  • Add an explicit mode or option for year enforcement, for example:
    • staged files, for local pre-commit convenience;
    • files changed against a base ref, for CI/PR validation;
    • all files, only if intentionally desired.
  • In CI, prefer checking changed files against the PR base or merge base instead of relying on git diff --staged.
  • Make the hook fail in validation mode when a changed file has an outdated copyright year, instead of silently accepting it because it is not staged.

The important part is that "was this file changed in the commit/PR being validated?" should not be inferred solely from the current staged index state.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1Medium priority - Should do

    Type

    No fields configured for Bug.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions