ACL COSI: handle shared BTRFS UUIDs and ESP space management#673
Draft
bfjelds wants to merge 22 commits into
Draft
ACL COSI: handle shared BTRFS UUIDs and ESP space management#673bfjelds wants to merge 22 commits into
bfjelds wants to merge 22 commits into
Conversation
ACL images ship with PARTUUID-based verity addons — templates for both A and B slots stored in acl/uki-addons/ on the ESP, with slot A active by default. During an A/B update, trident must swap the active addon to match the target slot so the new UKI boots with the correct verity partition identity. Add activate_verity_addon_for_target_volume() which: - Checks for ACL verity addon templates on the image ESP - Copies the correct slot template into the staged addon directory - Is a silent no-op for non-ACL images (no template dir) - Errors if template dir exists but the selected slot is missing Called from copy_file_artifacts() after stage_uki_on_esp(), gated on ctx.image_distro().is_acl() to ensure only ACL images are affected. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
ACL uses identical FS UUIDs across A/B slots by design — partitions are distinguished by PARTUUID instead. The within-image uniqueness check is unaffected. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Scan each UKI's .extra.d/ directory for *.addon.efi files and extract their .cmdline PE sections. Addons are stored as a new field on the boot entry so the COSI metadata captures the full effective cmdline (main UKI + addons). Both Go (mkcosi) and Rust (metadata deserialization) updated. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
With PARTUUID-based verity addons, usrhash= moved from the main UKI cmdline to the verity addon cmdline. Update extractUsrhashFromUKIEntries to also search addon cmdlines when looking for the root hash. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When staging an A/B update on ACL (Azure Container Linux) UKI images, the COSI image may share BTRFS filesystem UUIDs with the active OS. BTRFS maintains a kernel-global UUID registry and refuses to mount a filesystem whose UUID is already registered by another mounted device, causing the staging verity device mount to fail. This change detects the UUID collision by checking the well-known ACL USR-A/USR-B partition UUIDs (by PARTUUID) before the mount loop. When a collision is detected, it bind-mounts the active /usr into the newroot instead of attempting to mount the staging verity device. This is safe because: - USR is verity-protected and read-only - Matching UUIDs means identical filesystem content - The chroot only reads from /usr during provisioning - After reboot, initramfs sets up the correct verity device normally Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When the bind-mount workaround activates for ACL BTRFS UUID collisions, compare the staging USR verity root hash (from COSI metadata) against the active USR root hash (from /proc/cmdline usrhash= parameter) to cryptographically prove the filesystems are byte-identical. If the staging hash is available but the active hash cannot be read or does not match, the bind-mount is refused and the normal mount path proceeds (which will fail with the BTRFS UUID error, as expected for genuinely different content). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When internalParams.forceAbUpdate is true, trident will proceed with
an A/B update even when the old and new OS image SHA384 hashes match.
This is useful for testing A/B update flows repeatedly with the same
COSI file.
Usage in trident-config.yaml:
internalParams:
forceAbUpdate: true
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace the blanket ACL skip in validate_filesystem_uniqueness() with proper validation. When a duplicate FS UUID is found during A/B update on ACL, the update is only allowed if: 1. The duplicate is on the /usr mount point 2. The staging COSI has a verity root hash 3. The active system has a usrhash= in /proc/cmdline 4. The normalized hashes match (merkle tree proof of identical content) If COSI partition metadata is available, also validates that the staging USR partition has a known ACL PARTUUID. Extracts ACL constants and read_active_usr_roothash() into a shared engine::acl module used by both osimage.rs and newroot.rs. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
DiscoverablePartitionType does not have is_acl_usr() — that method lives on the HC PartitionType enum. Since we already check for known ACL USR PARTUUIDs, the part_type check was redundant. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The ESP (128 MB) can overflow when multiple UKIs accumulate across A/B updates. Before staging a new UKI, remove old UKIs for the target slot: 1. Trident-managed UKIs matching the target slot (all install indices) 2. Non-trident-managed (original install) UKIs, but only when trident already manages the other slot (proving it owns boot management) The other slot's UKI is always preserved as the active/rollback path. Also extract UKI_SLOT_A/UKI_SLOT_B constants to replace string literals. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
In multi-OS configurations, the ESP has UKI pairs per OS instance (azla0/azlb0, azla1/azlb1, etc.). Cleanup must only remove UKIs for the specific slot+os-index being updated, not all UKIs for the same slot letter across different OS instances. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
In multiboot configurations, the original UKI has OS 0's partition references baked in. OS 1+ instances never depend on it, but only OS 0 should remove it since it's the owner. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Move /proc/cmdline read out of validate_acl_duplicate_uuid into its caller (validate_filesystem_uniqueness). The function now accepts active_usr_roothash as Option<String>, making it fully testable in unit tests without filesystem access. Add 7 unit tests covering all validation paths: - matching hash (success) - case-insensitive matching (success) - wrong mount point (reject) - no staging verity hash (reject) - mismatched hashes (reject) - no active hash / None (reject) - empty active hash (reject) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
DR-001 (High): Replace if-let with let-else for missing staging hash in detect_acl_btrfs_uuid_collision - None now logs a warning and refuses the bind-mount instead of silently proceeding unverified. DR-002 (High): Replace suffix.contains() with exact suffix equality in cleanup_ukis_before_staging - prevents azla0 from matching azla01.efi in multiboot with 10+ OS instances. DR-003 (Medium): Extract verity_hashes_match() into engine::acl module, replacing duplicated normalize+compare logic in newroot.rs and osimage.rs. Rejects empty hashes so "" == "" cannot incorrectly pass. DR-004 (Medium): Document pre-staging cleanup ordering rationale in esp.rs - explains the crash-safety trade-off (active slot UKI preserved as A/B fallback). DR-005 (Medium): Make remove_uki_and_addons idempotent by treating NotFound as success - prevents orphaned addon dirs if UKI was already removed by a prior partial cleanup. DR-006 (Medium): Document that cleanup_ukis_before_staging is intentionally universal (not ACL-gated) - ESP space constraints apply to all UKI-based A/B updates. DR-007 (Medium): Replace byte-index hash slicing with char-safe hash_preview() using chars().take(16) - prevents panics on non-ASCII input (defense in depth for hex hashes). Adds unit tests for verity_hashes_match(), hash_preview(), cleanup_ukis_before_staging (exact suffix matching, multi-index cleanup), and remove_uki_and_addons (idempotency, addon directory cleanup). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Member
Author
|
/azp run [GITHUB]-trident-pr-e2e |
|
Azure Pipelines successfully started running 1 pipeline(s). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Enables trident to successfully perform A/B updates on ACL when the COSI image shares BTRFS filesystem UUIDs with the currently active OS. Also adds pre-staging ESP cleanup to prevent "no space left on device" failures during repeated updates.
Related PRs
Combined Validation
https://dev.azure.com/mariner-org/ACL/_build/results?buildId=1132645
Changes
ACL BTRFS UUID handling
search --fs-uuid), so shared UUIDs are safeUKI enhancements
verity.addon.efibased on which slot is being updatedfindUkiEntriesCOSI metadata — ensures addon files are discovered during COSI parsingusrhash=parameter — extracts verity root hash from addon kernel cmdlineESP space management
azla0) not just slot letter, safe for multibootinstall_index == 0— only the OS that placed the original UKI can remove itInternal testing support
forceAbUpdateinternal param — bypasses SHA384 identity check, allowing the same COSI to be applied repeatedly for A↔B cycle testingTesting
cargo check -p trident,cargo fmt)user/bfjelds/single-acl-buildupdated with 5-cycle A↔B test and ESP diagnostics