Skip to content

metrics(site-explorer): instrument endpoint update path#2891

Merged
williampnvidia merged 1 commit into
NVIDIA:mainfrom
williampnvidia:william/site-explorer-update-endpoints-metrics-v2
Jun 26, 2026
Merged

metrics(site-explorer): instrument endpoint update path#2891
williampnvidia merged 1 commit into
NVIDIA:mainfrom
williampnvidia:william/site-explorer-update-endpoints-metrics-v2

Conversation

@williampnvidia

@williampnvidia williampnvidia commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Summary

  • add update_explored_endpoints subphase timings to the existing site explorer phase latency histogram
  • add per-endpoint exploration step latency for Redfish probing, error-context load, and report enrichment
  • add last-run update_explored_endpoints counts for loaded records, candidate classes, selected endpoints, DB write attempts, firmware update attempts, and Redfish remediation candidates

Validation

  • PASS: cargo +nightly-2026-06-16 fmt --all -- --check
  • PASS: cargo check -p carbide-site-explorer
  • PASS: cargo clippy -p carbide-site-explorer --lib -- -D warnings
  • Needs martindev rerun for current SHA: CARGO_HOME="$HOME/.cargo" cargo make --no-workspace clippy-flow

Supersedes draft PR #2888, whose PR metadata stayed pinned to the old force-pushed head SHA.

@copy-pr-bot

copy-pr-bot Bot commented Jun 25, 2026

Copy link
Copy Markdown

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@coderabbitai

coderabbitai Bot commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: cef213d4-73f1-4b5f-8932-e818abed34e1

📥 Commits

Reviewing files that changed from the base of the PR and between 6a2db3d and a51b191.

📒 Files selected for processing (2)
  • crates/site-explorer/src/lib.rs
  • crates/site-explorer/src/metrics.rs
🚧 Files skipped from review as they are similar to previous changes (2)
  • crates/site-explorer/src/metrics.rs
  • crates/site-explorer/src/lib.rs

Summary by CodeRabbit

  • New Features
    • Added finer-grained site exploration telemetry, including per-step latency tracking across the update-explored-endpoints workflow.
    • Exposed additional metrics for exploration planning/candidates/selection, persistence operation counts, and remediation candidate volume.
    • Introduced new telemetry instruments to publish step-latency histograms and update-explored-endpoints count gauges.
  • Bug Fixes
    • Improved metrics emission by clearing step-latency data after recording to prevent accumulation across metric updates.

Walkthrough

The PR adds step-level exploration timings, phase latency counters, and per-run counts to site explorer metrics. It also threads these measurements through task execution, planning, persistence, and remediation.

Changes

Site explorer metrics instrumentation

Layer / File(s) Summary
Metrics storage and instrument wiring
crates/site-explorer/src/metrics.rs
SiteExplorationMetrics stores step latencies and last-run counts, and SiteExplorerInstruments exports them through a histogram and observable gauge.
Load and selection phase metrics
crates/site-explorer/src/lib.rs
update_explored_endpoints records load, preallocation, interface/index build, candidate, and selection phase counts and latencies.
Per-endpoint exploration step timing
crates/site-explorer/src/lib.rs
Each spawned exploration task tracks Redfish, failure-context, and report-enrichment durations and returns them in a structured result.
Result aggregation, persistence, and remediation
crates/site-explorer/src/lib.rs
Task-result aggregation emits per-step latencies and records explicit persistence counters plus persistence and remediation latencies.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the endpoint update metrics instrumentation change.
Description check ✅ Passed The description matches the added site-explorer timing and count metrics.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands.

@williampnvidia williampnvidia force-pushed the william/site-explorer-update-endpoints-metrics-v2 branch from 338a0c6 to 83f61a8 Compare June 25, 2026 17:58
@williampnvidia williampnvidia marked this pull request as ready for review June 25, 2026 18:04
@williampnvidia williampnvidia requested a review from a team as a code owner June 25, 2026 18:04
@krish-nvidia krish-nvidia requested a review from Matthias247 June 25, 2026 19:29
@github-actions

Copy link
Copy Markdown

🔍 Container Scan Summary

Service Total Critical High Medium Low Other
boot-artifacts-aarch64 3 0 0 3 0 0
boot-artifacts-x86_64 3 0 0 3 0 0
forge-admin-cli-x86_64 283 6 24 98 7 148
machine-validation-runner 744 32 188 267 36 221
machine_validation 744 32 188 267 36 221
machine_validation-aarch64 744 32 188 267 36 221
nvmetal-carbide 744 32 188 267 36 221
TOTAL 3265 134 776 1172 151 1032

Per-CVE detail lives in the per-service grype-* artifacts (JSON + SARIF). Severity counts only — no CVE IDs published here.

@williampnvidia williampnvidia force-pushed the william/site-explorer-update-endpoints-metrics-v2 branch from 83f61a8 to 6a2db3d Compare June 25, 2026 19:35
@github-actions

Copy link
Copy Markdown

@williampnvidia williampnvidia force-pushed the william/site-explorer-update-endpoints-metrics-v2 branch from 6a2db3d to a51b191 Compare June 25, 2026 20:30
@williampnvidia williampnvidia merged commit 59ed192 into NVIDIA:main Jun 26, 2026
58 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants