docs(quick-start): add site controller node DPU provisioning requirement#2859
docs(quick-start): add site controller node DPU provisioning requirement#2859shayan1995 wants to merge 2 commits into
Conversation
WalkthroughThe quick-start guide adds DPU provisioning prerequisites for site controller nodes before Kubernetes setup. The NICo container build guide adds instructions to verify that ChangesSite controller DPU prerequisites
NICo build verification
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~5 minutes Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
🔍 Container Scan SummaryNo Grype artifacts were found to aggregate. |
|
🌿 Preview your docs: https://nvidia-preview-pull-request-2859.docs.buildwithfern.com/infra-controller |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/getting-started/quick-start.md`:
- Around line 44-46: The quick-start markdown text references both the NVIDIA
DOCA documentation and the BlueField Firmware Bundle archive, but only the
archive link is present. Update the paragraph near the referenced sentence by
either adding the missing DOCA documentation link alongside the existing archive
link or rewriting the sentence to mention only the archive; use the surrounding
quick-start section content to keep the wording accurate and the links working.
- Line 42: The quick-start text is ambiguous about which DPU access path to use,
so update the sentence to name the DPU OOB/SSH endpoint explicitly instead of
referring generically to a management interface. Keep the wording aligned with
the surrounding docs terminology in quick-start.md, and make sure readers can
clearly distinguish the DPU OOB/RJ45 SSH path from the BMC/Redfish path.
In `@docs/manuals/building_nico_containers.md`:
- Around line 71-75: The verification snippet after make images-all is not
self-contained because it relies on IMAGE_REGISTRY and IMAGE_TAG being
pre-exported, which can make the docker images filter check the wrong reference
or an empty one. Update the documentation in the building_nico_containers guide
so the verification command derives or sets the same default values used by the
Makefile before running docker images, and apply the same fix to the later
related snippet. Keep the example copy-paste safe and ensure the image count
check works without requiring prior shell setup.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: d8964236-7ba2-4fb0-bbbe-fd0a54f7f446
📒 Files selected for processing (2)
docs/getting-started/quick-start.mddocs/manuals/building_nico_containers.md
|
|
||
| - Flash the DPU firmware to a supported version using the BlueField Firmware Bundle. | ||
| - Configure the DPU operating mode (DPU mode or NIC mode) to match your site controller networking topology. See the [network prerequisites](prerequisites/network.md) for the supported topologies. | ||
| - Ensure the DPU ARM OS is booted and reachable via its management interface. |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win
Name the DPU endpoint explicitly.
“Management interface” is ambiguous here; the existing docs distinguish the DPU OOB/RJ45 SSH path from the separate BMC/Redfish path, so readers may validate the wrong interface. Please call out the DPU OOB/SSH endpoint directly.
As per path instructions, review Markdown for correctness and clarity.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@docs/getting-started/quick-start.md` at line 42, The quick-start text is
ambiguous about which DPU access path to use, so update the sentence to name the
DPU OOB/SSH endpoint explicitly instead of referring generically to a management
interface. Keep the wording aligned with the surrounding docs terminology in
quick-start.md, and make sure readers can clearly distinguish the DPU OOB/RJ45
SSH path from the BMC/Redfish path.
Source: Path instructions
| Refer to the NVIDIA DOCA documentation and the BlueField Firmware Bundle download archive for firmware flashing instructions and supported firmware versions: | ||
|
|
||
| [https://developer.nvidia.com/doca-2-9-2-lts-ovs-doca-download-archive?deployment_platform=BlueField&deployment_package=BF-FW-Bundle](https://developer.nvidia.com/doca-2-9-2-lts-ovs-doca-download-archive?deployment_platform=BlueField&deployment_package=BF-FW-Bundle) |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win
Add the missing DOCA link or rewrite this sentence.
This paragraph says “NVIDIA DOCA documentation and the BlueField Firmware Bundle download archive,” but only the archive URL is present. Either add the actual DOCA docs link or remove that claim.
As per path instructions, review Markdown for correctness and working links.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@docs/getting-started/quick-start.md` around lines 44 - 46, The quick-start
markdown text references both the NVIDIA DOCA documentation and the BlueField
Firmware Bundle archive, but only the archive link is present. Update the
paragraph near the referenced sentence by either adding the missing DOCA
documentation link alongside the existing archive link or rewriting the sentence
to mention only the archive; use the surrounding quick-start section content to
keep the wording accurate and the links working.
Source: Path instructions
| After `make images-all` completes, verify that all 14 deployable images were produced: | ||
|
|
||
| ```sh | ||
| docker images --filter "reference=${IMAGE_REGISTRY}/*:${IMAGE_TAG}" \ | ||
| --format "table {{.Repository}}\t{{.Tag}}\t{{.Size}}" |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win
Make the verification commands self-contained.
make images-all uses the Makefile defaults when IMAGE_REGISTRY/IMAGE_TAG are unset, but these shell snippets only work if the reader has already exported matching values. As written, the copy-paste check can silently filter on an empty reference and report the wrong count. As per path instructions, review Markdown for correctness, clarity, spelling, grammar, working links, and whether commands/examples are realistic and safe.
♻️ Suggested fix
-docker images --filter "reference=${IMAGE_REGISTRY}/*:${IMAGE_TAG}" \
+docker images --filter "reference=${IMAGE_REGISTRY:-localhost:5000}/*:${IMAGE_TAG:-latest}" \
--format "table {{.Repository}}\t{{.Tag}}\t{{.Size}}"
@@
-docker images --filter "reference=${IMAGE_REGISTRY}/*:${IMAGE_TAG}" \
+docker images --filter "reference=${IMAGE_REGISTRY:-localhost:5000}/*:${IMAGE_TAG:-latest}" \
--format "{{.Repository}}" | wc -lAlso applies to: 97-100
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@docs/manuals/building_nico_containers.md` around lines 71 - 75, The
verification snippet after make images-all is not self-contained because it
relies on IMAGE_REGISTRY and IMAGE_TAG being pre-exported, which can make the
docker images filter check the wrong reference or an empty one. Update the
documentation in the building_nico_containers guide so the verification command
derives or sets the same default values used by the Makefile before running
docker images, and apply the same fix to the later related snippet. Keep the
example copy-paste safe and ensure the image count check works without requiring
prior shell setup.
Source: Path instructions
Documents the expected 14 deployable images, a quick wc -l count check, what to infer from a lower count, and which 3 intermediates are local-only.
If site controller nodes have DPUs, they must be flashed and configured before K8s cluster setup. NICo does not provision its own nodes' DPUs. Adds pointer to NVIDIA DOCA BlueField Firmware Bundle download archive.
3f80dfe to
0846595
Compare
If site controller nodes have DPUs, they must be flashed and configured
before K8s cluster setup. NICo does not provision its own nodes' DPUs.
Adds pointer to NVIDIA DOCA BlueField Firmware Bundle download archive.
Related issues
Type of Change
Breaking Changes
Testing
Additional Notes
The quick-start guide Step 2 ("Prepare the Kubernetes Cluster") had no mention of what to do if site controller nodes are equipped with DPUs. This is a silent failure mode: without pre-provisioned DPUs, the Kubernetes cluster and NICo networking topology may not come up correctly, but the guide gave operators no indication that DPU setup was required before cluster bootstrap.
Adds a "Site controller node DPU requirements" subsection that clarifies: