[TRTLLM-12529][feat] Graceful exit when lora unsupported by brb-nv · Pull Request #13869 · NVIDIA/TensorRT-LLM

brb-nv · 2026-05-07T22:10:49Z

Description

This MR enables graceful exit when LoRA is not supported for a module. Also, adds a few tests for multi-lora with MoE models.

Previously:

File "/home/scratch.bbuddharaju_gpu/TensorRT-LLM/tests/unittest/utils/../../../tensorrt_llm/_torch/pyexecutor/model_engine.py", line 3917, in forward
    inputs, gather_ids = self._prepare_inputs(
                         ^^^^^^^^^^^^^^^^^^^^^
  File "/home/scratch.bbuddharaju_gpu/TensorRT-LLM/tests/unittest/utils/../../../tensorrt_llm/_torch/pyexecutor/model_engine.py", line 3702, in _prepare_inputs
    return self._prepare_tp_inputs(
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/scratch.bbuddharaju_gpu/TensorRT-LLM/tests/unittest/utils/../../../tensorrt_llm/_torch/pyexecutor/model_engine.py", line 3018, in _prepare_tp_inputs
    lora_params = self._get_lora_params_from_requests(
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/scratch.bbuddharaju_gpu/TensorRT-LLM/tests/unittest/utils/../../../tensorrt_llm/_torch/pyexecutor/model_engine.py", line 3522, in _get_lora_params_from_requests
    return self.cuda_graph_lora_manager.prepare_cuda_graph_lora_params(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/scratch.bbuddharaju_gpu/TensorRT-LLM/tests/unittest/utils/../../../tensorrt_llm/_torch/peft/lora/cuda_graph_lora_manager.py", line 167, in prepare_cuda_graph_lora_params
    cuda_graph_lora_params.update_weight_pointers(peft_table, slot2task)
  File "/home/scratch.bbuddharaju_gpu/TensorRT-LLM/tests/unittest/utils/../../../tensorrt_llm/_torch/peft/lora/cuda_graph_lora_params.py", line 262, in update_weight_pointers
    key = self.layer_module2key[(layer_id, module_id)]
          ~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^
KeyError: (0, 14)

Now. Exception is raised during worker initialization.

[05/07/2026-21:57:23] [TRT-LLM] [E] [executor] Traceback (most recent call last):
  File "/home/scratch.bbuddharaju_gpu/TensorRT-LLM/tests/unittest/utils/../../../tensorrt_llm/executor/worker.py", line 282, in worker_main
    worker: GenerationExecutorWorker = worker_cls(
                                       ^^^^^^^^^^^
  File "/home/scratch.bbuddharaju_gpu/TensorRT-LLM/tests/unittest/utils/../../../tensorrt_llm/executor/worker.py", line 62, in __init__
    self.setup_engine()
  File "/home/scratch.bbuddharaju_gpu/TensorRT-LLM/tests/unittest/utils/../../../tensorrt_llm/executor/base_worker.py", line 260, in setup_engine
    self.engine = _create_py_executor(
                  ^^^^^^^^^^^^^^^^^^^^
  File "/home/scratch.bbuddharaju_gpu/TensorRT-LLM/tests/unittest/utils/../../../tensorrt_llm/executor/base_worker.py", line 231, in _create_py_executor
    _executor = create_executor(**args)
                ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/scratch.bbuddharaju_gpu/TensorRT-LLM/tests/unittest/utils/../../../tensorrt_llm/_torch/pyexecutor/py_executor_creator.py", line 863, in create_py_executor
    py_executor = create_py_executor_instance(
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/scratch.bbuddharaju_gpu/TensorRT-LLM/tests/unittest/utils/../../../tensorrt_llm/_torch/pyexecutor/_util.py", line 1291, in create_py_executor_instance
    validate_lora_target_modules_supported(model_engine.model,
  File "/home/scratch.bbuddharaju_gpu/TensorRT-LLM/tests/unittest/utils/../../../tensorrt_llm/_torch/peft/lora/layer.py", line 594, in validate_lora_target_modules_supported
    raise UnsupportedLoraTargetModulesError(
tensorrt_llm._torch.peft.lora.layer.UnsupportedLoraTargetModulesError: LoRA target module(s) ['moe_4h_to_h', 'moe_gate', 'moe_h_to_4h'] have no LoraLayer registered on this model. Supported on this model: ['attn_dense', 'attn_k', 'attn_q', 'attn_qkv', 'attn_v', 'mlp_gate_up', 'shared_expert_4h_to_h', 'shared_expert_gate', 'shared_expert_h_to_4h']. Remove the unsupported entries from `LoraConfig.lora_target_modules` and from `target_modules` in your adapter checkpoints to proceed.

Test Coverage

$ pytest -v tests/unittest/_torch/lora/test_lora.py -k validate_lora_target_modules_supported
$ pytest -s -v tests/unittest/_torch/modules/tests_lora_modules/test_moe_multi_lora.py::TestQwen15MoEMultiLoRA::test_multi_lora_attention
$ pytest -s -v tests/unittest/_torch/modules/tests_lora_modules/test_moe_multi_lora.py::TestQwen15MoEMultiLoRA::test_multi_lora_attention_and_shared_expert
$ pytest -s -v tests/unittest/_torch/modules/tests_lora_modules/test_moe_multi_lora.py::TestPerExpertMoELoRARejected::test_per_expert_moe_targets_rejected

PR Checklist

Please review the following before submitting your PR:

PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

Summary by CodeRabbit

Release Notes

New Features
- Added validation for LoRA target modules with clear error messages listing supported and unsupported targets
- Implemented early failure detection to prevent invalid LoRA configurations from proceeding to model initialization
- Extended LoRA support validation for Mixture of Experts (MoE) models
Tests
- Added comprehensive unit tests for target module validation logic
- Added Multi-LoRA tests for Qwen MoE models

Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>

coderabbitai · 2026-05-07T22:14:08Z

📝 Walkthrough

Walkthrough

This PR introduces LoRA target-module validation to prevent unsupported configurations. A new exception class and validation function scan models for registered LoRA layers, resolve target names through a canonical registry, and raise exceptions with informative messages when targets are unsupported or unknown. The executor integrates this validation before LoRA binding. Unit tests verify the function with a minimal attention-only model; integration tests validate multi-LoRA generation on Qwen MoE and confirm fail-fast rejection of per-expert targets.

Changes

LoRA Target Module Validation

Layer / File(s)	Summary
Exception Type `tensorrt_llm/_torch/peft/lora/layer.py`	Defines `UnsupportedLoraTargetModulesError(NotImplementedError)` for reporting unsupported target modules.
Validation Function `tensorrt_llm/_torch/peft/lora/layer.py`	Implements `validate_lora_target_modules_supported()` to resolve target names via `LoraManager.LORA_MODULE_IDS`, scan model for registered `LoraLayer`, and raise `UnsupportedLoraTargetModulesError` with unsupported/supported names listed.
Executor Integration `tensorrt_llm/_torch/pyexecutor/_util.py`	Calls validation in `create_py_executor_instance()` after building `target_modules`, providing early failure for unsupported targets before LoRA module construction.
Unit Tests `tests/unittest/_torch/lora/test_lora.py`	Tests validation function with attention-only model: accepts supported targets, handles empty/`None` lists, rejects per-expert MoE targets with proper error messages, and raises `ValueError` for unknown names.
Integration Tests `tests/unittest/_torch/modules/tests_lora_modules/test_moe_multi_lora.py`	Validates multi-LoRA on Qwen MoE with attention and shared-expert targets, asserts output changes per adapter, and verifies fail-fast rejection of per-expert targets with informative error messages.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 32.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The pull request title clearly and concisely describes the main feature being added—enabling graceful exit when LoRA is unsupported for a module.
Description check	✅ Passed	The PR description includes a clear explanation of the problem (KeyError crashes during LoRA resolution) and the solution (graceful early validation with informative error message), plus relevant test coverage details.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (1)

tensorrt_llm/_torch/peft/lora/layer.py (1)

568-569: 💤 Low value

Prefer built-in generic types over legacy typing equivalents.

Per Python 3.10+ guidelines, use dict[str, int] and list[str] instead of Dict[str, int] and List[str].

♻️ Suggested fix

-    requested: Dict[str, int] = {}
-    unknown: List[str] = []
+    requested: dict[str, int] = {}
+    unknown: list[str] = []

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tensorrt_llm/_torch/peft/lora/layer.py` around lines 568 - 569, Replace the
legacy typing generics with built-in generics for the variables declared in
layer.py: change the type annotations of requested and unknown from Dict[str,
int] and List[str] to dict[str, int] and list[str] respectively (update any
matching declarations or variable annotations in the scope where requested and
unknown are defined, e.g., the assignments in _torch/peft/lora/layer.py). Also
remove or adjust any unused imports from typing if Dict/List are no longer used
elsewhere in the file.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@tensorrt_llm/_torch/peft/lora/layer.py`:
- Around line 568-569: Replace the legacy typing generics with built-in generics
for the variables declared in layer.py: change the type annotations of requested
and unknown from Dict[str, int] and List[str] to dict[str, int] and list[str]
respectively (update any matching declarations or variable annotations in the
scope where requested and unknown are defined, e.g., the assignments in
_torch/peft/lora/layer.py). Also remove or adjust any unused imports from typing
if Dict/List are no longer used elsewhere in the file.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: cbed61a6-f278-4683-9a59-64cb584c704c

📥 Commits

Reviewing files that changed from the base of the PR and between fd2eb03 and ea9ce05.

📒 Files selected for processing (4)

tensorrt_llm/_torch/peft/lora/layer.py
tensorrt_llm/_torch/pyexecutor/_util.py
tests/unittest/_torch/lora/test_lora.py
tests/unittest/_torch/modules/tests_lora_modules/test_moe_multi_lora.py

brb-nv · 2026-05-08T00:12:59Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-05-08T00:20:17Z

PR_Github #47261 [ run ] triggered by Bot. Commit: ea9ce05 Link to invocation

tensorrt-cicd · 2026-05-08T00:21:43Z

PR_Github #47261 [ run ] completed with state FAILURE. Commit: ea9ce05

Link to invocation

byshiue

LGTM

brb-nv · 2026-05-08T02:46:00Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-05-08T02:52:02Z

PR_Github #47292 [ run ] triggered by Bot. Commit: ea9ce05 Link to invocation

tensorrt-cicd · 2026-05-08T12:17:01Z

PR_Github #47292 [ run ] completed with state SUCCESS. Commit: ea9ce05
/LLM/main/L0_MergeRequest_PR pipeline #37234 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

[TRTLLM-12529][feat] Graceful exit when lora unsupported

ea9ce05

Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>

brb-nv requested review from a team as code owners May 7, 2026 22:10

brb-nv requested a review from schetlur-nv May 7, 2026 22:10

github-actions Bot assigned brb-nv May 7, 2026

brb-nv requested a review from byshiue May 7, 2026 22:12

coderabbitai Bot reviewed May 7, 2026

View reviewed changes

byshiue approved these changes May 8, 2026

View reviewed changes

brb-nv enabled auto-merge (squash) May 8, 2026 02:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TRTLLM-12529][feat] Graceful exit when lora unsupported#13869

[TRTLLM-12529][feat] Graceful exit when lora unsupported#13869
brb-nv wants to merge 1 commit intoNVIDIA:mainfrom
brb-nv:user/brb/validate-multi-lora

brb-nv commented May 7, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented May 7, 2026

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

brb-nv commented May 8, 2026

Uh oh!

tensorrt-cicd commented May 8, 2026

Uh oh!

tensorrt-cicd commented May 8, 2026

Uh oh!

byshiue left a comment

Uh oh!

brb-nv commented May 8, 2026

Uh oh!

tensorrt-cicd commented May 8, 2026

Uh oh!

tensorrt-cicd commented May 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

brb-nv commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Test Coverage

PR Checklist

GitHub Bot Help

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented May 7, 2026

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

brb-nv commented May 8, 2026

Uh oh!

tensorrt-cicd commented May 8, 2026

Uh oh!

tensorrt-cicd commented May 8, 2026

Uh oh!

byshiue left a comment

Choose a reason for hiding this comment

Uh oh!

brb-nv commented May 8, 2026

Uh oh!

tensorrt-cicd commented May 8, 2026

Uh oh!

tensorrt-cicd commented May 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

brb-nv commented May 7, 2026 •

edited

Loading