Skip to content

[TRTLLM-12529][feat] Graceful exit when lora unsupported#13869

Open
brb-nv wants to merge 1 commit intoNVIDIA:mainfrom
brb-nv:user/brb/validate-multi-lora
Open

[TRTLLM-12529][feat] Graceful exit when lora unsupported#13869
brb-nv wants to merge 1 commit intoNVIDIA:mainfrom
brb-nv:user/brb/validate-multi-lora

Conversation

@brb-nv
Copy link
Copy Markdown
Collaborator

@brb-nv brb-nv commented May 7, 2026

Description

This MR enables graceful exit when LoRA is not supported for a module. Also, adds a few tests for multi-lora with MoE models.

Previously:

File "/home/scratch.bbuddharaju_gpu/TensorRT-LLM/tests/unittest/utils/../../../tensorrt_llm/_torch/pyexecutor/model_engine.py", line 3917, in forward
    inputs, gather_ids = self._prepare_inputs(
                         ^^^^^^^^^^^^^^^^^^^^^
  File "/home/scratch.bbuddharaju_gpu/TensorRT-LLM/tests/unittest/utils/../../../tensorrt_llm/_torch/pyexecutor/model_engine.py", line 3702, in _prepare_inputs
    return self._prepare_tp_inputs(
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/scratch.bbuddharaju_gpu/TensorRT-LLM/tests/unittest/utils/../../../tensorrt_llm/_torch/pyexecutor/model_engine.py", line 3018, in _prepare_tp_inputs
    lora_params = self._get_lora_params_from_requests(
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/scratch.bbuddharaju_gpu/TensorRT-LLM/tests/unittest/utils/../../../tensorrt_llm/_torch/pyexecutor/model_engine.py", line 3522, in _get_lora_params_from_requests
    return self.cuda_graph_lora_manager.prepare_cuda_graph_lora_params(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/scratch.bbuddharaju_gpu/TensorRT-LLM/tests/unittest/utils/../../../tensorrt_llm/_torch/peft/lora/cuda_graph_lora_manager.py", line 167, in prepare_cuda_graph_lora_params
    cuda_graph_lora_params.update_weight_pointers(peft_table, slot2task)
  File "/home/scratch.bbuddharaju_gpu/TensorRT-LLM/tests/unittest/utils/../../../tensorrt_llm/_torch/peft/lora/cuda_graph_lora_params.py", line 262, in update_weight_pointers
    key = self.layer_module2key[(layer_id, module_id)]
          ~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^
KeyError: (0, 14)

Now. Exception is raised during worker initialization.

[05/07/2026-21:57:23] [TRT-LLM] [E] [executor] Traceback (most recent call last):
  File "/home/scratch.bbuddharaju_gpu/TensorRT-LLM/tests/unittest/utils/../../../tensorrt_llm/executor/worker.py", line 282, in worker_main
    worker: GenerationExecutorWorker = worker_cls(
                                       ^^^^^^^^^^^
  File "/home/scratch.bbuddharaju_gpu/TensorRT-LLM/tests/unittest/utils/../../../tensorrt_llm/executor/worker.py", line 62, in __init__
    self.setup_engine()
  File "/home/scratch.bbuddharaju_gpu/TensorRT-LLM/tests/unittest/utils/../../../tensorrt_llm/executor/base_worker.py", line 260, in setup_engine
    self.engine = _create_py_executor(
                  ^^^^^^^^^^^^^^^^^^^^
  File "/home/scratch.bbuddharaju_gpu/TensorRT-LLM/tests/unittest/utils/../../../tensorrt_llm/executor/base_worker.py", line 231, in _create_py_executor
    _executor = create_executor(**args)
                ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/scratch.bbuddharaju_gpu/TensorRT-LLM/tests/unittest/utils/../../../tensorrt_llm/_torch/pyexecutor/py_executor_creator.py", line 863, in create_py_executor
    py_executor = create_py_executor_instance(
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/scratch.bbuddharaju_gpu/TensorRT-LLM/tests/unittest/utils/../../../tensorrt_llm/_torch/pyexecutor/_util.py", line 1291, in create_py_executor_instance
    validate_lora_target_modules_supported(model_engine.model,
  File "/home/scratch.bbuddharaju_gpu/TensorRT-LLM/tests/unittest/utils/../../../tensorrt_llm/_torch/peft/lora/layer.py", line 594, in validate_lora_target_modules_supported
    raise UnsupportedLoraTargetModulesError(
tensorrt_llm._torch.peft.lora.layer.UnsupportedLoraTargetModulesError: LoRA target module(s) ['moe_4h_to_h', 'moe_gate', 'moe_h_to_4h'] have no LoraLayer registered on this model. Supported on this model: ['attn_dense', 'attn_k', 'attn_q', 'attn_qkv', 'attn_v', 'mlp_gate_up', 'shared_expert_4h_to_h', 'shared_expert_gate', 'shared_expert_h_to_4h']. Remove the unsupported entries from `LoraConfig.lora_target_modules` and from `target_modules` in your adapter checkpoints to proceed.

Test Coverage

$ pytest -v tests/unittest/_torch/lora/test_lora.py -k validate_lora_target_modules_supported
$ pytest -s -v tests/unittest/_torch/modules/tests_lora_modules/test_moe_multi_lora.py::TestQwen15MoEMultiLoRA::test_multi_lora_attention
$ pytest -s -v tests/unittest/_torch/modules/tests_lora_modules/test_moe_multi_lora.py::TestQwen15MoEMultiLoRA::test_multi_lora_attention_and_shared_expert
$ pytest -s -v tests/unittest/_torch/modules/tests_lora_modules/test_moe_multi_lora.py::TestPerExpertMoELoRARejected::test_per_expert_moe_targets_rejected

PR Checklist

Please review the following before submitting your PR:

  • PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.

  • PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.

  • Test cases are provided for new code paths (see test instructions)

  • Any new dependencies have been scanned for license and vulnerabilities

  • CODEOWNERS updated if ownership changes

  • Documentation updated as needed

  • Update tava architecture diagram if there is a significant design change in PR.

  • The reviewers assigned automatically/manually are appropriate for the PR.

  • Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

Summary by CodeRabbit

Release Notes

  • New Features

    • Added validation for LoRA target modules with clear error messages listing supported and unsupported targets
    • Implemented early failure detection to prevent invalid LoRA configurations from proceeding to model initialization
    • Extended LoRA support validation for Mixture of Experts (MoE) models
  • Tests

    • Added comprehensive unit tests for target module validation logic
    • Added Multi-LoRA tests for Qwen MoE models

Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
@brb-nv brb-nv requested review from a team as code owners May 7, 2026 22:10
@brb-nv brb-nv requested a review from schetlur-nv May 7, 2026 22:10
@brb-nv brb-nv requested a review from byshiue May 7, 2026 22:12
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 7, 2026

Review Change Stack

📝 Walkthrough

Walkthrough

This PR introduces LoRA target-module validation to prevent unsupported configurations. A new exception class and validation function scan models for registered LoRA layers, resolve target names through a canonical registry, and raise exceptions with informative messages when targets are unsupported or unknown. The executor integrates this validation before LoRA binding. Unit tests verify the function with a minimal attention-only model; integration tests validate multi-LoRA generation on Qwen MoE and confirm fail-fast rejection of per-expert targets.

Changes

LoRA Target Module Validation

Layer / File(s) Summary
Exception Type
tensorrt_llm/_torch/peft/lora/layer.py
Defines UnsupportedLoraTargetModulesError(NotImplementedError) for reporting unsupported target modules.
Validation Function
tensorrt_llm/_torch/peft/lora/layer.py
Implements validate_lora_target_modules_supported() to resolve target names via LoraManager.LORA_MODULE_IDS, scan model for registered LoraLayer, and raise UnsupportedLoraTargetModulesError with unsupported/supported names listed.
Executor Integration
tensorrt_llm/_torch/pyexecutor/_util.py
Calls validation in create_py_executor_instance() after building target_modules, providing early failure for unsupported targets before LoRA module construction.
Unit Tests
tests/unittest/_torch/lora/test_lora.py
Tests validation function with attention-only model: accepts supported targets, handles empty/None lists, rejects per-expert MoE targets with proper error messages, and raises ValueError for unknown names.
Integration Tests
tests/unittest/_torch/modules/tests_lora_modules/test_moe_multi_lora.py
Validates multi-LoRA on Qwen MoE with attention and shared-expert targets, asserts output changes per adapter, and verifies fail-fast rejection of per-expert targets with informative error messages.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 32.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The pull request title clearly and concisely describes the main feature being added—enabling graceful exit when LoRA is unsupported for a module.
Description check ✅ Passed The PR description includes a clear explanation of the problem (KeyError crashes during LoRA resolution) and the solution (graceful early validation with informative error message), plus relevant test coverage details.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
tensorrt_llm/_torch/peft/lora/layer.py (1)

568-569: 💤 Low value

Prefer built-in generic types over legacy typing equivalents.

Per Python 3.10+ guidelines, use dict[str, int] and list[str] instead of Dict[str, int] and List[str].

♻️ Suggested fix
-    requested: Dict[str, int] = {}
-    unknown: List[str] = []
+    requested: dict[str, int] = {}
+    unknown: list[str] = []
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tensorrt_llm/_torch/peft/lora/layer.py` around lines 568 - 569, Replace the
legacy typing generics with built-in generics for the variables declared in
layer.py: change the type annotations of requested and unknown from Dict[str,
int] and List[str] to dict[str, int] and list[str] respectively (update any
matching declarations or variable annotations in the scope where requested and
unknown are defined, e.g., the assignments in _torch/peft/lora/layer.py). Also
remove or adjust any unused imports from typing if Dict/List are no longer used
elsewhere in the file.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@tensorrt_llm/_torch/peft/lora/layer.py`:
- Around line 568-569: Replace the legacy typing generics with built-in generics
for the variables declared in layer.py: change the type annotations of requested
and unknown from Dict[str, int] and List[str] to dict[str, int] and list[str]
respectively (update any matching declarations or variable annotations in the
scope where requested and unknown are defined, e.g., the assignments in
_torch/peft/lora/layer.py). Also remove or adjust any unused imports from typing
if Dict/List are no longer used elsewhere in the file.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: cbed61a6-f278-4683-9a59-64cb584c704c

📥 Commits

Reviewing files that changed from the base of the PR and between fd2eb03 and ea9ce05.

📒 Files selected for processing (4)
  • tensorrt_llm/_torch/peft/lora/layer.py
  • tensorrt_llm/_torch/pyexecutor/_util.py
  • tests/unittest/_torch/lora/test_lora.py
  • tests/unittest/_torch/modules/tests_lora_modules/test_moe_multi_lora.py

@brb-nv
Copy link
Copy Markdown
Collaborator Author

brb-nv commented May 8, 2026

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #47261 [ run ] triggered by Bot. Commit: ea9ce05 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #47261 [ run ] completed with state FAILURE. Commit: ea9ce05

Link to invocation

Copy link
Copy Markdown
Collaborator

@byshiue byshiue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@brb-nv
Copy link
Copy Markdown
Collaborator Author

brb-nv commented May 8, 2026

/bot run --disable-fail-fast

@brb-nv brb-nv enabled auto-merge (squash) May 8, 2026 02:46
@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #47292 [ run ] triggered by Bot. Commit: ea9ce05 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #47292 [ run ] completed with state SUCCESS. Commit: ea9ce05
/LLM/main/L0_MergeRequest_PR pipeline #37234 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants