Skip to content

Skip empty MultiLoraAdapter when no LoRAs target a model#1469

Open
fszontagh wants to merge 1 commit intoleejet:masterfrom
fszontagh:fix/skip-empty-lora-adapter
Open

Skip empty MultiLoraAdapter when no LoRAs target a model#1469
fszontagh wants to merge 1 commit intoleejet:masterfrom
fszontagh:fix/skip-empty-lora-adapter

Conversation

@fszontagh
Copy link
Copy Markdown
Contributor

Summary

apply_loras_at_runtime() always wraps each model (cond_stage_model, diffusion_model, first_stage_model) with a MultiLoraAdapter, even when no LoRA tensors actually match that model's prefix. With an empty adapter the model still routes every Linear/Conv2d::forward() through WeightAdapter::forward_with_lora() rather than the direct ggml_ext_linear / ggml_ext_conv_2d path:

// ggml_extend.hpp Linear::forward
if (ctx->weight_adapter) {
    // -> MultiLoraAdapter::forward_with_lora -> empty patch_weight loop -> ggml_ext_linear
    return ctx->weight_adapter->forward_with_lora(...);
}
return ggml_ext_linear(...);    // direct path

For a typical character LoRA targeting only the diffusion model, this means the cond_stage (CLIP/T5/LLM) and first_stage (VAE) graphs both go through the indirect path for no benefit — every patch_weight call iterates an empty lora_models vector and the inner ggml_ext_linear/ggml_ext_conv_2d is invoked from inside the adapter rather than directly.

The fix is to only attach the adapter when its lora_models list is non-empty. set_weight_adapter(nullptr) is already called at the top of apply_loras_at_runtime, so skipping the later assignment leaves the adapter correctly cleared.

Test plan

  • Builds cleanly off master
  • Image generation still works for a Z-Image LoRA targeting only the diffusion model — image is bit-identical to before (the diffusion adapter is non-empty, so its path is unchanged)
  • When LoRA targets only diffusion, cond_stage_model and first_stage_model no longer carry a stale MultiLoraAdapter
  • Reviewer to confirm desired behavior when subsequent calls add cond_stage/first_stage LoRAs (the adapter will be (re)attached on the next apply_loras_at_runtime call, which I believe is the intended flow)

apply_loras_at_runtime always wraps each model (cond_stage, diffusion,
first_stage) with a MultiLoraAdapter, even when no LoRA tensors match
that model's prefix. The empty adapter routes every linear/conv through
forward_with_lora() instead of the direct kernel path, adding an extra
pointer indirection and a no-op iteration over an empty lora_models
vector for every weighted op in the model.

Skip the wrap when the matching lora_models list is empty so unaffected
models keep the fast direct path. Also avoids attaching a stale adapter
to first_stage_model in the common case where the LoRA only targets
the diffusion model.

set_weight_adapter(nullptr) is already called at the top of
apply_loras_at_runtime, so skipping the assignment leaves the adapter
correctly cleared.
@fszontagh fszontagh force-pushed the fix/skip-empty-lora-adapter branch from 0118ef0 to 5da694a Compare May 1, 2026 15:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant