Skip empty MultiLoraAdapter when no LoRAs target a model#1469
Open
fszontagh wants to merge 1 commit intoleejet:masterfrom
Open
Skip empty MultiLoraAdapter when no LoRAs target a model#1469fszontagh wants to merge 1 commit intoleejet:masterfrom
fszontagh wants to merge 1 commit intoleejet:masterfrom
Conversation
apply_loras_at_runtime always wraps each model (cond_stage, diffusion, first_stage) with a MultiLoraAdapter, even when no LoRA tensors match that model's prefix. The empty adapter routes every linear/conv through forward_with_lora() instead of the direct kernel path, adding an extra pointer indirection and a no-op iteration over an empty lora_models vector for every weighted op in the model. Skip the wrap when the matching lora_models list is empty so unaffected models keep the fast direct path. Also avoids attaching a stale adapter to first_stage_model in the common case where the LoRA only targets the diffusion model. set_weight_adapter(nullptr) is already called at the top of apply_loras_at_runtime, so skipping the assignment leaves the adapter correctly cleared.
0118ef0 to
5da694a
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
apply_loras_at_runtime()always wraps each model (cond_stage_model,diffusion_model,first_stage_model) with aMultiLoraAdapter, even when no LoRA tensors actually match that model's prefix. With an empty adapter the model still routes everyLinear/Conv2d::forward()throughWeightAdapter::forward_with_lora()rather than the directggml_ext_linear/ggml_ext_conv_2dpath:For a typical character LoRA targeting only the diffusion model, this means the cond_stage (CLIP/T5/LLM) and first_stage (VAE) graphs both go through the indirect path for no benefit — every
patch_weightcall iterates an emptylora_modelsvector and the innerggml_ext_linear/ggml_ext_conv_2dis invoked from inside the adapter rather than directly.The fix is to only attach the adapter when its
lora_modelslist is non-empty.set_weight_adapter(nullptr)is already called at the top ofapply_loras_at_runtime, so skipping the later assignment leaves the adapter correctly cleared.Test plan
mastercond_stage_modelandfirst_stage_modelno longer carry a staleMultiLoraAdapterapply_loras_at_runtimecall, which I believe is the intended flow)