-
Notifications
You must be signed in to change notification settings - Fork 197
feat: Add LoRA and ControlNet support for Qwen Image models #754
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Implemented `update_lora_params` method in `NunchakuFeedForward`, `NunchakuQwenAttention`, and `NunchakuQwenImageTransformerBlock` to handle LoRA weights. - Added `set_lora_strength` method in `SVDQW4A4Linear` for dynamic adjustment of LoRA scaling. - Enhanced `NunchakuQwenImageTransformer2DModel` with methods to update and restore original parameters for LoRA. - Introduced handling for unquantized and quantized parts of the model in LoRA updates.
|
Merged this into my local Repo, both nunchaku and Comfy-UI nunchaku. Some of my trained loras produce wrong results, as if the weights were applied twice. Some other work well. In the last couple of days I also played around with Bluear7878's PR's, With Bluear7878's changes everything seems to be working as expected, so the there might |
Thanks a lot for testing and for your detailed feedback! Regarding the issue where the LoRA weights seem to be applied twice — that might be because my code currently uses AMPLIFICATION_FACTOR = 2.0 to compensate for potential loss from W4A4 quantization. When I was testing my own LoRA, the results looked too weak without this amplification, so I kept it. If others are finding that the LoRA effect is too strong, I’ll remove or adjust this compensation factor. So far, I’ve only trained one LoRA myself, so I don’t yet have a good sense of the overall strength balance. As for the issue with generating 1920x720 images — I just tested it on my end, and it seems to be working correctly. I’m currently using the FP4 R128 model. |
|
You are welcome. I would say the AMPLIFICATION_FACTOR = 2.0 might be to high, since the loras work fine without nunchaku, but who am I to judge? :) Regarding the resolution I am using svdq-int4_r128-qwen-image-lightningv1.1-8steps, and when running at 1920x1080 or 1080x1920 i got a tensor size error in the ksampler step. Sorry do not have the exact error message anymore. |
You can try using a model without the fused Lightning LoRA, and instead load the Lightning LoRA separately through the LoRA Loader node for testing. |
- Added logic to remove '.default.' from PEFT-style naming for LoRA weights, ensuring compatibility with models trained using PEFT/Kohya. - Updated key transformation to handle cases where '.default.weight' appears at the end of the key.
I'd probably leave the amplification factor as 1.0 as weights can always be turned up, but turning them down if they're already amplified by 2 would mean you lose precision. |
You can test it. When I adjusted AMPLIFICATION_FACTOR = 2 to 1, I found that the application intensity of lora is not enough. You can compare under what intensity 2 and 1 are correct exactly. |
|
Image doubling / Unusual placement issue. Both of these images use the same seed, same sampler settings, same everything, both are two stages. Prompt was only Stage 1 has NO loras on it for the first 4 stesp, Stage 2 has just the 8 step lightning lora on it for the last 4 steps. These two images should be identical (more or less). **The other PR for LoRAs #680 ** From my own experience, the second one is correct, Qwen will prefer to put objects in dead-center, and I've consistently seen this PR's version of nunchaku placing objects on the right when using wide ratios. More examples of this PRs right-placement of objects all different seeds |
I’ve addressed the issue with AMPLIFICATION_FACTOR = 1 (LoRA strength) and the position ID generation that caused characters to |
- Changed the amplification factor from 2.0 to 1.0 to address quantization precision loss in W4A4 models.
|
Thank you for the fixes, works great now. |
|
sorry for the noob question, im fairly new to comfyui, is there any tutorial on how to impliment this to my comfyui, im using the ComfyUI-Easy-Install build with the latest nunchaku 1.01 |
|
Hi, Thank you for this PR. I'm trying to get this code to work with a diffusers pipeline, but without success so far. With one LoRA, I get this error: With two composed LoRAs, I get an image full of noise. Could you please provide an example of:
I have the feeling that code from your ComfyUI PR (especially in the |
- Added functionality to handle .alpha parameters for scaling lora_A weights. - Extracted alpha values from tensors and applied scaling based on the rank of lora_A weights. - Updated `update_lora_params` method to support multiple LoRA compositions, allowing for proper handling of composed LoRAs.
Hello! Thank you for your feedback. Regarding the issues you encountered:
I have fixed the combined LoRAs issue in this PR:
Usage examples have been added to the PR description, including:
Please refer to the "API Examples" section in the updated PR description. |
|
Thank you for your answer. After further testing with your new version, it appears that my problem occurs whenever I use a lightning LoRA. This is the full code I use: import torch
from diffusers import QwenImagePipeline
from huggingface_hub import hf_hub_download
from nunchaku import NunchakuQwenImageTransformer2DModel
transformer = NunchakuQwenImageTransformer2DModel.from_pretrained(
"nunchaku-tech/nunchaku-qwen-image/svdq-int4_r32-qwen-image.safetensors"
)
lora_path = hf_hub_download(
repo_id="lightx2v/Qwen-Image-Lightning",
filename="Qwen-Image-Lightning-8steps-V2.0-bf16.safetensors",
)
transformer.update_lora_params(lora_path)
pipeline = QwenImagePipeline.from_pretrained(
"Qwen/Qwen-Image",
transformer=transformer,
torch_dtype=torch.bfloat16,
)
pipeline.enable_model_cpu_offload()
generator = torch.Generator(device="cpu").manual_seed(42)
output_image = pipeline(
prompt="GHIBSKY style painting, sign saying 'Flux Ghibsky'",
width=1024,
height=1024,
num_inference_steps=8,
true_cfg_scale=1.0,
generator=generator,
).images[0]
output_image.save(f"qwen-image.png")With this code, it crashes with the following error: python3: /nunchaku/src/kernels/zgemm/gemm_w4a4_launch_impl.cuh:482: static void nunchaku::kernels::GEMM_W4A4_Launch<Config, USE_FP4>::quantize_w4a4_act_fuse_lora(Tensor, Tensor, Tensor, Tensor, Tensor, Tensor, bool, bool) [with Config = nunchaku::kernels::GEMMConfig_W4A4<true>; bool USE_FP4 = false]: Assertion `lora_down.shape[0] == N' failed.I also tried without CPU offloading just in case (as I saw it may not work with lightning LoRAs, but I think you're referring to offloading at the transformer level, not pipeline one). Since I have an RTX 4090, I can do without CPU offloading by using a quantized text encoder: import torch
from diffusers import QwenImagePipeline
from transformers import Qwen2_5_VLForConditionalGeneration
from huggingface_hub import hf_hub_download
from nunchaku import NunchakuQwenImageTransformer2DModel
text_encoder = Qwen2_5_VLForConditionalGeneration.from_pretrained(
"unsloth/Qwen2.5-VL-7B-Instruct-unsloth-bnb-4bit",
dtype=torch.bfloat16,
device_map="auto"
)
transformer = NunchakuQwenImageTransformer2DModel.from_pretrained(
"nunchaku-tech/nunchaku-qwen-image/svdq-int4_r32-qwen-image.safetensors"
)
lora_path = hf_hub_download(
repo_id="lightx2v/Qwen-Image-Lightning",
filename="Qwen-Image-Lightning-8steps-V2.0-bf16.safetensors",
)
transformer.update_lora_params(lora_path)
pipeline = QwenImagePipeline.from_pretrained(
"Qwen/Qwen-Image",
text_encoder=text_encoder,
transformer=transformer,
torch_dtype=torch.bfloat16,
).to("cuda")
generator = torch.Generator(device="cpu").manual_seed(42)
output_image = pipeline(
prompt="GHIBSKY style painting, sign saying 'Flux Ghibsky'",
width=1024,
height=1024,
num_inference_steps=8,
true_cfg_scale=1.0,
generator=generator,
).images[0]
output_image.save(f"qwen-image.png")But the result is the same. I still have the previous error. Adding a call to When using the other PR's branch, it's working and I get this image:
In addition to this issue, I noticed that, when using a LoRA that is different from a lightning LoRA, it works as if the LoRA had not been applied, even though there are no errors. For instance with this code: import torch
from diffusers import QwenImagePipeline
from huggingface_hub import hf_hub_download
from nunchaku import NunchakuQwenImageTransformer2DModel
transformer = NunchakuQwenImageTransformer2DModel.from_pretrained(
"nunchaku-tech/nunchaku-qwen-image/svdq-int4_r32-qwen-image.safetensors"
)
lora_path = hf_hub_download(
repo_id="Raelina/Raena-Qwen-Image",
filename="raena_qwen_image_lora_v0.1_diffusers_fix.safetensors",
)
transformer.update_lora_params(lora_path)
pipeline = QwenImagePipeline.from_pretrained(
"Qwen/Qwen-Image",
transformer=transformer,
torch_dtype=torch.bfloat16,
)
pipeline.enable_model_cpu_offload()
generator = torch.Generator(device="cpu").manual_seed(42)
output_image = pipeline(
prompt="anime illustration of a girl with long black hair with blunt bangs and purple eyes, wearing a blue kimono with purple floral prints and a purple obi. She is looking at the viewer with a slight smile, standing outdoors at night. The background features a brightly lit food stand with lanterns and a blurred figure in the distance. The girl is positioned slightly to the right, with a three-quarter view from the front. The scene has a festival atmosphere, with warm yellow and orange lights from the lanterns. The viewing angle is slightly below eye level, focusing on her upper body.",
negative_prompt=" ",
width=1024,
height=1408,
num_inference_steps=50,
true_cfg_scale=4.0,
generator=generator,
).images[0]
output_image.save(f"qwen-image.png")I get the following image:
But without applying any LoRA, I get:
And when using the other PR's branch, I get:
This last image seems more in line with the expected result when using this LoRA. Do you experience the same behaviour, or do you think it's a configuration issue on my end? Finally, I would also like to add that there appears to be a small error on import of Shouldn't it be: from .linear import SVDQW4A4LinearInstead of: from ..linear import SVDQW4A4Linear? This line could even be removed completely, since there is already an identical import at the top of the file, right? (line 11) |
I've just tested with the ComfyUI version of this PR with the lora from your example and it seems to be working as expected. Both use the lightning lora and the same seed. |
|
Working great. Tried with int4 model with baked in 4 steps lora. Thank you very much for your work! |
|
如果CPU Offload 兼容性(ComfyUI 集成)有问题,像8G显存的显卡,是不是就不太适用你的分支?谢谢 |
|
Thank you for all your work! I just wanted to ask if this will get merged and what stop it. It seems very important, now that many Lora are famous (like Multi-View, Fusion, Relight). |
I'm having the same issue, would like to ask how did you solve it? |
|
hi, i deploy this pr, then i use code as followed. The result of image is blur: |
I found the same problem, have you ever solve it? |
































📄 中文版本
📝 描述
概述
本PR为Qwen Image模型添加了完整的LoRA(Low-Rank Adaptation)支持,使得用户能够在Nunchaku量化推理框架中使用LoRA微调模型,实现更灵活的模型定制和更低的内存占用。
主要功能
1. 新增 Qwen Image LoRA 模块 (
nunchaku/lora/qwenimage/)格式转换 (
diffusers_converter.py,nunchaku_converter.py)LoRA合并 (
compose.py)工具函数 (
utils.py,packer.py)2. 核心模型增强
Attention模块 (
models/attention.py)Linear层 (
models/linear.py)Transformer (
models/transformers/transformer_qwenimage.py)API示例
使用单个LoRA(可调节强度)
使用多个组合LoRA(每个独立强度)
格式转换
技术亮点
.alpha参数和各种命名约定(在to_diffusers和compose_lora中都正确处理)num_loras参数支持**:修复组合LoRA的base model重复合并问题,确保Diffusers pipeline中正确使用多LoRA代码统计
📂 文件列表
新增文件:
nunchaku/lora/qwenimage/__init__.py- LoRA 模块接口nunchaku/lora/qwenimage/compose.py- 多 LoRA 合并逻辑nunchaku/lora/qwenimage/diffusers_converter.py- Diffusers 格式转换nunchaku/lora/qwenimage/nunchaku_converter.py- Nunchaku 格式转换nunchaku/lora/qwenimage/packer.py- 权重打包工具nunchaku/lora/qwenimage/utils.py- 工具函数修改文件:
nunchaku/models/attention.py- 添加 LoRA 更新方法 (+73行)nunchaku/models/linear.py- 添加 LoRA 强度和缩放支持 (+69行)nunchaku/models/transformers/transformer_qwenimage.py- 深度集成 LoRA (+315行)测试
兼容性
CPU Offload 兼容性(ComfyUI 集成)
部分 LoRA 文件(如 Qwen-Image-Lightning)对不同 transformer block 训练了不同的层,导致不同 block 之间的内部结构(rank)不一致。在 ComfyUI 环境中使用时,由于 QwenImage 使用 Python 层的
CPUOffloadManager实现 CPU offload,而该管理器要求所有 block 具有完全相同的结构,因此这类 LoRA 无法与 CPU offload 同时使用。影响范围:仅影响 ComfyUI 中启用 CPU offload 的场景,不影响标准推理
症状:
解决方案:
技术原因:
CPUOffloadManager,通过固定的 buffer blocks 进行参数复制,要求所有 block 结构一致📚 相关Issue
Closes #[issue编号](如果有的话)
✅ 检查清单
📄 English Version
🎯 PR Title
feat: Add LoRA support for Qwen Image models
📝 Description
Overview
This PR adds comprehensive LoRA (Low-Rank Adaptation) support for Qwen Image models, enabling users to leverage LoRA fine-tuned models within Nunchaku's quantized inference framework for more flexible model customization and reduced memory footprint.
Key Features
1. New Qwen Image LoRA Module (
nunchaku/lora/qwenimage/)Format Conversion (
diffusers_converter.py,nunchaku_converter.py)LoRA Composition (
compose.py)Utility Functions (
utils.py,packer.py)2. Core Model Enhancements
Attention Module (
models/attention.py)Linear Layer (
models/linear.py)Transformer (
models/transformers/transformer_qwenimage.py)API Examples
Using Single LoRA (Adjustable Strength)
Using Multiple Composed LoRAs (Independent Strengths)
Format Conversion
Technical Highlights
.alphaparameters and various naming conventions (correctly handled in bothto_diffusersandcompose_lora)num_lorasParameter Support**: Fixed composed LoRA base model double-merging issue, ensuring correct multi-LoRA usage in Diffusers pipelinesCode Statistics
📂 File List
New Files:
nunchaku/lora/qwenimage/__init__.py- LoRA module interfacenunchaku/lora/qwenimage/compose.py- Multi-LoRA composition logicnunchaku/lora/qwenimage/diffusers_converter.py- Diffusers format conversionnunchaku/lora/qwenimage/nunchaku_converter.py- Nunchaku format conversionnunchaku/lora/qwenimage/packer.py- Weight packing utilitiesnunchaku/lora/qwenimage/utils.py- Utility functionsModified Files:
nunchaku/models/attention.py- Added LoRA update methods (+73 lines)nunchaku/models/linear.py- Added LoRA strength and scaling support (+69 lines)nunchaku/models/transformers/transformer_qwenimage.py- Deep LoRA integration (+315 lines)Testing
Compatibility
CPU Offload Compatibility (ComfyUI Integration)
Some LoRA files (e.g., Qwen-Image-Lightning) train different layers for different transformer blocks, resulting in inconsistent internal structures (ranks) across blocks. When used in ComfyUI, since QwenImage uses a Python-based
CPUOffloadManagerfor CPU offload, which requires all blocks to have identical structure, these LoRAs cannot be used with CPU offload simultaneously.Scope: Only affects ComfyUI scenarios with CPU offload enabled, does not affect standard inference
Symptom:
Solutions:
Technical Reason:
CPUOffloadManager, which copies parameters through fixed buffer blocks, requiring all blocks to have identical structure📚 Related Issues
Closes #[issue number] (if applicable)
✅ Checklist