Releases: kantan-kanto/ComfyUI-MultiModal-Prompt-Nodes
v1.0.10 – Qwen3.5 Support and Stability Improvements
Added local GGUF support for Qwen3.5 models
- Implemented proper Qwen3.5 handler routing with
Qwen35ChatHandler - Fixed incorrect fallback to
Qwen3VLChatHandlerfor Qwen3.5 model names - Improved mmproj handling for Qwen3.5 (requirement checks + auto-detection flow)
Added post-run cleanup() calls to VisionLLMNode, WanVideoPromptGenerator, and QwenImageEditPromptGenerator
- Refined cleanup lifecycle with
cleanup(finalize=False/True)for regular unload vs final teardown
v1.0.9 - Local GGUF Discovery Expansion, mmproj Filtering, and Qwen/Wan Prompt Quality Improvements
Highlights
This release focuses on four major areas:
- Expanded local GGUF model discovery
- Safer and more intuitive mmproj selection
- Improved Qwen / Wan prompt rewriting quality
- Better robustness for text-only and image-to-video workflows
What Changed
Local GGUF model discovery
- Expanded local Qwen-family GGUF model search paths
- Added
models/text_encodersand all subdirectories under bothmodels/LLMandmodels/text_encodersto the search paths - Centralized local model path and mmproj path resolution in
local_gguf_utils.py - Reduced duplicated path-handling logic across nodes
mmproj handling and model selection
- Added UI-side mmproj filtering so only mmproj files in the same directory as the selected model are shown
- Improved mmproj resolution behavior for local models
- Explicitly forces text-only mode when
mmproj = (Not required)is selected - Prevents unnecessary or incorrect Vision handler usage in text-only workflows
- Improves safety when switching between local model configurations
Prompt rewriting quality
- Added dedicated system prompt flows for:
qwen_imageqwen_image_editwan_t2vwan_i2v
- Strengthened prompt instructions so outputs are more likely to contain only the final prompt body
- Reduced verbose analysis-style or heading-based outputs
- Added a second-pass Simplified Chinese normalization flow when Chinese output is requested but another language is returned
- Preserves quoted text during second-pass normalization to avoid breaking user-specified text
- Improved Qwen2.5-VL behavior in
Qwen Image Edit Prompt Generatorby fixing system prompt application issues
Node behavior and robustness
Qwen-Imagecan now be used without an image input for text-only prompt generation- Local text-only
Qwen-Imageruns no longer require mmproj - Increased local inference
max_tokensandn_ctxfor longer prompt generation - Added explicit validation for missing image input in Image-to-Video mode
- Improved output control for Wan prompt generation in Chinese-targeted workflows
Upgrade Notes
- After upgrading, you may need to reselect your GGUF model and mmproj file once
- This is because internal model path handling changed with the expanded search paths
- If you use local GGUF models, verify that the selected mmproj still matches the model directory
- Users of
Qwen Image Edit Prompt Generatorwith Qwen2.5-VL should see improved output quality in this release
Notes
- Vision behavior still depends on the installed
llama-cpp-pythonbuild and backend environment - Some models may still show different output quality depending on whether they are used locally or via API
- Chinese-targeted prompt generation should now be more stable, but final output quality still depends on model behavior
v1.0.8 - Bug Fix
- Fixed issue where
Qwen2.5-VLwere always loaded in text-only mode even when a valid mmproj file was specified. - Improved mmproj auto-detection logic."
v1.0.7 - Stability Update
- Fixed incorrect detection of Qwen3-VL when mmproj is set to (Not required).
- Disabled automatic mmproj detection and prevented use of the VL handler in this case.
- Updated GGUFModelManager.load_model and node-side mmproj interpretation to correctly respect (Not required).
v1.0.6 – Stability and Documentation Update
This release focuses on stability improvements and documentation cleanup ahead of the initial Comfy Registry publishing.
Changes
- Improved stability when switching between Qwen3-VL GGUF models
- Fixed mmproj reuse issues in local vision models
- Refined internal GGUF model lifecycle management
- Clarified project scope as a prompt generator for QwenImageEdit and Wan2.2
- Reorganized Credits and Dependencies for clearer attribution
- Updated llama-cpp-python installation notes to reference the JamePeng fork documentation
Notes
- No breaking changes to node interfaces
- This is the first version published to the Comfy Registry
v1.0.5 - Initial Release
v1.0.5 - Initial Release
Multimodal prompt generation nodes for ComfyUI with local Qwen-VL GGUF support.
What's Included
- Vision LLM Node - Local GGUF models (Qwen2.5-VL, Qwen3-VL) with multi-image input
- Qwen Image Edit - Image editing prompt optimization (local + cloud API)
- Wan Video Generator - T2V/I2V prompt enhancement for Wan2.2
Key Features
✅ Multi-image batch input support
✅ CPU/GPU device selection
✅ 5 style presets (raw, default, detailed, concise, creative)
✅ Auto-detect mmproj for Qwen3-VL
✅ Optimized for Chinese language prompts
✅ GPL-3.0 licensed with proper attribution
Installation
cd ComfyUI/custom_nodes
git clone https://github.com/kantan-kanto/ComfyUI-MultiModal-Prompt-Nodes.git
cd ComfyUI-MultiModal-Prompt-Nodes
pip install -r requirements.txtFull documentation: README.md