Releases · kantan-kanto/ComfyUI-MultiModal-Prompt-Nodes

Expanded local Qwen-family GGUF model search paths
Added models/text_encoders and all subdirectories under both models/LLM and models/text_encoders to the search paths
Centralized local model path and mmproj path resolution in local_gguf_utils.py
Reduced duplicated path-handling logic across nodes

mmproj handling and model selection

Added UI-side mmproj filtering so only mmproj files in the same directory as the selected model are shown
Improved mmproj resolution behavior for local models
Explicitly forces text-only mode when mmproj = (Not required) is selected
Prevents unnecessary or incorrect Vision handler usage in text-only workflows
Improves safety when switching between local model configurations

Prompt rewriting quality

Added dedicated system prompt flows for:
- qwen_image
- qwen_image_edit
- wan_t2v
- wan_i2v
Strengthened prompt instructions so outputs are more likely to contain only the final prompt body
Reduced verbose analysis-style or heading-based outputs
Added a second-pass Simplified Chinese normalization flow when Chinese output is requested but another language is returned
Preserves quoted text during second-pass normalization to avoid breaking user-specified text
Improved Qwen2.5-VL behavior in Qwen Image Edit Prompt Generator by fixing system prompt application issues

Node behavior and robustness

Qwen-Image can now be used without an image input for text-only prompt generation
Local text-only Qwen-Image runs no longer require mmproj
Increased local inference max_tokens and n_ctx for longer prompt generation
Added explicit validation for missing image input in Image-to-Video mode
Improved output control for Wan prompt generation in Chinese-targeted workflows

Upgrade Notes

After upgrading, you may need to reselect your GGUF model and mmproj file once
This is because internal model path handling changed with the expanded search paths
If you use local GGUF models, verify that the selected mmproj still matches the model directory
Users of Qwen Image Edit Prompt Generator with Qwen2.5-VL should see improved output quality in this release

Notes

Vision behavior still depends on the installed llama-cpp-python build and backend environment
Some models may still show different output quality depending on whether they are used locally or via API
Chinese-targeted prompt generation should now be more stable, but final output quality still depends on model behavior

Assets 2

09 Feb 06:17

kantan-kanto

v1.0.8

8a8cc59

v1.0.8 - Bug Fix

Fixed issue where Qwen2.5-VL were always loaded in text-only mode even when a valid mmproj file was specified.
Improved mmproj auto-detection logic."

Assets 2

26 Jan 09:50

kantan-kanto

v1.0.7

d11d03b

v1.0.7 - Stability Update

Fixed incorrect detection of Qwen3-VL when mmproj is set to (Not required).
- Disabled automatic mmproj detection and prevented use of the VL handler in this case.
- Updated GGUFModelManager.load_model and node-side mmproj interpretation to correctly respect (Not required).

Assets 2

17 Jan 08:30

kantan-kanto

v1.0.6

111a5c8

v1.0.6 – Stability and Documentation Update

This release focuses on stability improvements and documentation cleanup ahead of the initial Comfy Registry publishing.

Changes

Improved stability when switching between Qwen3-VL GGUF models
Fixed mmproj reuse issues in local vision models
Refined internal GGUF model lifecycle management
Clarified project scope as a prompt generator for QwenImageEdit and Wan2.2
Reorganized Credits and Dependencies for clearer attribution
Updated llama-cpp-python installation notes to reference the JamePeng fork documentation

Notes

No breaking changes to node interfaces
This is the first version published to the Comfy Registry

Assets 2

13 Jan 00:42

kantan-kanto

v1.0.5

b3de55e

v1.0.5 - Initial Release

Multimodal prompt generation nodes for ComfyUI with local Qwen-VL GGUF support.

What's Included

Vision LLM Node - Local GGUF models (Qwen2.5-VL, Qwen3-VL) with multi-image input
Qwen Image Edit - Image editing prompt optimization (local + cloud API)
Wan Video Generator - T2V/I2V prompt enhancement for Wan2.2

Key Features

✅ Multi-image batch input support
✅ CPU/GPU device selection
✅ 5 style presets (raw, default, detailed, concise, creative)
✅ Auto-detect mmproj for Qwen3-VL
✅ Optimized for Chinese language prompts
✅ GPL-3.0 licensed with proper attribution

Installation

cd ComfyUI/custom_nodes
git clone https://github.com/kantan-kanto/ComfyUI-MultiModal-Prompt-Nodes.git
cd ComfyUI-MultiModal-Prompt-Nodes
pip install -r requirements.txt

Full documentation: README.md

Assets 2

Releases: kantan-kanto/ComfyUI-MultiModal-Prompt-Nodes

v1.0.10 – Qwen3.5 Support and Stability Improvements

Uh oh!

v1.0.9 - Local GGUF Discovery Expansion, mmproj Filtering, and Qwen/Wan Prompt Quality Improvements

Highlights

What Changed

Upgrade Notes

Notes

Uh oh!

v1.0.8 - Bug Fix

Uh oh!

v1.0.7 - Stability Update

Uh oh!

v1.0.6 – Stability and Documentation Update

Changes

Notes

Uh oh!

v1.0.5 - Initial Release

v1.0.5 - Initial Release

What's Included

Key Features

Installation

Uh oh!