January Highlights
The densest release month in ComfyUI history: infrastructure breakthroughs, model diversity explosion, and the hardware ecosystem finally approaching parity across vendors.
Temporal Rolling VAE: The Video VRAM Breakthrough
The single most impactful technical change of the quarter: Temporal Rolling VAE delivers massive VRAM reductions for video generation models including HunyuanVideo and Kandinsky. Instead of decoding all video frames simultaneously, the rolling approach processes frames in temporal windows, reducing peak VRAM consumption by an estimated 50–60% for typical video workflows.
Combined with reduced LoRA memory reservations (especially beneficial for Flux 2 workflows) and improved dequantisation offload accounting that fixes Flux 2 OOM errors, January's memory management upgrades make 24 GB cards viable for workloads that previously required 48 GB+ professional GPUs.
Market signal: This is the infrastructure inflection point that moves video generation from "requires a workstation" to "runs on a gaming PC." The implications for the creator economy and PaaS pricing are substantial.
Model Family Explosion
January sees six new model families gain native ComfyUI support in a single month — an unprecedented rate of integration:
- Kandinsky 5.0 — Text-to-video, image-to-video, and text-to-image variants from Sber. A significant new entrant in the open video generation space.
- Sora 2 — OpenAI's video generation API arrives as a native ComfyUI node, bringing commercial-grade video synthesis into the workflow graph.
- WAN 2.6 — Updated model integration with ReferenceToVideo capabilities and enhanced motion control.
- Tripo 3.0 — Third-generation 3D model generation with optimised face-limit parameter handling.
- Ovis Image — New image model expanding the generation options beyond Flux/SD architectures.
- GPT-Image-1.5 — OpenAI's image generation API nodes, bringing DALL-E capabilities natively into ComfyUI.
Pattern: ComfyUI is consolidating its position as the universal orchestration layer — every major model vendor now has native or API node support.
AMD RDNA4: First-Class Citizen
The long-running AMD parity story reaches a milestone: RDNA4 PyTorch attention is enabled on ROCm 7.0+, marking the first time AMD's newest architecture receives native attention kernel support in ComfyUI without workarounds or fallbacks. This is not DirectML emulation — it is native ROCm execution at near-CUDA performance levels for supported operations.
For the installer market, this changes the calculus: AMD GPU users no longer need specialised packs or cloud offload for standard workflows. Stability Matrix and ComfyUI Desktop both benefit from this upstream improvement. The "Robust AMD/Intel Support" opportunity from our July analysis is now partially realised at the platform level.
Hardware note: Async memory offload for AMD GPUs (enabled by default since October) now works without the race conditions that plagued earlier releases. Combined with RDNA4 attention, AMD is genuinely competitive for local inference.
V3 Migration & Developer Infrastructure
The V3 schema migration reaches critical mass with extensive conversions across core node categories: audio encoder, differential diffusion, morphology, torch compile, ControlNet, 3D, audio, freelunch, and mask nodes all migrate. The PyTorch 2.4+ minimum version requirement is enforced, clearing the path for modern CUDA/ROCm features.
New infrastructure additions in January:
- NVFP4 checkpoint support — FP4 matrix multiplication for extreme quantisation scenarios
- Sage Attention 3 — Third-generation efficient attention via CLI argument
- Gemma 12B support with quantised weights
- LTXV 2 model support — second-generation Lightricks video
- MMaudio 16K VAE — high-fidelity audio workflows at 16 kHz
- Unified
/api/jobsendpoints for workflow monitoring - Frontend significantly faster load/startup times
- GPU-accelerated mask editor rendering
Key Takeaway
- Temporal Rolling VAE is the most impactful single feature since Mixed Precision Quantisation — it moves video generation from workstation-tier to consumer-tier hardware.
- Six new model families in one month confirms ComfyUI's position as the universal AI workflow platform — model vendors are now racing to be "ComfyUI native."
- RDNA4 native attention on ROCm 7.0+ is the AMD parity milestone the ecosystem has waited for since 2023.
- Sora 2 + GPT-Image-1.5 API nodes bring OpenAI directly into the ComfyUI graph — commercial API and open-source local models now coexist seamlessly.
- The V3 migration is no longer "in progress" — it is the new baseline. Custom nodes that have not migrated are effectively deprecated.
Six-Month Ecosystem Evolution (July 2025 → January 2026)
| Dimension | July 2025 | January 2026 | Change |
|---|---|---|---|
| Minimum VRAM for Flux | ~16 GB (FP16) | ~8 GB (Mixed Precision + FP8) | 50% reduction |
| Video Generation | Experimental (external plugins) | Native: LTXV 2, HunyuanVideo 1.5, Kandinsky 5.0, Sora 2, Veo 3.1 | 5 native architectures |
| AMD GPU Support | ROCm 6.4, workarounds needed | ROCm 7.0+, RDNA4 native attention, async offload stable | First-class citizen |
| UI Paradigm | Node graph only | Nodes 2.0 + Linear Mode + subgraph widgets | Multi-paradigm |
| Node Schema | V1 dominant, V3 announced | V3 is baseline; V1 deprecated | Migration complete |
| 3D Generation | HunYuan 3D (early) | HunYuan 3D 2.0 + Tripo 3.0 + Rodin3D | 3 native engines |
| Enterprise Licensing | 90% GPL/AGPL barrier | FLUX.2 VAE Apache 2.0; Permissive models growing | Chasm narrowing |
| Audio Generation | None | MMaudio 16K VAE, Qwen3-TTS nodes | New modality |
Looking Ahead: Q1 2026 Outlook
Expect: Consolidation
- Nodes 2.0 exits beta; Linear Mode becomes the default for new users
- V3-only enforcement — legacy V1 nodes start breaking in core updates
- Video generation matures: expect workflow templates, not just raw nodes
- AMD RDNA4 benchmarks close the gap with NVIDIA for standard workflows
Watch: Disruption Vectors
- OpenAI expanding Sora 2 capabilities — will they offer a ComfyUI-native commercial tier?
- NVFP4 and Sage Attention 3 may enable frontier models on 6 GB cards
- Audio + video + image in a single graph — multi-modal production pipelines becoming practical
- Enterprise PaaS providers pivoting from "managed ComfyUI" to "managed multi-modal AI"