ComfyGPT: A Self-Optimizing Multi-Agent System for Comprehensive ComfyUI Workflow Generation
By: Oucheng Huang , Yuhang Ma , Zeng Zhao and more
Potential Business Impact:
Makes AI create image-making steps automatically.
ComfyUI is a popular workflow-based interface that allows users to customize image generation tasks through an intuitive node-based system. However, the complexity of managing node connections and diverse modules can be challenging for users. In this paper, we introduce ComfyGPT, a self-optimizing multi-agent system designed to generate ComfyUI workflows based on task descriptions automatically. The key innovations of ComfyGPT include: (1) consisting of four specialized agents to build a multi-agent workflow generation system: ReformatAgent, FlowAgent, RefineAgent, and ExecuteAgent; (2) focusing on generating precise node connections instead of entire workflows, improving generation accuracy; and (3) enhancing workflow generation through reinforcement learning. Moreover, we introduce FlowDataset, a large-scale dataset containing 13,571 workflow-description pairs, and FlowBench, a comprehensive benchmark for evaluating workflow generation systems. Additionally, we propose four novel evaluation metrics: Format Validation (FV), Pass Accuracy (PA), Pass Instruct Alignment (PIA), and Pass Node Diversity (PND). Experimental results demonstrate that ComfyGPT significantly outperforms existing LLM-based methods in workflow generation, making it a significant step forward in this field. Code is avaliable at https://github.com/comfygpt/comfygpt.
Similar Papers
ComfyUI-Copilot: An Intelligent Assistant for Automated Workflow Development
Computation and Language
Helps make AI art faster and easier.
ComfyUI-R1: Exploring Reasoning Models for Workflow Generation
Computation and Language
Builds AI art tools automatically.
ComfyMind: Toward General-Purpose Generation via Tree-Based Planning and Reactive Feedback
Artificial Intelligence
AI makes creating complex pictures easier and more reliable.