Stable Diffusion 3 Medium
Stable Diffusion 3 Mid is a text-to-image model called Multimodal Diffusion Transformer (MMDiT). It has significant improvements in image quality, typography, understanding complex prompts, and being resource-efficient.
For more technical details, please refer to the Research paper.
Model Description
- Developed by: Stability AI
- Model type: MMDiT text-to-image generative model
- Model Description: This is a model that can be used to generate images based on text prompts. It is a Multimodal Diffusion Transformer (https://arxiv.org/abs/2403.03206) that uses three fixed, pretrained text encoders (OpenCLIP-ViT/G, CLIP-ViT/L and T5-xxl)
HOW TO RUN:
if you have a NVIDIA gpu:
run_nvidia_gpu.bat
To run it in slow CPU mode:
run_cpu.bat
IF YOU GET A RED ERROR IN THE UI MAKE SURE YOU HAVE A MODEL/CHECKPOINT IN: ComfyUI\models\checkpoints
You can download the stable diffusion 1.5 one from: https://huggingface.co/runwayml/stable-diffusion-v1-5/blob/main/v1-5-pruned-emaonly.ckpt
RECOMMENDED WAY TO UPDATE:
To update the ComfyUI code: update\update_comfyui.bat
To update ComfyUI with the python dependencies, note that you should ONLY run this if you have issues with python dependencies.
update\update_comfyui_and_python_dependencies.bat
TO SHARE MODELS BETWEEN COMFYUI AND ANOTHER UI:
In the ComfyUI directory you will find a file: extra_model_paths.yaml.example
Rename this file to: extra_model_paths.yaml and edit it with your favorite text editor.