Stable Diffusion is Stability AI’s family of open text-to-image models. They turn a written prompt into an image using a diffusion process, and their defining trait is openness: the model weights have been released for download so that the models can run on consumer hardware rather than only behind a hosted API. The original 2022 launch announcement described Stable Diffusion as “a speed and quality breakthrough, meaning it can run on consumer GPUs” with under 10 GB of VRAM, framed as “empowering billions of people to create stunning art within seconds.”
The technical foundation is the latent diffusion approach from the paper “High-Resolution Image Synthesis with Latent Diffusion Models” (arXiv 2112.10752), which runs the diffusion process in a compressed latent space rather than directly on pixels, “significantly reducing computational requirements compared to pixel-based” diffusion models and using cross-attention to condition generation on text prompts. That efficiency is what made running the model on ordinary hardware practical.
The family has progressed through several generations. The original 1.x line was followed by Stable Diffusion 2, then SDXL (a larger, higher-quality model), and then Stable Diffusion 3, which Stability AI describes as combining “a diffusion transformer architecture and flow matching” with “greatly improved performance in multi-subject prompts, image quality, and spelling abilities.” Per Stability AI’s SD3 announcement, the Stable Diffusion 3 suite “ranges from 800M to 8B parameters.” Version names and the exact current lineup shift over time; the details here reflect Stability AI’s announcements as of the verification date, and parameter ranges are quoted with that as-of framing rather than as fixed facts. For the current catalog and licensing, Stability AI’s news page is the live reference.
Distribution is open weights: the models are downloadable and self-hostable, which sets the family apart from API-only image models.
Why business readers should care: Stable Diffusion proved that a capable image generator could be released openly and run on commodity hardware, which seeded an entire ecosystem of downstream tools and fine-tunes and made generative imaging something organizations can run on their own infrastructure for cost and data-control reasons.