Riffusion
Resource for people looking to access free AI tools to streamline their work. Having a centralized place where users can discover the latest AI tech

Resource for people looking to access free AI tools to streamline their work. Having a centralized place where users can discover the latest AI tech
Generate music from text prompts using spectrogram-based diffusion.
Riffusion is an open-source AI music generation platform that transforms natural-language prompts into audio clips by generating spectrogram images with a fine-tuned Stable Diffusion model. Initially released in December 2022, it introduced a novel workflow of text-to-spectrogram-to-audio conversion, democratizing experimental sound design for musicians, educators, and hobbyists.
Riffusion interprets your textual description of mood, genre, and instrumentation to generate a visual spectrogram. It then applies an inverse Fourier transform to convert that spectrogram into a playable audio clip, providing a few-second musical loop that can be further remixed or extended.
1. Prompt Entry: Users type a description such as "funky disco groove".
2. Spectrogram Generation: The fine-tuned Stable Diffusion model creates an image representing sound frequencies over time.
3. Audio Reconstruction: An inverse Fourier transform converts the image into a raw audio waveform.
4. Remixing & Interpolation: Use seed values or img2img to blend or extend loops.
5. Export: Download full loops or individual stems for DAWs.
Riffusion innovates by repurposing image-based diffusion for audio generation, offering a unique, cost-free tool for rapid musical ideas. While loop lengths are brief, its open-source nature and stem export capabilities make it a potent resource for creators exploring generative soundscapes.