Audio Generation Clear of Copyrights Stability AI releases enhanced text-to-audio generator Stable Audio Open

Jun 12, 2024
Reading time
2 min read
Audio Generation Clear of Copyrights: Stability AI releases enhanced text-to-audio generator Stable Audio Open

Sonically minded developers gained a high-profile text-to-audio generator. 

What’s new: Stability AI released Stable Audio Open, which takes text prompts and generates 16kHz-resolution music or sound effects. The model’s code and weights are available for noncommercial use. You can listen to a few sample outputs here.

How it works: Stability AI promotes Stable Audio Open for generating not full productions but elements that will be assembled into productions. Although it’s similar to the earlier Stable Audio 2.0, it has important differences.

  • Stable Audio Open is available for download. In contrast, Stable Audio 2.0 is available via API or web user interface.
  • The new model accepts only text input, while Stable Audio 2.0 accepts text or audio. It generates stereo, clips up to 47 seconds long rather than Stability Audio 2.0’s three minutes.
  • Its training dataset was drawn from open source audio databases that anyone can use without paying royalties. In contrast, Stable Audio 2.0 was trained on a commercial dataset.

Behind the news: Stable Audio Open competes not only with Stable Audio 2.0 but also with a handful of recent models. ElevenLabs, known for voice cloning and generation, introduced Sound Effects, which generates brief sound effects from a text prompt. Users can input up to 10,000 prompt characters with a free account. For music generation, Udio and Suno offer web-based systems that take text prompts and generate structured compositions including songs with lyrics, voices, and full instrumentation. Users can generate a handful of compositions daily for free.

Why it matters: Stable Audio Open is pretrained on both music and sound effects, and it can be fine-tuned and otherwise modified. The fact that its training data was copyright-free guarantees that users won’t make use of proprietary sounds — a suitable option for those who prefer to steer clear of the music industry’s brewing intellectual property disputes.

We’re thinking: We welcome Stability AI’s latest contribution, but we don’t consider it open source. Its license doesn’t permit commercial use and thus, as far as we know, doesn’t meet the definition established by the Open Source Initiative. We urge the AI community toward greater clarity and consistency with respect to the term “open source.” 


Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox