At first, Simplicity’s Delight sounds like a catchy pop song for a Velveeta commercial. The singer exalts “a humble slab of cheese” over a light guitar and breezy percussion.
Listen closely, and you may notice the voice sounds a little computerised – though it could be autotuned. The real giveaway that the singer may not be human comes in the second verse when they mispronounce the word “tangy”.
The song was created with software from Suno, one of a new crop of artificial intelligence (AI) startups focused on building tools to automate the music-making process.
Enter a short written command, and Suno will generate shockingly human-sounding music in seconds – anything from a dreamy electro-pop ditty about a breakup to, well, an acoustic tune celebrating the delights of fermented dairy products.
Entire albums of what appear to be AI-generated songs made with Suno are now streaming on services like Spotify.
Technological threat
Generative AI has been used to churn out all kinds of content, including text, images and videos. Now, music is shaping up to be the next frontier, with the promise of empowering anyone to express themselves through song.
In the process, however, AI startups have heightened worries over artistes’ livelihoods and could soon fill the Internet with half-baked, computerised songs.
They may also test the tolerance of music labels, which have a long history of being litigious against the tech industry.
Already, artistes and labels see AI as a potential existential threat. Hundreds of musicians, including Billie Eilish, Miranda Lambert and Aerosmith, signed an open letter last month via the non-profit Artist Rights Alliance urging AI developers, tech companies, and others to halt the use of AI “to infringe upon and devalue the rights of human artistes”.
Recently, Universal Music Group (UMG) struck a deal with TikTok for greater protections against AI music after having previously pulled its song catalogue from the platform, in part over concerns that TikTok was “flooded with AI-generated recordings”. That followed UMG’s condemnation last year of a viral track that used AI to replicate the stylings of Drake and the Weeknd. Dozens of music publishers also previously sued OpenAI rival Anthropic, alleging its chatbot scraped song lyrics without permission.
While leading AI companies like OpenAI and Alphabet’s Google have teased AI music generation features in recent years, they have so far not brought these to market as consumer products. Google DeepMind, for example, unveiled a music creator called Lyria in November but has yet to release it.
The company said it was “engaging” with artistes and the music industry on “responsible development”. Instead, the AI music revolution is now being led by smaller companies.
Craving market control
Suno, a startup founded in 2020 and based in Cambridge, Massachusetts, United States, first released its music-making software last year and recently introduced a new version that lets users generate songs up to two minutes long.
Likewise, Udio, started by former Google DeepMind researchers and engineers, introduced a “beta” version of its software last month that can spit out music in roughly 30-second chunks.
“A couple years ago, it wouldn’t have been ready,” said Udio co-founder and chief executive officer David Ding, who previously worked at DeepMind on generative AI projects, including Lyria. “Now all the different pieces of research – the advances in language modelling, but also image modelling, video modelling – just seemed to indicate that the time is right for a music model to really shine.”
Udio is backed by well-known names in tech and music. It raised a US$10mil (RM47.4mil) seed funding round led by Andreessen Horowitz, with participation from Instagram co-founder Mike Krieger as well as musicians will.i.am and Common, music producer Tay Keith, and music distributor UnitedMasters. Suno would not disclose details of its financing.
To entice users, both companies currently offer freebies: Udio users can make 1,200 free songs per month while the product is still in beta.
Suno users can make 10 per day or pay for a monthly subscription with features like more song generations. Both companies run each prompt from users twice to give some variations in the output.
In the first two weeks after Udio’s launch, more than 600,000 people tried it out, co-founder Andrew Sanchez said, and users have been generating an average of 10 songs per second. (Suno did not provide comparable figures.)
Udio’s software is also evolving rapidly: In the last few weeks, the company has rolled out multiple features, including the ability to extend songs to be up to 15 minutes in length. People tend to start using Suno by making a song for or with a friend or family member, co-founder Keenan Freyberg said, and then they go on to explore what the software can do.
Teachers have used Suno to make songs to help in the classroom, and data software company Palantir Technologies used it to create a country tune for a recent software boot camp.
The results can be catchy, weird, or both. One song called Rat Contraception Disco celebrated a New York Times story about New York City’s efforts to stop rat reproduction with birth control. Sample lyrics: “Forget the poison pellets, the traps ain’t doin’ nothin’/Time for a disco revolution, a funky little somethin’ somethin’.”
“We’ve been humbled by our inability to forecast how people are going to want to use the tools,” Freyberg said.
Coping with copyright
But as AI increasingly edges into more creative fields, the technology is on a collision course with the entertainment industry and its copyright lawyers.
Companies like Midjourney, OpenAI and Stability AI built their media-generating AI models with datasets that pulled imagery from across the Internet. While they argue that the practice is protected under the fair use doctrine of US copyright law, it has led to outrage and lawsuits.
Generative AI companies have plausible fair use defences for using works as training data, said Pamela Samuelson, a digital copyright expert and law professor at the University of California, Berkeley.
But she said courts might look at music differently than they would other works such as computer code, text or images. “The data type might actually matter,” Samuelson said. “I could see courts distinguishing based on that.”
Neither Suno nor Udio would say precisely what their AI systems are trained on. Ding said Udio used publicly available data from the Internet.
Suno co-founder Mikey Shulman said the startup believes training data is, in some ways, even more important than how the company constructs its AI software, “so we’re pretty closely guarding that secret”. But Shulman said Suno’s practices are “legal” and “fairly in line with what other people are doing”.
This secrecy is unsettling to Ed Newton-Rex, CEO of the non- profit Fairly Trained, which provides certification for AI models trained on licensed data.
Newton-Rex, who previously oversaw Stability AI’s music generation product, found it easy to generate a slew of outputs using both companies’ software that closely resemble copyrighted music. For instance, he was able to generate tunes that sound a lot like those of artistes such as Queen, Abba, Oasis, Blink-182 and Ed Sheeran.
“We don’t know what their training data is, but if their training data is copyrighted work and they’re building a competitor to that copyrighted work by training on it, I think it’s hard to see how they really do respect musicians,” Newton-Rex said.
Sanchez said the company is speaking with a range of music-industry stakeholders, including artistes and rights holders, “to ensure our technology is a boon to all musicians and creators”.
One record label official, who spoke on condition of anonymity, expressed openness to dealmaking with AI companies that they deem to be responsible partners. Suno declined to comment on talks with the music industry, but Shulman said the startup is thinking about how to compensate artistes and watching the shifting legal landscape.
“We really want to figure this out in a way that is fair to everybody,” he said.To that end, Suno currently rejects song prompts containing artistes’ names, and Udio will replace them with other descriptors.
For instance, when asked to generate a “brooding, moody pop song in the style of Billie Eilish” about the difficulty of picking a yoghurt flavour at the grocery store, Udio replaced the artiste’s name with a handful of adjectives like “folk pop” and “indie pop”. It also put a blue “artiste replaced” label on the track listing.
Despite anxiety from artistes, the startups point to the number of people who will soon be able to make compelling, professional-sounding music of their own using at least some AI tools.
“There’s going to be an enormous number of folks for whom there was a barrier to entry, economic or otherwise, that prevented them from entering music,” Sanchez said. “And we think that this is going to enable them to do so going forward.” – Bloomberg