Meta's Voicebox is an AI-powered text-to-speech studio


Meta's Voicebox technology is not yet available to the general public. — AFP Relaxnews

After virtual reality, the Meta group is now entering the audio arena. The American tech giant has unveiled Voicebox, a handy online studio for transforming text into audio, in six different languages. For the time being, Meta has decided not to share its new AI tool with the general public.

After the world of virtual reality, Mark Zuckerberg is now jumping into audio with Voicebox. In a blog post, the social networking giant describes this new tool as "a generative AI model that can help with audio editing, sampling and styling."

More natural voices

First and foremost, Meta's studio will enable text-to-speech generation, ie, it will be able to transform written text into spoken audio using a synthetic voice. Among other options, users will be able to benefit from cross-lingual style transfer.

"Given a sample of speech and a passage of text in English, French, German, Spanish, Polish, or Portuguese, Voicebox can produce a reading of the text in that language," says Meta.

Even more impressive is Voicebox's ability to reproduce the audio style from an extract of just two seconds. This can then be used to generate other audio content. The style used is thus more representative of the way people speak in everyday life, more natural and therefore more pleasing to the ear.

In addition to transforming text into audio and reproducing an audio style, the studio offers the option of editing an extract. In fact, the user can delete a sound or any other part of an audio track to make the content perfect without having to make a new recording.

"We trained Voicebox with more than 50,000 hours of recorded speech and transcripts from public domain audiobooks in English, French, Spanish, German, Polish, and Portuguese. Voicebox is trained to predict a speech segment when given the surrounding speech and the transcript of the segment," explains Meta.

However, the American group is not the first to have taken an interest in synthetic voices. TikTok caused a buzz with its own text-to-speech tool when it launched in 2020.

The Chinese giant even made it possible to use the voices of Disney movie characters such as Rocket Raccoon from Guardians Of The Galaxy, C-3PO from Star Wars and Stitch from Lilo And Stitch to read text in audio format.

More engaging and more inclusive, the use of synthetic voices continues to appeal to users and major players in social networking.

For Meta, "this type of technology could be used in the future to help creators easily edit audio tracks, allow visually impaired people to hear written messages from friends in their voices, and enable people to speak any foreign language in their own voice." A way of strengthening ties and attracting new users. – AFP Relaxnews

Follow us on our official WhatsApp channel for breaking news alerts and key updates!

   

Next In Tech News

GM lays off over 1,000 salaried software, services employees
AMD to acquire server builder ZT Systems for $4.9 billion in cash and stock
CIMB updates its banking apps to detect malware that exploits screen sharing and accessibility permissions
Uber-backed escooter startup Lime enters Japan after South Korea exit
AI is helping to launch new businesses (and not just AI businesses)
When AI fails the language test, who is left out of the conversation?
A US man’s photos are circulating in different regions as a Facebook scam – he died months earlier
Argentine lithium a boon for some, doom for others
Can AI truly replicate the screams of a man on fire? Video game performers want their work protected
As bird flu spreads, US disease trackers set their sights on pets

Others Also Read