Meta's Voicebox is an AI-powered text-to-speech studio


Meta's Voicebox technology is not yet available to the general public. — AFP Relaxnews

After virtual reality, the Meta group is now entering the audio arena. The American tech giant has unveiled Voicebox, a handy online studio for transforming text into audio, in six different languages. For the time being, Meta has decided not to share its new AI tool with the general public.

After the world of virtual reality, Mark Zuckerberg is now jumping into audio with Voicebox. In a blog post, the social networking giant describes this new tool as "a generative AI model that can help with audio editing, sampling and styling."

More natural voices

First and foremost, Meta's studio will enable text-to-speech generation, ie, it will be able to transform written text into spoken audio using a synthetic voice. Among other options, users will be able to benefit from cross-lingual style transfer.

"Given a sample of speech and a passage of text in English, French, German, Spanish, Polish, or Portuguese, Voicebox can produce a reading of the text in that language," says Meta.

Even more impressive is Voicebox's ability to reproduce the audio style from an extract of just two seconds. This can then be used to generate other audio content. The style used is thus more representative of the way people speak in everyday life, more natural and therefore more pleasing to the ear.

In addition to transforming text into audio and reproducing an audio style, the studio offers the option of editing an extract. In fact, the user can delete a sound or any other part of an audio track to make the content perfect without having to make a new recording.

"We trained Voicebox with more than 50,000 hours of recorded speech and transcripts from public domain audiobooks in English, French, Spanish, German, Polish, and Portuguese. Voicebox is trained to predict a speech segment when given the surrounding speech and the transcript of the segment," explains Meta.

However, the American group is not the first to have taken an interest in synthetic voices. TikTok caused a buzz with its own text-to-speech tool when it launched in 2020.

The Chinese giant even made it possible to use the voices of Disney movie characters such as Rocket Raccoon from Guardians Of The Galaxy, C-3PO from Star Wars and Stitch from Lilo And Stitch to read text in audio format.

More engaging and more inclusive, the use of synthetic voices continues to appeal to users and major players in social networking.

For Meta, "this type of technology could be used in the future to help creators easily edit audio tracks, allow visually impaired people to hear written messages from friends in their voices, and enable people to speak any foreign language in their own voice." A way of strengthening ties and attracting new users. – AFP Relaxnews

Follow us on our official WhatsApp channel for breaking news alerts and key updates!

   

Next In Tech News

Is tech industry already on cusp of artificial intelligence slowdown?
What does watching all those videos do to kids' brains?
How the Swedish Dungeons & Dragons inspired 'Helldivers 2'
'The Mind Twisting Quadroids' review: Help needed conquering the galaxy
Albania bans TikTok for a year after killing of teenager
As TikTok runs out of options in the US, this billionaire has a plan to save it
Google offers to loosen search deals in US antitrust case remedy
Is Bluesky the new Twitter for teachers in the US?
'Metaphor: ReFantazio', 'Dragon Age', 'Astro Bot' and an indie wave lead the top video games of 2024
Opinion: You can pay for white noise, but you don’t need to

Others Also Read