Meta's Voicebox is an AI-powered text-to-speech studio


Meta's Voicebox technology is not yet available to the general public. — AFP Relaxnews

After virtual reality, the Meta group is now entering the audio arena. The American tech giant has unveiled Voicebox, a handy online studio for transforming text into audio, in six different languages. For the time being, Meta has decided not to share its new AI tool with the general public.

After the world of virtual reality, Mark Zuckerberg is now jumping into audio with Voicebox. In a blog post, the social networking giant describes this new tool as "a generative AI model that can help with audio editing, sampling and styling."

More natural voices

First and foremost, Meta's studio will enable text-to-speech generation, ie, it will be able to transform written text into spoken audio using a synthetic voice. Among other options, users will be able to benefit from cross-lingual style transfer.

"Given a sample of speech and a passage of text in English, French, German, Spanish, Polish, or Portuguese, Voicebox can produce a reading of the text in that language," says Meta.

Even more impressive is Voicebox's ability to reproduce the audio style from an extract of just two seconds. This can then be used to generate other audio content. The style used is thus more representative of the way people speak in everyday life, more natural and therefore more pleasing to the ear.

In addition to transforming text into audio and reproducing an audio style, the studio offers the option of editing an extract. In fact, the user can delete a sound or any other part of an audio track to make the content perfect without having to make a new recording.

"We trained Voicebox with more than 50,000 hours of recorded speech and transcripts from public domain audiobooks in English, French, Spanish, German, Polish, and Portuguese. Voicebox is trained to predict a speech segment when given the surrounding speech and the transcript of the segment," explains Meta.

However, the American group is not the first to have taken an interest in synthetic voices. TikTok caused a buzz with its own text-to-speech tool when it launched in 2020.

The Chinese giant even made it possible to use the voices of Disney movie characters such as Rocket Raccoon from Guardians Of The Galaxy, C-3PO from Star Wars and Stitch from Lilo And Stitch to read text in audio format.

More engaging and more inclusive, the use of synthetic voices continues to appeal to users and major players in social networking.

For Meta, "this type of technology could be used in the future to help creators easily edit audio tracks, allow visually impaired people to hear written messages from friends in their voices, and enable people to speak any foreign language in their own voice." A way of strengthening ties and attracting new users. – AFP Relaxnews

Follow us on our official WhatsApp channel for breaking news alerts and key updates!

   

Next In Tech News

India restricts WhatsApp sharing data with other Meta entities, imposes $25.4 million fine
Goldman Sachs looking to spin out its digital assets platform, source says
Facebook users affected by data breach eligible for compensation, German court says
Tesla gains on report Trump's team planning federal self-driving vehicle regulations
Roblox tightens messaging rules for under-13 users amid abuse concerns
Nvidia's Blackwell revenue in focus as sales growth slows
South Africa's MTN exploring partnerships with satellite-internet providers
Xiaomi posts jump in third-quarter revenue, beats estimates
Could artificial general intelligence emerge as soon as 2025?
PS5 Pro review: Is Sony's flashier console worth the steep price?

Others Also Read