Voicebox is Meta’s first entry into the generative AI space for speech

Top

Search Canadian Reviewer

Sunday

Jun182023

Sunday, June 18, 2023 at 9:39PM

Image: Meta

Meta has recently unveiled its first generative artificial intelligence tool for speech. The tool, named Voicebox, can handle various speech-generation tasks without being explicitly trained for them, thanks to its in-context learning ability.

Meta detailed what Voicebox can do in a blog post, including tasks like:

In-context text-to-speech: It can mimic the audio style of any voice sample as short as two seconds and use it to generate speech from text.
Speech editing and noise reduction: It can fix speech errors or remove unwanted noises by regenerating the affected parts of the audio without requiring a new recording.
Cross-lingual style transfer: It can read any text in one of the six languages it supports (English, French, German, Spanish, Polish, or Portuguese) using the voice and style of any speech sample in any of those languages.
Diverse speech sampling: It can produce diverse and natural-sounding speech samples from the same text using data from different speakers and regions.

Meta says Voicebox is part of its ongoing research on generative AI and that it has many potential applications in the future. For example, Voicebox could provide realistic voices for virtual assistants and characters in the metaverse, help visually impaired people listen to messages from their friends in their familiar voices, and offer creators easy and powerful tools to create and edit audio tracks for their videos.

Nicole Batac |

Reader Comments

There are no comments for this journal entry. To create a new comment, use the form below.

Post a New Comment

Enter your information below to add a new comment.

My response is on my own website »

Author:

Author Email (optional):

Author URL (optional):

Post:

↓ | ↑

Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>