January 28, 2023

Microsoft's AI technology can generate a voice that minics any person's voice using just a 3-second sample

Microsoft has developed a new AI model, VALL-E, which is capable of simulating virtually anyone's voice using only a 3-second audio sample.

Microsoft has developed a new AI model, VALL-E, which is capable of simulating virtually anyone's voice using only a 3-second audio sample.

This is a significant improvement from other AI models that require at least a minute of audio recording input. To develop VALL-E, scientists used Meta's Libri-Light library containing audio from over 7,000 speakers, training the AI on 60,000 hours of English language recordings.

Microsoft refers to VALL-E as a "neural codec language model" based on a similar model from Meta that uses AI for text-to-speech audio.

Follow @mirainews.ai for the latest AI news.