OpenAI’s latest tool can create an accurate replica of someone’s voice with just 15 seconds of recorded audio. This technology is being used by AI Labs to address the threat of misinformation during a critical global election year. However, due to the risks involved, it is not being released to the public in an effort to limit potential harm.
Voice Engine was initially developed in 2022 and was initially integrated into ChatGPT for text-to-speech functionality. Despite its capabilities, OpenAI has refrained from publicizing it extensively, taking a cautious approach towards its broader release.
Through discussions and testing, OpenAI aims to make informed decisions about the responsible use of synthetic speech technology. Selected partners have access to incorporate the technology into their applications and products after careful consideration.
Various partners, like Age of Learning and HeyGen, are utilizing the technology for educational and storytelling purposes. It enables the creation of translated content while maintaining the original speaker’s accent and voice characteristics.
OpenAI showcased a study where the technology helped a person regain their lost voice due to a medical condition. Despite its potential, OpenAI is previewing the technology rather than widely releasing it to help society adapt to the challenges of advanced generative models.
OpenAI emphasizes the importance of protecting individual voices in AI applications and educating the public about the capabilities and limitations of AI technologies. The voice engine is watermarked to enable tracking of generated voices, with agreements in place to ensure consent from original speakers.
While OpenAI’s tools are known for their simplicity and efficiency in voice replication, competitors like Eleven Labs offer similar capabilities to the public. To address potential misuse, precautions are being taken to detect and prevent the creation of voice clones impersonating political figures in key elections.
Source: www.theguardian.com