ElevenLabs, a machine learning startup specializing in voice cloning and synthesis, has announced the expansion of its platform with a new text-to-speech model that supports 30 languages. The release of this new feature marks the platform’s exit from the beta phase and positions it as a valuable tool for enterprises and individuals looking to customize their content for a global audience. The expansion comes shortly after ElevenLabs’ successful $19 million series A funding round, which valued the company at nearly $100 million.
The CEO and co-founder of ElevenLabs, Mati Staniszewski, stated that the company’s ultimate goal is to make all content universally accessible in any language and voice. The release of Eleven Multilingual version 2 brings them one step closer to realizing this vision by offering human-quality AI voices in a wide range of dialects.
ElevenLabs currently provides two main voice-focused AI products: Speech Synthesis and VoiceLab. Speech Synthesis is a tool that generates natural-sounding speech from text inputs, while VoiceLab allows users to clone their own voices or create entirely new synthetic voices for use with the synthesis tool. With the new text-to-speech model supporting 30 languages, users can now convert their preferred speech in any language with ease.
Previously, the synthesis tool only supported speech in English. It was later expanded to include text inputs and AI voices for six additional languages, namely Polish, German, Spanish, French, Italian, Portuguese, and Hindi. With the latest release, Eleven Multilingual version 2, users can now synthesize speech in an impressive array of languages, including Korean, Dutch, Turkish, Swedish, Indonesian, Vietnamese, Filipino, Ukrainian, Greek, Czech, Finish, Romanian, Danish, Bulgarian, Malay, Hungarian, Norwegian, Slovak, Croatian, Classic Arabic, and Tamil.
The model is designed to understand the relationships between words and adjust delivery based on context, ensuring a more natural flow of speech. It can predict thousands of voice characteristics to generate AI voices, maintaining appropriate flow across different languages without sounding robotic.
ElevenLabs has already gained significant traction during its beta phase, with over a million registered users worldwide. The platform has been utilized by various enterprises, providing voice capabilities for video games, customer service avatars, audiobooks, and content for the visually impaired. The company has collaborated with ArXiv to publish audio versions of their papers for enhanced accessibility, as well as partnering with Storytel to offer additional AI voices alongside human narrators for audiobooks.
As part of their future plans, ElevenLabs aims to expand their products further by adding more languages and features. They are developing a projects tool to simplify the structuring and editing of long-form content, providing users with a seamless experience similar to Google Docs. Additionally, they are working on an AI dubbing tool that will allow users to convert speech from one language to another while preserving the original speaker’s voice.
The market for AI-powered voice and speech generation tools is projected to reach $5 billion by 2032, with a compound annual growth rate of over 15.40%. ElevenLabs faces competition from players like MURF.AI, Play.ht, and WellSaid Labs in this rapidly growing industry.
With its comprehensive language support and user-friendly platform, ElevenLabs is well-positioned to meet the needs of global audiences seeking customized and accessible content. The company’s expansion into 30 languages brings us one step closer to breaking down linguistic barriers and making content universally accessible for all.