Meta, the parent company of Facebook, has unveiled its latest innovation in artificial intelligence (AI) translation models. The tech giant has launched what it claims to be the first all-in-one multimodal and multilingual AI translation model called SeamlessM4T. This groundbreaking model is capable of performing various translations, including speech-to-text, speech-to-speech, text-to-speech, and text-to-text, for up to 100 languages, depending on the task at hand.
SeamlessM4T supports speech-to-speech translation for approximately 100 input languages and 36 output languages, including English. This impressive language support makes it a truly comprehensive translation tool. In an effort to foster innovation and collaboration in the AI research community, Meta has decided to publicly release SeamlessM4T under a research license. This move will enable researchers and developers to build upon the work done by Meta and explore novel applications for this advanced translation model.
In addition to launching SeamlessM4T, Meta is also releasing the metadata of SeamlessAlign, which is claimed to be the largest open multimodal translation dataset to date. This dataset comprises a staggering 270,000 hours of mined speech and text alignments. By making this metadata publicly available, Meta aims to further promote research and development in the field of multimodal translation.
This latest translation model from Meta builds upon the technological advancements made by the company and others in their pursuit of creating a universal translator. Last year, Meta introduced a text-to-text machine translation model called No Language Left Behind, which supports an impressive 200 languages. This was followed by the unveiling of Massively Multilingual Speech earlier this year. This technology offers speech recognition, language identification, and speech synthesis capabilities in over 1,100 languages.
Meta’s commitment to advancing AI technology is evident through the launch of several other AI models. Just this month, the company introduced AudioCraft, an open-source AI tool that allows users to generate audio and music from text prompts. Furthermore, Meta collaborated with Microsoft to release Llama 2, another AI model available for free research and commercial use.
It is worth mentioning that various companies worldwide have also developed their own large language models (LLMs), such as Baidu’s Ernie Bot, Alibaba’s Tongyi Qianwen and Tongyi Wanxiang, Google’s Bard, OpenAI’s DALL-E, and Midjourney Inc.’s Midjourney. However, as the adoption of AI continues to grow, concerns about potential risks have emerged. In response to these concerns, the U.N. Security Council held its first formal meeting in July to discuss the risks associated with the use of AI. Additionally, seven tech giants, including Meta, Amazon, Google, and Microsoft, have voluntarily committed to implementing safety measures when deploying AI technologies.
The introduction of Meta’s all-in-one AI translation model, SeamlessM4T, marks a significant milestone in the field of language translation. With its impressive language support and comprehensive translation capabilities, SeamlessM4T has the potential to revolutionize communication and bridge language barriers on a global scale. By openly sharing this technology and accompanying metadata, Meta aims to inspire further research and collaboration in the AI community, ultimately driving innovation and unlocking new possibilities in the realm of translation.