Mistral AI Releases New French AI Model, Mixtral 8x7B, Matching GPT-3.5 on Benchmarks

Paris-based Mistral AI has made significant strides in the AI space with the introduction of its latest language model, Mixtral 8x7B. This mixture of experts (MoE) model is equipped with open weights and reportedly performs on par with OpenAI’s GPT-3.5. The achievement has garnered attention from industry heavyweights, including OpenAI’s Andrej Karpathy and Jim Fan. With Mixtral 8x7B, the possibility of a ChatGPT-3.5-level AI assistant operating locally on devices becomes more attainable.

Mistral, headed by Arthur Mensch, Guillaume Lample, and Timothée Lacroix, has quickly made its mark in the AI space by advocating for smaller models that deliver impressive performance. Unlike closed AI models from industry giants like OpenAI, Anthropic, or Google, Mistral’s models run locally with open weights that can be easily downloaded and used with fewer restrictions.
Mixtral 8x7B boasts the ability to handle a 32K token context window and operates in multiple languages, including French, German, Spanish, Italian, and English. This AI model, much like ChatGPT, is capable of assisting with various tasks such as data analysis, software troubleshooting, and programming. Mistral claims that Mixtral 8x7B outperforms Meta’s larger LLaMA 2 70B language model, which has 70 billion parameters. Additionally, it matches or even surpasses the performance of OpenAI’s GPT-3.5 on specific benchmarks.

The remarkable progress of open-weights AI models catching up to the top offerings of OpenAI within a year has left many astounded. Pietro Schirano, the founder of EverArt, expressed his amazement on X, stating, Just incredible. I am running Mistral 8x7B instruct at 27 tokens per second, completely locally thanks to @LMStudioAI. A model that scores better than GPT-3.5, locally. Imagine where we will be 1 year from now.

Sharif Shameem, the founder of LexicaArt, also shared his enthusiasm on Twitter, calling the Mixtral MoE model an inflection point. He highlighted that it is a true GPT-3.5 level model capable of running at 30 tokens per second on an M1 chip, allowing inference to be free and keeping user data on their own devices. Andrej Karpathy, in agreement with Shameem, added that the reasoning power and capabilities of the model have made significant strides, with the user interface and experience needing further refinement.

But what exactly does mixture of experts (MoE) mean? According to a helpful guide by Hugging Face, it refers to a machine learning model architecture where a gate network assigns specific tasks to specialized neural network components known as experts. This approach enhances efficiency and scalability in model training and inference by activating only a subset of experts relevant to each input, reducing computational load compared to monolithic models with the same number of parameters.

In simpler terms, a MoE functions like a factory with a team of specialized workers (experts) and a smart system (gate network) that assigns each worker with the most suitable task. This setup ensures efficiency and expedites the overall process, as tasks are allocated to experts specializing in specific areas. In contrast, a traditional factory may require every worker to perform various tasks.

While Mixtral is not the first open MoE model, its notable feature lies in its relatively small parameter count and commendable performance. It is now available on Hugging Face and can be used locally through an app called LM Studio. Mistral has also introduced beta access to an API for three levels of Mistral models.

The rapid progress and success of Mistral’s Mixtral 8x7B model marks an exciting advancement in the AI landscape. As the field continues to evolve, we can expect further breakthroughs in AI language models and their potential applications across various industries.

Mistral AI Releases Mixtral 8x7B: A Breakthrough Open-Weights AI Language Model, France

Subscribe

Revolutionary Small Business Exchange Network Connects Sellers and Buyers

District 1 Commissioner Race Results Delayed by Recounts & Ballot Reviews, US

Fed Minutes Hint at Potential Rate Cut in September amid Economic Uncertainty, US

Baltimore Orioles Host First-Ever ‘Faith Night’ with Players Sharing Testimonies, US

Democratic National Convention Approves Platform Doubling Down on Abortion and LGBTQ+ Rights in 2024

More like this
Related

Revolutionary Small Business Exchange Network Connects Sellers and Buyers

District 1 Commissioner Race Results Delayed by Recounts & Ballot Reviews, US

Fed Minutes Hint at Potential Rate Cut in September amid Economic Uncertainty, US

Baltimore Orioles Host First-Ever ‘Faith Night’ with Players Sharing Testimonies, US

About us

Company

The latest

Revolutionary Small Business Exchange Network Connects Sellers and Buyers

District 1 Commissioner Race Results Delayed by Recounts & Ballot Reviews, US

Fed Minutes Hint at Potential Rate Cut in September amid Economic Uncertainty, US

Subscribe

Mistral AI Releases Mixtral 8x7B: A Breakthrough Open-Weights AI Language Model, France

Subscribe

More like thisRelated

About us

Company

The latest

Subscribe

More like this
Related