Removed, enjoy !
So Mistral prohibits you from using their models to train or improve other models or compete against them........... I thought they were fully open...... mistral.ai/terms-of-use/
367
626
7,919
2,187,745
We’re announcing a new optimised model today! Mistral Large has top-tier reasoning capacities, is multi-lingual by design, has native function calling capacities and a 32k model. The pre-trained model has 81.2% accuracy on MMLU. Learn more on mistral.ai/news/mistral-larg…. Mistral Large is available on la Plateforme, as well as on Azure, following our commitment to bring frontier AI everywhere. We're eager to see what developers do with Mistral Large!
84
311
2,230
282,014
Self-qualifying oneself as heavyweight while shipping nothing of significance looks like hubris to me
The AI race is very hard to enter at this point: even Mistral is a small player. The US has at least 7 heavyweights, China 3, the EU: 0. In the coming 1-3 years (as AIs become inreasingly capable) the amount of available compute will further gain on importance (self acceration). Even harder to catch up. Also I don't think it should be the givernment's role to push this directly: what hinders the EU is lack of VC money, overregulation and taxes.
65
89
1,919
271,845
We are announcing €600M in Series B funding for our first anniversary.  We are grateful to our new and existing investors for their continued confidence and support for our global expansion. This will accelerate our roadmap as we continue to bring frontier AI into everyone’s hands.
119
122
1,882
451,043
An over-enthusiastic employee of one of our early access customers leaked a quantised (and watermarked) version of an old model we trained and distributed quite openly. To quickly start working with a few selected customers, we retrained this model from Llama 2 the minute we got access to our entire cluster — the pretraining finished on the day of Mistral 7B release. We've made good progress since — stay tuned!
55
159
1,567
645,865
At @MistralAI we're releasing our very first model, the best 7B in town (outperforming Llama 13B on all metrics, and good at code), Apache 2.0. We believe in open models and we'll push them to the frontier mistral.ai/news/about-mistra… Very proud of the team !
50
187
1,508
288,307
A new model to hasten AI progress. 24B, 81% MMLU, no RL for now! We're super excited to see the latest development in international open-source AI (kudos to Deepseek!), and cannot wait to bring new contributions to it. We're renewing our commitment to using Apache licenses. AI brings joy: accelerate.
magnet:?xt=urn:btih:11f2d1ca613ccf5a5c60104db9f3babdfa2e6003&dn=Mistral-Small-3-Instruct&tr=udp%3A%2F%2Ftracker.opentrackr.org%3A1337%2Fannounce&tr=http%3A%2F%https://nitter.app/t.co/ua2yzvEYLu%3A1337%2Fannounce
105
127
1,515
167,788
We have heard many extrapolations of Mistral AI’s position on the AI Act, so I’ll clarify. In its early form, the AI Act was a text about product safety. Product safety laws are beneficial to consumers. Poorly designed use of automated decision-making systems can cause significant damage in many areas. In healthcare, a diagnosis assistant based on a poorly trained prediction system poses risks to the patient. Product safety regulation should be proportional to the risk level of the use case: it is undesirable to regulate entertainment software in the same way as health applications. The original EU AI Act found a reasonable equilibrium in that respect. We firmly believe in hard laws for product safety matters; the many voluntary commitments we see today bear little value. This should remain the only focus of the AI Act. The EU AI Act now proposes to regulate “foundational models”, i.e. the engine behind some AI applications. We cannot regulate an engine devoid of usage. We don’t regulate the C language because one can use it to develop malware. Instead, we ban malware and strengthen network systems (we regulate usage). Foundational language models provide a higher level of abstraction than the C language for programming computer systems; nothing in their behaviour justifies a change in the regulatory framework. Enforcing AI product safety will naturally affect the way we develop foundational models. By requiring AI application providers to comply with specific rules, the regulator fosters healthy competition among foundation model providers. It incentivises them to develop models and tools (filters, affordances for aligning models to one's beliefs) that allow for the fast development of safe products. As a small company, we can bring innovation into this space — creating good models and designing appropriate control mechanisms for deploying AI applications is why we founded Mistral. Note that we will eventually supply AI products, and we will craft them for zealous product safety. With a regulation focusing on product safety, Europe would already have the most protective legislation globally for citizens and consumers. Any foundational model would be affected by second-order regulatory pressure as soon as they are exposed to consumers: to empower diagnostic assistants, entertaining chatbots, and knowledge explorers, foundational models should have controlled biases and outputs. Recent versions of the AI Act started to address ill-defined “systemic risks”. In essence, the computation of some linear transformations, based on a certain amount of calculation, is now considered dangerous. Discussions around that topic may occur, and we agree that they should accompany the progress of technology. At this stage, they are very philosophical – they anticipate exponential progress in the field, where physics (scaling laws!) predicts diminishing returns with scale and the need for new paradigms. Whatever the content of these discussions, they certainly do not pertain to regulation around product safety. Still, let’s assume they do and go down that path. The AI Act comes up with the worst taxonomy possible to address systemic risks. The current version has no set rules (beyond the term highly capable) to determine whether a model brings systemic risk and should face heavy or limited regulation. We have been arguing that the least absurd set of rules for determining the capabilities of a model is post-training evaluation (but again, applications should be the focus; it is unrealistic to cover all usages of an engine in a regulatory test), followed by compute threshold (model capabilities being loosely related to compute). In its current format, the EU AI Act establishes no decision criteria. For all its pitfalls, the US Executive Order bears at least the merit of clarity in relying on compute threshold. The intention of introducing a two-level regulation is virtuous. Its effect is catastrophic. As we understand it, introducing a threshold aims to create a free innovation space for small companies. Yet, it effectively solidifies the existence of two categories of companies: those with the right to scale, i.e., the incumbent that can afford to face heavy compliance requirements, and those that can’t because they lack an army of lawyers, i.e., the newcomers. This signals to everyone that only prominent existing actors can provide state-of-the-art solutions. Mechanistically, this is highly counterproductive to the rising European AI ecosystem.  To be clear, we are not interested in benefiting from threshold effects: we play in the main league, we don’t need geographical protection, and we simply want rules that do not give an unfair advantage to incumbents (that all happen to be non-European). Transparency around technology development benefits safety and should be encouraged. Finally, we have been vocal about the benefits of open-sourcing AI technology. This is the best way to subject it to the most rigorous scrutiny. Providing model weights to the community (or even better, developing models in the open end-to-end, which is not something we do yet) should be well regarded by regulators, as it allows for more interpretable and steerable applications. A large community of users can much more efficiently identify the flaws of open models that can propagate to AI applications than an in-house team of red-teamers. Open models can then be corrected, making AI applications safer. The Linux kernel is today deemed safe because millions of eyes have reviewed its code in its 32 years of existence. Tomorrow’s AI systems will be safe because we’ll collectively work on making them controllable. The only validated way of working collectively on software is open-source development. Long prose, back to building!
59
332
1,330
789,363
Clarifying a couple of things since we’re reading creative interpretations of our latest announcements: - We’re still committed to leading open-weight models! We ask for a little patience, 1.5k H100s only got us that far. - We have a reselling agreement with Microsoft, that we’re very excited about. Alongside similar partnerships, it will accelerate our growth. - Microsoft invested in a small convertible note alongside many other companies, as a distribution partner. We are an independent European company with global ambitions, that part is not changing either. We’re seeing some interest for Le Chat and Mistral Large, on both la Plateforme and Azure, and we’ll be iterating fast!
52
144
1,081
157,295
We're back to school! Very proud of our team accomplishments, and honored to partner with @ASMLcompany in our next phase. We're very excited to push frontier AI capabilities in science and technology, with exciting releases ahead.
We’ve raised €1.7B to accelerate technological progress with AI! This Series C funding round, led by @ASMLcompany, fuels Mistral AI scientific research to keep pushing the frontier of AI to tackle the most critical technological challenges faced by strategic industries.
42
75
1,106
117,524
Congratulations, interesting how Github stars seem to correlate to superfluous parameters 😉
The Grok-1 repo is getting pretty popular. I will be responding to pull requests and issues. Feel free to contribute!
44
81
902
157,096
It may soon be a crime to compress public domain human knowledge into public domain matrices. We need to regulate the usage of AI in applications, not gradient descent
26
150
969
380,491
Mixtral is now powering Leo, Brave browser assistant!
Today's update for Brave on desktop (v1.62) dramatically improves Leo, our privacy-preserving AI browser assistant. The most important upgrade is that we've changed the default LLM for Leo to the high-performing and open-source Mixtral 8x7B from @MistralAI for all users.
11
85
812
145,479
Our new OCR model is available through our public API and as a self-deployable solution. In use on le Chat, and part of our specialist model family that is getting bigger !
Introducing the world's best OCR model! mistral.ai/news/mistral-ocr
25
47
749
72,389
The team is fast! It's been super exciting to see le Chat more and more widely adopted. It's an early product, and we can't wait to show you what's coming next. mistral.ai/en/news/all-new-l…
Le Chat is fast (1,100 tok/s for flash queries on an updated Mistral Large). Download it at mistral.ai/app/android or mistral.ai/app/ios
58
62
689
80,367
Our first shot at audio
Introducing the world's best (and open) speech recognition models!
25
34
648
67,691
Our work w/ @mblondel_ml 'Differentiable Dynamic Programming for Structured Prediction and Attention' was accepted at @icmlconf ! arxiv.org/abs/1802.03676 Sparsity and backprop in CRF-like inference layers using max-smoothing, application in text + time series (NER, NMT, DTW)
9
189
627
mistral.ai/news/devstral we’ve released an open-source model which is great at agentic coding tasks - by far Pareto optimal and a good prelude to what’s coming next
20
73
625
38,921
Complex matters slowly coming together — we actually got surprised ourselves
Mistral Medium 3.1 just landed on @lmarena_ai leaderboard—punching way above its weight! 🏆 #1 in English (no Style Control) 🏆 2nd overall (no Style Control) 🏆 Top 3 in Coding & Long Queries 🏆 8th overall Small model. Big impact. Try it now on Le Chat and the API!
29
43
616
78,314
Today, we're announcing Mistral NeMo, a tiny multilingual model, 128k context length, trained with quantization awareness in collaboration with the NVIDIA research team.
18
64
603
52,373
As the Olympic season reaches Paris, we express our respect to ancient Greeks by releasing two new research models, MathΣtral and Codestral Mamba.
14
44
597
47,940
Triangle of happiness!
Introducing Mistral Small 3.1. Multimodal, Apache 2.0, outperforms Gemma 3 and GPT 4o-mini. mistral.ai/news/mistral-smal…
19
37
592
43,908
Expanding from a science company to a science and product company was no easy task, and that release is a very significant milestone in our journey. We're looking forward to how you'll use le Chat, now a slightly more mature animal
We're proud to introduce the next generation of le Chat. Search, PDF upload, coding, image generation, le Canevas... All in one place: chat.mistral.ai/ mistral.ai/news/mistral-chat…
25
48
583
67,248
Reasoning with latency-optimized models is quite a UX game changer. Super proud of what the team has accomplished with this Magistral release! mistral.ai/news/magistral
16
58
539
33,217
This person has put a Mistral 7B model into a stuffed parrot and displays Chinchilla Equation (2) on his torso. Does anyone have his number? This seems a little unsafe.
13
18
449
116,221
Mistral is proud to provide the text LLM powering Unmute, the open-source voice AI from @kyutai_labs!
Kyutai TTS and Unmute are now open source! The text-to-speech is natural, customizable, and fast: it can serve 32 users with a 350ms latency on a single L40S. Try it out and get started on the project page: kyutai.org/next/tts
14
55
460
41,714
Welcome to the party
Meet DBRX, a new sota open llm from @databricks. It's a 132B MoE with 36B active params trained from scratch on 12T tokens. It sets a new bar on all the standard benchmarks, and - as an MoE - inference is blazingly fast. Simply put, it's the model your data has been waiting for.
9
20
440
87,996
Our first hackathon! We’ll be in SF the week before to meet our users
Thrilled to announce the first @MistralAI Hackathon in San Francisco on March 23-24! Sign up at: partiful.com/e/Zk9c9HVsmtsGD… Keynote from Mistral AI founders: @arthurmensch & @GuillaumeLample. Mistral AI mentors: @dchaplot, @sandeep1337, @theo_gervet, @mjmj1oo, @sophiamyang
27
28
425
46,927
With Codestral, our newest state-of-the-art code model, we are introducing the Mistral AI non-production license (MNPL). It allows developers to use our technology for non-commercial use and research. It ensures that every actor on the value chain builds successful businesses. mistral.ai/news/mistral-ai-n….
30
50
362
100,723
Moving forward with a new Small, Pixtral on le Chat and la Plateforme, reduced prices across the board, and an API free tier!
5
22
351
33,406
Day 1 delivery as always :)
Mistral Large is now available to all Perplexity Pro users! Head to your settings page to set it as your default model or test drive it with our Rewrite feature. This model will be available on our mobile apps very soon. Stay tuned!
8
9
344
60,787
Leaving the AI Safety Summit after some constructive discussions today and yesterday. I voiced how open-source was today the safest way to develop AI, putting this transformative technology under the highest level of scrutiny. With many others, we recalled the enormous opportunities that AI brings for better education, better healthcare, unlocking critical science problems, and making our jobs more rewarding. Transparency and fast information flow across different actors are needed to continue enabling these opportunities. Finally, we were many to stress how any new institutions measuring AI progress should be fully independent to avoid the pitfalls of regulatory capture. And it appears we have been heard! We must ensure that any such institution involves the entire world. More data is needed to understand the limitations of current models, and we look forward to engaging with the AI community to jointly agree on a scientific monitoring framework around new AI capabilities in the coming year.
7
55
330
141,065
At Mistral, we've grown aware that to create the best AI experience, one needs to co-design models and product interfaces. Pixtral was trained with high-impact front-end applications in mind and is a good example of that.
We also released Pixtral Large, a new SOTA vision model. mistral.ai/news/pixtral-larg…
14
32
323
43,902
Replying to @far__el
It’s removed, we missed it in our final review — no joke of ours, just a lot of materials to get right !
6
14
306
34,534
"Online Sinkhorn" refines an Optimal Transport distance between two continuous distributions, based on a stream of samples. Stochastic approximation again ! Joint work with @gabrielpeyre, to be presented (oral) at #NeurIPS2020. arxiv.org/abs/2003.01415 github.com/arthurmensch/onli…
2
57
274
Flamingo does feel slightly conscious these days 🦩
7
23
276
Apache 2.0 indeed
6
16
273
30,771
Local Codestral generating #scikitlearn and Keras code to run a medical imaging prediction on the fly. I like that :)
Open Interpreter's new release looks great for locally running LLMs The tool lets LLMs run code (Python, Javascript, Shell, and more) locally. You can chat with Open Interpreter through a ChatGPT-like interface in your terminal by running $ interpreter after installing. Just do `pip install open-interpreter` and then just run `- interpreter --local` sets up fast, local LLMs. Congrats @hellokillian 👏
4
28
253
71,448
Self deployable fully packaged frontier AI. If you’re a country you may want to look here instead, we can help!
Introducing Le Chat Enterprise, the most customizable and secure agent-powered AI assistant for businesses, making AI a real leverage for competitiveness. - Integration with your company knowledge (starting with Gmail, Google Drive, Sharepoint…) - Ability to add frequently used documents for better-informed outputs - Enterprise-grade features: agents, coding assistant, web search, global news coverage… - Secure deployment: on-prem, in your cloud, or as a service.
18
22
251
18,590
Very proud of the team, a first step towards making model customisation much simpler.
Announcing `mistral-finetune`, the official repo and guide on how to fine-tune Mistral open-source models:  github.com/mistralai/mistral…
2
17
230
31,750
Very excited to be bringing our models to Snowflake customers as part of this multi-year partnership. LLMs become all the more interesting when contextualised on data, and we’re eager to see developers create powerful applications combining Mistral models with the Data Cloud.
We’re excited to announce a global partnership to bring @MistralAI's most powerful language models directly to Snowflake customers in the Data Cloud.  Learn more about how Snowflake users can leverage AI with their enterprise data: okt.to/skPbvy
1
27
242
36,848
Very happy to partner with @awscloud to expose Mistral models on Amazon Bedrock, as we continue to bring our technology to every developer.
A colossal AI has arrived. Get large with @MistralAI. ☁️💥💻 Mistral Large is now on #AmazonBedrock. Make the most of your data with cutting-edge text generation, top-tier reasoning capabilities, & advanced language processing. #AWS #generativeAI 👉 go.aws/43GcD8V
6
23
240
51,663
Mistral models will very soon be available as a service on Azure, thank you @satyanadella! We bring our technology where developers build.
Copilot will be the new UI for both the world's knowledge and your organization's knowledge, but most importantly, it will be your agent that helps you act on that knowledge. Here are highlights from my keynote today at #MSIgnite.
4
21
230
72,915
Mistral Medium 2 was Miqu by the way
Introducing Mistral Medium 3: our new multimodal model offering SOTA performance at 8X lower cost. - A new class of models that balances performance, cost, and deployability. - High performance in coding and function-calling. - Full enterprise capabilities, including hybrid or on-premises/in-VPC deployment, custom post-training, and seamless integration into enterprise tools and systems. Check out our blog to learn more:
12
12
235
31,217
Codestral 25.01 is not only on top of the Copilot Arena leaderboard, it's also 2x faster than the first Codestral -- that matters a lot for code completion
Exciting news from @CopilotArena! The latest Codestral 25.01 release is now topping the Copilot Arena leaderboard (joint #1, +12 points over previous Codestral!). Congrats to @MistralAI🎆 Try out the new model today in the @CopilotArena VSCode extension.
6
33
231
31,442
3 notebooks on @PyTorch : from optimization (autodiff basics) to learning (a 2-parameter MLP) to deep-learning (Fashion Mnist + learning rate tricks). Thanks @ogrisel @CharlesOllion for this collaboration ! github.com/m2dsupsdlclass/le…
1
64
220
Notice periods are our biggest pain by far
Two European startups @recraftai and @bfl_ml lead image generation in the world. The third place model, Imagen 3, was developed in London but under a Californian company. artificialanalysis.ai/text-t… Europe could lead in other AI products by supporting more entrepreneurship and competition. One issue is that American companies enforce 6 month to 1 year notice periods and non-competes in Europe, but don’t do it in California. A good example of this is @GoogleDeepMind. They force employees to sign these contracts or retaliate by preventing them from getting merit promotions. This is wrong - it’s time for the UK government ( @matthewclifford ) not to be seduced by Google and develop its own AI industry. @xai Grok 3 came to be because a few deepminders left Google under the protection of musk. People leaving DeepMind also led to the creation of @MistralAI - if it was easier to leave, imagine how much more competitive Europe could be. Eliminate notice periods and non-competes in Europe. It has become a question of national security.
7
12
228
49,369
Welcome Marie!
After nearly a decade at Google, I’m happy to share that I’m starting a new position at @MistralAI. If you are excited about joining a small company which has already had an outsized impact in the field, check out our roles, we are hiring!
4
52
209
45,320
The safest way to make useful AI while mitigating misuse is to work on improving it in the open, with the highest level of scrutiny — as always in software. That’s why open-source is the one and only path toward AI safety.
3
31
211
151,038
We are very excited to partner with Harvey to help build domain-specific models! Our platform's deployment flexibility and high customisation capabilities will help Harvey address the highly regulated legal industry. harvey.ai/blog/mistral-annou…
7
16
200
40,646
Totally thrilled to be alongside @GuillaumeLample and @tlacroix6 to create Mistral AI. A lot of work ahead of us!
Life update: I recently left Meta, and we are starting Mistral.AI, a new AI company with @arthurmensch and @tlacroix6
27
15
198
47,863
We shipped higher level apis for agent orchestration - MCP compatible, all server-side logic deployable privately.
Introducing Agents API: your go-to tool for building tailored agents to solve complex real-world problems! mistral.ai/news/agents-api
6
25
199
18,074
A first step towards faster and stronger differentiated models for your use cases. We've rethought fine-tuning to make it super easy to use, both at training and inference.
10
21
196
55,575
AI needs to be connected to the physical world, proud to be supporting !
Today, @ekindogus and I are excited to introduce @periodiclabs. Our goal is to create an AI scientist. Science works by conjecturing how the world might be, running experiments, and learning from the results. Intelligence is necessary, but not sufficient. New knowledge is created when ideas are found to be consistent with reality. And so, at Periodic, we are building AI scientists and the autonomous laboratories for them to operate. Until now, scientific AI advances have come from models trained on the internet. But despite its vastness — it’s still finite (estimates are ~10T text tokens where one English word may be 1-2 tokens). And in recent years the best frontier AI models have fully exhausted it. Researchers seek better use of this data, but as any scientist knows: though re-reading a textbook may give new insights, they eventually need to try their idea to see if it holds. Autonomous labs are central to our strategy. They provide huge amounts of high-quality data (each experiment can produce GBs of data!) that exists nowhere else. They generate valuable negative results which are seldom published. But most importantly, they give our AI scientists the tools to act. We’re starting in the physical sciences. Technological progress is limited by our ability to design the physical world. We’re starting here because experiments have high signal-to-noise and are (relatively) fast, physical simulations effectively model many systems, but more broadly, physics is a verifiable environment. AI has progressed fastest in domains with data and verifiable results - for example, in math and code. Here, nature is the RL environment. One of our goals is to discover superconductors that work at higher temperatures than today's materials. Significant advances could help us create next-generation transportation and build power grids with minimal losses. But this is just one example — if we can automate materials design, we have the potential to accelerate Moore’s Law, space travel, and nuclear fusion. We’re also working to deploy our solutions with industry. As an example, we're helping a semiconductor manufacturer that is facing issues with heat dissipation on their chips. We’re training custom agents for their engineers and researchers to make sense of their experimental data in order to iterate faster. Our founding team co-created ChatGPT, DeepMind’s GNoME, OpenAI’s Operator (now Agent), the neural attention mechanism, MatterGen; have scaled autonomous physics labs; and have contributed to some of the most important materials discoveries of the last decade. We’ve come together to scale up and reimagine how science is done. We’re fortunate to be backed by investors who share our vision, including @a16z who led our $300M round, as well as @Felicis, DST Global, NVentures (NVIDIA’s venture capital arm), @Accel and individuals including @JeffBezos , @eladgil , @ericschmidt, and @JeffDean. Their support will help us grow our team, scale our labs, and develop the first generation of AI scientists.
10
13
207
36,060
Mistral 7B is now in prod, nice work @perplexity_ai !
💥 Mistral 7B Instruct is available now. Try it free—labs.perplexity.ai
6
18
172
34,594
"Geometric losses for distributional learning" w/ @mblondel_ml and @gabrielpeyre accepted @icmlconf. We derive a geometric softmax with reg. optimal transport and Fenchel duality. Accounts for cost bw/ classes, outputs discrete/*continuous* distributions. bit.ly/2HkBDKq
39
165
To seriously compare two methods in deep learning (say, GANs) , you're looking at 2 methods * 5 random seeds * 5 learning rates * 2 day = 100 days of GPU usage, and that's a minimum. This is often beyond academic labs capacity, how do you cope?
7
19
161
New steps towards faster model customisation and application building
9
17
161
28,513
(sorry for self-promotion) I will be presenting our work on a new geometric softmax with continuous output, based on OT and Fenchel duality, tonight 6:30 @icmlconf, poster #179. Joint work with @mblondel_ml and @gabrielpeyre
2
28
161
New steps toward completing our AI platform - proud of the team!
2
11
145
22,613
Merci pour votre confiance @SebLecornu, construisons ensemble l'IA de Défense 🇫🇷
L'agence ministérielle pour l'IA de défense (AMIAD) va nouer prochainement un partenariat avec @MistralAI Notre futur supercalculateur classifié sera également accessible aux acteurs publics et aux entreprises qui veulent développer de l'IA de façon sécurisée.
5
20
141
13,957
We need to hammer this home: the soon-to-become AGI of many is mere statistical knowledge compression. It’s great, it can help solving millions of problems in healthcare, in education, it can make the jobs of everybody more creativity-oriented.
2
7
126
8,371
Very proud to be partnering with @Capgemini to help enterprises adopt our technology!
Happy to share the news of our latest partnership, announced today: leveraging Mistral AI’s LLM technology, @Capgemini aims to make #GenAI more accessible for enterprises looking to customize and deploy multiple use cases with a lower carbon footprint. bit.ly/4bsZovL
4
13
130
34,044
Very proud of this massive scale partnership. We’re investing in infrastructure to bring AI strategic autonomy to the entire world
MGX, Bpifrance, Mistral AI et NVIDIA choisissent ensemble la France ! Ils vont développer en Île-de-France un campus IA ouvert, associant data centers, calcul de haute performance, éducation et recherche. La France compte tant de talents de l’IA !
6
6
132
10,679
We purposely made it great at optimal transport as you may have guessed !
Just tested this model on a few challenging math questions and I found it very helpful. Magistral keeps doubting its answers ("wait, but...") & trying to improve them, which makes it great at exploring & exploiting knowledge from its train data (and it's fast). Congrats Mistral !
2
11
128
18,064
We're presenting RETRO at 4:15pm @icmlconf with @borgeaud_s, and later today at the poster session. Add a retrieval DB to divide your model size by 10, don't miss out!
1
16
122
Excited to partner with Cloudflare on low-latency generative AI!
Today we’re excited to announce that we’ve added the Mistral-7B-v0.1-instruct to Workers AI. Mistral 7B is a 7.3 billion parameter language model with a number of unique advantages. Try it today! cfl.re/47iSsyO
4
5
113
34,381
At NeurIPS this week, reach out if you want to discuss our work around LLM at @DeepMind (Chinchilla, Retro, Flamingo), and if you're interested in working with us!
2
2
100
We need less religion and more science to make AI safe and useful
Things we need to get past in the AI safety discussion to make progress: - Circular arguments / tautologies : AGI definitionally being the feared end goal is a substance free position. - Bad/incomplete inductive arguments : I've yet to find an inductive step that has any rigor it. - Bad analogy : the silicon stack isn't the carbon stack. AI is not a nuclear weapon. - Conflating replication with aggregate capability : Just because you get an exponential with replication does not mean you have exponential capability. To wit, 10 people with 150 IQ is not the same thing is 1 person with 1500 IQ. - Imputing exponential growth from a control system: Many (most?) control systems that get increasingly complex end up with diminishing marginal returns. Not the opposite. - Bad forms of Pascal's wager : If we actually operated this way, we'd all be every religion, and adopt every regulation for every hypothetical. - Claims with no evidentiary basis : recursive self-improvement of intelligence sounds good, but there is absolutely no evidence I can find that supports it can happen. - Appeal to experts : there are plenty on either side. Let's focus on the actual claims.
10
9
90
35,094
Happy to present our work on games @icmlconf: Gradient extrapolation with alternated player update speeds-up equilibrium finding in convex games/GANs ! Joint work with S. Jelassi, C. Domingo, D. Scieur, @joanbruna @NYUDataScience. arxiv.org/pdf/1905.12363.pdf github.com/arthurmensch/dseg
1
17
89
Thank you for your trust Chuck, we're extremely excited to be working with Cisco
Thrilled to announce our AI Renewals Agent, developed in partnership w/ @MistralAI. This is the first big step toward improving our customer experience with Agentic AI – big thanks to the teams driving this innovation! newsroom.cisco.com/c/r/newsr…
4
4
87
14,770
I had the great pleasure to talk about sovereignty in AI with Jensen and @AnjneyMidha. We explain why every nation state needs an AI strategy and what matters for it to be successful. Full link below
9
18
87
19,657
We’re bringing le Chat and our AI Platform to Dell’s hardware for on-prem deployment dell.com/en-us/blog/bringing…
1
10
88
5,841
Can this change? Yes, although not without a paradigm change in the technology, and that’s why it’s great that we talk about risks. But right now the true risk is to mechanically leave the development of AI to 2 or 3 large corporations.
2
5
77
12,224
Replying to @gneubig
Hello, thanks for this study! It seems you have been using a third-party model github.com/neulab/gemini-ben…… based on Mixtral base model. You'll probably get better results by working with our instructed version (mistral-small on our API). Let us know if we can help!
1
2
80
14,366
DX improvements on la Plateforme!
Introducing @MistralAI Structured Outputs: Define your desired output format with @pydantic and ensure responses are structured exactly as specified🚀
6
4
79
15,429
Answered some hard questions during our trip to SF. Thanks @jacobeffron and @jordan_segall for hosting!
New Unsupervised Learning with @MistralAI CEO and Co-Founder @arthurmensch and @jacobeffron on: - Naming “Mistral” - How the LLM landscape will evolve - Mistral’s commercial strategy - Regulating AI safety YouTube: piped.video/_N2KPEdh69s Apple: ​​apple.co/4cAWKof Spotify: spoti.fi/4cNRYUL
4
6
68
18,543
But it’s not that hard to replicate. Anyone with 100M and the will to do it can create a rather good model from anywhere on earth. Good actor, or bad actor.
3
4
67
7,179
We present our work on multi-task learning in fMRI at poster #151 #NIPS2017 this afternoon with @GaelVaroquaux @julienmairal @danilobzdok, feel free to come and discuss ! arxiv.org/abs/1710.11438
28
74
Very happy to see the release of these works ! In particular, we push semi-parametric language models (a AR Transformer and a nearest-neighbor database) to an unprecedented scale, and obtain continuous improvements with DB size. The semi-parametric route holds many promises !
Replying to @GoogleDeepMind
The three studies explore: Gopher - a SOTA 280B parameter transformer, ethical and social risks, & a new retrieval architecture with better training efficiency. 1: dpmd.ai/llm-gopher 2: dpmd.ai/llm-ethics 3: dpmd.ai/llm-retrieval (more dpmd.ai/llm-retro) 2/
1
13
75
Super excited by this partnership with @SnowflakeDB, with specialist models!
Exciting to combine @SnowflakeDB 's SQL expertise with cutting edge AI from our friends at @MistralAI to create the world’s best SQL copilot!!
2
8
67
18,568
The good news is, for all of the illegal usage of AI we can imagine (misinformation and knowledge search), AI currently does nothing to lift the actual bottleneck standing in the way (distribution of information, actual execution of what the LLM recommends doing, respectively)
1
3
59
86,786
Great to see the release of Llama II, open-source LLMs are making good progress! Still a lot of room to improve OS models positioning on the efficiency/performance front — so that they eventually catch up with proprietary solutions. An interesting challenge 😇
3
6
69
16,146
Replying to @grove0100
Canevas is a French word!
5
59
3,659
"It directly follows from Koshi-Schwartz inequality", welcome to a subtitled #NeurIPS2020 conference that promises to be fun
5
1
53
Our paper #nips2017: Multi-layer classif° and dropout regul° permit transfer learning between fMRI datasets. goo.gl/WrnWqb
1
16
52
solved :)
3
48
4,722
Replying to @julien_c
my number one priority
2
42
4,331
I will be presenting this work at @icmlconf on Thursday morning, and at poster #48. Please come and discuss :-) (alpha) code online github.com/arthurmensch/didy…
Smoothing the max operator in a dynamic program recursion induces a random walk on the computational graph. The expected path on that walk can be computed efficiently by backpropagation, which converges to backtracking as smoothing vanishes. arxiv.org/abs/1802.03676
10
42
Our paper on faster stochastic matrix factorization to appear in IEEE TSP 😃 Convergence proofs, 10x speed-ups, code. goo.gl/F5Jgpe
1
13
38
My slides #icml2016: online learning + random subsampling for fast matrix factorization. Thanks to the audience ! amensch.fr/docs/presentation…
16
41
What an abusive subtitle system @NeurIPSConf 😇 I think there is a slight bias against the French accent, and towards French in general: the optimal transport distance (a national pride) was systematically dubbed the American transport distance !
3
34
Large language models are too big: very happy to share this work which is a welcome finding for computational sobriety! If you're into training big models, consider a dataset collection specialist for your next hire, more text tokens are needed ;)
1
37