[00:00] The US government essentially banned the use of Fable, Anthropix's frontier-level AI system. And if this kind of capability is locked away from us, even from some of its own creators,
[00:12] we have to ask, if any other AI model reaches that capability, will that get the Ben Hammer too? So far, the answer seems to be yes. And even if it comes back for us, it might, but with an identity
[00:25] and a nationality verification system. So, is this the last time we laid our hands on a frontier AI system? Well, I think I have an answer. I hope. You see, there are free and openweight AI models
[00:39] out there that we can download and run forever. Yes! Something you can actually own. I know, it sounds weird. Now, these open systems are typically behind what these trillion-dollar
[00:52] companies can offer. It has been like this for a while now. But then, a system called GLM 5.2 appeared. The headlines say this is a fabled level system. Some benchmarks say it matches
[01:05] some frontier models. As always, it depends. During my internal testing, I was very surprised myself. In most of my usage, it leaves all other open systems in the dust. It is insanely good.
[01:19] A huge jump forward. Now, for me, let's be measured here. It did not match the frontier systems, but it came so close. Way better than their 5.1 system in general knowledge, coding, math,
[01:34] fixing things in the terminal, you name it. And this is just a minor version number jump from 5.1? This in less than three months That is insane How on earth did they do that Dear Fellow Scholars this is Two Minute Papers with Dr K Zsolnai Well looking through the technical details it has a few tricks up the sleeve For instance Claude and many other advanced systems keep hacking benchmarks to get a higher score
[02:02] They copy answers from references and pretend that they just calculated everything. Crazy thing! So, it happens with GLM 5.2 as well, right? Ehm, not quite.
[02:15] Look! Anti-hacking measures. That is lovely! How does that work? Well, they check if it uses suspicious tools, and when they see some shenanigans, what happens?
[02:27] Get this, it gives the AI back some bank information and lets it continue its work. Yes, little AI, you can hack all you want, but it gets you nothing.
[02:39] It just won't pay off, and I think that is incredible. Now, Anthropic promised us that Claude would be honest, and then introduced Fable, which, depending on your question, could pass it to a different, less capable model, so you
[02:55] get a lower quality answer. All this without telling you about it. Well, I do not consider that to be honest. And now, we may have a free system that might be more honest than paid, proprietary Frontier
[03:09] AI systems, what a time to be alive! Although don't ask it about geopolitics. Now, it is also a bit faster than normal, because like a junior writer, it writes not
[03:21] just one, but several output tokens at the same time, and has a senior editor who decides whether to accept or reject them. This is called multi-token prediction.
[03:33] And look at that, wow, it uses PPO again. What is that While during training many other similar systems are asked a question and it produces not one answer but a bunch of answers A full classroom solves it and we grade the whole classroom
[03:52] We call this G.R.P.O. Cheap Efficient, yes. But PPO grades not a classroom, no, it grades every single student every single step. Whew, sounds great until you find out that the teacher's time is extremely expensive.
[04:09] So, why? Well, GLM is designed for long horizon tasks. It can code for hours and hours, without getting lost or stopping. And for Nant, GRPO is not a great fit. Why? You see,
[04:23] every student has an answer vastly different in length and tools being used. You can't just grade them as a classroom anymore. You have to grade them individually. But it pays off because it
[04:36] tells the AI exactly which tiny decisions it made were useful and which were not. They also have a training factory called Slime that lets many long coding agents practice in
[04:48] parallel without breaking down. And the result is an AI system that is huge, about 750 billion parameters. How huge is that? Well, you would need tens of thousands of dollars in hardware
[05:03] investment to run it. I don't have the hardware for that. Very few people do. Or, you wait a little until this kind of capacity gets distilled down into much smaller models, hopefully soon,
[05:16] or you just fire it up on Lambda. And then, here is the bombshell. Hold on to your papers, fellow scholars, because one of the lead scientists says that they are going to make a Fable-level
[05:28] system before 2027 But that is about six months from now That is a huge prediction But you know what After having this insane leap in less than three months after putting something like this on the table
[05:44] and saying Fable Level before 2027, I am keen to believe it. So that is the answer. Nothing is guaranteed, but potentially Fable Level openweight AI in our hands that we own forever. More power to
[06:00] the little man. And the community already picked up GLM 5.2 and ran with it, made it available in different sizes, platforms, you name it, amazing work fellow scholars. And look,
[06:12] it's not without downsides. It uses a lot of tokens, you can see, maybe 2x, in some cases, 10x is not unheard of. So factor that into any kind of per token API pricing. And once again,
[06:27] In my opinion, it's not Claude Opus, it's not mythos or fable level. But finally, we see a path towards better intelligence that all of us can own. And I've been saying to executives
[06:40] for years that you need to own your own model. And they just kept looking at me like I am crazy. Say it with me, not your weights, not your model. And a huge thank you to everyone who is working
[06:54] on open models. Here you see me running the full DeepSeek AI model through Lambda GPU Cloud. 671 billion parameters, running super fast and super reliably. This is insane. I love it, and I use it
[07:11] on a regular basis. Lambda provides you with powerful Nvidia GPUs to run your own chatbots and experiments. Seriously, try it out now at lambda.ai slash papers, or click the link in the description.