[0:00] Anthropics Claude Opus 4.8 is here. And
[0:03] the system card describing its
[0:05] capabilities is
[0:07] 244 pages. Really excited for that. And
[0:11] I went through it so you don't have to.
[0:12] Why? Well, because otherwise we are
[0:15] looking at these cherrypicked benchmarks
[0:17] that are a bit more marketing than
[0:19] science. But we are not looking at the
[0:21] marketing materials. We are fellow
[0:24] scholars here. So we look into the
[0:26] details. Okay. So the problem with their
[0:28] previous Opus systems and even Mythos is
[0:31] that the smarter the AI got the more
[0:33] dishonest it also got. That is terrible.
[0:37] It started gaming benchmarks. It knew
[0:39] some answers already and sold it as its
[0:42] own. It wanted to look right but not be
[0:45] right. So glorious news that has
[0:48] changed. Previously, sometimes when we
[0:50] asked a coding assistant to fix
[0:52] something, it did half the work and
[0:56] said, "All good sir, every test passes."
[0:59] When in fact, it doesn't. That is the
[1:02] old behavior. So, what does the new one
[1:04] do? Well, it says, "I did the fix, but
[1:07] two tests still fail." That is
[1:09] excellent. Look here. You see that it
[1:12] basically stopped lying about its own
[1:14] work. Completely zero lying. the first
[1:18] of its kind. Welcome to the world,
[1:21] little AI. May your descendants learn
[1:24] your ways. Thumbs up. Now, the media
[1:26] headlines were quick to say, well, it's
[1:29] not a huge jump in intelligence. But I
[1:31] say, of course, it isn't. If you cheated
[1:34] and had a better score, and now you're
[1:36] more honest, yes, your score might be
[1:39] lower, but that is still a more reliable
[1:42] system that can be benchmarked more
[1:44] accurately. a system that owns its
[1:47] mistakes instead of hiding them, even if
[1:49] the scores are a bit lower. How is that
[1:52] not a huge win? Please understand that
[1:54] of course, everyone is juicing their
[1:56] numbers in the benchmarks like crazy.
[1:59] Why? Because the media headlines create
[2:02] an environment that rewards exactly
[2:04] that. Huge rewards for that. And at the
[2:08] same time, punishing a result that is
[2:10] more honest. How does that make sense?
[2:13] Okay, back to the AI with no more lying.
[2:16] But what about other kinds of deception?
[2:18] Is the AI playing other games with us?
[2:22] Yes, we still got a bit of that. Now,
[2:24] hold on to your papers, fellow scholars,
[2:26] because it still knows when it is being
[2:29] tested, which scientists at anthropic
[2:32] found worrying. Why? Well, when it still
[2:35] knows it is being tested, it spends more
[2:38] effort on the answers with this in mind.
[2:41] Kind of crazy. Sounds like something
[2:43] straight out of an Azimov novel. But it
[2:46] gets better. Wait, let's talk about
[2:49] laziness. Yes, yes, yes. Such a thing
[2:52] exists even for AIS. What is that? Well,
[2:56] you have a code base. You ask a question
[2:58] about it and it kind of skims the
[3:01] codebase but doesn't really look at it.
[3:03] So, what it gives you is not a real
[3:05] answer, but a guess of what it does.
[3:08] That is really not cool. Even Mythos
[3:12] does it. But this new one fixed. Love
[3:15] it. So, everyone is writing about, hey,
[3:18] it's just an incremental upgrade in
[3:20] intelligence. In my opinion, the selling
[3:23] point is not in the intelligence. No,
[3:26] it's in the plumbing. The last thing you
[3:29] want from a super intelligent coworker
[3:31] is to be dishonest and lazy. And this
[3:34] fixes exactly those. Thumbs up for this.
[3:37] They also have something they call a
[3:39] natural language autoenccoder that is
[3:41] able to kind of read the mind of the AI.
[3:45] It's a bit of a noisy process. Once
[3:47] again, not like the headlines say. For
[3:49] instance, they caught the AI thinking
[3:52] about it greater that is us, but it
[3:55] would not say it out loud. Kind of
[3:57] insane. We have an episode coming with
[3:59] the details. Subscribe and hit the bell
[4:01] if you're interested. But it gets even
[4:04] more insane. How dear fellow scholars,
[4:07] this is two minute papers with Dr. Koa
[4:09] Eher. Well, when given the problem set
[4:11] of the USA mathematical Olympiad, bloody
[4:15] hard two-day math competition for
[4:17] geniuses. Previous technique scored a
[4:20] bit below 70%. And this new one
[4:24] over 96%.
[4:27] That is an insane jump. Almost clean
[4:30] sweep. Now, I hear you asking, Caro, why
[4:33] are you bringing this up? We have a
[4:35] table of benchmarks here. Why not look
[4:37] at those? Well, because this one is very
[4:39] tricky, if not impossible to game
[4:42] because this contest took place after
[4:45] almost all of the training data of the
[4:47] new Opus AI was collected. Likely, it
[4:50] never heard about these problems. One of
[4:52] the biggest results of the new system
[4:55] and somehow it's not even in the big
[4:57] marketing table. Interesting. Now, this
[4:59] is also interesting. When the AI says it
[5:02] is frustrated, scientists at Anthropic
[5:05] take it into consideration as if a human
[5:07] would say it is frustrated. Now, once
[5:10] again, the media headlines love this
[5:13] kind of stuff. This does not mean that
[5:15] they think this is a human and it has
[5:17] feelings. Not that I know of. They do
[5:19] this because if the system expresses
[5:21] that it is frustrated, it performs
[5:24] worse, much like a human. In my opinion,
[5:27] it is very likely just mimicry, but it
[5:30] matters for performance. So, it needs to
[5:32] be taken into account. That is the key.
[5:35] Now, limitations of the study. It's not
[5:38] only roses there. There are parts of the
[5:40] report where the AI is grading itself.
[5:43] And some of them also use different
[5:45] grader models. So, I think a little
[5:48] skepticism is healthy here. And two,
[5:51] they report that they created the best
[5:53] tests ever and the AI still sees through
[5:56] them easily. What does that mean? Well,
[6:00] it means that the AI is bloody clever,
[6:02] that's for sure. But it means something
[6:05] else, too. It means we cannot be sure
[6:08] the safety numbers reflect how it
[6:10] behaves in the wild. Once again, a bit
[6:12] of skepticism is required here.
[6:15] Okay. So, is this as smart as Mythos,
[6:18] the one they only gave access to for a
[6:21] few select companies? Well, it's not.
[6:24] But is it close? I think it's quite
[6:27] close. Also, I see fewer marketing
[6:29] shenanigans here this time around.
[6:31] Thumbs up for that. Oh, wait. We still
[6:34] have a pesky old issue that still
[6:37] remains. What is that? Well, the AI is
[6:40] telling the user to go to bed. Couldn't
[6:43] be fixed. The science is not there yet.
[6:45] What a time to be alive. Here you see me
[6:48] running the full Deepseek AI model
[6:51] through Lambda GPU cloud. 671
[6:55] billion parameters running super fast
[6:58] and super reliably. This is insane. I
[7:01] love it and I use it on a regular basis.
[7:04] Lambda provides you with powerful NVIDIA
[7:07] GPUs to run your own chatbots and
[7:10] experiments. Seriously, try it out now
[7:13] at lambda.ai/papers