[0:00] Deepseek just dropped their latest [0:02] flagship model V4. It's massive, [0:06] powerful, open-source, and a fraction of [0:10] the cost. And it might be the model that [0:12] ends America's lead in artificial [0:15] intelligence. Not because China caught [0:17] up, but because of what happens next. [0:20] So, usually at this point, I would do a [0:22] model overview. I would tell you about [0:23] the model. I would show you the [0:25] benchmarks. I would test it and show you [0:27] what I think. But as I looked at it, I [0:29] realized there was actually a much [0:31] bigger story here. America has the best [0:33] chips. It has the most money flowing [0:36] into AI labs. Yet, China was able to [0:40] release a frontier level model that [0:43] matches the best of them. Completely [0:45] open- source, completely open weights, [0:48] and at a fraction of the cost and [0:50] resources. They are literally working [0:52] with nerfed Nvidia GPUs. that's not [0:55] supposed to be possible and the fallout [0:57] will be bigger than people realize. And [0:59] so today, I'm going to tell you what's [1:01] special about Deep Seek V4, why it [1:03] matters, and what it means for the [1:06] world. But first, if you love seeing [1:08] videos about AI models, go ahead and hit [1:11] the like and subscribe button. I want to [1:14] reach as many people as possible and [1:15] teach them about artificial [1:17] intelligence, get them excited about it. [1:19] And hitting the like and subscribe [1:20] button helps the channel more than you [1:23] realize. So, thank you for doing that in [1:25] advance. Okay, so what is so special [1:27] about DeepSeek V4? First, let me tell [1:29] you about who DeepSEK actually is. If [1:32] you remember, about 18 months ago, they [1:34] dropped a model that literally changed [1:37] the world. It was called Deepseek R1, [1:40] and it was an open-source open weights [1:42] model that could think. Remember back [1:45] then, models that could think were only [1:48] developed by the closed source AI labs [1:50] in the United States. They dropped R1, [1:53] showed the world that other countries [1:56] and open-source labs could develop [1:59] models that were at the frontier, and [2:01] the stock market dropped 20% pretty much [2:04] overnight. And what was really special [2:06] about Deep Seek Rar1 was how efficient [2:09] they were able to train it a fraction of [2:11] the price than the hundreds of billions [2:13] of dollars paid by the Frontier USI [2:16] labs. And so people thought, "Wow, if [2:19] they can train it at a fraction of the [2:20] price, then maybe Nvidia GPUs are not [2:23] actually worth that much." But it turns [2:25] out they were very wrong about that. [2:27] When things get cheaper in price, we [2:29] actually use a lot more of it. That's [2:31] called Javon's Paradox. Okay. And now [2:33] fast forward to today. Deepseek is back [2:36] with V4. And they wrote an incredibly [2:39] thorough white paper explaining how they [2:41] did all of it, including being super [2:44] honest about some of their failures. [2:46] much more honest than any of the closed [2:48] source AI labs in the United States. All [2:50] right, so here's the post. It came out [2:52] late last night. Let me tell you about [2:54] it. Deepseek V4 preview is here and it [2:58] comes in two flavors, pro and flash. [3:01] First, it has a million token context [3:03] length. That is amazing because that is [3:06] the frontier. So immediately check that [3:08] box. They are at the frontier of context [3:10] limits. Next, it is a 1.6 6 trillion [3:15] total parameter model with 49 billion [3:18] active parameters. This is called [3:20] mixture of experts. It basically allows [3:22] you to have a massive model but run only [3:26] parts small parts of the model that are [3:29] specific to the question or the prompt [3:32] that you're giving it. They also have V4 [3:35] flash which is their workhorse model. [3:37] It's going to be smaller, it's going to [3:39] be faster, and it's going to be much [3:40] cheaper. This is 284 [3:43] billion total parameters with 13 billion [3:46] active. And if we look at this [3:48] screenshot, we can see both of them were [3:51] trained with about 33 trillion tokens of [3:55] training data. So some of the [3:56] characteristics of these models, they [3:59] have enhanced agentic capabilities. It [4:01] is comparable to the state-of-the-art [4:03] agentic coding models like Opus 47 and [4:06] GPT 5.5. Literally the models that were [4:10] just released in the last week from [4:11] anthropic and open AI. It has rich world [4:14] knowledge and worldclass reasoning beats [4:17] all current open models in math stem [4:19] coding rivaling top closed source [4:21] models. All right. So let me show you [4:22] some of the major benchmarks. Here we [4:24] have MMLU Pro which is knowledge and [4:26] reasoning. We can see here in the dark [4:29] green bar, this is DeepSk with orange [4:31] being Opus 46, purple being GPT54, and [4:34] then in the stripes, those are the new [4:37] models, Opus 47, and GPT55. But what [4:40] we're seeing is although it is slightly [4:43] behind here, it's right up there. Okay? [4:46] And remember, it is a fraction of the [4:48] price here. GPQA diamond, same thing. [4:50] Sweetbench verified. And basically what [4:53] you're seeing across the board is yes it [4:56] is behind but just a little bit. And [4:59] that's the real story here. Most use [5:02] cases, the vast majority of use cases do [5:05] not require the absolute frontier level [5:08] intelligence. And the fact that DeepSeek [5:10] is so much more efficient and so much [5:12] cheaper is actually the problem for the [5:16] United States. And so let's talk about [5:18] cost because that is really what we need [5:22] to be scared of. And if you're not sure [5:24] why, I'm going to explain. Let's look at [5:26] the cost first. This is AI model price [5:29] versus performance. On the Y ais, we [5:32] have intelligence. Just think the higher [5:34] the smarter. On the Xaxis, it is the [5:38] price. The more to the left it is, the [5:42] cheaper it is. Cheaper is better. And so [5:44] what you want is to be up here. in this [5:47] top left. You want to be as cheap as [5:50] possible and as intelligent as possible. [5:53] And so what do we see? We see GBT 5.5, [5:56] which was just released. At the very [5:58] top, we have Opus 4.7 right next to it. [6:02] And I'm just measuring Intelligence [6:04] right now. GBT 5.4 extra high right over [6:07] here. And then we have Deepseek V4 Pro. [6:10] a little bit behind, a little bit lower [6:12] on intelligence, but much much cheaper. [6:17] And then look at Flash down here. [6:19] Certainly a big drop in intelligence. [6:22] Still really good, but this is an [6:24] absolute workhorse model price right [6:26] here. This is pennies per million [6:29] tokens. Now, I want to show you how the [6:31] rivalry between the US Frontier Labs and [6:35] Chinese Frontier Labs has gone over the [6:37] last few years. So we have GPT4 that [6:41] came out in May 2023 and we had this [6:44] massive gap. This is the ELO score in [6:46] Arena and then Quen then GLM4 and at [6:51] this point right after 01 preview came [6:53] out. Remember 01 was the first thinking [6:56] model right after that just a few months [6:58] deepseeek R1 changed the world and [7:00] closed the gap almost completely. The US [7:04] labs did shoot ahead and there's been [7:06] this back and forth eb and flow between [7:08] them. Every time the US shoots ahead, [7:11] Chinese open source catches up. They [7:13] have always been behind, but that might [7:15] not always be the case. And so that [7:18] brings us to a geopolitical question. [7:21] Are export controls actually working? [7:25] Export controls basically means the US, [7:28] specifically Nvidia, is not allowed to [7:31] sell its top chips, its best GB300 and a [7:36] few others to China directly. Now, there [7:39] is a lot of rumors that China is going [7:41] around those export controls and [7:43] importing them into other countries and [7:45] smuggling them. And there's an entire [7:47] story there. We're not going to get into [7:48] that today, though. But are export [7:50] controls working? assuming that they are [7:53] actually enforced. Well, the answer is [7:56] kind of yes and kind of no. Export [7:58] controls are working because China [8:00] doesn't have the same compute resources [8:02] that the United States has. This is just [8:05] a fact. Even if they're able to smuggle [8:08] in chips, it is difficult and they [8:11] certainly don't have as much compute as [8:13] we have in the United States. But if [8:16] they did, imagine what they'd be able to [8:18] do. Because on the flip side, the export [8:22] controls kind of aren't working because [8:24] they are innovating on the algorithm [8:26] side. They are coming up with incredible [8:29] algorithmic unlocks that make training [8:31] and running inference of these models [8:34] including DeepSeek incredibly efficient. [8:37] And so even using nerfed GPUs, even [8:41] using Chinese native GPUs, they're still [8:45] able to create a frontier level model. [8:48] And in fact, Nvidia, specifically [8:49] Jensen, has made arguments for selling [8:53] our top GPUs to them. China is going to [8:55] be developing and producing their own AI [8:58] chips. They should be built on American [9:01] technology. And that argument is [9:03] actually why Deepseek V4 is actually [9:06] such a big deal and such a big threat to [9:09] the US economy. But just the flip side [9:12] to it, they're going to make their own [9:14] chips. They're going to make their own [9:17] incredible models and they are going to [9:19] be very attractive to US companies and [9:21] our allies. But more on that in a [9:24] minute. All right, I want to talk about [9:25] distillation hacking cuz it's all [9:27] related. Just a few weeks ago, Anthropic [9:30] put out a report basically saying they [9:32] have proof that the top Chinese AI labs [9:36] have been distillation attacking them [9:39] for their clawed model. And what does [9:41] that actually mean? The simplest way to [9:43] explain what a distillation attack is is [9:45] the Chinese AI labs are essentially [9:48] trying to steal the data from Claude and [9:50] from chat GPT. They're asking it [9:53] questions, getting the answers, and then [9:55] using those questionans answer pairs to [9:58] train their own models. Those [9:59] questionans answer pairs are everything. [10:01] That's the IP of companies like [10:03] Anthropic and OpenAI. And just [10:05] yesterday, the US government put out a [10:08] statement on distillation attacks. This [10:10] is director Michael Kratzios. The US has [10:13] evidence that foreign entities primarily [10:15] in China are running industrialcale [10:17] distillation campaigns to steal American [10:19] AI. We will be taking action to protect [10:21] American innovation. Now, this was [10:24] already reported by Anthropic a few [10:26] weeks ago. So, this is not really new [10:27] news, but the US government actually [10:30] saying yes, it's happening is the new [10:32] part of it. And I'm going to explain why [10:35] this ties into this overall story in a [10:37] moment. These foreign entities are using [10:39] tens of thousands of proxies and [10:40] jailbreaking techniques and coordinated [10:42] campaigns to systematically extract [10:44] American breakthroughs. But here's the [10:46] thing. If you look at Enthropics report, [10:49] the Chinese labs and specifically [10:51] DeepSeek didn't really steal all that [10:54] much data. And there is actually an [10:56] argument that they weren't stealing at [10:58] all. Maybe it's against the terms of [11:00] service, but a lot of it can be [11:02] explained by simple benchmark [11:04] comparisons. If you're a Frontier Lab [11:07] and you want to know how well does my [11:09] model do against my competitor model, [11:11] well, the only way to know is to run [11:14] benchmarks against both. And those [11:16] benchmarks look exactly the same as a [11:18] distillation attack. All right, so this [11:20] is the report from Anthropic. I just [11:22] want to very briefly show one thing. The [11:24] scale of Deep Seek's distillation attack [11:27] is just 150,000 exchanges. That is not [11:31] much. Now, Moonshot, the company behind [11:33] Kimmy, had 3.4 4 million and Miniax has [11:37] 13 million. So certainly Deepseek of the [11:40] Chinese labs have been doing this quote [11:43] unquote dissolation attack far less than [11:45] the other labs. And 150,000 exchanges is [11:50] not really enough to explain the level [11:53] of quality that DeepS has been able to [11:55] achieve. And then you pair that with the [11:57] fact that they've open sourced the whole [11:59] thing. They have an incredibly detailed [12:01] and thorough white paper that explains [12:03] exactly how they were able to achieve [12:05] it. It just doesn't mesh. And so back to [12:09] our export controls actually working [12:11] well. Twitter user Jukcon pointed out [12:13] something very interesting in the report [12:16] because of course like I said DeepSeek [12:18] put out a very thorough report. It says [12:21] due to constraints in high-end compute [12:23] capacity, the current service capacity [12:25] for Pro is very limited. After the 950 [12:28] super nodes are launched at scale in the [12:31] second half of this year, the price of [12:33] Pro is expected to be reduced [12:34] significantly. So they are very compute [12:37] constrained. They were able to bake and [12:39] produce this model, but they can't even [12:42] serve it in the most optimized way. And [12:44] they're also charging more than they [12:46] would have otherwise. So the price is [12:48] going to continue to drop and the price [12:50] is what I want to focus on now. So why [12:52] is the price and efficiency of Deep Seek [12:56] V4 such a big deal? Yes, it is nearly [13:00] state-of-the-art. Nearly, not quite. It [13:03] is almost as good as the top models Opus [13:06] 47 and GPT 5.5. But here's the thing, it [13:10] doesn't need to be as good. And just [13:14] being nearly as good is good enough for [13:17] almost everybody including enterprise [13:20] companies in the United States. And that [13:22] is what matters. Imagine you are a CEO [13:26] of a company in the United States or one [13:28] of our ally countries and you're looking [13:31] at Opus 47. You're looking at GPT 5.5 [13:35] and you're looking at the costs and you [13:37] see GPT 5.5 is $30 per million output [13:41] tokens. You see, Opus 47 is similarly [13:43] priced. And then you look at DeepSeek [13:46] and it can accomplish all of your use [13:49] cases because you're not doing frontier [13:51] scientific research. You're not trying [13:53] to crack some of the hardest coding [13:55] problems in the world. You have a [13:57] business and you're trying to run your [13:59] business. And you look at the price and [14:01] it is literally a fraction. And you get [14:03] to control it more precisely. It's open [14:06] source. You can fine-tune it all you [14:08] like. You can make it exactly how you [14:09] like, host it how you like, and your [14:12] bill will be a fraction of the size it [14:15] would be otherwise. The calculus that [14:17] these CEOs are making becomes very [14:20] obvious. Why would you pay so much more [14:23] for a US frontier lab to serve you their [14:27] model over an open-source Chinese model? [14:30] And that's where the problem comes in [14:32] because more and more US and our ally [14:34] countries enterprise companies are going [14:37] to think about this and make the [14:39] decision to build on top of Chinese [14:41] opensource technology. And that's the [14:44] big argument. Remember Jensen just had [14:47] the argument that hey China is going to [14:48] be building their own chips. They're [14:50] going to be building their own models. [14:51] They might as well be built on US chips. [14:54] Well, the same argument is on the flip [14:57] side with US companies building on top [14:59] of Chinese open- source models. That is [15:01] a big security risk for the United [15:03] States because if Chinese companies [15:05] decide to change their architecture or [15:08] cut us off suddenly, we're in a really [15:10] bad spot. And so, let's think about [15:12] this. We have trillions of dollars [15:15] pouring into the AI industry in the [15:17] United States. We have infrastructure [15:20] buildout happening more quickly than any [15:22] infrastructure buildout in history. So [15:24] if you have all of this investment that [15:26] requires a return and all of a sudden [15:29] we're not getting that return, there is [15:31] the potential for the US economy to [15:33] collapse, especially because we are [15:36] betting so heavily on artificial [15:38] intelligence being the future of our [15:41] economy. And then think about [15:43] culturally. Think about how social media [15:45] changed the world and social media came [15:48] from the United States. We were able to [15:50] control the narrative in a lot of [15:52] places. Now flip that on its head. [15:55] Imagine we're all built on Chinese [15:57] models and they're dictating what the [16:00] models are able to say and what they're [16:02] not able to say. These are big questions [16:05] that we're going to have to grapple with [16:06] if US companies decide to build their AI [16:09] strategy on top of Chinese open- source [16:11] models and they are looking very [16:14] attractive right now. All right, so [16:16] where do we go from here? Well, I think [16:18] there needs to be two big initiatives in [16:21] the United States. Number one, we need [16:23] to go much harder on open source. The [16:26] frontier labs in the US are not open- [16:29] source friendly for the most part. maybe [16:31] with the exception of Google, but Google [16:33] is building small open- source models, [16:36] not the same level and not the same [16:38] capability as a DeepSk V4. And then we [16:41] also need to work on efficiency even if [16:45] we are to maintain closed source and [16:47] they're being served by OpenAI and [16:49] anthropic. They need to get much cheaper [16:52] much more quickly because US enterprise [16:54] companies need to look at these [16:56] different models and it needs to make [16:57] sense costwise. That's going to enable [17:00] the entire world to build on top of US [17:04] artificial intelligence. So, if DeepSeek [17:06] is doing everything right, Anthropic [17:09] might be doing everything wrong, at [17:10] least lately. I made a video about it, [17:13] so check out the video on the screen [17:15] right now. People say I went a little [17:17] bit too hard on them, but I've been [17:19] really frustrated.