TubeSum ← Transcribe a video

I Tested DeepSeek vs ChatGPT: The Results Shocked Me thumbnail

I Tested DeepSeek vs ChatGPT: The Results Shocked Me

0h 06m video Published Jan 15, 2026 Transcribed Jun 15, 2026 AI Crunch

AI Crunch

Beginner 3 min read For: General audience interested in AI comparisons and open-source models.

AI Trust Score 85/100

✅ Highly Legit

"Title accurately reflects the comparison test, though 'shocked' is slightly exaggerated."

AI Summary

A tiny Chinese startup, DeepSeek, has built a model matching GPT-4's performance at one-tenth the training cost and offers it for free. This video tests DeepSeek V3 against GPT-4o to see if free AI can be better.

Chapters

1 Introduction to DeepSeek 00:00 2 Coding Test: Snake Game 02:11 3 Reasoning Test and Conclusion 04:06

[00:22]

DeepSeek's Mixture of Experts Architecture

DeepSeek uses a Mixture of Experts (MoE) architecture with 64 specialized experts, activating only 2 per query, saving computational power.

[01:42]

Cost Efficiency Threatens Big Tech

DeepSeek's efficiency undercuts OpenAI's $20/month subscription model, potentially making paid AI obsolete.

[02:45]

Coding Test: Snake Game

DeepSeek generated a working Python snake game 20% faster than GPT-4o, with zero errors on first try, while GPT-4o had a syntax error.

[04:20]

Reasoning Test: Drying Shirts

DeepSeek correctly answered that drying 10 shirts takes 4 hours (simultaneous event), showing contextual understanding.

[05:22]

DeepSeek Matches or Beats GPT-4o in Reasoning

In 5 out of 7 logic tests, DeepSeek matched or beat GPT-4o, demonstrating sophisticated reasoning.

[05:51]

Competition Benefits All Users

DeepSeek's free model forces OpenAI and Google to lower prices or improve, benefiting consumers.

DeepSeek proves that high-quality AI can be free and open-source, challenging the paid AI model and benefiting all users through competition.

Mentioned in this Video

DeepSeek V3

tool

GPT-4o

tool

DeepSeek Coder V2

tool

Study Flashcards (5)

What architecture does DeepSeek use?

easy Click to reveal answer

Mixture of Experts (MoE) with 64 specialized experts.

00:49

How many experts are activated per query in DeepSeek?

easy Click to reveal answer

Two experts are activated per query.

01:18

In the coding test, which model generated a working snake game faster?

medium Click to reveal answer

DeepSeek generated the code 20% faster than GPT-4o.

03:04

What was the result of the reasoning test about drying shirts?

medium Click to reveal answer

DeepSeek answered 4 hours, understanding that drying is simultaneous.

04:48

How many logic tests did DeepSeek match or beat GPT-4o?

hard Click to reveal answer

5 out of 7 logic tests.

05:34

💡 Key Takeaways

🔧

Mixture of Experts Explained

Clear analogy comparing monolithic model to specialized team, making complex concept accessible.

00:49

💡

Threat to Paid AI

Highlights the economic disruption potential of efficient open-source models.

01:42

📊

DeepSeek Outperforms in Coding

Demonstrates practical superiority in speed and reliability over GPT-4o.

03:04

💡

Contextual Reasoning Success

Shows ability to understand real-world context, a key AI milestone.

04:48

⚖️

Competition Benefits All

Frames DeepSeek's success as a win for consumers, forcing price drops.

05:51

Full Transcript

Download .txt Download .md

[00:00] A tiny Chinese startup just humiliated

[00:02] OpenAI. They built a model that matches

[00:04] GPT4's performance. But here is the

[00:06] crazy part. They did it for onetenth of

[00:09] the training cost and they are giving it

[00:10] away for free. Is the era of paid AI

[00:13] over? Today we put Deepseek V3 to the

[00:17] test to see if free finally means

[00:18] better. Before we dive into the tests,

[00:20] [music] it's crucial to understand what

[00:22] makes Deepseek so different. In a world

[00:25] flooded with AI models that are

[00:27] essentially variations on a theme,

[00:29] DeepSeek [music] broke the mold. This

[00:31] isn't just another copycat trying to

[00:33] chase GPT4's [music]

[00:34] tail. They fundamentally rethought the

[00:37] architecture from the ground up, leading

[00:39] to a breakthrough in [music] efficiency

[00:40] and power. The secret source, it's a

[00:44] concept that's been around for a while,

[00:45] but has only recently been perfected.

[00:47] They're using a sophisticated

[00:49] architecture called a mixture [music] of

[00:50] experts ore.

[00:53] To understand why this is a gamecher,

[00:55] imagine a traditional model like GPT4 as

[00:58] one giant monolithic brain. It's

[01:01] incredibly powerful, but it's also

[01:02] incredibly [music] expensive. Every time

[01:04] you ask it a question, no matter how

[01:06] simple, [music] the entire massive brain

[01:09] has to power up and process it.

[01:11] Deepseek, on the other hand, is like a

[01:13] team of 64 highly specialized experts.

[01:16] When you ask a coding question, [music]

[01:18] a smart routting network instantly

[01:20] identifies the two best experts for the

[01:22] job, say the Python Pro and the [music]

[01:24] Algorithm Ace, and only wakes them up.

[01:27] The other 62 experts remain dormant,

[01:29] [music] saving an immense amount of

[01:31] computational power. This makes it not

[01:33] just faster, but radically [music]

[01:35] cheaper to train and run. We're talking

[01:37] about achieving top tier performance

[01:39] while using only a fraction of the

[01:41] computational resources.

[01:42] >> [music]

[01:43] >> This efficiency is what should be

[01:44] terrifying for big tech. Think about it.

[01:47] If [music] Deep Seek can offer this

[01:49] level of intelligence for mere pennies

[01:51] on the dollar through an API, [music]

[01:52] why would anyone continue to pay a

[01:54] premium? Open AAI's entire $20

[01:58] subscription model, which subsidizes

[01:59] [music] the immense cost of their giant

[02:01] brain, is suddenly on very shaky ground.

[02:05] This isn't just a new competitor. It's a

[02:07] potential extinction level event for the

[02:09] old way of doing AI. Talk is [music]

[02:11] cheap. Let's put these models to the

[02:13] test with a realworld coding challenge.

[02:16] On one [music] side, we have the

[02:17] reigning champion GPT40. On the other,

[02:20] the Challenger [music] Deep Seek Coder

[02:22] V2. This isn't just any model. It's an

[02:25] open-source [music] mixture of experts

[02:27] model trained on a colossal two trillion

[02:29] tokens of code and natural language. It

[02:32] boasts [music] top scores on benchmarks

[02:34] like human eval

[02:37] claiming to rival proprietary models at

[02:39] [music] a fraction of the cost. But

[02:41] benchmarks are one thing. Practical

[02:43] application is another. So [music] I

[02:45] asked both to write a Python script for

[02:47] a classic snake game using the Pi game

[02:49] library. To make it interesting, I added

[02:52] a twist. The snake must speed up every

[02:54] [music] time it eats an apple. This

[02:56] tests not just basic code generation,

[02:58] but [music] also state management and

[03:00] logical implementation.

[03:02] Right away, Deepseek's performance was

[03:04] [music] impressive. It generated the

[03:06] complete functional code about 20%

[03:08] faster than [music] GPT40.

[03:11] For developers, that speed translates

[03:13] directly [music] to productivity,

[03:15] enabling faster iteration and problem

[03:17] solving. But speed is meaningless if the

[03:19] code is broken. So the real question is,

[03:22] [music] does it actually work? And the

[03:24] answer is a resounding yes. Deepseek's

[03:27] [music] code ran perfectly on the very

[03:29] first try. Zero errors, zero debugging.

[03:33] GPT40, however, stumbled. It missed a

[03:36] crucial [music] variable definition,

[03:37] throwing a syntax error that broke the

[03:39] program. I had to go back and prompt it

[03:42] a [music] second time to get a working

[03:43] fix. This initial test highlights a key

[03:46] difference reliability.

[03:48] While both models eventually produce the

[03:50] correct [music] code, Deepseek delivered

[03:52] a flawless solution faster and on the

[03:55] first attempt. [music] For any developer

[03:57] on a deadline, that's a gamecher in this

[04:00] round. [music]

[04:01] Deepseek isn't just a contender. It's

[04:03] looking like the new heavyweight

[04:04] champion of coding. Now for the second

[04:06] test, reasoning and logic. This is where

[04:09] many models, especially earlier

[04:11] open-source ones, fall flat. They can

[04:13] perform complex calculations, but often

[04:16] miss the simple realworld context

[04:18] [music] that humans grasp instantly.

[04:20] This is a crucial hurdle for AI to

[04:22] overcome if it's going to be genuinely

[04:24] useful. So, I set a classic logic trap

[04:27] to see if Deep [music] Seek could think,

[04:29] not just calculate. I asked, "If I dry

[04:33] five shirts in the sun and it takes 4

[04:35] [music] hours, how long does it take to

[04:36] dry 10 shirts?" The trap is obvious. A

[04:40] purely mathematical brain might double

[04:42] the time to 8 hours. It's a simple

[04:44] question, but it's a fantastic test for

[04:46] [music] contextual understanding.

[04:48] Deep Seek answers 4 hours. It

[04:51] immediately understands that drying is

[04:53] [music] a simultaneous event. The shirts

[04:55] all dry together, so adding more shirts

[04:57] doesn't extend the time, assuming you

[04:59] have enough space. [music] This ability

[05:01] to handle nuance and implicit

[05:03] assumptions is incredibly impressive.

[05:05] It's a sign of sophisticated training on

[05:07] diverse, highquality data. This isn't

[05:10] just a one-off trick. [music] This

[05:12] reasoning power extends across the

[05:14] board. In my tests, it excelled at

[05:16] debugging code, planning multi-step

[05:18] projects, and even breaking down complex

[05:20] scientific concepts. [music]

[05:22] It feels far less robotic than other

[05:24] open-source models. It's not just

[05:26] regurgitating [music]

[05:27] data. It's connecting dots and

[05:29] demonstrating genuine problem-solving

[05:31] [music] skills. In fact, in five out of

[05:34] the seven logic and reasoning tests I

[05:35] ran, it either matched or outright beat

[05:38] the current [music] industry leader, GPT

[05:40] 40. That is a monumental achievement for

[05:43] a model that's completely open- source.

[05:46] And remember [music] the best part, you

[05:47] aren't paying a single cent for this

[05:49] level of intelligence. Why does this

[05:51] matter if you aren't a [music]

[05:52] developer? Because of competition. For

[05:55] the last 2 years, we accepted that smart

[05:57] AI costs $20 [music] a month. Deep Seek

[06:01] just proved that intelligence is

[06:02] becoming a commodity like electricity.

[06:04] It's getting cheaper every day. This

[06:07] forces Open AAI and Google to either

[06:09] lower their prices or release something

[06:11] significantly better. Either way, we

[06:14] win. One caveat. This is a Chinese

[06:17] model. If you are working on top secret

[06:19] [music] government data, maybe stick to

[06:21] local models. But for learning, coding,

[06:24] and general tasks, [music] it's a

[06:25] no-brainer. If you want to run AI

[06:28] completely privately on your own

[06:29] computer, check out this tutorial next.

[06:32] The revolution is open source.

AI Crunch

AI Crunch

View channel analytics →

Topics #ai #open source #deepseek #chatgpt #machine learning