I built a 2500W LLM monster... it DESTROYS EVERYTHING

Transcribed Jun 28, 2026 Watch on YouTube ↗

Intermediate 10 min read For: PC enthusiasts and AI developers interested in building or understanding high-end workstation hardware for large language models.

322.0K

Views

7.5K

Likes

643

Comments

349

Dislikes

2.5%

📈 Moderate

AI Summary

The video follows Alex as he builds an ultra-high-end AI workstation with a Threadripper CPU and dual RTX Pro 6000 GPUs. He visits MicroCenter to select a case, power supplies, fans, and a UPS, then assembles the system and tests it with a massive 235-billion-parameter LLM. The build is designed to handle demanding AI workloads that no single consumer GPU can manage.

Chapters

1 Introduction and problem 0:00 2 Shopping at MicroCenter - parts 0:19 3 Power supply decisions 3:34 4 UPS and fans 6:57 5 Build and test 10:10

[0:00]

Need for upgrade

Alex's existing AI rig can't fit a second RTX Pro 6000, so he plans to expand with a new build.

[0:19]

Visit to MicroCenter

Alex meets Dan at MicroCenter, bringing his 2500W power supply and explaining the specs: Threadripper TRX board, 9970X CPU, two RTX Pro 6000 GPUs.

[1:18]

Dual PSU solution

A single 2500W unit needs a special high-voltage plug; US outlets max at 1800W, so they lean toward a dual PSU setup (CPU+one GPU on one PSU, second GPU on another).

[2:40]

Choosing a case

Dan notes that dual PSUs require a full tower case; they explore options like a limited-edition Doom case and one with an OLED screen.

[3:34]

Power supply recommendations

Dan suggests a Taichi 1650W PSU for one GPU and an ROG Loki SFXL 1000W for the second, discussing efficiency ratings from bronze to titanium (95% efficiency for titanium).

[5:23]

Cost comparison

Dual PSU costs around $680 (1300W + 1200W) vs $830 for two 1650W units; they choose the cheaper option.

[6:34]

Synchronization adapter

An $11 adapter connects the two PSUs via motherboard cables to sync them.

[6:57]

UPS requirement

A massive 1500W+ UPS (about $800) is needed to protect against power blips during LLM workflows.

[8:15]

Fans selection

Noctua fans are chosen for quiet, efficient airflow; the case allows assembling fan arrays outside and sliding them in.

[10:58]

Build completion and test

Alex assembles the system with two RTX Pro 6000s and runs Qwen 3 235B model (142GB on disk) using Ollama, achieving 68 tokens per second with both GPUs active.

Alex successfully builds a powerhouse AI workstation capable of running the largest open-source LLMs, proving that multiple RTX Pro 6000s are necessary for models beyond consumer GPU memory limits. The build demonstrates practical decisions around power, cooling, and component selection for extreme AI hardware.

Clickbait Check

65% Legit

"The video delivers on the '2500W LLM monster' but 'DESTROYS EVERYTHING' is hyperbolic since it only runs models at 68 tok/s, not literally destroying benchmarks."

Mentioned in this Video

Ollama (Olama)

tool

Qwen 3 235B model

tool

Dan (MicroCenter employee)

person

MicroCenter

service

Study Flashcards (7)

What model was tested on the new build?

easy Click to reveal answer

Qwen 3 235 billion parameters.