---
title: 'What a Surprise'
source: 'https://youtube.com/watch?v=EQ8nrP6PKV4'
video_id: 'EQ8nrP6PKV4'
date: 2026-06-22
duration_sec: 0
---

# What a Surprise

> Source: [What a Surprise](https://youtube.com/watch?v=EQ8nrP6PKV4)

## Summary

The video discusses how AI companies are non-consensually using copyrighted content, such as YouTube videos and music, to train their AI models, labeling it as theft. The creator highlights a report from The Atlantic that provides an open tool to search what data these models are trained on. The creator personally found 67 of their own videos in nine datasets used by AI corporations.

### Key Points

- **AI's non-consensual impact** [0:02] — AI is already deeply affecting people's lives, often without their knowledge or consent.
- **The Atlantic's AI watchdog report** [0:20] — The Atlantic's report reveals that AI models are trained on stolen content without permission, using the term 'stealing' as the scientific term.
- **Open tool to search AI training data** [0:51] — The Atlantic provides an open tool where users can directly search what music and videos AI is being trained on.
- **Artists call out AI companies** [1:04] — Huge artists are calling out AI companies like Sunno for ripping their music, including unreleased material, to train models.
- **Creator's own videos stolen** [1:56] — The creator searched their name in the dataset and found 67 of their videos in nine datasets used by AI corporations.
- **Example of stolen content** [2:29] — Examples include a video about a horrible chair, Doom mod commentary, washing machine repair championship, and a foot fetish wedding dildo circus.
- **YT Temporal 180M dataset** [3:16] — YT Temporal 180M is a collection of 5.4 million YouTube videos compiled by University of Washington and Allen Institute for AI to train a multimodal model called Merllo, released in 2021.
- **Theft is theft regardless of wealth** [4:50] — The creator argues that stealing content is theft regardless of the entity's wealth, but billion-dollar corporations get to skirt around it.
- **Nintendo's trailers stolen** [6:18] — Nintendo, a highly litigious company, had 4,926 trailers used to train AI models without permission, including by Runway Gen 3.
- **Runway Gen 3's use of Nintendo content** [7:57] — Runway Gen 3 took 4,772 Nintendo trailers, likely without permission, to train their AI model.
- **Runway AI's internal documents** [8:09] — Internal documents obtained by 404 media list 3,970 channels identified as high-quality video sources for training, including Nintendo video game trailers.
- **Open tool includes all videos before May 17, 2024** [9:00] — The search tool includes all YouTube videos from named channels published before May 17, 2024, one month before Runway introduced their model.
- **Likely usage of compiled videos** [9:40] — The creator wagers that since videos are in the dataset, they were likely used to train the models, though not 100% confirmed.

### Conclusion

The video underscores the massive scale of unapproved data scraping by AI companies, framing it as theft that disproportionately affects smaller creators while large corporations evade accountability, and emphasizes the need for greater transparency through tools like The Atlantic's dataset search.

## Transcript

That's the sound of AI coming in and
non-consensually motorboating us all.
Whether you want to admit it or not,
whether you even recognize it or not,
your life is already being affected
deeply by AI. And I know I've yapped
about it a lot, but I did just see The
Atlantic's most recent report where they
actually pulled down the pants on a lot
of what AI models are being trained on
here, aka what they're stealing to train
their AI models on because they don't
have permission for it. The word, the
nomenclature is stealing. That's the
scientific term. But because these
multi-billion corporations are in the
the driver's seat now at the helm, they
can go ahead and rebrand that whole
stealing thing into something entirely
different. They're trying to argue that
it's all by the book in fact. So, uh,
The Atlantic now has this AI watchdog,
which is just a open tool. Anyone can go
in and they can just directly search
what music AI is being trained on as
well as like videos AI is being trained
on. Now, a ton of huge artists have been
made aware of this and have called out
AI companies for just directly ripping
their music. Some of it unreleased, by
the way, in order to train their AI
models on. Notably, Sunno has been
getting called out with spitballs fired
at it and Rotten Tomatoes thrown at it
because they're pretty shameless and
unapologetic about it. Anyone remember
back in the day the whole you wouldn't
download a car, one of the most iconic
[ __ ] antipiriracy advertisements
ever? arguably just one of the most
well-known uh campaigns ever as well.
Well, now the same people that used to
run that [ __ ] are the ones that are
literally just downloading everything,
stealing everything they can. It It's
It's pretty incredible the uh 180
they've done there on that whole
messaging. And I know what you're
thinking. Yes, they have stolen my
videos to train their AI models on. I
searched my name in their data set. 67
of my videos have made it into nine data
sets used by AI corporations to train
their models.
Heaven help us. Lord have mercy on the
absent souls of these AI models that are
being [ __ ] clockworked orangeed
having the eyelids pulled wide open to
be trained off my videos here. Those
have to be the dumbest AI models you can
find. Just look at this poor bastard
here from YT Temporal 180M who's got 221
of my videos shoved down its throat
being trained off of things like
horrible chair where I'm just making fun
of a dog [ __ ] chair. Some old game plan
commentary like Doom four feathers where
I'm playing a Doom mod where I'm a
chicken shooting at warthogs from Halo.
It's [ __ ] uh washing machine repair
championship where I'm just commentating
a washing machine repair competition
between some of the highest quality
athletes you can find in the washing
machine circuit. Not to bismerch the
good name there. A foot fetish wedding
dildo circus where you know I did a lot
of cool trick shots with dildos. Like
this this [ __ ] AI model. It must be
just sitting there drooling. Actually,
let me learn a little bit about my son
here, seeing as I taught him everything
he knows. YT Temporal 180M. It's a
collection of 5.4 million YouTube videos
compiled by a team of researchers at the
University of Washington and the Allen
Institute for AI to train a multi a
multimodal model called Merllo.
It was released in 2021.
Bro, is Merllo an idiot? Be honest with
me. Is this the dumbest AI you can find?
If it's being trained even partly on
some of my videos, there's a chance you
ask it who wrote the Declaration of
Independence and it says dildo titty
fart or something. Very fascinating.
Very interesting. So, YT Temporal got a
huge dollop of some of my incredible
work like hintai survive. What the hell
is this? I don't even remember this. As
the 21st century continues to evolve,
human sexual fetishes are evolving right
there alongside it. Okay, you know what?
That one might actually be somewhat
educational. That that actually might
help them out a little bit because it
it's not wrong. Now, obviously, my
videos weren't handpicked for these AI
models or anything. They made it into
those giant compilations a lot of these
groups put together solely to train
models off of to give like a huge sample
size. I I understand that, but it
doesn't make it any less garbage. It's
so [ __ ] ridiculous because it is just
stealing it. Same thing when they do it
with music. It is just stealing all of
that to train their models off of. And
it's been a huge contentious topic for a
while now when it comes to AI. And no
amount of like trying to put makeup on
it changes the truth that they are just
stealing to train their models. Now, if
you as just a normal person try and
follow their footsteps and do the same
thing of just taking a ton of artists
music and videos in order to make your
own product off of those works, you're
going to get arrested and charged with a
little something called theft. But that
word doesn't exist once you reach a
certain level of wealth. These
billion-dollar corporations get to kind
of skirt around that a little bit.
They're able to tiptoe around that. and
we don't need to worry about a little
pesky thing called theft for them. It's
very different rules they play by there.
And I know there's not a soul on this
planet surprised by this, but it's still
something I think worth yapping about,
especially now that it's so easily
accessible to see how many things just
get up by these AI companies to train
their models off of. all of these huge
compilations of work and IP that they
steal to train their models on in order
to sell it to the people that are now
hooked on AI as everyone is in this
giant [ __ ] gold rush, this whirlwind
of the in the industry. It's just so
bizarre how things have just accelerated
to this point. Now, one thing I got very
curious about is trying to think of like
the most latigious company I could think
of to see if their work had also just
been stolen by a ton of these AI models.
So, obviously, the first thing that pops
in my noodle is Nintendo. And yeah,
Nintendo is not exempt from this. 4,926.
That is a pretty big chunk of their
trailers being used to train AI models
that I'm sure Nintendo didn't sign off
on because Nintendo wouldn't just be
giving away this for free. 1,000%.
You would need to pay oodles of clams to
have access to their material in order
to train your own stuff off of to sell
that product. They are extremely strict
when it comes to their copyright. They
rule that [ __ ] with an iron fist. They
are judge dread when it comes to their
copyright. And yet here we have Runaway
Gen 3 that just shamelessly takes 4,772
of their trailers. Again, Nintendo is
extremely strict with their trailers.
You can use their trailers in like
YouTube videos if you follow a very very
particular set of guidelines around it
that is extremely transformative.
Like there are in intense rules
Nintendo. so brutal with this. I
remember there was a couple YouTube
channels that got taken down because
they used Nintendo music. Like they they
took Nintendo music from games they
owned, put it in their videos, and they
lost their whole channels for it. There
was also that time where some streamers
got banned because they watched Nintendo
trailers during the direct on stream.
Like the point is they take that very
seriously. And now Runaway Gen 3 just
comes in here from the top turnbuckle.
Takes 4,772
of their trailers, most likely
completely for free without Nintendo's
permission, didn't pay a dime for it, to
train their model on. So, uh, let's see
what this is. Runaway AI collected
YouTube videos to train a
videogenerating AI model released as Gen
3 in 2024. An internal company document
that was obtained by 404 media lists
3,970.
What? Wait, that can't be right. Oh, oh,
channels. I thought I was talking about
videos in general. I was like, "No, they
have more than that from just Nintendo."
That Runaway identified as sources of
highquality video for training.
Nintendo video game trailers. Okay. The
spreadsheet contains comments describing
what is desirable about some of the
channels. For example, beautiful
cinematic landscapes, high quality
scenes from movies, only four videos,
but they are really well done. Super
high quality sci-fi short films, and the
holy grail of car cinematics so far.
That must be talking about like Mario
Kart or something. It's not clear which
if any videos Runway actually used for
training its AI system. Our search our
search tool includes all YouTube videos
from the named channels that were
published before May 17th, 2024, which
is 1 month before Runway introduced
their model. So, this is something that
The Atlantic also made note of. It's not
100% confirmed that they used all of
these videos, but the fact that they
were compiled by these companies. I
think you can make a pretty educated
guess that it was likely used to train
their models. And also another thing
they mention is that there are likely a
lot of other ones that even though
they're not here, doesn't mean they
weren't used to train their model. It's
a very tricky and messy, sloppy thing to
nail down exactly what and what isn't
being used to train models. But I would
wager a guess that since it's here in
the data set, they probably used it.
Much like all of these also most likely
used it, like this company, which also
used a lot of my YouTube videos, which
that actually kind of gave me a giggle
when I saw that Nintendo of America was
part of the same data set as this one
where my videos are in, such as
Seaweed's [ __ ] cool or whatever that
one was. I already forgot. So Nintendo
and I, we're basically in the in the
same ballpark now when it comes to
quality. That's pretty cool. I bet
Nintendo's thrilled about that. Uh but
anyway, point is thanks to the Atlantic
data sets here that you can freely
explore. You can see so so so many
things have just been taken by these
companies and put in these data sets
that are presumably being used to train
their AI models on. It's just it's
pretty egregious. I I wanted to yap
about it a little bit. That's it. See
you.
