---
title: 'Google’s AI endgame is here… everything you missed at I/O 2026'
source: 'https://youtube.com/watch?v=9OQ5vaYbGV0'
video_id: '9OQ5vaYbGV0'
date: 2026-06-28
duration_sec: 0
---

# Google’s AI endgame is here… everything you missed at I/O 2026

> Source: [Google’s AI endgame is here… everything you missed at I/O 2026](https://youtube.com/watch?v=9OQ5vaYbGV0)

## Summary

The video covers the major announcements from Google I/O 2026, focusing on the company's shift to an 'agentic Gemini era' where AI is integrated into all products. It highlights new models like Gemini Omni and Flash 3.5, hardware updates like TPU-T and TPU-I, and a new web API for developers.

### Key Points

- **Google's AI Vision** [0:00] — Google I/O 2026 unveiled an ambitious vision where Gemini AI is embedded into every product, marking the 'agentic Gemini era' where search, Gmail, Android, and glasses become AI agents.
- **Scaling Achievements** [1:05] — Google scaled from 9.7 trillion tokens per month to 3.2 quadrillion tokens per month in two years, supported by increased capital expenditures and new TPU chips.
- **TPU Chip Split** [1:35] — Google announced splitting TPU chips into TPU-T for training and TPU-I for inference, optimizing each for specific tasks.
- **Gemini Omni Model** [2:02] — Gemini Omni is a multimodal model that takes text, video, and sound as input and produces any output, simulating reality on demand.
- **Neural Expressive Design** [2:24] — A new design system for the Gemini app called Neural Expressive optimizes UI for generating elements like diagrams and mini apps on demand.
- **Gemini Flash 3.5** [2:43] — Gemini Flash 3.5 is a fast model performing nearly on par with Opus 4.7 and GPT-5.5, but at much higher speed, though it's not the top-tier model.
- **Anti-Gravity IDE** [3:18] — Google's anti-gravity IDE (formerly Windserve) focuses on managing AI agents for coding, demonstrated by building an OS from scratch and fixing drivers to run Doom.
- **Price Increase** [4:02] — Gemini 3.5 Flash is three times more expensive than the previous version and 30 times more than Gemini 1.5 Flash, though still cheaper than Claude.
- **HTML on Canvas API** [4:15] — Chrome introduced the HTML on Canvas API, allowing HTML elements to be used directly in canvas for highly interactive UIs with WebGL and WebGPU.

### Conclusion

Google I/O 2026 showcased a comprehensive AI-first strategy, with new models, hardware, and developer tools, though price increases and the absence of Gemini 3.5 Pro disappointed some. The event underscored Google's ambition to become the interface to reality itself.

## Transcript

Yesterday, Google I/O wrapped, and I was
able to watch in person as Sundar and
Demis laid out an ambitious vision for
the future of software. And apparently,
that future is Gemini hiding inside of
every product like the microplastics in
your bloodstream. But the road map is
basically take Gemini, append a noun to
it, and ship it. Gemini Spark, Gemini
Omni, Gemini Flow, and the list goes on.
But they're calling it the agentic
Gemini era. The search is now an AI
agent, Gmail is an AI agent, Android is
an AI agent, your glasses are an AI
agent. And as I watched the keynote, I
realized something. That Google is no
longer trying to organize the world's
information with blue hyperlinks,
because search engines are now an
archaic technology. Instead, Google is
trying to become the interface to
reality itself before Anthropic and
OpenAI create better realities. But
luckily, Google I/O wasn't all about AI.
I didn't see any updates about Angular,
but I did come across a new awesome web
API that every web developer should know
about. In today's video, we'll break
down everything you missed at Google
I/O. It is May 22nd, 2026, and you're
watching The Code Report. Whether you
love it or hate it, one thing is
undeniably impressive about Google, and
that's its ability to scale. Not only is
it serving its core products to billions
of daily active users, but in the last 2
years, they've gone from serving 9.7
trillion tokens per month to a
staggering 3.2 quadrillion tokens per
month. And that number is going to
continue accelerating. In addition,
Alphabet's capital expenditures have
exploded, building new infrastructure to
support all these stupid AI images you
guys create with nano banana. You ever
see a pug dressed like an accountant?
No.
You want to? Uh
One thing that makes this massive scale
possible is Google's TPU chip, or Tensor
Processing Unit. I remember being amazed
seeing a TPU at my first Google I/O back
in 2018. But this week, they announced
they're splitting these chips into two
distinct jobs, the training and
inference with the TPU-T and TPU-I. In
other words, Google now has one chip
that's optimized to teach a robot how to
think, and another chip that's optimized
for it to hallucinate search results on
a global scale. The headline
announcement at Google I/O though was
Gemini Omni, a model that takes any
input like text, video, and sound and
produces any output. Demis Hassabis, who
might be the smartest guy at Google,
appears to be fully world model pilled
because models like this don't just
generate pixels anymore. They understand
language physics motion and
everything else in your world just well
enough to simulate reality on demand.
But along with this new model comes an
entirely new design system for the
Gemini app called Neural Expressive. At
first glance, the UI looks like a simple
glow up with new icons and better
gradients. But what's unique about it is
that it's optimized for generating UI
elements on demand, like diagrams,
timelines, and even mini apps that
didn't exist before your prompt. Now,
when it comes to Google's core large
language models, they released Gemini
Flash 3.5, which is not the big brain
model, but the fast model. According to
the trust me bro benchmarks, it performs
nearly on par with Opus 4.7 and GPT-5.5,
but runs at a much faster speed. Like if
we look at this trust me bro diagram, we
see that Flash is entirely in a quadrant
of its own in terms of speed and
intelligence. However, it's important to
remember that this is not their top-tier
model. The Gemini 3.5 Pro is still under
wraps and not expected to release until
later this summer, which was very
disappointing to a lot of people on the
internet. Speaking of disappointment
though, not everybody was happy with the
new direction of Google's anti-gravity
IDE. Anti-gravity was formerly known as
Windserve and was code for AI coding
just like Cursor. And once again,
following in the footsteps of Cursor,
its latest version looks like an OpenAI
Codex clone that's more focused on
managing agents than writing code. Old
school programmers might not be happy
with this change, but the live demo was
pretty badass. They used the tool to
build a complete operating system from
scratch, which took like 12 hours and
billions of tokens. But then, they tried
to play Doom on it and it failed due to
missing drivers. However, live on stage,
they had Gemini code up those drivers
and within a few seconds, Doom was up
and running. The most impressive part
was just the sheer speed at which this
thing could spit out tokens. But, the
speed is not the only thing increasing.
But, the price of Gemini 3.5 Flash is
three times more than the previous
version and 30 times more than Gemini
1.5 Flash. It's still a lot cheaper than
Claude, but not nearly as cheap as it
used to be. Almost everything at IO
involved AI in one way or another. But,
if you're a web developer, one cool
thing you should know about in Chrome is
the HTML on Canvas API, which as the
name implies, allows you to use HTML
elements directly in a canvas now.
>> Awesome. Native HTML elements rendered
into the canvas.
Woo!
That means you can build highly
interactive UIs where you control every
pixel with tools like WebGL and WebGPU,
while simultaneously using HTML for your
more basic UI elements. The only
question is which AI coding model should
you use to work with this API? Well,
that's why you need to know about
Emergent, the sponsor of today's video.
Everyone's switching between five
different coding models these days, but
we still need something to help us ship
full stack applications that actually
work. And that's exactly where Emergent
can help. Right now, I'm using it to
build a pull request review dashboard
where I can paste in a GitHub link and
get an AI summary of all the changes and
risks per repo. You still start with a
prompt, but instead of one LLM guessing
how to build everything, Emergent spins
up specialized agents to work on your
app's front end, back end, database,
testing, and deployment all in parallel.
You also don't need to mess with any
Superbase wiring or Express boilerplate,
because that one prompt sets up your
app's database, auth, and APIs. If
you're really into self-torture, feel
free to keep scaffolding this stuff by
hand, or you could just describe the
tool you want and let Emergent's agents
swarm build it all for you. You try it
out for free at the link below. This has
been the Code Report. Thanks for
watching and I will see you in the next
one.