---
title: 'AI Agents Explained: How to Create and Use AI Agents in 2026'
source: 'https://youtube.com/watch?v=4TvH-OZhwxI'
video_id: '4TvH-OZhwxI'
date: 2026-06-30
duration_sec: 1467
---

# AI Agents Explained: How to Create and Use AI Agents in 2026

> Source: [AI Agents Explained: How to Create and Use AI Agents in 2026](https://youtube.com/watch?v=4TvH-OZhwxI)

## Summary

This video explains the critical difference between a simple chatbot and a full AI agent, demonstrating how the same underlying model becomes a powerful autonomous worker when given tools, memory, goals, and a feedback loop. It covers four leading agent platforms—Claude Code, Codex, OpenClaw, and Anti-gravity—shows how to install and use each, and teaches a structured 'prompt contract' method to ensure agents execute tasks correctly and efficiently.

### Key Points

- **The $20 Mistake** [00:00] — Most people pay for an AI agent but use it like a simple chatbot. The agent can run 10 tasks at once and finish a week of work before lunch, but users only ask it questions. This is the most expensive mistake in AI.
- **Agent in Action: File Cleanup** [00:34] — A real-world demo shows an agent cleaning 80+ messy files (PDFs, receipts, invoices) into 8 categorized folders, renaming by date/vendor, and building an expense spreadsheet—all from one plain-English command.
- **Chatbot vs. Agent: The Brain Analogy** [01:32] — A chatbot is like a world-class chef with no kitchen, tools, or recipes. An agent is the same brain plus four components: the LLM (reasoning), tools (terminal/browser/filesystem), memory (a notebook), and goals (a destination). Tied together with an observe-think-act loop.
- **The Four Agent Platforms Overview** [03:01] — After a year of testing, four platforms stand out: Claude Code (best for interpretable reasoning), Codex (lowest friction for ChatGPT users), OpenClaw (life automation via messengers), and Anti-gravity (visual/front-end work).
- **Claude Code: Installation & Pricing** [03:40] — Requires a paid Anthropic plan ($17-20/mo). Download from 'Claude code desktop download'. Free Claude.ai does NOT include Claude code. Works for any computer task, not just coding.
- **Codex: OpenAI's Agent** [06:51] — For ChatGPT Plus subscribers (already included). Download from 'OpenAI Codex'. Runs on Mac, Windows, Linux. Has a unique cloud version at ChatGPT.com for long-running tasks. Lowest friction for existing ChatGPT users.
- **OpenClaw: Open Source & Messengers** [09:11] — Self-hosted, open-source agent that lives inside Telegram, WhatsApp, iMessage, etc. Single-line install: go to claw.bot, copy command. Best for life automation: email triage, reminders, document creation.
- **Anti-gravity: Google's Agent** [12:24] — Desktop app (fork of VS Code), free preview with Gemini 3 Pro. Best for visual/front-end work: UI mockups, design iteration, image/video tasks.
- **Quick Rule of Thumb for Platform Choice** [14:43] — Words and code: Claude Code. Low friction (ChatGPT user): Codex. Life automation: OpenClaw. Visual/front-end: Anti-gravity.
- **The Prompt Contract: Goal, Constraints, Format, Failure** [16:54] — Chatbot prompts are descriptions; agent prompts must be contracts. The four sections: Goal (clear outcome), Constraints (guardrails), Format (exact output shape), Failure (what to do when stuck). Example given for building a landing page.
- **Memory Files: The Permanent Fix** [22:04] — Every agent platform reads a memory file at session start (Claude.md, agents.md, etc.). Drop rules there and the agent never repeats mistakes. Make it self-modifying: ask the agent to append new rules when corrected.
- **Action Plan** [24:08] — Open one platform tonight, pick a real task, write a prompt contract, add a memory file with 3 rules, and run it end-to-end. Stop reading, start running.

### Conclusion

The key to unlocking an AI agent's true power is to stop using it as a chatbot. By applying the prompt contract framework (Goal, Constraints, Format, Failure) and leveraging a persistent memory file, anyone can turn a $20 subscription into a hyper-productive digital worker.

## Transcript

Same Claude, same 20 bucks a month, two completely different tools, a chatbot and an AI agent. One answers questions inside a chat window, the other opens your browser, runs 10 tasks at once, and finishes a week of work before lunch. Most people are paying for the second one,
and using it like the first. And that's the most expensive mistake in AI right now. Let me show you exactly how to flip that switch. Here's a folder on my desktop, 80 something files, PDF, screenshots, receipts, invoices, contracts, random downloads from the last three months,
total mess. Looks like everybody's downloads folder. One command, plain English, I asked the agent to read every file, sort everything into categorized folders, rename it by date, and vendor,
and build me one spreadsheet with every expense for tax season. Hit enter. Now watch, a few minutes later, eight clean folders, every file renamed to the same format, and one spreadsheet with every expense from the last three months, date, vendor, amount, category, ready to hand to my accountant.
A chatbot can tell you how to do this, an agent just does it. Now the prompt I typed wasn't lucky, there's a specific structure that makes agents actually follow through instead of going off the rails. We'll get to it later in the video, and once you see it, you'll never write a prompt the old way
again. But first, how this actually works. Quick foundation. Because if you skip this, nothing else in the video makes sense. A chatbot and an agent can run on the exact same model. The brain is identical,
the difference is what's wired around the brain. Picture a world-class chef sitting in an empty room, same training, same palette, same instincts as the chef running a Michelin kitchen across town.
But with no stove, no ingredients, and no ordered ticket telling them what to make, they can't actually cook a thing. The skill didn't change, everything around the skill did. That's exactly the difference between a chatbot and an agent. A chatbot is the brain alone. You type it answers. Conversation ends,
it can't open a file, run a command, browse a website, or remember what you taught it last week, it lives inside the chat window. Useful but boxed in, like a brilliant mind with no hands and no notebook. An agent is the same brain with four things bolted on, the LM, the reasoning engine,
tools, terminal, browser, file system, APIs, the hands. Memory files the agent reads at the start of every session, so it doesn't show up empty handed each time, the notebook. And goals, not vague wishes,
but specific outcomes with a clear definition of done, the destination. Tie those four together with a loop and you have an agent, the loop has three steps. Observe, the agent looks at the current state
of the world. What files exist, what the page shows, what the last command returned, think. It decides what to do next based on what's actually true right now, not what you said five minutes ago. Act, it uses one of its tools to change something. Then it repeats, observe, think, act, check in against
the goal until done. That's the entire engine. Every agent on every platform runs this same loop, so that's the whole picture. LLM tools, memory, goals, loop. There are dozens of agent platforms out
there. Most are noise. After a year of testing, four earned a permanent spot in my workload, and they each win at something different. Quick tour of all four, exactly how to install each one, and what the
screen actually looks like the first time you open it. Claude, code, andthropics official agent runs as a desktop app on your machine. Mac, Windows, or even Windows ARM 64. The part most people get
wrong is pricing. Free Claude.ai does not include Claude code. You need a paid and thropic plan, the pro plan, $17 a month on annual, or 20 if you pay monthly. Same subscription covers the
chat at Claude.ai and Claude code. To install, Google, Claude code desktop download, you'll land on a page with downloads for Mac OS Windows and Windows ARM 64. Click the one for your machine, drag
Claude code into your application's folder, open it, sign in with Google or email. Done. First run, you'll see a chat page. Click the code button, choose a folder to work in, and you're ready. Type your
task in plain English. Spin up a symbol to do list app, add, check off, delete, nothing fancy. One setting to know about bypass permissions. It sounds scary, just lets the agent act independently
without asking before each step. From there, the interface is what you'd expect. A message box at the bottom, a thank-and-panel that shows you what the model is doing in real time, and tool calls
as it edits files or runs commands. You can also queue a follow-up message like open it, and it'll pick that up the moment the current task finishes. And one thing worth saying out loud before we move on,
Claude code is not just a coding tool. The name throws people off. Remember that folder a cleanup demo at the start of the video? That was Claude code. Same agent handles video editing tasks,
batch renaming, parsing PDFs, pulling data out of screenshots, anything you can describe on your computer in plain English. The ceiling is your imagination. Not the tool. Treat Claude code as a
general purpose agent that happens to be graded code, and you'll start handing it work you'd never think to give a chat bot. Wear Claude code wins interpretable reasoning. You can literally watch the model think step by step and steer it mid-flight. Pause it, redirect it, or hand it new context
that makes it the strongest of the four when you're orchestrating something complex or chaining agents together. Anything where you want a thanking partner, not a missile, Claude code. Quick break before the next platform. Claude code builds the thing. Braebo's the part most builders skip. Keeping the list
you've already got warm. Braebo's an all-in-one email, SMS, automation, and CRM platform. I use Aura AI daily. They're built in assistant. Watch, three seconds, and here's what comes back. Subject line, preview, body in my toe, and CTA button. The segment picked itself. 347 people who went quiet.
Last month. Tuesday, 10 a.m., when this list actually opens email. One edit, hit send. 41% open rate. 6.2 click through. On a re-engagement list. Openers auto flow into a follow-up. I built once.
Free plan, 300 cents a day. Price in scales by cents, not contact count. If you need enterprise scale, lifecycle, marketing, this isn't HubSpot, and it isn't priced like it either. Use this code for
50% off starter and standard plans for three months, new paying customers only. Link in the description. All right. Back to the agents. Next up is Codex, OpenAI's official agent, and the easiest way in
for anyone who already lives inside ChatGPT. First, if you don't have an OpenAI account, head to OpenAI.com, click try ChatGPT in the top right corner and continue with Google,
phone, or email. Quick onboarding, you're in, but ChatGPT alone is just a chatbot. To get the agent, you need Codex specifically. To install Google OpenAI Codex, you'll land on a page with a download
button. The site auto detects your OS. On Mac, it offers a Mac OS build. On Windows, you get the Windows installer, available since early 2024, and there's a Linux version too. Click download,
drag Codex into your application's folder on Mac, same as any other app, and open it. It's bundled with a ChatGPT plus, Pro, business, and enterprise plans, so if you already pay $20 a month for ChatGPT plus, Codex is already covered. Same login, no extra subscription. First run, the app opens
to a clean workspace. In the middle, you create a new folder for your project. Call it whatever you want. Open inside it. Now you can type your task, spin up, simple to do list, app, add, check off, delete,
nothing fancy. Codex starts thinking, looks through your workspace, drafts, files, and writes them out. You can queue a follow-up message, like open it, either send it immediately with shift or wait. The agent picks it up as soon as the current step finishes. Visually, you see thinking, tool calls,
and file writes as they happen. Same general feel as Cloud Code, slightly different flavor. Codex also has one trick the others don't. A cloud version at ChatGPT.com. Hand off a long running
task to a sandbox in the cloud, walk away, come back to a finished branch you can review. Useful when you don't want to keep your laptop running. Wear Codex wins. Friction. If you already pay for
ChatGPT. Codex is already half installed. Same account, no new subscription, no new tools to learn. The ID extension also drops it straight into VS Code or cursor as a sidebar with a built-in diff
you. So every change the agent proposes you can see file by file before it lands. For someone who doesn't want to think about pricing, plans, or auth, this is the lowest friction way to get an agent running today. OpenClaw is the wild dirt and honestly the most fun one to play with. It's open source,
you self-hosted on your own machine and it was built by Peter Steinberger, the founder of PSPDF kit. As the personal AI agent he wanted for himself. It blew up fast. Over 100,000 stars on GitHub
in its first week, one of the fastest growing AI repos of the year. The killer feature is where it lives. Not in a browser. Not in a terminal, but inside your messengers. Telegram, WhatsApp, iMessage, Discord, Signal Slack, Microsoft Teams, over 20 messaging apps work with it.
Text the bot from a coffee shop. The agent does the work on your computer at home and text you back when it's done. Installation. Easier than people think. One line of code. Step one, go to claw.bot, scroll down to quick start, copy the install command, step two, open terminal,
on Mac, hit command space, type terminal, press enter, on window, search for command prompt or power shell, on Linux, open the terminal. Step three, paste the command, hit enter. Done. That's
single line installs. OpenClaw and kicks off an onboarding wizard that walks through the rest. The first screen says I understand this is very powerful and very risky. Confirm and the wizard
takes over. The wizard walks through five quick choices, onboarding mode, pick quick start, AI provider, anthropic for clawed, open AI for GPT, mini max for budget, not permanent, you can switch
later. API key paste in your key from the provider. If you don't have one, pause, go to platform.openai.com or the anthropic console, generate a key, paste it back. Default model, newer models cost more
per call, but give better answers. Skills, pick three to five core integrations from the marketplace, Apple notes, notion, things three, PowerPoint, Google docs, whatever you actually use. Don't go crazy,
you can add more later. Messenger. Telegram is the popular pick because the app is clean and you can dedicate it just to your bot, paste your telegram bot token, connect your done. Five to ten
minutes start to finish. The interface, that's the hall trick. There isn't a clunky agent dashboard. There's a telegram bot if you want it, but the way I actually use OpenClaw is the web dashboard. Clean chat interface, full thread in front of me. I can see what the agent is doing in real time.
Open it, type your task. Summarize my unread emails from today, build me a Kanban board for my Q2 launch, respond to this email asking to push the meeting to Thursday, hit send. The agent picks up
the task on your computer, does the work, and the reply lands back in the same thread a few seconds later, where OpenClaw wins, life automation, not code, real life stuff. Email triage, reminders,
document creation, project management, knowledge organization, texted a video idea, it files it, texted a tweet draft, it stores it, texted a research note, it categorizes it,
shopping and daily morning briefs. With full computer control and skill integrations basically anything a human can do in a computer, OpenClaw can do. This is the agent that finally made AI feel like the thing we were promised, a real assistant living inside the apps you already use,
not another tab to manage. Integrity is Google's agent platform built on Gemini, and the part most people miss is that it's not a website, it's a desktop app, specifically a heavily modified fork of VS code, so it feels like a real IDE that happens to have an agent living inside it.
Mac, Windows, Linux, all supported. It's currently in public preview, free for individuals with generous rate limits on Gemini 3 Pro, no card required, installation, Google, Google anti-gravity
download, you'll land on the official page. On Mac check whether you're on Apple Silicon or Intel, before you pick a build, type about this Mac in Spotlight, look at the chip line. And something means Apple Silicon, anything that says Intel means Intel, same idea on Windows,
pick the right architecture, download drag anti-gravity into applications on Mac, open it, if you're already signed in to Google in your browser, the app picks up auth automatically. If not, sign in once with your Google account. First run, the layout is VS code with a twist,
code editor in the middle, file tree on the left, and on the right side a dedicated agent panel where you talk to Gemini, type your task in the agent panel, spin up a sample to do list app, add, check off, delete, nothing fancy, and Gemini gets to work. You'll see a generating tab at the
bottom, a model selector for fast versus Gemini 3 Pro, a thinking tab that tells you how long the agent has been reasoning, and any web searches it runs appear in line. Same general feel as
Claude code and codex, slightly different flavor, but you'll pick up the UX in 5 minutes. You can also queue follow-up messages, open it, for example. And the agent fires them as soon as it finishes the current step, where anti-gravity wins. Anything visual, hands down the best agent for front-end work,
UI mockups, design iteration, land in pages, and anything that involves images or video. Gemini's multi-modal stack is just ahead of the others when you need the agent to actually see what it's doing,
read a screenshot, check a layout, compare to design variations, generate a hero image. If your work is designed, marketed, or anything visual, this is your agent. Quick rule of thumb to wrap this section. Words and code. Use Claude code. Lowest friction if
you're already in chat GPT. Use codex. Life automation from a chat window. Use OpenClaw. Visual or front-end work. Use anti-gravity. Quick honest moment. Claude code, codex, open-claw, anti-gravity. These are world-class general-purpose agents, brilliant at thinking code automation
browser work. But the moment you try to run an actual content operation with them, a YouTube channel and Instagram a TikTok, you hit the wall. They don't know your audience. They don't hold the channel strategy across sessions. They don't talk to each other. You end up
being the human glue between five chats, and the whole point of agents quietly disappears. It's exactly the gap we built AI Master 4. Same idea as the four platforms in this video, agents with a loop, tools, memory, goals, but specialized for content production, and wired together
as a team. Three agents talking to each other. The producer agent runs the show. Holds your channel strategy, generates ideas, picks angles, decides what gets made, and when. The script writer agent takes those briefs and writes the scripts. In your voice, your length, your format. The designer agent
gets the brief like any designer would and ships the visuals. Thumbnails, covers, on-screen graphics. They hand work to each other automatically. You sit at the top and approve. It's not just YouTube.
Same three agents run any content operation. Instagram, TikTok, LinkedIn, newsletter, a podcast. You describe your project once at onboarding and the team adapts to that platform.
If you don't want to figure all this out yourself, we'll do it for you. Picking up the tools is one thing. Building an actual content engine on top of them is a different level of work, and that's the part we've already done. Strategy, scripts,
generation, production, publishing. It's the same system running on our channels and on our clients channels right now, working like a pipeline, not a pile of chats. You don't get a stack of tools.
You get a finished process that produces content and a team that runs it end to end. The link is in the description below. Now that you've seen the four platforms, let's fix the single thing that breaks the most agent runs across all of them. The prompt itself. Because here's what nobody tells you
when you switch from chatbots to agents. The way you write a prompt has to change completely. A chatbot prompt is a description of what you want. An agent prompt is a contract. A brief the agent has to deliver against. Description, verse, contract. Two different sentences, two completely different
outcomes. Chatbot prompt sounds like, build me a landing page for my new product. Short, vague, the model fills in the blanks however it wants. Hand that exact same line to an agent and you've just lit money on fire. The agent has tools, a loop, real autonomy. It'll spin up, start scaffolding
files, install whatever framework it feels like, push a dark mode hero section you didn't ask for, and 10 minutes later hand you something you didn't want. But it costs real tokens to produce. The fix isn't a longer prompt. It's a structured one. A real agent prompt, what I call a prompt contract,
has four sections. Goll, constraints, format, failure. Memorize those four words. Every prompt you give an agent for the rest of your life should answer all four. Let me break them down. Goll, the outcome, not the action. The goll section answers one question. What does finished actually look like
not the action? The outcome. Build a landing page is an action. Build a single page landing site for my new product launch that pushes visitors toward one email sign up above the fold. Ready for
me to review locally before deploying is an outcome. The first one is a wish. The second one tells the agent how to know when it's done. If your goll sentence doesn't include a clear finish line, the agent will invent one, and you won't like it. Constraints, the guardrails. Constraints are
everything the agent is not allowed to do. This is the section that prevents disasters. Don't install new dependencies without asking. Don't touch any file outside the our landing folder. No external
CDN scripts. Don't deploy anywhere. Local files only. Don't pull copy from competitor sites. Constraints exist because an agent will happily do something that's technically inside the goll, but completely against the spirit of it. Every time an agent does something stupid you didn't
anticipate that becomes a permanent constraint in your next prompt. Your constraint list is your scar tissue. It only grows. Format, the exact shape of the output. Format is where most prompts fall apart. The agent might do the work perfectly and then dump it into a structure you can't actually
use. Tell it the exact shape you want. Output a single index.html file with inline CSS. No JavaScript frameworks. Mobile responsive. Plus a brief.md alongside it that lists every section in order with the
headline copy you used. Or output a landing folder containing index.html styles.css and assets. Nothing else. Be that specific. The format section is where you reach into the agent's head and
decide the shape of the deliverable before it starts. If you can't describe the format in one sentence, you don't actually know what you want yet and you should not have hit enter. Failure. What to do when
stuck? This is the section almost nobody writes and it's the one that saves the most tokens. What should the agent do when it gets stuck? Should it ask you a clarifying question? Should it stop and report? Should it make a best guess assumption and flag it without instructions? Agents default to the worst
option. They keep trying looping burning tokens until something works or you kill it. One sentence fixes this. If you're missing information you need stop and ask before continuing. Or if a tool call fails
twice in a row, stop and report what you tried. Define how the agent handles uncertainty or it will define it for you. Here's what a contract looks like end to end. Goll build a single page landing
site for my sass launch next week. Optimize to convert visitors in to email sign ups above the fold. Ready for me to review locally before deploying. Constraints single index.html file. Inline CSS.
No JavaScript frameworks. No external CDN scripts. Light theme only. Mobile responsive. Don't touch any file outside the landing folder. Don't deploy anywhere. Format. Landing folder contain an
index.html plus a brief.md that lists every section in order with the headline copy used. Failure. If the target audience or core value prop is unclear from the project notes, stop and ask one consolidated
question rather than guessing. That's a contract. Five sentences. The agent now knows the outcome, the rails, the deliverable and the escape hatch. Ten minutes later, you have a real landing page
instead of a mystery. Try it on your next agent run. Pick a task, write the four sections before you hit enter. The first time you do it, it'll feel like overhead. By the third run, you'll wonder how
you ever briefed an agent any other way. Description versus contract. That's the hall shift. Here's the moment most people give up on agents. They run a task. The agent does something dumb. They correct it. They run another task. Same dumb thing. 10th time in a row. People assume the agent is broken. The agent
isn't broken. You just never told it to remember. Every serious platform has the same feature with different file names. Claude code reads Claude.md. OpenClaude reads agents.md. Anti-gravity has its own version. Same idea
everywhere. A plain text file, the agent reads at the start of every session before it touches anything. Whatever is in that file becomes a rule. It follows forever. Drop it in the root of your project and you've just given your agent a long-term memory. A few weeks ago, I was built in a customer dashboard
with Claude code. Every single run, the agent slipped emojis into customer facing copy. Confirmation messages, error states, button labels, little smileys and rockets everywhere. I'd strip them out.
Next session, they came back. So I just told the agent, create a Claude.md file in the root of this project and add one rule. Never use emojis and customer facing copy unless I explicitly ask.
This is a B2B product. It spun up the file, dropped the rule in, saved it. Took about 10 seconds. That book never showed up again in any project. Ever. Now, layer on the real trick. Make the file
self-modifying. Ask the agent to add the rule before you finish any task. If I corrected you or you hit a bug from a wrong assumption, append a new rule to the learned rule section at the bottom of
this file. Now, the agent updates its own memory. Session 1, you have one rule. Session 5, you have 20. Session 20, the agent rarely makes a preference mistake because it's been writing its own scar tissue
the whole time. Three things to remember. First, the prompt contract, goal constraints, format, failure. That's what turns an agent from an expensive chatbot into something that ships. Second,
memory. One memory file in the root of your project, Claude.md, agents.md, whatever your platform uses and the agent stops making the same mistakes session after session. Third, the four platforms each win at something different. Pick the one that fits your work and go deep. Here's your action plan.
Open one of the four platforms tonight. Pick a real task, not a toy. Write a prompt contract, add a memory file with three rules. Claude.md, agents.md, whatever fits your platform.
Run it and to end. Stop reading about AI agents. Start running them. Your future self will thank you.