TubeSum ← Transcribe a video

AI Agents Explained: How to Create and Use AI Agents in 2026

0h 24m video Transcribed Jun 30, 2026 Watch on YouTube ↗
Intermediate 14 min read For: Tech-savvy professionals and creators already familiar with LLMs and chatbots, who want to transition to using AI agents for real automation and productivity.
89.6K
Views
1.5K
Likes
72
Comments
56
Dislikes
1.8%
📊 Average

AI Summary

This video explains the critical difference between a simple chatbot and a full AI agent, demonstrating how the same underlying model becomes a powerful autonomous worker when given tools, memory, goals, and a feedback loop. It covers four leading agent platforms—Claude Code, Codex, OpenClaw, and Anti-gravity—shows how to install and use each, and teaches a structured 'prompt contract' method to ensure agents execute tasks correctly and efficiently.

[00:00]
The $20 Mistake

Most people pay for an AI agent but use it like a simple chatbot. The agent can run 10 tasks at once and finish a week of work before lunch, but users only ask it questions. This is the most expensive mistake in AI.

[00:34]
Agent in Action: File Cleanup

A real-world demo shows an agent cleaning 80+ messy files (PDFs, receipts, invoices) into 8 categorized folders, renaming by date/vendor, and building an expense spreadsheet—all from one plain-English command.

[01:32]
Chatbot vs. Agent: The Brain Analogy

A chatbot is like a world-class chef with no kitchen, tools, or recipes. An agent is the same brain plus four components: the LLM (reasoning), tools (terminal/browser/filesystem), memory (a notebook), and goals (a destination). Tied together with an observe-think-act loop.

[03:01]
The Four Agent Platforms Overview

After a year of testing, four platforms stand out: Claude Code (best for interpretable reasoning), Codex (lowest friction for ChatGPT users), OpenClaw (life automation via messengers), and Anti-gravity (visual/front-end work).

[03:40]
Claude Code: Installation & Pricing

Requires a paid Anthropic plan ($17-20/mo). Download from 'Claude code desktop download'. Free Claude.ai does NOT include Claude code. Works for any computer task, not just coding.

[06:51]
Codex: OpenAI's Agent

For ChatGPT Plus subscribers (already included). Download from 'OpenAI Codex'. Runs on Mac, Windows, Linux. Has a unique cloud version at ChatGPT.com for long-running tasks. Lowest friction for existing ChatGPT users.

[09:11]
OpenClaw: Open Source & Messengers

Self-hosted, open-source agent that lives inside Telegram, WhatsApp, iMessage, etc. Single-line install: go to claw.bot, copy command. Best for life automation: email triage, reminders, document creation.

[12:24]
Anti-gravity: Google's Agent

Desktop app (fork of VS Code), free preview with Gemini 3 Pro. Best for visual/front-end work: UI mockups, design iteration, image/video tasks.

[14:43]
Quick Rule of Thumb for Platform Choice

Words and code: Claude Code. Low friction (ChatGPT user): Codex. Life automation: OpenClaw. Visual/front-end: Anti-gravity.

[16:54]
The Prompt Contract: Goal, Constraints, Format, Failure

Chatbot prompts are descriptions; agent prompts must be contracts. The four sections: Goal (clear outcome), Constraints (guardrails), Format (exact output shape), Failure (what to do when stuck). Example given for building a landing page.

[22:04]
Memory Files: The Permanent Fix

Every agent platform reads a memory file at session start (Claude.md, agents.md, etc.). Drop rules there and the agent never repeats mistakes. Make it self-modifying: ask the agent to append new rules when corrected.

[24:08]
Action Plan

Open one platform tonight, pick a real task, write a prompt contract, add a memory file with 3 rules, and run it end-to-end. Stop reading, start running.

The key to unlocking an AI agent's true power is to stop using it as a chatbot. By applying the prompt contract framework (Goal, Constraints, Format, Failure) and leveraging a persistent memory file, anyone can turn a $20 subscription into a hyper-productive digital worker.

Clickbait Check

92% Legit

"The title is a strong match: the video thoroughly explains the conceptual differences, installs, use cases, prompt strategies, and memory files for AI agents—delivering exactly what the headline promises."

Mentioned in this Video

Tutorial Checklist

1 03:40 Install Claude Code: Google 'Claude code desktop download', click download for your OS, drag into Applications, open and sign in.
2 06:51 Install Codex: Google 'OpenAI Codex', click download (auto-detects OS), drag into Applications (Mac) or run installer (Windows). Sign in with your ChatGPT Plus account.
3 09:59 Install OpenClaw: Go to claw.bot, copy the install command, paste into terminal, complete the wizard (5 quick choices: onboarding mode, AI provider, API key, default model, skills, messenger).
4 12:54 Install Anti-gravity: Google 'Google anti-gravity download', pick your OS/architecture, download, drag into Applications, sign in with Google account.
5 17:48 Write a Prompt Contract: For every agent task, write four sections: Goal (what finished looks like), Constraints (what not to do), Format (exact output shape), Failure (what to do when stuck).
6 22:04 Create a Memory File: In the project root, create Claude.md (for Claude Code) or agents.md (for other platforms). Add rules like 'Never use emojis in customer-facing copy unless explicitly asked'. Enable self-modification by asking the agent to append corrected rules automatically.
7 24:08 Execute a Real Task: Open one of the four platforms, pick a non-toy task, write a prompt contract, ensure the memory file exists, and run the task end-to-end.

Study Flashcards (10)

What are the four components that turn an LLM into an agent?

easy Click to reveal answer

LLM (reasoning engine), Tools (terminal, browser, file system), Memory (files read at session start), Goals (specific outcomes).

02:18

What are the three steps of the agent loop?

easy Click to reveal answer

Observe (look at current state), Think (decide next action), Act (use a tool to change something). Repeat until the goal is met.

02:44

Which Claude plan includes Claude Code?

easy Click to reveal answer

The paid Anthropic Pro plan ($17/month annual or $20/month). Free Claude.ai does not include Claude Code.

03:40

What is the key advantage of Codex over other agents?

medium Click to reveal answer

Lowest friction: if you already pay for ChatGPT Plus, Codex is already included and requires no new subscription or tools. It also has a cloud version for long-running tasks.

08:42

What unique feature does OpenClaw have compared to other agents?

medium Click to reveal answer

It lives inside messengers (Telegram, WhatsApp, iMessage, etc.) so you can text it a task from anywhere and it works on your home computer, then texts back the result.

09:24

What is the recommended installation method for OpenClaw?

medium Click to reveal answer

A single-line install command pasted into terminal, followed by an onboarding wizard that asks 5 quick choices: onboarding mode, AI provider, API key, default model, and skills.

09:59

Which agent platform is best for visual and front-end work?

hard Click to reveal answer

Anti-gravity (Google's agent), because Gemini's multi-modal stack is ahead of others when the agent needs to 'see' screenshots, check layouts, or generate images/video.

14:15

What are the four sections of a 'prompt contract'?

medium Click to reveal answer

Goal (outcome), Constraints (guardrails), Format (exact output shape), Failure (what to do when stuck).

17:48

What is the single most important line to add to a memory file to enable continuous improvement?

hard Click to reveal answer

Ask the agent to append a new rule to the 'learned rule' section at the bottom of the memory file whenever it is corrected or encounters a bug from a wrong assumption.

23:15

What file does Claude Code read at startup for project-level rules?

easy Click to reveal answer

Claude.md (in the root of the project).

22:19

💡 Key Takeaways

💡

The Chef Without a Kitchen

This analogy perfectly and memorably distinguishes a chatbot (brain alone) from an agent (brain + tools + memory + goals), making the abstract concept instantly graspable.

01:32
⚖️

The Observe-Think-Act Loop

This is the foundational engine of every agent platform. Understanding it is essential to debugging and designing agent behavior.

02:44
🔧

The Prompt Contract Framework

Introduces a structured method (Goal, Constraints, Format, Failure) that directly solves the common problem of agents 'going off the rails' and wasting tokens.

17:48
🔧

Self-Modifying Memory Files

Reveals a practical, powerful technique to make an agent improve automatically over time by writing its own rules into a persistent memory file.

22:04
💬

Stop Reading, Start Running

A strong call to action that shifts the viewer from passive learning to active implementation, emphasizing that the real value comes from doing.

24:08

✂️ Creator Tools: Viral Hooks

AI-generated clip ideas for Shorts based on the transcript

You're Paying for an AI Agent but Using It Wrong

45s

The dramatic contrast between paying for advanced AI and using it as a basic chatbot creates immediate FOMO and curiosity.

▶ Play Clip

Chef Analogy: Chatbot vs Agent Explained

46s

The simple visual analogy of a chef without a kitchen makes the technical concept of agent architecture instantly understandable and shareable.

▶ Play Clip

The #1 Mistake When Prompting AI Agents

54s

Reveals a critical flaw in how people use agents and provides a four-step framework that dramatically improves results, making it highly valuable and likely to be saved.

▶ Play Clip

Stop Correcting Your AI Agent – Do This Instead

46s

The 'memory file' hack solves a common frustration of agents repeating mistakes, offering a permanent fix that feels like a secret tip, driving engagement and shares.

▶ Play Clip

[00:00] Same Claude, same 20 bucks a month, two completely different tools, a chatbot and an AI agent. One answers questions inside a chat window, the other opens your browser, runs 10 tasks at once, and finishes a week of work before lunch. Most people are paying for the second one,

[00:16] and using it like the first. And that's the most expensive mistake in AI right now. Let me show you exactly how to flip that switch. Here's a folder on my desktop, 80 something files, PDF, screenshots, receipts, invoices, contracts, random downloads from the last three months,

[00:34] total mess. Looks like everybody's downloads folder. One command, plain English, I asked the agent to read every file, sort everything into categorized folders, rename it by date, and vendor,

[00:46] and build me one spreadsheet with every expense for tax season. Hit enter. Now watch, a few minutes later, eight clean folders, every file renamed to the same format, and one spreadsheet with every expense from the last three months, date, vendor, amount, category, ready to hand to my accountant.

[01:03] A chatbot can tell you how to do this, an agent just does it. Now the prompt I typed wasn't lucky, there's a specific structure that makes agents actually follow through instead of going off the rails. We'll get to it later in the video, and once you see it, you'll never write a prompt the old way

[01:19] again. But first, how this actually works. Quick foundation. Because if you skip this, nothing else in the video makes sense. A chatbot and an agent can run on the exact same model. The brain is identical,

[01:32] the difference is what's wired around the brain. Picture a world-class chef sitting in an empty room, same training, same palette, same instincts as the chef running a Michelin kitchen across town.

[01:44] But with no stove, no ingredients, and no ordered ticket telling them what to make, they can't actually cook a thing. The skill didn't change, everything around the skill did. That's exactly the difference between a chatbot and an agent. A chatbot is the brain alone. You type it answers. Conversation ends,

[02:01] it can't open a file, run a command, browse a website, or remember what you taught it last week, it lives inside the chat window. Useful but boxed in, like a brilliant mind with no hands and no notebook. An agent is the same brain with four things bolted on, the LM, the reasoning engine,

[02:18] tools, terminal, browser, file system, APIs, the hands. Memory files the agent reads at the start of every session, so it doesn't show up empty handed each time, the notebook. And goals, not vague wishes,

[02:32] but specific outcomes with a clear definition of done, the destination. Tie those four together with a loop and you have an agent, the loop has three steps. Observe, the agent looks at the current state

[02:44] of the world. What files exist, what the page shows, what the last command returned, think. It decides what to do next based on what's actually true right now, not what you said five minutes ago. Act, it uses one of its tools to change something. Then it repeats, observe, think, act, check in against

[03:01] the goal until done. That's the entire engine. Every agent on every platform runs this same loop, so that's the whole picture. LLM tools, memory, goals, loop. There are dozens of agent platforms out

[03:14] there. Most are noise. After a year of testing, four earned a permanent spot in my workload, and they each win at something different. Quick tour of all four, exactly how to install each one, and what the

[03:26] screen actually looks like the first time you open it. Claude, code, andthropics official agent runs as a desktop app on your machine. Mac, Windows, or even Windows ARM 64. The part most people get

[03:40] wrong is pricing. Free Claude.ai does not include Claude code. You need a paid and thropic plan, the pro plan, $17 a month on annual, or 20 if you pay monthly. Same subscription covers the

[03:53] chat at Claude.ai and Claude code. To install, Google, Claude code desktop download, you'll land on a page with downloads for Mac OS Windows and Windows ARM 64. Click the one for your machine, drag

[04:06] Claude code into your application's folder, open it, sign in with Google or email. Done. First run, you'll see a chat page. Click the code button, choose a folder to work in, and you're ready. Type your

[04:18] task in plain English. Spin up a symbol to do list app, add, check off, delete, nothing fancy. One setting to know about bypass permissions. It sounds scary, just lets the agent act independently

[04:30] without asking before each step. From there, the interface is what you'd expect. A message box at the bottom, a thank-and-panel that shows you what the model is doing in real time, and tool calls

[04:42] as it edits files or runs commands. You can also queue a follow-up message like open it, and it'll pick that up the moment the current task finishes. And one thing worth saying out loud before we move on,

[04:54] Claude code is not just a coding tool. The name throws people off. Remember that folder a cleanup demo at the start of the video? That was Claude code. Same agent handles video editing tasks,

[05:07] batch renaming, parsing PDFs, pulling data out of screenshots, anything you can describe on your computer in plain English. The ceiling is your imagination. Not the tool. Treat Claude code as a

[05:19] general purpose agent that happens to be graded code, and you'll start handing it work you'd never think to give a chat bot. Wear Claude code wins interpretable reasoning. You can literally watch the model think step by step and steer it mid-flight. Pause it, redirect it, or hand it new context

[05:37] that makes it the strongest of the four when you're orchestrating something complex or chaining agents together. Anything where you want a thanking partner, not a missile, Claude code. Quick break before the next platform. Claude code builds the thing. Braebo's the part most builders skip. Keeping the list

[05:54] you've already got warm. Braebo's an all-in-one email, SMS, automation, and CRM platform. I use Aura AI daily. They're built in assistant. Watch, three seconds, and here's what comes back. Subject line, preview, body in my toe, and CTA button. The segment picked itself. 347 people who went quiet.

[06:13] Last month. Tuesday, 10 a.m., when this list actually opens email. One edit, hit send. 41% open rate. 6.2 click through. On a re-engagement list. Openers auto flow into a follow-up. I built once.

[06:26] Free plan, 300 cents a day. Price in scales by cents, not contact count. If you need enterprise scale, lifecycle, marketing, this isn't HubSpot, and it isn't priced like it either. Use this code for

[06:38] 50% off starter and standard plans for three months, new paying customers only. Link in the description. All right. Back to the agents. Next up is Codex, OpenAI's official agent, and the easiest way in

[06:51] for anyone who already lives inside ChatGPT. First, if you don't have an OpenAI account, head to OpenAI.com, click try ChatGPT in the top right corner and continue with Google,

[07:04] phone, or email. Quick onboarding, you're in, but ChatGPT alone is just a chatbot. To get the agent, you need Codex specifically. To install Google OpenAI Codex, you'll land on a page with a download

[07:17] button. The site auto detects your OS. On Mac, it offers a Mac OS build. On Windows, you get the Windows installer, available since early 2024, and there's a Linux version too. Click download,

[07:30] drag Codex into your application's folder on Mac, same as any other app, and open it. It's bundled with a ChatGPT plus, Pro, business, and enterprise plans, so if you already pay $20 a month for ChatGPT plus, Codex is already covered. Same login, no extra subscription. First run, the app opens

[07:48] to a clean workspace. In the middle, you create a new folder for your project. Call it whatever you want. Open inside it. Now you can type your task, spin up, simple to do list, app, add, check off, delete,

[08:00] nothing fancy. Codex starts thinking, looks through your workspace, drafts, files, and writes them out. You can queue a follow-up message, like open it, either send it immediately with shift or wait. The agent picks it up as soon as the current step finishes. Visually, you see thinking, tool calls,

[08:17] and file writes as they happen. Same general feel as Cloud Code, slightly different flavor. Codex also has one trick the others don't. A cloud version at ChatGPT.com. Hand off a long running

[08:30] task to a sandbox in the cloud, walk away, come back to a finished branch you can review. Useful when you don't want to keep your laptop running. Wear Codex wins. Friction. If you already pay for

[08:42] ChatGPT. Codex is already half installed. Same account, no new subscription, no new tools to learn. The ID extension also drops it straight into VS Code or cursor as a sidebar with a built-in diff

[08:55] you. So every change the agent proposes you can see file by file before it lands. For someone who doesn't want to think about pricing, plans, or auth, this is the lowest friction way to get an agent running today. OpenClaw is the wild dirt and honestly the most fun one to play with. It's open source,

[09:11] you self-hosted on your own machine and it was built by Peter Steinberger, the founder of PSPDF kit. As the personal AI agent he wanted for himself. It blew up fast. Over 100,000 stars on GitHub

[09:24] in its first week, one of the fastest growing AI repos of the year. The killer feature is where it lives. Not in a browser. Not in a terminal, but inside your messengers. Telegram, WhatsApp, iMessage, Discord, Signal Slack, Microsoft Teams, over 20 messaging apps work with it.

[09:41] Text the bot from a coffee shop. The agent does the work on your computer at home and text you back when it's done. Installation. Easier than people think. One line of code. Step one, go to claw.bot, scroll down to quick start, copy the install command, step two, open terminal,

[09:59] on Mac, hit command space, type terminal, press enter, on window, search for command prompt or power shell, on Linux, open the terminal. Step three, paste the command, hit enter. Done. That's

[10:11] single line installs. OpenClaw and kicks off an onboarding wizard that walks through the rest. The first screen says I understand this is very powerful and very risky. Confirm and the wizard

[10:23] takes over. The wizard walks through five quick choices, onboarding mode, pick quick start, AI provider, anthropic for clawed, open AI for GPT, mini max for budget, not permanent, you can switch

[10:35] later. API key paste in your key from the provider. If you don't have one, pause, go to platform.openai.com or the anthropic console, generate a key, paste it back. Default model, newer models cost more

[10:48] per call, but give better answers. Skills, pick three to five core integrations from the marketplace, Apple notes, notion, things three, PowerPoint, Google docs, whatever you actually use. Don't go crazy,

[11:01] you can add more later. Messenger. Telegram is the popular pick because the app is clean and you can dedicate it just to your bot, paste your telegram bot token, connect your done. Five to ten

[11:13] minutes start to finish. The interface, that's the hall trick. There isn't a clunky agent dashboard. There's a telegram bot if you want it, but the way I actually use OpenClaw is the web dashboard. Clean chat interface, full thread in front of me. I can see what the agent is doing in real time.

[11:30] Open it, type your task. Summarize my unread emails from today, build me a Kanban board for my Q2 launch, respond to this email asking to push the meeting to Thursday, hit send. The agent picks up

[11:42] the task on your computer, does the work, and the reply lands back in the same thread a few seconds later, where OpenClaw wins, life automation, not code, real life stuff. Email triage, reminders,

[11:55] document creation, project management, knowledge organization, texted a video idea, it files it, texted a tweet draft, it stores it, texted a research note, it categorizes it,

[12:07] shopping and daily morning briefs. With full computer control and skill integrations basically anything a human can do in a computer, OpenClaw can do. This is the agent that finally made AI feel like the thing we were promised, a real assistant living inside the apps you already use,

[12:24] not another tab to manage. Integrity is Google's agent platform built on Gemini, and the part most people miss is that it's not a website, it's a desktop app, specifically a heavily modified fork of VS code, so it feels like a real IDE that happens to have an agent living inside it.

[12:42] Mac, Windows, Linux, all supported. It's currently in public preview, free for individuals with generous rate limits on Gemini 3 Pro, no card required, installation, Google, Google anti-gravity

[12:54] download, you'll land on the official page. On Mac check whether you're on Apple Silicon or Intel, before you pick a build, type about this Mac in Spotlight, look at the chip line. And something means Apple Silicon, anything that says Intel means Intel, same idea on Windows,

[13:11] pick the right architecture, download drag anti-gravity into applications on Mac, open it, if you're already signed in to Google in your browser, the app picks up auth automatically. If not, sign in once with your Google account. First run, the layout is VS code with a twist,

[13:27] code editor in the middle, file tree on the left, and on the right side a dedicated agent panel where you talk to Gemini, type your task in the agent panel, spin up a sample to do list app, add, check off, delete, nothing fancy, and Gemini gets to work. You'll see a generating tab at the

[13:44] bottom, a model selector for fast versus Gemini 3 Pro, a thinking tab that tells you how long the agent has been reasoning, and any web searches it runs appear in line. Same general feel as

[13:57] Claude code and codex, slightly different flavor, but you'll pick up the UX in 5 minutes. You can also queue follow-up messages, open it, for example. And the agent fires them as soon as it finishes the current step, where anti-gravity wins. Anything visual, hands down the best agent for front-end work,

[14:15] UI mockups, design iteration, land in pages, and anything that involves images or video. Gemini's multi-modal stack is just ahead of the others when you need the agent to actually see what it's doing,

[14:27] read a screenshot, check a layout, compare to design variations, generate a hero image. If your work is designed, marketed, or anything visual, this is your agent. Quick rule of thumb to wrap this section. Words and code. Use Claude code. Lowest friction if

[14:43] you're already in chat GPT. Use codex. Life automation from a chat window. Use OpenClaw. Visual or front-end work. Use anti-gravity. Quick honest moment. Claude code, codex, open-claw, anti-gravity. These are world-class general-purpose agents, brilliant at thinking code automation

[14:59] browser work. But the moment you try to run an actual content operation with them, a YouTube channel and Instagram a TikTok, you hit the wall. They don't know your audience. They don't hold the channel strategy across sessions. They don't talk to each other. You end up

[15:11] being the human glue between five chats, and the whole point of agents quietly disappears. It's exactly the gap we built AI Master 4. Same idea as the four platforms in this video, agents with a loop, tools, memory, goals, but specialized for content production, and wired together

[15:29] as a team. Three agents talking to each other. The producer agent runs the show. Holds your channel strategy, generates ideas, picks angles, decides what gets made, and when. The script writer agent takes those briefs and writes the scripts. In your voice, your length, your format. The designer agent

[15:45] gets the brief like any designer would and ships the visuals. Thumbnails, covers, on-screen graphics. They hand work to each other automatically. You sit at the top and approve. It's not just YouTube.

[15:58] Same three agents run any content operation. Instagram, TikTok, LinkedIn, newsletter, a podcast. You describe your project once at onboarding and the team adapts to that platform.

[16:10] If you don't want to figure all this out yourself, we'll do it for you. Picking up the tools is one thing. Building an actual content engine on top of them is a different level of work, and that's the part we've already done. Strategy, scripts,

[16:25] generation, production, publishing. It's the same system running on our channels and on our clients channels right now, working like a pipeline, not a pile of chats. You don't get a stack of tools.

[16:37] You get a finished process that produces content and a team that runs it end to end. The link is in the description below. Now that you've seen the four platforms, let's fix the single thing that breaks the most agent runs across all of them. The prompt itself. Because here's what nobody tells you

[16:54] when you switch from chatbots to agents. The way you write a prompt has to change completely. A chatbot prompt is a description of what you want. An agent prompt is a contract. A brief the agent has to deliver against. Description, verse, contract. Two different sentences, two completely different

[17:12] outcomes. Chatbot prompt sounds like, build me a landing page for my new product. Short, vague, the model fills in the blanks however it wants. Hand that exact same line to an agent and you've just lit money on fire. The agent has tools, a loop, real autonomy. It'll spin up, start scaffolding

[17:30] files, install whatever framework it feels like, push a dark mode hero section you didn't ask for, and 10 minutes later hand you something you didn't want. But it costs real tokens to produce. The fix isn't a longer prompt. It's a structured one. A real agent prompt, what I call a prompt contract,

[17:48] has four sections. Goll, constraints, format, failure. Memorize those four words. Every prompt you give an agent for the rest of your life should answer all four. Let me break them down. Goll, the outcome, not the action. The goll section answers one question. What does finished actually look like

[18:06] not the action? The outcome. Build a landing page is an action. Build a single page landing site for my new product launch that pushes visitors toward one email sign up above the fold. Ready for

[18:18] me to review locally before deploying is an outcome. The first one is a wish. The second one tells the agent how to know when it's done. If your goll sentence doesn't include a clear finish line, the agent will invent one, and you won't like it. Constraints, the guardrails. Constraints are

[18:33] everything the agent is not allowed to do. This is the section that prevents disasters. Don't install new dependencies without asking. Don't touch any file outside the our landing folder. No external

[18:45] CDN scripts. Don't deploy anywhere. Local files only. Don't pull copy from competitor sites. Constraints exist because an agent will happily do something that's technically inside the goll, but completely against the spirit of it. Every time an agent does something stupid you didn't

[19:01] anticipate that becomes a permanent constraint in your next prompt. Your constraint list is your scar tissue. It only grows. Format, the exact shape of the output. Format is where most prompts fall apart. The agent might do the work perfectly and then dump it into a structure you can't actually

[19:18] use. Tell it the exact shape you want. Output a single index.html file with inline CSS. No JavaScript frameworks. Mobile responsive. Plus a brief.md alongside it that lists every section in order with the

[19:33] headline copy you used. Or output a landing folder containing index.html styles.css and assets. Nothing else. Be that specific. The format section is where you reach into the agent's head and

[19:48] decide the shape of the deliverable before it starts. If you can't describe the format in one sentence, you don't actually know what you want yet and you should not have hit enter. Failure. What to do when

[20:00] stuck? This is the section almost nobody writes and it's the one that saves the most tokens. What should the agent do when it gets stuck? Should it ask you a clarifying question? Should it stop and report? Should it make a best guess assumption and flag it without instructions? Agents default to the worst

[20:17] option. They keep trying looping burning tokens until something works or you kill it. One sentence fixes this. If you're missing information you need stop and ask before continuing. Or if a tool call fails

[20:29] twice in a row, stop and report what you tried. Define how the agent handles uncertainty or it will define it for you. Here's what a contract looks like end to end. Goll build a single page landing

[20:42] site for my sass launch next week. Optimize to convert visitors in to email sign ups above the fold. Ready for me to review locally before deploying. Constraints single index.html file. Inline CSS.

[20:56] No JavaScript frameworks. No external CDN scripts. Light theme only. Mobile responsive. Don't touch any file outside the landing folder. Don't deploy anywhere. Format. Landing folder contain an

[21:10] index.html plus a brief.md that lists every section in order with the headline copy used. Failure. If the target audience or core value prop is unclear from the project notes, stop and ask one consolidated

[21:24] question rather than guessing. That's a contract. Five sentences. The agent now knows the outcome, the rails, the deliverable and the escape hatch. Ten minutes later, you have a real landing page

[21:36] instead of a mystery. Try it on your next agent run. Pick a task, write the four sections before you hit enter. The first time you do it, it'll feel like overhead. By the third run, you'll wonder how

[21:48] you ever briefed an agent any other way. Description versus contract. That's the hall shift. Here's the moment most people give up on agents. They run a task. The agent does something dumb. They correct it. They run another task. Same dumb thing. 10th time in a row. People assume the agent is broken. The agent

[22:04] isn't broken. You just never told it to remember. Every serious platform has the same feature with different file names. Claude code reads Claude.md. OpenClaude reads agents.md. Anti-gravity has its own version. Same idea

[22:19] everywhere. A plain text file, the agent reads at the start of every session before it touches anything. Whatever is in that file becomes a rule. It follows forever. Drop it in the root of your project and you've just given your agent a long-term memory. A few weeks ago, I was built in a customer dashboard

[22:36] with Claude code. Every single run, the agent slipped emojis into customer facing copy. Confirmation messages, error states, button labels, little smileys and rockets everywhere. I'd strip them out.

[22:49] Next session, they came back. So I just told the agent, create a Claude.md file in the root of this project and add one rule. Never use emojis and customer facing copy unless I explicitly ask.

[23:02] This is a B2B product. It spun up the file, dropped the rule in, saved it. Took about 10 seconds. That book never showed up again in any project. Ever. Now, layer on the real trick. Make the file

[23:15] self-modifying. Ask the agent to add the rule before you finish any task. If I corrected you or you hit a bug from a wrong assumption, append a new rule to the learned rule section at the bottom of

[23:27] this file. Now, the agent updates its own memory. Session 1, you have one rule. Session 5, you have 20. Session 20, the agent rarely makes a preference mistake because it's been writing its own scar tissue

[23:40] the whole time. Three things to remember. First, the prompt contract, goal constraints, format, failure. That's what turns an agent from an expensive chatbot into something that ships. Second,

[23:52] memory. One memory file in the root of your project, Claude.md, agents.md, whatever your platform uses and the agent stops making the same mistakes session after session. Third, the four platforms each win at something different. Pick the one that fits your work and go deep. Here's your action plan.

[24:08] Open one of the four platforms tonight. Pick a real task, not a toy. Write a prompt contract, add a memory file with three rules. Claude.md, agents.md, whatever fits your platform.

[24:20] Run it and to end. Stop reading about AI agents. Start running them. Your future self will thank you.

⚡ Saved you 0h 24m reading this? Transcribe any YouTube video for free — no signup needed.