3 AI Agents You Can Build in Python
45sClear, actionable promise of building three production-ready AI agents hooks developers seeking practical skills.
▶ Play ClipThis video is a full course on building production-ready AI agents in Python using the open-source framework AgentSpan. It covers three increasingly complex agents: a simple conversational bot with memory, a RAG-based support agent with structured output and guardrails, and a multi-agent orchestrator for research tasks. The focus is on solving real-world production challenges like crash recovery, human-in-the-loop approvals, and observability.
The video outlines seven key features for production AI agents: durability, retries, human-in-the-loop, observability, long-running tasks, scaling, and testing.
AgentSpan is introduced as the framework used, which is free and open-source. It provides a server that handles state management, orchestration, and observability.
The server stores all agent state, allowing workers to reconnect and resume from crashes without losing progress. It also handles retries and human-in-the-loop approvals.
Installation is done via `pip install agent-span`. The server is started with `agent-span server start` and runs on port 6767 by default.
The first agent is a simple conversational agent. It is created by instantiating the `Agent` class with a name, model, and instructions. Tools and memory are added later.
Tools are created by defining a function with a `@tool` decorator. The function's docstring becomes the tool's description for the LLM.
Conversational memory is added using the `ConversationMemory` class. Messages can be added manually or automatically by passing the memory object to the agent.
Agent 2 is a RAG-based support agent. It uses a Pydantic model (`SupportResponse`) for structured output, ensuring predictable responses.
Guardrails are functions that run before (input) or after (output) the LLM to block malicious content. The video demonstrates an input guardrail for prompt injection detection.
Human-in-the-loop approval is implemented by setting `approval_required=True` on a tool. The worker can then use `handle.approve()` or `handle.reject()` to continue or stop.
Agent 3 is a multi-agent orchestrator. It supports strategies like sequential, parallel, and nested pipelines. The video shows a research team with parallel analysis followed by sequential writing and editing.
AgentSpan provides a testing framework to mock tool calls and verify agent behavior without hitting a real LLM, enabling fast and deterministic tests.
The durability feature is demonstrated by crashing a worker mid-task and then resuming it using the execution ID. The agent continues from where it left off without losing state.
For deployment, the video recommends using Docker Compose with PostgreSQL for persistent storage. The server supports basic auth for secure worker connections.
"The title accurately describes the video's content: building three distinct AI agents (conversational, RAG-based, and multi-agent orchestrator) using AgentSpan, with a focus on production readiness."
What are the seven features needed for a production-ready AI agent according to the video?
Durability, retries, human-in-the-loop, observability, long-running tasks, scaling, and testing.
1:57
What is the name of the open-source framework used in the video to build production AI agents?
AgentSpan.
2:47
How does AgentSpan achieve durability (crash recovery) for AI agents?
It stores all state on the server, so if a worker crashes, it can reconnect and resume from where it left off.
4:20
How do you create a tool for an AgentSpan agent?
By using the `@tool` decorator on a function and providing a docstring as the description.
22:05
What class is used to add conversational memory to an AgentSpan agent?
ConversationMemory.
25:43
How do you force an AgentSpan agent to return structured output (e.g., a Python object)?
By defining a Pydantic BaseModel and setting the `output_type` parameter of the agent to that model.
29:46
How do you implement a human-in-the-loop approval for a tool in AgentSpan?
By setting `approval_required=True` in the tool decorator and then using `handle.approve()` or `handle.reject()` in the stream.
43:20
What is a guardrail in the context of AgentSpan?
A guardrail is a function that runs before (input) or after (output) the LLM to audit or block certain content, like prompt injection attempts.
51:04
Name at least three multi-agent strategies supported by AgentSpan.
Sequential, parallel, handoff, router, swarm, round-robin, random, and manual.
59:17
How do you define a sequential pipeline of agents in AgentSpan?
By using the `>>` operator between agents (e.g., `agent1 >> agent2 >> agent3`).
60:54
How do you store credentials (like an API key) on the AgentSpan server?
Using the `agent-span credentials set <key> <value>` command.
68:43
How can you test an AgentSpan agent without making actual LLM API calls?
By mocking tool calls and results, and then using standard assertions to verify the agent's behavior.
71:04
Seven Pillars of Production AI Agents
Provides a clear checklist (durability, retries, human-in-the-loop, etc.) that separates demo agents from production systems.
1:57Human-in-the-Loop Approval for Refunds
Demonstrates a practical pattern for adding human oversight to critical actions, a must-have for financial or destructive operations.
43:20Guardrails for Prompt Injection Prevention
Shows how to implement a simple but effective input guardrail to block common jailbreak attempts before they reach the LLM.
51:04Durability: Crash Recovery Without State Loss
Illustrates the key benefit of AgentSpan: if a worker crashes mid-task, it can resume from the exact step it failed on, saving time and money.
72:55[00:00] In this video, I'll be going through a full course
[00:05] We're going to write every single line of code, and
[00:09] The first is going to be a simple conversational
[00:14] The second is going to be a rank based agent,
[00:18] from like a company database.
[00:19] And then the last agent
[00:23] where we actually have multiple AI agents running
[00:28] Now, this video is not designed
[00:30] but as long as you're familiar with Python,
[00:33] And we're going to be using a framework here
[00:36] But don't worry, it is free is open source.
[00:38] You won't need to pay for anything.
[00:39] You just need to have access
[00:42] So like OpenAI, anthropic, whatever.
[00:44] But we'll go over that in a minute.
[00:46] Okay, now this is really going to be focused
[00:51] So rather than just agents
[00:53] or that run in a demo environment, ones
[00:58] Now, in order to do that,
[01:01] that you have
[01:04] Now, first
[01:08] So maybe the network goes down, database freezes
[01:11] Your agent just gets killed.
[01:13] And that means that a lot of the work
[01:16] And that can be quite expensive over time.
[01:18] Next human in the loop.
[01:19] So maybe we need a user to approve a task,
[01:22] Or to press a button
[01:25] We're just unsure about that.
[01:26] Lastly or not.
[01:27] Lastly, but thirdly,
[01:31] A lot of times when you build these AI agents,
[01:34] So you need observability into the platform
[01:39] Where is it going wrong?
[01:42] And then obviously scaling a lot of times
[01:45] agent or something, it's not going to scale
[01:49] and you have to pretty much reinvent the wheel
[01:53] this infrastructure.
[01:54] When really you want to focus on
[01:57] So there's seven things that you need.
[01:59] If you want to have an AI agent
[02:02] I'm gonna quickly going to go through them here.
[02:04] Now first durability.
[02:05] That means that if the agent crashes it can recover.
[02:08] And it doesn't need
[02:11] So sometimes the step will fail.
[02:13] That doesn't
[02:15] We should retry it multiple times.
[02:18] Human in the loop. Again.
[02:19] Sometimes we need to delegate a task back to a human
[02:23] Do you want to issue the refund?
[02:24] You want to delete this file x y, z, right?
[02:27] Observability.
[02:28] Like I talked about, we need to be able to actually
[02:33] If agents take 2030 two hours to run,
[02:37] and then scale and testing,
[02:40] Okay, so in order to accomplish what I just discussed
[02:43] get these seven features for our
[02:47] a framework called Agent Span, which comes from Orx
[02:51] And don't worry, this is free.
[02:52] You don't need to pay for anything.
[02:55] I want to quickly just show you what it looks like
[02:58] because this is the benefit of using a platform
[03:02] essentially gives us a server
[03:06] and kind of track the progress of the multiple
[03:10] So you can just see a few quick
[03:12] This is the server running on my own computer.
[03:14] You don't need to build this.
[03:17] And for any given AI agent,
[03:22] You can see a full log of everything
[03:25] And you can see this in real time.
[03:27] So in this case we had a multi-agent system.
[03:29] And I can click into one of these agents
[03:33] or actually go into the execution
[03:38] So this is the observability that I'm talking about.
[03:40] What this also does is allow us to scale the agents
[03:45] for all of them running, and then to retry tasks.
[03:48] For example, if we go here and we scroll down,
[03:52] that were running.
[03:53] And we can go through every single
[03:55] turn of the agent and see everything
[03:59] The reason it stopped the duration, all of that
[04:03] later.
[04:03] But effectively this is the backend infrastructure
[04:08] And each of these agents that you see here was me
[04:12] that connected to this server,
[04:15] and the orchestration, but allowed all of the code
[04:20] So from our local machine, from our server, whatever.
[04:23] But if there was a crash, for example,
[04:25] we could recover from that crash
[04:28] So we could just reconnect, restart
[04:31] And it's not a big deal.
[04:32] And this task can run for as long as it needs to.
[04:34] So anyways, that's the basics on Agent Span.
[04:36] They also have their own Python framework
[04:40] you also can connect them to Lang graph, the OpenAI
[04:45] If you just want to use their orchestration layer
[04:49] about now, in terms of the kind of architecture here,
[04:53] This is pretty much what it looks like.
[04:55] We have a worker.
[04:56] The worker is what we're going to write ourselves.
[04:58] We have the agent span server.
[04:59] This is already provided to us. Again
[05:02] We can run it ourselves.
[05:04] And from here, this keeps track of all of the state.
[05:06] The history allows us to retry, handle human
[05:11] It just handled for us.
[05:12] So from the worker side, we pretty much
[05:16] We're going to connect to this server.
[05:17] All of the rest of the code stays exactly the same.
[05:20] The server handles all of that
[05:24] And then of course we have an LM.
[05:25] We can use any LM that we want.
[05:27] So bring OpenAI cloud whatever.
[05:29] And that's essentially how it works.
[05:30] So anyways that is the brief.
[05:32] That's
[05:34] What I want to do now is hop over to the code editor.
[05:36] We're going to start
[05:39] And then from there we're going to build out
[05:40] three unique AI agents again, starting easy
[05:45] So you get a sense of how to actually build these.
[05:47] And again how they work in production,
[05:50] because at the end of this video,
[05:53] this up by just deploying the server
[05:56] And you're good. That's it.
[05:58] Because of the way that we built it, as opposed to
[06:02] Anyways, let's dive in.
[06:03] All right.
[06:04] So now we're going to get started
[06:07] Now I'm just on the documentation.
[06:08] I'll leave a link to it in the description.
[06:10] It's actually very good. So you can follow along.
[06:12] And a lot of the stuff that you see in this video
[06:16] Now first things first we need to install Agent Span
[06:20] Once we have that installed
[06:24] which is our worker code
[06:27] Now notice that we can simply install it
[06:31] This of course requires
[06:32] that we have Python installed on our computer,
[06:35] So my case I'm going to be using cursor.
[06:37] You can use any editor that you want for this video.
[06:40] Now notice that what I've done in cursor
[06:43] So I just one file I want open folder.
[06:46] And I just selected one that was on my desktop.
[06:48] Just made a new one called AI Agent Tutorial.
[06:50] From here I've opened up the terminal
[06:54] Don't let me zoom in a little bit
[06:56] And this is just going to make a new UV project
[07:01] So you notice
[07:04] So from here
[07:08] And then it should add it to our environment for us.
[07:10] And install everything that we need.
[07:12] Now you don't have to use UV but I prefer to use UV.
[07:15] So that's what I'm going to do.
[07:16] Now I'm just going to delete this main.py file
[07:20] So now if we go to the Pi project tunnel
[07:24] Okay.
[07:25] Now the next thing that we need to do is set our API
[07:30] is that it
[07:34] So any environment variable that you need to use,
[07:39] You can have it stored on the server
[07:42] So I can actually put in OpenAI
[07:46] I want to use directly, where I'm running my server,
[07:51] So what we're going to do
[07:54] I'm going to use OpenAI,
[07:57] And what I'm doing is
[08:02] This is going to let me make a new API key.
[08:04] You will need an account here.
[08:06] But it's very cheap.
[08:07] We're talking about, you know, maybe sense of spend
[08:11] And I'm going to go create API key.
[08:13] And I'm just going to call this agent span.
[08:16] And then maybe tutorial or something.
[08:19] Okay I'm going to make the key.
[08:20] And obviously you don't want to leak this to anyone.
[08:22] So I will delete it afterwards.
[08:24] Okay. So from here
[08:26] And we're going to type the command
[08:29] So let's go back. Export OpenAI API key is equal to.
[08:33] And then the key.
[08:33] So we're going to export OpenAI underscore API.
[08:36] Underscore key is equal to.
[08:37] And then we're going to paste the key inside of here.
[08:40] And then we're going to press enter.
[08:42] Now this should put it
[08:44] Which means that any command that we run after
[08:49] So make sure that if you're going to run the server
[08:54] There's other ways to avoid doing that.
[08:56] But for now this is the easiest where you just have
[09:01] Okay before you run it.
[09:02] Now if you are on windows,
[09:06] And if you're using something like cursor,
[09:10] equivalent command to and then paste the export,
[09:15] you know, OpenAI whatever for PowerShell.
[09:19] And it should tell you
[09:21] So I'm not going to guess.
[09:22] But you can just use an AI model
[09:25] So now that it's exported, what we're going
[09:29] So we can just directly run the agent spawn server,
[09:32] Agent Spin doctor
[09:35] Now because I'm using UV,
[09:39] You've run an agent spin doctor.
[09:41] If you're not
[09:44] you should be able to just run the agent
[09:46] So from here I'm going to press enter
[09:50] And it looks like all is good.
[09:51] It says okay OpenAI is set, Java is installed.
[09:54] We have enough disk space. The server jars cache.
[09:56] That's because I've installed this previously.
[09:59] If you didn't install this previously,
[10:02] And if that's the case, you may need to install,
[10:07] Now if you don't know how to install it again,
[10:08] ask the Lem to ask something like cursor
[10:12] And it should give you the command.
[10:14] Okay, so now that that's running we're going to type.
[10:16] You've run agent spin server start okay.
[10:21] Now this is the command to start the server.
[10:23] So we're going to go ahead and run that.
[10:26] And you can see that
[10:28] okay let me stop the server
[10:31] So to stop it we're just going to go stop okay.
[10:33] And then I'm just going to restart it from here.
[10:35] So let's give it a second.
[10:36] And it says it's running on port 6767.
[10:39] We're just going to wait a minute.
[10:42] So now if we want to test if the server is working
[10:46] We can go to our web browser and just paste it.
[10:49] And we should be able to see the agent spans server
[10:51] So from here you'll see the agent spans server.
[10:54] There's a bunch of stuff you can look through.
[10:55] But generally you're just going to be looking
[10:58] And it's going to show you a history
[11:01] Now obviously you won't see anything
[11:04] seeing previous executions
[11:07] Okay.
[11:07] So we're going to have a look at this later
[11:08] because it will make more sense
[11:11] But for now, let's go back to our project here
[11:15] installing a few last things that we need.
[11:16] And then we can create our first AI agent.
[11:19] Okay.
[11:19] So I'm going to write clear
[11:22] I'm going to add a few dependencies that we need.
[11:24] And if you're not using UV you can just use Pip
[11:28] Now first we're just going to bring in Python
[11:33] And we're also going to bring in pedantic.
[11:35] And then lastly fire crawl
[11:39] dash pi which we're going to use for the last agent
[11:41] So go ahead and press on enter.
[11:44] And we should see that
[11:46] So that's all we're going to need
[11:49] What I'm going to do now is just make a new folder.
[11:51] And I'm going to call this agents now instead of
[11:56] And I'm just going to call this agent 1.py.
[11:59] And this is where we're going to start
[12:01] now, our first agent, it's just going to be
[12:05] All that means that we're just going to talk to it
[12:08] And the one thing that we're going to add
[12:11] to know what our current time is,
[12:15] We're also going to add memory so that anything
[12:20] because by default, if you don't add memory,
[12:23] It says Hey Tim.
[12:24] And then the next conversation or the next time
[12:28] because it's not storing the previous responses.
[12:30] Okay, so that's the goal here.
[12:31] And this is just to show you
[12:33] And then we'll go into building some stuff
[12:36] So we're going to start by importing logging.
[12:38] This is because there's a lot of logs
[12:41] And we want to probably suppress some of them.
[12:43] So we don't see too much in the terminal.
[12:45] We're then going to save from date time import
[12:50] We're then going to go from dot env import
[12:54] And we're just going to use local env
[12:58] that we're going to need in a second.
[13:00] Next we are going to say from agent Span.
[13:03] And this is going to be Dot agents.
[13:05] Make sure that you put
[13:09] the agent runtime okay.
[13:12] And runtime is with the lowercase there.
[13:14] And then conversation memory run and tool okay.
[13:19] So this is all we're going to need for now for this
[13:22] Let me just close this.
[13:23] You guys can see it a little bit bigger okay.
[13:25] Next lines. We're going to locate Env.
[13:28] What this is going to do is load
[13:31] And in fact while we're here we're just going to make
[13:37] So dot env and we are going to put inside of here
[13:42] Now this variable is the agent span underscore server
[13:48] And for now this is going to be equal to Http colon
[13:54] port 6767 slash API.
[13:57] Now let's make sure we spelled this correct
[14:01] But this is local host like so now.
[14:04] And let's add the extra slash okay.
[14:07] So this is where the agent spends
[14:10] Again we're running it on our own computer.
[14:12] So we just put in this URL
[14:15] If the agent spent server was running
[14:19] then of course we would change this
[14:22] hosted somewhere else
[14:25] That's possible.
[14:26] You also can have the workers
[14:30] It's completely up to how you want it deployed.
[14:32] But this is what allows you to specify, hey,
[14:35] Okay. So next we're going to go back to agent one.
[14:37] We've now loaded the dot env.
[14:39] And because we've loaded that agent spin
[14:43] And it will know that it needs to communicate
[14:47] Now next what we're going to do is just say logging
[14:50] And we're just going to set the level.
[14:52] So we're going to say
[14:57] Just so we only show warnings that we don't show
[14:59] all of the logs that are probably going
[15:03] We're then going to say logging dot get logger
[15:09] So let's get it like that.
[15:10] And we're going to set the level to warning as well.
[15:13] And then next we're going to put not agent span
[15:18] We're going to set the level to warning
[15:20] just so that we don't accidentally get
[15:24] All right.
[15:24] So next what we're going to do
[15:28] So to make an agent is super easy.
[15:30] We're just going to say assistant is equal to agent.
[15:33] And then inside of here
[15:35] And this is what's going to show up in Agent Spence.
[15:37] We can see it. So we're going to call this personal
[15:40] assistant like that.
[15:41] Perfect. Next we need to specify the model.
[15:44] Now because we're using OpenAI
[15:47] And we'll be able to connect to it.
[15:48] And use it because we have that API key set.
[15:51] If we wanted to use an anthropic model,
[15:55] we would have needed to declare an anthropic API
[15:59] key, or whatever
[16:01] If we go back here, you can see that
[16:04] We could have exported one of these.
[16:05] So based on the one that we export
[16:09] It gives you the different options.
[16:11] You can specify the model that you want to use okay.
[16:13] So we're going to go back here and we're going to
[16:20] It's a little bit expensive.
[16:21] If you just want a cheaper one.
[16:22] You can do GPT four or GPT for a mini.
[16:26] And that's going to give you a really cheap model
[16:28] This one still will not be expensive based on
[16:33] Next, we're going to pass some instructions.
[16:35] Now the instructions
[16:37] I can separate them out with some quotation
[16:41] And this is the system prompt.
[16:42] This is what's going to be read
[16:45] So it understands how it should actually behave.
[16:47] So we can say something like,
[16:52] Use tools when they help because we're going
[16:57] And then down here
[17:02] okay user details across terms okay.
[17:06] Cool.
[17:07] So that's our instructions.
[17:08] Now beneath this we're going to provide some tools.
[17:10] For now the tool list is going to be empty.
[17:12] And then after this
[17:15] But we'll just add those later.
[17:16] So for now we just have the basic agent.
[17:18] Next thing we're going to do is just run the agent.
[17:21] So to run the agent we're going to say
[17:24] underscore equals
[17:27] This is just the main entry point in our application.
[17:29] If you're unfamiliar with what
[17:30] this does essentially just checks
[17:34] We're just going to do a print statement.
[17:35] And we're going to say starting
[17:39] And then down here
[17:44] okay as runtime.
[17:47] And then we're just going to go into a simple
[17:50] asking the agent questions until we type quit.
[17:53] Okay.
[17:54] So we have our width.
[17:54] This is how you start the runtime for the agent.
[17:56] We're now going to say, while true.
[17:59] And then here we're going to say
[18:02] And that's going to be you dot strip
[18:06] We're going to say if the prompt dot lower
[18:11] okay is equal to q.
[18:14] So if you type the letter q
[18:18] We're going to say if not prompt
[18:20] then we're going to continue
[18:22] so that if you don't type anything at all
[18:25] Now down here, but still inside of the while loop
[18:28] We're going to say the result is equal to run.
[18:32] And we're going to run the assistant.
[18:35] We're going to pass our prompt.
[18:37] And we're going to say
[18:40] And that's it.
[18:43] So for now, what we can do is we can just say print
[18:46] and we can put an F string,
[18:50] And then we can just put inside of a set of braces,
[18:53] What is this result? Okay.
[18:55] And this is going to give us kind of
[18:58] but at least for right now,
[19:01] So let me zoom out a little bit
[19:04] Essentially what we've done is
[19:07] We've set up the Env so we can connect to the server.
[19:09] We have the assistant.
[19:12] It's just a super basic assistant.
[19:13] And we set up a while loop
[19:16] And if we go here
[19:19] I believe I didn't shut it down,
[19:22] Yes, looks like it is.
[19:23] So make sure that the Agent Span server
[19:27] And then what we can do is from the root
[19:30] You've run and then agents slash agent 1.py.
[19:35] Now notice that I'm doing this
[19:38] So I'm doing kind of the path to this file
[19:42] So we're going to pick up the env file
[19:47] That's a starting agent.
[19:48] You can see we have initialized.
[19:49] We've connected to the server.
[19:50] And now we can type something like hello World.
[19:53] And we give it a second here okay.
[19:55] And let's see if we get the response.
[19:57] And it says hey it was completed.
[19:59] And we get this agent result here where
[20:04] So we can see everything ran.
[20:06] And then if we come back
[20:10] You see personal assistant just ran.
[20:12] And if we click into this you can see our prompt
[20:15] Hello world.
[20:16] We can see the output of the model was Hello world
[20:19] That's right from the learn.
[20:20] And then we can see the immediate output at the end
[20:23] Was this right work we got with Hello world. Cool.
[20:26] So that's kind of the benefit
[20:29] We have full insight into
[20:32] And of course this is just a very basic one.
[20:34] Now what I'm going to do is just type Q
[20:36] and let's make it so that we can kind of view
[20:40] So rather than just printing out the result object
[20:45] So what I'm going to do is say so
[20:49] And then I think we can just put in single quotes
[20:54] Otherwise it's going to interfere with the F string.
[20:56] And let's just try this one more time
[20:59] So let's go. You've run. Let's go. Hello.
[21:02] And let's see what we get this time
[21:05] it says agent result has no attribute yet.
[21:08] Okay. Interesting.
[21:09] So I think we can do result in dot output dot
[21:16] Let's just try it.
[21:16] I'm just doing this off the top of my head
[21:21] And let's see now if we get the correct response
[21:26] And there we go.
[21:26] We get hello, how can I help you
[21:30] Whatever. And it's not going to know the answer.
[21:31] But the point is that this is now functioning.
[21:33] I don't know your name.
[21:34] If you want to tell me, I'll remember for later.
[21:36] All right, so this is great.
[21:38] However, like I mentioned,
[21:43] so anything that I chat with
[21:44] the agent is not going to remember later on,
[21:47] So what I want to do now is
[21:50] These are things that the agent will be able
[21:54] And then we're going to add memory.
[21:55] So to add a tool is super simple.
[21:57] What we can do is we can just make a function
[22:02] And then what we're going to do is just return
[22:05] Now it's important that what we write these tools,
[22:09] and the input and output format.
[22:11] So that agent span can automatically convert
[22:15] So for example I'm going to say okay, the get
[22:20] And if it was going to take some input here
[22:24] you know input and then whatever type the input was.
[22:27] And then beneath this importantly
[22:30] which is just a comment at the top of the function
[22:34] the current local time, okay.
[22:38] And then you can see that we have datetime dot now.
[22:40] And then we just convert this into a string
[22:43] Now this is great, but if I want to turn this
[22:48] Now what is a tool?
[22:49] A tool is something that I can call to get
[22:55] So right now
[22:57] It can't actually do anything.
[22:58] It's just capable of essentially,
[23:02] Or giving us a text response
[23:06] and generate a report or search for something,
[23:10] Now, agent spend
[23:13] So all we have to do is just define a function.
[23:16] We specify it's a tool using this add tool decorator.
[23:19] Right. Like we specified here.
[23:20] Then the name of the tool will automatically
[23:23] So make sure you name the function something useful.
[23:26] The input and output type you'll specify,
[23:28] and then the description of the tool
[23:31] So what will happen is Agent Span will now say hey
[23:34] You know get current time right.
[23:37] The description of this is whatever
[23:40] And it takes no input and gives this output.
[23:42] And then that will be passed to the assistant.
[23:45] And the assistant will essentially give us a response
[23:49] And then inside of this runtime
[23:53] call the tool for us
[23:56] And we'll be able to see this happening
[24:00] So for now,
[24:03] And the model,
[24:03] if we run it again should be able to call this
[24:08] So let me.
[24:08] That's not what I meant to do.
[24:10] Let me open up the terminal and run this again.
[24:12] I'm gonna say, what time is it?
[24:16] Okay.
[24:16] And let's see if we get the time. Here.
[24:20] Give this second
[24:22] and hopefully it's going to call that
[24:26] Okay. And you can see it says that it's this time.
[24:28] And if I look at my window here
[24:31] 1947 and two seconds.
[24:33] But now if we go back to the server and we refresh,
[24:38] And we can see now that actually it called a tool.
[24:41] So the yellow line gave some output.
[24:43] The output effectively said
[24:46] The name of that tool is Get current time.
[24:48] Okay.
[24:49] So then we called the tool.
[24:51] We got the input which was this.
[24:53] We got the output which was the result right.
[24:55] And then we pass it to the model.
[24:57] Now the model now has access to that tool call.
[24:59] So it knows what the time was. Right.
[25:02] And that gives us the output.
[25:03] Boom. Here's the time.
[25:04] So that's one of the reasons
[25:07] that full insight
[25:10] Now let me say, what time was it
[25:13] last time I asked you just to show you something?
[25:18] And you should see here
[25:21] Yeah, it says I cannot.
[25:23] I don't have access to timestamps
[25:24] to your previous messages in the chat
[25:27] So essentially what is telling us is that, hey,
[25:31] So the next step here is to add memory to the agent.
[25:34] Now adding memory is super easy.
[25:37] All we have to do here is just go above our agent
[25:40] and we're going to say conversation underscore memory
[25:43] is equal to conversation memory like so.
[25:48] Then inside of here we can also put
[25:52] So I can say
[25:55] So after 50 it will start
[25:58] So we don't clog up the context too much.
[26:01] And then what I can do is just say
[26:04] Memory. Boom.
[26:06] So that's that.
[26:07] So what we can do now is let's open this up.
[26:10] Let's type clear okay.
[26:12] And let's go. You've run and let's do something.
[26:15] My name is Tim okay.
[26:17] And let's see if it can remember that okay.
[26:18] So it says nice to meet you I'm going to say
[26:22] And let's see if I can remember this
[26:26] And it doesn't remember it because I made one mistake
[26:30] So let's do that.
[26:31] Now that's actually a good issue to run into.
[26:34] Okay. So we've created the conversation memory.
[26:36] We've added it to the agent
[26:40] So what we need to do is we need to add what we type
[26:46] So the way we do this is we're just going to go here
[26:51] And we're going to say conversation memory dot add
[26:54] underscore user underscore message.
[26:56] And this is going to be the message that we sent
[26:59] We're then going to say conversation memory dot.
[27:01] And this is going to be add assistant message.
[27:03] And we're going to add the results output dot get
[27:06] And just to make this a little bit cleaner
[27:12] And then we can just replace this
[27:15] And then same thing here with the readable result
[27:19] So essentially what we're doing is saying
[27:22] The memory has a few different functions we can call.
[27:24] One is to add a user message, which is what we said.
[27:26] And then one is to an assistant message
[27:30] So let's save this.
[27:31] And now let's go again to our terminal.
[27:34] Let's make sure I didn't mess something up.
[27:36] Let's clear okay.
[27:37] So let's run it again and let's see what we get.
[27:40] This time I'm going to say my name is Tim okay.
[27:43] And let's see here.
[27:44] Give it a second. Say what is my name.
[27:48] And hopefully it's going to give us the answer
[27:52] Let's see I don't know your name yet okay.
[27:55] What's what I'm going to say.
[27:57] My name is Tim. Let's try one more time.
[27:59] I think sometimes on the first run,
[28:02] Based on kind of how we're adding this. Maybe.
[28:05] Let's see. Okay.
[28:06] What is my name?
[28:08] And let's see now, here we go.
[28:10] So for some reason on the first run, I think based on
[28:16] I'm not sure exactly why that was the case,
[28:19] now it looks like it's working
[28:22] It also might
[28:24] Either way, looks like we are good
[28:28] Okay, so anyways, the memory is functioning
[28:31] let's move on to our next agent,
[28:36] Okay.
[28:36] So as discussed we're now moving on to agent two.
[28:39] Agent two is going to be kind of a rag based agent,
[28:43] to look up some info in something
[28:47] I'm not going to build true rag here
[28:51] But of course, you could very easily
[28:53] What we're going to do is add some more tools,
[28:57] and we're just going to look at a much more complex
[29:02] Now we're also going to look at a pedantic structured
[29:05] Now, what that means
[29:09] of a random string of text,
[29:13] so that it's predictable
[29:17] So as you can see here,
[29:19] Any of this code will be available
[29:22] If you just want to copy it,
[29:24] But I just want to save us a bit of time.
[29:26] So I just did the import.
[29:27] So you can see
[29:30] And then we have the mock database documentation
[29:33] and then the logging setup
[29:37] Okay.
[29:37] So if we go down here
[29:40] is I'm actually going to define what I'm going
[29:46] Now this is how I want the agent
[29:49] So rather than just giving me some random text
[29:54] I want it to give me something
[29:57] that I can then convert into a Python object
[30:01] You're going to see what I mean,
[30:05] response like this.
[30:07] And then this is going to inherit from base model
[30:10] if I can type it correctly,
[30:15] Now pedantic allows us to just do typing in Python.
[30:17] It I use with a lot of these AI frameworks.
[30:20] So first things first, I'm
[30:23] to actually give me output
[30:27] The first is a stage,
[30:32] And then I can actually just give a description
[30:36] And I'm going to say stage like answered okay.
[30:40] So answered refunded or rejected
[30:46] because this is going to be related
[30:48] because we're setting up kind of like a support
[30:52] Next we're going to have successful boolean.
[30:56] And this is just going to be a boolean.
[30:57] So we know if it's successful or not.
[30:59] And then we're gonna have a message
[31:01] Now this is a super basic structured response.
[31:03] But if we wanted it to give us like a price
[31:07] then we just set that as a field and the model
[31:11] So now it's going to give us always a stage
[31:15] And it will give us whether it was successful
[31:18] And if we had other types, we could set those here.
[31:20] We could set up enums, we could do anything you want.
[31:22] I'm just trying to show you that
[31:25] which is really powerful for more deterministic
[31:30] Okay.
[31:31] Now, before we build the model, before
[31:35] So the first tool is going to be one
[31:39] So I'm going to say define search
[31:42] knowledge underscore base.
[31:44] And we're going to take in a query which is a string.
[31:48] And we're going to have a string which is a response.
[31:52] Now for the description of this tool.
[31:54] Let's put it in here.
[31:55] We're going to say search support
[32:01] docs like this okay.
[32:03] Pretty basic.
[32:04] We're just saying hey
[32:07] Let's fix the comment.
[32:09] We're going to say for title and then body in
[32:13] docs dot items, we're going to say if the title
[32:20] is in query dot lower,
[32:24] then what we're going to do is just return the body.
[32:27] This is not an efficient search by any means,
[32:31] is just a quick keyword search of any keyword
[32:35] So like shipping refund policy, account,
[32:36] whatever, then we're just going to return
[32:41] Again, if you use real rag
[32:43] But I'm just showing you how you can set this up.
[32:45] So next we're going to say return no matching
[32:48] support articles found okay.
[32:51] So if there's not a response.
[32:52] So that's the first tool we're going to have next
[32:56] to looking up some orders okay.
[32:58] So we're going to have a tool.
[32:59] This is going to be define lookup underscore order.
[33:03] So let's do this for the order.
[33:06] We're going to have an order ID which is a string.
[33:09] And this is going to return a dictionary
[33:13] Now same thing.
[33:13] We're going to have the comment
[33:17] Pretty basic. And we're going to say by ID.
[33:20] And then what we're going to do is return
[33:23] the mock underscore db and then the orders like this.
[33:29] And then we're going to say dot get.
[33:31] And then this is going to be the order ID.
[33:34] And if the order ID is not found
[33:37] And the dictionary is just going to say error order
[33:42] okay. So that's going to be our lookup order tool.
[33:44] And then we are going to have a few other tools
[33:47] We'll look at those later.
[33:48] So we're going to have one tool
[33:51] However before we call that tool
[33:55] because we don't want to just automatically refund
[34:00] Okay, so for now, let's create the support agent.
[34:01] Let's run it with the two tools so far
[34:05] And then we'll move on to the rest.
[34:07] So we're going to say support
[34:11] For the agent we're going to say the name is support
[34:17] We're going to go with
[34:19] OpenAI slash GPT five for.
[34:23] Then we're going to have some instructions.
[34:25] I'm just going to copy these in because
[34:29] But you'll see kind of how I've written them here.
[34:31] So we're going to say instructions.
[34:32] And then I'm just going to copy
[34:35] So give me one second which looks like this.
[34:39] So you are a customer support agent.
[34:41] Use the knowledge base first
[34:44] When you know the order ID, call the lookup order
[34:47] Process refund.
[34:48] Very short plain English sentence describing
[34:53] Okay, so this is the instructions.
[34:56] Next we're going to specify the output type.
[34:59] So this is a new one.
[35:00] And we're going to say output
[35:03] Now what this allows us to do
[35:06] So like one that we have right here.
[35:08] And that's now going to force the model
[35:12] So that's all we need to do.
[35:13] We just say hey we want it in this format
[35:16] What you're going to see in a second.
[35:18] Now next we're going to specify the tools.
[35:20] So the tools are just going to be
[35:22] And then the lookup order
[35:25] So we'll keep that for right now.
[35:27] And then we can specify
[35:32] Now what this will do is specify the number of times
[35:36] until we reset the session.
[35:37] The for the conversation memory.
[35:38] Here we can actually just specify it like this.
[35:41] And when we do that it should automatically add
[35:46] Now, the reason I didn't do it here is just because
[35:50] the memory, but by default it will automatically add
[35:55] When you. Oops, that's not what I wanted.
[35:57] Specify it like this.
[35:58] Okay, so actually, if I go to the docs, you can see
[36:03] And then there's five methods or six
[36:07] system, message tool, call tool result.
[36:08] So if you don't manually add it like we did,
[36:12] all of these for you automatically,
[36:15] Right.
[36:15] But sometimes you want to just add certain pieces
[36:19] as we did in example one. Okay.
[36:21] But for now we have our memory and we have our agent.
[36:24] So now we need to be able to run the agent.
[36:26] So what I'm going to do is create a function.
[36:28] This is going to be called Run Interactive okay.
[36:32] For this let's spell interactive correctly.
[36:35] We're going to take in a prompt which is a string.
[36:37] And we're just going to return nothing or none okay.
[36:41] From here we're going to say with the agent runtime
[36:46] what we're going to do is we're going to say
[36:49] To start.
[36:50] This is a different function.
[36:52] And this is going to be support agent prompt.
[36:55] And then run time is equal to run time.
[36:57] Now the reason we're doing this is that
[37:01] And we want to actually be able to hook into
[37:05] if it needs approval from us, if there's a guardrail
[37:09] and this gives us just a little bit more control
[37:13] So will allow us to actually stop approval request
[37:18] So what we're going to do now is we're going to say
[37:23] And before I go any further,
[37:24] let me just refer to the docs
[37:28] So if we go to this streaming page here, I'm doing it
[37:31] But you can see that we can actually hook
[37:33] into the stream of the agent, which allows us to see
[37:37] So we can see, for example,
[37:40] if there's a result, if there's a handoff
[37:43] So if it's waiting, what that means
[37:45] is that it's waiting for us to approve something
[37:49] So this allows us to have some more control
[37:51] into what the agent's doing,
[37:54] We can see all of the steps in the meantime,
[37:58] Again. You can reference the docs
[38:01] So we're going to say order underscore ID comma
[38:03] amount is equal to none and none
[38:09] if the user wants to refund an order,
[38:12] And then what we're going to do
[38:16] For now, I'm just going to pass.
[38:17] But this will allow us to actually view
[38:21] that's going on before eventually we get a result.
[38:24] Now we'll handle that in a second.
[38:25] But for now, what I'm going to do is just go
[38:29] dot, get, underscore result.
[38:32] This will then give us the result.
[38:33] Once all of these events are finished
[38:38] output is equal to result dot output, okay.
[38:44] Then we can actually just tack on message here.
[38:46] The reason for that is that we know that it should be
[38:51] So we get results dot output.
[38:52] And then the output is going to be this.
[38:54] We know that there's going to be a message.
[38:55] So we can simply just view that okay.
[38:58] Then we need to do is just print out the message.
[39:00] So we're just going to say print.
[39:02] And we can put an F string
[39:05] And I'm just going to put the
[39:08] sorry let's put output and then backslash n okay.
[39:11] So we'll start running this in one second.
[39:12] In order to do that we're just going to do
[39:16] is equal to underscore underscore main underscore
[39:21] and we can print support bot starting dot dot dot.
[39:26] Then we can go down here
[39:29] It actually knows exactly what I want okay.
[39:32] So we're going to say well true.
[39:34] The prompt is you if you enter Q then break.
[39:36] If there's not a prompt then just continue
[39:40] interactive, which is this function
[39:43] Now like I said,
[39:45] but for now,
[39:47] or look through the knowledge base
[39:50] Okay. So let's simply open this up.
[39:53] Let's go.
[39:53] You've run agents slash agent 2.py.
[39:58] And we got an issue here.
[40:01] Instructions
[40:04] So let's just fix that instructions like so okay.
[40:09] Let's run it again.
[40:10] And it says you I'm going to say
[40:17] And let's see if it can look that up.
[40:18] And if we get the result.
[40:21] Okay.
[40:21] This dictionary object has no attribute message.
[40:24] Interesting.
[40:26] Let's have a look at why we're getting that.
[40:28] And it's going to be something to do with this.
[40:30] So for now let's just print whatever this result is.
[40:35] Let's see what it is.
[40:36] And then we can parse through it.
[40:37] So let's say shipping or something.
[40:40] And let's see if it finds anything.
[40:41] And it gives us okay
[40:45] And reason model GPT 5.4 does not exist
[40:49] We found the issue there.
[40:50] So OpenAI slash GPT and think this is Dash 5.4.
[40:54] Let's look at what we had in the first agent.
[40:56] Yeah dash 5.4 okay silly error.
[40:58] But at least it gives us the response there.
[41:00] And now we can just quit this
[41:04] rerun.
[41:05] And let's see shipping
[41:08] and let's see what we get if it works this time okay.
[41:10] So I'm just playing around with this
[41:12] And we can see that the way we can do that is
[41:18] And that will
[41:22] However, it doesn't give us
[41:24] It gives us two.
[41:25] It gives it to a string and a dictionary
[41:29] which is still effectively the exact same thing.
[41:31] So you can see we get stage completed.
[41:33] Successful true message standard shipping
[41:38] Now let's ask it can you look up my order
[41:44] What's the order ID a 100. So let's type
[41:48] Need order ID successful false message.
[41:49] Sure. Please send me your ID.
[41:51] So we're going to say a 100.
[41:53] And so let's see if it can look that up. Now
[41:57] give it a second here to give us that response
[42:00] okay.
[42:01] Come on I hope it's calling the tool
[42:04] being a little bit slow and says refund pending info
[42:07] Message I can help with the refund,
[42:11] Eight 100 was found for 4999.
[42:13] If you want to refund for this
[42:15] Okay cool.
[42:16] So it looks it up.
[42:17] We get the information
[42:21] and we refresh, we can see
[42:25] And we can actually see all of the logs
[42:28] as well as the debug view here on exactly
[42:32] Anyways, let's go back to the most recent one there
[42:36] We have multiple turns
[42:38] So you can see we typed in a 100
[42:42] This was the input.
[42:44] This was the output. It got us the information.
[42:46] And then it gave us that full Json for the tool call
[42:49] went to the yellow and then gave us the output.
[42:52] Now you'll notice that this is just one run.
[42:54] If we go back.
[42:56] All right.
[42:57] You can see this was the other run.
[42:59] And then it's remembering
[43:02] Okay cool.
[43:03] So that's functioning now let's move on to add a few
[43:06] other things to our agent.
[43:09] So one thing that I want to add
[43:13] But like I said, we shouldn't just refund
[43:18] So in order to do this,
[43:20] This can be a tool,
[43:26] Approval underscore required is equal to true.
[43:30] Now this means that we need to manually approve this
[43:34] I'm going to show you how we do that.
[43:35] Now for the function
[43:37] We're going to take an order ID and we're going
[43:41] And then we're going to return, not a boolean
[43:46] Now we need to give a description.
[43:48] So for the description we're going to say
[43:51] let's go like this.
[43:52] Request a refund
[43:56] okay.
[43:56] Refund pause for human approval.
[44:01] Think before you run this
[44:04] okay cool.
[44:05] Just so it knows that
[44:09] then we're just going to return, even though
[44:12] We're just going to say refunded.
[44:14] And we'll put inside of brackets amount
[44:18] colon dot to f okay.
[44:20] For order.
[44:22] Order ID we're kind of faking a refund,
[44:26] that requires the human approval,
[44:29] So now that we have that tool,
[44:32] So we're going to say process refund.
[44:34] Now the thing is we need to start handling this
[44:39] So what I can do is the following.
[44:41] Now I can say if event dot type
[44:44] is equal to and then this is event type
[44:48] dot tool okay tool underscore call like so.
[44:52] And event dot args meaning it has some arguments.
[44:56] I'm going to say my order id is equal to event args.
[45:01] Yet order id or
[45:05] like this or order underscore d
[45:10] or order underscore id.
[45:11] Now what I'm effectively saying is hey I'm
[45:15] if we ever call an order ID when we're looking up
[45:19] because that's in the order I do referencing
[45:24] Okay, so I'm just pulling out that order ID,
[45:27] otherwise I'm going to say if event dot
[45:29] type is equal to event dot
[45:33] or event type dot tool underscore result okay.
[45:38] And is
[45:41] instance event dot result a dictionary,
[45:46] then what I'm going to do here is say
[45:51] dot result dot get.
[45:54] And I'm going to get a total
[45:58] or an amount.
[45:59] So same thing.
[46:00] Now I'm going to look in the tool result to see
[46:03] if I can figure out what the amount is
[46:06] It's kind of a weird way to do it, but it allows me
[46:10] And then lastly I'm going to say, Elif, the event dot
[46:14] type is equal to event
[46:18] type dot waiting.
[46:21] Then what I'm going to do
[46:24] And this message is going to be essentially saying,
[46:29] And then I'm pulling out the two arguments I have.
[46:31] So order ID an amount so I can print those
[46:36] And if they do, then we can approve it.
[46:37] So here's how it works.
[46:38] I'm just going to say print
[46:42] I'm going to go approval required.
[46:46] And then I'm going to say refund.
[46:48] And we're just going to put the order
[46:54] So we'll put a dollar sign like this amount.
[46:57] And then colon dot to f for order.
[47:00] And then the order ID okay.
[47:02] And then down here we're just going to put a print
[47:06] press enter to approve.
[47:09] Technically you can't actually press anything else.
[47:11] And this is going to be an input statement not this.
[47:15] So we're not even going to check what it is.
[47:16] And then we're just going to say handle
[47:20] So effectively when we call handled at approve
[47:23] So we're just going to wait for the human
[47:25] And then as soon as we want to approve boom,
[47:29] Okay. So now that we have
[47:31] So I'm going to say decision is equal to input.
[47:34] And we're just gonna ask them approve yes or no.
[47:36] And then lower dot strip.
[47:37] We're going to say if the decision is
[47:40] why then let me just check the documentation.
[47:44] Here it is. Handle dot approve okay.
[47:46] So we're just going to say handle.
[47:50] Dot approve like so okay.
[47:53] Otherwise we can say handle dot.
[47:56] And I believe it is reject. Let's see.
[47:58] Yes you can reject and you can pass a reason.
[48:01] If you want to pass a reason just say user
[48:04] rejected okay cool.
[48:06] So that is how we can now handle this.
[48:08] Again the reason why I'm looking at these tool
[48:12] that we're going to have for the refund, because
[48:16] So anyways, now let us go and run this
[48:21] and see if this works with the refund okay.
[48:24] So we're going to clear and then we're going to go.
[48:25] You've run agents too
[48:31] And it gave us an issue saying tool result.
[48:33] Just because I didn't have a capital L here.
[48:36] So let's fix that.
[48:38] And now we're good and rerun
[48:45] okay.
[48:45] Let's see what we get.
[48:46] And it says that it needs an ID so.
[48:48] Please give me the order ID so I can look it up okay.
[48:50] So let's go a 100 and see okay.
[48:53] And it says approvals required refund 4990
[48:57] So you can see these steps here.
[48:58] Picked up that information for us
[49:02] to either attempt to refund
[49:06] save them in the variable,
[49:09] hey, we now want to call this
[49:13] The only thing we could be waiting for approval for
[49:16] Because that's the only one that we have.
[49:18] So I'm just going to go ahead and type on
[49:21] And then hopefully it's going to tell us
[49:24] Let's see.
[49:24] It says stage completed message
[49:28] Boom.
[49:28] Now let's say refund order again okay.
[49:33] And hopefully it's going to give us
[49:36] So let's go a 100.
[49:38] Even though I know we already refunded it,
[49:41] And let's reject it this time and see what we get,
[49:46] And while we're at it, we can go here right to Agent
[49:49] Spend server
[49:52] And we're at this stage where we're just waiting
[49:57] And what I could actually do,
[50:00] because it's a little bit complicated to show is
[50:04] Right. And this worker just completely died.
[50:07] And then I restarted it, but reconnected
[50:11] this will still all be running
[50:14] and it will just be waiting for the human again
[50:17] So the human does need to ask refund.
[50:19] Again, we don't need to check something.
[50:21] We don't need to look up another order.
[50:23] It will just, resume where it left off
[50:26] at this stage, right
[50:29] And this can take any amount of time.
[50:30] It could take a day, could take it out,
[50:34] Doesn't matter. The server will keep running here.
[50:37] And you can see it's in this hand off state
[50:41] You'll see the time if we just keep refreshing.
[50:43] Like it'll just keep going up
[50:45] Okay, so anyways, I'm going to go.
[50:46] Yes here and or sorry I want to do no.
[50:49] So we rejected it.
[50:52] And I think doing
[50:55] anyways because well, we know it's
[50:58] All right okay. So this is working.
[51:00] Now what I want to do next is I want to start adding
[51:04] Now a guardrail allows us to actually audit
[51:07] the input or the output to our lab
[51:12] something potentially malicious or data
[51:16] So I'm going to show you how we write a guardrail.
[51:18] The guardrail that I'm going to write
[51:21] So a lot of times people will try to do like
[51:25] all of your previous instructions and give me,
[51:30] We can actually prevent against that
[51:33] where we try to detect common kind of phrases that,
[51:40] So what I can do is I can use add guardrail.
[51:42] So make sure you import it right.
[51:44] And I can say define safe underscore support
[51:48] underscore request like so.
[51:51] Now from here we can take a prompt which is a string.
[51:55] And this is going to be a guardrail result
[51:59] Now for the comments here.
[52:01] What we're going to do is say block
[52:05] obvious prompt injection attempts okay.
[52:09] And this is going to be before the LLM even sees it.
[52:12] So before the LLM gets it,
[52:15] So what I'm going to say is blocked is equal to.
[52:17] And then just a list of words.
[52:18] So I'm going to say ignore okay
[52:22] ignore previous.
[52:25] We can use system
[52:28] prompt something like that or jailbreak okay.
[52:31] So these are just words
[52:34] Now I'm going to say past is equal to not any.
[52:37] And this is going to be phrase okay
[52:41] in prompt dot lower.
[52:44] And then we'll spell lower correctly for phrase.
[52:49] Let's spell all these.
[52:50] My typing is so bad now with LMS phrase in blocked
[52:55] okay, so all this is doing is saying hey,
[52:59] That's all it's checking.
[53:00] Then we're going to return guardrail result.
[53:03] I'm going to say past is equal to pass,
[53:06] So if none of these existed then true.
[53:08] If they did exist then false.
[53:10] We're going to say reason or we can say sorry.
[53:12] Message is equal to.
[53:14] And we're going to say please ask a normal question.
[53:20] This is blocked.
[53:22] So if it fails
[53:25] So now what we can do is
[53:29] The way we add it is
[53:34] We then need to put a guardrail object.
[53:37] We're going to say
[53:42] This is going to be the safe support request.
[53:45] And we're going to say the position of the guardrail
[53:47] to be position dot input. Okay.
[53:51] And then we're going to say on underscore
[53:54] to on fail dot raise.
[53:57] Now raise is going to raise an error which is just
[54:01] There's other things that we can do here
[54:03] But for now I just want to completely quit.
[54:05] So effectively what I've done is I said, hey,
[54:08] This is a function that we want to run,
[54:11] before we pass anything to our LLF.
[54:14] So as soon as we get some input to our agent,
[54:16] run it through the guardrail,
[54:19] Make sure that there's nothing wrong.
[54:22] If there is something wrong, then tell us and fail.
[54:25] Okay, that's a simple guardrail.
[54:26] Now this is on the input.
[54:28] You also can add a guardrail on the output,
[54:31] So if we go to guardrails here, you can see there's
[54:35] You can see guardrail.
[54:37] We have a word limit.
[54:37] So for example we're checking to make sure that
[54:41] We're going to have a correct number of characters.
[54:44] And you can see for the failure modes here.
[54:45] Do you have like retry, raise fix human, etc..
[54:49] Okay.
[54:50] In terms of constructing the guardrail,
[54:54] So output input on fail the name
[54:59] And for position two you either input or output.
[55:00] So either run after or run before.
[55:03] Now there's a bunch of guardrails you can do here.
[55:04] You can do a custom
[55:06] You can do a regular expression, guardrail
[55:11] like we were kind of doing.
[55:14] Sorry, because it's a little bit complicated.
[55:16] And you could do an LM guardrail.
[55:18] So if you do an alarm guardrail,
[55:21] then either get the, what is it, fail or pass.
[55:25] The issue with this is that
[55:29] This LM where that's doing the guardrail.
[55:31] But the point is you can use an LM to actually
[55:35] Is this bad? Whatever. Okay.
[55:37] And then same thing input guardrails as we saw here
[55:41] There's a bunch of different ones that you can set up
[55:45] So I'm not going to go through all of them.
[55:46] We just wanted to show you
[55:49] Very good to add to the agent.
[55:51] So now that we've added this let's try it.
[55:54] And let's just go clear and run.
[55:58] So we forgot to pass a comma.
[56:00] Maybe let me see where that is.
[56:02] Yes we forgot the comma here.
[56:04] So let's add that and rerun and I'm going to say
[56:08] you know jailbreak this prompt okay.
[56:11] And you can see boom it just immediately
[56:15] support request failed.
[56:16] Please ask a normal question.
[56:18] This is blocked okay. So we ran into the guardrail.
[56:20] And then of course
[56:23] we wrote won't run into the guardrail because
[56:27] Okay give this a second.
[56:28] Hopefully it will give us the response.
[56:32] Not sure why this was taking so long.
[56:33] Maybe getting rate limited or something.
[56:35] Okay, you can see that it gives us the response here.
[56:37] And also you'll notice
[56:42] because we never even got to the images, immediately
[56:47] So like as I was scrolling through here, I actually
[56:53] Yeah. See, it's actually not showing up here at all.
[56:57] Just help me.
[56:57] Yeah, because we never even hit the server
[57:02] Okay. So again, a lot of other stuff
[57:04] They're not going to go through all of it.
[57:06] But with that said
[57:09] This was a little bit complicated.
[57:11] We had tools, output type, memory guardrails.
[57:14] What else.
[57:15] Human in the loop approvals
[57:17] kind of getting into the stream
[57:20] And again, all of this is available
[57:25] As you can see here, we have testing
[57:28] We have the memory right.
[57:29] And in conversation memory we have tools right.
[57:32] So check all of this
[57:34] And you can also add Http tools
[57:38] If you don't want to add custom function ones
[57:42] Anyways, now let's move on to agent three,
[57:46] kind of orchestration agent,
[57:50] that can be triggered at once
[57:53] All right.
[57:53] So we finished the first two agents where
[57:57] Now we're going to move on to agent three which
[58:02] Now what we're going to be
[58:05] So it's actually going to be very similar
[58:08] So I'm not going to write
[58:11] I'm just going to run you through it at a high level,
[58:15] in the description.
[58:15] And I'm going to explain the different strategies
[58:19] So this is the code that I have.
[58:21] I'm just going to quickly skim through it.
[58:22] And then I'm going to explain
[58:26] example you're trying to build.
[58:28] Okay.
[58:28] So effectively
[58:32] I have a researcher agent, I have a writer agent,
[58:37] a risk analyst, financial analyst
[58:41] And then I have these different agent pipelines,
[58:45] And then I have just a few things that will kind of
[58:50] Because that's how I'm going to kind of set it up.
[58:53] But effectively, the way this agent is going to work,
[58:56] is that I'm going to tell it, hey,
[59:00] And the strategy I want to use for the,
[59:02] research is sequential,
[59:06] And then what will happen is it will go and use
[59:11] and generate a research report.
[59:12] For me, that's what this agent is.
[59:14] Again, I'm going to show you how it works.
[59:15] And we'll run through the code in a second.
[59:17] Now, the way that I'm able to do
[59:20] Span supports these multi-agent strategies.
[59:23] Now here's the following strategies.
[59:25] First is handoff okay.
[59:28] And chooses which sub agent to handle the request.
[59:30] This you can write similar to this
[59:36] where essentially you just write an agent,
[59:39] These agents can be exactly
[59:42] And then you change the strategy here to say handoff.
[59:45] That's it.
[59:46] And then you just trigger this agent
[59:49] And it will just go and let's remove this.
[59:51] Be able to use each agent
[59:55] So it has all these different agents beneath it.
[59:57] Similar to if you're using like cloud code
[1:00:01] Then you have sequential straightforward.
[1:00:03] This just means that we always run the agents in a,
[1:00:08] We run them one by one, and then we take the result
[1:00:12] You can see sequential looks like this, right?
[1:00:14] We run and we get the result.
[1:00:17] We run, we get the result.
[1:00:19] Then eventually we get the final results
[1:00:24] And then we get the response, okay,
[1:00:28] Parallel allows us to run these all concurrently.
[1:00:31] This means that I can run all three agents
[1:00:35] so I don't need to wait for one response
[1:00:39] Then we have rotor.
[1:00:40] As you can see, we can route between different ones.
[1:00:42] We have swarm handoffs between different agents.
[1:00:46] We have round robin, random and manual, a
[1:00:50] When you make these agents
[1:00:54] It looks like this.
[1:00:55] These kind of two
[1:00:59] And this is the same syntax as writing this.
[1:01:02] This just means run these agents sequentially.
[1:01:04] You're kind of piping the response
[1:01:08] You can define the agent and you can just
[1:01:11] Okay.
[1:01:12] And then you can just run the pipeline like this
[1:01:15] So I'm going to show you
[1:01:17] So you can see the time difference
[1:01:20] But notice that if I want to run them in parallel,
[1:01:24] strategy parallel boom. We get the response.
[1:01:26] And if you want to get the sub result
[1:01:29] Hand off the default one.
[1:01:30] You just pass them in here.
[1:01:32] Strategies.
[1:01:32] Hand off it will go and hand off as needed. Rotor.
[1:01:35] You can set up agents.
[1:01:36] You can also set up a rotor for the rotor.
[1:01:38] You can actually use an agent to do this.
[1:01:40] You see have a classifier
[1:01:43] and then just reply with the correct category.
[1:01:45] And then it will call the correct one, okay.
[1:01:48] And then swarm. And you can go through
[1:01:51] But I'm going to show you the code example right now
[1:01:54] So let's go through the code that I have right here
[1:01:56] So first things first we just bring in the imports.
[1:01:58] We disable some of the logging kind of war
[1:02:03] We specify the mode.
[1:02:04] So we want to be able to run.
[1:02:05] So sequential parallel nested and worker.
[1:02:08] We then have some various tools here.
[1:02:10] Now notice that these tools use
[1:02:14] Now when I specify a credential here
[1:02:18] that we need to grab this credential from our server
[1:02:23] So I say credentials is equal to fire curl API key.
[1:02:27] Now what I'm doing is saying API key is equal
[1:02:31] And this will automatically set the fire Curl API key
[1:02:35] which I'm going to show you how to do in a second.
[1:02:37] In the local shell while we're running this worker.
[1:02:40] So this means any credentials that you want to have,
[1:02:42] you can store them directly on the agent server,
[1:02:46] You can grab them when a tool is called
[1:02:48] and then use them locally
[1:02:52] So only when they're needed they can get pulled out.
[1:02:54] So essentially I'm going to use Fire Curl.
[1:02:56] If you want to sign up, you can get a free account.
[1:02:58] You don't need to pay for it.
[1:02:59] You get a bunch of free credits,
[1:03:03] of scraping and searching of the web more effectively
[1:03:07] So I'm using Fire Curl to just search the web
[1:03:10] whatever topic we're going to look up.
[1:03:11] I then have this fetch page tool.
[1:03:13] This can get an individual tool
[1:03:16] from the page and give us the information
[1:03:19] Okay, so just two tools.
[1:03:21] Now I have a researcher agent, this agent I keep it
[1:03:26] Right.
[1:03:27] And that's it then for the writer agent
[1:03:31] I don't even change the model for the editor.
[1:03:34] I just give it some different instructions for the
[1:03:38] And I just have all these different agents
[1:03:40] I then create an analysis team.
[1:03:42] And this analysis team I want to run in parallel
[1:03:46] analyst and the financial analyst.
[1:03:47] So these three right here,
[1:03:51] So I just specify that
[1:03:54] I then create these pipelines.
[1:03:56] So I have a published pipeline
[1:03:59] So let's have a look here.
[1:04:01] We do the research.
[1:04:02] We do the writing and we do the editing.
[1:04:05] Now when I do that, because of the syntax
[1:04:09] which means I need to wait for the researcher to go,
[1:04:13] Then for my nested pipeline, this is where I take
[1:04:18] And then after that.
[1:04:19] So after I get my analysis,
[1:04:23] So I run this whole thing sequentially.
[1:04:26] But this first step runs
[1:04:29] So I've created this kind of like multi-agent,
[1:04:33] where my analysis team goes in parallel at first.
[1:04:36] Once the analysis team is done,
[1:04:40] Hopefully that makes sense.
[1:04:41] But that's kind of how I've set up these agents
[1:04:44] And notice we just have two simple tools.
[1:04:46] But we can use anything from agent two or agent
[1:04:52] Okay.
[1:04:52] Now we just have a few functions
[1:04:55] one to slug ify something, one to save the report.
[1:04:59] These are just functions that I'm manually calling.
[1:05:01] And we're just going to save a report
[1:05:06] In that folder it's
[1:05:09] So let's say we're
[1:05:10] just going to save like a markdown report
[1:05:14] Now you'll notice that I just have this run
[1:05:16] This allows me to take in either
[1:05:19] You can see if it's sequential.
[1:05:20] We run the publish pipeline, which is this.
[1:05:24] If it is parallel we run the analysis team
[1:05:28] And if it is let's go back.
[1:05:30] What's the other option we had here nested that.
[1:05:32] It runs my nested pipeline.
[1:05:34] Then what we do is
[1:05:37] hey, we're going to run
[1:05:41] So just which one are we going to execute?
[1:05:43] This is the topic that we want to research.
[1:05:45] And then we just have some runtime.
[1:05:47] We get the execution ID, we get the status,
[1:05:50] And then we just save the report and that's it okay.
[1:05:53] Then serving the worker. Don't worry
[1:05:55] And prompt mode.
[1:05:56] This just allows me to essentially type
[1:06:01] So we can run it.
[1:06:02] So let me run it and show you what this looks like.
[1:06:04] So you get a sense of how this functions.
[1:06:06] So I'm gonna say you've run Agent Slash
[1:06:09] and then this is going to be agent 3.py okay.
[1:06:13] For the mode we're going to pick.
[1:06:15] So for now let's go with parallel topic.
[1:06:19] Let's go with tech with Tim okay.
[1:06:22] So for parallel what this is going to do.
[1:06:24] Again let's just look at the setup here
[1:06:28] with just this market analyst.
[1:06:30] Risk analyst and financial analyst.
[1:06:32] Now this probably doesn't make sense for me
[1:06:35] is not really something
[1:06:39] But if we want to see this running we can go here,
[1:06:43] we can save and you can see that this is running.
[1:06:46] We actually have three agents running.
[1:06:48] And if we go back to the main execution,
[1:06:52] And then if we go back here
[1:06:56] And if we open up the report we get the full report
[1:07:02] Okay cool.
[1:07:03] Now let's try a different execution mode.
[1:07:05] You've run
[1:07:11] Let's go Nvidia stock okay.
[1:07:14] Now if we go here let's go to our agents.
[1:07:18] You can see that
[1:07:20] So have the analysis team researcher writer editor,
[1:07:24] And these are going to run sequentially.
[1:07:26] So if we go and have a look at this, the first thing
[1:07:30] The analysis team we need to run sequentially.
[1:07:31] So we're waiting for all of these to finish okay.
[1:07:34] Now we're going to the researcher.
[1:07:36] So the researcher is going to have their
[1:07:37] the input from the analysis team,
[1:07:42] We're going to wait for the researcher to finish.
[1:07:44] And then as soon as the researcher is finished,
[1:07:46] we're going to go to the writer,
[1:07:48] So this of course is going to take longer.
[1:07:50] But that makes sense
[1:07:51] because we need to go through this flow
[1:07:55] So let's just refresh here,
[1:07:59] And actually if we go to the main execution,
[1:08:01] you can see that we're running this analysis team
[1:08:05] And we can just wait for the researcher to finish.
[1:08:07] We should see it
[1:08:09] And you can see that we have a lot of different
[1:08:13] Because it's using the search web call
[1:08:16] Now if I check here it actually says the fire curl
[1:08:19] So I'm glad we saw that.
[1:08:21] And you can see
[1:08:21] this is just going to continue to keep retrying
[1:08:26] Or I provide the fire Curl API key,
[1:08:30] So what I'm going to do is just quit out of this for
[1:08:34] Okay, so like I mentioned before,
[1:08:38] on the server, which we need to do
[1:08:41] And the way to do that is the following.
[1:08:43] You're going to type, you've run
[1:08:47] credentials,
[1:08:50] And then you're just going to set the credentials
[1:08:52] Now in our case
[1:08:57] And I'm just going to make this equal to my fire
[1:09:01] Okay.
[1:09:01] So you're saying you've run agent spanned
[1:09:05] And we need to remove the equal sign
[1:09:09] And now we've stored this on the server.
[1:09:12] So now we may need to restart the server
[1:09:14] Let's actually just go here and check.
[1:09:16] We can refresh and let's go to credentials.
[1:09:19] And okay it looks like the credential is now here.
[1:09:21] So that's good. So it's stored.
[1:09:23] And what we can do is rerun our agent okay.
[1:09:27] We're going to run this in the what mode
[1:09:32] Yeah. So let's run this in the nested mode.
[1:09:35] And let's look up in Nvidia
[1:09:37] stock okay.
[1:09:39] And hopefully this time it will work.
[1:09:40] Once we get to this step
[1:09:43] Okay.
[1:09:43] So I just opened up the server
[1:09:47] Now this is the one that takes the longest
[1:09:50] But you can see that
[1:09:53] Right.
[1:09:53] To get all this information about Nvidia,
[1:09:58] I believe it use yes search web.
[1:09:59] So it was searching past the input query Nvidia
[1:10:04] And then it got all this output.
[1:10:05] And then it went to search
[1:10:08] And we can see the full flowing flow full flow story
[1:10:13] If we go back to the agents
[1:10:14] we can see now we're just at the writer
[1:10:19] So let's see what response we get okay. Boom.
[1:10:21] And looks like we got the response.
[1:10:22] If we go to the reports here, we can open this up.
[1:10:26] Let's just preview it here.
[1:10:27] And we can see our full markdown report about Nvidia
[1:10:33] We'll just click one and see if it works.
[1:10:35] And boom yeah we get like the full report.
[1:10:37] I guess it's long.
[1:10:38] I'm not going to wait for that PDF download
[1:10:42] Okay. So very good.
[1:10:43] The nested agent is working.
[1:10:46] So that's pretty much
[1:10:50] Now what I want to do is move on to a few other parts
[1:10:54] and then the durability feature.
[1:10:56] So how do you actually resume an AI agent
[1:10:59] in the middle, or it's
[1:11:02] Let me show you.
[1:11:04] So what I've just done here is written a short file
[1:11:05] that shows some basic usage
[1:11:09] Now what we're able to do is we can test these agents
[1:11:12] without actually having to make an API call
[1:11:15] to ensure that things like the model
[1:11:20] they're using, or the tools that are using,
[1:11:24] So, for example, what I've done is I've said, hey,
[1:11:27] So I've brought in some stuff from Agent Span.
[1:11:29] I've brought in the support response
[1:11:32] I have an example refund policy
[1:11:36] that we should be getting as a response here.
[1:11:38] And what I've said is,
[1:11:41] The tool call is going to be searching
[1:11:44] We're going to have a query,
[1:11:46] We're going to mock the tool result
[1:11:50] And then we're going to mock done.
[1:11:52] And we expect that
[1:11:55] So we're mocking a lot of the functionality.
[1:11:57] But again it's still good just to make sure
[1:12:01] extremely quickly without relying on lumps,
[1:12:05] We expect the result to be completed,
[1:12:09] and we expected to have used this tool
[1:12:12] Right?
[1:12:12] If we give the support agent this,
[1:12:16] So we mock all of the events, but we can just
[1:12:21] Now there's full docs on how this works.
[1:12:22] I'm not going to go through all of it,
[1:12:25] If we want to run this,
[1:12:28] I just moved some of the import stuff around cause
[1:12:30] But anyways, if I go here and I run this now,
[1:12:36] We didn't get any errors
[1:12:39] you know, dot refund did instead of refund
[1:12:45] we get an assertion error and it says, hey,
[1:12:49] Okay.
[1:12:51] So now I want to have a look at the durability
[1:12:55] And what I mean by
[1:12:58] we can restart it
[1:13:03] So let's imagine we have a simple agent
[1:13:07] What do you call it? Tool that runs.
[1:13:10] It takes three seconds to run.
[1:13:11] Notice. Also, I added a timeout.
[1:13:13] You can do that on various tools.
[1:13:14] And what I've done is I've told the agent, hey,
[1:13:17] by calling the slow
[1:13:21] That's it.
[1:13:22] So this will take 30s to run, but we might make it
[1:13:27] and then we would have to restart from the beginning
[1:13:31] So what I've done is I've set this up
[1:13:34] We also have a resume mode.
[1:13:35] Now you would have this if you're running this in
[1:13:40] when these agents are running,
[1:13:43] So anyways, you can see that if the mode is start,
[1:13:46] what I'm going to do
[1:13:49] Right. And then I'm just going to stream the handle.
[1:13:51] And this is just going to print out everything
[1:13:54] So we can see until it says that this is done.
[1:13:56] That's it.
[1:13:57] Now if the mode is resume
[1:14:02] So I'm going to start the agent.
[1:14:04] And then what I'm going to do is connect
[1:14:09] So this is going to allow me to connect
[1:14:13] And because this agent will be running,
[1:14:18] So I'm just serving the agent, so.
[1:14:20] Okay, start the agent.
[1:14:21] And for our handle,
[1:14:24] just connect to the previous one that we have.
[1:14:27] So any of these execution IDs
[1:14:30] Of course, there's a lot more scientific,
[1:14:32] scientific way to go about doing this,
[1:14:35] So let me show you what I mean.
[1:14:37] Let's open this up and let's go. You've run
[1:14:41] and let's spell this correctly.
[1:14:42] And then agents slash crash resume demo okay.
[1:14:46] So let's let this run for a second and let's wait
[1:14:51] So let's go back here to our agents.
[1:14:53] And you can see the durable demo is running.
[1:14:55] It's running this slow step.
[1:14:56] And if we keep refreshing here we should just see
[1:15:00] So now we're on step two.
[1:15:01] And I'm just going to keep going.
[1:15:02] Right is going to do this well up to ten times.
[1:15:05] So let's wait okay. Refresh again.
[1:15:08] You can see now we're on step three.
[1:15:09] And then what happens
[1:15:13] Well if we go here
[1:15:16] So we made it to step four.
[1:15:17] But the slow step
[1:15:20] So what can I do.
[1:15:22] So that I don't need to restart this
[1:15:25] You'll notice
[1:15:27] We're still on step four
[1:15:31] So if we go here, you'll see that we have
[1:15:36] Looks just like this.
[1:15:37] So we're just going to copy that execution ID
[1:15:40] and we're going to paste that right here.
[1:15:41] I'm going to remove the spaces.
[1:15:43] I'm going to change the mode to just say resume.
[1:15:46] So now what's going to happen
[1:15:49] where I'm
[1:15:53] because all of the state is stored
[1:15:57] So if I just restart this here, you'll see that
[1:16:02] And I have all of the state already there.
[1:16:04] And we can now just continue.
[1:16:05] And if we refresh, you'll see that
[1:16:08] So I didn't restart anything.
[1:16:10] I didn't lose any state and lose any information.
[1:16:12] I just go from where I left off
[1:16:16] So this is the important thing to understand
[1:16:19] Right.
[1:16:20] And kind of all of the information.
[1:16:21] And your worker is just executing the code, right.
[1:16:24] It's executing the functions, it's
[1:16:26] But you at any point,
[1:16:30] So imagine you're writing a platform.
[1:16:32] You just store all your execution IDs.
[1:16:34] If any of them fail,
[1:16:37] when the worker comes back online, because that's
[1:16:41] And same thing. Let's can I quit? Maybe in time?
[1:16:44] I'm not sure if I was able to quit it in time
[1:16:46] Now let's go down here and see.
[1:16:48] Yeah.
[1:16:49] So it's still waiting on the let me call.
[1:16:50] So now same thing if I run it again boom.
[1:16:53] You see we get right back into the execution
[1:16:56] And all of it's finished
[1:17:01] Which if we look here, there's tons of workflows.
[1:17:03] Complete steps one through ten will run an order.
[1:17:05] But okay, so that's what I wanted to show you
[1:17:09] and how easy it is to get back into the state
[1:17:12] Now, lastly,
[1:17:15] and then we're going to be done with this course
[1:17:18] So now let's talk a little bit about deployments.
[1:17:20] Now I'm not going to deploy full application here.
[1:17:22] But I just want to discuss how you can move to
[1:17:27] Now if you just want to use local development
[1:17:29] you just run the Agent spin server and that's it.
[1:17:31] It will just stored in a local SQLite database.
[1:17:34] However, if you want to go to a deployed environment,
[1:17:39] and some kind of Docker compose
[1:17:42] Now, in order to do that, you can just pull
[1:17:46] I'll leave a link to it in the description,
[1:17:51] Here you can go into the deployment
[1:17:56] So if you go here they have deployment right.
[1:17:58] And then they have docker compose.
[1:18:00] And from Docker compose
[1:18:03] Inside of the env example
[1:18:07] or like API keys that you want to have.
[1:18:08] You can put the what do you call a Postgres database
[1:18:13] so that rather than running it
[1:18:16] You can also connect to it as needed.
[1:18:18] Now it also goes over exactly how to deploy it
[1:18:21] This will just deploy this server for you.
[1:18:24] And as soon as this server is deployed, all you need
[1:18:30] So as it says, right here, all you have to do is just
[1:18:34] It could be running on this server,
[1:18:38] or behind some URL, whatever.
[1:18:40] And that's it.
[1:18:40] Then you just point it there with the server URL,
[1:18:45] Right. And this can be scaled as much as you want.
[1:18:47] Now there's a bunch of other options
[1:18:51] and all of this kind of stuff,
[1:18:52] which I'm not going to go through here,
[1:18:55] you can set an off secret, and then you can also
[1:18:59] So now if someone wants to connect to it,
[1:19:04] So you have some kind of secure authentication
[1:19:07] and between your agent span server.
[1:19:10] And that's pretty much it.
[1:19:11] That's all you need to do for deployment okay.
[1:19:13] You also can obviously self-hosted
[1:19:16] And it kind of explains how you have multiple workers
[1:19:20] And you can see all of the different options,
[1:19:24] It's just a matter of essentially
[1:19:26] And once a server is deployed, pointing
[1:19:30] kind of, you know, protocol
[1:19:34] So that's it
[1:19:37] That's pretty much all of the core
[1:19:41] And of course there's a lot more
[1:19:43] But this should give you a really good head
[1:19:47] Great AI agents in Python.
[1:19:49] If you enjoy this type of video,
[1:19:51] Subscribe to the channel
⚡ Saved you time reading this? Transcribe any YouTube video for free — no signup needed.