[0:00] In this video, I'll be going through a full course
on how to build production agents in Python.
[0:05] We're going to write every single line of code, and
I'm going to show you how to build three AI agents.
[0:09] The first is going to be a simple conversational
agent that has access to conversational memory.
[0:14] The second is going to be a rank based agent,
where it can pull out information
[0:18] from like a company database.
[0:19] And then the last agent
is going to be a multi-agent orchestrator,
[0:23] where we actually have multiple AI agents running
at the same time to achieve a longer running task.
[0:28] Now, this video is not designed
for complete beginners,
[0:30] but as long as you're familiar with Python,
you should be able to follow along.
[0:33] And we're going to be using a framework here
called Agent Span.
[0:36] But don't worry, it is free is open source.
[0:38] You won't need to pay for anything.
[0:39] You just need to have access
to some kind of AI model.
[0:42] So like OpenAI, anthropic, whatever.
[0:44] But we'll go over that in a minute.
[0:46] Okay, now this is really going to be focused
on how to build production AI agents.
[0:51] So rather than just agents
that can run in your terminal
[0:53] or that run in a demo environment, ones
that you could actually eventually scale up.
[0:58] Now, in order to do that,
we need to talk about the main problems
[1:01] that you have
when you actually try to run AI agents in production.
[1:04] Now, first
we have processes that crash mid run, right?
[1:08] So maybe the network goes down, database freezes
whatever.
[1:11] Your agent just gets killed.
[1:13] And that means that a lot of the work
that's done can be completely wasted.
[1:16] And that can be quite expensive over time.
[1:18] Next human in the loop.
[1:19] So maybe we need a user to approve a task,
something, right?
[1:22] Or to press a button
that could take any amount of time.
[1:25] We're just unsure about that.
[1:26] Lastly or not.
[1:27] Lastly, but thirdly,
and one that's most important to me is visibility.
[1:31] A lot of times when you build these AI agents,
you have no idea what they're actually doing.
[1:34] So you need observability into the platform
to see what step is on.
[1:39] Where is it going wrong?
What tools is it calling, etc..
[1:42] And then obviously scaling a lot of times
if you just build a simple like long chain
[1:45] agent or something, it's not going to scale
to tens of thousands of users,
[1:49] and you have to pretty much reinvent the wheel
and spend most of your time deploying out all
[1:53] this infrastructure.
[1:54] When really you want to focus on
just building the AI.
[1:57] So there's seven things that you need.
[1:59] If you want to have an AI agent
actually be production ready.
[2:02] I'm gonna quickly going to go through them here.
[2:04] Now first durability.
[2:05] That means that if the agent crashes it can recover.
[2:08] And it doesn't need
to completely restart next retries.
[2:11] So sometimes the step will fail.
[2:13] That doesn't
mean we should completely exit the process.
[2:15] We should retry it multiple times.
[2:18] Human in the loop. Again.
[2:19] Sometimes we need to delegate a task back to a human
and say hey, are you sure you want to do this?
[2:23] Do you want to issue the refund?
[2:24] You want to delete this file x y, z, right?
[2:27] Observability.
[2:28] Like I talked about, we need to be able to actually
see what's going on in real time long running tasks.
[2:33] If agents take 2030 two hours to run,
we should be able to handle that
[2:37] and then scale and testing,
which we can talk about a little bit later.
[2:40] Okay, so in order to accomplish what I just discussed
there and essentially
[2:43] get these seven features for our
AI agents, we're going to be using
[2:47] a framework called Agent Span, which comes from Orx
who's kindly sponsored this video.
[2:51] And don't worry, this is free.
[2:52] You don't need to pay for anything.
It is completely open source.
[2:55] I want to quickly just show you what it looks like
when you actually get this running,
[2:58] because this is the benefit of using a platform
like this network is
[3:02] essentially gives us a server
which is going to handle all of the different state
[3:06] and kind of track the progress of the multiple
AI agents that we're running.
[3:10] So you can just see a few quick
examples here from the dashboard.
[3:12] This is the server running on my own computer.
[3:14] You don't need to build this.
You literally just install it and run it.
[3:17] And for any given AI agent,
let's say we go to this analysis team agent.
[3:22] You can see a full log of everything
that's actually gone on.
[3:25] And you can see this in real time.
[3:27] So in this case we had a multi-agent system.
[3:29] And I can click into one of these agents
and see the input, the output, the Json, the summary,
[3:33] or actually go into the execution
of this agent itself to see everything that went on.
[3:38] So this is the observability that I'm talking about.
[3:40] What this also does is allow us to scale the agents
by having a built in queue system
[3:45] for all of them running, and then to retry tasks.
[3:48] For example, if we go here and we scroll down,
you can see there was like ten tasks
[3:52] that were running.
[3:53] And we can go through every single
[3:55] turn of the agent and see everything
that went on along with the tokens.
[3:59] The reason it stopped the duration, all of that
good stuff, then this will make a lot more sense
[4:03] later.
[4:03] But effectively this is the backend infrastructure
that we run our agents against.
[4:08] And each of these agents that you see here was me
running code
[4:12] that connected to this server,
and the server handled the state
[4:15] and the orchestration, but allowed all of the code
to be executed where it was run.
[4:20] So from our local machine, from our server, whatever.
[4:23] But if there was a crash, for example,
[4:25] we could recover from that crash
because all of the state is stored on this server.
[4:28] So we could just reconnect, restart
where we left off.
[4:31] And it's not a big deal.
[4:32] And this task can run for as long as it needs to.
[4:34] So anyways, that's the basics on Agent Span.
[4:36] They also have their own Python framework
for building AI agents, which we're going to use, but
[4:40] you also can connect them to Lang graph, the OpenAI
SDK, Google ADK, I believe a few other ones as well.
[4:45] If you just want to use their orchestration layer
or kind of the server that I talked
[4:49] about now, in terms of the kind of architecture here,
let me quickly go through it.
[4:53] This is pretty much what it looks like.
[4:55] We have a worker.
[4:56] The worker is what we're going to write ourselves.
[4:58] We have the agent span server.
[4:59] This is already provided to us. Again
it's open source.
[5:02] We can run it ourselves.
We don't need to pay for anything.
[5:04] And from here, this keeps track of all of the state.
[5:06] The history allows us to retry, handle human
in the loop, multi-agent, all of that kind of stuff.
[5:11] It just handled for us.
[5:12] So from the worker side, we pretty much
just say, hey, we're building an agent.
[5:16] We're going to connect to this server.
[5:17] All of the rest of the code stays exactly the same.
[5:20] The server handles all of that
durable execution stuff that I talked about.
[5:24] And then of course we have an LM.
[5:25] We can use any LM that we want.
[5:27] So bring OpenAI cloud whatever.
[5:29] And that's essentially how it works.
[5:30] So anyways that is the brief.
[5:32] That's
what I'm going to show you how to do in this video.
[5:34] What I want to do now is hop over to the code editor.
[5:36] We're going to start
getting some things installed and set up.
[5:39] And then from there we're going to build out
[5:40] three unique AI agents again, starting easy
and then medium and more difficult.
[5:45] So you get a sense of how to actually build these.
[5:47] And again how they work in production,
which is the most important part
[5:50] because at the end of this video,
you could very easily go deploy
[5:53] this up by just deploying the server
and deploying your workers.
[5:56] And you're good. That's it.
[5:58] Because of the way that we built it, as opposed to
if you use a lot of the other frameworks out there.
[6:02] Anyways, let's dive in.
[6:03] All right.
[6:04] So now we're going to get started
with the installation steps here for Agent Spann.
[6:07] Now I'm just on the documentation.
[6:08] I'll leave a link to it in the description.
[6:10] It's actually very good. So you can follow along.
[6:12] And a lot of the stuff that you see in this video
I just pulled directly from the documentation.
[6:16] Now first things first we need to install Agent Span
and the Agent Spans server.
[6:20] Once we have that installed
that is very easy for us to just write the code,
[6:24] which is our worker code
which will connect to the server.
[6:27] Now notice that we can simply install it
using pip install agent span.
[6:31] This of course requires
[6:32] that we have Python installed on our computer,
and that we have some kind of code editor.
[6:35] So my case I'm going to be using cursor.
[6:37] You can use any editor that you want for this video.
[6:40] Now notice that what I've done in cursor
is I've just made a new folder here.
[6:43] So I just one file I want open folder.
[6:46] And I just selected one that was on my desktop.
[6:48] Just made a new one called AI Agent Tutorial.
[6:50] From here I've opened up the terminal
and I'm going to type a UV init.
[6:54] Don't let me zoom in a little bit
so you guys can see this.
[6:56] And this is just going to make a new UV project
because I'm going to use UV to install agent span.
[7:01] So you notice
it says you can use UV pip install agent span.
[7:04] So from here
we're just going to type UV add agent span like so.
[7:08] And then it should add it to our environment for us.
[7:10] And install everything that we need.
[7:12] Now you don't have to use UV but I prefer to use UV.
[7:15] So that's what I'm going to do.
[7:16] Now I'm just going to delete this main.py file
because we don't need that as well.
[7:20] So now if we go to the Pi project tunnel
you can see Agent Span is installed.
[7:24] Okay.
[7:25] Now the next thing that we need to do is set our API
key, that the interesting thing about Agent Span
[7:30] is that it
actually will hold the various provider keys for us.
[7:34] So any environment variable that you need to use,
you don't have to have it in your worker code.
[7:39] You can have it stored on the server
which is going to be more secure.
[7:42] So I can actually put in OpenAI
API key or anthropic API key or whatever provider
[7:46] I want to use directly, where I'm running my server,
which you're going to see in a second.
[7:51] So what we're going to do
is just get one of these API keys for this video.
[7:54] I'm going to use OpenAI,
but you can use anthropic if you want.
[7:57] And what I'm doing is
I'm going to platform.openai.com/home okay.
[8:02] This is going to let me make a new API key.
[8:04] You will need an account here.
And you will need to pay for this.
[8:06] But it's very cheap.
[8:07] We're talking about, you know, maybe sense of spend
to follow along with this tutorial.
[8:11] And I'm going to go create API key.
[8:13] And I'm just going to call this agent span.
[8:16] And then maybe tutorial or something.
[8:19] Okay I'm going to make the key.
[8:20] And obviously you don't want to leak this to anyone.
[8:22] So I will delete it afterwards.
[8:24] Okay. So from here
we're going to go into our terminal.
[8:26] And we're going to type the command
as it shows here from the documentation.
[8:29] So let's go back. Export OpenAI API key is equal to.
[8:33] And then the key.
[8:33] So we're going to export OpenAI underscore API.
[8:36] Underscore key is equal to.
[8:37] And then we're going to paste the key inside of here.
[8:40] And then we're going to press enter.
[8:42] Now this should put it
inside of the current shell session.
[8:44] Which means that any command that we run after
this should have access to this variable here okay.
[8:49] So make sure that if you're going to run the server
again that you first export the key beforehand.
[8:54] There's other ways to avoid doing that.
[8:56] But for now this is the easiest where you just have
to have this environment variable set in your shell.
[9:01] Okay before you run it.
[9:02] Now if you are on windows,
this command will likely look a little bit different.
[9:06] And if you're using something like cursor,
I would just ask it, hey, what is the, you know,
[9:10] equivalent command to and then paste the export,
[9:15] you know, OpenAI whatever for PowerShell.
[9:19] And it should tell you
I don't know what the exact command is.
[9:21] So I'm not going to guess.
[9:22] But you can just use an AI model
and it should tell you how to export it properly.
[9:25] So now that it's exported, what we're going
to attempt to do is run the agent Spans server.
[9:29] So we can just directly run the agent spawn server,
or we can run
[9:32] Agent Spin doctor
just to make sure that is all working.
[9:35] Now because I'm using UV,
that means that if I want to run this I need to do.
[9:39] You've run an agent spin doctor.
[9:41] If you're not
and you just installed it globally with Pip,
[9:44] you should be able to just run the agent
spin command.
[9:46] So from here I'm going to press enter
and let's see what it says.
[9:50] And it looks like all is good.
[9:51] It says okay OpenAI is set, Java is installed.
[9:54] We have enough disk space. The server jars cache.
[9:56] That's because I've installed this previously.
[9:59] If you didn't install this previously,
it may tell you that something's wrong.
[10:02] And if that's the case, you may need to install,
for example, Java 21, okay, etc..
[10:07] Now if you don't know how to install it again,
[10:08] ask the Lem to ask something like cursor
hey, how do I install Java 21?
[10:12] And it should give you the command.
[10:14] Okay, so now that that's running we're going to type.
[10:16] You've run agent spin server start okay.
[10:21] Now this is the command to start the server.
[10:23] So we're going to go ahead and run that.
[10:26] And you can see that
it says server is already running
[10:28] okay let me stop the server
because I may have it in another port.
[10:31] So to stop it we're just going to go stop okay.
[10:33] And then I'm just going to restart it from here.
[10:35] So let's give it a second.
[10:36] And it says it's running on port 6767.
[10:39] We're just going to wait a minute.
And it says that it is running.
[10:42] So now if we want to test if the server is working
we can just copy this URL right here.
[10:46] We can go to our web browser and just paste it.
[10:49] And we should be able to see the agent spans server
okay.
[10:51] So from here you'll see the agent spans server.
[10:54] There's a bunch of stuff you can look through.
[10:55] But generally you're just going to be looking
through executions right here.
[10:58] And it's going to show you a history
of all of the executions.
[11:01] Now obviously you won't see anything
if it's your first time, but for me, I'm
[11:04] seeing previous executions
because I've ran the server before.
[11:07] Okay.
[11:07] So we're going to have a look at this later
[11:08] because it will make more sense
when we actually get executions.
[11:11] But for now, let's go back to our project here
and let's start
[11:15] installing a few last things that we need.
[11:16] And then we can create our first AI agent.
[11:19] Okay.
[11:19] So I'm going to write clear
and I'm just going to type UV add.
[11:22] I'm going to add a few dependencies that we need.
[11:24] And if you're not using UV you can just use Pip
to add the equivalent dependencies.
[11:28] Now first we're just going to bring in Python
dash dot envy.
[11:33] And we're also going to bring in pedantic.
[11:35] And then lastly fire crawl
[11:39] dash pi which we're going to use for the last agent
okay.
[11:41] So go ahead and press on enter.
[11:44] And we should see that
we get them all installed okay.
[11:46] So that's all we're going to need
installed for our project.
[11:49] What I'm going to do now is just make a new folder.
[11:51] And I'm going to call this agents now instead of
agents, I'm just going to make a new agent.
[11:56] And I'm just going to call this agent 1.py.
[11:59] And this is where we're going to start
writing our code
[12:01] now, our first agent, it's just going to be
a simple conversational agent.
[12:05] All that means that we're just going to talk to it
kind of like a chat bot.
[12:08] And the one thing that we're going to add
is that we're going to allow the agent
[12:11] to know what our current time is,
and to get information about us as a user.
[12:15] We're also going to add memory so that anything
that we say previously it can actually remember,
[12:20] because by default, if you don't add memory,
I can say, hey, my name is Tim.
[12:23] It says Hey Tim.
[12:24] And then the next conversation or the next time
I ask it something, it will completely forget
[12:28] because it's not storing the previous responses.
[12:30] Okay, so that's the goal here.
[12:31] And this is just to show you
the basics of the framework.
[12:33] And then we'll go into building some stuff
that's more complicated.
[12:36] So we're going to start by importing logging.
[12:38] This is because there's a lot of logs
that are going to be output by agent span.
[12:41] And we want to probably suppress some of them.
[12:43] So we don't see too much in the terminal.
[12:45] We're then going to save from date time import
date time.
[12:50] We're then going to go from dot env import
load dot env.
[12:54] And we're just going to use local env
to load an environment variable file
[12:58] that we're going to need in a second.
[13:00] Next we are going to say from agent Span.
[13:03] And this is going to be Dot agents.
[13:05] Make sure that you put
plural. We're going to import agent
[13:09] the agent runtime okay.
[13:12] And runtime is with the lowercase there.
[13:14] And then conversation memory run and tool okay.
[13:19] So this is all we're going to need for now for this
basic agent.
[13:22] Let me just close this.
[13:23] You guys can see it a little bit bigger okay.
[13:25] Next lines. We're going to locate Env.
[13:28] What this is going to do is load
any environment variable files that are present.
[13:31] And in fact while we're here we're just going to make
a new dot env file in the root of our project.
[13:37] So dot env and we are going to put inside of here
one variable that we need.
[13:42] Now this variable is the agent span underscore server
underscore URL okay.
[13:48] And for now this is going to be equal to Http colon
slash slash localhost
[13:54] port 6767 slash API.
[13:57] Now let's make sure we spelled this correct
because I completely butchered the spelling here.
[14:01] But this is local host like so now.
[14:04] And let's add the extra slash okay.
[14:07] So this is where the agent spends
servers running right now.
[14:10] Again we're running it on our own computer.
[14:12] So we just put in this URL
and yours will be the exact same.
[14:15] If the agent spent server was running
on a different computer wasn't running on localhost,
[14:19] then of course we would change this
because maybe we're going to have the server
[14:22] hosted somewhere else
and our workers hosted somewhere else.
[14:25] That's possible.
[14:26] You also can have the workers
and the agent spin server on the same server.
[14:30] It's completely up to how you want it deployed.
[14:32] But this is what allows you to specify, hey,
where actually is this server?
[14:35] Okay. So next we're going to go back to agent one.
[14:37] We've now loaded the dot env.
[14:39] And because we've loaded that agent spin
will now automatically see this variable.
[14:43] And it will know that it needs to communicate
with the server at that location.
[14:47] Now next what we're going to do is just say logging
dot basic config.
[14:50] And we're just going to set the level.
[14:52] So we're going to say
level is equal to logging dot warning okay.
[14:57] Just so we only show warnings that we don't show
[14:59] all of the logs that are probably going
to kind of mess up the terminal.
[15:03] We're then going to say logging dot get logger
and we're going to get the agent span logger.
[15:09] So let's get it like that.
[15:10] And we're going to set the level to warning as well.
[15:13] And then next we're going to put not agent span
but we're going to put conductor and same thing.
[15:18] We're going to set the level to warning
[15:20] just so that we don't accidentally get
a bunch of random info logs showing up.
[15:24] All right.
[15:24] So next what we're going to do
is we're going to create a basic agent.
[15:28] So to make an agent is super easy.
[15:30] We're just going to say assistant is equal to agent.
[15:33] And then inside of here
we're just going to give the agent a name.
[15:35] And this is what's going to show up in Agent Spence.
[15:37] We can see it. So we're going to call this personal
[15:40] assistant like that.
[15:41] Perfect. Next we need to specify the model.
[15:44] Now because we're using OpenAI
we can specify any OpenAI model.
[15:47] And we'll be able to connect to it.
[15:48] And use it because we have that API key set.
[15:51] If we wanted to use an anthropic model,
then when we started running the agent spawn server,
[15:55] we would have needed to declare an anthropic API
key or a Gemini API
[15:59] key, or whatever
the other model is that you want to use, right?
[16:01] If we go back here, you can see that
we had the option, right?
[16:04] We could have exported one of these.
[16:05] So based on the one that we export
and you can see all the providers here right.
[16:09] It gives you the different options.
[16:11] You can specify the model that you want to use okay.
[16:13] So we're going to go back here and we're going to
change this to OpenAI again GPT 5.4.
[16:20] It's a little bit expensive.
[16:21] If you just want a cheaper one.
[16:22] You can do GPT four or GPT for a mini.
[16:26] And that's going to give you a really cheap model
that's going to cost literally nothing.
[16:28] This one still will not be expensive based on
how we're using it, but it is more expensive.
[16:33] Next, we're going to pass some instructions.
[16:35] Now the instructions
I'm going to put in a set of braces just so that
[16:37] I can separate them out with some quotation
marks here.
[16:41] And this is the system prompt.
[16:42] This is what's going to be read
at the beginning of each message.
[16:45] So it understands how it should actually behave.
[16:47] So we can say something like,
you are a concise personal assistant.
[16:52] Use tools when they help because we're going
to provide some tools for this in a second.
[16:57] And then down here
we're going to say, and remember, use full
[17:02] okay user details across terms okay.
[17:06] Cool.
[17:07] So that's our instructions.
[17:08] Now beneath this we're going to provide some tools.
[17:10] For now the tool list is going to be empty.
[17:12] And then after this
we are going to provide some memory.
[17:15] But we'll just add those later.
[17:16] So for now we just have the basic agent.
[17:18] Next thing we're going to do is just run the agent.
[17:21] So to run the agent we're going to say
if underscore underscore name, underscore
[17:24] underscore equals
underscore underscore main underscore underscore.
[17:27] This is just the main entry point in our application.
[17:29] If you're unfamiliar with what
[17:30] this does essentially just checks
to make sure we're running this Python file directly.
[17:34] We're just going to do a print statement.
[17:35] And we're going to say starting
agent dot dot dot okay.
[17:39] And then down here
we're going to say with the agent runtime
[17:44] okay as runtime.
[17:47] And then we're just going to go into a simple
while loop where we just keep
[17:50] asking the agent questions until we type quit.
[17:53] Okay.
[17:54] So we have our width.
[17:54] This is how you start the runtime for the agent.
[17:56] We're now going to say, while true.
[17:59] And then here we're going to say
prompt is equal to input.
[18:02] And that's going to be you dot strip
just to remove any leading or trailing spaces.
[18:06] We're going to say if the prompt dot lower
[18:11] okay is equal to q.
[18:14] So if you type the letter q
then we are just going to break okay.
[18:18] We're going to say if not prompt
[18:20] then we're going to continue
and just ask you to type something
[18:22] so that if you don't type anything at all
we don't prompt the model okay.
[18:25] Now down here, but still inside of the while loop
we are going to do the following.
[18:28] We're going to say the result is equal to run.
[18:32] And we're going to run the assistant.
[18:35] We're going to pass our prompt.
[18:37] And we're going to say
the runtime is equal to the agent runtime right here.
[18:40] And that's it.
That's all we need to do to run the agent.
[18:43] So for now, what we can do is we can just say print
[18:46] and we can put an F string,
or we can say assistant like this.
[18:50] And then we can just put inside of a set of braces,
maybe.
[18:53] What is this result? Okay.
[18:55] And this is going to give us kind of
a messy dictionary, which we can look through later,
[18:58] but at least for right now,
it should give us the response.
[19:01] So let me zoom out a little bit
so you guys can read this better.
[19:04] Essentially what we've done is
we've imported a few things we need.
[19:07] We've set up the Env so we can connect to the server.
[19:09] We have the assistant.
We don't have any tools or anything.
[19:12] It's just a super basic assistant.
[19:13] And we set up a while loop
so we can now communicate with it.
[19:16] And if we go here
we'll just make sure the agent spent servers running.
[19:19] I believe I didn't shut it down,
so it should still be running here.
[19:22] Yes, looks like it is.
[19:23] So make sure that the Agent Span server
is going guys before you try to do this.
[19:27] And then what we can do is from the root
of our directory we're going to type.
[19:30] You've run and then agents slash agent 1.py.
[19:35] Now notice that I'm doing this
from where my env file is present.
[19:38] So I'm doing kind of the path to this file
agent slash agent 1.py.
[19:42] So we're going to pick up the env file
and we're going to load it and let's hit enter.
[19:47] That's a starting agent.
[19:48] You can see we have initialized.
[19:49] We've connected to the server.
[19:50] And now we can type something like hello World.
[19:53] And we give it a second here okay.
[19:55] And let's see if we get the response.
[19:57] And it says hey it was completed.
[19:59] And we get this agent result here where
we have some result in the output called hello world.
[20:04] So we can see everything ran.
[20:06] And then if we come back
here, let's just refresh the server.
[20:10] You see personal assistant just ran.
[20:12] And if we click into this you can see our prompt
which was where is it here.
[20:15] Hello world.
[20:16] We can see the output of the model was Hello world
okay.
[20:19] That's right from the learn.
[20:20] And then we can see the immediate output at the end
here.
[20:23] Was this right work we got with Hello world. Cool.
[20:26] So that's kind of the benefit
is that we can see exactly what's going on.
[20:29] We have full insight into
how the AI agent is running.
[20:32] And of course this is just a very basic one.
[20:34] Now what I'm going to do is just type Q
to get out of this,
[20:36] and let's make it so that we can kind of view
the response a little bit better.
[20:40] So rather than just printing out the result object
here, let's print out the kind of output here.
[20:45] So what I'm going to do is say so
I'm just going to say result dot get.
[20:49] And then I think we can just put in single quotes
here a result, make sure that it's single quotes.
[20:54] Otherwise it's going to interfere with the F string.
[20:56] And let's just try this one more time
where we run the agent.
[20:59] So let's go. You've run. Let's go. Hello.
[21:02] And let's see what we get this time
[21:05] it says agent result has no attribute yet.
[21:08] Okay. Interesting.
[21:09] So I think we can do result in dot output dot
get maybe I think that's going to work.
[21:16] Let's just try it.
[21:16] I'm just doing this off the top of my head
here, and let's run it again and just type hello.
[21:21] And let's see now if we get the correct response
give it a second.
[21:26] And there we go.
[21:26] We get hello, how can I help you
and say what is my name or something.
[21:30] Whatever. And it's not going to know the answer.
[21:31] But the point is that this is now functioning.
[21:33] I don't know your name.
[21:34] If you want to tell me, I'll remember for later.
Okay, cool.
[21:36] All right, so this is great.
[21:38] However, like I mentioned,
we currently don't have any tools or any memory,
[21:43] so anything that I chat with
[21:44] the agent is not going to remember later on,
even though it's said that it would.
[21:47] So what I want to do now is
I want to start by adding a few tools.
[21:50] These are things that the agent will be able
to actually call to get some information.
[21:54] And then we're going to add memory.
[21:55] So to add a tool is super simple.
[21:57] What we can do is we can just make a function
so we can say something like define get current time.
[22:02] And then what we're going to do is just return
whatever the current time is.
[22:05] Now it's important that what we write these tools,
we also write docstrings for them
[22:09] and the input and output format.
[22:11] So that agent span can automatically convert
that into something that the AI agent can read.
[22:15] So for example I'm going to say okay, the get
current time function is going to return a string.
[22:20] And if it was going to take some input here
then I would also specify like
[22:24] you know input and then whatever type the input was.
[22:27] And then beneath this importantly
I'm going to write a doc string,
[22:30] which is just a comment at the top of the function
that says returns
[22:34] the current local time, okay.
[22:38] And then you can see that we have datetime dot now.
[22:40] And then we just convert this into a string
and we return that.
[22:43] Now this is great, but if I want to turn this
into a tool, I simply just have to put that tool.
[22:48] Now what is a tool?
[22:49] A tool is something that I can call to get
some kind of response or to take some kind of action.
[22:55] So right now
the agent doesn't know anything about us.
[22:57] It can't actually do anything.
[22:58] It's just capable of essentially,
you know, printing out text, right?
[23:02] Or giving us a text response
if we want it to actually take an action
[23:06] and generate a report or search for something,
it needs to have tools in order to use that.
[23:10] Now, agent spend
natively defines the ability to call tools.
[23:13] So all we have to do is just define a function.
[23:16] We specify it's a tool using this add tool decorator.
[23:19] Right. Like we specified here.
[23:20] Then the name of the tool will automatically
be the name of the function.
[23:23] So make sure you name the function something useful.
[23:26] The input and output type you'll specify,
[23:28] and then the description of the tool
you'll put as the doc string.
[23:31] So what will happen is Agent Span will now say hey
we have a tool.
[23:34] You know get current time right.
[23:37] The description of this is whatever
the description was here.
[23:40] And it takes no input and gives this output.
[23:42] And then that will be passed to the assistant.
[23:45] And the assistant will essentially give us a response
back that says, hey, I want to call this tool.
[23:49] And then inside of this runtime
here, Agent Span will automatically
[23:53] call the tool for us
and then give the response back to the model.
[23:56] And we'll be able to see this happening
inside of the UI, which I'll show you in a second.
[24:00] So for now,
we can just pass this get current time tool.
[24:03] And the model,
[24:03] if we run it again should be able to call this
if we ask it about something related to the time.
[24:08] So let me.
[24:08] That's not what I meant to do.
[24:10] Let me open up the terminal and run this again.
[24:12] I'm gonna say, what time is it?
[24:16] Okay.
[24:16] And let's see if we get the time. Here.
[24:20] Give this second
[24:22] and hopefully it's going to call that
and then tell us what it is.
[24:26] Okay. And you can see it says that it's this time.
[24:28] And if I look at my window here
that is the correct time okay.
[24:31] 1947 and two seconds.
[24:33] But now if we go back to the server and we refresh,
we can check our personal assistant.
[24:38] And we can see now that actually it called a tool.
[24:41] So the yellow line gave some output.
[24:43] The output effectively said
hey I want to call a tool.
[24:46] The name of that tool is Get current time.
[24:48] Okay.
[24:49] So then we called the tool.
[24:51] We got the input which was this.
[24:53] We got the output which was the result right.
[24:55] And then we pass it to the model.
[24:57] Now the model now has access to that tool call.
[24:59] So it knows what the time was. Right.
[25:02] And that gives us the output.
[25:03] Boom. Here's the time.
[25:04] So that's one of the reasons
why this is super useful, is that you get
[25:07] that full insight
into what the AI model is actually doing.
[25:10] Now let me say, what time was it
[25:13] last time I asked you just to show you something?
[25:18] And you should see here
that assuming it doesn't just call it.
[25:21] Yeah, it says I cannot.
[25:23] I don't have access to timestamps
[25:24] to your previous messages in the chat
unless they're shown in the inference.
[25:27] So essentially what is telling us is that, hey,
I don't know what it was because I don't have memory.
[25:31] So the next step here is to add memory to the agent.
[25:34] Now adding memory is super easy.
[25:37] All we have to do here is just go above our agent
[25:40] and we're going to say conversation underscore memory
[25:43] is equal to conversation memory like so.
[25:48] Then inside of here we can also put
the maximum number of messages that we want to store.
[25:52] So I can say
Max messages is equal to like 50 or something.
[25:55] So after 50 it will start
just getting rid of the last messages.
[25:58] So we don't clog up the context too much.
[26:01] And then what I can do is just say
memory is equal to conversation.
[26:04] Memory. Boom.
[26:06] So that's that.
[26:07] So what we can do now is let's open this up.
[26:10] Let's type clear okay.
[26:12] And let's go. You've run and let's do something.
[26:15] My name is Tim okay.
[26:17] And let's see if it can remember that okay.
[26:18] So it says nice to meet you I'm going to say
what is my name.
[26:22] And let's see if I can remember this
now using the conversation memory okay.
[26:26] And it doesn't remember it because I made one mistake
and I forgot to add to the conversation memory.
[26:30] So let's do that.
[26:31] Now that's actually a good issue to run into.
[26:34] Okay. So we've created the conversation memory.
[26:36] We've added it to the agent
but we're not adding anything to the memory yet.
[26:40] So what we need to do is we need to add what we type
in, what the agent types to the memory.
[26:46] So the way we do this is we're just going to go here
and let's go underneath the result.
[26:51] And we're going to say conversation memory dot add
[26:54] underscore user underscore message.
[26:56] And this is going to be the message that we sent
which is the prompt.
[26:59] We're then going to say conversation memory dot.
[27:01] And this is going to be add assistant message.
[27:03] And we're going to add the results output dot get
and then the result.
[27:06] And just to make this a little bit cleaner
we're going to say readable result is equal to this.
[27:12] And then we can just replace this
with the readable result.
[27:15] And then same thing here with the readable result
okay.
[27:19] So essentially what we're doing is saying
hey we're going to append to the memory.
[27:22] The memory has a few different functions we can call.
[27:24] One is to add a user message, which is what we said.
[27:26] And then one is to an assistant message
which is what they said.
[27:30] So let's save this.
[27:31] And now let's go again to our terminal.
[27:34] Let's make sure I didn't mess something up.
I think it's okay.
[27:36] Let's clear okay.
[27:37] So let's run it again and let's see what we get.
[27:40] This time I'm going to say my name is Tim okay.
[27:43] And let's see here.
[27:44] Give it a second. Say what is my name.
[27:48] And hopefully it's going to give us the answer
and tell us that it's Tim.
[27:52] Let's see I don't know your name yet okay.
[27:55] What's what I'm going to say.
[27:57] My name is Tim. Let's try one more time.
[27:59] I think sometimes on the first run,
for some reason, it's not picking it up.
[28:02] Based on kind of how we're adding this. Maybe.
[28:05] Let's see. Okay.
[28:06] What is my name?
[28:08] And let's see now, here we go.
Your name is Tim. Okay.
[28:10] So for some reason on the first run, I think based on
how I added the info here, it's not working.
[28:16] I'm not sure exactly why that was the case,
but either way, afterwards,
[28:19] now it looks like it's working
and it is able to determine my name.
[28:22] It also might
just be how it's searching through the memory.
[28:24] Either way, looks like we are good
and it knows my name now.
[28:28] Okay, so anyways, the memory is functioning
now that we have that,
[28:31] let's move on to our next agent,
which is going to be a rag based agent.
[28:36] Okay.
[28:36] So as discussed we're now moving on to agent two.
[28:39] Agent two is going to be kind of a rag based agent,
where we're going to be able
[28:43] to look up some info in something
like a database or documentation or whatever we have.
[28:47] I'm not going to build true rag here
because that's going to be a little bit complicated.
[28:51] But of course, you could very easily
add that effectively.
[28:53] What we're going to do is add some more tools,
we're going to add guardrails,
[28:57] and we're just going to look at a much more complex
agent that has a few more components to it.
[29:02] Now we're also going to look at a pedantic structured
output agent.
[29:05] Now, what that means
is that rather than just getting the output as kind
[29:09] of a random string of text,
we can actually pipe it into a Python object
[29:13] so that it's predictable
and we know what kind of format we're going to have.
[29:17] So as you can see here,
I've already brought in a bit of code.
[29:19] Any of this code will be available
from the link in the description.
[29:22] If you just want to copy it,
there'll be a GitHub repo there.
[29:24] But I just want to save us a bit of time.
[29:26] So I just did the import.
[29:27] So you can see
we've got a bunch of stuff from Agent Span here.
[29:30] And then we have the mock database documentation
[29:33] and then the logging setup
as well as the loading env.
[29:37] Okay.
[29:37] So if we go down here
the first thing that I'm gonna do
[29:40] is I'm actually going to define what I'm going
to call the pedantic structured output object.
[29:46] Now this is how I want the agent
to give us its output.
[29:49] So rather than just giving me some random text
that maybe I have to parse through,
[29:54] I want it to give me something
in kind of a dictionary format
[29:57] that I can then convert into a Python object
so I can read the different values.
[30:01] You're going to see what I mean,
but I'm just going to go class support
[30:05] response like this.
[30:07] And then this is going to inherit from base model
[30:10] if I can type it correctly,
which we brought in here from pedantic okay.
[30:15] Now pedantic allows us to just do typing in Python.
[30:17] It I use with a lot of these AI frameworks.
[30:20] So first things first, I'm
going to say that I want this AI model
[30:23] to actually give me output
that has the following fields.
[30:27] The first is a stage,
so I'm going to say stage string is equal to a field.
[30:32] And then I can actually just give a description
for this field.
[30:36] And I'm going to say stage like answered okay.
[30:40] So answered refunded or rejected
[30:46] because this is going to be related
to kind of the support request
[30:48] because we're setting up kind of like a support
agent here that has the ability to do this rack.
[30:52] Next we're going to have successful boolean.
[30:56] And this is just going to be a boolean.
[30:57] So we know if it's successful or not.
[30:59] And then we're gonna have a message
which is a string.
[31:01] Now this is a super basic structured response.
[31:03] But if we wanted it to give us like a price
or a number or a time or something specific,
[31:07] then we just set that as a field and the model
will automatically fill in these values.
[31:11] So now it's going to give us always a stage
which will fit this description.
[31:15] And it will give us whether it was successful
and what the message was.
[31:18] And if we had other types, we could set those here.
[31:20] We could set up enums, we could do anything you want.
[31:22] I'm just trying to show you that
you do have this ability to use structured output,
[31:25] which is really powerful for more deterministic
AI, applications.
[31:30] Okay.
[31:31] Now, before we build the model, before
we build the agent, I want to set up a few tools.
[31:35] So the first tool is going to be one
that can search our knowledge base.
[31:39] So I'm going to say define search
[31:42] knowledge underscore base.
[31:44] And we're going to take in a query which is a string.
[31:48] And we're going to have a string which is a response.
[31:52] Now for the description of this tool.
[31:54] Let's put it in here.
[31:55] We're going to say search support
[32:01] docs like this okay.
[32:03] Pretty basic.
[32:04] We're just saying hey
this can sort through our support documentation.
[32:07] Let's fix the comment.
And what we're going to do is the following.
[32:09] We're going to say for title and then body in
[32:13] docs dot items, we're going to say if the title
[32:20] is in query dot lower,
[32:24] then what we're going to do is just return the body.
[32:27] This is not an efficient search by any means,
but all we're effectively doing
[32:31] is just a quick keyword search of any keyword
of what the user typed in was in this.
[32:35] So like shipping refund policy, account,
[32:36] whatever, then we're just going to return
whatever the body or the content of this is.
[32:41] Again, if you use real rag
you can get a much better response.
[32:43] But I'm just showing you how you can set this up.
[32:45] So next we're going to say return no matching
[32:48] support articles found okay.
[32:51] So if there's not a response.
[32:52] So that's the first tool we're going to have next
we're going to have some tools related
[32:56] to looking up some orders okay.
[32:58] So we're going to have a tool.
[32:59] This is going to be define lookup underscore order.
[33:03] So let's do this for the order.
[33:06] We're going to have an order ID which is a string.
[33:09] And this is going to return a dictionary
with the information about the order.
[33:13] Now same thing.
[33:13] We're going to have the comment
lookup order in database.
[33:17] Pretty basic. And we're going to say by ID.
[33:20] And then what we're going to do is return
[33:23] the mock underscore db and then the orders like this.
[33:29] And then we're going to say dot get.
[33:31] And then this is going to be the order ID.
[33:34] And if the order ID is not found
then we're going to return a dictionary.
[33:37] And the dictionary is just going to say error order
not found
[33:42] okay. So that's going to be our lookup order tool.
[33:44] And then we are going to have a few other tools
as well.
[33:47] We'll look at those later.
[33:48] So we're going to have one tool
that will allow the user to refund.
[33:51] However before we call that tool
we are going to ask for a human in the loop approval
[33:55] because we don't want to just automatically refund
something unless the user actually allows that.
[34:00] Okay, so for now, let's create the support agent.
[34:01] Let's run it with the two tools so far
just to make sure that these work.
[34:05] And then we'll move on to the rest.
[34:07] So we're going to say support
agent is equal to agent.
[34:11] For the agent we're going to say the name is support
agent for the model.
[34:17] We're going to go with
[34:19] OpenAI slash GPT five for.
[34:23] Then we're going to have some instructions.
[34:25] I'm just going to copy these in because
they're kind of long and you guys can adjust them.
[34:29] But you'll see kind of how I've written them here.
[34:31] So we're going to say instructions.
[34:32] And then I'm just going to copy
in this long paragraph.
[34:35] So give me one second which looks like this.
[34:39] So you are a customer support agent.
[34:41] Use the knowledge base first
and the customer wants a refund.
[34:44] When you know the order ID, call the lookup order
to get the email before calling.
[34:47] Process refund.
[34:48] Very short plain English sentence describing
exactly what to refund you about to issue, etc..
[34:53] Okay, so this is the instructions.
[34:56] Next we're going to specify the output type.
[34:59] So this is a new one.
[35:00] And we're going to say output
type is a support response.
[35:03] Now what this allows us to do
is specify any pedantic object.
[35:06] So like one that we have right here.
[35:08] And that's now going to force the model
to give us the output in this format.
[35:12] So that's all we need to do.
[35:13] We just say hey we want it in this format
now it's going to give it to us in that object.
[35:16] What you're going to see in a second.
[35:18] Now next we're going to specify the tools.
[35:20] So the tools are just going to be
the search knowledge base.
[35:22] And then the lookup order
we will have conversational memory.
[35:25] So we'll keep that for right now.
[35:27] And then we can specify
max underscore turns is equal to ten.
[35:32] Now what this will do is specify the number of times
we can go back and forth the agent
[35:36] until we reset the session.
[35:37] The for the conversation memory.
[35:38] Here we can actually just specify it like this.
[35:41] And when we do that it should automatically add
all of the contents to conversation memory for us.
[35:46] Now, the reason I didn't do it here is just because
I wanted to show you that you can manually control
[35:50] the memory, but by default it will automatically add
everything, including the tool calls to memory.
[35:55] When you. Oops, that's not what I wanted.
[35:57] Specify it like this.
[35:58] Okay, so actually, if I go to the docs, you can see
that you can manually add it as you chose here.
[36:03] And then there's five methods or six
methods like user message system, message
[36:07] system, message tool, call tool result.
[36:08] So if you don't manually add it like we did,
then it will just add
[36:12] all of these for you automatically,
which is good obviously.
[36:15] Right.
[36:15] But sometimes you want to just add certain pieces
so then you can control it yourself
[36:19] as we did in example one. Okay.
[36:21] But for now we have our memory and we have our agent.
[36:24] So now we need to be able to run the agent.
[36:26] So what I'm going to do is create a function.
[36:28] This is going to be called Run Interactive okay.
[36:32] For this let's spell interactive correctly.
[36:35] We're going to take in a prompt which is a string.
[36:37] And we're just going to return nothing or none okay.
[36:41] From here we're going to say with the agent runtime
just like last time as runtime,
[36:46] what we're going to do is we're going to say
handle is equal.
[36:49] To start.
[36:50] This is a different function.
We're not running the agent.
[36:52] And this is going to be support agent prompt.
[36:55] And then run time is equal to run time.
[36:57] Now the reason we're doing this is that
we want to have a little bit more control this time.
[37:01] And we want to actually be able to hook into
what the agent is doing to see, for example,
[37:05] if it needs approval from us, if there's a guardrail
that ran, which we're going to look at in a second,
[37:09] and this gives us just a little bit more control
in terms of what the agent's doing.
[37:13] So will allow us to actually stop approval request
as you're going to see.
[37:18] So what we're going to do now is we're going to say
stream is equal to handle dot stream.
[37:23] And before I go any further,
[37:24] let me just refer to the docs
so you can kind of get a sense of how this works.
[37:28] So if we go to this streaming page here, I'm doing it
a little bit differently than it shows in the docs.
[37:31] But you can see that we can actually hook
[37:33] into the stream of the agent, which allows us to see
all of the events that are happening.
[37:37] So we can see, for example,
if it's thinking, if it's calling a tool,
[37:40] if there's a result, if there's a handoff
and if it's waiting.
[37:43] So if it's waiting, what that means
[37:45] is that it's waiting for us to approve something
which we need to manually do.
[37:49] So this allows us to have some more control
[37:51] into what the agent's doing,
rather than just purely getting the result.
[37:54] We can see all of the steps in the meantime,
so I'm going to show you how I'm going to do it here.
[37:58] Again. You can reference the docs
and you can do it a little bit differently.
[38:01] So we're going to say order underscore ID comma
[38:03] amount is equal to none and none
because I want to potentially know
[38:09] if the user wants to refund an order,
which we're going to have a look at in a second.
[38:12] And then what we're going to do
is we're going to say for event in stream.
[38:16] For now, I'm just going to pass.
[38:17] But this will allow us to actually view
all of the stuff
[38:21] that's going on before eventually we get a result.
[38:24] Now we'll handle that in a second.
[38:25] But for now, what I'm going to do is just go
down here and say result is equal to stream
[38:29] dot, get, underscore result.
[38:32] This will then give us the result.
[38:33] Once all of these events are finished
and we've gone through them and we're going to say
[38:38] output is equal to result dot output, okay.
[38:44] Then we can actually just tack on message here.
[38:46] The reason for that is that we know that it should be
in this support response object type.
[38:51] So we get results dot output.
[38:52] And then the output is going to be this.
[38:54] We know that there's going to be a message.
[38:55] So we can simply just view that okay.
[38:58] Then we need to do is just print out the message.
[39:00] So we're just going to say print.
[39:02] And we can put an F string
and we can put a new line character here.
[39:05] And I'm just going to put the
[39:08] sorry let's put output and then backslash n okay.
[39:11] So we'll start running this in one second.
[39:12] In order to do that we're just going to do
if underscore underscore name underscore underscore
[39:16] is equal to underscore underscore main underscore
underscore then what we can do is say print
[39:21] and we can print support bot starting dot dot dot.
[39:26] Then we can go down here
and we can just do a simple while loop.
[39:29] It actually knows exactly what I want okay.
[39:32] So we're going to say well true.
[39:34] The prompt is you if you enter Q then break.
[39:36] If there's not a prompt then just continue
and then otherwise just simply run
[39:40] interactive, which is this function
that we wrote to run at the agent.
[39:43] Now like I said,
there's some more stuff that we're going to add,
[39:45] but for now,
let's just see if it can look up the order
[39:47] or look through the knowledge base
before we go any further.
[39:50] Okay. So let's simply open this up.
[39:53] Let's go.
[39:53] You've run agents slash agent 2.py.
[39:58] And we got an issue here.
[40:01] Instructions
I think I just felt instructions incorrectly.
[40:04] So let's just fix that instructions like so okay.
[40:09] Let's run it again.
[40:10] And it says you I'm going to say
can you tell me about shipping.
[40:17] And let's see if it can look that up.
[40:18] And if we get the result.
[40:21] Okay.
[40:21] This dictionary object has no attribute message.
[40:24] Interesting.
[40:26] Let's have a look at why we're getting that.
[40:28] And it's going to be something to do with this.
[40:30] So for now let's just print whatever this result is.
[40:35] Let's see what it is.
[40:36] And then we can parse through it.
[40:37] So let's say shipping or something.
[40:40] And let's see if it finds anything.
[40:41] And it gives us okay
result output error failed with status failed.
[40:45] And reason model GPT 5.4 does not exist
okay so that's good.
[40:49] We found the issue there.
[40:50] So OpenAI slash GPT and think this is Dash 5.4.
[40:54] Let's look at what we had in the first agent.
[40:56] Yeah dash 5.4 okay silly error.
[40:58] But at least it gives us the response there.
[41:00] And now we can just quit this
[41:04] rerun.
[41:05] And let's see shipping
[41:08] and let's see what we get if it works this time okay.
[41:10] So I'm just playing around with this
just to get the correct output.
[41:12] And we can see that the way we can do that is
by doing result dot output dot get and then result.
[41:18] And that will
then give us the format as specified here.
[41:22] However, it doesn't give us
give it to us in the Python object.
[41:24] It gives us two.
[41:25] It gives it to a string and a dictionary
that is the same format as this,
[41:29] which is still effectively the exact same thing.
[41:31] So you can see we get stage completed.
[41:33] Successful true message standard shipping
takes 3 to 7 business days when I type shipping.
[41:38] Now let's ask it can you look up my order
and let's see if that will work?
[41:44] What's the order ID a 100. So let's type
that in stage.
[41:48] Need order ID successful false message.
[41:49] Sure. Please send me your ID.
[41:51] So we're going to say a 100.
[41:53] And so let's see if it can look that up. Now
[41:57] give it a second here to give us that response
[42:00] okay.
[42:01] Come on I hope it's calling the tool
[42:04] being a little bit slow and says refund pending info
okay.
[42:07] Message I can help with the refund,
but I need your request to proceed in order.
[42:11] Eight 100 was found for 4999.
[42:13] If you want to refund for this
order, please confirm and I'll continue.
[42:15] Okay cool.
[42:16] So it looks it up.
[42:17] We get the information
and now if we go back to the agent span's server here
[42:21] and we refresh, we can see
first of all these ones failed.
[42:25] And we can actually see all of the logs
on why this failed, which is interesting
[42:28] as well as the debug view here on exactly
what went wrong.
[42:32] Anyways, let's go back to the most recent one there
and we can see we have this support agent.
[42:36] We have multiple turns
so we can see how those worked.
[42:38] So you can see we typed in a 100
and then it looked up the order.
[42:42] This was the input.
[42:44] This was the output. It got us the information.
[42:46] And then it gave us that full Json for the tool call
[42:49] went to the yellow and then gave us the output.
[42:52] Now you'll notice that this is just one run.
[42:54] If we go back.
[42:56] All right.
[42:57] You can see this was the other run.
Can you look at my order. Boom.
[42:59] And then it's remembering
all of this based on the conversation memory.
[43:02] Okay cool.
[43:03] So that's functioning now let's move on to add a few
[43:06] other things to our agent.
[43:09] So one thing that I want to add
now is the ability to refund.
[43:13] But like I said, we shouldn't just refund
unless we get approval from the human to do that.
[43:18] So in order to do this,
we can just make another tool.
[43:20] This can be a tool,
but this time we are going to say approval.
[43:26] Approval underscore required is equal to true.
[43:30] Now this means that we need to manually approve this
in order for the function to execute.
[43:34] I'm going to show you how we do that.
[43:35] Now for the function
we're going to go process refund.
[43:37] We're going to take an order ID and we're going
to take in an amount that we want to be refunded.
[43:41] And then we're going to return, not a boolean
but a string okay.
[43:46] Now we need to give a description.
[43:48] So for the description we're going to say
[43:51] let's go like this.
[43:52] Request a refund
[43:56] okay.
[43:56] Refund pause for human approval.
[44:01] Think before you run this
[44:04] okay cool.
[44:05] Just so it knows that
this should be a careful operation,
[44:09] then we're just going to return, even though
we're not really going to do anything here.
[44:12] We're just going to say refunded.
[44:14] And we'll put inside of brackets amount
[44:18] colon dot to f okay.
[44:20] For order.
[44:22] Order ID we're kind of faking a refund,
but I'm just showing you that we can build a tool
[44:26] that requires the human approval,
which is kind of the more important part.
[44:29] So now that we have that tool,
we're just going to add that to our tool list.
[44:32] So we're going to say process refund.
[44:34] Now the thing is we need to start handling this
stream here in order to actually process that refund.
[44:39] So what I can do is the following.
[44:41] Now I can say if event dot type
[44:44] is equal to and then this is event type
[44:48] dot tool okay tool underscore call like so.
[44:52] And event dot args meaning it has some arguments.
[44:56] I'm going to say my order id is equal to event args.
[45:01] Yet order id or
[45:05] like this or order underscore d
[45:10] or order underscore id.
[45:11] Now what I'm effectively saying is hey I'm
going to try to look through these tool calls to see
[45:15] if we ever call an order ID when we're looking up
something, or calling one of these tools,
[45:19] because that's in the order I do referencing
when we're trying to refund something.
[45:24] Okay, so I'm just pulling out that order ID,
[45:27] otherwise I'm going to say if event dot
[45:29] type is equal to event dot
[45:33] or event type dot tool underscore result okay.
[45:38] And is
[45:41] instance event dot result a dictionary,
[45:46] then what I'm going to do here is say
amount is equal to event
[45:51] dot result dot get.
[45:54] And I'm going to get a total
[45:58] or an amount.
[45:59] So same thing.
[46:00] Now I'm going to look in the tool result to see
[46:03] if I can figure out what the amount is
that we're refunding for the order.
[46:06] It's kind of a weird way to do it, but it allows me
to parse through and see tool calls, tool result.
[46:10] And then lastly I'm going to say, Elif, the event dot
[46:14] type is equal to event
[46:18] type dot waiting.
[46:21] Then what I'm going to do
is I'm going to print the following okay.
[46:24] And this message is going to be essentially saying,
hey, we're requesting to refund.
[46:29] And then I'm pulling out the two arguments I have.
[46:31] So order ID an amount so I can print those
and then tell them, hey, do you want to approve this?
[46:36] And if they do, then we can approve it.
[46:37] So here's how it works.
[46:38] I'm just going to say print
and this is going to be type string go backslash n
[46:42] I'm going to go approval required.
[46:46] And then I'm going to say refund.
[46:48] And we're just going to put the order
actually let's put the amount.
[46:54] So we'll put a dollar sign like this amount.
[46:57] And then colon dot to f for order.
[47:00] And then the order ID okay.
[47:02] And then down here we're just going to put a print
and we're going to say
[47:06] press enter to approve.
[47:09] Technically you can't actually press anything else.
[47:11] And this is going to be an input statement not this.
[47:15] So we're not even going to check what it is.
[47:16] And then we're just going to say handle
dot approve okay.
[47:20] So effectively when we call handled at approve
we're just going to approve that operation.
[47:23] So we're just going to wait for the human
to be at this step.
[47:25] And then as soon as we want to approve boom,
we go ahead and run approve and we're good to go.
[47:29] Okay. So now that we have
that we're going to ask them to approve it.
[47:31] So I'm going to say decision is equal to input.
[47:34] And we're just gonna ask them approve yes or no.
[47:36] And then lower dot strip.
[47:37] We're going to say if the decision is
[47:40] why then let me just check the documentation.
[47:44] Here it is. Handle dot approve okay.
[47:46] So we're just going to say handle.
[47:50] Dot approve like so okay.
[47:53] Otherwise we can say handle dot.
[47:56] And I believe it is reject. Let's see.
[47:58] Yes you can reject and you can pass a reason.
[48:01] If you want to pass a reason just say user
[48:04] rejected okay cool.
[48:06] So that is how we can now handle this.
[48:08] Again the reason why I'm looking at these tool
calls is just so I can figure out the kind of amount
[48:12] that we're going to have for the refund, because
otherwise it's not going to tell us that beforehand.
[48:16] So anyways, now let us go and run this
[48:21] and see if this works with the refund okay.
[48:24] So we're going to clear and then we're going to go.
[48:25] You've run agents too
I'm going to say I want to refund an order.
[48:31] And it gave us an issue saying tool result.
[48:33] Just because I didn't have a capital L here.
[48:36] So let's fix that.
[48:38] And now we're good and rerun
and let's say refund and order
[48:45] okay.
[48:45] Let's see what we get.
[48:46] And it says that it needs an ID so.
So I can help that.
[48:48] Please give me the order ID so I can look it up okay.
[48:50] So let's go a 100 and see okay.
[48:53] And it says approvals required refund 4990
for order a 100.
[48:57] So you can see these steps here.
[48:58] Picked up that information for us
because it saw that we were doing a tool call
[49:02] to either attempt to refund
or to look up the order ID so it pick those up,
[49:06] save them in the variable,
and then we're using them in this step to tell them,
[49:09] hey, we now want to call this
because we're waiting for your approval.
[49:13] The only thing we could be waiting for approval for
is this function, right?
[49:16] Because that's the only one that we have.
[49:18] So I'm just going to go ahead and type on
yes to approve this.
[49:21] And then hopefully it's going to tell us
that it was able to refund it.
[49:24] Let's see.
[49:24] It says stage completed message
trigger refund was issued successfully.
[49:28] Boom.
[49:28] Now let's say refund order again okay.
[49:33] And hopefully it's going to give us
maybe another ID says please read the ID okay.
[49:36] So let's go a 100.
[49:38] Even though I know we already refunded it,
we still can try.
[49:41] And let's reject it this time and see what we get,
just to make sure that that step works.
[49:46] And while we're at it, we can go here right to Agent
[49:49] Spend server
and you'll see that this is running right.
[49:52] And we're at this stage where we're just waiting
for the human, and we can just wait indefinitely.
[49:57] And what I could actually do,
I'm not going to do this right now
[50:00] because it's a little bit complicated to show is
let's say I were to quit this worker.
[50:04] Right. And this worker just completely died.
[50:07] And then I restarted it, but reconnected
to kind of this execution that's going
[50:11] this will still all be running
with all of the saved state,
[50:14] and it will just be waiting for the human again
to approve this.
[50:17] So the human does need to ask refund.
[50:19] Again, we don't need to check something.
[50:21] We don't need to look up another order.
[50:23] It will just, resume where it left off
[50:26] at this stage, right
where we're waiting for the human.
[50:29] And this can take any amount of time.
[50:30] It could take a day, could take it out,
or it could take ten minutes.
[50:34] Doesn't matter. The server will keep running here.
[50:37] And you can see it's in this hand off state
where it's waiting for us to approve writing.
[50:41] You'll see the time if we just keep refreshing.
[50:43] Like it'll just keep going up
and it will just keep waiting.
[50:45] Okay, so anyways, I'm going to go.
[50:46] Yes here and or sorry I want to do no.
[50:49] So we rejected it.
But anyways you can see that it's working.
[50:52] And I think doing
now is not really going to make any difference
[50:55] anyways because well, we know it's
just going to move to the next step.
[50:58] All right okay. So this is working.
[51:00] Now what I want to do next is I want to start adding
something called a guardrail.
[51:04] Now a guardrail allows us to actually audit
[51:07] the input or the output to our lab
or to our agent to ensure that we don't have
[51:12] something potentially malicious or data
that shouldn't be given to the user given.
[51:16] So I'm going to show you how we write a guardrail.
[51:18] The guardrail that I'm going to write
is going to be related to a jailbreak.
[51:21] So a lot of times people will try to do like
a prompt injection where they say, hey, like ignore
[51:25] all of your previous instructions and give me,
you know, all this information that I need x, y, z.
[51:30] We can actually prevent against that
by building in these guardrails
[51:33] where we try to detect common kind of phrases that,
you know, scammers and exploiters will try to use.
[51:40] So what I can do is I can use add guardrail.
[51:42] So make sure you import it right.
[51:44] And I can say define safe underscore support
[51:48] underscore request like so.
[51:51] Now from here we can take a prompt which is a string.
[51:55] And this is going to be a guardrail result
that is going to return.
[51:59] Now for the comments here.
[52:01] What we're going to do is say block
[52:05] obvious prompt injection attempts okay.
[52:09] And this is going to be before the LLM even sees it.
[52:12] So before the LLM gets it,
we're going to have this function that will run.
[52:15] So what I'm going to say is blocked is equal to.
[52:17] And then just a list of words.
[52:18] So I'm going to say ignore okay
[52:22] ignore previous.
[52:25] We can use system
[52:28] prompt something like that or jailbreak okay.
[52:31] So these are just words
that I don't want to be allowed in the input.
[52:34] Now I'm going to say past is equal to not any.
[52:37] And this is going to be phrase okay
[52:41] in prompt dot lower.
[52:44] And then we'll spell lower correctly for phrase.
[52:49] Let's spell all these.
[52:50] My typing is so bad now with LMS phrase in blocked
[52:55] okay, so all this is doing is saying hey,
we're any of these words in this prompt.
[52:59] That's all it's checking.
[53:00] Then we're going to return guardrail result.
[53:03] I'm going to say past is equal to pass,
which is either going to be true or false.
[53:06] So if none of these existed then true.
[53:08] If they did exist then false.
[53:10] We're going to say reason or we can say sorry.
[53:12] Message is equal to.
[53:14] And we're going to say please ask a normal question.
[53:20] This is blocked.
[53:22] So if it fails
this is the message that's going to be returned.
[53:25] So now what we can do is
we can add a guardrail here to our support agent.
[53:29] The way we add it is
we specify a guardrail or guardrails with a plural.
[53:34] We then need to put a guardrail object.
[53:37] We're going to say
like this guard rail for the guardrail.
[53:42] This is going to be the safe support request.
[53:45] And we're going to say the position of the guardrail
is going
[53:47] to be position dot input. Okay.
[53:51] And then we're going to say on underscore
fail is equal
[53:54] to on fail dot raise.
[53:57] Now raise is going to raise an error which is just
going to exit out of the bot completely.
[54:01] There's other things that we can do here
when we fail.
[54:03] But for now I just want to completely quit.
[54:05] So effectively what I've done is I said, hey,
we have this guardrail, right?
[54:08] This is a function that we want to run,
and we actually want to run it
[54:11] before we pass anything to our LLF.
[54:14] So as soon as we get some input to our agent,
[54:16] run it through the guardrail,
which is this function right here.
[54:19] Make sure that there's nothing wrong.
[54:22] If there is something wrong, then tell us and fail.
[54:25] Okay, that's a simple guardrail.
[54:26] Now this is on the input.
[54:28] You also can add a guardrail on the output,
which I'm going to show you from the docs here.
[54:31] So if we go to guardrails here, you can see there's
a bunch of stuff that we brought in here.
[54:35] You can see guardrail.
[54:37] We have a word limit.
[54:37] So for example we're checking to make sure that 
what do you call it here.
[54:41] We're going to have a correct number of characters.
[54:44] And you can see for the failure modes here.
[54:45] Do you have like retry, raise fix human, etc..
[54:49] Okay.
[54:50] In terms of constructing the guardrail,
you can do the function position right.
[54:54] So output input on fail the name
and then the maximum number of retries that you want.
[54:59] And for position two you either input or output.
[55:00] So either run after or run before.
[55:03] Now there's a bunch of guardrails you can do here.
[55:04] You can do a custom
one like the one that we just did.
[55:06] You can do a regular expression, guardrail
if you want to just check for certain characters
[55:11] like we were kind of doing.
I just don't like to, write regex.
[55:14] Sorry, because it's a little bit complicated.
[55:16] And you could do an LM guardrail.
[55:18] So if you do an alarm guardrail,
you're actually using an LM to
[55:21] then either get the, what is it, fail or pass.
[55:25] The issue with this is that
you still can have prompt injection going to them.
[55:29] This LM where that's doing the guardrail.
[55:31] But the point is you can use an LM to actually
detect, hey, is this good?
[55:35] Is this bad? Whatever. Okay.
[55:37] And then same thing input guardrails as we saw here
auto fix.
[55:41] There's a bunch of different ones that you can set up
as you can see like this okay.
[55:45] So I'm not going to go through all of them.
[55:46] We just wanted to show you
that these are super interesting.
[55:49] Very good to add to the agent.
[55:51] So now that we've added this let's try it.
[55:54] And let's just go clear and run.
[55:58] So we forgot to pass a comma.
[56:00] Maybe let me see where that is.
[56:02] Yes we forgot the comma here.
[56:04] So let's add that and rerun and I'm going to say
[56:08] you know jailbreak this prompt okay.
[56:11] And you can see boom it just immediately
crashes and gives us the error input guardrail safe
[56:15] support request failed.
[56:16] Please ask a normal question.
[56:18] This is blocked okay. So we ran into the guardrail.
[56:20] And then of course
if we run this we say help me or something
[56:23] we wrote won't run into the guardrail because
well it was not triggered.
[56:27] Okay give this a second.
[56:28] Hopefully it will give us the response.
[56:32] Not sure why this was taking so long.
[56:33] Maybe getting rate limited or something.
[56:35] Okay, you can see that it gives us the response here.
[56:37] And also you'll notice
that there's no run for this guardrail execution
[56:42] because we never even got to the images, immediately
blocked it before we even passed it to the server.
[56:47] So like as I was scrolling through here, I actually
couldn't find one that, was that execution.
[56:53] Yeah. See, it's actually not showing up here at all.
[56:57] Just help me.
[56:57] Yeah, because we never even hit the server
because we immediately exited after the guardrail.
[57:02] Okay. So again, a lot of other stuff
you can do with the guardrail.
[57:04] They're not going to go through all of it.
[57:06] But with that said
that is going to wrap up our second agent.
[57:09] This was a little bit complicated.
We added a lot of stuff.
[57:11] We had tools, output type, memory guardrails.
[57:14] What else.
[57:15] Human in the loop approvals
[57:17] kind of getting into the stream
of what's actually going on with the AI agent.
[57:20] And again, all of this is available
from the documentation we have streaming.
[57:25] As you can see here, we have testing
which we're going to look at later.
[57:28] We have the memory right.
[57:29] And in conversation memory we have tools right.
[57:32] So check all of this
and you'll be able to see how it works.
[57:34] And you can also add Http tools
API tools and mic tools as well.
[57:38] If you don't want to add custom function ones
like the ones that we've written so far.
[57:42] Anyways, now let's move on to agent three,
which is going to be a multi agent
[57:46] kind of orchestration agent,
where there's multiple agents
[57:50] that can be triggered at once
to perform a long running task.
[57:53] All right.
[57:53] So we finished the first two agents where
we're actually writing all of the code manually.
[57:57] Now we're going to move on to agent three which
is going to be going over multi-agent strategies.
[58:02] Now what we're going to be
building is a multi-agent researcher.
[58:05] So it's actually going to be very similar
to what we have in the docs here.
[58:08] So I'm not going to write
every line of code from scratch.
[58:11] I'm just going to run you through it at a high level,
because this code will be available from the link
[58:15] in the description.
[58:15] And I'm going to explain the different strategies
that you can use and show the executions.
[58:19] So this is the code that I have.
[58:21] I'm just going to quickly skim through it.
[58:22] And then I'm going to explain
how you can configure this to be useful for whatever
[58:26] example you're trying to build.
[58:28] Okay.
[58:28] So effectively
what I have here is a bunch of different agents.
[58:32] I have a researcher agent, I have a writer agent,
I have an editor agent, I have a market analyst,
[58:37] a risk analyst, financial analyst
and now, analyst team or analysis team.
[58:41] And then I have these different agent pipelines,
which we're going to have a look at in a second.
[58:45] And then I have just a few things that will kind of
create and save a report manually for us.
[58:50] Because that's how I'm going to kind of set it up.
[58:53] But effectively, the way this agent is going to work,
I'll run it for you in a second,
[58:56] is that I'm going to tell it, hey,
I want to do research on tech with Tim, for example.
[59:00] And the strategy I want to use for the,
[59:02] research is sequential,
which means, you know, run these in individual steps.
[59:06] And then what will happen is it will go and use
all of these different agents, gather information
[59:11] and generate a research report.
[59:12] For me, that's what this agent is.
[59:14] Again, I'm going to show you how it works.
[59:15] And we'll run through the code in a second.
[59:17] Now, the way that I'm able to do
this is because Agent
[59:20] Span supports these multi-agent strategies.
[59:23] Now here's the following strategies.
[59:25] First is handoff okay.
[59:28] And chooses which sub agent to handle the request.
[59:30] This you can write similar to this
if I can find it right here
[59:36] where essentially you just write an agent,
you give it access to some other agents.
[59:39] These agents can be exactly
what we just built before.
[59:42] And then you change the strategy here to say handoff.
[59:45] That's it.
[59:46] And then you just trigger this agent
the way that we've been running them.
[59:49] And it will just go and let's remove this.
[59:51] Be able to use each agent
as it needs to use them as you chat with it.
[59:55] So it has all these different agents beneath it.
[59:57] Similar to if you're using like cloud code
and you have sub agent setup okay.
[60:01] Then you have sequential straightforward.
[60:03] This just means that we always run the agents in a, 
what is it kind of linear paths.
[60:08] We run them one by one, and then we take the result
of one agent and we pass it to the other.
[60:12] You can see sequential looks like this, right?
[60:14] We run and we get the result.
We pass the results to the next agent.
[60:17] We run, we get the result.
We pass the result to the next agent.
[60:19] Then eventually we get the final results
have like researcher, writer, editor, boom.
[60:24] And then we get the response, okay,
then we have parallel.
[60:28] Parallel allows us to run these all concurrently.
[60:31] This means that I can run all three agents
at the exact same time at scale,
[60:35] so I don't need to wait for one response
before I get the next.
[60:39] Then we have rotor.
[60:40] As you can see, we can route between different ones.
[60:42] We have swarm handoffs between different agents.
[60:46] We have round robin, random and manual, a
bunch of different strategies that you can use here.
[60:50] When you make these agents
now you'll notice that there's a special syntax.
[60:54] It looks like this.
[60:55] These kind of two
I don't know what you call them greater than signs.
[60:59] And this is the same syntax as writing this.
[61:02] This just means run these agents sequentially.
[61:04] You're kind of piping the response
into one another or assertively.
[61:08] You can define the agent and you can just
specify the strategy as you see here.
[61:11] Okay.
[61:12] And then you can just run the pipeline like this
and get the result.
[61:15] So I'm going to show you
a few different strategies here.
[61:17] So you can see the time difference
and the response that we get.
[61:20] But notice that if I want to run them in parallel,
same thing I define three agents
[61:24] strategy parallel boom. We get the response.
[61:26] And if you want to get the sub result
you can have a look at it here.
[61:29] Hand off the default one.
[61:30] You just pass them in here.
[61:32] Strategies.
[61:32] Hand off it will go and hand off as needed. Rotor.
[61:35] You can set up agents.
[61:36] You can also set up a rotor for the rotor.
[61:38] You can actually use an agent to do this.
[61:40] You see have a classifier
agent says classify the request
[61:43] and then just reply with the correct category.
[61:45] And then it will call the correct one, okay.
[61:48] And then swarm. And you can go through
and you can view how all of these work.
[61:51] But I'm going to show you the code example right now
okay.
[61:54] So let's go through the code that I have right here
okay.
[61:56] So first things first we just bring in the imports.
[61:58] We disable some of the logging kind of war 
errors and warnings you're seeing.
[62:03] We specify the mode.
[62:04] So we want to be able to run.
[62:05] So sequential parallel nested and worker.
[62:08] We then have some various tools here.
[62:10] Now notice that these tools use
something called credentials.
[62:14] Now when I specify a credential here
this effectively means
[62:18] that we need to grab this credential from our server
in order to use it inside of this function.
[62:23] So I say credentials is equal to fire curl API key.
[62:27] Now what I'm doing is saying API key is equal
to OS start environ fire curl API key.
[62:31] And this will automatically set the fire Curl API key
that's going to be stored on our server,
[62:35] which I'm going to show you how to do in a second.
[62:37] In the local shell while we're running this worker.
[62:40] So this means any credentials that you want to have,
[62:42] you can store them directly on the agent server,
which again we're going to look at in a minute.
[62:46] You can grab them when a tool is called
[62:48] and then use them locally
without having to expose them locally permanently.
[62:52] So only when they're needed they can get pulled out.
[62:54] So essentially I'm going to use Fire Curl.
[62:56] If you want to sign up, you can get a free account.
[62:58] You don't need to pay for it.
[62:59] You get a bunch of free credits,
and this will allow you to do a ton
[63:03] of scraping and searching of the web more effectively
than with like a default search.
[63:07] So I'm using Fire Curl to just search the web
for a bunch of pages on
[63:10] whatever topic we're going to look up.
[63:11] I then have this fetch page tool.
[63:13] This can get an individual tool
and actually grab all of the content
[63:16] from the page and give us the information
so that we can scrape the content.
[63:19] Okay, so just two tools.
[63:21] Now I have a researcher agent, this agent I keep it
access to these two tools search web and fetch page.
[63:26] Right.
[63:27] And that's it then for the writer agent
I just give it some different instructions.
[63:31] I don't even change the model for the editor.
Same thing.
[63:34] I just give it some different instructions for the
market analyst, give it different instructions.
[63:38] And I just have all these different agents
that I've created.
[63:40] I then create an analysis team.
[63:42] And this analysis team I want to run in parallel
where I say, hey, for the market analyst, the risk
[63:46] analyst and the financial analyst.
[63:47] So these three right here,
we want to run those at the exact same time.
[63:51] So I just specify that
I'm going to run them in parallel.
[63:54] I then create these pipelines.
[63:56] So I have a published pipeline
which is my researcher writer and editor.
[63:59] So let's have a look here.
[64:01] We do the research.
[64:02] We do the writing and we do the editing.
[64:05] Now when I do that, because of the syntax
that I've used here, I'm running them sequentially,
[64:09] which means I need to wait for the researcher to go,
then the writer to go, then the editor to go.
[64:13] Then for my nested pipeline, this is where I take
my analysis team, which I run in parallel.
[64:18] And then after that.
[64:19] So after I get my analysis,
I write the researcher, writer and editor.
[64:23] So I run this whole thing sequentially.
[64:26] But this first step runs
these three agents in parallel.
[64:29] So I've created this kind of like multi-agent,
you know, orchestration
[64:33] where my analysis team goes in parallel at first.
[64:36] Once the analysis team is done,
then we go sequentially to the other agents.
[64:40] Hopefully that makes sense.
[64:41] But that's kind of how I've set up these agents
to call each other.
[64:44] And notice we just have two simple tools.
[64:46] But we can use anything from agent two or agent
one with the agents that we have in this example.
[64:52] Okay.
[64:52] Now we just have a few functions
one to render the output,
[64:55] one to slug ify something, one to save the report.
[64:59] These are just functions that I'm manually calling.
[65:01] And we're just going to save a report
in a folder called reports directory.
[65:06] In that folder it's
just going to look like this path reports okay.
[65:09] So let's say we're
[65:10] just going to save like a markdown report
with the information that we get from these agents.
[65:14] Now you'll notice that I just have this run
pipeline function.
[65:16] This allows me to take in either
sequential parallel or nested.
[65:19] You can see if it's sequential.
[65:20] We run the publish pipeline, which is this.
[65:24] If it is parallel we run the analysis team
which is just the analysis.
[65:28] And if it is let's go back.
[65:30] What's the other option we had here nested that.
[65:32] It runs my nested pipeline.
[65:34] Then what we do is
we just say with the agent runtime,
[65:37] hey, we're going to run
whatever pipeline mode we have that's like this.
[65:41] So just which one are we going to execute?
[65:43] This is the topic that we want to research.
[65:45] And then we just have some runtime.
[65:47] We get the execution ID, we get the status,
we get a path to the report.
[65:50] And then we just save the report and that's it okay.
[65:53] Then serving the worker. Don't worry
too much about this.
[65:55] And prompt mode.
[65:56] This just allows me to essentially type
directly into here and specify, hey, what do I want?
[66:01] So we can run it.
[66:02] So let me run it and show you what this looks like.
[66:04] So you get a sense of how this functions.
[66:06] So I'm gonna say you've run Agent Slash
[66:09] and then this is going to be agent 3.py okay.
[66:13] For the mode we're going to pick.
[66:15] So for now let's go with parallel topic.
[66:19] Let's go with tech with Tim okay.
[66:22] So for parallel what this is going to do.
[66:24] Again let's just look at the setup here
is it's going to run the analysis team
[66:28] with just this market analyst.
[66:30] Risk analyst and financial analyst.
[66:32] Now this probably doesn't make sense for me
because tech with Tim
[66:35] is not really something
that's going to have like a market analysis.
[66:39] But if we want to see this running we can go here,
[66:43] we can save and you can see that this is running.
[66:46] We actually have three agents running.
[66:48] And if we go back to the main execution,
see we have analysis team financial risk and market.
[66:52] And then if we go back here
it says the report was saved to this directory.
[66:56] And if we open up the report we get the full report
from these three different agents.
[67:02] Okay cool.
[67:03] Now let's try a different execution mode.
So let's go.
[67:05] You've run
agent 3.py and let's try nested for the topic.
[67:11] Let's go Nvidia stock okay.
[67:14] Now if we go here let's go to our agents.
[67:18] You can see that
we now have a bunch of agents running right.
[67:20] So have the analysis team researcher writer editor,
the analyst team market risk financial.
[67:24] And these are going to run sequentially.
[67:26] So if we go and have a look at this, the first thing
we're doing is running the analysis team.
[67:30] The analysis team we need to run sequentially.
[67:31] So we're waiting for all of these to finish okay.
Looks like they're finished.
[67:34] Now we're going to the researcher.
[67:36] So the researcher is going to have their
[67:37] the input from the analysis team,
which you can see is piped in right here.
[67:42] We're going to wait for the researcher to finish.
[67:44] And then as soon as the researcher is finished,
[67:46] we're going to go to the writer,
and then we're going to go to the editor.
[67:48] So this of course is going to take longer.
[67:50] But that makes sense
[67:51] because we need to go through this flow
to pass the data between the different models.
[67:55] So let's just refresh here,
wait for it to finish and see what we get.
[67:59] And actually if we go to the main execution,
[68:01] you can see that we're running this analysis team
and then this researcher.
[68:05] And we can just wait for the researcher to finish.
[68:07] We should see it
all right here okay. So it's running now.
[68:09] And you can see that we have a lot of different
tool calls that are being executed here.
[68:13] Because it's using the search web call
from fire Crawl.
[68:16] Now if I check here it actually says the fire curl
API key is not defined.
[68:19] So I'm glad we saw that.
[68:21] And you can see
[68:21] this is just going to continue to keep retrying
and retrying until I eventually crash this.
[68:26] Or I provide the fire Curl API key,
which is kind of how this is designed to run.
[68:30] So what I'm going to do is just quit out of this for
right now and show you how we can provide that key.
[68:34] Okay, so like I mentioned before,
you can actually store credentials
[68:38] on the server, which we need to do
because of how we're looking them up in the tool.
[68:41] And the way to do that is the following.
[68:43] You're going to type, you've run
if you're using UV agent spin
[68:47] credentials,
make sure we spell that correctly and then set.
[68:50] And then you're just going to set the credentials
that you want.
[68:52] Now in our case
it is the fire crawl underscore API underscore key.
[68:57] And I'm just going to make this equal to my fire
curl API key which I will disable afterwards.
[69:01] Okay.
[69:01] So you're saying you've run agent spanned
credential set fire curl API key.
[69:05] And we need to remove the equal sign
because that's how the syntax is.
[69:09] And now we've stored this on the server.
[69:12] So now we may need to restart the server
I'm not sure.
[69:14] Let's actually just go here and check.
[69:16] We can refresh and let's go to credentials.
[69:19] And okay it looks like the credential is now here.
[69:21] So that's good. So it's stored.
[69:23] And what we can do is rerun our agent okay.
[69:27] We're going to run this in the what mode
was I running this in the nested mode I think.
[69:32] Yeah. So let's run this in the nested mode.
[69:35] And let's look up in Nvidia
[69:37] stock okay.
[69:39] And hopefully this time it will work.
[69:40] Once we get to this step
where it's trying to call fire curl.
[69:43] Okay.
[69:43] So I just opened up the server
and we can see the researcher is running.
[69:47] Now this is the one that takes the longest
because it's using fire curl.
[69:50] But you can see that
it's fetching all of these different pages.
[69:53] Right.
[69:53] To get all this information about Nvidia,
you can see if we go back to the top
[69:58] I believe it use yes search web.
[69:59] So it was searching past the input query Nvidia
Investor Relations annual report.
[70:04] And then it got all this output.
[70:05] And then it went to search
all of these individual pages.
[70:08] And we can see the full flowing flow full flow story
right here until eventually we get the output.
[70:13] If we go back to the agents
[70:14] we can see now we're just at the writer
which is going, and then we should be good.
[70:19] So let's see what response we get okay. Boom.
[70:21] And looks like we got the response.
[70:22] If we go to the reports here, we can open this up.
[70:26] Let's just preview it here.
[70:27] And we can see our full markdown report about Nvidia
stock analysis with the different sources.
[70:33] We'll just click one and see if it works.
[70:35] And boom yeah we get like the full report.
[70:37] I guess it's long.
[70:38] I'm not going to wait for that PDF download
and all of the other information.
[70:42] Okay. So very good.
[70:43] The nested agent is working.
[70:46] So that's pretty much
what I wanted to show you for agent three.
[70:50] Now what I want to do is move on to a few other parts
that we should be understanding, which is testing
[70:54] and then the durability feature.
[70:56] So how do you actually resume an AI agent
when it crashes
[70:59] in the middle, or it's
waiting for a human or something along those lines?
[71:02] Let me show you.
[71:04] So what I've just done here is written a short file
[71:05] that shows some basic usage
of testing an agent span agent.
[71:09] Now what we're able to do is we can test these agents
[71:12] without actually having to make an API call
[71:15] to ensure that things like the model
or the pedantic, response model
[71:20] they're using, or the tools that are using,
or these kind of things work properly.
[71:24] So, for example, what I've done is I've said, hey,
I want to test agent two.
[71:27] So I've brought in some stuff from Agent Span.
[71:29] I've brought in the support response
and the support agent.
[71:32] I have an example refund policy
where there's like some, you know, thing
[71:36] that we should be getting as a response here.
[71:38] And what I've said is,
okay, hey, we're going to do a tool call.
[71:41] The tool call is going to be searching
the knowledge base.
[71:44] We're going to have a query,
which is the refund policy.
[71:46] We're going to mock the tool result
which will be refund policy.
[71:50] And then we're going to mock done.
[71:52] And we expect that
we should get this support response.
[71:55] So we're mocking a lot of the functionality.
[71:57] But again it's still good just to make sure
the agents working as we expect and to run
[72:01] extremely quickly without relying on lumps,
we then can use a standard expect.
[72:05] We expect the result to be completed,
the output to contain refund,
[72:09] and we expected to have used this tool
search knowledge base.
[72:12] Right?
[72:12] If we give the support agent this,
which is what is the refund policy.
[72:16] So we mock all of the events, but we can just
make sure that those events are triggered properly.
[72:21] Now there's full docs on how this works.
[72:22] I'm not going to go through all of it,
but very basic.
[72:25] If we want to run this,
I can just come here and go, okay, so sorry,
[72:28] I just moved some of the import stuff around cause
I had it in the wrong place.
[72:30] But anyways, if I go here and I run this now,
you can see mock test passed and all is good.
[72:36] We didn't get any errors
and if we change this to maybe say like,
[72:39] you know, dot refund did instead of refund
and we run this, you can see that
[72:45] we get an assertion error and it says, hey,
there's some issue you need to now go and fix this.
[72:49] Okay.
So just showing you the basic testing usage okay.
[72:51] So now I want to have a look at the durability
feature here of Agent Span.
[72:55] And what I mean by
that is if an agent were to crash or go offline,
[72:58] we can restart it
without having to repeat all of the steps.
[73:03] So let's imagine we have a simple agent
like we have here where there's a slow step.
[73:07] What do you call it? Tool that runs.
[73:10] It takes three seconds to run.
[73:11] Notice. Also, I added a timeout.
[73:13] You can do that on various tools.
[73:14] And what I've done is I've told the agent, hey,
I just want you to run a ten step workflow
[73:17] by calling the slow
step for each step and run it ten times.
[73:21] That's it.
[73:22] So this will take 30s to run, but we might make it
to step nine or something, might crash or break,
[73:27] and then we would have to restart from the beginning
if we didn't have this durability.
[73:31] So what I've done is I've set this up
so that we have a mode, we have a start mode.
[73:34] We also have a resume mode.
[73:35] Now you would have this if you're running this in
production, because you would know the execution ID
[73:40] when these agents are running,
which I'm going to show you in a second.
[73:43] So anyways, you can see that if the mode is start,
[73:46] what I'm going to do
is I'm just going to start the ten step workflow.
[73:49] Right. And then I'm just going to stream the handle.
[73:51] And this is just going to print out everything
that's going on.
[73:54] So we can see until it says that this is done.
[73:56] That's it.
[73:57] Now if the mode is resume
I'm actually going to serve the durable agent okay.
[74:02] So I'm going to start the agent.
[74:04] And then what I'm going to do is connect
to the execution ID that we had previously.
[74:09] So this is going to allow me to connect
to the existing, execution.
[74:13] And because this agent will be running,
we can just go and resume from where we left off.
[74:18] So I'm just serving the agent, so.
[74:20] Okay, start the agent.
[74:21] And for our handle,
rather than starting a new process,
[74:24] just connect to the previous one that we have.
[74:27] So any of these execution IDs
that are not yet finished.
[74:30] Of course, there's a lot more scientific,
[74:32] scientific way to go about doing this,
but that's the basic way that I'm going to show you.
[74:35] So let me show you what I mean.
[74:37] Let's open this up and let's go. You've run
[74:41] and let's spell this correctly.
[74:42] And then agents slash crash resume demo okay.
[74:46] So let's let this run for a second and let's wait
till it gets to kind of some, you know, later steps.
[74:51] So let's go back here to our agents.
[74:53] And you can see the durable demo is running.
[74:55] It's running this slow step.
[74:56] And if we keep refreshing here we should just see
that it keeps moving on to the next step.
[75:00] So now we're on step two.
[75:01] And I'm just going to keep going.
[75:02] Right is going to do this well up to ten times.
[75:05] So let's wait okay. Refresh again.
[75:08] You can see now we're on step three.
[75:09] And then what happens
if I just crash it boom it stops.
[75:13] Well if we go here
you'll notice that this is still running right.
[75:16] So we made it to step four.
[75:17] But the slow step
we're just waiting on this to finish.
[75:20] So what can I do.
[75:22] So that I don't need to restart this
from the very beginning?
[75:25] You'll notice
it's not going to advance any further. Right?
[75:27] We're still on step four
without having to restart the whole thing.
[75:31] So if we go here, you'll see that we have
an execution ID that would have been printed out.
[75:36] Looks just like this.
[75:37] So we're just going to copy that execution ID
[75:40] and we're going to paste that right here.
[75:41] I'm going to remove the spaces.
[75:43] I'm going to change the mode to just say resume.
[75:46] So now what's going to happen
is I'm just going to go and I'm going to use this
[75:49] where I'm
going to connect to that previous execution ID now,
[75:53] because all of the state is stored
here on the agent span server when I reconnect.
[75:57] So if I just restart this here, you'll see that
it brings me back to where I already was.
[76:02] And I have all of the state already there.
[76:04] And we can now just continue.
[76:05] And if we refresh, you'll see that
we now go to turn number five.
[76:08] So I didn't restart anything.
[76:10] I didn't lose any state and lose any information.
[76:12] I just go from where I left off
and I just restarted the worker.
[76:16] So this is the important thing to understand
is that agent Span is storing the state.
[76:19] Right.
[76:20] And kind of all of the information.
[76:21] And your worker is just executing the code, right.
[76:24] It's executing the functions, it's
completing the task.
[76:26] But you at any point,
if it fails, can go back and reconnect to that.
[76:30] So imagine you're writing a platform.
[76:32] You just store all your execution IDs.
[76:34] If any of them fail,
you just simply reconnect back to them and continue
[76:37] when the worker comes back online, because that's
something that happens a lot in production.
[76:41] And same thing. Let's can I quit? Maybe in time?
[76:44] I'm not sure if I was able to quit it in time
or if this is going to be completed.
[76:46] Now let's go down here and see.
[76:48] Yeah.
[76:49] So it's still waiting on the let me call.
[76:50] So now same thing if I run it again boom.
[76:53] You see we get right back into the execution
we had before and we're done.
[76:56] And all of it's finished
and we get whatever that final response was.
[77:01] Which if we look here, there's tons of workflows.
[77:03] Complete steps one through ten will run an order.
[77:05] But okay, so that's what I wanted to show you
with this kind of crash and resume
[77:09] and how easy it is to get back into the state
where you were before.
[77:12] Now, lastly,
let's talk a little bit about deployment,
[77:15] and then we're going to be done with this course
okay.
[77:18] So now let's talk a little bit about deployments.
[77:20] Now I'm not going to deploy full application here.
[77:22] But I just want to discuss how you can move to
this stage if you do want to deploy your apps.
[77:27] Now if you just want to use local development
like we were doing right,
[77:29] you just run the Agent spin server and that's it.
[77:31] It will just stored in a local SQLite database.
[77:34] However, if you want to go to a deployed environment,
you probably want to use PostgreSQL
[77:39] and some kind of Docker compose
to be running this for you.
[77:42] Now, in order to do that, you can just pull
the GitHub repo that Agent Span has.
[77:46] I'll leave a link to it in the description,
and when you pull this, it gives you the information.
[77:51] Here you can go into the deployment
and then Docker compose directory.
[77:56] So if you go here they have deployment right.
[77:58] And then they have docker compose.
[78:00] And from Docker compose
you can just adjust the variables here.
[78:03] Inside of the env example
you can put any environment variables
[78:07] or like API keys that you want to have.
[78:08] You can put the what do you call a Postgres database
that you want to connect to
[78:13] so that rather than running it
locally, it's going to run with that remote DBS.
[78:16] You can also connect to it as needed.
[78:18] Now it also goes over exactly how to deploy it
using Docker Compose.
[78:21] This will just deploy this server for you.
[78:24] And as soon as this server is deployed, all you need
to do is just point your workers to this server.
[78:30] So as it says, right here, all you have to do is just
say, hey, here's the URL where this is running.
[78:34] It could be running on this server,
could be running another server behind some endpoint,
[78:38] or behind some URL, whatever.
[78:40] And that's it.
[78:40] Then you just point it there with the server URL,
you start working and everything is good.
[78:45] Right. And this can be scaled as much as you want.
[78:47] Now there's a bunch of other options
in terms of using Kubernetes and setting up the off
[78:51] and all of this kind of stuff,
[78:52] which I'm not going to go through here,
but you can see that you can set an off key,
[78:55] you can set an off secret, and then you can also
just configure those directly from code.
[78:59] So now if someone wants to connect to it,
they do need to pass those values from their worker.
[79:04] So you have some kind of secure authentication
going between the worker
[79:07] and between your agent span server.
[79:10] And that's pretty much it.
[79:11] That's all you need to do for deployment okay.
[79:13] You also can obviously self-hosted
this as a service right here.
[79:16] And it kind of explains how you have multiple workers
going to the server connected to Postgres.
[79:20] And you can see all of the different options,
but it's very straightforward.
[79:24] It's just a matter of essentially
deploying the server.
[79:26] And once a server is deployed, pointing
your workers towards and then adding that basic auth
[79:30] kind of, you know, protocol
so that not anyone can connect to the worker.
[79:34] So that's it
guys, that's going to wrap up this video.
[79:37] That's pretty much all of the core
things that you could do inside of agents.
[79:41] And of course there's a lot more
I didn't go over everything.
[79:43] But this should give you a really good head
start to building production.
[79:47] Great AI agents in Python.
[79:49] If you enjoy this type of video,
make sure leave a like.
[79:51] Subscribe to the channel
and I will see you in the next one.