[0:00] In this video, I'll be going through a full course on how to build production agents in Python. [0:05] We're going to write every single line of code, and I'm going to show you how to build three AI agents. [0:09] The first is going to be a simple conversational agent that has access to conversational memory. [0:14] The second is going to be a rank based agent, where it can pull out information [0:18] from like a company database. [0:19] And then the last agent is going to be a multi-agent orchestrator, [0:23] where we actually have multiple AI agents running at the same time to achieve a longer running task. [0:28] Now, this video is not designed for complete beginners, [0:30] but as long as you're familiar with Python, you should be able to follow along. [0:33] And we're going to be using a framework here called Agent Span. [0:36] But don't worry, it is free is open source. [0:38] You won't need to pay for anything. [0:39] You just need to have access to some kind of AI model. [0:42] So like OpenAI, anthropic, whatever. [0:44] But we'll go over that in a minute. [0:46] Okay, now this is really going to be focused on how to build production AI agents. [0:51] So rather than just agents that can run in your terminal [0:53] or that run in a demo environment, ones that you could actually eventually scale up. [0:58] Now, in order to do that, we need to talk about the main problems [1:01] that you have when you actually try to run AI agents in production. [1:04] Now, first we have processes that crash mid run, right? [1:08] So maybe the network goes down, database freezes whatever. [1:11] Your agent just gets killed. [1:13] And that means that a lot of the work that's done can be completely wasted. [1:16] And that can be quite expensive over time. [1:18] Next human in the loop. [1:19] So maybe we need a user to approve a task, something, right? [1:22] Or to press a button that could take any amount of time. [1:25] We're just unsure about that. [1:26] Lastly or not. [1:27] Lastly, but thirdly, and one that's most important to me is visibility. [1:31] A lot of times when you build these AI agents, you have no idea what they're actually doing. [1:34] So you need observability into the platform to see what step is on. [1:39] Where is it going wrong? What tools is it calling, etc.. [1:42] And then obviously scaling a lot of times if you just build a simple like long chain [1:45] agent or something, it's not going to scale to tens of thousands of users, [1:49] and you have to pretty much reinvent the wheel and spend most of your time deploying out all [1:53] this infrastructure. [1:54] When really you want to focus on just building the AI. [1:57] So there's seven things that you need. [1:59] If you want to have an AI agent actually be production ready. [2:02] I'm gonna quickly going to go through them here. [2:04] Now first durability. [2:05] That means that if the agent crashes it can recover. [2:08] And it doesn't need to completely restart next retries. [2:11] So sometimes the step will fail. [2:13] That doesn't mean we should completely exit the process. [2:15] We should retry it multiple times. [2:18] Human in the loop. Again. [2:19] Sometimes we need to delegate a task back to a human and say hey, are you sure you want to do this? [2:23] Do you want to issue the refund? [2:24] You want to delete this file x y, z, right? [2:27] Observability. [2:28] Like I talked about, we need to be able to actually see what's going on in real time long running tasks. [2:33] If agents take 2030 two hours to run, we should be able to handle that [2:37] and then scale and testing, which we can talk about a little bit later. [2:40] Okay, so in order to accomplish what I just discussed there and essentially [2:43] get these seven features for our AI agents, we're going to be using [2:47] a framework called Agent Span, which comes from Orx who's kindly sponsored this video. [2:51] And don't worry, this is free. [2:52] You don't need to pay for anything. It is completely open source. [2:55] I want to quickly just show you what it looks like when you actually get this running, [2:58] because this is the benefit of using a platform like this network is [3:02] essentially gives us a server which is going to handle all of the different state [3:06] and kind of track the progress of the multiple AI agents that we're running. [3:10] So you can just see a few quick examples here from the dashboard. [3:12] This is the server running on my own computer. [3:14] You don't need to build this. You literally just install it and run it. [3:17] And for any given AI agent, let's say we go to this analysis team agent. [3:22] You can see a full log of everything that's actually gone on. [3:25] And you can see this in real time. [3:27] So in this case we had a multi-agent system. [3:29] And I can click into one of these agents and see the input, the output, the Json, the summary, [3:33] or actually go into the execution of this agent itself to see everything that went on. [3:38] So this is the observability that I'm talking about. [3:40] What this also does is allow us to scale the agents by having a built in queue system [3:45] for all of them running, and then to retry tasks. [3:48] For example, if we go here and we scroll down, you can see there was like ten tasks [3:52] that were running. [3:53] And we can go through every single [3:55] turn of the agent and see everything that went on along with the tokens. [3:59] The reason it stopped the duration, all of that good stuff, then this will make a lot more sense [4:03] later. [4:03] But effectively this is the backend infrastructure that we run our agents against. [4:08] And each of these agents that you see here was me running code [4:12] that connected to this server, and the server handled the state [4:15] and the orchestration, but allowed all of the code to be executed where it was run. [4:20] So from our local machine, from our server, whatever. [4:23] But if there was a crash, for example, [4:25] we could recover from that crash because all of the state is stored on this server. [4:28] So we could just reconnect, restart where we left off. [4:31] And it's not a big deal. [4:32] And this task can run for as long as it needs to. [4:34] So anyways, that's the basics on Agent Span. [4:36] They also have their own Python framework for building AI agents, which we're going to use, but [4:40] you also can connect them to Lang graph, the OpenAI SDK, Google ADK, I believe a few other ones as well. [4:45] If you just want to use their orchestration layer or kind of the server that I talked [4:49] about now, in terms of the kind of architecture here, let me quickly go through it. [4:53] This is pretty much what it looks like. [4:55] We have a worker. [4:56] The worker is what we're going to write ourselves. [4:58] We have the agent span server. [4:59] This is already provided to us. Again it's open source. [5:02] We can run it ourselves. We don't need to pay for anything. [5:04] And from here, this keeps track of all of the state. [5:06] The history allows us to retry, handle human in the loop, multi-agent, all of that kind of stuff. [5:11] It just handled for us. [5:12] So from the worker side, we pretty much just say, hey, we're building an agent. [5:16] We're going to connect to this server. [5:17] All of the rest of the code stays exactly the same. [5:20] The server handles all of that durable execution stuff that I talked about. [5:24] And then of course we have an LM. [5:25] We can use any LM that we want. [5:27] So bring OpenAI cloud whatever. [5:29] And that's essentially how it works. [5:30] So anyways that is the brief. [5:32] That's what I'm going to show you how to do in this video. [5:34] What I want to do now is hop over to the code editor. [5:36] We're going to start getting some things installed and set up. [5:39] And then from there we're going to build out [5:40] three unique AI agents again, starting easy and then medium and more difficult. [5:45] So you get a sense of how to actually build these. [5:47] And again how they work in production, which is the most important part [5:50] because at the end of this video, you could very easily go deploy [5:53] this up by just deploying the server and deploying your workers. [5:56] And you're good. That's it. [5:58] Because of the way that we built it, as opposed to if you use a lot of the other frameworks out there. [6:02] Anyways, let's dive in. [6:03] All right. [6:04] So now we're going to get started with the installation steps here for Agent Spann. [6:07] Now I'm just on the documentation. [6:08] I'll leave a link to it in the description. [6:10] It's actually very good. So you can follow along. [6:12] And a lot of the stuff that you see in this video I just pulled directly from the documentation. [6:16] Now first things first we need to install Agent Span and the Agent Spans server. [6:20] Once we have that installed that is very easy for us to just write the code, [6:24] which is our worker code which will connect to the server. [6:27] Now notice that we can simply install it using pip install agent span. [6:31] This of course requires [6:32] that we have Python installed on our computer, and that we have some kind of code editor. [6:35] So my case I'm going to be using cursor. [6:37] You can use any editor that you want for this video. [6:40] Now notice that what I've done in cursor is I've just made a new folder here. [6:43] So I just one file I want open folder. [6:46] And I just selected one that was on my desktop. [6:48] Just made a new one called AI Agent Tutorial. [6:50] From here I've opened up the terminal and I'm going to type a UV init. [6:54] Don't let me zoom in a little bit so you guys can see this. [6:56] And this is just going to make a new UV project because I'm going to use UV to install agent span. [7:01] So you notice it says you can use UV pip install agent span. [7:04] So from here we're just going to type UV add agent span like so. [7:08] And then it should add it to our environment for us. [7:10] And install everything that we need. [7:12] Now you don't have to use UV but I prefer to use UV. [7:15] So that's what I'm going to do. [7:16] Now I'm just going to delete this main.py file because we don't need that as well. [7:20] So now if we go to the Pi project tunnel you can see Agent Span is installed. [7:24] Okay. [7:25] Now the next thing that we need to do is set our API key, that the interesting thing about Agent Span [7:30] is that it actually will hold the various provider keys for us. [7:34] So any environment variable that you need to use, you don't have to have it in your worker code. [7:39] You can have it stored on the server which is going to be more secure. [7:42] So I can actually put in OpenAI API key or anthropic API key or whatever provider [7:46] I want to use directly, where I'm running my server, which you're going to see in a second. [7:51] So what we're going to do is just get one of these API keys for this video. [7:54] I'm going to use OpenAI, but you can use anthropic if you want. [7:57] And what I'm doing is I'm going to platform.openai.com/home okay. [8:02] This is going to let me make a new API key. [8:04] You will need an account here. And you will need to pay for this. [8:06] But it's very cheap. [8:07] We're talking about, you know, maybe sense of spend to follow along with this tutorial. [8:11] And I'm going to go create API key. [8:13] And I'm just going to call this agent span. [8:16] And then maybe tutorial or something. [8:19] Okay I'm going to make the key. [8:20] And obviously you don't want to leak this to anyone. [8:22] So I will delete it afterwards. [8:24] Okay. So from here we're going to go into our terminal. [8:26] And we're going to type the command as it shows here from the documentation. [8:29] So let's go back. Export OpenAI API key is equal to. [8:33] And then the key. [8:33] So we're going to export OpenAI underscore API. [8:36] Underscore key is equal to. [8:37] And then we're going to paste the key inside of here. [8:40] And then we're going to press enter. [8:42] Now this should put it inside of the current shell session. [8:44] Which means that any command that we run after this should have access to this variable here okay. [8:49] So make sure that if you're going to run the server again that you first export the key beforehand. [8:54] There's other ways to avoid doing that. [8:56] But for now this is the easiest where you just have to have this environment variable set in your shell. [9:01] Okay before you run it. [9:02] Now if you are on windows, this command will likely look a little bit different. [9:06] And if you're using something like cursor, I would just ask it, hey, what is the, you know, [9:10] equivalent command to and then paste the export, [9:15] you know, OpenAI whatever for PowerShell. [9:19] And it should tell you I don't know what the exact command is. [9:21] So I'm not going to guess. [9:22] But you can just use an AI model and it should tell you how to export it properly. [9:25] So now that it's exported, what we're going to attempt to do is run the agent Spans server. [9:29] So we can just directly run the agent spawn server, or we can run [9:32] Agent Spin doctor just to make sure that is all working. [9:35] Now because I'm using UV, that means that if I want to run this I need to do. [9:39] You've run an agent spin doctor. [9:41] If you're not and you just installed it globally with Pip, [9:44] you should be able to just run the agent spin command. [9:46] So from here I'm going to press enter and let's see what it says. [9:50] And it looks like all is good. [9:51] It says okay OpenAI is set, Java is installed. [9:54] We have enough disk space. The server jars cache. [9:56] That's because I've installed this previously. [9:59] If you didn't install this previously, it may tell you that something's wrong. [10:02] And if that's the case, you may need to install, for example, Java 21, okay, etc.. [10:07] Now if you don't know how to install it again, [10:08] ask the Lem to ask something like cursor hey, how do I install Java 21? [10:12] And it should give you the command. [10:14] Okay, so now that that's running we're going to type. [10:16] You've run agent spin server start okay. [10:21] Now this is the command to start the server. [10:23] So we're going to go ahead and run that. [10:26] And you can see that it says server is already running [10:28] okay let me stop the server because I may have it in another port. [10:31] So to stop it we're just going to go stop okay. [10:33] And then I'm just going to restart it from here. [10:35] So let's give it a second. [10:36] And it says it's running on port 6767. [10:39] We're just going to wait a minute. And it says that it is running. [10:42] So now if we want to test if the server is working we can just copy this URL right here. [10:46] We can go to our web browser and just paste it. [10:49] And we should be able to see the agent spans server okay. [10:51] So from here you'll see the agent spans server. [10:54] There's a bunch of stuff you can look through. [10:55] But generally you're just going to be looking through executions right here. [10:58] And it's going to show you a history of all of the executions. [11:01] Now obviously you won't see anything if it's your first time, but for me, I'm [11:04] seeing previous executions because I've ran the server before. [11:07] Okay. [11:07] So we're going to have a look at this later [11:08] because it will make more sense when we actually get executions. [11:11] But for now, let's go back to our project here and let's start [11:15] installing a few last things that we need. [11:16] And then we can create our first AI agent. [11:19] Okay. [11:19] So I'm going to write clear and I'm just going to type UV add. [11:22] I'm going to add a few dependencies that we need. [11:24] And if you're not using UV you can just use Pip to add the equivalent dependencies. [11:28] Now first we're just going to bring in Python dash dot envy. [11:33] And we're also going to bring in pedantic. [11:35] And then lastly fire crawl [11:39] dash pi which we're going to use for the last agent okay. [11:41] So go ahead and press on enter. [11:44] And we should see that we get them all installed okay. [11:46] So that's all we're going to need installed for our project. [11:49] What I'm going to do now is just make a new folder. [11:51] And I'm going to call this agents now instead of agents, I'm just going to make a new agent. [11:56] And I'm just going to call this agent 1.py. [11:59] And this is where we're going to start writing our code [12:01] now, our first agent, it's just going to be a simple conversational agent. [12:05] All that means that we're just going to talk to it kind of like a chat bot. [12:08] And the one thing that we're going to add is that we're going to allow the agent [12:11] to know what our current time is, and to get information about us as a user. [12:15] We're also going to add memory so that anything that we say previously it can actually remember, [12:20] because by default, if you don't add memory, I can say, hey, my name is Tim. [12:23] It says Hey Tim. [12:24] And then the next conversation or the next time I ask it something, it will completely forget [12:28] because it's not storing the previous responses. [12:30] Okay, so that's the goal here. [12:31] And this is just to show you the basics of the framework. [12:33] And then we'll go into building some stuff that's more complicated. [12:36] So we're going to start by importing logging. [12:38] This is because there's a lot of logs that are going to be output by agent span. [12:41] And we want to probably suppress some of them. [12:43] So we don't see too much in the terminal. [12:45] We're then going to save from date time import date time. [12:50] We're then going to go from dot env import load dot env. [12:54] And we're just going to use local env to load an environment variable file [12:58] that we're going to need in a second. [13:00] Next we are going to say from agent Span. [13:03] And this is going to be Dot agents. [13:05] Make sure that you put plural. We're going to import agent [13:09] the agent runtime okay. [13:12] And runtime is with the lowercase there. [13:14] And then conversation memory run and tool okay. [13:19] So this is all we're going to need for now for this basic agent. [13:22] Let me just close this. [13:23] You guys can see it a little bit bigger okay. [13:25] Next lines. We're going to locate Env. [13:28] What this is going to do is load any environment variable files that are present. [13:31] And in fact while we're here we're just going to make a new dot env file in the root of our project. [13:37] So dot env and we are going to put inside of here one variable that we need. [13:42] Now this variable is the agent span underscore server underscore URL okay. [13:48] And for now this is going to be equal to Http colon slash slash localhost [13:54] port 6767 slash API. [13:57] Now let's make sure we spelled this correct because I completely butchered the spelling here. [14:01] But this is local host like so now. [14:04] And let's add the extra slash okay. [14:07] So this is where the agent spends servers running right now. [14:10] Again we're running it on our own computer. [14:12] So we just put in this URL and yours will be the exact same. [14:15] If the agent spent server was running on a different computer wasn't running on localhost, [14:19] then of course we would change this because maybe we're going to have the server [14:22] hosted somewhere else and our workers hosted somewhere else. [14:25] That's possible. [14:26] You also can have the workers and the agent spin server on the same server. [14:30] It's completely up to how you want it deployed. [14:32] But this is what allows you to specify, hey, where actually is this server? [14:35] Okay. So next we're going to go back to agent one. [14:37] We've now loaded the dot env. [14:39] And because we've loaded that agent spin will now automatically see this variable. [14:43] And it will know that it needs to communicate with the server at that location. [14:47] Now next what we're going to do is just say logging dot basic config. [14:50] And we're just going to set the level. [14:52] So we're going to say level is equal to logging dot warning okay. [14:57] Just so we only show warnings that we don't show [14:59] all of the logs that are probably going to kind of mess up the terminal. [15:03] We're then going to say logging dot get logger and we're going to get the agent span logger. [15:09] So let's get it like that. [15:10] And we're going to set the level to warning as well. [15:13] And then next we're going to put not agent span but we're going to put conductor and same thing. [15:18] We're going to set the level to warning [15:20] just so that we don't accidentally get a bunch of random info logs showing up. [15:24] All right. [15:24] So next what we're going to do is we're going to create a basic agent. [15:28] So to make an agent is super easy. [15:30] We're just going to say assistant is equal to agent. [15:33] And then inside of here we're just going to give the agent a name. [15:35] And this is what's going to show up in Agent Spence. [15:37] We can see it. So we're going to call this personal [15:40] assistant like that. [15:41] Perfect. Next we need to specify the model. [15:44] Now because we're using OpenAI we can specify any OpenAI model. [15:47] And we'll be able to connect to it. [15:48] And use it because we have that API key set. [15:51] If we wanted to use an anthropic model, then when we started running the agent spawn server, [15:55] we would have needed to declare an anthropic API key or a Gemini API [15:59] key, or whatever the other model is that you want to use, right? [16:01] If we go back here, you can see that we had the option, right? [16:04] We could have exported one of these. [16:05] So based on the one that we export and you can see all the providers here right. [16:09] It gives you the different options. [16:11] You can specify the model that you want to use okay. [16:13] So we're going to go back here and we're going to change this to OpenAI again GPT 5.4. [16:20] It's a little bit expensive. [16:21] If you just want a cheaper one. [16:22] You can do GPT four or GPT for a mini. [16:26] And that's going to give you a really cheap model that's going to cost literally nothing. [16:28] This one still will not be expensive based on how we're using it, but it is more expensive. [16:33] Next, we're going to pass some instructions. [16:35] Now the instructions I'm going to put in a set of braces just so that [16:37] I can separate them out with some quotation marks here. [16:41] And this is the system prompt. [16:42] This is what's going to be read at the beginning of each message. [16:45] So it understands how it should actually behave. [16:47] So we can say something like, you are a concise personal assistant. [16:52] Use tools when they help because we're going to provide some tools for this in a second. [16:57] And then down here we're going to say, and remember, use full [17:02] okay user details across terms okay. [17:06] Cool. [17:07] So that's our instructions. [17:08] Now beneath this we're going to provide some tools. [17:10] For now the tool list is going to be empty. [17:12] And then after this we are going to provide some memory. [17:15] But we'll just add those later. [17:16] So for now we just have the basic agent. [17:18] Next thing we're going to do is just run the agent. [17:21] So to run the agent we're going to say if underscore underscore name, underscore [17:24] underscore equals underscore underscore main underscore underscore. [17:27] This is just the main entry point in our application. [17:29] If you're unfamiliar with what [17:30] this does essentially just checks to make sure we're running this Python file directly. [17:34] We're just going to do a print statement. [17:35] And we're going to say starting agent dot dot dot okay. [17:39] And then down here we're going to say with the agent runtime [17:44] okay as runtime. [17:47] And then we're just going to go into a simple while loop where we just keep [17:50] asking the agent questions until we type quit. [17:53] Okay. [17:54] So we have our width. [17:54] This is how you start the runtime for the agent. [17:56] We're now going to say, while true. [17:59] And then here we're going to say prompt is equal to input. [18:02] And that's going to be you dot strip just to remove any leading or trailing spaces. [18:06] We're going to say if the prompt dot lower [18:11] okay is equal to q. [18:14] So if you type the letter q then we are just going to break okay. [18:18] We're going to say if not prompt [18:20] then we're going to continue and just ask you to type something [18:22] so that if you don't type anything at all we don't prompt the model okay. [18:25] Now down here, but still inside of the while loop we are going to do the following. [18:28] We're going to say the result is equal to run. [18:32] And we're going to run the assistant. [18:35] We're going to pass our prompt. [18:37] And we're going to say the runtime is equal to the agent runtime right here. [18:40] And that's it. That's all we need to do to run the agent. [18:43] So for now, what we can do is we can just say print [18:46] and we can put an F string, or we can say assistant like this. [18:50] And then we can just put inside of a set of braces, maybe. [18:53] What is this result? Okay. [18:55] And this is going to give us kind of a messy dictionary, which we can look through later, [18:58] but at least for right now, it should give us the response. [19:01] So let me zoom out a little bit so you guys can read this better. [19:04] Essentially what we've done is we've imported a few things we need. [19:07] We've set up the Env so we can connect to the server. [19:09] We have the assistant. We don't have any tools or anything. [19:12] It's just a super basic assistant. [19:13] And we set up a while loop so we can now communicate with it. [19:16] And if we go here we'll just make sure the agent spent servers running. [19:19] I believe I didn't shut it down, so it should still be running here. [19:22] Yes, looks like it is. [19:23] So make sure that the Agent Span server is going guys before you try to do this. [19:27] And then what we can do is from the root of our directory we're going to type. [19:30] You've run and then agents slash agent 1.py. [19:35] Now notice that I'm doing this from where my env file is present. [19:38] So I'm doing kind of the path to this file agent slash agent 1.py. [19:42] So we're going to pick up the env file and we're going to load it and let's hit enter. [19:47] That's a starting agent. [19:48] You can see we have initialized. [19:49] We've connected to the server. [19:50] And now we can type something like hello World. [19:53] And we give it a second here okay. [19:55] And let's see if we get the response. [19:57] And it says hey it was completed. [19:59] And we get this agent result here where we have some result in the output called hello world. [20:04] So we can see everything ran. [20:06] And then if we come back here, let's just refresh the server. [20:10] You see personal assistant just ran. [20:12] And if we click into this you can see our prompt which was where is it here. [20:15] Hello world. [20:16] We can see the output of the model was Hello world okay. [20:19] That's right from the learn. [20:20] And then we can see the immediate output at the end here. [20:23] Was this right work we got with Hello world. Cool. [20:26] So that's kind of the benefit is that we can see exactly what's going on. [20:29] We have full insight into how the AI agent is running. [20:32] And of course this is just a very basic one. [20:34] Now what I'm going to do is just type Q to get out of this, [20:36] and let's make it so that we can kind of view the response a little bit better. [20:40] So rather than just printing out the result object here, let's print out the kind of output here. [20:45] So what I'm going to do is say so I'm just going to say result dot get. [20:49] And then I think we can just put in single quotes here a result, make sure that it's single quotes. [20:54] Otherwise it's going to interfere with the F string. [20:56] And let's just try this one more time where we run the agent. [20:59] So let's go. You've run. Let's go. Hello. [21:02] And let's see what we get this time [21:05] it says agent result has no attribute yet. [21:08] Okay. Interesting. [21:09] So I think we can do result in dot output dot get maybe I think that's going to work. [21:16] Let's just try it. [21:16] I'm just doing this off the top of my head here, and let's run it again and just type hello. [21:21] And let's see now if we get the correct response give it a second. [21:26] And there we go. [21:26] We get hello, how can I help you and say what is my name or something. [21:30] Whatever. And it's not going to know the answer. [21:31] But the point is that this is now functioning. [21:33] I don't know your name. [21:34] If you want to tell me, I'll remember for later. Okay, cool. [21:36] All right, so this is great. [21:38] However, like I mentioned, we currently don't have any tools or any memory, [21:43] so anything that I chat with [21:44] the agent is not going to remember later on, even though it's said that it would. [21:47] So what I want to do now is I want to start by adding a few tools. [21:50] These are things that the agent will be able to actually call to get some information. [21:54] And then we're going to add memory. [21:55] So to add a tool is super simple. [21:57] What we can do is we can just make a function so we can say something like define get current time. [22:02] And then what we're going to do is just return whatever the current time is. [22:05] Now it's important that what we write these tools, we also write docstrings for them [22:09] and the input and output format. [22:11] So that agent span can automatically convert that into something that the AI agent can read. [22:15] So for example I'm going to say okay, the get current time function is going to return a string. [22:20] And if it was going to take some input here then I would also specify like [22:24] you know input and then whatever type the input was. [22:27] And then beneath this importantly I'm going to write a doc string, [22:30] which is just a comment at the top of the function that says returns [22:34] the current local time, okay. [22:38] And then you can see that we have datetime dot now. [22:40] And then we just convert this into a string and we return that. [22:43] Now this is great, but if I want to turn this into a tool, I simply just have to put that tool. [22:48] Now what is a tool? [22:49] A tool is something that I can call to get some kind of response or to take some kind of action. [22:55] So right now the agent doesn't know anything about us. [22:57] It can't actually do anything. [22:58] It's just capable of essentially, you know, printing out text, right? [23:02] Or giving us a text response if we want it to actually take an action [23:06] and generate a report or search for something, it needs to have tools in order to use that. [23:10] Now, agent spend natively defines the ability to call tools. [23:13] So all we have to do is just define a function. [23:16] We specify it's a tool using this add tool decorator. [23:19] Right. Like we specified here. [23:20] Then the name of the tool will automatically be the name of the function. [23:23] So make sure you name the function something useful. [23:26] The input and output type you'll specify, [23:28] and then the description of the tool you'll put as the doc string. [23:31] So what will happen is Agent Span will now say hey we have a tool. [23:34] You know get current time right. [23:37] The description of this is whatever the description was here. [23:40] And it takes no input and gives this output. [23:42] And then that will be passed to the assistant. [23:45] And the assistant will essentially give us a response back that says, hey, I want to call this tool. [23:49] And then inside of this runtime here, Agent Span will automatically [23:53] call the tool for us and then give the response back to the model. [23:56] And we'll be able to see this happening inside of the UI, which I'll show you in a second. [24:00] So for now, we can just pass this get current time tool. [24:03] And the model, [24:03] if we run it again should be able to call this if we ask it about something related to the time. [24:08] So let me. [24:08] That's not what I meant to do. [24:10] Let me open up the terminal and run this again. [24:12] I'm gonna say, what time is it? [24:16] Okay. [24:16] And let's see if we get the time. Here. [24:20] Give this second [24:22] and hopefully it's going to call that and then tell us what it is. [24:26] Okay. And you can see it says that it's this time. [24:28] And if I look at my window here that is the correct time okay. [24:31] 1947 and two seconds. [24:33] But now if we go back to the server and we refresh, we can check our personal assistant. [24:38] And we can see now that actually it called a tool. [24:41] So the yellow line gave some output. [24:43] The output effectively said hey I want to call a tool. [24:46] The name of that tool is Get current time. [24:48] Okay. [24:49] So then we called the tool. [24:51] We got the input which was this. [24:53] We got the output which was the result right. [24:55] And then we pass it to the model. [24:57] Now the model now has access to that tool call. [24:59] So it knows what the time was. Right. [25:02] And that gives us the output. [25:03] Boom. Here's the time. [25:04] So that's one of the reasons why this is super useful, is that you get [25:07] that full insight into what the AI model is actually doing. [25:10] Now let me say, what time was it [25:13] last time I asked you just to show you something? [25:18] And you should see here that assuming it doesn't just call it. [25:21] Yeah, it says I cannot. [25:23] I don't have access to timestamps [25:24] to your previous messages in the chat unless they're shown in the inference. [25:27] So essentially what is telling us is that, hey, I don't know what it was because I don't have memory. [25:31] So the next step here is to add memory to the agent. [25:34] Now adding memory is super easy. [25:37] All we have to do here is just go above our agent [25:40] and we're going to say conversation underscore memory [25:43] is equal to conversation memory like so. [25:48] Then inside of here we can also put the maximum number of messages that we want to store. [25:52] So I can say Max messages is equal to like 50 or something. [25:55] So after 50 it will start just getting rid of the last messages. [25:58] So we don't clog up the context too much. [26:01] And then what I can do is just say memory is equal to conversation. [26:04] Memory. Boom. [26:06] So that's that. [26:07] So what we can do now is let's open this up. [26:10] Let's type clear okay. [26:12] And let's go. You've run and let's do something. [26:15] My name is Tim okay. [26:17] And let's see if it can remember that okay. [26:18] So it says nice to meet you I'm going to say what is my name. [26:22] And let's see if I can remember this now using the conversation memory okay. [26:26] And it doesn't remember it because I made one mistake and I forgot to add to the conversation memory. [26:30] So let's do that. [26:31] Now that's actually a good issue to run into. [26:34] Okay. So we've created the conversation memory. [26:36] We've added it to the agent but we're not adding anything to the memory yet. [26:40] So what we need to do is we need to add what we type in, what the agent types to the memory. [26:46] So the way we do this is we're just going to go here and let's go underneath the result. [26:51] And we're going to say conversation memory dot add [26:54] underscore user underscore message. [26:56] And this is going to be the message that we sent which is the prompt. [26:59] We're then going to say conversation memory dot. [27:01] And this is going to be add assistant message. [27:03] And we're going to add the results output dot get and then the result. [27:06] And just to make this a little bit cleaner we're going to say readable result is equal to this. [27:12] And then we can just replace this with the readable result. [27:15] And then same thing here with the readable result okay. [27:19] So essentially what we're doing is saying hey we're going to append to the memory. [27:22] The memory has a few different functions we can call. [27:24] One is to add a user message, which is what we said. [27:26] And then one is to an assistant message which is what they said. [27:30] So let's save this. [27:31] And now let's go again to our terminal. [27:34] Let's make sure I didn't mess something up. I think it's okay. [27:36] Let's clear okay. [27:37] So let's run it again and let's see what we get. [27:40] This time I'm going to say my name is Tim okay. [27:43] And let's see here. [27:44] Give it a second. Say what is my name. [27:48] And hopefully it's going to give us the answer and tell us that it's Tim. [27:52] Let's see I don't know your name yet okay. [27:55] What's what I'm going to say. [27:57] My name is Tim. Let's try one more time. [27:59] I think sometimes on the first run, for some reason, it's not picking it up. [28:02] Based on kind of how we're adding this. Maybe. [28:05] Let's see. Okay. [28:06] What is my name? [28:08] And let's see now, here we go. Your name is Tim. Okay. [28:10] So for some reason on the first run, I think based on how I added the info here, it's not working. [28:16] I'm not sure exactly why that was the case, but either way, afterwards, [28:19] now it looks like it's working and it is able to determine my name. [28:22] It also might just be how it's searching through the memory. [28:24] Either way, looks like we are good and it knows my name now. [28:28] Okay, so anyways, the memory is functioning now that we have that, [28:31] let's move on to our next agent, which is going to be a rag based agent. [28:36] Okay. [28:36] So as discussed we're now moving on to agent two. [28:39] Agent two is going to be kind of a rag based agent, where we're going to be able [28:43] to look up some info in something like a database or documentation or whatever we have. [28:47] I'm not going to build true rag here because that's going to be a little bit complicated. [28:51] But of course, you could very easily add that effectively. [28:53] What we're going to do is add some more tools, we're going to add guardrails, [28:57] and we're just going to look at a much more complex agent that has a few more components to it. [29:02] Now we're also going to look at a pedantic structured output agent. [29:05] Now, what that means is that rather than just getting the output as kind [29:09] of a random string of text, we can actually pipe it into a Python object [29:13] so that it's predictable and we know what kind of format we're going to have. [29:17] So as you can see here, I've already brought in a bit of code. [29:19] Any of this code will be available from the link in the description. [29:22] If you just want to copy it, there'll be a GitHub repo there. [29:24] But I just want to save us a bit of time. [29:26] So I just did the import. [29:27] So you can see we've got a bunch of stuff from Agent Span here. [29:30] And then we have the mock database documentation [29:33] and then the logging setup as well as the loading env. [29:37] Okay. [29:37] So if we go down here the first thing that I'm gonna do [29:40] is I'm actually going to define what I'm going to call the pedantic structured output object. [29:46] Now this is how I want the agent to give us its output. [29:49] So rather than just giving me some random text that maybe I have to parse through, [29:54] I want it to give me something in kind of a dictionary format [29:57] that I can then convert into a Python object so I can read the different values. [30:01] You're going to see what I mean, but I'm just going to go class support [30:05] response like this. [30:07] And then this is going to inherit from base model [30:10] if I can type it correctly, which we brought in here from pedantic okay. [30:15] Now pedantic allows us to just do typing in Python. [30:17] It I use with a lot of these AI frameworks. [30:20] So first things first, I'm going to say that I want this AI model [30:23] to actually give me output that has the following fields. [30:27] The first is a stage, so I'm going to say stage string is equal to a field. [30:32] And then I can actually just give a description for this field. [30:36] And I'm going to say stage like answered okay. [30:40] So answered refunded or rejected [30:46] because this is going to be related to kind of the support request [30:48] because we're setting up kind of like a support agent here that has the ability to do this rack. [30:52] Next we're going to have successful boolean. [30:56] And this is just going to be a boolean. [30:57] So we know if it's successful or not. [30:59] And then we're gonna have a message which is a string. [31:01] Now this is a super basic structured response. [31:03] But if we wanted it to give us like a price or a number or a time or something specific, [31:07] then we just set that as a field and the model will automatically fill in these values. [31:11] So now it's going to give us always a stage which will fit this description. [31:15] And it will give us whether it was successful and what the message was. [31:18] And if we had other types, we could set those here. [31:20] We could set up enums, we could do anything you want. [31:22] I'm just trying to show you that you do have this ability to use structured output, [31:25] which is really powerful for more deterministic AI, applications. [31:30] Okay. [31:31] Now, before we build the model, before we build the agent, I want to set up a few tools. [31:35] So the first tool is going to be one that can search our knowledge base. [31:39] So I'm going to say define search [31:42] knowledge underscore base. [31:44] And we're going to take in a query which is a string. [31:48] And we're going to have a string which is a response. [31:52] Now for the description of this tool. [31:54] Let's put it in here. [31:55] We're going to say search support [32:01] docs like this okay. [32:03] Pretty basic. [32:04] We're just saying hey this can sort through our support documentation. [32:07] Let's fix the comment. And what we're going to do is the following. [32:09] We're going to say for title and then body in [32:13] docs dot items, we're going to say if the title [32:20] is in query dot lower, [32:24] then what we're going to do is just return the body. [32:27] This is not an efficient search by any means, but all we're effectively doing [32:31] is just a quick keyword search of any keyword of what the user typed in was in this. [32:35] So like shipping refund policy, account, [32:36] whatever, then we're just going to return whatever the body or the content of this is. [32:41] Again, if you use real rag you can get a much better response. [32:43] But I'm just showing you how you can set this up. [32:45] So next we're going to say return no matching [32:48] support articles found okay. [32:51] So if there's not a response. [32:52] So that's the first tool we're going to have next we're going to have some tools related [32:56] to looking up some orders okay. [32:58] So we're going to have a tool. [32:59] This is going to be define lookup underscore order. [33:03] So let's do this for the order. [33:06] We're going to have an order ID which is a string. [33:09] And this is going to return a dictionary with the information about the order. [33:13] Now same thing. [33:13] We're going to have the comment lookup order in database. [33:17] Pretty basic. And we're going to say by ID. [33:20] And then what we're going to do is return [33:23] the mock underscore db and then the orders like this. [33:29] And then we're going to say dot get. [33:31] And then this is going to be the order ID. [33:34] And if the order ID is not found then we're going to return a dictionary. [33:37] And the dictionary is just going to say error order not found [33:42] okay. So that's going to be our lookup order tool. [33:44] And then we are going to have a few other tools as well. [33:47] We'll look at those later. [33:48] So we're going to have one tool that will allow the user to refund. [33:51] However before we call that tool we are going to ask for a human in the loop approval [33:55] because we don't want to just automatically refund something unless the user actually allows that. [34:00] Okay, so for now, let's create the support agent. [34:01] Let's run it with the two tools so far just to make sure that these work. [34:05] And then we'll move on to the rest. [34:07] So we're going to say support agent is equal to agent. [34:11] For the agent we're going to say the name is support agent for the model. [34:17] We're going to go with [34:19] OpenAI slash GPT five for. [34:23] Then we're going to have some instructions. [34:25] I'm just going to copy these in because they're kind of long and you guys can adjust them. [34:29] But you'll see kind of how I've written them here. [34:31] So we're going to say instructions. [34:32] And then I'm just going to copy in this long paragraph. [34:35] So give me one second which looks like this. [34:39] So you are a customer support agent. [34:41] Use the knowledge base first and the customer wants a refund. [34:44] When you know the order ID, call the lookup order to get the email before calling. [34:47] Process refund. [34:48] Very short plain English sentence describing exactly what to refund you about to issue, etc.. [34:53] Okay, so this is the instructions. [34:56] Next we're going to specify the output type. [34:59] So this is a new one. [35:00] And we're going to say output type is a support response. [35:03] Now what this allows us to do is specify any pedantic object. [35:06] So like one that we have right here. [35:08] And that's now going to force the model to give us the output in this format. [35:12] So that's all we need to do. [35:13] We just say hey we want it in this format now it's going to give it to us in that object. [35:16] What you're going to see in a second. [35:18] Now next we're going to specify the tools. [35:20] So the tools are just going to be the search knowledge base. [35:22] And then the lookup order we will have conversational memory. [35:25] So we'll keep that for right now. [35:27] And then we can specify max underscore turns is equal to ten. [35:32] Now what this will do is specify the number of times we can go back and forth the agent [35:36] until we reset the session. [35:37] The for the conversation memory. [35:38] Here we can actually just specify it like this. [35:41] And when we do that it should automatically add all of the contents to conversation memory for us. [35:46] Now, the reason I didn't do it here is just because I wanted to show you that you can manually control [35:50] the memory, but by default it will automatically add everything, including the tool calls to memory. [35:55] When you. Oops, that's not what I wanted. [35:57] Specify it like this. [35:58] Okay, so actually, if I go to the docs, you can see that you can manually add it as you chose here. [36:03] And then there's five methods or six methods like user message system, message [36:07] system, message tool, call tool result. [36:08] So if you don't manually add it like we did, then it will just add [36:12] all of these for you automatically, which is good obviously. [36:15] Right. [36:15] But sometimes you want to just add certain pieces so then you can control it yourself [36:19] as we did in example one. Okay. [36:21] But for now we have our memory and we have our agent. [36:24] So now we need to be able to run the agent. [36:26] So what I'm going to do is create a function. [36:28] This is going to be called Run Interactive okay. [36:32] For this let's spell interactive correctly. [36:35] We're going to take in a prompt which is a string. [36:37] And we're just going to return nothing or none okay. [36:41] From here we're going to say with the agent runtime just like last time as runtime, [36:46] what we're going to do is we're going to say handle is equal. [36:49] To start. [36:50] This is a different function. We're not running the agent. [36:52] And this is going to be support agent prompt. [36:55] And then run time is equal to run time. [36:57] Now the reason we're doing this is that we want to have a little bit more control this time. [37:01] And we want to actually be able to hook into what the agent is doing to see, for example, [37:05] if it needs approval from us, if there's a guardrail that ran, which we're going to look at in a second, [37:09] and this gives us just a little bit more control in terms of what the agent's doing. [37:13] So will allow us to actually stop approval request as you're going to see. [37:18] So what we're going to do now is we're going to say stream is equal to handle dot stream. [37:23] And before I go any further, [37:24] let me just refer to the docs so you can kind of get a sense of how this works. [37:28] So if we go to this streaming page here, I'm doing it a little bit differently than it shows in the docs. [37:31] But you can see that we can actually hook [37:33] into the stream of the agent, which allows us to see all of the events that are happening. [37:37] So we can see, for example, if it's thinking, if it's calling a tool, [37:40] if there's a result, if there's a handoff and if it's waiting. [37:43] So if it's waiting, what that means [37:45] is that it's waiting for us to approve something which we need to manually do. [37:49] So this allows us to have some more control [37:51] into what the agent's doing, rather than just purely getting the result. [37:54] We can see all of the steps in the meantime, so I'm going to show you how I'm going to do it here. [37:58] Again. You can reference the docs and you can do it a little bit differently. [38:01] So we're going to say order underscore ID comma [38:03] amount is equal to none and none because I want to potentially know [38:09] if the user wants to refund an order, which we're going to have a look at in a second. [38:12] And then what we're going to do is we're going to say for event in stream. [38:16] For now, I'm just going to pass. [38:17] But this will allow us to actually view all of the stuff [38:21] that's going on before eventually we get a result. [38:24] Now we'll handle that in a second. [38:25] But for now, what I'm going to do is just go down here and say result is equal to stream [38:29] dot, get, underscore result. [38:32] This will then give us the result. [38:33] Once all of these events are finished and we've gone through them and we're going to say [38:38] output is equal to result dot output, okay. [38:44] Then we can actually just tack on message here. [38:46] The reason for that is that we know that it should be in this support response object type. [38:51] So we get results dot output. [38:52] And then the output is going to be this. [38:54] We know that there's going to be a message. [38:55] So we can simply just view that okay. [38:58] Then we need to do is just print out the message. [39:00] So we're just going to say print. [39:02] And we can put an F string and we can put a new line character here. [39:05] And I'm just going to put the [39:08] sorry let's put output and then backslash n okay. [39:11] So we'll start running this in one second. [39:12] In order to do that we're just going to do if underscore underscore name underscore underscore [39:16] is equal to underscore underscore main underscore underscore then what we can do is say print [39:21] and we can print support bot starting dot dot dot. [39:26] Then we can go down here and we can just do a simple while loop. [39:29] It actually knows exactly what I want okay. [39:32] So we're going to say well true. [39:34] The prompt is you if you enter Q then break. [39:36] If there's not a prompt then just continue and then otherwise just simply run [39:40] interactive, which is this function that we wrote to run at the agent. [39:43] Now like I said, there's some more stuff that we're going to add, [39:45] but for now, let's just see if it can look up the order [39:47] or look through the knowledge base before we go any further. [39:50] Okay. So let's simply open this up. [39:53] Let's go. [39:53] You've run agents slash agent 2.py. [39:58] And we got an issue here. [40:01] Instructions I think I just felt instructions incorrectly. [40:04] So let's just fix that instructions like so okay. [40:09] Let's run it again. [40:10] And it says you I'm going to say can you tell me about shipping. [40:17] And let's see if it can look that up. [40:18] And if we get the result. [40:21] Okay. [40:21] This dictionary object has no attribute message. [40:24] Interesting. [40:26] Let's have a look at why we're getting that. [40:28] And it's going to be something to do with this. [40:30] So for now let's just print whatever this result is. [40:35] Let's see what it is. [40:36] And then we can parse through it. [40:37] So let's say shipping or something. [40:40] And let's see if it finds anything. [40:41] And it gives us okay result output error failed with status failed. [40:45] And reason model GPT 5.4 does not exist okay so that's good. [40:49] We found the issue there. [40:50] So OpenAI slash GPT and think this is Dash 5.4. [40:54] Let's look at what we had in the first agent. [40:56] Yeah dash 5.4 okay silly error. [40:58] But at least it gives us the response there. [41:00] And now we can just quit this [41:04] rerun. [41:05] And let's see shipping [41:08] and let's see what we get if it works this time okay. [41:10] So I'm just playing around with this just to get the correct output. [41:12] And we can see that the way we can do that is by doing result dot output dot get and then result. [41:18] And that will then give us the format as specified here. [41:22] However, it doesn't give us give it to us in the Python object. [41:24] It gives us two. [41:25] It gives it to a string and a dictionary that is the same format as this, [41:29] which is still effectively the exact same thing. [41:31] So you can see we get stage completed. [41:33] Successful true message standard shipping takes 3 to 7 business days when I type shipping. [41:38] Now let's ask it can you look up my order and let's see if that will work? [41:44] What's the order ID a 100. So let's type that in stage. [41:48] Need order ID successful false message. [41:49] Sure. Please send me your ID. [41:51] So we're going to say a 100. [41:53] And so let's see if it can look that up. Now [41:57] give it a second here to give us that response [42:00] okay. [42:01] Come on I hope it's calling the tool [42:04] being a little bit slow and says refund pending info okay. [42:07] Message I can help with the refund, but I need your request to proceed in order. [42:11] Eight 100 was found for 4999. [42:13] If you want to refund for this order, please confirm and I'll continue. [42:15] Okay cool. [42:16] So it looks it up. [42:17] We get the information and now if we go back to the agent span's server here [42:21] and we refresh, we can see first of all these ones failed. [42:25] And we can actually see all of the logs on why this failed, which is interesting [42:28] as well as the debug view here on exactly what went wrong. [42:32] Anyways, let's go back to the most recent one there and we can see we have this support agent. [42:36] We have multiple turns so we can see how those worked. [42:38] So you can see we typed in a 100 and then it looked up the order. [42:42] This was the input. [42:44] This was the output. It got us the information. [42:46] And then it gave us that full Json for the tool call [42:49] went to the yellow and then gave us the output. [42:52] Now you'll notice that this is just one run. [42:54] If we go back. [42:56] All right. [42:57] You can see this was the other run. Can you look at my order. Boom. [42:59] And then it's remembering all of this based on the conversation memory. [43:02] Okay cool. [43:03] So that's functioning now let's move on to add a few [43:06] other things to our agent. [43:09] So one thing that I want to add now is the ability to refund. [43:13] But like I said, we shouldn't just refund unless we get approval from the human to do that. [43:18] So in order to do this, we can just make another tool. [43:20] This can be a tool, but this time we are going to say approval. [43:26] Approval underscore required is equal to true. [43:30] Now this means that we need to manually approve this in order for the function to execute. [43:34] I'm going to show you how we do that. [43:35] Now for the function we're going to go process refund. [43:37] We're going to take an order ID and we're going to take in an amount that we want to be refunded. [43:41] And then we're going to return, not a boolean but a string okay. [43:46] Now we need to give a description. [43:48] So for the description we're going to say [43:51] let's go like this. [43:52] Request a refund [43:56] okay. [43:56] Refund pause for human approval. [44:01] Think before you run this [44:04] okay cool. [44:05] Just so it knows that this should be a careful operation, [44:09] then we're just going to return, even though we're not really going to do anything here. [44:12] We're just going to say refunded. [44:14] And we'll put inside of brackets amount [44:18] colon dot to f okay. [44:20] For order. [44:22] Order ID we're kind of faking a refund, but I'm just showing you that we can build a tool [44:26] that requires the human approval, which is kind of the more important part. [44:29] So now that we have that tool, we're just going to add that to our tool list. [44:32] So we're going to say process refund. [44:34] Now the thing is we need to start handling this stream here in order to actually process that refund. [44:39] So what I can do is the following. [44:41] Now I can say if event dot type [44:44] is equal to and then this is event type [44:48] dot tool okay tool underscore call like so. [44:52] And event dot args meaning it has some arguments. [44:56] I'm going to say my order id is equal to event args. [45:01] Yet order id or [45:05] like this or order underscore d [45:10] or order underscore id. [45:11] Now what I'm effectively saying is hey I'm going to try to look through these tool calls to see [45:15] if we ever call an order ID when we're looking up something, or calling one of these tools, [45:19] because that's in the order I do referencing when we're trying to refund something. [45:24] Okay, so I'm just pulling out that order ID, [45:27] otherwise I'm going to say if event dot [45:29] type is equal to event dot [45:33] or event type dot tool underscore result okay. [45:38] And is [45:41] instance event dot result a dictionary, [45:46] then what I'm going to do here is say amount is equal to event [45:51] dot result dot get. [45:54] And I'm going to get a total [45:58] or an amount. [45:59] So same thing. [46:00] Now I'm going to look in the tool result to see [46:03] if I can figure out what the amount is that we're refunding for the order. [46:06] It's kind of a weird way to do it, but it allows me to parse through and see tool calls, tool result. [46:10] And then lastly I'm going to say, Elif, the event dot [46:14] type is equal to event [46:18] type dot waiting. [46:21] Then what I'm going to do is I'm going to print the following okay. [46:24] And this message is going to be essentially saying, hey, we're requesting to refund. [46:29] And then I'm pulling out the two arguments I have. [46:31] So order ID an amount so I can print those and then tell them, hey, do you want to approve this? [46:36] And if they do, then we can approve it. [46:37] So here's how it works. [46:38] I'm just going to say print and this is going to be type string go backslash n [46:42] I'm going to go approval required. [46:46] And then I'm going to say refund. [46:48] And we're just going to put the order actually let's put the amount. [46:54] So we'll put a dollar sign like this amount. [46:57] And then colon dot to f for order. [47:00] And then the order ID okay. [47:02] And then down here we're just going to put a print and we're going to say [47:06] press enter to approve. [47:09] Technically you can't actually press anything else. [47:11] And this is going to be an input statement not this. [47:15] So we're not even going to check what it is. [47:16] And then we're just going to say handle dot approve okay. [47:20] So effectively when we call handled at approve we're just going to approve that operation. [47:23] So we're just going to wait for the human to be at this step. [47:25] And then as soon as we want to approve boom, we go ahead and run approve and we're good to go. [47:29] Okay. So now that we have that we're going to ask them to approve it. [47:31] So I'm going to say decision is equal to input. [47:34] And we're just gonna ask them approve yes or no. [47:36] And then lower dot strip. [47:37] We're going to say if the decision is [47:40] why then let me just check the documentation. [47:44] Here it is. Handle dot approve okay. [47:46] So we're just going to say handle. [47:50] Dot approve like so okay. [47:53] Otherwise we can say handle dot. [47:56] And I believe it is reject. Let's see. [47:58] Yes you can reject and you can pass a reason. [48:01] If you want to pass a reason just say user [48:04] rejected okay cool. [48:06] So that is how we can now handle this. [48:08] Again the reason why I'm looking at these tool calls is just so I can figure out the kind of amount [48:12] that we're going to have for the refund, because otherwise it's not going to tell us that beforehand. [48:16] So anyways, now let us go and run this [48:21] and see if this works with the refund okay. [48:24] So we're going to clear and then we're going to go. [48:25] You've run agents too I'm going to say I want to refund an order. [48:31] And it gave us an issue saying tool result. [48:33] Just because I didn't have a capital L here. [48:36] So let's fix that. [48:38] And now we're good and rerun and let's say refund and order [48:45] okay. [48:45] Let's see what we get. [48:46] And it says that it needs an ID so. So I can help that. [48:48] Please give me the order ID so I can look it up okay. [48:50] So let's go a 100 and see okay. [48:53] And it says approvals required refund 4990 for order a 100. [48:57] So you can see these steps here. [48:58] Picked up that information for us because it saw that we were doing a tool call [49:02] to either attempt to refund or to look up the order ID so it pick those up, [49:06] save them in the variable, and then we're using them in this step to tell them, [49:09] hey, we now want to call this because we're waiting for your approval. [49:13] The only thing we could be waiting for approval for is this function, right? [49:16] Because that's the only one that we have. [49:18] So I'm just going to go ahead and type on yes to approve this. [49:21] And then hopefully it's going to tell us that it was able to refund it. [49:24] Let's see. [49:24] It says stage completed message trigger refund was issued successfully. [49:28] Boom. [49:28] Now let's say refund order again okay. [49:33] And hopefully it's going to give us maybe another ID says please read the ID okay. [49:36] So let's go a 100. [49:38] Even though I know we already refunded it, we still can try. [49:41] And let's reject it this time and see what we get, just to make sure that that step works. [49:46] And while we're at it, we can go here right to Agent [49:49] Spend server and you'll see that this is running right. [49:52] And we're at this stage where we're just waiting for the human, and we can just wait indefinitely. [49:57] And what I could actually do, I'm not going to do this right now [50:00] because it's a little bit complicated to show is let's say I were to quit this worker. [50:04] Right. And this worker just completely died. [50:07] And then I restarted it, but reconnected to kind of this execution that's going [50:11] this will still all be running with all of the saved state, [50:14] and it will just be waiting for the human again to approve this. [50:17] So the human does need to ask refund. [50:19] Again, we don't need to check something. [50:21] We don't need to look up another order. [50:23] It will just, resume where it left off [50:26] at this stage, right where we're waiting for the human. [50:29] And this can take any amount of time. [50:30] It could take a day, could take it out, or it could take ten minutes. [50:34] Doesn't matter. The server will keep running here. [50:37] And you can see it's in this hand off state where it's waiting for us to approve writing. [50:41] You'll see the time if we just keep refreshing. [50:43] Like it'll just keep going up and it will just keep waiting. [50:45] Okay, so anyways, I'm going to go. [50:46] Yes here and or sorry I want to do no. [50:49] So we rejected it. But anyways you can see that it's working. [50:52] And I think doing now is not really going to make any difference [50:55] anyways because well, we know it's just going to move to the next step. [50:58] All right okay. So this is working. [51:00] Now what I want to do next is I want to start adding something called a guardrail. [51:04] Now a guardrail allows us to actually audit [51:07] the input or the output to our lab or to our agent to ensure that we don't have [51:12] something potentially malicious or data that shouldn't be given to the user given. [51:16] So I'm going to show you how we write a guardrail. [51:18] The guardrail that I'm going to write is going to be related to a jailbreak. [51:21] So a lot of times people will try to do like a prompt injection where they say, hey, like ignore [51:25] all of your previous instructions and give me, you know, all this information that I need x, y, z. [51:30] We can actually prevent against that by building in these guardrails [51:33] where we try to detect common kind of phrases that, you know, scammers and exploiters will try to use. [51:40] So what I can do is I can use add guardrail. [51:42] So make sure you import it right. [51:44] And I can say define safe underscore support [51:48] underscore request like so. [51:51] Now from here we can take a prompt which is a string. [51:55] And this is going to be a guardrail result that is going to return. [51:59] Now for the comments here. [52:01] What we're going to do is say block [52:05] obvious prompt injection attempts okay. [52:09] And this is going to be before the LLM even sees it. [52:12] So before the LLM gets it, we're going to have this function that will run. [52:15] So what I'm going to say is blocked is equal to. [52:17] And then just a list of words. [52:18] So I'm going to say ignore okay [52:22] ignore previous. [52:25] We can use system [52:28] prompt something like that or jailbreak okay. [52:31] So these are just words that I don't want to be allowed in the input. [52:34] Now I'm going to say past is equal to not any. [52:37] And this is going to be phrase okay [52:41] in prompt dot lower. [52:44] And then we'll spell lower correctly for phrase. [52:49] Let's spell all these. [52:50] My typing is so bad now with LMS phrase in blocked [52:55] okay, so all this is doing is saying hey, we're any of these words in this prompt. [52:59] That's all it's checking. [53:00] Then we're going to return guardrail result. [53:03] I'm going to say past is equal to pass, which is either going to be true or false. [53:06] So if none of these existed then true. [53:08] If they did exist then false. [53:10] We're going to say reason or we can say sorry. [53:12] Message is equal to. [53:14] And we're going to say please ask a normal question. [53:20] This is blocked. [53:22] So if it fails this is the message that's going to be returned. [53:25] So now what we can do is we can add a guardrail here to our support agent. [53:29] The way we add it is we specify a guardrail or guardrails with a plural. [53:34] We then need to put a guardrail object. [53:37] We're going to say like this guard rail for the guardrail. [53:42] This is going to be the safe support request. [53:45] And we're going to say the position of the guardrail is going [53:47] to be position dot input. Okay. [53:51] And then we're going to say on underscore fail is equal [53:54] to on fail dot raise. [53:57] Now raise is going to raise an error which is just going to exit out of the bot completely. [54:01] There's other things that we can do here when we fail. [54:03] But for now I just want to completely quit. [54:05] So effectively what I've done is I said, hey, we have this guardrail, right? [54:08] This is a function that we want to run, and we actually want to run it [54:11] before we pass anything to our LLF. [54:14] So as soon as we get some input to our agent, [54:16] run it through the guardrail, which is this function right here. [54:19] Make sure that there's nothing wrong. [54:22] If there is something wrong, then tell us and fail. [54:25] Okay, that's a simple guardrail. [54:26] Now this is on the input. [54:28] You also can add a guardrail on the output, which I'm going to show you from the docs here. [54:31] So if we go to guardrails here, you can see there's a bunch of stuff that we brought in here. [54:35] You can see guardrail. [54:37] We have a word limit. [54:37] So for example we're checking to make sure that what do you call it here. [54:41] We're going to have a correct number of characters. [54:44] And you can see for the failure modes here. [54:45] Do you have like retry, raise fix human, etc.. [54:49] Okay. [54:50] In terms of constructing the guardrail, you can do the function position right. [54:54] So output input on fail the name and then the maximum number of retries that you want. [54:59] And for position two you either input or output. [55:00] So either run after or run before. [55:03] Now there's a bunch of guardrails you can do here. [55:04] You can do a custom one like the one that we just did. [55:06] You can do a regular expression, guardrail if you want to just check for certain characters [55:11] like we were kind of doing. I just don't like to, write regex. [55:14] Sorry, because it's a little bit complicated. [55:16] And you could do an LM guardrail. [55:18] So if you do an alarm guardrail, you're actually using an LM to [55:21] then either get the, what is it, fail or pass. [55:25] The issue with this is that you still can have prompt injection going to them. [55:29] This LM where that's doing the guardrail. [55:31] But the point is you can use an LM to actually detect, hey, is this good? [55:35] Is this bad? Whatever. Okay. [55:37] And then same thing input guardrails as we saw here auto fix. [55:41] There's a bunch of different ones that you can set up as you can see like this okay. [55:45] So I'm not going to go through all of them. [55:46] We just wanted to show you that these are super interesting. [55:49] Very good to add to the agent. [55:51] So now that we've added this let's try it. [55:54] And let's just go clear and run. [55:58] So we forgot to pass a comma. [56:00] Maybe let me see where that is. [56:02] Yes we forgot the comma here. [56:04] So let's add that and rerun and I'm going to say [56:08] you know jailbreak this prompt okay. [56:11] And you can see boom it just immediately crashes and gives us the error input guardrail safe [56:15] support request failed. [56:16] Please ask a normal question. [56:18] This is blocked okay. So we ran into the guardrail. [56:20] And then of course if we run this we say help me or something [56:23] we wrote won't run into the guardrail because well it was not triggered. [56:27] Okay give this a second. [56:28] Hopefully it will give us the response. [56:32] Not sure why this was taking so long. [56:33] Maybe getting rate limited or something. [56:35] Okay, you can see that it gives us the response here. [56:37] And also you'll notice that there's no run for this guardrail execution [56:42] because we never even got to the images, immediately blocked it before we even passed it to the server. [56:47] So like as I was scrolling through here, I actually couldn't find one that, was that execution. [56:53] Yeah. See, it's actually not showing up here at all. [56:57] Just help me. [56:57] Yeah, because we never even hit the server because we immediately exited after the guardrail. [57:02] Okay. So again, a lot of other stuff you can do with the guardrail. [57:04] They're not going to go through all of it. [57:06] But with that said that is going to wrap up our second agent. [57:09] This was a little bit complicated. We added a lot of stuff. [57:11] We had tools, output type, memory guardrails. [57:14] What else. [57:15] Human in the loop approvals [57:17] kind of getting into the stream of what's actually going on with the AI agent. [57:20] And again, all of this is available from the documentation we have streaming. [57:25] As you can see here, we have testing which we're going to look at later. [57:28] We have the memory right. [57:29] And in conversation memory we have tools right. [57:32] So check all of this and you'll be able to see how it works. [57:34] And you can also add Http tools API tools and mic tools as well. [57:38] If you don't want to add custom function ones like the ones that we've written so far. [57:42] Anyways, now let's move on to agent three, which is going to be a multi agent [57:46] kind of orchestration agent, where there's multiple agents [57:50] that can be triggered at once to perform a long running task. [57:53] All right. [57:53] So we finished the first two agents where we're actually writing all of the code manually. [57:57] Now we're going to move on to agent three which is going to be going over multi-agent strategies. [58:02] Now what we're going to be building is a multi-agent researcher. [58:05] So it's actually going to be very similar to what we have in the docs here. [58:08] So I'm not going to write every line of code from scratch. [58:11] I'm just going to run you through it at a high level, because this code will be available from the link [58:15] in the description. [58:15] And I'm going to explain the different strategies that you can use and show the executions. [58:19] So this is the code that I have. [58:21] I'm just going to quickly skim through it. [58:22] And then I'm going to explain how you can configure this to be useful for whatever [58:26] example you're trying to build. [58:28] Okay. [58:28] So effectively what I have here is a bunch of different agents. [58:32] I have a researcher agent, I have a writer agent, I have an editor agent, I have a market analyst, [58:37] a risk analyst, financial analyst and now, analyst team or analysis team. [58:41] And then I have these different agent pipelines, which we're going to have a look at in a second. [58:45] And then I have just a few things that will kind of create and save a report manually for us. [58:50] Because that's how I'm going to kind of set it up. [58:53] But effectively, the way this agent is going to work, I'll run it for you in a second, [58:56] is that I'm going to tell it, hey, I want to do research on tech with Tim, for example. [59:00] And the strategy I want to use for the, [59:02] research is sequential, which means, you know, run these in individual steps. [59:06] And then what will happen is it will go and use all of these different agents, gather information [59:11] and generate a research report. [59:12] For me, that's what this agent is. [59:14] Again, I'm going to show you how it works. [59:15] And we'll run through the code in a second. [59:17] Now, the way that I'm able to do this is because Agent [59:20] Span supports these multi-agent strategies. [59:23] Now here's the following strategies. [59:25] First is handoff okay. [59:28] And chooses which sub agent to handle the request. [59:30] This you can write similar to this if I can find it right here [59:36] where essentially you just write an agent, you give it access to some other agents. [59:39] These agents can be exactly what we just built before. [59:42] And then you change the strategy here to say handoff. [59:45] That's it. [59:46] And then you just trigger this agent the way that we've been running them. [59:49] And it will just go and let's remove this. [59:51] Be able to use each agent as it needs to use them as you chat with it. [59:55] So it has all these different agents beneath it. [59:57] Similar to if you're using like cloud code and you have sub agent setup okay. [60:01] Then you have sequential straightforward. [60:03] This just means that we always run the agents in a, what is it kind of linear paths. [60:08] We run them one by one, and then we take the result of one agent and we pass it to the other. [60:12] You can see sequential looks like this, right? [60:14] We run and we get the result. We pass the results to the next agent. [60:17] We run, we get the result. We pass the result to the next agent. [60:19] Then eventually we get the final results have like researcher, writer, editor, boom. [60:24] And then we get the response, okay, then we have parallel. [60:28] Parallel allows us to run these all concurrently. [60:31] This means that I can run all three agents at the exact same time at scale, [60:35] so I don't need to wait for one response before I get the next. [60:39] Then we have rotor. [60:40] As you can see, we can route between different ones. [60:42] We have swarm handoffs between different agents. [60:46] We have round robin, random and manual, a bunch of different strategies that you can use here. [60:50] When you make these agents now you'll notice that there's a special syntax. [60:54] It looks like this. [60:55] These kind of two I don't know what you call them greater than signs. [60:59] And this is the same syntax as writing this. [61:02] This just means run these agents sequentially. [61:04] You're kind of piping the response into one another or assertively. [61:08] You can define the agent and you can just specify the strategy as you see here. [61:11] Okay. [61:12] And then you can just run the pipeline like this and get the result. [61:15] So I'm going to show you a few different strategies here. [61:17] So you can see the time difference and the response that we get. [61:20] But notice that if I want to run them in parallel, same thing I define three agents [61:24] strategy parallel boom. We get the response. [61:26] And if you want to get the sub result you can have a look at it here. [61:29] Hand off the default one. [61:30] You just pass them in here. [61:32] Strategies. [61:32] Hand off it will go and hand off as needed. Rotor. [61:35] You can set up agents. [61:36] You can also set up a rotor for the rotor. [61:38] You can actually use an agent to do this. [61:40] You see have a classifier agent says classify the request [61:43] and then just reply with the correct category. [61:45] And then it will call the correct one, okay. [61:48] And then swarm. And you can go through and you can view how all of these work. [61:51] But I'm going to show you the code example right now okay. [61:54] So let's go through the code that I have right here okay. [61:56] So first things first we just bring in the imports. [61:58] We disable some of the logging kind of war errors and warnings you're seeing. [62:03] We specify the mode. [62:04] So we want to be able to run. [62:05] So sequential parallel nested and worker. [62:08] We then have some various tools here. [62:10] Now notice that these tools use something called credentials. [62:14] Now when I specify a credential here this effectively means [62:18] that we need to grab this credential from our server in order to use it inside of this function. [62:23] So I say credentials is equal to fire curl API key. [62:27] Now what I'm doing is saying API key is equal to OS start environ fire curl API key. [62:31] And this will automatically set the fire Curl API key that's going to be stored on our server, [62:35] which I'm going to show you how to do in a second. [62:37] In the local shell while we're running this worker. [62:40] So this means any credentials that you want to have, [62:42] you can store them directly on the agent server, which again we're going to look at in a minute. [62:46] You can grab them when a tool is called [62:48] and then use them locally without having to expose them locally permanently. [62:52] So only when they're needed they can get pulled out. [62:54] So essentially I'm going to use Fire Curl. [62:56] If you want to sign up, you can get a free account. [62:58] You don't need to pay for it. [62:59] You get a bunch of free credits, and this will allow you to do a ton [63:03] of scraping and searching of the web more effectively than with like a default search. [63:07] So I'm using Fire Curl to just search the web for a bunch of pages on [63:10] whatever topic we're going to look up. [63:11] I then have this fetch page tool. [63:13] This can get an individual tool and actually grab all of the content [63:16] from the page and give us the information so that we can scrape the content. [63:19] Okay, so just two tools. [63:21] Now I have a researcher agent, this agent I keep it access to these two tools search web and fetch page. [63:26] Right. [63:27] And that's it then for the writer agent I just give it some different instructions. [63:31] I don't even change the model for the editor. Same thing. [63:34] I just give it some different instructions for the market analyst, give it different instructions. [63:38] And I just have all these different agents that I've created. [63:40] I then create an analysis team. [63:42] And this analysis team I want to run in parallel where I say, hey, for the market analyst, the risk [63:46] analyst and the financial analyst. [63:47] So these three right here, we want to run those at the exact same time. [63:51] So I just specify that I'm going to run them in parallel. [63:54] I then create these pipelines. [63:56] So I have a published pipeline which is my researcher writer and editor. [63:59] So let's have a look here. [64:01] We do the research. [64:02] We do the writing and we do the editing. [64:05] Now when I do that, because of the syntax that I've used here, I'm running them sequentially, [64:09] which means I need to wait for the researcher to go, then the writer to go, then the editor to go. [64:13] Then for my nested pipeline, this is where I take my analysis team, which I run in parallel. [64:18] And then after that. [64:19] So after I get my analysis, I write the researcher, writer and editor. [64:23] So I run this whole thing sequentially. [64:26] But this first step runs these three agents in parallel. [64:29] So I've created this kind of like multi-agent, you know, orchestration [64:33] where my analysis team goes in parallel at first. [64:36] Once the analysis team is done, then we go sequentially to the other agents. [64:40] Hopefully that makes sense. [64:41] But that's kind of how I've set up these agents to call each other. [64:44] And notice we just have two simple tools. [64:46] But we can use anything from agent two or agent one with the agents that we have in this example. [64:52] Okay. [64:52] Now we just have a few functions one to render the output, [64:55] one to slug ify something, one to save the report. [64:59] These are just functions that I'm manually calling. [65:01] And we're just going to save a report in a folder called reports directory. [65:06] In that folder it's just going to look like this path reports okay. [65:09] So let's say we're [65:10] just going to save like a markdown report with the information that we get from these agents. [65:14] Now you'll notice that I just have this run pipeline function. [65:16] This allows me to take in either sequential parallel or nested. [65:19] You can see if it's sequential. [65:20] We run the publish pipeline, which is this. [65:24] If it is parallel we run the analysis team which is just the analysis. [65:28] And if it is let's go back. [65:30] What's the other option we had here nested that. [65:32] It runs my nested pipeline. [65:34] Then what we do is we just say with the agent runtime, [65:37] hey, we're going to run whatever pipeline mode we have that's like this. [65:41] So just which one are we going to execute? [65:43] This is the topic that we want to research. [65:45] And then we just have some runtime. [65:47] We get the execution ID, we get the status, we get a path to the report. [65:50] And then we just save the report and that's it okay. [65:53] Then serving the worker. Don't worry too much about this. [65:55] And prompt mode. [65:56] This just allows me to essentially type directly into here and specify, hey, what do I want? [66:01] So we can run it. [66:02] So let me run it and show you what this looks like. [66:04] So you get a sense of how this functions. [66:06] So I'm gonna say you've run Agent Slash [66:09] and then this is going to be agent 3.py okay. [66:13] For the mode we're going to pick. [66:15] So for now let's go with parallel topic. [66:19] Let's go with tech with Tim okay. [66:22] So for parallel what this is going to do. [66:24] Again let's just look at the setup here is it's going to run the analysis team [66:28] with just this market analyst. [66:30] Risk analyst and financial analyst. [66:32] Now this probably doesn't make sense for me because tech with Tim [66:35] is not really something that's going to have like a market analysis. [66:39] But if we want to see this running we can go here, [66:43] we can save and you can see that this is running. [66:46] We actually have three agents running. [66:48] And if we go back to the main execution, see we have analysis team financial risk and market. [66:52] And then if we go back here it says the report was saved to this directory. [66:56] And if we open up the report we get the full report from these three different agents. [67:02] Okay cool. [67:03] Now let's try a different execution mode. So let's go. [67:05] You've run agent 3.py and let's try nested for the topic. [67:11] Let's go Nvidia stock okay. [67:14] Now if we go here let's go to our agents. [67:18] You can see that we now have a bunch of agents running right. [67:20] So have the analysis team researcher writer editor, the analyst team market risk financial. [67:24] And these are going to run sequentially. [67:26] So if we go and have a look at this, the first thing we're doing is running the analysis team. [67:30] The analysis team we need to run sequentially. [67:31] So we're waiting for all of these to finish okay. Looks like they're finished. [67:34] Now we're going to the researcher. [67:36] So the researcher is going to have their [67:37] the input from the analysis team, which you can see is piped in right here. [67:42] We're going to wait for the researcher to finish. [67:44] And then as soon as the researcher is finished, [67:46] we're going to go to the writer, and then we're going to go to the editor. [67:48] So this of course is going to take longer. [67:50] But that makes sense [67:51] because we need to go through this flow to pass the data between the different models. [67:55] So let's just refresh here, wait for it to finish and see what we get. [67:59] And actually if we go to the main execution, [68:01] you can see that we're running this analysis team and then this researcher. [68:05] And we can just wait for the researcher to finish. [68:07] We should see it all right here okay. So it's running now. [68:09] And you can see that we have a lot of different tool calls that are being executed here. [68:13] Because it's using the search web call from fire Crawl. [68:16] Now if I check here it actually says the fire curl API key is not defined. [68:19] So I'm glad we saw that. [68:21] And you can see [68:21] this is just going to continue to keep retrying and retrying until I eventually crash this. [68:26] Or I provide the fire Curl API key, which is kind of how this is designed to run. [68:30] So what I'm going to do is just quit out of this for right now and show you how we can provide that key. [68:34] Okay, so like I mentioned before, you can actually store credentials [68:38] on the server, which we need to do because of how we're looking them up in the tool. [68:41] And the way to do that is the following. [68:43] You're going to type, you've run if you're using UV agent spin [68:47] credentials, make sure we spell that correctly and then set. [68:50] And then you're just going to set the credentials that you want. [68:52] Now in our case it is the fire crawl underscore API underscore key. [68:57] And I'm just going to make this equal to my fire curl API key which I will disable afterwards. [69:01] Okay. [69:01] So you're saying you've run agent spanned credential set fire curl API key. [69:05] And we need to remove the equal sign because that's how the syntax is. [69:09] And now we've stored this on the server. [69:12] So now we may need to restart the server I'm not sure. [69:14] Let's actually just go here and check. [69:16] We can refresh and let's go to credentials. [69:19] And okay it looks like the credential is now here. [69:21] So that's good. So it's stored. [69:23] And what we can do is rerun our agent okay. [69:27] We're going to run this in the what mode was I running this in the nested mode I think. [69:32] Yeah. So let's run this in the nested mode. [69:35] And let's look up in Nvidia [69:37] stock okay. [69:39] And hopefully this time it will work. [69:40] Once we get to this step where it's trying to call fire curl. [69:43] Okay. [69:43] So I just opened up the server and we can see the researcher is running. [69:47] Now this is the one that takes the longest because it's using fire curl. [69:50] But you can see that it's fetching all of these different pages. [69:53] Right. [69:53] To get all this information about Nvidia, you can see if we go back to the top [69:58] I believe it use yes search web. [69:59] So it was searching past the input query Nvidia Investor Relations annual report. [70:04] And then it got all this output. [70:05] And then it went to search all of these individual pages. [70:08] And we can see the full flowing flow full flow story right here until eventually we get the output. [70:13] If we go back to the agents [70:14] we can see now we're just at the writer which is going, and then we should be good. [70:19] So let's see what response we get okay. Boom. [70:21] And looks like we got the response. [70:22] If we go to the reports here, we can open this up. [70:26] Let's just preview it here. [70:27] And we can see our full markdown report about Nvidia stock analysis with the different sources. [70:33] We'll just click one and see if it works. [70:35] And boom yeah we get like the full report. [70:37] I guess it's long. [70:38] I'm not going to wait for that PDF download and all of the other information. [70:42] Okay. So very good. [70:43] The nested agent is working. [70:46] So that's pretty much what I wanted to show you for agent three. [70:50] Now what I want to do is move on to a few other parts that we should be understanding, which is testing [70:54] and then the durability feature. [70:56] So how do you actually resume an AI agent when it crashes [70:59] in the middle, or it's waiting for a human or something along those lines? [71:02] Let me show you. [71:04] So what I've just done here is written a short file [71:05] that shows some basic usage of testing an agent span agent. [71:09] Now what we're able to do is we can test these agents [71:12] without actually having to make an API call [71:15] to ensure that things like the model or the pedantic, response model [71:20] they're using, or the tools that are using, or these kind of things work properly. [71:24] So, for example, what I've done is I've said, hey, I want to test agent two. [71:27] So I've brought in some stuff from Agent Span. [71:29] I've brought in the support response and the support agent. [71:32] I have an example refund policy where there's like some, you know, thing [71:36] that we should be getting as a response here. [71:38] And what I've said is, okay, hey, we're going to do a tool call. [71:41] The tool call is going to be searching the knowledge base. [71:44] We're going to have a query, which is the refund policy. [71:46] We're going to mock the tool result which will be refund policy. [71:50] And then we're going to mock done. [71:52] And we expect that we should get this support response. [71:55] So we're mocking a lot of the functionality. [71:57] But again it's still good just to make sure the agents working as we expect and to run [72:01] extremely quickly without relying on lumps, we then can use a standard expect. [72:05] We expect the result to be completed, the output to contain refund, [72:09] and we expected to have used this tool search knowledge base. [72:12] Right? [72:12] If we give the support agent this, which is what is the refund policy. [72:16] So we mock all of the events, but we can just make sure that those events are triggered properly. [72:21] Now there's full docs on how this works. [72:22] I'm not going to go through all of it, but very basic. [72:25] If we want to run this, I can just come here and go, okay, so sorry, [72:28] I just moved some of the import stuff around cause I had it in the wrong place. [72:30] But anyways, if I go here and I run this now, you can see mock test passed and all is good. [72:36] We didn't get any errors and if we change this to maybe say like, [72:39] you know, dot refund did instead of refund and we run this, you can see that [72:45] we get an assertion error and it says, hey, there's some issue you need to now go and fix this. [72:49] Okay. So just showing you the basic testing usage okay. [72:51] So now I want to have a look at the durability feature here of Agent Span. [72:55] And what I mean by that is if an agent were to crash or go offline, [72:58] we can restart it without having to repeat all of the steps. [73:03] So let's imagine we have a simple agent like we have here where there's a slow step. [73:07] What do you call it? Tool that runs. [73:10] It takes three seconds to run. [73:11] Notice. Also, I added a timeout. [73:13] You can do that on various tools. [73:14] And what I've done is I've told the agent, hey, I just want you to run a ten step workflow [73:17] by calling the slow step for each step and run it ten times. [73:21] That's it. [73:22] So this will take 30s to run, but we might make it to step nine or something, might crash or break, [73:27] and then we would have to restart from the beginning if we didn't have this durability. [73:31] So what I've done is I've set this up so that we have a mode, we have a start mode. [73:34] We also have a resume mode. [73:35] Now you would have this if you're running this in production, because you would know the execution ID [73:40] when these agents are running, which I'm going to show you in a second. [73:43] So anyways, you can see that if the mode is start, [73:46] what I'm going to do is I'm just going to start the ten step workflow. [73:49] Right. And then I'm just going to stream the handle. [73:51] And this is just going to print out everything that's going on. [73:54] So we can see until it says that this is done. [73:56] That's it. [73:57] Now if the mode is resume I'm actually going to serve the durable agent okay. [74:02] So I'm going to start the agent. [74:04] And then what I'm going to do is connect to the execution ID that we had previously. [74:09] So this is going to allow me to connect to the existing, execution. [74:13] And because this agent will be running, we can just go and resume from where we left off. [74:18] So I'm just serving the agent, so. [74:20] Okay, start the agent. [74:21] And for our handle, rather than starting a new process, [74:24] just connect to the previous one that we have. [74:27] So any of these execution IDs that are not yet finished. [74:30] Of course, there's a lot more scientific, [74:32] scientific way to go about doing this, but that's the basic way that I'm going to show you. [74:35] So let me show you what I mean. [74:37] Let's open this up and let's go. You've run [74:41] and let's spell this correctly. [74:42] And then agents slash crash resume demo okay. [74:46] So let's let this run for a second and let's wait till it gets to kind of some, you know, later steps. [74:51] So let's go back here to our agents. [74:53] And you can see the durable demo is running. [74:55] It's running this slow step. [74:56] And if we keep refreshing here we should just see that it keeps moving on to the next step. [75:00] So now we're on step two. [75:01] And I'm just going to keep going. [75:02] Right is going to do this well up to ten times. [75:05] So let's wait okay. Refresh again. [75:08] You can see now we're on step three. [75:09] And then what happens if I just crash it boom it stops. [75:13] Well if we go here you'll notice that this is still running right. [75:16] So we made it to step four. [75:17] But the slow step we're just waiting on this to finish. [75:20] So what can I do. [75:22] So that I don't need to restart this from the very beginning? [75:25] You'll notice it's not going to advance any further. Right? [75:27] We're still on step four without having to restart the whole thing. [75:31] So if we go here, you'll see that we have an execution ID that would have been printed out. [75:36] Looks just like this. [75:37] So we're just going to copy that execution ID [75:40] and we're going to paste that right here. [75:41] I'm going to remove the spaces. [75:43] I'm going to change the mode to just say resume. [75:46] So now what's going to happen is I'm just going to go and I'm going to use this [75:49] where I'm going to connect to that previous execution ID now, [75:53] because all of the state is stored here on the agent span server when I reconnect. [75:57] So if I just restart this here, you'll see that it brings me back to where I already was. [76:02] And I have all of the state already there. [76:04] And we can now just continue. [76:05] And if we refresh, you'll see that we now go to turn number five. [76:08] So I didn't restart anything. [76:10] I didn't lose any state and lose any information. [76:12] I just go from where I left off and I just restarted the worker. [76:16] So this is the important thing to understand is that agent Span is storing the state. [76:19] Right. [76:20] And kind of all of the information. [76:21] And your worker is just executing the code, right. [76:24] It's executing the functions, it's completing the task. [76:26] But you at any point, if it fails, can go back and reconnect to that. [76:30] So imagine you're writing a platform. [76:32] You just store all your execution IDs. [76:34] If any of them fail, you just simply reconnect back to them and continue [76:37] when the worker comes back online, because that's something that happens a lot in production. [76:41] And same thing. Let's can I quit? Maybe in time? [76:44] I'm not sure if I was able to quit it in time or if this is going to be completed. [76:46] Now let's go down here and see. [76:48] Yeah. [76:49] So it's still waiting on the let me call. [76:50] So now same thing if I run it again boom. [76:53] You see we get right back into the execution we had before and we're done. [76:56] And all of it's finished and we get whatever that final response was. [77:01] Which if we look here, there's tons of workflows. [77:03] Complete steps one through ten will run an order. [77:05] But okay, so that's what I wanted to show you with this kind of crash and resume [77:09] and how easy it is to get back into the state where you were before. [77:12] Now, lastly, let's talk a little bit about deployment, [77:15] and then we're going to be done with this course okay. [77:18] So now let's talk a little bit about deployments. [77:20] Now I'm not going to deploy full application here. [77:22] But I just want to discuss how you can move to this stage if you do want to deploy your apps. [77:27] Now if you just want to use local development like we were doing right, [77:29] you just run the Agent spin server and that's it. [77:31] It will just stored in a local SQLite database. [77:34] However, if you want to go to a deployed environment, you probably want to use PostgreSQL [77:39] and some kind of Docker compose to be running this for you. [77:42] Now, in order to do that, you can just pull the GitHub repo that Agent Span has. [77:46] I'll leave a link to it in the description, and when you pull this, it gives you the information. [77:51] Here you can go into the deployment and then Docker compose directory. [77:56] So if you go here they have deployment right. [77:58] And then they have docker compose. [78:00] And from Docker compose you can just adjust the variables here. [78:03] Inside of the env example you can put any environment variables [78:07] or like API keys that you want to have. [78:08] You can put the what do you call a Postgres database that you want to connect to [78:13] so that rather than running it locally, it's going to run with that remote DBS. [78:16] You can also connect to it as needed. [78:18] Now it also goes over exactly how to deploy it using Docker Compose. [78:21] This will just deploy this server for you. [78:24] And as soon as this server is deployed, all you need to do is just point your workers to this server. [78:30] So as it says, right here, all you have to do is just say, hey, here's the URL where this is running. [78:34] It could be running on this server, could be running another server behind some endpoint, [78:38] or behind some URL, whatever. [78:40] And that's it. [78:40] Then you just point it there with the server URL, you start working and everything is good. [78:45] Right. And this can be scaled as much as you want. [78:47] Now there's a bunch of other options in terms of using Kubernetes and setting up the off [78:51] and all of this kind of stuff, [78:52] which I'm not going to go through here, but you can see that you can set an off key, [78:55] you can set an off secret, and then you can also just configure those directly from code. [78:59] So now if someone wants to connect to it, they do need to pass those values from their worker. [79:04] So you have some kind of secure authentication going between the worker [79:07] and between your agent span server. [79:10] And that's pretty much it. [79:11] That's all you need to do for deployment okay. [79:13] You also can obviously self-hosted this as a service right here. [79:16] And it kind of explains how you have multiple workers going to the server connected to Postgres. [79:20] And you can see all of the different options, but it's very straightforward. [79:24] It's just a matter of essentially deploying the server. [79:26] And once a server is deployed, pointing your workers towards and then adding that basic auth [79:30] kind of, you know, protocol so that not anyone can connect to the worker. [79:34] So that's it guys, that's going to wrap up this video. [79:37] That's pretty much all of the core things that you could do inside of agents. [79:41] And of course there's a lot more I didn't go over everything. [79:43] But this should give you a really good head start to building production. [79:47] Great AI agents in Python. [79:49] If you enjoy this type of video, make sure leave a like. [79:51] Subscribe to the channel and I will see you in the next one.