[00:00] Everywhere you look, people are talking about AI agents. It's one of the hardest topics right now, but with so much information flying around, it's easy to feel overwhelmed, like you are already behind or don't even know where to begin.
[00:14] So I decided to dive into this topic for you, so you don't have to spend weeks piecing it all together yourself. I went through everything, classic AI books, online courses, research papers, and countless YouTube videos,
[00:27] also figure out what's actually going on with AI agents today. And to put it to test, I built an AI agent myself in Python to see if this stuff really works. In this video, I'll walk you through the foundations of AI agents
[00:41] and show you exactly how to build one step-by-step. By the end of this video, you'll have some concrete and practical ideas for how to create your own AI agents, whether it's for a personal project,
[00:53] something at work, or even a prototype for your next startup idea. Some time ago, some people said AI agents were just hype, but as AI getting smarter and more reliable, there have been some interesting developments recently.
[01:07] One recent example is Manus AI, a Chinese AI company who made headlines with a general AI agent which can plan trips, analyze data, create video presentations, and conduct in-depth research.
[01:20] Google also launched an experiment with a data science agent that performs data analysis autonomously. Open AI has also just recently released some brand new toolkit for building agents.
[01:33] And if you follow the news, big names like Jeff Bezos and Satya Nadella, believe AI agents will be the future of how we interact with computers and Bill Gates even says the biggest tech race right now
[01:46] is all about who builds the best AI agent. But as of today, reward AI agents still have a lot of rough edges and complex systems often require heavy human oversight. However, even with the current state,
[01:59] AI agents have proved useful in many time consuming tasks and it's becoming a game changer in many areas. There's a good chance you've already been using chat to be tea for a while. It's probably one of the most well-known AI agents.
[02:13] It can generate content but also browse the web, conduct research, and execute Python code. Other examples of specialized AI agents include writer agents that generate block articles,
[02:26] coding agents that write and debug code research agents that analyze documents and extract insight. Another common example is customer chatbot agents that handle customer questions
[02:38] and ultimate responses. That would choose the few examples among many, many use cases. So that's exactly why I want to help as many of you as possible to get up to speed with the solid foundation
[02:51] and the best insights on this fast-moving landscape. And as it turns out, there's really no better way to learn something than by actually doing it. So I figured what better time to try building an AI agent
[03:04] and see if it actually could solve one of my own problems. One of my own struggles is I pour my heart and soul into these YouTube videos but they only reach people on YouTube.
[03:16] And what about all potential viewers on Instagram or LinkedIn or those who prefer reading blocks? The reality is I hardly have time to manually repurpose every single video
[03:28] for multiple platforms. That's when I had this lightbulb moment. What if I could build an AI agent that would watch my videos and automatically create tailored content for other platforms in my own voice?
[03:42] So later in this video, I'll walk you through exactly how I built this social media content assistant agent that extracts the transcript from a YouTube video on my channel and then creates block posts
[03:55] or LinkedIn articles or Instagram captions for my new video using my writing style. I was very excited about this project. And so if you're tired of some time consuming and repetitive tasks in your life
[04:09] and want to build your own AI assistant that works while you sleep or at least tries to, I believe you learned a lot from this project. And the goal here is to learn the key ideas behind how things work
[04:22] and once you've got that down, you can tweak it and remix it and build whatever project you come up with next. For example, if you're struggling with focus and did work like me, why don't we build an AI agent that will celebrate your focus
[04:37] with eye of the tiger after your deep work session and brutally roasts you with slack messages when you get distracted? That sounds like a pretty useful AI agent to me. So how do we actually build an AI agent?
[04:51] There are really two main ways to go about it and the best choice depends on your background and what you're hoping to create. The first way is by using low-code or no-code tools
[05:03] and these are perfect for folks who don't have much programming experience but still want to build something powerful. Platforms like NHN, FlowWise, Bubble and Rapid AI
[05:16] that you design AI agents visually. This approach is fast, user-friendly and great for quickly getting something up and running but of course there are few trade-offs. Most of these platforms come with subscription fees
[05:30] which you'll be paying on top of any API cost for your airline and when something doesn't work quite right you get some errors, it can be frustrating because you might not have full control
[05:42] or visibility into what's going wrong and it's also harder to customize deeply when you're relying on a pre-built system. The second approach is to code your own AI agent from scratch
[05:54] and if you're comfortable with Python or even just willing to learn it opens up a whole new level of flexibility and that's actually what we'll be doing in this product walkthrough building it yourself means you're not tied
[06:07] to any platform's limitations or pricing and you get to understand exactly how everything works under the hood personally I find this a lot more fun and rewarding too there are some great Python frameworks available now
[06:21] that make this process easier you might have heard of Lanxin probably the most well-known toolkit for building our lam-based applications we also have crew AI which is great for
[06:33] simulating collaborative agents with specific roles then Microsoft's autogen we also have Lama Index which is ideal if your agent needs to work with a lot of documents
[06:45] and extract contact intelligently and recently we have OpenAI's agent SDK which is what we'll be using in this project I find it to be one of the most straightforward options
[06:57] if you're already planning to work with OpenAI's models that said even with the help these frameworks writing a fully functioning AI agent still involves a fair amount of coding and that's when I had another lightmap moment
[07:12] what if we use a coding agent to help us build an AI agent yep we are going for meta here later in this video we'll be building our social media content assistant agent with a little help from a coding agent code Juni
[07:27] this coding agent is made by the awesome fox at JetBrains who has kindly sponsored this video Juni just recently became publicly available and now all JetBrains users can find it under a single JetBrains AI subscription
[07:42] you can think of Juni as an autonomous coding buddy that can help you with complex parts of the project and execute simple tasks or fully take over some of them if you let it having a smart coding agent to do the heavy lifting in your project
[07:57] can seriously save you hours or maybe even days you see exactly how it works when we get into the tutorial later on so first thing first if you're not familiar with the concept of AI agents yet
[08:10] let me give you a quick crash course it's important to understand the foundation before you start building things but if you already know the basics you could skip this part of this video and jump right to the project walkthrough
[08:22] i've also left a link to a comprehensive notes on AI agents that i created you can download it using the link in the video description all right let's talk about what AI agents actually are
[08:35] most modern definitions focus on AI agents powered by our lambs but the concept is much broader and has been around four decades simply put an agent is anything that can perceive its environment
[08:49] and act upon that environment according to the classic book by Stuart Russell and Pito Norvik i highly recommend getting this book if you're a little bit of a tech nerd an AI agent is anything that can perceive its environment through sensors
[09:04] and act upon that environment through actuators that's quite a bit of jargon so let's break it down so here's a simple diagram to illustrate the idea of how agents work at its core
[09:17] first we have the agent and what exactly is an agent well it can be a software system or hardware system it can be a self-driving car or even a robot if we extend the definition to biological systems as well
[09:31] then technically humans can also be considered agents though that's a little bit of a little odd way to think about ourselves then we have the environment an agent's environment is the system it
[09:44] interacts with and gathers information from for a game agent the game is its environment for a self-driving car agent the environment is the road system and it's adjacent areas and for a robot the environment is the physical world
[10:01] for a travel booking AI agent the environment could be the travel booking system that the AI agent uses to complete the task next an agent gathers information and feedback from its environment through its sensors
[10:16] for robotic agent it might use cameras and microphones to gather information and feedback from its environment and for software agent the sensor input could be text voice images or data obtained from an API
[10:32] then the agent does something with the input receives from the environment for example using an LLM like GPT-4 or Cloud Sunnet or even an open source model to reason about the task and based on that take relevant actions
[10:49] and affect its environment through actuators for example a robot actuators might include motors or movements or grippers for handling objects for software agents actuators might involve action like browsing the web
[11:05] modifying files in the file system or executing commands in a system to accomplish a task now these actions that an AI agent can perform is augmented by the tools it has access to
[11:19] as you might notice from this diagram the workflow is circular it is like loop the agent receives some input do something and then get feedback revise the output get more feedback improve the output and so on and so on
[11:35] so in other words you can say AI agent operates in sense think act loop it senses so that is the perception then things which is reasoning decision and learning then acts and then repeats
[11:51] this loop can be extremely fast for example milliseconds in an automated trading agent or more drawn out for example an AI scientific researcher agent that spends hours
[12:04] analyzing complex data so this loop pattern is what makes the agentic workflow different from a non-agentic workflow for non-agentic workflow you give an input to the system and the system
[12:18] spits out the output for example classify the sentiment of this text so it gives you the output and that's done so the workflow is very much like a straight line so the input and the output
[12:30] then done but with agentic workflow the system can get feedback from the environment and iterate to improve the results and this often leads to a better output what most people talk about today
[12:43] are agentic workflows that have some agentic components but we humans still need to give feedback and help along the way for the agent to successfully complete a complex task so this is not fully
[12:58] autonomous agents that we're talking about that can figure out everything from start to end by itself at the time of filming we still don't have fully autonomous agents yet this is a whole new level
[13:11] and would likely need another technological breakthrough but if it works it can be quite frightening thinking about how much it can alter our lives together with all the security concerns and job
[13:23] replacements so if the concept of agents is nothing new why AI agents have recently become such a bus just a few years ago most AI systems were either narrow so doing one single task or required
[13:39] significant human orchestration now breakthroughs in AI especially large language models mean that an AI agent can be instructed in natural language and reason about how to perform a wide variety of tasks
[13:54] we no longer have to hard code the workflow because the AI can now figure out the needed steps what actions take which tools to use so agents now become much more general purpose and become more
[14:07] autonomous and this has made the concept of AI agents much more accessible and useful than ever before an average user can now instruct an AI agent in plain English to do something for example search the
[14:21] web summarize a report or plan a route etc and the agent can actually reason about it and carry it out also more powerful models are much more needed for agent use cases compared to non-agent use
[14:36] cases for two reasons firstly we have the compound mistakes an agent often needs to perform multiple steps to accomplish a task and the overall accuracy decreases as the number of steps increases
[14:50] if the models accuracy is 95% per step over 10 steps the accuracy will drop to 60% and over 100 steps the accuracy will only be 0.6% an agent use cases also have higher stakes with access to different tools
[15:07] agents are capable of performing more impactful tasks so not just generating text so any failure could have more severe consequences and that's why with the current capability of AI models agents
[15:21] work best when they have a singular purpose a narrow scope and a small number of tools this can reduce the potential mistakes confusion for the AI agents and cost of failure of an AI agent
[15:34] alright it's time to get our hands dirty with building an AI agent in our project we'll be building a content writer and assistant agent system that automatically generates social media posts
[15:47] from my youtube videos and a little bit more than that for this project we're going to be using pie charm which is a very popular IDE for coding in python and so you might have already
[15:59] working with pie charm regularly at work so let me first open up pie charm but if you prefer using some other IDE you can use them as well so just a quick note that you can use pie charm for
[16:11] free however if you want to use some advanced features like AI features like AI assistant or AI agent you need to have an AI pro subscription or you can also use the pie charm pro subscription
[16:24] JetBrains also have a 30 day free trial for pie charm pro subscription so you can take advantage of that and see if this is something for you alright so let's now create a new project now I'll create
[16:38] a new project and an empty project photo that I have here called social media agents let's go ahead and click on this one and click open and next we can choose an interpreter for our project so
[16:54] here we have a few different python versions I have python 3.12 installed which is also the latest python version in my computer so I'll just leave it like this and now let's go ahead and create
[17:10] this project now for those of you who want to use Juni which is the coding agent from JetBrains you can click on this AI icon over here on the top right and click on go pro and then the Juni
[17:24] plugin will be automatically installed after it's done you can see that we have a Juni icon over here this is the coding agent that we were talking about we'll be using it to help with routing tasks
[17:39] such as creating requirements optimizing code creating a user interface with streamlit and documenting at the end of the project there are basically two modes the first one is code which is basically ask the agent to do some task for example add a function or add a script
[17:57] in your project directory or even create project specific guidelines or create a list of tasks or improve optimize your code base and the other mode is ask which is basically to ask questions
[18:14] like how you would with chativity however tools like chativity don't have full context of your project they can't read or edit your file so they can't collaborate with you directly inside the IDE
[18:28] like a coding agent can so these are two basic modes and we also have the brave mode as well and this brave mode is quite interesting sometimes your agents will need to access your terminal
[18:40] and run some terminal commands for example to install a certain python package or to run some tests so for these sensitive tasks Juni will explicitly ask for your permission to run these certain terminal
[18:54] commands so the brave mode here if you turn it on this will allow Juni to execute terminal commands without confirmation from you so in a way it's kind of convenient but of course it's more risky
[19:09] because you don't get to check the command that your agent is trying to run in advance this is kind of like use it at your own risk now let's ask our agent to do something simple for example to create
[19:22] a new empty python script called social media agent something like this just to see how it works
[19:34] you can see that the agent is working on it is sending the alarm request so it comes up with a little plan so first to check if there's any sub directories in the current directory create a new python script
[19:46] called social media agent.py in the root directory and then it go ahead and create a social media agent.py file which is an empty script that we requested so if you're happy with the result
[20:00] you can click on done otherwise you can also decline this output altogether. Let's go ahead and click on done in this case so for our AI agent project we're going to be using OpenAI models together
[20:14] with the OpenAI agents SDK which is the agent software development kit from OpenAI and this is a very new tool from OpenAI to help us really build agent AI apps in a lightweight and easy to use
[20:33] way and with few abstractions so even newcomers can easily learn to do this without having a lot of learning curve like for other frameworks that you may have heard of like Cruei or Langchain or
[20:48] Microsoft AutoGen. Since we are using OpenAI models I find this the most straightforward tool to use for this project. In this documentation page you can find the intro to the agent's SDK
[21:03] the quick start we just need to install the OpenAI agents package and we need to have an OpenAI API key which I believe a lot of you already have if you are having an OpenAI account. There's also
[21:18] example how to create your first AI agent which is pretty simple you can just simply define an agent using this agent class and give it some instruction. For example you provide help with math problems
[21:32] explain your reasoning blah blah blah and so this is the basic structure of an agent and then there's also documentation on how you can configure your agents with model which model you want to use.
[21:48] For example here for this high cool agent we're using the 03 mini model and then you can also even add certain tools for your agent. For example here we are having this get weather tool simply a custom
[22:04] function and you use this function tool decorator to let OpenAI know that this is a tool for your agent then you can supply this tool inside the square brackets in this argument. Now OpenAI also have
[22:21] some other built-in tools the first one is web search tool which is useful when you want your agent to be able to search the web. The second one is file search tool which allows your agent to retrieve
[22:34] information from OpenAI vector stores and the other one is a computer tool which allows your agent to do some computer tasks. So this is a quick overview of this agent software development kit
[22:48] from OpenAI and as you can see it's not difficult to get started so this is what we'll be using. So getting back to our project in PyCharm now let's go ahead and install some necessary packages
[23:03] for our project. Now of course you can go to the terminal and type the pip install command yourself but since we have the Julie agent we can simply ask get to do it for us. So let's ask the agents
[23:17] to install a few packages. First we want to install OpenAI which is the package that help us interface with OpenAI APIs. We'll also install OpenAI agents which is the OpenAI agents SDK and then
[23:34] we'll also want to install the package called YouTube transcript API. This package will help us pull the transcript from any YouTube video for example through the video ID and then we also want to
[23:50] install the Python dot environment package and this Python package allows us to load variables from the global environment instead of hard coding those API keys in our Python script. So let's go ahead
[24:08] and click on run and here we can see that our agent has created a file called requirements dot txt that contains all the packages that we have specified and now in our Python script we can install
[24:26] these requirements directly from this requirements dot txt file so that's pretty convenient. To make it easier for you guys to follow I'm going to organize our script into a few different steps.
[24:41] Step one we're going to get our OpenAI API key and then in step three we are going to define our social media agents. So this agent is going to take a certain YouTube video transcript
[24:58] and then generate content based on this transcript for LinkedIn or Instagram or whatever you define and before we do this I realize that I miss one step that is we also want to define the tools for
[25:14] our agents. For example you can define a custom function to give our agents some extra capabilities and we'll get into that in a bit and then we have the first step is to define some other functions
[25:30] for example if we have any helper functions and in the final step we're just going to run the agent. So this project is going to be pretty simple with only one agent but if we have multiple agents
[25:44] then this will get a little bit more complicated but this will be really the foundation of how you can develop an agent and all the more complex projects with more complex agents are
[25:57] just basically extension of this. All right so first we'll import some packages and modules so we import some built-in modules the async-seal module it basically allows a Python program to run
[26:10] asynchronously so what that means is for example when we define an asynchronous function and within this function there are a couple of different steps so step one could be for example calling an API like OpenAI API to generate a piece of text. So this API code typically takes a
[26:28] few seconds to finish but in an asynchronous function it doesn't need to wait for this step to finish before it starts doing other things so this will help make a program run more efficiently
[26:40] save more time so we're going to use this to run our agents and we see how it looks like in a bit and we have the OS module and we also import the YouTube transcript API module from this package
[26:53] that we just downloaded and then we also import a few different tools from this OpenAI agent's development toolkit and then we import the OpenAI module and the dot environment module so this is
[27:07] all that we need to set up this simple agent. Now the next step is to get our OpenAI API key so in our working directory we're going to create an ENV file. This dot ENV file will store our
[27:23] OpenAI API key so let's go ahead and define that so we have our OpenAI API key being the API key that you have and you can find this key in your OpenAI account so let's go ahead and save this file
[27:38] now we can go ahead and load this OpenAI API key from our dot ENV file. I have my auto completion here so I'm just going to use this to speed up our project but if you have any other
[27:51] code completion tools or chat GPT or some other tools feel free to use them as well. Let me just go ahead and accept this code so basically we load in our OpenAI API key from the environment
[28:07] also prefer to make this variable name uppercase just to signify that this is a constant and then in step two we are going to define the tools for our agent so here we are going to define this tool that is to generate content, social media content from transcripts and so this is going
[28:26] to be a function we call it generate social media content or just generate content just to be short this function will take the video transcript and this video transcript will be a string and the
[28:42] social media platform will also be a string as well for example a LinkedIn or Instagram or a medium etc and let me just quickly make a print statement here so this print statement just like to make it
[28:59] easier for us to follow what the agent is doing. Next up we're going to initialize the OpenAI clients and let me move this line over here and then we can start generate content so here we are going
[29:14] to create the response and here we are going to specify how this will look like so we will generate the response using the GPT 40 model we will give it some instruction so my prompt is something like
[29:29] here is the new video transcripts so generate a new social media post on my social media platform for example LinkedIn based on my provided transcripts so we also define the maximum output tokens
[29:43] and we cap it at 2500 and then we just return the output from this response so if you've used OpenAI API in your project before you can see that there's some updates in the way their
[29:58] API is organized and this is based on the newest version of OpenAI API that is called response API instead of the previous older version that is chat completion API that's why you can see
[30:13] some differences here but if you have any doubts just feel free to check out their official documentation page to check out their latest API documentation all right there is another thing that we forgot to do
[30:26] is that we will need to add a decorator for this function to let the OpenAI agents SDK know that this function is not an ordinary function but it's supposed to be a tool for an agent so this is the
[30:44] only thing you need to keep in mind when creating tools for an agent now the next step is to define the agent itself so here we only have one agent that is the content writer agent so now we're going to
[30:58] define this agent so let me call it content writer agent and this agent will basically have a name so the name could be yeah content writer agent sounds good and we have some instructions for
[31:16] this agent so here we're giving instructions for this agent you are content writer for social media platform you'll be given a video transcript and a social media platform and you will generate a
[31:28] social media post based on the video transcript and the social media platform and depending on your use case you can customize it make it more specific more relevant to your use case for example here
[31:41] I prefer to make it a little bit more specific I will say you're a talented content writer who write engaging humorous informative and highly readable social media posts so this sounds a little bit more
[31:54] like what I'm aiming at regarding the tool we have one tool for the agent that we just created that is generate content tool and so we're adding it to the tools arguments over here but next to that we can
[32:10] also have some other tools for example we can add the web search tool that is a built-in tool in the OpenAI agents SDK and so this web search tool allows our agent the content writer agent
[32:27] to search the web when necessary this is also something I want my agent to do so I'm going to add here as well you may search the web for updated information on the topic and fill in some
[32:39] useful details if needed so if it's needed our agent will also use this web search tool to fill in some details for us and here we also want to specify our model which model we want our agent to use
[32:55] so here I'm just using GPT 40 mini model this model is slightly smaller than the GPT 40 model and from my experience this model is enough for a simple agent like this another thing to note here
[33:09] is that we can also specify the output type for the output of our agent so if we don't specify anything then it's pretty much like your agent will always return a string but you can change
[33:24] this output type to any output structured output that you like that may be easier for you to process later on in your workflow and then our agents will make sure that the output
[33:36] satisfy this output structure or schema that we define and where I want this agent the output is a list of posts so we're going to have to import some extra things here so we will import
[33:51] the data class module from the data classes package and then we also have to import the list module here so now let's go ahead and define this post class so we're going to say this post class
[34:07] will rather have the name of the platform for example LinkedIn or Instagram and then it also has the generated content so here in our instruction we say it's not one single social media platform
[34:21] but the user can also request to have the generated post for multiple platforms at the same time so this will all be multiple so that is it for defining our content writer agent and we'll see in a bit how
[34:36] we can run this agent later in this script now in the next step we just need to define some helper functions in our case we only need one helper function that is a function to fetch the youtube
[34:50] transcript from a youtube video using the video ID so suppose that our user will provide the video ID in her request or her query so what we first need to do in our workflow before we generate the
[35:05] content is to fetch this video transcript so we'll have a function called for example get transcript from a video ID being string and return a string and we'll be using the youtube
[35:21] transcript API and then we'll fetch the transcript and you're going to see here that we have a little problem here so that we this method get transcript is replicated so we need to adjust this a little bit
[35:34] so let me quickly check the latest documentation for this package and you can see here there's some documentation on how you can fetch the video transcript from a video ID so let me just quickly copy and
[35:49] paste this line of code in here we have the video ID so it's all good and we can return the fetch transcript now it's worth noting that the fetch transcript here is actually an iterable including
[36:04] multiple snippets and each snippet will contain some text so what we want to do is to concatenate all the texts together to get the full transcript so let me go ahead and do that let me just say concatenate
[36:19] all the text from snippets in the fetch transcript and so we have transcript text and for each snippet in the fetch transcript we're going to concatenate them together I have the feeling that this is not
[36:35] optimal so I'm going to ask the agent to optimize the get transcript function and also implement try catch mechanism for error so let's go ahead and do that all right so our agent has now edited
[36:54] this function to optimize it and also add error handling for our function so let's go ahead and check what it does so it also adds some docs string here it also gives an option to configure the
[37:08] languages for the transcript so if the language is none then we will pick the English transcript and so that is good and it implement the try accept mechanism here so it basically fetch the transcript
[37:25] and then also join the snippets the text from the snippets together this way and this is much more elegant than what we had before and for the accept block we have a bunch of different error messages
[37:40] here you know these things may take you a lot of time to write them yourself so I would encourage you to use this kind of coding agent or different tools or to completion or chat GPD or whatever to save
[37:53] you some time when improving optimizing your code so that is it for this function I actually also want to make sure this function actually works I want to test it explicitly on a test video you can also
[38:07] run this function and test it in your Python console but here I'm using the agent to test it let me say run this get transcript function to get the transcript from video I have a video here that I want
[38:24] to fetch the transcript from so for example on my channel I have a recent video called should data scientist pivot to AI and you can see that this little string here after the v equal character
[38:39] this is the video ID so you can simply use this video ID to run this function and let's see if it actually works from video ID like this so let's see if our agent will be able to test this function
[38:55] so our agent has created a test transcript here that will help us run this function so let me go ahead and run this file all right so we have this output here so this is the first
[39:08] 500 character of the transcript so it seems to be working correctly so let me just close this file and also remove it from the project okay now the final step is to run the agent so to run the agent
[39:25] we mentioned earlier that we will use an asynchronous function to run the agent so let me define the main function here to make it easier to test our agent in our script I'm going to hard code a video ID here
[39:42] this is a video ID that we just saw earlier on my channel and this function will pull the transcript from this video ID for us and later we can make it a little bit more dynamic by designing a user
[39:55] friendly interface for user to input this video ID by herself and here we'll say given a user input for example this user query generate a LinkedIn post based on this video transcript this is a user
[40:12] input that we can also make more dynamic later with a user interface with streamlit so let's not worry about it so now the next thing we want to do is to package the input for the agent so this input
[40:26] will take the user query and with the role of user and we are going to run this content writer agent so what we're doing is that we will run this content writer agent using the input that we just
[40:42] defined earlier with the user query then we're going to get out the output from running this agent and so this is just some helpful syntax so we don't need to worry too much about it I just follow
[40:57] the code example from the documentation now after all of that is done all we need to do is to run this main function asynchronously using this asyncCO module now is the moment of truth let's see if
[41:13] this whole workflow actually works so let me go ahead and run this script so here we are starting to generate social media content for LinkedIn and this is what we mentioned in our user query so this is
[41:29] what our agent is doing for us so here is the response from our agent this includes the platform names which is linked in and the content that has been generated for us using the video transcript that we got
[41:46] based on the video ID here so this means that our agent is working properly and we can also see that the output that we got from running this agent has exactly the same structure that we specify here in
[42:00] in our code here we specify the output type being a list of the social media post and the post contains a platform name and the content itself here we can see that this structure is very clearly
[42:15] is a list of a number of objects which include the platform and the content so if we want to generate multiple social media posts for example if we don't want to only generate a LinkedIn post but also we
[42:31] want to say an Instagram caption then when we run our agent we should see a list of two pieces of content one is for LinkedIn and one is for Instagram I think it's also good to really understand how the
[42:46] running of a function works so let me quickly go to the OpenAI agents Python GitHub repo here from OpenAI and here they have a very interesting and very useful explanation of how agent loop works so when you
[43:03] call runno.run so running a function we will run a loop until we get a final output so this is also how running an agent is different from running an ordinary function so here are the different steps so
[43:18] we first call the LM so the agents will use the model that we specify so for example in our case we are using GBT40 mini model and using the instructions that we give our agent and the message history from
[43:34] the user and the LM will return a response which may include two goals so for example here in our content writer agent so given a user query for example generate for me a LinkedIn post based on this video
[43:50] transcript this agent will use this model GBT40 mini to automatically figure out okay now we need to use this tool that is called generate content that we have in our toolbox and the agent will also
[44:09] automatically figure out what kind of values that it should put into to run this function so this generate content tool or this generate content function so it will supply the correct value for
[44:23] the video transcript which is a bunch of text and it will also fill in what kind of social media platform here which is for example LinkedIn so that is the basics of how an agent can work with tools
[44:36] and after that if the response has a final output so suppose that it runs this generate content function and the output is final then we'll return it and end the loop so the final output here
[44:50] is defined below if you set an output type on the agent the final output is when the LM returns something of that type so for example for this agent we define the output type being this type a
[45:06] list of social media posts then what this agent does is that it will iterate several times until the final output satisfy the output type that we specified for our agent so if the output doesn't meet
[45:22] this required structure that we specified in the output type then the agent will iterate this whole agent loop again until it gets the output with the right structure in case we don't define any
[45:35] output type so we can leave it this argument out then the first LM response without any two calls is considered the the final output so all in all this is a mechanism behind an agent
[45:49] and how they work under the hood another very useful thing that I think everyone should utilize when using the OpenAI's agents SDK is that we can also add trace to trace back all the steps
[46:04] that your agent has executed so in the code we can simply add one line here that is with trace and then we will name our trace for example writing content and let's also make sure to import
[46:19] the trace here in our project so when we run this script again yeah we also see that we are generating two posts one for LinkedIn and one for Instagram and so this is all working very well
[46:35] and we go to this platform.openai.com slash traces let me open up this this website then you can see that yeah we have a trace here which is writing content which is the name that we give
[46:50] to our trace you can name your trace anything you want this is just to know which one is which and so if we click on this writing content trace you can see all the steps that have been executed
[47:03] when we run our agent so here we have an API call that basically takes the instruction that we give our agent you are the talented content writer who writes engaging blocks blah blah blah and then
[47:18] we also have the user input which is the user query generate a LinkedIn post and an Instagram caption based on this video transcript and here is the video transcript that has been fetched
[47:31] from the video ID and here is the output from this API call so our agent decides that based on the instructions and the input it will call the generate content function or toe and it automatically
[47:49] populate the right value for each of the the arguments in this function so this is the video transcript which is a bunch of texts really long and also the social media platform which is LinkedIn
[48:05] so this is how our agent figures out that okay it first need to generate the content for the LinkedIn post and so this is a function call that it has executed and here is the output which is the LinkedIn
[48:22] post and next it decides that it should also generate the Instagram post as well so in this function call it calls the generate content to another time and this time it says that we need to generate the
[48:38] Instagram post instead then we have the function call for this Instagram post and then we have subsequently the output which is the Instagram caption and finally we have the final API call here
[48:53] with again the user original user instructions the user query which is to generate the LinkedIn post and Instagram caption and here are the two function calls and the agents assembles for us the final
[49:07] output which is the response that has both the LinkedIn and the Instagram post so this is the final response that our agent produced after finishing the agent loop so you can see that my tracing bag all
[49:22] these different steps that our agent has done it's really easy to see if it's working correctly or if there's any errors in a particular step and that makes it very easy to troubleshoot and debug your agent
[49:36] so going back to our project I also want to extend or improve my agents a little bit further by creating a user friendly interface for our agent so let me pull up Juni here our coding agent
[49:51] to ask create a streamlit web app that allows users input query and video ID then runs the content writer agent and display the output in a nice way so this is just my initial idea of how this
[50:07] streamlit web app should look like so let's see what our coding agent comes up with okay so now it's adding streamlit in our requirements file I'll go ahead and install this requirement and as you
[50:19] notice the coding agent also added a readme file that tells users how they can run this project so how they can install requirements and how they can run this streamlit application so this is pretty cool
[50:35] and it saved a lot of time doing all these stuff manually so it looks pretty good and now it's done for now so let's open up the terminal and let's run streamlitrunapp.py and so the app has popped up I can
[50:53] see that we have an input field here for the video ID so again we're going to use this test video and then we will type our query so again we're going to say generate a LinkedIn post and an
[51:06] Instagram caption based on this video and so that is our query and now we can select the platform this has been automatically done for us so that's pretty cool and now let's hit the generate content
[51:20] button you can see that there has also been some text description here instructions here to explain the user how they can find the video ID which I find pretty cool and so here we are generating content
[51:33] this may take a minute or two so these kind of like small details are really interesting to see if I were to create this app myself I probably wouldn't bother adding all these details and yet here is
[51:45] the generate content produced by this social media agent so in this video you've learned the foundations of AI agents and also we've got our hands dirty building our first AI agent in Python
[51:58] we've also learned how to use a coding agent like Juni and coding assistance and code completion to speed up our workflow and if you want to build a solid foundation in Python machine learning and AI and also building applications such as AI agents please feel free to check out my
[52:16] Python for AI project course in the past 10 months I've put my heart and soul into building this comprehensive course for those of you who really want to level up your Python skill and also building the
[52:28] solid foundation in machine learning and AI so if this is something for you I hope to welcome you to this little community of more than 250 learners that have joined so far so congratulations and thank
[52:41] you for making it to the end of this video thank you for watching bye bye