[0:00] in this video I'll be showing you how to [0:01] build a local AI agent in just a few [0:04] minutes using python we'll be using AMA [0:07] Lang chain and something called chroma [0:09] DB to act as our Vector search database [0:12] because I'm going to show you how to add [0:13] retrieval augmented generation into this [0:16] app that essentially means we can [0:17] retrieve relevant information from [0:19] something like a CSV file or a PDF and [0:22] bring that into our model now all of [0:24] this is completely free you don't need [0:26] an open AI account you don't need a clot [0:28] account or something you can do this all [0:30] from your local computer so let me show [0:32] you how to set it all up so I'm just [0:34] going to show you a quick demo of the [0:35] finished product and then we'll get into [0:37] the tutorial now you can see on the [0:39] right hand side of my screen here that I [0:40] just opened up a CSV file this CSV file [0:43] just contains some fake reviews for a [0:45] random pizza restaurant so we have title [0:47] date rating and review and you can see [0:49] something like best pizza in town here's [0:51] the date here's the ID or sorry the [0:53] rating of the review out of five and [0:55] then you have what the actual review is [0:57] and there's kind of some information now [0:59] what I'm going to show you is how to [1:00] build an AI agent here that can actually [1:02] go and look up relevant reviews from [1:05] this document to answer questions about [1:07] the restaurant I don't know about you [1:08] guys but whenever I go to a new place I [1:10] always look at the reviews and typically [1:12] I'm looking for an answer to my [1:13] particular question so this can kind of [1:15] do that for you so for example maybe I [1:17] want to know you know how is the quality [1:21] of the pizza okay well what it can do is [1:23] then go to this document find the [1:25] relevant reviews which you can see it [1:26] kind of pulls into here and analyzes and [1:29] then it gives me a conclusion overall [1:31] without more data or context it's [1:32] challenging to give a definitive score [1:33] on the pizza based solely on the reviews [1:35] however they do suggest a restaurant [1:37] with potentially room for improvement in [1:38] presentation and overall consistency so [1:40] there you go right I could ask something [1:42] like are there vegan options let's see [1:47] what that gives us and you can see here [1:49] in conclusion based on the reviews there [1:51] appear to be at least one vegan pizza [1:53] pizza and possibly more vegan options [1:54] available Okay cool so that's what we're [1:56] going to build this isn't going to be [1:57] super complicated it'll be pretty fast [1:59] so stick around around and let me show [2:00] you how we make it all right so we have [2:02] a few quick setup steps here and then we [2:04] can dive right into the code now the [2:06] first thing that we're going to need is [2:07] obviously some kind of CSV file now you [2:09] can use anything that you want and I'll [2:11] show you how to adjust this code for [2:13] your own example but if you want to [2:14] download the CSV file that I'm using [2:16] I'll leave a link to it in the [2:18] description and in fact all of the code [2:19] will be available from the GitHub so you [2:21] can go to the GitHub and you can [2:22] download this CSV file and just bring it [2:25] into a new folder in VSS code so to [2:27] begin open up some kind of code editor [2:29] I'm using VSS code code create a new [2:30] folder you can see I have one called [2:32] local AI agent bring in the CSV file and [2:35] then I also created this [2:36] requirements.txt file which just has the [2:39] three things that we're going to need to [2:41] install in Python so let's get started [2:43] with that installing our Python [2:45] dependencies and then I'll show you the [2:46] next steps so what we need to do is open [2:48] up our terminal again I'm inside of the [2:51] directory that I want to write code in [2:52] for this video and what I'm going to do [2:54] is create a virtual environment so to do [2:56] that I'm going to type python DMV en EnV [3:00] and then VV if you're on Mac or Linux [3:03] you can change this to Python 3 and what [3:05] this will do is create a new isolated [3:07] environment that we can install various [3:09] dependencies into if you don't know [3:11] anything about virtual environments and [3:12] you want to learn more I'll leave a [3:14] video on screen now that the virtual [3:16] environment has been created we need to [3:17] activate it to activate it if you're on [3:20] Windows is going to be dot slash the [3:22] name of the virtual environment slash [3:24] and then scripts with a capital S and [3:26] then slash activate when you type that [3:29] you should see that you get the name of [3:31] the virtual environment as a prefix [3:33] before your command line now if you are [3:35] on Mac or Linux then the command is [3:37] going to be SL venv and then this is [3:39] going to be bin SL activate Okay so it's [3:44] different if you're on Windows it's this [3:45] one and if you are Mac or Linux it's [3:47] going to be this one and again I'll [3:48] leave a video on screen that we'll go [3:50] through this more in depth now that we [3:52] have the virtual environment activated [3:54] what we're going to do is install the [3:56] various dependencies inside of here now [3:58] if you have this requirements. txt file [4:00] then you can say pip install dasr and [4:04] then you can do requirements.txt and [4:07] this will install all of the [4:08] requirements into our virtual [4:10] environment however if you don't have [4:12] the requirements.txt file you can just [4:14] type them out so you can just install [4:16] Lang chain you can install Lang chain [4:19] dama and you can install Lang chain Das [4:22] chroma like that okay so we just need to [4:25] install these dependencies in order to [4:26] be able to use these in Python so that's [4:28] going to take a second installing of [4:30] those dependencies for us and then once [4:31] that's done I'll be right back okay so [4:34] those are installed and the next thing [4:35] that we're going to need to get is [4:36] something called olama now olama allows [4:39] us to run models locally on our own [4:41] computer using our own Hardware so [4:43] that's why we're able to do everything [4:44] locally here rather than having to use [4:46] something like an openai API key so [4:49] please go to this page just ama.com if [4:52] you don't already have this software and [4:53] simply download it once you download it [4:56] what you should be able to do is just [4:57] open up some kind of terminal or command [4:59] prompt and then type the command olama [5:02] if you have any issues with this again [5:04] I'll put another video on screen that [5:05] walks through AMA in depth and we'll [5:07] show you how to set this up but once we [5:09] have Ama installed on our computer what [5:12] we're going to do is install an olama [5:14] model now AMA again it's just this open [5:16] source software and allows us to pull [5:18] various models to our own computer and [5:20] then run them using our own Hardware now [5:22] depending on the type of Hardware you [5:24] have that will dictate the models you'll [5:26] be able to run for example you probably [5:28] can't run a 200 g by model if you don't [5:30] have a graphics card in your computer so [5:32] I'm going to show you a few models that [5:33] should work on most machines if you have [5:36] a graphics card if you don't have a [5:37] graphics card and you just have a CPU [5:39] there's some very small models that you [5:41] can download and use but obviously the [5:43] performance won't be as good so what you [5:45] can do is you can actually go to the [5:46] olama library I'll leave this link in [5:49] the description and you can see that [5:50] there's various different models and it [5:52] kind of shows you all of the options [5:53] that they have now we're going to pull [5:55] two models to our computer we're just [5:57] going to pull llama 3.2 so this is kind [6:00] of a smaller model that we can use that [6:02] performs pretty well and then we're [6:04] going to pull an embedding model and [6:05] I'll show you the name of that in 1 [6:07] second which we'll use to embed the [6:09] documents that we add into our Vector [6:11] store if that means nothing to you don't [6:13] worry just follow along with the next [6:14] steps okay so we're going to go into our [6:16] terminal and again we're going to make [6:18] sure that ama command works and then [6:20] we're going to type AMA pull and we're [6:22] going to start by pulling the model [6:24] llama 3.2 now you can pull any model [6:27] that you want you can choose you can go [6:28] look at the directory but I'm going to [6:30] go with 3.2 once that's done okay you [6:32] can see it's here because I already had [6:33] it downloaded then we can move on to the [6:35] next one now the next model that we're [6:37] going to pull is going to be an [6:38] embedding model now this embedding model [6:40] is going to be mxb Ai and then this is [6:44] going to be Dash embed Das large there's [6:47] various other embedding models you can [6:49] use but this is the one we'll use for [6:50] this video okay so we're going to go and [6:52] hit enter and then again downloaded to [6:54] our computer these are not super big so [6:56] you should be able to run them on your [6:58] computer if you have any kind of GPU all [7:00] right so now that we have these models [7:01] we're good to start writing some code so [7:03] I'm going to go back into VSS code I'm [7:05] going to make a file called [7:07] main.py and in this file I'm going to [7:10] start writing some code now you'll [7:11] notice that I actually get this [7:12] autocomplete here this is coming from [7:14] GitHub co-pilot you know that really [7:16] cool assistant that replaces a lot of [7:18] your manual typing work they've actually [7:20] sponsored this video and speaking of [7:22] Microsoft's GitHub co-pilot I was [7:24] fortunate enough to have them sponsor a [7:26] video a few weeks ago on AI agents and [7:29] today's video where I promise to [7:30] highlight some of the standout ways that [7:32] developers are using GitHub co-pilot [7:34] that you guys submitted with the coding [7:36] with co-pilot hashtag so let's get into [7:38] it check out these examples of how [7:40] developers are using GitHub co-pilot [7:42] like Emy who created an entire flutter [7:44] mobile app tug Duel who created a python [7:47] script to resize and save images Adrien [7:49] who used co-pilot as a beginner when he [7:51] was working in Jupiter to learn better [7:53] ways to write functions and Yousef who [7:55] uses it to avoid manually writing [7:57] tedious documentation Now personal I use [8:00] GitHub co-pilot every single time I open [8:02] up vs code and it's insane how well it [8:04] can predict what I want to do next and [8:06] save me tons of hours of manual typing [8:08] it's literally like it can read my mind [8:10] now I'm sure that you guys have more [8:12] stories on how you're using GitHub [8:14] co-pilot so please share them with me [8:16] using the coding with co-pilot hashtag [8:18] because I'm excited to check them out [8:20] now with that said let's get back to the [8:22] video all right so back into the code [8:23] editor here let's go ahead and get [8:25] started now we're going to begin by just [8:27] importing a few things so we're going to [8:29] say from Lang [8:31] chain. llms import the olama llm we're [8:36] then going to say from Lang chain [8:40] core. prompts import the chat prompt [8:45] template okay now if you're unfamiliar [8:47] with Lang chain this is a framework that [8:49] just makes it a lot easier for us to [8:50] work with llms it's very popular in [8:53] Python and it has all of these [8:55] extensions like the AMA extension that [8:57] allows us to directly use our llama [8:59] models and by the way what will happen [9:02] is a llama should be running in the [9:04] background on your computer and it's [9:05] going to expose a server or like an HTTP [9:09] rest API that we'll be able to [9:11] communicate with from our program so [9:13] when you pull these models they are [9:15] actually running on your own computer [9:17] and we can trigger AMA to utilize these [9:19] models from code in python or we can [9:22] actually just do it directly from the [9:23] command line so everything that I'm [9:25] showing you here will run 100% locally [9:27] on your own computer even though it [9:29] might not necessarily feel like that it [9:30] also means it'll be pretty fast okay so [9:34] after this we're going to specify our [9:35] model now I'm going to show you in this [9:37] code snippet here how to utilize an AMA [9:39] model like quite quickly and then we'll [9:41] start connecting some more complexity to [9:43] it with the vector database and I'll [9:45] talk about what that means so I'm going [9:46] to say model is equal to oama and then [9:49] inside of here I need to specify the [9:52] specific model from olama that I want to [9:54] use now if you're confused on what to [9:56] put here you can open up your command [9:58] prompt you can type a llama list like [10:01] this and it will show you the models [10:03] that you have available so you can see [10:05] that I have this embedding model I have [10:06] llama 3.2 I have mistol I have llama 2 [10:09] so any of these models I can use so what [10:12] I'm going to do is just copy llama 3.2 [10:14] you don't need the latest part of it you [10:16] can just do the original name and you [10:18] can put it right here okay so I'm going [10:20] to use model o llama model equal to [10:22] llama 3.2 and now I can start utilizing [10:25] this model and kind of invoking it so [10:28] next what we're going to do is is we're [10:29] going to create a template and this [10:31] template is going to be just a string [10:34] and inside of this string we're just [10:35] going to specify what we want the model [10:37] to actually do so we're going to say [10:39] something like you are an [10:41] expert in answering questions about a [10:46] pizza restaurant okay here are some [10:51] relevant reviews and then we're just [10:53] going to put inside of a variable here [10:56] reviews and say here is the question to [11:00] answer okay and then we're going to put [11:02] a question perfect then what we're going [11:05] to do is we're going to say our prompt [11:07] is equal to a chat prompt template we're [11:10] going to pass our template and actually [11:12] we don't need to pass the model I don't [11:14] know why it's doing that and now we've [11:15] created a chat prompt template where [11:17] we'll be able to pass in a reviews [11:19] variable and a question variable and [11:21] then the model can respond to that okay [11:24] then we're going to create a chain so [11:26] with the chain we can say prompt and [11:28] then we can put a type and then we can [11:30] put model now what this allows us to do [11:32] is essentially invoke this entire chain [11:35] that can combine multiple things [11:37] together to run our llm so first what [11:40] we'll do is we'll pass variables reviews [11:42] and question into this prompt this chat [11:45] prompt template that we just created and [11:47] then that will automatically get passed [11:48] to our model because we put it inside of [11:51] this chain and then it will return to us [11:53] whatever the answer is so if we want to [11:55] test this out really quickly because [11:57] this is literally all we need to in [11:58] order to do this we can say [12:01] chain. invoke and then inside of a [12:04] python dictionary we need to specify the [12:06] two variables that we had inside of this [12:09] prompt so we're going to have reviews [12:11] and then question okay so we'll start [12:13] with reviews and for now we can just [12:16] make this an empty list and then we can [12:18] say question and something like what is [12:20] the best pizza place in town that might [12:22] not necessarily make sense because this [12:23] is just about one pizza place but I just [12:25] want to show you a quick demo so we're [12:27] going to say result is equal to this and [12:30] then we're going to go down and we're [12:31] going to say print result okay so we can [12:35] just test this out and make sure that [12:36] it's working and it should go ahead and [12:38] invoke our olama llm and give us some [12:41] kind of response so let's go here and [12:43] run this we can do that by typing python [12:45] the name of our file which is main.py or [12:48] Python 3 main.py so I'm going to hit [12:50] enter give this a second to run and we [12:53] got an error some kind of formatting [12:54] issue so let's see what the problem is [12:57] okay so silly mistake here what we [12:58] actually need to say chat prompt [13:00] template. from template I forgot to [13:03] specify this method so of course that [13:05] was giving us an issue so let's go back [13:07] here and fix that quickly Python main.py [13:10] and we should see that this works now [13:12] give it a second and you can see it say [13:14] based on our customer feedack and [13:15] ratings I would highly recommend this [13:16] the top rated pizza place One reviewer [13:18] mentioned blah blah blah blah in fact [13:20] our own team has sampled their pizza so [13:22] it just came up with something random [13:23] here because I didn't actually give it [13:25] any reviews so it's kind of [13:26] hallucinating the response but you get [13:28] the idea okay it did actually work we [13:30] were able to use AMA and we got a [13:32] response from the model which is really [13:34] just the point of what we were testing [13:35] here okay so now what we're going to do [13:37] is we're just going to put this inside [13:39] of a y Loop so essentially we can just [13:41] keep asking it questions and then we're [13:42] going to set up the vector search so we [13:44] can actually get a relevant response so [13:46] let's set up a simple Loop here we're [13:48] just going to say while true then we're [13:50] going to ask a question so we're going [13:51] to say question is equal to input so we [13:55] can get some input from the user and [13:56] we'll say you know ask [13:59] your [14:00] question and then we're just going to [14:02] put a set of parentheses here and say Q [14:04] to quit so if they type Q then we can [14:07] quit we're going to say if the question [14:09] is equal to Q then we are going to break [14:13] otherwise we can invoke this chain so [14:15] we're going to say result is equal to [14:18] chain. invoke okay and for the question [14:21] we'll just put the question the user [14:22] asked so we'll replace this with [14:25] question and then we can print the [14:27] result now we also can just have a few [14:29] kind of formatting variables here so I'm [14:30] just going to say print and I'm just [14:32] going to print kind of a big line with a [14:34] few back slend characters and then same [14:37] thing here I'm just going to print a few [14:40] back SL ends so we can kind of read [14:42] what's happening okay so we don't need [14:44] to test this but this will just allow us [14:46] to continue to ask questions until we [14:48] type in Q now what I want to do is show [14:50] you how to set up the vector search all [14:52] right so we're going to create a new [14:53] file here called vector. py can call [14:56] this anything that you want and here's [14:58] where we're going to write the logic for [15:00] actually embedding our documents and [15:02] then looking them up or vectorizing our [15:04] documents now in case you're unfamiliar [15:06] with Vector search this essentially is [15:08] going to be a database it's going to be [15:10] hosted locally on our own computer using [15:12] something called chroma DB which we [15:14] installed earlier and this is going to [15:16] allow us to really quickly look up [15:18] relevant information that we can then [15:20] pass to our model and then our model can [15:23] use that data to give us some more [15:25] contextually relevant replies so [15:27] obviously llms are really good at kind [15:29] of synthesizing text and giving us [15:31] responses but usually they don't have [15:33] the correct data so in this case what [15:35] we're going to do is we're going to take [15:36] this entire CSV file we're going to put [15:38] it inside of this Vector enabled [15:41] database and then as soon as we ask a [15:43] question we're going to look up the [15:45] relevant documents in that database [15:47] we're going to pass those to the llm as [15:49] a list of reviews and then it will be [15:51] able to search through those reviews and [15:53] answer our question okay so that's like [15:55] the very Basics on Vector search let me [15:57] show you how we do that so we're going [15:59] to say from Lang chain uncore olama [16:03] we're going to [16:04] import the olama embeddings okay now one [16:08] thing that we need when we do this [16:10] vectorization process is an embedding [16:12] model this model will be able to take [16:14] text and convert it into a vector this [16:16] is essentially numbers that we can then [16:18] use to look up data really efficiently [16:21] next we're going to say from Lang chain [16:24] and then underscore chroma and we're [16:26] going to import chroma like this which [16:28] is which is going to be our Vector store [16:30] we're then going to say from Lang chain [16:33] uncore core. document import a document [16:38] we're going to create documents and then [16:40] pass these to our uh what do you call it [16:42] chroma database we're then going to [16:44] import OS and we're going to import [16:46] something that I forgot to install [16:47] before which is pandas as PD okay now [16:51] pandas is a library that we can use to [16:52] really easily read in our CSV file so [16:55] just quickly before I forget we do need [16:57] to install this so same as before we're [16:59] going to type pip install pandas in our [17:02] virtual environment and then we should [17:04] install that dependency and be able to [17:06] use it I'll also add it to the [17:07] requirements.txt file so if you guys [17:09] were to have downloaded this before you [17:11] would already have it okay so pandas is [17:13] installing we can just wait for that to [17:15] run and start writing some more code so [17:17] first things first we're going to load [17:18] in our CSV file we're going to use the [17:20] data in the CSV file for our Vector [17:23] store so of course we're going to need [17:24] the data so we're going to say DF [17:26] standing for data frame and this is [17:27] going to be pd. read _ CSV and we're [17:31] going to read in the realistic uncore [17:35] restaurant uncore [17:37] reviews. CSV and obviously you know read [17:40] in whatever the name of your CSV file is [17:43] I think that I spelled that correctly [17:45] although maybe not restaurant let's see [17:48] you know what we can just do this rename [17:50] copy and then paste here to avoid any [17:53] misspellings okay anyway so we have our [17:55] data frame here next we're going to [17:57] bring in the embedding model so we're [17:58] going to say embeddings is equal to the [18:02] olama embeddings and then we're going to [18:04] say model is equal to and then the name [18:06] of the model that we installed which is [18:08] mxb ai- embed D [18:13] large okay now after that we're going to [18:15] specify the location where we want to [18:17] store our Vector database so I'm going [18:19] to say do slash and then [18:22] chroma Lang chain and then this is going [18:25] to beore DB you can call this anything [18:27] that you want but this is just going to [18:28] a folder where we store our uh database [18:31] okay next after that we're going to say [18:34] addore documents is equal to and then [18:37] we're going to say not os. path. exist [18:41] and then the database location now what [18:43] I want to do is I want to check and see [18:44] if this database already exists if it [18:47] does that means that I've already [18:48] performed the process of converting the [18:50] CSV file into vectors and adding into [18:53] the database if it doesn't exist then it [18:55] means that I need to do that okay so we [18:57] don't need to keep doing this every [18:58] single time we can just one time [19:00] vectorize our data and then once it's [19:02] vectorized and it's in the database we [19:04] don't need to do that again we can just [19:05] start using it so below here I'm going [19:07] to say if add documents so if we do [19:10] actually need to add them then we're [19:11] going to say the following we're going [19:13] to say documents is equal to an empty [19:15] list and we're going to say IDs is equal [19:17] to an empty list as well then what we're [19:19] going to do is we're going to iterate [19:21] through our rows so we're going to say 4 [19:23] I comma Row in DF do eer rows this is [19:26] simply going to go row by row through [19:28] our CSV file and then allow us to access [19:31] the various entries now what we're going [19:33] to do is we're going to create [19:34] individual documents we're going to add [19:36] them to the documents list and then [19:37] we're going to add them to our Vector [19:39] store okay so we're going to say [19:41] document is equal to document and inside [19:44] of this document we need to pass three [19:46] things we need to pass a page content [19:49] and this page content is going to be [19:52] what we will actually be vectorizing and [19:54] what we'll be looking up so if you [19:56] wanted to adjust this for your own [19:58] example any of the content that you want [20:00] to use to actually look up the [20:02] information in the database that needs [20:04] to go in the page content so what we're [20:05] going to do is we're going to combine [20:08] the title of the review with the review [20:11] itself so that we have a bunch of [20:12] information to be able to actually query [20:14] our data okay there's all kinds of [20:16] different things you can do here but you [20:18] want to include the important [20:19] information that you'll be querying [20:20] based on in the page content so we're [20:23] going to take row at title and then [20:26] we're going to say plus a space and then [20:28] row at review okay then we're going to [20:31] specify some metadata co-pilot's already [20:33] doing it for me so we have metadata and [20:35] then rating and that's row rating and [20:37] then we're going to have date and then [20:39] this is going to be [20:42] row date okay so the metadata is just [20:46] additional information that we will grab [20:48] along with the document but we won't be [20:51] querying based on the Met metadata okay [20:53] so hopefully that makes sense again just [20:55] additional data that will be included [20:57] with the document but it won't [20:58] necessarily be used to actually query [21:00] and then lastly we can specify an ID so [21:03] we're going to say the ID is the string [21:05] of I which is just the index of this [21:07] value in the row or in the uh what do [21:10] you call it the CSV file and just make [21:12] sure that you convert this to a string [21:14] okay so I think that should be good for [21:16] now after this what we're going to do is [21:18] we're going to say IDs do append and [21:20] we're going to append string I and then [21:22] we're going to say documents. append and [21:25] we're going to append our document now [21:28] the reason why we need to store the IDS [21:30] is because when we actually create this [21:31] data in the vector store for some reason [21:33] we need two separate lists we need a [21:36] list of documents and then we need a [21:37] list of their Associated IDs in case for [21:39] some reason they're different so I know [21:41] it seems a bit weird that we have the ID [21:43] twice but just follow along because we [21:44] need that for this process okay so now [21:47] we've kind of prepared the data in [21:49] documents and the next thing we need to [21:50] do is add this to the vector store so we [21:53] need to create the vector store so we're [21:54] going to say Vector store is equal to [21:58] chroma and then inside of chroma we're [22:01] going to specify the location and the [22:03] collection name so we're going to say [22:05] collection name is equal to restaurant [22:07] reviews we're going to say the [22:09] persistent directory co-pilot is leading [22:12] me wrong here is going to be equal to [22:14] the DB location now this just means that [22:17] we'll store it persistently rather than [22:18] just storing it in memory you don't need [22:21] to do this but I recommend that you do [22:23] store this permanently so that you don't [22:24] need to keep regenerating this chroma [22:26] database and then lastly we we need to [22:28] pass the embedding function which will [22:30] be equal to our embeddings from olama [22:34] okay so we're using all of this stuff [22:36] locally we have the chrom ADB locally we [22:37] have the local embeddings model and now [22:40] we have the vector store next I'm going [22:42] to do a quick if statement and I'm going [22:43] to say if add documents then we're going [22:45] to say Vector store. add documents and [22:48] then this is going to be documents is [22:51] equal to documents and IDs is equal to [22:55] IDs okay so this is how you add this you [22:58] just say Vector store. add documents you [23:00] specify the documents that you want to [23:02] add which we've already prepared here [23:04] and then you specify the corresponding [23:05] IDs and we're only doing that if this [23:09] did not already exist because if it did [23:11] already exist then we don't need to add [23:13] the documents right and we wouldn't have [23:14] already prepared this data hopefully [23:16] that makes sense but that essentially [23:18] will create the vector store for us and [23:20] automatically add the data last thing [23:22] we're going to do is we're going to make [23:24] this Vector store be usable by our llm [23:27] so I'm going to show you how to do that [23:29] we're going to say [23:31] retriever okay is equal to the vector [23:35] store. [23:36] asore retriever okay now inside of here [23:40] there's a few parameters that we can [23:42] pass for example we can specify the [23:44] number of documents that we wanted to [23:45] look up so I'm going to say search [23:47] keyword arguments is equal to K and then [23:50] five now when I do this what's going to [23:53] happen is it's going to look up five [23:54] relevant reviews and then pass those [23:57] five reviews to the the llm now if we [23:59] wanted 10 reviews we would make this 10 [24:01] if we wanted one review we would make [24:03] this one you can specify as many or as [24:05] few as you want obviously minimum of one [24:07] but I'm going to go with five okay so [24:10] now we have the Retriever and what this [24:13] retriever will allow us to do is look up [24:15] documents then we can pass those [24:17] documents into the prompt for our llm so [24:20] quickly recapping we import all the [24:23] relevant data we bring in the CSV file [24:26] we uh Define the embeddings model from [24:28] llama we check if this location already [24:31] exists if it doesn't then we're going to [24:33] prepare all of our data by converting it [24:35] into documents we're going to initialize [24:37] the vector store if for some reason this [24:40] directory already exists then there's no [24:42] need to add the data but if it doesn't [24:44] exist then we're going to add this data [24:46] into the vector store by adding all of [24:48] our documents this will automatically [24:50] embed all of the documents for us and [24:52] add it to the vector store and then we [24:54] can create this retriever from the [24:56] vector store which will allow us to grab [24:58] documents so the last step is to Simply [25:00] use this retriever from our main.py file [25:03] so we're going to go into main.py we're [25:05] going to say from Vector import [25:07] retriever because we're just going to [25:09] import it from the other file and now [25:11] before we actually invoke this chain we [25:14] can use the retriever to grab the [25:16] relevant reviews and then we can pass [25:18] the reviews as a parameter to our prompt [25:21] okay so in order to do that we're just [25:23] going to say the reviews is equal to the [25:26] Retriever and then this is going to be [25:28] Dot invoke we're just going to invoke [25:30] this with our question and then we can [25:32] simply pass the reviews that are [25:34] returned here to our chain so all that [25:37] we do here is we just say retriever do [25:39] invoke we pass the question or like the [25:42] search string that we want to use to [25:43] look up the relevant reviews what will [25:45] happen is the retriever is automatically [25:48] going to embed that question it's going [25:51] to go into the vector store it's going [25:52] to look up all of the relevant reviews [25:54] using a similarity search algorithm it's [25:57] going to grab the top five reviews and [25:59] then it's going to pass this to our [26:00] chain and then we can print out the [26:02] result and hopefully we get something [26:03] meaningful based on those reviews so [26:06] let's give this a run now and pray that [26:08] it works with python and then [26:12] main.py give this a second to run you [26:15] can see that it creates this chroma Lang [26:17] chain DB directory it will take a second [26:19] because it does need to embed all of our [26:21] documents and now we can ask a question [26:23] so I'm going to say how are the you know [26:28] vegan options if I can spell anything [26:31] correctly which apparently I cannot okay [26:33] so let's see what we get here and you [26:35] can see that it pulls up a few different [26:37] reviews here and it says based on the [26:38] reviews provided appears the vegan [26:40] options of the PE Peach Restaurant are a [26:41] mixed bag on the positive side some [26:43] reviewers have raved about the vegan [26:45] pizz saying they're hidden gems okay and [26:47] it even tells us what document it got [26:49] this from however not all reviews are [26:51] glowing One reviewer had a vastly [26:53] different experience with the vegan [26:54] cheese option calling it tasteless and [26:56] then it says overall it seems that the [26:57] vegan opt options are Hit or Miss but [26:59] there's definitely potential and then it [27:00] gave us a overall rating three out of [27:02] five based on the two positive views out [27:04] of the four total okay cool we can also [27:07] ask it something like you know how is [27:10] the Ambiance or something I don't know [27:12] if I spelled that correctly but let's [27:13] see what it says said overall I would [27:15] say the Ambiance of the pizza restaurant [27:16] has an all Style no substance feel and [27:19] apparently they don't like the pizza [27:20] restaurant based on these reviews but [27:22] you guys get the idea it is insanely [27:24] fast it uses the vector store database [27:26] everything runs completely low locally [27:28] and we're ready to quit we can hit q and [27:31] we can exit out this was a simple [27:34] example that was just meant to [27:35] demonstrate how you can run llms locally [27:38] on your own computer using your own [27:39] Hardware obviously you can adjust the [27:41] CSV file and you can make this any type [27:43] of data that you want it also doesn't [27:45] need to be CSV data you can just convert [27:47] anything that you want into documents [27:49] like I demonstrated here and if you want [27:51] the code from this video it will be [27:53] available from the link in the [27:54] description if you guys enjoyed make [27:56] sure to leave a like subscribe to the [27:57] channel and I will see you in the next [27:59] one [28:01] [Music]