Build a Local AI Agent in Minutes
40sShows a quick demo of a working AI agent that answers questions about pizza reviews, hooking viewers with immediate value.
▶ Play ClipThis tutorial demonstrates how to build a local AI agent using Python, Ollama, LangChain, and Chroma DB for retrieval-augmented generation (RAG). The agent can query a CSV file of restaurant reviews to answer questions, all running locally without any cloud APIs.
Build a local AI agent in minutes using Python, Ollama, LangChain, and Chroma DB for RAG, enabling retrieval from CSV/PDF files.
Agent queries a CSV of fake pizza restaurant reviews to answer questions like 'how is the quality of the pizza?' and 'are there vegan options?'.
Create a virtual environment, install langchain, langchain-ollama, langchain-chroma, and pandas.
Install Ollama, pull llama3.2 for the LLM and mxbai-embed-large for embeddings.
Import Ollama LLM, create a chat prompt template, build a chain, and test with a simple question.
Create a separate file to load CSV, define embeddings, initialize Chroma DB, and create a retriever.
Import retriever into main.py, use it to fetch relevant reviews before invoking the LLM chain, and run the interactive loop.
This tutorial shows how to build a fully local AI agent with RAG using Python, Ollama, LangChain, and Chroma DB. The approach can be adapted to any CSV or document data, enabling private, offline question-answering.
"Title accurately describes the tutorial; it delivers exactly what it promises."
What is the purpose of the embedding model in this project?
To convert text into vectors for efficient similarity search in the vector database.
16:10
Which Ollama models are used in this tutorial?
llama3.2 for the LLM and mxbai-embed-large for embeddings.
05:55
What does the retriever's search_kwargs={'k': 5} parameter do?
It specifies that the retriever should return the top 5 most relevant documents.
23:47
How do you activate the virtual environment on Windows?
Type .\venv\Scripts\activate
03:20
What is the role of Chroma DB in this project?
It acts as a local vector database to store and retrieve document embeddings.
15:08
What three things are passed to the Document object when creating documents?
page_content, metadata, and id.
19:44
Why is the vector store persisted to disk?
To avoid re-embedding the documents every time the program runs.
22:17
What command lists all available Ollama models?
ollama list
09:58
Local AI Agent with RAG
Demonstrates a fully local, free alternative to cloud-based AI agents using RAG.
Vector Search Explanation
Clearly explains how vector search enables efficient retrieval of relevant documents.
15:08Document Structure for RAG
Shows how to structure documents with page_content and metadata for effective retrieval.
19:44Retriever Configuration
Demonstrates configuring the retriever to return a specific number of documents (k=5).
23:47[00:00] in this video I'll be showing you how to
[00:01] build a local AI agent in just a few
[00:04] minutes using python we'll be using AMA
[00:07] Lang chain and something called chroma
[00:09] DB to act as our Vector search database
[00:12] because I'm going to show you how to add
[00:13] retrieval augmented generation into this
[00:16] app that essentially means we can
[00:17] retrieve relevant information from
[00:19] something like a CSV file or a PDF and
[00:22] bring that into our model now all of
[00:24] this is completely free you don't need
[00:26] an open AI account you don't need a clot
[00:28] account or something you can do this all
[00:30] from your local computer so let me show
[00:32] you how to set it all up so I'm just
[00:34] going to show you a quick demo of the
[00:35] finished product and then we'll get into
[00:37] the tutorial now you can see on the
[00:39] right hand side of my screen here that I
[00:40] just opened up a CSV file this CSV file
[00:43] just contains some fake reviews for a
[00:45] random pizza restaurant so we have title
[00:47] date rating and review and you can see
[00:49] something like best pizza in town here's
[00:51] the date here's the ID or sorry the
[00:53] rating of the review out of five and
[00:55] then you have what the actual review is
[00:57] and there's kind of some information now
[00:59] what I'm going to show you is how to
[01:00] build an AI agent here that can actually
[01:02] go and look up relevant reviews from
[01:05] this document to answer questions about
[01:07] the restaurant I don't know about you
[01:08] guys but whenever I go to a new place I
[01:10] always look at the reviews and typically
[01:12] I'm looking for an answer to my
[01:13] particular question so this can kind of
[01:15] do that for you so for example maybe I
[01:17] want to know you know how is the quality
[01:21] of the pizza okay well what it can do is
[01:23] then go to this document find the
[01:25] relevant reviews which you can see it
[01:26] kind of pulls into here and analyzes and
[01:29] then it gives me a conclusion overall
[01:31] without more data or context it's
[01:32] challenging to give a definitive score
[01:33] on the pizza based solely on the reviews
[01:35] however they do suggest a restaurant
[01:37] with potentially room for improvement in
[01:38] presentation and overall consistency so
[01:40] there you go right I could ask something
[01:42] like are there vegan options let's see
[01:47] what that gives us and you can see here
[01:49] in conclusion based on the reviews there
[01:51] appear to be at least one vegan pizza
[01:53] pizza and possibly more vegan options
[01:54] available Okay cool so that's what we're
[01:56] going to build this isn't going to be
[01:57] super complicated it'll be pretty fast
[01:59] so stick around around and let me show
[02:00] you how we make it all right so we have
[02:02] a few quick setup steps here and then we
[02:04] can dive right into the code now the
[02:06] first thing that we're going to need is
[02:07] obviously some kind of CSV file now you
[02:09] can use anything that you want and I'll
[02:11] show you how to adjust this code for
[02:13] your own example but if you want to
[02:14] download the CSV file that I'm using
[02:16] I'll leave a link to it in the
[02:18] description and in fact all of the code
[02:19] will be available from the GitHub so you
[02:21] can go to the GitHub and you can
[02:22] download this CSV file and just bring it
[02:25] into a new folder in VSS code so to
[02:27] begin open up some kind of code editor
[02:29] I'm using VSS code code create a new
[02:30] folder you can see I have one called
[02:32] local AI agent bring in the CSV file and
[02:35] then I also created this
[02:36] requirements.txt file which just has the
[02:39] three things that we're going to need to
[02:41] install in Python so let's get started
[02:43] with that installing our Python
[02:45] dependencies and then I'll show you the
[02:46] next steps so what we need to do is open
[02:48] up our terminal again I'm inside of the
[02:51] directory that I want to write code in
[02:52] for this video and what I'm going to do
[02:54] is create a virtual environment so to do
[02:56] that I'm going to type python DMV en EnV
[03:00] and then VV if you're on Mac or Linux
[03:03] you can change this to Python 3 and what
[03:05] this will do is create a new isolated
[03:07] environment that we can install various
[03:09] dependencies into if you don't know
[03:11] anything about virtual environments and
[03:12] you want to learn more I'll leave a
[03:14] video on screen now that the virtual
[03:16] environment has been created we need to
[03:17] activate it to activate it if you're on
[03:20] Windows is going to be dot slash the
[03:22] name of the virtual environment slash
[03:24] and then scripts with a capital S and
[03:26] then slash activate when you type that
[03:29] you should see that you get the name of
[03:31] the virtual environment as a prefix
[03:33] before your command line now if you are
[03:35] on Mac or Linux then the command is
[03:37] going to be SL venv and then this is
[03:39] going to be bin SL activate Okay so it's
[03:44] different if you're on Windows it's this
[03:45] one and if you are Mac or Linux it's
[03:47] going to be this one and again I'll
[03:48] leave a video on screen that we'll go
[03:50] through this more in depth now that we
[03:52] have the virtual environment activated
[03:54] what we're going to do is install the
[03:56] various dependencies inside of here now
[03:58] if you have this requirements. txt file
[04:00] then you can say pip install dasr and
[04:04] then you can do requirements.txt and
[04:07] this will install all of the
[04:08] requirements into our virtual
[04:10] environment however if you don't have
[04:12] the requirements.txt file you can just
[04:14] type them out so you can just install
[04:16] Lang chain you can install Lang chain
[04:19] dama and you can install Lang chain Das
[04:22] chroma like that okay so we just need to
[04:25] install these dependencies in order to
[04:26] be able to use these in Python so that's
[04:28] going to take a second installing of
[04:30] those dependencies for us and then once
[04:31] that's done I'll be right back okay so
[04:34] those are installed and the next thing
[04:35] that we're going to need to get is
[04:36] something called olama now olama allows
[04:39] us to run models locally on our own
[04:41] computer using our own Hardware so
[04:43] that's why we're able to do everything
[04:44] locally here rather than having to use
[04:46] something like an openai API key so
[04:49] please go to this page just ama.com if
[04:52] you don't already have this software and
[04:53] simply download it once you download it
[04:56] what you should be able to do is just
[04:57] open up some kind of terminal or command
[04:59] prompt and then type the command olama
[05:02] if you have any issues with this again
[05:04] I'll put another video on screen that
[05:05] walks through AMA in depth and we'll
[05:07] show you how to set this up but once we
[05:09] have Ama installed on our computer what
[05:12] we're going to do is install an olama
[05:14] model now AMA again it's just this open
[05:16] source software and allows us to pull
[05:18] various models to our own computer and
[05:20] then run them using our own Hardware now
[05:22] depending on the type of Hardware you
[05:24] have that will dictate the models you'll
[05:26] be able to run for example you probably
[05:28] can't run a 200 g by model if you don't
[05:30] have a graphics card in your computer so
[05:32] I'm going to show you a few models that
[05:33] should work on most machines if you have
[05:36] a graphics card if you don't have a
[05:37] graphics card and you just have a CPU
[05:39] there's some very small models that you
[05:41] can download and use but obviously the
[05:43] performance won't be as good so what you
[05:45] can do is you can actually go to the
[05:46] olama library I'll leave this link in
[05:49] the description and you can see that
[05:50] there's various different models and it
[05:52] kind of shows you all of the options
[05:53] that they have now we're going to pull
[05:55] two models to our computer we're just
[05:57] going to pull llama 3.2 so this is kind
[06:00] of a smaller model that we can use that
[06:02] performs pretty well and then we're
[06:04] going to pull an embedding model and
[06:05] I'll show you the name of that in 1
[06:07] second which we'll use to embed the
[06:09] documents that we add into our Vector
[06:11] store if that means nothing to you don't
[06:13] worry just follow along with the next
[06:14] steps okay so we're going to go into our
[06:16] terminal and again we're going to make
[06:18] sure that ama command works and then
[06:20] we're going to type AMA pull and we're
[06:22] going to start by pulling the model
[06:24] llama 3.2 now you can pull any model
[06:27] that you want you can choose you can go
[06:28] look at the directory but I'm going to
[06:30] go with 3.2 once that's done okay you
[06:32] can see it's here because I already had
[06:33] it downloaded then we can move on to the
[06:35] next one now the next model that we're
[06:37] going to pull is going to be an
[06:38] embedding model now this embedding model
[06:40] is going to be mxb Ai and then this is
[06:44] going to be Dash embed Das large there's
[06:47] various other embedding models you can
[06:49] use but this is the one we'll use for
[06:50] this video okay so we're going to go and
[06:52] hit enter and then again downloaded to
[06:54] our computer these are not super big so
[06:56] you should be able to run them on your
[06:58] computer if you have any kind of GPU all
[07:00] right so now that we have these models
[07:01] we're good to start writing some code so
[07:03] I'm going to go back into VSS code I'm
[07:05] going to make a file called
[07:07] main.py and in this file I'm going to
[07:10] start writing some code now you'll
[07:11] notice that I actually get this
[07:12] autocomplete here this is coming from
[07:14] GitHub co-pilot you know that really
[07:16] cool assistant that replaces a lot of
[07:18] your manual typing work they've actually
[07:20] sponsored this video and speaking of
[07:22] Microsoft's GitHub co-pilot I was
[07:24] fortunate enough to have them sponsor a
[07:26] video a few weeks ago on AI agents and
[07:29] today's video where I promise to
[07:30] highlight some of the standout ways that
[07:32] developers are using GitHub co-pilot
[07:34] that you guys submitted with the coding
[07:36] with co-pilot hashtag so let's get into
[07:38] it check out these examples of how
[07:40] developers are using GitHub co-pilot
[07:42] like Emy who created an entire flutter
[07:44] mobile app tug Duel who created a python
[07:47] script to resize and save images Adrien
[07:49] who used co-pilot as a beginner when he
[07:51] was working in Jupiter to learn better
[07:53] ways to write functions and Yousef who
[07:55] uses it to avoid manually writing
[07:57] tedious documentation Now personal I use
[08:00] GitHub co-pilot every single time I open
[08:02] up vs code and it's insane how well it
[08:04] can predict what I want to do next and
[08:06] save me tons of hours of manual typing
[08:08] it's literally like it can read my mind
[08:10] now I'm sure that you guys have more
[08:12] stories on how you're using GitHub
[08:14] co-pilot so please share them with me
[08:16] using the coding with co-pilot hashtag
[08:18] because I'm excited to check them out
[08:20] now with that said let's get back to the
[08:22] video all right so back into the code
[08:23] editor here let's go ahead and get
[08:25] started now we're going to begin by just
[08:27] importing a few things so we're going to
[08:29] say from Lang
[08:31] chain. llms import the olama llm we're
[08:36] then going to say from Lang chain
[08:40] core. prompts import the chat prompt
[08:45] template okay now if you're unfamiliar
[08:47] with Lang chain this is a framework that
[08:49] just makes it a lot easier for us to
[08:50] work with llms it's very popular in
[08:53] Python and it has all of these
[08:55] extensions like the AMA extension that
[08:57] allows us to directly use our llama
[08:59] models and by the way what will happen
[09:02] is a llama should be running in the
[09:04] background on your computer and it's
[09:05] going to expose a server or like an HTTP
[09:09] rest API that we'll be able to
[09:11] communicate with from our program so
[09:13] when you pull these models they are
[09:15] actually running on your own computer
[09:17] and we can trigger AMA to utilize these
[09:19] models from code in python or we can
[09:22] actually just do it directly from the
[09:23] command line so everything that I'm
[09:25] showing you here will run 100% locally
[09:27] on your own computer even though it
[09:29] might not necessarily feel like that it
[09:30] also means it'll be pretty fast okay so
[09:34] after this we're going to specify our
[09:35] model now I'm going to show you in this
[09:37] code snippet here how to utilize an AMA
[09:39] model like quite quickly and then we'll
[09:41] start connecting some more complexity to
[09:43] it with the vector database and I'll
[09:45] talk about what that means so I'm going
[09:46] to say model is equal to oama and then
[09:49] inside of here I need to specify the
[09:52] specific model from olama that I want to
[09:54] use now if you're confused on what to
[09:56] put here you can open up your command
[09:58] prompt you can type a llama list like
[10:01] this and it will show you the models
[10:03] that you have available so you can see
[10:05] that I have this embedding model I have
[10:06] llama 3.2 I have mistol I have llama 2
[10:09] so any of these models I can use so what
[10:12] I'm going to do is just copy llama 3.2
[10:14] you don't need the latest part of it you
[10:16] can just do the original name and you
[10:18] can put it right here okay so I'm going
[10:20] to use model o llama model equal to
[10:22] llama 3.2 and now I can start utilizing
[10:25] this model and kind of invoking it so
[10:28] next what we're going to do is is we're
[10:29] going to create a template and this
[10:31] template is going to be just a string
[10:34] and inside of this string we're just
[10:35] going to specify what we want the model
[10:37] to actually do so we're going to say
[10:39] something like you are an
[10:41] expert in answering questions about a
[10:46] pizza restaurant okay here are some
[10:51] relevant reviews and then we're just
[10:53] going to put inside of a variable here
[10:56] reviews and say here is the question to
[11:00] answer okay and then we're going to put
[11:02] a question perfect then what we're going
[11:05] to do is we're going to say our prompt
[11:07] is equal to a chat prompt template we're
[11:10] going to pass our template and actually
[11:12] we don't need to pass the model I don't
[11:14] know why it's doing that and now we've
[11:15] created a chat prompt template where
[11:17] we'll be able to pass in a reviews
[11:19] variable and a question variable and
[11:21] then the model can respond to that okay
[11:24] then we're going to create a chain so
[11:26] with the chain we can say prompt and
[11:28] then we can put a type and then we can
[11:30] put model now what this allows us to do
[11:32] is essentially invoke this entire chain
[11:35] that can combine multiple things
[11:37] together to run our llm so first what
[11:40] we'll do is we'll pass variables reviews
[11:42] and question into this prompt this chat
[11:45] prompt template that we just created and
[11:47] then that will automatically get passed
[11:48] to our model because we put it inside of
[11:51] this chain and then it will return to us
[11:53] whatever the answer is so if we want to
[11:55] test this out really quickly because
[11:57] this is literally all we need to in
[11:58] order to do this we can say
[12:01] chain. invoke and then inside of a
[12:04] python dictionary we need to specify the
[12:06] two variables that we had inside of this
[12:09] prompt so we're going to have reviews
[12:11] and then question okay so we'll start
[12:13] with reviews and for now we can just
[12:16] make this an empty list and then we can
[12:18] say question and something like what is
[12:20] the best pizza place in town that might
[12:22] not necessarily make sense because this
[12:23] is just about one pizza place but I just
[12:25] want to show you a quick demo so we're
[12:27] going to say result is equal to this and
[12:30] then we're going to go down and we're
[12:31] going to say print result okay so we can
[12:35] just test this out and make sure that
[12:36] it's working and it should go ahead and
[12:38] invoke our olama llm and give us some
[12:41] kind of response so let's go here and
[12:43] run this we can do that by typing python
[12:45] the name of our file which is main.py or
[12:48] Python 3 main.py so I'm going to hit
[12:50] enter give this a second to run and we
[12:53] got an error some kind of formatting
[12:54] issue so let's see what the problem is
[12:57] okay so silly mistake here what we
[12:58] actually need to say chat prompt
[13:00] template. from template I forgot to
[13:03] specify this method so of course that
[13:05] was giving us an issue so let's go back
[13:07] here and fix that quickly Python main.py
[13:10] and we should see that this works now
[13:12] give it a second and you can see it say
[13:14] based on our customer feedack and
[13:15] ratings I would highly recommend this
[13:16] the top rated pizza place One reviewer
[13:18] mentioned blah blah blah blah in fact
[13:20] our own team has sampled their pizza so
[13:22] it just came up with something random
[13:23] here because I didn't actually give it
[13:25] any reviews so it's kind of
[13:26] hallucinating the response but you get
[13:28] the idea okay it did actually work we
[13:30] were able to use AMA and we got a
[13:32] response from the model which is really
[13:34] just the point of what we were testing
[13:35] here okay so now what we're going to do
[13:37] is we're just going to put this inside
[13:39] of a y Loop so essentially we can just
[13:41] keep asking it questions and then we're
[13:42] going to set up the vector search so we
[13:44] can actually get a relevant response so
[13:46] let's set up a simple Loop here we're
[13:48] just going to say while true then we're
[13:50] going to ask a question so we're going
[13:51] to say question is equal to input so we
[13:55] can get some input from the user and
[13:56] we'll say you know ask
[13:59] your
[14:00] question and then we're just going to
[14:02] put a set of parentheses here and say Q
[14:04] to quit so if they type Q then we can
[14:07] quit we're going to say if the question
[14:09] is equal to Q then we are going to break
[14:13] otherwise we can invoke this chain so
[14:15] we're going to say result is equal to
[14:18] chain. invoke okay and for the question
[14:21] we'll just put the question the user
[14:22] asked so we'll replace this with
[14:25] question and then we can print the
[14:27] result now we also can just have a few
[14:29] kind of formatting variables here so I'm
[14:30] just going to say print and I'm just
[14:32] going to print kind of a big line with a
[14:34] few back slend characters and then same
[14:37] thing here I'm just going to print a few
[14:40] back SL ends so we can kind of read
[14:42] what's happening okay so we don't need
[14:44] to test this but this will just allow us
[14:46] to continue to ask questions until we
[14:48] type in Q now what I want to do is show
[14:50] you how to set up the vector search all
[14:52] right so we're going to create a new
[14:53] file here called vector. py can call
[14:56] this anything that you want and here's
[14:58] where we're going to write the logic for
[15:00] actually embedding our documents and
[15:02] then looking them up or vectorizing our
[15:04] documents now in case you're unfamiliar
[15:06] with Vector search this essentially is
[15:08] going to be a database it's going to be
[15:10] hosted locally on our own computer using
[15:12] something called chroma DB which we
[15:14] installed earlier and this is going to
[15:16] allow us to really quickly look up
[15:18] relevant information that we can then
[15:20] pass to our model and then our model can
[15:23] use that data to give us some more
[15:25] contextually relevant replies so
[15:27] obviously llms are really good at kind
[15:29] of synthesizing text and giving us
[15:31] responses but usually they don't have
[15:33] the correct data so in this case what
[15:35] we're going to do is we're going to take
[15:36] this entire CSV file we're going to put
[15:38] it inside of this Vector enabled
[15:41] database and then as soon as we ask a
[15:43] question we're going to look up the
[15:45] relevant documents in that database
[15:47] we're going to pass those to the llm as
[15:49] a list of reviews and then it will be
[15:51] able to search through those reviews and
[15:53] answer our question okay so that's like
[15:55] the very Basics on Vector search let me
[15:57] show you how we do that so we're going
[15:59] to say from Lang chain uncore olama
[16:03] we're going to
[16:04] import the olama embeddings okay now one
[16:08] thing that we need when we do this
[16:10] vectorization process is an embedding
[16:12] model this model will be able to take
[16:14] text and convert it into a vector this
[16:16] is essentially numbers that we can then
[16:18] use to look up data really efficiently
[16:21] next we're going to say from Lang chain
[16:24] and then underscore chroma and we're
[16:26] going to import chroma like this which
[16:28] is which is going to be our Vector store
[16:30] we're then going to say from Lang chain
[16:33] uncore core. document import a document
[16:38] we're going to create documents and then
[16:40] pass these to our uh what do you call it
[16:42] chroma database we're then going to
[16:44] import OS and we're going to import
[16:46] something that I forgot to install
[16:47] before which is pandas as PD okay now
[16:51] pandas is a library that we can use to
[16:52] really easily read in our CSV file so
[16:55] just quickly before I forget we do need
[16:57] to install this so same as before we're
[16:59] going to type pip install pandas in our
[17:02] virtual environment and then we should
[17:04] install that dependency and be able to
[17:06] use it I'll also add it to the
[17:07] requirements.txt file so if you guys
[17:09] were to have downloaded this before you
[17:11] would already have it okay so pandas is
[17:13] installing we can just wait for that to
[17:15] run and start writing some more code so
[17:17] first things first we're going to load
[17:18] in our CSV file we're going to use the
[17:20] data in the CSV file for our Vector
[17:23] store so of course we're going to need
[17:24] the data so we're going to say DF
[17:26] standing for data frame and this is
[17:27] going to be pd. read _ CSV and we're
[17:31] going to read in the realistic uncore
[17:35] restaurant uncore
[17:37] reviews. CSV and obviously you know read
[17:40] in whatever the name of your CSV file is
[17:43] I think that I spelled that correctly
[17:45] although maybe not restaurant let's see
[17:48] you know what we can just do this rename
[17:50] copy and then paste here to avoid any
[17:53] misspellings okay anyway so we have our
[17:55] data frame here next we're going to
[17:57] bring in the embedding model so we're
[17:58] going to say embeddings is equal to the
[18:02] olama embeddings and then we're going to
[18:04] say model is equal to and then the name
[18:06] of the model that we installed which is
[18:08] mxb ai- embed D
[18:13] large okay now after that we're going to
[18:15] specify the location where we want to
[18:17] store our Vector database so I'm going
[18:19] to say do slash and then
[18:22] chroma Lang chain and then this is going
[18:25] to beore DB you can call this anything
[18:27] that you want but this is just going to
[18:28] a folder where we store our uh database
[18:31] okay next after that we're going to say
[18:34] addore documents is equal to and then
[18:37] we're going to say not os. path. exist
[18:41] and then the database location now what
[18:43] I want to do is I want to check and see
[18:44] if this database already exists if it
[18:47] does that means that I've already
[18:48] performed the process of converting the
[18:50] CSV file into vectors and adding into
[18:53] the database if it doesn't exist then it
[18:55] means that I need to do that okay so we
[18:57] don't need to keep doing this every
[18:58] single time we can just one time
[19:00] vectorize our data and then once it's
[19:02] vectorized and it's in the database we
[19:04] don't need to do that again we can just
[19:05] start using it so below here I'm going
[19:07] to say if add documents so if we do
[19:10] actually need to add them then we're
[19:11] going to say the following we're going
[19:13] to say documents is equal to an empty
[19:15] list and we're going to say IDs is equal
[19:17] to an empty list as well then what we're
[19:19] going to do is we're going to iterate
[19:21] through our rows so we're going to say 4
[19:23] I comma Row in DF do eer rows this is
[19:26] simply going to go row by row through
[19:28] our CSV file and then allow us to access
[19:31] the various entries now what we're going
[19:33] to do is we're going to create
[19:34] individual documents we're going to add
[19:36] them to the documents list and then
[19:37] we're going to add them to our Vector
[19:39] store okay so we're going to say
[19:41] document is equal to document and inside
[19:44] of this document we need to pass three
[19:46] things we need to pass a page content
[19:49] and this page content is going to be
[19:52] what we will actually be vectorizing and
[19:54] what we'll be looking up so if you
[19:56] wanted to adjust this for your own
[19:58] example any of the content that you want
[20:00] to use to actually look up the
[20:02] information in the database that needs
[20:04] to go in the page content so what we're
[20:05] going to do is we're going to combine
[20:08] the title of the review with the review
[20:11] itself so that we have a bunch of
[20:12] information to be able to actually query
[20:14] our data okay there's all kinds of
[20:16] different things you can do here but you
[20:18] want to include the important
[20:19] information that you'll be querying
[20:20] based on in the page content so we're
[20:23] going to take row at title and then
[20:26] we're going to say plus a space and then
[20:28] row at review okay then we're going to
[20:31] specify some metadata co-pilot's already
[20:33] doing it for me so we have metadata and
[20:35] then rating and that's row rating and
[20:37] then we're going to have date and then
[20:39] this is going to be
[20:42] row date okay so the metadata is just
[20:46] additional information that we will grab
[20:48] along with the document but we won't be
[20:51] querying based on the Met metadata okay
[20:53] so hopefully that makes sense again just
[20:55] additional data that will be included
[20:57] with the document but it won't
[20:58] necessarily be used to actually query
[21:00] and then lastly we can specify an ID so
[21:03] we're going to say the ID is the string
[21:05] of I which is just the index of this
[21:07] value in the row or in the uh what do
[21:10] you call it the CSV file and just make
[21:12] sure that you convert this to a string
[21:14] okay so I think that should be good for
[21:16] now after this what we're going to do is
[21:18] we're going to say IDs do append and
[21:20] we're going to append string I and then
[21:22] we're going to say documents. append and
[21:25] we're going to append our document now
[21:28] the reason why we need to store the IDS
[21:30] is because when we actually create this
[21:31] data in the vector store for some reason
[21:33] we need two separate lists we need a
[21:36] list of documents and then we need a
[21:37] list of their Associated IDs in case for
[21:39] some reason they're different so I know
[21:41] it seems a bit weird that we have the ID
[21:43] twice but just follow along because we
[21:44] need that for this process okay so now
[21:47] we've kind of prepared the data in
[21:49] documents and the next thing we need to
[21:50] do is add this to the vector store so we
[21:53] need to create the vector store so we're
[21:54] going to say Vector store is equal to
[21:58] chroma and then inside of chroma we're
[22:01] going to specify the location and the
[22:03] collection name so we're going to say
[22:05] collection name is equal to restaurant
[22:07] reviews we're going to say the
[22:09] persistent directory co-pilot is leading
[22:12] me wrong here is going to be equal to
[22:14] the DB location now this just means that
[22:17] we'll store it persistently rather than
[22:18] just storing it in memory you don't need
[22:21] to do this but I recommend that you do
[22:23] store this permanently so that you don't
[22:24] need to keep regenerating this chroma
[22:26] database and then lastly we we need to
[22:28] pass the embedding function which will
[22:30] be equal to our embeddings from olama
[22:34] okay so we're using all of this stuff
[22:36] locally we have the chrom ADB locally we
[22:37] have the local embeddings model and now
[22:40] we have the vector store next I'm going
[22:42] to do a quick if statement and I'm going
[22:43] to say if add documents then we're going
[22:45] to say Vector store. add documents and
[22:48] then this is going to be documents is
[22:51] equal to documents and IDs is equal to
[22:55] IDs okay so this is how you add this you
[22:58] just say Vector store. add documents you
[23:00] specify the documents that you want to
[23:02] add which we've already prepared here
[23:04] and then you specify the corresponding
[23:05] IDs and we're only doing that if this
[23:09] did not already exist because if it did
[23:11] already exist then we don't need to add
[23:13] the documents right and we wouldn't have
[23:14] already prepared this data hopefully
[23:16] that makes sense but that essentially
[23:18] will create the vector store for us and
[23:20] automatically add the data last thing
[23:22] we're going to do is we're going to make
[23:24] this Vector store be usable by our llm
[23:27] so I'm going to show you how to do that
[23:29] we're going to say
[23:31] retriever okay is equal to the vector
[23:35] store.
[23:36] asore retriever okay now inside of here
[23:40] there's a few parameters that we can
[23:42] pass for example we can specify the
[23:44] number of documents that we wanted to
[23:45] look up so I'm going to say search
[23:47] keyword arguments is equal to K and then
[23:50] five now when I do this what's going to
[23:53] happen is it's going to look up five
[23:54] relevant reviews and then pass those
[23:57] five reviews to the the llm now if we
[23:59] wanted 10 reviews we would make this 10
[24:01] if we wanted one review we would make
[24:03] this one you can specify as many or as
[24:05] few as you want obviously minimum of one
[24:07] but I'm going to go with five okay so
[24:10] now we have the Retriever and what this
[24:13] retriever will allow us to do is look up
[24:15] documents then we can pass those
[24:17] documents into the prompt for our llm so
[24:20] quickly recapping we import all the
[24:23] relevant data we bring in the CSV file
[24:26] we uh Define the embeddings model from
[24:28] llama we check if this location already
[24:31] exists if it doesn't then we're going to
[24:33] prepare all of our data by converting it
[24:35] into documents we're going to initialize
[24:37] the vector store if for some reason this
[24:40] directory already exists then there's no
[24:42] need to add the data but if it doesn't
[24:44] exist then we're going to add this data
[24:46] into the vector store by adding all of
[24:48] our documents this will automatically
[24:50] embed all of the documents for us and
[24:52] add it to the vector store and then we
[24:54] can create this retriever from the
[24:56] vector store which will allow us to grab
[24:58] documents so the last step is to Simply
[25:00] use this retriever from our main.py file
[25:03] so we're going to go into main.py we're
[25:05] going to say from Vector import
[25:07] retriever because we're just going to
[25:09] import it from the other file and now
[25:11] before we actually invoke this chain we
[25:14] can use the retriever to grab the
[25:16] relevant reviews and then we can pass
[25:18] the reviews as a parameter to our prompt
[25:21] okay so in order to do that we're just
[25:23] going to say the reviews is equal to the
[25:26] Retriever and then this is going to be
[25:28] Dot invoke we're just going to invoke
[25:30] this with our question and then we can
[25:32] simply pass the reviews that are
[25:34] returned here to our chain so all that
[25:37] we do here is we just say retriever do
[25:39] invoke we pass the question or like the
[25:42] search string that we want to use to
[25:43] look up the relevant reviews what will
[25:45] happen is the retriever is automatically
[25:48] going to embed that question it's going
[25:51] to go into the vector store it's going
[25:52] to look up all of the relevant reviews
[25:54] using a similarity search algorithm it's
[25:57] going to grab the top five reviews and
[25:59] then it's going to pass this to our
[26:00] chain and then we can print out the
[26:02] result and hopefully we get something
[26:03] meaningful based on those reviews so
[26:06] let's give this a run now and pray that
[26:08] it works with python and then
[26:12] main.py give this a second to run you
[26:15] can see that it creates this chroma Lang
[26:17] chain DB directory it will take a second
[26:19] because it does need to embed all of our
[26:21] documents and now we can ask a question
[26:23] so I'm going to say how are the you know
[26:28] vegan options if I can spell anything
[26:31] correctly which apparently I cannot okay
[26:33] so let's see what we get here and you
[26:35] can see that it pulls up a few different
[26:37] reviews here and it says based on the
[26:38] reviews provided appears the vegan
[26:40] options of the PE Peach Restaurant are a
[26:41] mixed bag on the positive side some
[26:43] reviewers have raved about the vegan
[26:45] pizz saying they're hidden gems okay and
[26:47] it even tells us what document it got
[26:49] this from however not all reviews are
[26:51] glowing One reviewer had a vastly
[26:53] different experience with the vegan
[26:54] cheese option calling it tasteless and
[26:56] then it says overall it seems that the
[26:57] vegan opt options are Hit or Miss but
[26:59] there's definitely potential and then it
[27:00] gave us a overall rating three out of
[27:02] five based on the two positive views out
[27:04] of the four total okay cool we can also
[27:07] ask it something like you know how is
[27:10] the Ambiance or something I don't know
[27:12] if I spelled that correctly but let's
[27:13] see what it says said overall I would
[27:15] say the Ambiance of the pizza restaurant
[27:16] has an all Style no substance feel and
[27:19] apparently they don't like the pizza
[27:20] restaurant based on these reviews but
[27:22] you guys get the idea it is insanely
[27:24] fast it uses the vector store database
[27:26] everything runs completely low locally
[27:28] and we're ready to quit we can hit q and
[27:31] we can exit out this was a simple
[27:34] example that was just meant to
[27:35] demonstrate how you can run llms locally
[27:38] on your own computer using your own
[27:39] Hardware obviously you can adjust the
[27:41] CSV file and you can make this any type
[27:43] of data that you want it also doesn't
[27:45] need to be CSV data you can just convert
[27:47] anything that you want into documents
[27:49] like I demonstrated here and if you want
[27:51] the code from this video it will be
[27:53] available from the link in the
[27:54] description if you guys enjoyed make
[27:56] sure to leave a like subscribe to the
[27:57] channel and I will see you in the next
[27:59] one
[28:01] [Music]
⚡ Saved you time reading this? Transcribe any YouTube video for free — no signup needed.