TubeSum ← Transcribe a video

How to Build a Local AI Agent With Python (Ollama, LangChain & RAG)

Transcribed Jun 15, 2026 Watch on YouTube ↗
Intermediate 8 min read For: Python developers interested in building local AI applications with RAG.
478.4K
Views
10.6K
Likes
376
Comments
71
Dislikes
2.3%
📈 Moderate

AI Summary

This tutorial demonstrates how to build a local AI agent using Python, Ollama, LangChain, and Chroma DB for retrieval-augmented generation (RAG). The agent can query a CSV file of restaurant reviews to answer questions, all running locally without any cloud APIs.

[00:00]
Project Overview

Build a local AI agent in minutes using Python, Ollama, LangChain, and Chroma DB for RAG, enabling retrieval from CSV/PDF files.

[00:34]
Demo: Querying Reviews

Agent queries a CSV of fake pizza restaurant reviews to answer questions like 'how is the quality of the pizza?' and 'are there vegan options?'.

[02:02]
Setup: Dependencies

Create a virtual environment, install langchain, langchain-ollama, langchain-chroma, and pandas.

[04:34]
Setup: Ollama Models

Install Ollama, pull llama3.2 for the LLM and mxbai-embed-large for embeddings.

[07:03]
Coding: Basic LLM Chain

Import Ollama LLM, create a chat prompt template, build a chain, and test with a simple question.

[14:52]
Coding: Vector Store Setup

Create a separate file to load CSV, define embeddings, initialize Chroma DB, and create a retriever.

[24:58]
Integration and Testing

Import retriever into main.py, use it to fetch relevant reviews before invoking the LLM chain, and run the interactive loop.

This tutorial shows how to build a fully local AI agent with RAG using Python, Ollama, LangChain, and Chroma DB. The approach can be adapted to any CSV or document data, enabling private, offline question-answering.

Clickbait Check

95% Legit

"Title accurately describes the tutorial; it delivers exactly what it promises."

Mentioned in this Video

Tutorial Checklist

1 02:02 Create a new folder and add your CSV file (e.g., realistic_restaurant_reviews.csv).
2 02:43 Create a virtual environment: python -m venv venv, then activate it (Windows: .\venv\Scripts\activate; Mac/Linux: source venv/bin/activate).
3 03:56 Install dependencies: pip install langchain langchain-ollama langchain-chroma pandas (or use requirements.txt).
4 04:34 Download and install Ollama from ollama.com, then pull models: ollama pull llama3.2 and ollama pull mxbai-embed-large.
5 07:03 Create main.py: import Ollama LLM, create ChatPromptTemplate, build chain, and test with a simple invoke.
6 14:52 Create vector.py: load CSV with pandas, define OllamaEmbeddings, initialize Chroma DB, create documents with page_content (title + review) and metadata (rating, date), add to vector store, and create retriever with search_kwargs={'k': 5}.
7 24:58 In main.py, import retriever from vector, before invoking chain call retriever.invoke(question) to get relevant reviews, then pass reviews to chain.invoke().
8 13:37 Wrap the invoke logic in a while True loop to allow multiple questions; break on 'q'.

Study Flashcards (8)

What is the purpose of the embedding model in this project?

medium Click to reveal answer

To convert text into vectors for efficient similarity search in the vector database.

16:10

Which Ollama models are used in this tutorial?

easy Click to reveal answer

llama3.2 for the LLM and mxbai-embed-large for embeddings.

05:55

What does the retriever's search_kwargs={'k': 5} parameter do?

medium Click to reveal answer

It specifies that the retriever should return the top 5 most relevant documents.

23:47

How do you activate the virtual environment on Windows?

easy Click to reveal answer

Type .\venv\Scripts\activate

03:20

What is the role of Chroma DB in this project?

medium Click to reveal answer

It acts as a local vector database to store and retrieve document embeddings.

15:08

What three things are passed to the Document object when creating documents?

hard Click to reveal answer

page_content, metadata, and id.

19:44

Why is the vector store persisted to disk?

medium Click to reveal answer

To avoid re-embedding the documents every time the program runs.

22:17

What command lists all available Ollama models?

easy Click to reveal answer

ollama list

09:58

💡 Key Takeaways

🔧

Local AI Agent with RAG

Demonstrates a fully local, free alternative to cloud-based AI agents using RAG.

💡

Vector Search Explanation

Clearly explains how vector search enables efficient retrieval of relevant documents.

15:08
🔧

Document Structure for RAG

Shows how to structure documents with page_content and metadata for effective retrieval.

19:44
🔧

Retriever Configuration

Demonstrates configuring the retriever to return a specific number of documents (k=5).

23:47

✂️ Creator Tools: Viral Hooks

AI-generated clip ideas for Shorts based on the transcript

Build a Local AI Agent in Minutes

40s

Shows a quick demo of a working AI agent that answers questions about pizza reviews, hooking viewers with immediate value.

▶ Play Clip

Setup: Install Dependencies & Ollama

60s

Provides a clear, step-by-step setup guide for running AI models locally, appealing to developers who want to avoid cloud costs.

▶ Play Clip

Code a Local LLM in Python

60s

Demonstrates writing actual Python code with LangChain and Ollama, showing how easy it is to get a local LLM running.

▶ Play Clip

Add RAG with ChromaDB Vector Search

60s

Explains retrieval-augmented generation (RAG) in a simple way, showing how to make the AI answer based on custom data.

▶ Play Clip

Live Demo: AI Answers Pizza Questions

60s

Runs the final agent and asks real questions about vegan options and ambiance, proving the system works and delivering satisfying results.

▶ Play Clip

[00:00] in this video I'll be showing you how to

[00:01] build a local AI agent in just a few

[00:04] minutes using python we'll be using AMA

[00:07] Lang chain and something called chroma

[00:09] DB to act as our Vector search database

[00:12] because I'm going to show you how to add

[00:13] retrieval augmented generation into this

[00:16] app that essentially means we can

[00:17] retrieve relevant information from

[00:19] something like a CSV file or a PDF and

[00:22] bring that into our model now all of

[00:24] this is completely free you don't need

[00:26] an open AI account you don't need a clot

[00:28] account or something you can do this all

[00:30] from your local computer so let me show

[00:32] you how to set it all up so I'm just

[00:34] going to show you a quick demo of the

[00:35] finished product and then we'll get into

[00:37] the tutorial now you can see on the

[00:39] right hand side of my screen here that I

[00:40] just opened up a CSV file this CSV file

[00:43] just contains some fake reviews for a

[00:45] random pizza restaurant so we have title

[00:47] date rating and review and you can see

[00:49] something like best pizza in town here's

[00:51] the date here's the ID or sorry the

[00:53] rating of the review out of five and

[00:55] then you have what the actual review is

[00:57] and there's kind of some information now

[00:59] what I'm going to show you is how to

[01:00] build an AI agent here that can actually

[01:02] go and look up relevant reviews from

[01:05] this document to answer questions about

[01:07] the restaurant I don't know about you

[01:08] guys but whenever I go to a new place I

[01:10] always look at the reviews and typically

[01:12] I'm looking for an answer to my

[01:13] particular question so this can kind of

[01:15] do that for you so for example maybe I

[01:17] want to know you know how is the quality

[01:21] of the pizza okay well what it can do is

[01:23] then go to this document find the

[01:25] relevant reviews which you can see it

[01:26] kind of pulls into here and analyzes and

[01:29] then it gives me a conclusion overall

[01:31] without more data or context it's

[01:32] challenging to give a definitive score

[01:33] on the pizza based solely on the reviews

[01:35] however they do suggest a restaurant

[01:37] with potentially room for improvement in

[01:38] presentation and overall consistency so

[01:40] there you go right I could ask something

[01:42] like are there vegan options let's see

[01:47] what that gives us and you can see here

[01:49] in conclusion based on the reviews there

[01:51] appear to be at least one vegan pizza

[01:53] pizza and possibly more vegan options

[01:54] available Okay cool so that's what we're

[01:56] going to build this isn't going to be

[01:57] super complicated it'll be pretty fast

[01:59] so stick around around and let me show

[02:00] you how we make it all right so we have

[02:02] a few quick setup steps here and then we

[02:04] can dive right into the code now the

[02:06] first thing that we're going to need is

[02:07] obviously some kind of CSV file now you

[02:09] can use anything that you want and I'll

[02:11] show you how to adjust this code for

[02:13] your own example but if you want to

[02:14] download the CSV file that I'm using

[02:16] I'll leave a link to it in the

[02:18] description and in fact all of the code

[02:19] will be available from the GitHub so you

[02:21] can go to the GitHub and you can

[02:22] download this CSV file and just bring it

[02:25] into a new folder in VSS code so to

[02:27] begin open up some kind of code editor

[02:29] I'm using VSS code code create a new

[02:30] folder you can see I have one called

[02:32] local AI agent bring in the CSV file and

[02:35] then I also created this

[02:36] requirements.txt file which just has the

[02:39] three things that we're going to need to

[02:41] install in Python so let's get started

[02:43] with that installing our Python

[02:45] dependencies and then I'll show you the

[02:46] next steps so what we need to do is open

[02:48] up our terminal again I'm inside of the

[02:51] directory that I want to write code in

[02:52] for this video and what I'm going to do

[02:54] is create a virtual environment so to do

[02:56] that I'm going to type python DMV en EnV

[03:00] and then VV if you're on Mac or Linux

[03:03] you can change this to Python 3 and what

[03:05] this will do is create a new isolated

[03:07] environment that we can install various

[03:09] dependencies into if you don't know

[03:11] anything about virtual environments and

[03:12] you want to learn more I'll leave a

[03:14] video on screen now that the virtual

[03:16] environment has been created we need to

[03:17] activate it to activate it if you're on

[03:20] Windows is going to be dot slash the

[03:22] name of the virtual environment slash

[03:24] and then scripts with a capital S and

[03:26] then slash activate when you type that

[03:29] you should see that you get the name of

[03:31] the virtual environment as a prefix

[03:33] before your command line now if you are

[03:35] on Mac or Linux then the command is

[03:37] going to be SL venv and then this is

[03:39] going to be bin SL activate Okay so it's

[03:44] different if you're on Windows it's this

[03:45] one and if you are Mac or Linux it's

[03:47] going to be this one and again I'll

[03:48] leave a video on screen that we'll go

[03:50] through this more in depth now that we

[03:52] have the virtual environment activated

[03:54] what we're going to do is install the

[03:56] various dependencies inside of here now

[03:58] if you have this requirements. txt file

[04:00] then you can say pip install dasr and

[04:04] then you can do requirements.txt and

[04:07] this will install all of the

[04:08] requirements into our virtual

[04:10] environment however if you don't have

[04:12] the requirements.txt file you can just

[04:14] type them out so you can just install

[04:16] Lang chain you can install Lang chain

[04:19] dama and you can install Lang chain Das

[04:22] chroma like that okay so we just need to

[04:25] install these dependencies in order to

[04:26] be able to use these in Python so that's

[04:28] going to take a second installing of

[04:30] those dependencies for us and then once

[04:31] that's done I'll be right back okay so

[04:34] those are installed and the next thing

[04:35] that we're going to need to get is

[04:36] something called olama now olama allows

[04:39] us to run models locally on our own

[04:41] computer using our own Hardware so

[04:43] that's why we're able to do everything

[04:44] locally here rather than having to use

[04:46] something like an openai API key so

[04:49] please go to this page just ama.com if

[04:52] you don't already have this software and

[04:53] simply download it once you download it

[04:56] what you should be able to do is just

[04:57] open up some kind of terminal or command

[04:59] prompt and then type the command olama

[05:02] if you have any issues with this again

[05:04] I'll put another video on screen that

[05:05] walks through AMA in depth and we'll

[05:07] show you how to set this up but once we

[05:09] have Ama installed on our computer what

[05:12] we're going to do is install an olama

[05:14] model now AMA again it's just this open

[05:16] source software and allows us to pull

[05:18] various models to our own computer and

[05:20] then run them using our own Hardware now

[05:22] depending on the type of Hardware you

[05:24] have that will dictate the models you'll

[05:26] be able to run for example you probably

[05:28] can't run a 200 g by model if you don't

[05:30] have a graphics card in your computer so

[05:32] I'm going to show you a few models that

[05:33] should work on most machines if you have

[05:36] a graphics card if you don't have a

[05:37] graphics card and you just have a CPU

[05:39] there's some very small models that you

[05:41] can download and use but obviously the

[05:43] performance won't be as good so what you

[05:45] can do is you can actually go to the

[05:46] olama library I'll leave this link in

[05:49] the description and you can see that

[05:50] there's various different models and it

[05:52] kind of shows you all of the options

[05:53] that they have now we're going to pull

[05:55] two models to our computer we're just

[05:57] going to pull llama 3.2 so this is kind

[06:00] of a smaller model that we can use that

[06:02] performs pretty well and then we're

[06:04] going to pull an embedding model and

[06:05] I'll show you the name of that in 1

[06:07] second which we'll use to embed the

[06:09] documents that we add into our Vector

[06:11] store if that means nothing to you don't

[06:13] worry just follow along with the next

[06:14] steps okay so we're going to go into our

[06:16] terminal and again we're going to make

[06:18] sure that ama command works and then

[06:20] we're going to type AMA pull and we're

[06:22] going to start by pulling the model

[06:24] llama 3.2 now you can pull any model

[06:27] that you want you can choose you can go

[06:28] look at the directory but I'm going to

[06:30] go with 3.2 once that's done okay you

[06:32] can see it's here because I already had

[06:33] it downloaded then we can move on to the

[06:35] next one now the next model that we're

[06:37] going to pull is going to be an

[06:38] embedding model now this embedding model

[06:40] is going to be mxb Ai and then this is

[06:44] going to be Dash embed Das large there's

[06:47] various other embedding models you can

[06:49] use but this is the one we'll use for

[06:50] this video okay so we're going to go and

[06:52] hit enter and then again downloaded to

[06:54] our computer these are not super big so

[06:56] you should be able to run them on your

[06:58] computer if you have any kind of GPU all

[07:00] right so now that we have these models

[07:01] we're good to start writing some code so

[07:03] I'm going to go back into VSS code I'm

[07:05] going to make a file called

[07:07] main.py and in this file I'm going to

[07:10] start writing some code now you'll

[07:11] notice that I actually get this

[07:12] autocomplete here this is coming from

[07:14] GitHub co-pilot you know that really

[07:16] cool assistant that replaces a lot of

[07:18] your manual typing work they've actually

[07:20] sponsored this video and speaking of

[07:22] Microsoft's GitHub co-pilot I was

[07:24] fortunate enough to have them sponsor a

[07:26] video a few weeks ago on AI agents and

[07:29] today's video where I promise to

[07:30] highlight some of the standout ways that

[07:32] developers are using GitHub co-pilot

[07:34] that you guys submitted with the coding

[07:36] with co-pilot hashtag so let's get into

[07:38] it check out these examples of how

[07:40] developers are using GitHub co-pilot

[07:42] like Emy who created an entire flutter

[07:44] mobile app tug Duel who created a python

[07:47] script to resize and save images Adrien

[07:49] who used co-pilot as a beginner when he

[07:51] was working in Jupiter to learn better

[07:53] ways to write functions and Yousef who

[07:55] uses it to avoid manually writing

[07:57] tedious documentation Now personal I use

[08:00] GitHub co-pilot every single time I open

[08:02] up vs code and it's insane how well it

[08:04] can predict what I want to do next and

[08:06] save me tons of hours of manual typing

[08:08] it's literally like it can read my mind

[08:10] now I'm sure that you guys have more

[08:12] stories on how you're using GitHub

[08:14] co-pilot so please share them with me

[08:16] using the coding with co-pilot hashtag

[08:18] because I'm excited to check them out

[08:20] now with that said let's get back to the

[08:22] video all right so back into the code

[08:23] editor here let's go ahead and get

[08:25] started now we're going to begin by just

[08:27] importing a few things so we're going to

[08:29] say from Lang

[08:31] chain. llms import the olama llm we're

[08:36] then going to say from Lang chain

[08:40] core. prompts import the chat prompt

[08:45] template okay now if you're unfamiliar

[08:47] with Lang chain this is a framework that

[08:49] just makes it a lot easier for us to

[08:50] work with llms it's very popular in

[08:53] Python and it has all of these

[08:55] extensions like the AMA extension that

[08:57] allows us to directly use our llama

[08:59] models and by the way what will happen

[09:02] is a llama should be running in the

[09:04] background on your computer and it's

[09:05] going to expose a server or like an HTTP

[09:09] rest API that we'll be able to

[09:11] communicate with from our program so

[09:13] when you pull these models they are

[09:15] actually running on your own computer

[09:17] and we can trigger AMA to utilize these

[09:19] models from code in python or we can

[09:22] actually just do it directly from the

[09:23] command line so everything that I'm

[09:25] showing you here will run 100% locally

[09:27] on your own computer even though it

[09:29] might not necessarily feel like that it

[09:30] also means it'll be pretty fast okay so

[09:34] after this we're going to specify our

[09:35] model now I'm going to show you in this

[09:37] code snippet here how to utilize an AMA

[09:39] model like quite quickly and then we'll

[09:41] start connecting some more complexity to

[09:43] it with the vector database and I'll

[09:45] talk about what that means so I'm going

[09:46] to say model is equal to oama and then

[09:49] inside of here I need to specify the

[09:52] specific model from olama that I want to

[09:54] use now if you're confused on what to

[09:56] put here you can open up your command

[09:58] prompt you can type a llama list like

[10:01] this and it will show you the models

[10:03] that you have available so you can see

[10:05] that I have this embedding model I have

[10:06] llama 3.2 I have mistol I have llama 2

[10:09] so any of these models I can use so what

[10:12] I'm going to do is just copy llama 3.2

[10:14] you don't need the latest part of it you

[10:16] can just do the original name and you

[10:18] can put it right here okay so I'm going

[10:20] to use model o llama model equal to

[10:22] llama 3.2 and now I can start utilizing

[10:25] this model and kind of invoking it so

[10:28] next what we're going to do is is we're

[10:29] going to create a template and this

[10:31] template is going to be just a string

[10:34] and inside of this string we're just

[10:35] going to specify what we want the model

[10:37] to actually do so we're going to say

[10:39] something like you are an

[10:41] expert in answering questions about a

[10:46] pizza restaurant okay here are some

[10:51] relevant reviews and then we're just

[10:53] going to put inside of a variable here

[10:56] reviews and say here is the question to

[11:00] answer okay and then we're going to put

[11:02] a question perfect then what we're going

[11:05] to do is we're going to say our prompt

[11:07] is equal to a chat prompt template we're

[11:10] going to pass our template and actually

[11:12] we don't need to pass the model I don't

[11:14] know why it's doing that and now we've

[11:15] created a chat prompt template where

[11:17] we'll be able to pass in a reviews

[11:19] variable and a question variable and

[11:21] then the model can respond to that okay

[11:24] then we're going to create a chain so

[11:26] with the chain we can say prompt and

[11:28] then we can put a type and then we can

[11:30] put model now what this allows us to do

[11:32] is essentially invoke this entire chain

[11:35] that can combine multiple things

[11:37] together to run our llm so first what

[11:40] we'll do is we'll pass variables reviews

[11:42] and question into this prompt this chat

[11:45] prompt template that we just created and

[11:47] then that will automatically get passed

[11:48] to our model because we put it inside of

[11:51] this chain and then it will return to us

[11:53] whatever the answer is so if we want to

[11:55] test this out really quickly because

[11:57] this is literally all we need to in

[11:58] order to do this we can say

[12:01] chain. invoke and then inside of a

[12:04] python dictionary we need to specify the

[12:06] two variables that we had inside of this

[12:09] prompt so we're going to have reviews

[12:11] and then question okay so we'll start

[12:13] with reviews and for now we can just

[12:16] make this an empty list and then we can

[12:18] say question and something like what is

[12:20] the best pizza place in town that might

[12:22] not necessarily make sense because this

[12:23] is just about one pizza place but I just

[12:25] want to show you a quick demo so we're

[12:27] going to say result is equal to this and

[12:30] then we're going to go down and we're

[12:31] going to say print result okay so we can

[12:35] just test this out and make sure that

[12:36] it's working and it should go ahead and

[12:38] invoke our olama llm and give us some

[12:41] kind of response so let's go here and

[12:43] run this we can do that by typing python

[12:45] the name of our file which is main.py or

[12:48] Python 3 main.py so I'm going to hit

[12:50] enter give this a second to run and we

[12:53] got an error some kind of formatting

[12:54] issue so let's see what the problem is

[12:57] okay so silly mistake here what we

[12:58] actually need to say chat prompt

[13:00] template. from template I forgot to

[13:03] specify this method so of course that

[13:05] was giving us an issue so let's go back

[13:07] here and fix that quickly Python main.py

[13:10] and we should see that this works now

[13:12] give it a second and you can see it say

[13:14] based on our customer feedack and

[13:15] ratings I would highly recommend this

[13:16] the top rated pizza place One reviewer

[13:18] mentioned blah blah blah blah in fact

[13:20] our own team has sampled their pizza so

[13:22] it just came up with something random

[13:23] here because I didn't actually give it

[13:25] any reviews so it's kind of

[13:26] hallucinating the response but you get

[13:28] the idea okay it did actually work we

[13:30] were able to use AMA and we got a

[13:32] response from the model which is really

[13:34] just the point of what we were testing

[13:35] here okay so now what we're going to do

[13:37] is we're just going to put this inside

[13:39] of a y Loop so essentially we can just

[13:41] keep asking it questions and then we're

[13:42] going to set up the vector search so we

[13:44] can actually get a relevant response so

[13:46] let's set up a simple Loop here we're

[13:48] just going to say while true then we're

[13:50] going to ask a question so we're going

[13:51] to say question is equal to input so we

[13:55] can get some input from the user and

[13:56] we'll say you know ask

[13:59] your

[14:00] question and then we're just going to

[14:02] put a set of parentheses here and say Q

[14:04] to quit so if they type Q then we can

[14:07] quit we're going to say if the question

[14:09] is equal to Q then we are going to break

[14:13] otherwise we can invoke this chain so

[14:15] we're going to say result is equal to

[14:18] chain. invoke okay and for the question

[14:21] we'll just put the question the user

[14:22] asked so we'll replace this with

[14:25] question and then we can print the

[14:27] result now we also can just have a few

[14:29] kind of formatting variables here so I'm

[14:30] just going to say print and I'm just

[14:32] going to print kind of a big line with a

[14:34] few back slend characters and then same

[14:37] thing here I'm just going to print a few

[14:40] back SL ends so we can kind of read

[14:42] what's happening okay so we don't need

[14:44] to test this but this will just allow us

[14:46] to continue to ask questions until we

[14:48] type in Q now what I want to do is show

[14:50] you how to set up the vector search all

[14:52] right so we're going to create a new

[14:53] file here called vector. py can call

[14:56] this anything that you want and here's

[14:58] where we're going to write the logic for

[15:00] actually embedding our documents and

[15:02] then looking them up or vectorizing our

[15:04] documents now in case you're unfamiliar

[15:06] with Vector search this essentially is

[15:08] going to be a database it's going to be

[15:10] hosted locally on our own computer using

[15:12] something called chroma DB which we

[15:14] installed earlier and this is going to

[15:16] allow us to really quickly look up

[15:18] relevant information that we can then

[15:20] pass to our model and then our model can

[15:23] use that data to give us some more

[15:25] contextually relevant replies so

[15:27] obviously llms are really good at kind

[15:29] of synthesizing text and giving us

[15:31] responses but usually they don't have

[15:33] the correct data so in this case what

[15:35] we're going to do is we're going to take

[15:36] this entire CSV file we're going to put

[15:38] it inside of this Vector enabled

[15:41] database and then as soon as we ask a

[15:43] question we're going to look up the

[15:45] relevant documents in that database

[15:47] we're going to pass those to the llm as

[15:49] a list of reviews and then it will be

[15:51] able to search through those reviews and

[15:53] answer our question okay so that's like

[15:55] the very Basics on Vector search let me

[15:57] show you how we do that so we're going

[15:59] to say from Lang chain uncore olama

[16:03] we're going to

[16:04] import the olama embeddings okay now one

[16:08] thing that we need when we do this

[16:10] vectorization process is an embedding

[16:12] model this model will be able to take

[16:14] text and convert it into a vector this

[16:16] is essentially numbers that we can then

[16:18] use to look up data really efficiently

[16:21] next we're going to say from Lang chain

[16:24] and then underscore chroma and we're

[16:26] going to import chroma like this which

[16:28] is which is going to be our Vector store

[16:30] we're then going to say from Lang chain

[16:33] uncore core. document import a document

[16:38] we're going to create documents and then

[16:40] pass these to our uh what do you call it

[16:42] chroma database we're then going to

[16:44] import OS and we're going to import

[16:46] something that I forgot to install

[16:47] before which is pandas as PD okay now

[16:51] pandas is a library that we can use to

[16:52] really easily read in our CSV file so

[16:55] just quickly before I forget we do need

[16:57] to install this so same as before we're

[16:59] going to type pip install pandas in our

[17:02] virtual environment and then we should

[17:04] install that dependency and be able to

[17:06] use it I'll also add it to the

[17:07] requirements.txt file so if you guys

[17:09] were to have downloaded this before you

[17:11] would already have it okay so pandas is

[17:13] installing we can just wait for that to

[17:15] run and start writing some more code so

[17:17] first things first we're going to load

[17:18] in our CSV file we're going to use the

[17:20] data in the CSV file for our Vector

[17:23] store so of course we're going to need

[17:24] the data so we're going to say DF

[17:26] standing for data frame and this is

[17:27] going to be pd. read _ CSV and we're

[17:31] going to read in the realistic uncore

[17:35] restaurant uncore

[17:37] reviews. CSV and obviously you know read

[17:40] in whatever the name of your CSV file is

[17:43] I think that I spelled that correctly

[17:45] although maybe not restaurant let's see

[17:48] you know what we can just do this rename

[17:50] copy and then paste here to avoid any

[17:53] misspellings okay anyway so we have our

[17:55] data frame here next we're going to

[17:57] bring in the embedding model so we're

[17:58] going to say embeddings is equal to the

[18:02] olama embeddings and then we're going to

[18:04] say model is equal to and then the name

[18:06] of the model that we installed which is

[18:08] mxb ai- embed D

[18:13] large okay now after that we're going to

[18:15] specify the location where we want to

[18:17] store our Vector database so I'm going

[18:19] to say do slash and then

[18:22] chroma Lang chain and then this is going

[18:25] to beore DB you can call this anything

[18:27] that you want but this is just going to

[18:28] a folder where we store our uh database

[18:31] okay next after that we're going to say

[18:34] addore documents is equal to and then

[18:37] we're going to say not os. path. exist

[18:41] and then the database location now what

[18:43] I want to do is I want to check and see

[18:44] if this database already exists if it

[18:47] does that means that I've already

[18:48] performed the process of converting the

[18:50] CSV file into vectors and adding into

[18:53] the database if it doesn't exist then it

[18:55] means that I need to do that okay so we

[18:57] don't need to keep doing this every

[18:58] single time we can just one time

[19:00] vectorize our data and then once it's

[19:02] vectorized and it's in the database we

[19:04] don't need to do that again we can just

[19:05] start using it so below here I'm going

[19:07] to say if add documents so if we do

[19:10] actually need to add them then we're

[19:11] going to say the following we're going

[19:13] to say documents is equal to an empty

[19:15] list and we're going to say IDs is equal

[19:17] to an empty list as well then what we're

[19:19] going to do is we're going to iterate

[19:21] through our rows so we're going to say 4

[19:23] I comma Row in DF do eer rows this is

[19:26] simply going to go row by row through

[19:28] our CSV file and then allow us to access

[19:31] the various entries now what we're going

[19:33] to do is we're going to create

[19:34] individual documents we're going to add

[19:36] them to the documents list and then

[19:37] we're going to add them to our Vector

[19:39] store okay so we're going to say

[19:41] document is equal to document and inside

[19:44] of this document we need to pass three

[19:46] things we need to pass a page content

[19:49] and this page content is going to be

[19:52] what we will actually be vectorizing and

[19:54] what we'll be looking up so if you

[19:56] wanted to adjust this for your own

[19:58] example any of the content that you want

[20:00] to use to actually look up the

[20:02] information in the database that needs

[20:04] to go in the page content so what we're

[20:05] going to do is we're going to combine

[20:08] the title of the review with the review

[20:11] itself so that we have a bunch of

[20:12] information to be able to actually query

[20:14] our data okay there's all kinds of

[20:16] different things you can do here but you

[20:18] want to include the important

[20:19] information that you'll be querying

[20:20] based on in the page content so we're

[20:23] going to take row at title and then

[20:26] we're going to say plus a space and then

[20:28] row at review okay then we're going to

[20:31] specify some metadata co-pilot's already

[20:33] doing it for me so we have metadata and

[20:35] then rating and that's row rating and

[20:37] then we're going to have date and then

[20:39] this is going to be

[20:42] row date okay so the metadata is just

[20:46] additional information that we will grab

[20:48] along with the document but we won't be

[20:51] querying based on the Met metadata okay

[20:53] so hopefully that makes sense again just

[20:55] additional data that will be included

[20:57] with the document but it won't

[20:58] necessarily be used to actually query

[21:00] and then lastly we can specify an ID so

[21:03] we're going to say the ID is the string

[21:05] of I which is just the index of this

[21:07] value in the row or in the uh what do

[21:10] you call it the CSV file and just make

[21:12] sure that you convert this to a string

[21:14] okay so I think that should be good for

[21:16] now after this what we're going to do is

[21:18] we're going to say IDs do append and

[21:20] we're going to append string I and then

[21:22] we're going to say documents. append and

[21:25] we're going to append our document now

[21:28] the reason why we need to store the IDS

[21:30] is because when we actually create this

[21:31] data in the vector store for some reason

[21:33] we need two separate lists we need a

[21:36] list of documents and then we need a

[21:37] list of their Associated IDs in case for

[21:39] some reason they're different so I know

[21:41] it seems a bit weird that we have the ID

[21:43] twice but just follow along because we

[21:44] need that for this process okay so now

[21:47] we've kind of prepared the data in

[21:49] documents and the next thing we need to

[21:50] do is add this to the vector store so we

[21:53] need to create the vector store so we're

[21:54] going to say Vector store is equal to

[21:58] chroma and then inside of chroma we're

[22:01] going to specify the location and the

[22:03] collection name so we're going to say

[22:05] collection name is equal to restaurant

[22:07] reviews we're going to say the

[22:09] persistent directory co-pilot is leading

[22:12] me wrong here is going to be equal to

[22:14] the DB location now this just means that

[22:17] we'll store it persistently rather than

[22:18] just storing it in memory you don't need

[22:21] to do this but I recommend that you do

[22:23] store this permanently so that you don't

[22:24] need to keep regenerating this chroma

[22:26] database and then lastly we we need to

[22:28] pass the embedding function which will

[22:30] be equal to our embeddings from olama

[22:34] okay so we're using all of this stuff

[22:36] locally we have the chrom ADB locally we

[22:37] have the local embeddings model and now

[22:40] we have the vector store next I'm going

[22:42] to do a quick if statement and I'm going

[22:43] to say if add documents then we're going

[22:45] to say Vector store. add documents and

[22:48] then this is going to be documents is

[22:51] equal to documents and IDs is equal to

[22:55] IDs okay so this is how you add this you

[22:58] just say Vector store. add documents you

[23:00] specify the documents that you want to

[23:02] add which we've already prepared here

[23:04] and then you specify the corresponding

[23:05] IDs and we're only doing that if this

[23:09] did not already exist because if it did

[23:11] already exist then we don't need to add

[23:13] the documents right and we wouldn't have

[23:14] already prepared this data hopefully

[23:16] that makes sense but that essentially

[23:18] will create the vector store for us and

[23:20] automatically add the data last thing

[23:22] we're going to do is we're going to make

[23:24] this Vector store be usable by our llm

[23:27] so I'm going to show you how to do that

[23:29] we're going to say

[23:31] retriever okay is equal to the vector

[23:35] store.

[23:36] asore retriever okay now inside of here

[23:40] there's a few parameters that we can

[23:42] pass for example we can specify the

[23:44] number of documents that we wanted to

[23:45] look up so I'm going to say search

[23:47] keyword arguments is equal to K and then

[23:50] five now when I do this what's going to

[23:53] happen is it's going to look up five

[23:54] relevant reviews and then pass those

[23:57] five reviews to the the llm now if we

[23:59] wanted 10 reviews we would make this 10

[24:01] if we wanted one review we would make

[24:03] this one you can specify as many or as

[24:05] few as you want obviously minimum of one

[24:07] but I'm going to go with five okay so

[24:10] now we have the Retriever and what this

[24:13] retriever will allow us to do is look up

[24:15] documents then we can pass those

[24:17] documents into the prompt for our llm so

[24:20] quickly recapping we import all the

[24:23] relevant data we bring in the CSV file

[24:26] we uh Define the embeddings model from

[24:28] llama we check if this location already

[24:31] exists if it doesn't then we're going to

[24:33] prepare all of our data by converting it

[24:35] into documents we're going to initialize

[24:37] the vector store if for some reason this

[24:40] directory already exists then there's no

[24:42] need to add the data but if it doesn't

[24:44] exist then we're going to add this data

[24:46] into the vector store by adding all of

[24:48] our documents this will automatically

[24:50] embed all of the documents for us and

[24:52] add it to the vector store and then we

[24:54] can create this retriever from the

[24:56] vector store which will allow us to grab

[24:58] documents so the last step is to Simply

[25:00] use this retriever from our main.py file

[25:03] so we're going to go into main.py we're

[25:05] going to say from Vector import

[25:07] retriever because we're just going to

[25:09] import it from the other file and now

[25:11] before we actually invoke this chain we

[25:14] can use the retriever to grab the

[25:16] relevant reviews and then we can pass

[25:18] the reviews as a parameter to our prompt

[25:21] okay so in order to do that we're just

[25:23] going to say the reviews is equal to the

[25:26] Retriever and then this is going to be

[25:28] Dot invoke we're just going to invoke

[25:30] this with our question and then we can

[25:32] simply pass the reviews that are

[25:34] returned here to our chain so all that

[25:37] we do here is we just say retriever do

[25:39] invoke we pass the question or like the

[25:42] search string that we want to use to

[25:43] look up the relevant reviews what will

[25:45] happen is the retriever is automatically

[25:48] going to embed that question it's going

[25:51] to go into the vector store it's going

[25:52] to look up all of the relevant reviews

[25:54] using a similarity search algorithm it's

[25:57] going to grab the top five reviews and

[25:59] then it's going to pass this to our

[26:00] chain and then we can print out the

[26:02] result and hopefully we get something

[26:03] meaningful based on those reviews so

[26:06] let's give this a run now and pray that

[26:08] it works with python and then

[26:12] main.py give this a second to run you

[26:15] can see that it creates this chroma Lang

[26:17] chain DB directory it will take a second

[26:19] because it does need to embed all of our

[26:21] documents and now we can ask a question

[26:23] so I'm going to say how are the you know

[26:28] vegan options if I can spell anything

[26:31] correctly which apparently I cannot okay

[26:33] so let's see what we get here and you

[26:35] can see that it pulls up a few different

[26:37] reviews here and it says based on the

[26:38] reviews provided appears the vegan

[26:40] options of the PE Peach Restaurant are a

[26:41] mixed bag on the positive side some

[26:43] reviewers have raved about the vegan

[26:45] pizz saying they're hidden gems okay and

[26:47] it even tells us what document it got

[26:49] this from however not all reviews are

[26:51] glowing One reviewer had a vastly

[26:53] different experience with the vegan

[26:54] cheese option calling it tasteless and

[26:56] then it says overall it seems that the

[26:57] vegan opt options are Hit or Miss but

[26:59] there's definitely potential and then it

[27:00] gave us a overall rating three out of

[27:02] five based on the two positive views out

[27:04] of the four total okay cool we can also

[27:07] ask it something like you know how is

[27:10] the Ambiance or something I don't know

[27:12] if I spelled that correctly but let's

[27:13] see what it says said overall I would

[27:15] say the Ambiance of the pizza restaurant

[27:16] has an all Style no substance feel and

[27:19] apparently they don't like the pizza

[27:20] restaurant based on these reviews but

[27:22] you guys get the idea it is insanely

[27:24] fast it uses the vector store database

[27:26] everything runs completely low locally

[27:28] and we're ready to quit we can hit q and

[27:31] we can exit out this was a simple

[27:34] example that was just meant to

[27:35] demonstrate how you can run llms locally

[27:38] on your own computer using your own

[27:39] Hardware obviously you can adjust the

[27:41] CSV file and you can make this any type

[27:43] of data that you want it also doesn't

[27:45] need to be CSV data you can just convert

[27:47] anything that you want into documents

[27:49] like I demonstrated here and if you want

[27:51] the code from this video it will be

[27:53] available from the link in the

[27:54] description if you guys enjoyed make

[27:56] sure to leave a like subscribe to the

[27:57] channel and I will see you in the next

[27:59] one

[28:01] [Music]

⚡ Saved you time reading this? Transcribe any YouTube video for free — no signup needed.