TubeSum ← Transcribe a video

What is RAG? (Retrieval Augmented Generation)

Transcribed Jun 17, 2026 Watch on YouTube ↗
Beginner 6 min read For: Anyone curious about how to use large language models with their own data, such as developers, product managers, or business analysts.
291.7K
Views
9.3K
Likes
311
Comments
40
Dislikes
3.3%
📈 Moderate

AI Summary

This video explains Retrieval Augmented Generation (RAG), a popular pattern for using large language models (LLMs) on your own content. It compares the search engine experience with the LLM experience, then details how RAG works: vectorizing content, retrieving relevant chunks, and augmenting the prompt before sending it to the LLM. The speaker emphasizes that RAG is widely used to create ChatGPT-like experiences for employees or customers using proprietary data.

[0:00]
Introduction to the video series and RAG

The video is part of a series rotating through educational, use case, and ethics topics. Today's topic is RAG, a very important and popular solution pattern.

[1:14]
Search engines vs. LLMs

Search engines list links; users must click, read, and digest. LLMs digest content and generate an answer, creating a better experience.

[2:28]
Applying LLMs to your own content

RAG allows you to apply the same LLM experience to your own content (website, PDFs, ticketing systems) that may not be available on the internet.

[4:30]
Prompt before the prompt

The user question is bundled into a prompt. Instructions (e.g., 'you are a contact center specialist') and relevant content are added before the prompt.

[7:37]
Vectorization of content

Content is broken into chunks, each converted into a vector (numeric representation). Similar topics have similar vectors.

[8:46]
Retrieving relevant content

The user question is also vectorized. A mathematical comparison finds the top 5 closest content vectors. Those chunks are used in the prompt.

[10:37]
Definition of RAG

The whole process is called RAG: retrieving relevant documents, augmenting the generation process with them.

[11:05]
Popularity of RAG

RAG is extremely popular; the majority of LLM projects the speaker sees use this pattern to create ChatGPT-like experiences for employees or customers.

Clickbait Check

95% Legit

"The title accurately describes the video's content: a clear explanation of Retrieval Augmented Generation."

Tutorial Checklist

1 3:32 Identify your content source (website, PDFs, ticketing system, etc.).
2 7:37 Break your content into chunks (paragraphs or pages).
3 7:50 Convert each chunk into a vector using an LLM and store in a vector database.
4 8:46 When a user asks a question, vectorize the question in real time.
5 9:06 Perform a mathematical comparison to find the top 5 most similar content vectors.
6 9:32 Retrieve the corresponding chunks of content.
7 5:29 Build the prompt: add instructions (e.g., 'you are a contact center specialist'), then the retrieved content, then the user question.
8 6:40 Send the full prompt to the LLM and return the generated response.

Study Flashcards (7)

What does RAG stand for?

easy Click to reveal answer

Retrieval Augmented Generation

0:21

What is the main purpose of RAG?

medium Click to reveal answer

To provide a ChatGPT-like experience but using your own content (e.g., website, PDFs, ticketing systems).

0:49

How does an LLM experience differ from a search engine experience?

medium Click to reveal answer

Search engines list links; LLMs digest content and generate an answer.

1:14

What is the key technique used in RAG to incorporate custom content?

hard Click to reveal answer

The prompt before the prompt: instructions and relevant content added to the user's question before sending to the LLM.

5:29

How does the system retrieve only the relevant parts of your content?

hard Click to reveal answer

Chunks of content are converted into vectors (numeric representations) and stored in a vector database. The user's question is also vectorized, and the closest vectors are retrieved.

7:37

What type of database is typically used to store the vectorized content?

medium Click to reveal answer

A vector database.

9:58

What do the 'retrieval' and 'augmented' parts of RAG refer to?

medium Click to reveal answer

Retrieving relevant documents, augmenting the generation process with those documents.

10:37

💡 Key Takeaways

💡

RAG enables LLMs on custom content

Explains the core value of RAG: using LLMs on your own data, not just public internet content.

0:49
🔧

Prompt before the prompt

Key technique of adding instructions and relevant content to the user query before sending to the LLM.

5:29
🔧

Vectorization and retrieval

Describes how content is chunked, vectorized, and stored, then retrieved based on similarity to the question vector.

7:37
📊

Majority of LLM projects use RAG

Highlights the popularity and practical importance of this pattern in real-world applications.

11:05

✂️ Creator Tools: Viral Hooks

AI-generated clip ideas for Shorts based on the transcript

Why ChatGPT is Better Than Google Search?

57s

Relatable comparison between search engines and LLMs sparks curiosity and engagement.

▶ Play Clip

How to Make ChatGPT Answer Using YOUR Data

50s

Practical business value for anyone wanting to apply AI to their own content.

▶ Play Clip

The Secret Trick: Prompt Before the Prompt

60s

Reveals an insider prompt engineering technique that is both educational and actionable.

▶ Play Clip

How AI Understands Similar Topics (Vectors Explained)

59s

Simplifies a complex technical concept into an understandable analogy, making it shareable.

▶ Play Clip

Why RAG is EVERYWHERE in AI

44s

Trend confirmation and summary that resonates with viewers following AI developments.

▶ Play Clip

[00:00] hello, everyone, uh, welcome, to, my, code

[00:02] deare, uh, video, series, um, what, I'm, doing

[00:06] is, I'm, rotating, through, three, different

[00:08] types, of, topics, educational, topics, uh

[00:11] use, case, topics, and, then, kind, of, bias

[00:13] ethics, safety, uh, topic, so, now, on, the

[00:17] education, rotation, and, today, what, I

[00:19] wanted, to, talk, about, is, uh, what, is

[00:21] retrieval, augmented, generation, or, rag, uh

[00:26] and, you, may, think, that, I'm, going, into

[00:28] some, kind, of, nook, and, cranny, of, the, AI

[00:31] uh, field, but, this, is, a, very, important

[00:33] and, popular, kind, of, solution, pattern

[00:36] that, I, see, um, being, used, over, and, over

[00:39] and, over, again, for, uh, how, to, leverage

[00:42] large, language, models, so, I, thought, I

[00:44] would, explain, it, uh, to, you, uh, and, the

[00:47] the, the, thing, that, this, is, used, for, is

[00:49] basically, systems, that, leverage, large

[00:51] language, models, but, on, your, own, content

[00:55] so, let, me, describe, that, if, you, think, of

[00:57] like, the, chat, GPT, experience, and, if, you

[01:00] think, about, that, um, relative, to, like, the

[01:02] search, engine, experience, that, we, had

[01:05] before, if, you, ask, a, question, like, um, I

[01:08] don't, know, what, color, is, the, sky, or, how

[01:10] do, I, fix, this, plumbing, issue, or

[01:12] something, like, that, a, search, engine

[01:14] would, go, out, uh, or, appear, to, go, out

[01:17] search, the, internet, find, relevant

[01:19] content, and, then, just, list, that, content

[01:21] for, you, list, those, links, and, then, you, as

[01:24] a, user, would, need, to, click, on, the, links

[01:26] that, seem, seem, right, read, it, digest, it

[01:29] and, figure, out, the, answer, to, your

[01:31] question, what, a, large, language, model

[01:33] does, is, it, seems, to, do, that, first, part

[01:35] meaning, leverage, the, content, on, the

[01:37] whole, internet, but, instead, of, just

[01:39] listing, that, content, it, sort, of, digests

[01:41] it, digests, it, combines, it, assembles, it

[01:44] together, and, answers, your, question, sort

[01:46] of, generates, an, answer, um, so, it's, a

[01:49] whole, lot, better, I, mean, search, engines

[01:51] have, been, great, but, this, is, taking, the

[01:52] whole, experience, to, another, level, and, in

[01:55] addition, the, question, and, answering, uh

[01:57] you, can, also, give, it, instructions, like

[01:59] write, me, this, document, or, write, me, a

[02:01] lesson, plan, to, teach, geometry, to, seventh

[02:03] graders, uh, and, it, will, do, something

[02:05] similar, it, will, kind, of, assemble, content

[02:08] that, it, SE, that, it, has, seen, uh, that

[02:10] talks, about, geometry, or, seventh, graders

[02:12] or, how, to, do, lesson, plans, or, whatever, uh

[02:15] pulls, that, together, assembles, it, and

[02:17] then, writes, out, a, lesson, plan, okay, so

[02:21] it's, a, much, better, experience, than, just

[02:23] taking, the, raw, content, from, the, internet

[02:25] but, it, really, uh, creates, something, new

[02:28] from, that, now, let's, say, you, want, that

[02:30] same, experience, but, on, your, own, content

[02:33] so, it, might, be, a, chatbot, on, your, website

[02:36] or, you, might, have, a, library, of, PDF

[02:37] documents, that, this, documentation, for

[02:40] one, of, your, products, uh, and, instead, of

[02:42] just, linking, the, user, to, parag, sections

[02:46] of, the, documentation, you, want, to

[02:47] actually, answer, their, question, uh, it

[02:50] might, be, your, service, ticketing, uh

[02:52] system, so, when, a, new, issue, comes, in, you

[02:54] could, say, how, would, I, resolve, this, issue

[02:56] and, it, can, assemble, past, similar, issues

[02:59] uh, and, then, come, up, with, a, new, uh, new

[03:01] solution, based, on, that, so, this, is, an

[03:04] incredible, experience, that, these, large

[03:07] language, models, offer, but, how, can, you

[03:09] create, that, experience, on, your, own

[03:11] content, uh, that, might, not, be, available

[03:14] to, the, internet, or, available, to, these

[03:16] large, language, models, well, the, solution

[03:18] to, this, is, this, rag, um, architecture, this

[03:22] retrieval, augmented, uh, generation

[03:24] architecture, so, now, I'm, going, to, do, my

[03:25] best, to, explain, that, uh, to

[03:28] you, so, let's, say, you, have, a, um

[03:32] user, and, I'm, going, to, use, the, example, of

[03:35] a, uh, patient, chatbot, and, the, content

[03:39] source, is, going, to, be, that, content, from

[03:41] your, website, let's, say, or, could, be

[03:43] content, from, PDF, documents, or, or

[03:46] whatever, but, you, want, this, to, be, the

[03:47] content, to, answer, the, patient's

[03:49] questions, so, if, the, patient, has, a

[03:50] question, like, how, do, I, prepare, for, my

[03:52] knee, surgery, instead, of, just, going, to

[03:55] chat, sheet, PT, and, getting, a, generic

[03:56] answer, you'd, like, to, provide, an, answer

[03:59] that's, from, your, health, system, or, a

[04:02] question, like, do, you, have, parking, you'd

[04:04] like, to, provide, an, answer, for, your

[04:06] health, system, for, your, the, office, where

[04:07] the, patient, is, seen, okay, so, that's, a

[04:10] scenario, that, I'd, like, to, do, so, the

[04:12] patient, has, a

[04:13] question, uh, and, I'm, going, to, do, do, you

[04:16] have

[04:17] parking, have

[04:23] parking, um, you, can, uh, imagine, that

[04:26] question, being, bundled, up, into, a, prompt

[04:30] what's, called, a, prompt, and, I'll, describe

[04:32] this, more

[04:33] later, so, there, is, the, question, that

[04:37] prompt, is, sent, to, a, large, language

[04:40] model, and, that, large, language, model, will

[04:44] come, up, with, a, response, to, that, question

[04:48] okay, now, um, if, you, just, wanted, to, use, uh

[04:51] chat, GPT, let's, say, or, some, other, llm, uh

[04:54] without, any, extra, content, you, could, just

[04:57] use, this, flow, how, do, I, prepare, for, my

[04:59] knee, surgery, or, do, you, have, parking, put

[05:02] that, into, a, prompt, send, that, to, the, uh

[05:04] large, language, model, and, get, a, response

[05:06] back, okay, but, uh, but, what, we, want, to, do

[05:09] is, enhance, this, experience, with, our, own

[05:11] content, so, let's, say, here, is, your

[05:13] content

[05:14] source, and, again, this, might, be, all, the

[05:17] content, of, your, website, or, PDF, documents

[05:21] or, internal, ticketing, system, or

[05:23] databases, or, that, uh, that, sort, of, thing

[05:27] and, what, you'd, like, to, do, is, something

[05:29] called, called, the, prop, before, the, propt

[05:32] so, in, these, systems, you, don't, just, send

[05:34] the, user, question, to, the, large, language

[05:36] model, you, usually, have, some, level, of

[05:38] instructions, So, the, instructions, might

[05:41] be, you, are, a, contact, center, specialist

[05:44] working, for, a, hospital, answering, patient

[05:47] questions, that, come, in, over, the, Internet

[05:50] uh, please, be, uh, nice, to, the, patients, and

[05:53] responsive, and, folksy, because, that, fits

[05:55] with, our, brand, or, some, instructions, like

[05:58] that, are, sometimes, sent, with, the, prompt

[06:01] um, and, then, uh

[06:03] Additionally, you, want, to, provide, the

[06:06] information, that, the, L, llm, needs, to

[06:08] answer, the, question, so, what, you'd

[06:11] ideally, like, is, information, from, your

[06:14] website, to, be, included, here, um, and, uh

[06:18] and, that, to, be, sent, to, the, llm, as, well

[06:20] so, the, full, prompt, might, be, your

[06:23] instructions, it, might, be, something, like

[06:25] please, use, this, content, um, in, order, to

[06:28] answer, the, patient, question, at, the, end

[06:30] and, then, you, put, in, a, bunch, of

[06:32] information, about, parking, or, about, knee

[06:34] surgery, or, whatever, the, patient, asked

[06:36] you, put, that, in, the, prompt, before, the

[06:38] prompt, then, you, have, the, question, then

[06:40] you, send, that, whole, package, to, the, llm

[06:43] and, the, llm, will, give, a, great, response

[06:45] based, on, your

[06:47] content, okay, with, me, so, far, so, um, so

[06:51] this, notion, is, the, prop, before, the

[06:53] prompt, um, and, and, that's, why, prompt

[06:56] engineering, and, these, types, of, things

[06:58] are, a, big, field, right, now, now, because

[07:00] you, can, really, hone, the, um, these, systems

[07:03] by, doing, a, better, and, better, job, with

[07:05] the, actual, prompt, before, the, prompt, um

[07:08] in, uh, in, this

[07:10] style, now, the, last, trick, here, is, your

[07:14] website, or, your, content, is, huge, and, it

[07:16] talks, about, all, kinds, of, topics, Beyond

[07:19] parking, and, Beyond, knee, surgery, so, you

[07:21] really, want, to, somehow, pull, out, only, the

[07:24] parts, of, your, content, that, are, relevant

[07:26] to, the, patient's, question, so, this, is

[07:29] another, um, a, tricky, part, of, this, whole

[07:32] rag, architecture, uh, and, the, way, that

[07:34] works, is, that, um, you, take, all, your

[07:37] content, and, you, break, it, into, chunks, or

[07:40] these, systems, will, break, it, into, chunks

[07:42] so, chunk, might, be, a, paragraph, of, content

[07:44] or, a, p, or, a, couple, paragraphs, a, page

[07:46] something, like, that, and, then, those, um

[07:50] chunks, are, sent, to, a, large, language

[07:53] model, could, be, the, same, one, or, a

[07:55] different, one, and, they, are, turned, into, a

[07:58] vector

[08:01] and, uh, so, each, each, paragraph, or, each

[08:04] chunk, will, have, a

[08:06] vector, which, is, just, is, just, a, series, of

[08:09] numbers, and, that, series, of

[08:12] numbers, you, can, think, of, it, as, the

[08:14] numeric, representation, of, the, essence, of

[08:17] that

[08:18] paragraph, and, what's, uh, different, about

[08:21] these, numbers, just, they're, not, random

[08:23] numbers, but, paragraphs, that, talk, about, a

[08:25] similar, topic, have, close, by, numbers, they

[08:28] almost, have, the, same, vectors, okay, so, in

[08:31] addition, to, the, uh, it's, a, numera, Zed

[08:33] version, of, the, paragraph, but, it's, such

[08:37] that, similar, paragraphs, on, similar

[08:39] topics, will, have, similar, vectors, will

[08:42] have, similar, numbers, so, that, means, that

[08:46] what, happens, is, when, um, uh, a, user, will

[08:49] ask, a, question, like, do, you, have, parking

[08:51] let's

[08:52] say, then, that, is, also, sent, to, the, llm, in

[08:55] real, time, right, after, the, user, asked, the

[08:58] question

[08:59] that, comes, up, with, the, vector, as, well

[09:02] you, could, think, of, that, as, the, question

[09:04] vector, and, then, what, happens, we, do, we, do

[09:06] a, mathematical, comparison, real, quick

[09:09] between, the, vector, of, the, question, and

[09:11] then, the, vectors, of, your, content, and

[09:13] pick, like, the, top, five, documents, that

[09:15] are, closest, to, this, question, so, do, you

[09:17] have, parking, will, be, a, vector, then, you

[09:21] have, all, your, content, and, it's, going, to

[09:23] try, and, find, the, five, documents, that

[09:25] taught, the, most, about, parking, basically

[09:28] um, and, so, it'll, find, those, I, don't, know

[09:30] what, that, is, it'll, find, those, documents

[09:32] let's, say, uh, from, these, it'll, grab, the

[09:34] paragraphs, associated, with, those

[09:37] documents, um, and, it'll, use, that

[09:41] here, so, those, will, be, the, subset, of, your

[09:45] content, basically, that, is, used, as, part

[09:48] of, the, prompt, before, the, prompt, okay, so

[09:51] this, whole, uh, concept, is, uh, kind, of

[09:54] vectorizing, your, content, uh, typically

[09:58] that, then, our, storage, in, something

[09:59] called, a, vector, database, which, is

[10:01] basically, a, representation, of, your

[10:03] content, in, this, numeric, form, and, then

[10:06] this, system, that, you, build, this, rag

[10:08] system, will, uh, take, the, question, find

[10:12] retrieve, the, most, relevant, content, make

[10:15] that, as, part, of, the, prompt, before, the

[10:17] prompt, send, that, to, the, llm, and, then

[10:20] you'll, get, a, good, response, back, actually

[10:23] so, it's, a, little, bit, confusing, but, um

[10:25] but, it's, actually, not, that, confusing, um

[10:28] uh, I, just, made, it, more, confusing, by, this

[10:30] horrible, uh, horrible, drawing, but, this

[10:32] whole, thing, is, um, what, is, uh, called, rag

[10:37] retrieval, so, you're, retrieving, the

[10:39] relevant, documents, from, your, content

[10:42] you're, augmenting, the, generation, process

[10:45] so, you're, augmenting, the, lm's, ability, to

[10:48] do, generative, AI, based, on, the, documents

[10:51] that, you, retrieve, so, that's, why, it's

[10:52] retrieval, augmenting

[10:55] generation, okay, so, I, hope, that, made

[10:58] sense, uh, like, I, said, this, is, a, very

[11:00] popular, um, solution, pattern, that, I'm

[11:03] seeing, over, and, over, again, in, fact, the

[11:05] majority, of, llm, projects, that, I, see, are

[11:08] this, kind, of, thing, using, my, content

[11:11] packaging, that, up, with, an, llm, system, to

[11:14] create, a, kind, of, chat, chpt, like

[11:16] experience, for, my, employees, or, for, my

[11:20] customers, for, my, users, that, kind, of

[11:22] thing, and, it, works, extremely, well, that's

[11:24] why, uh, that's, why, it's, so, popular, so, I

[11:27] hope, that, was, interesting, and

[11:29] educational, and, made, sense, if, you, have

[11:31] any, questions, please, leave, them, for, me

[11:33] uh, as, part, of, the, comments, uh, thank, you

[11:35] very, much

⚡ Saved you time reading this? Transcribe any YouTube video for free — no signup needed.