Why ChatGPT is Better Than Google Search?
57sRelatable comparison between search engines and LLMs sparks curiosity and engagement.
▶ Play ClipThis video explains Retrieval Augmented Generation (RAG), a popular pattern for using large language models (LLMs) on your own content. It compares the search engine experience with the LLM experience, then details how RAG works: vectorizing content, retrieving relevant chunks, and augmenting the prompt before sending it to the LLM. The speaker emphasizes that RAG is widely used to create ChatGPT-like experiences for employees or customers using proprietary data.
The video is part of a series rotating through educational, use case, and ethics topics. Today's topic is RAG, a very important and popular solution pattern.
Search engines list links; users must click, read, and digest. LLMs digest content and generate an answer, creating a better experience.
RAG allows you to apply the same LLM experience to your own content (website, PDFs, ticketing systems) that may not be available on the internet.
The user question is bundled into a prompt. Instructions (e.g., 'you are a contact center specialist') and relevant content are added before the prompt.
Content is broken into chunks, each converted into a vector (numeric representation). Similar topics have similar vectors.
The user question is also vectorized. A mathematical comparison finds the top 5 closest content vectors. Those chunks are used in the prompt.
The whole process is called RAG: retrieving relevant documents, augmenting the generation process with them.
RAG is extremely popular; the majority of LLM projects the speaker sees use this pattern to create ChatGPT-like experiences for employees or customers.
"The title accurately describes the video's content: a clear explanation of Retrieval Augmented Generation."
What does RAG stand for?
Retrieval Augmented Generation
0:21
What is the main purpose of RAG?
To provide a ChatGPT-like experience but using your own content (e.g., website, PDFs, ticketing systems).
0:49
How does an LLM experience differ from a search engine experience?
Search engines list links; LLMs digest content and generate an answer.
1:14
What is the key technique used in RAG to incorporate custom content?
The prompt before the prompt: instructions and relevant content added to the user's question before sending to the LLM.
5:29
How does the system retrieve only the relevant parts of your content?
Chunks of content are converted into vectors (numeric representations) and stored in a vector database. The user's question is also vectorized, and the closest vectors are retrieved.
7:37
What type of database is typically used to store the vectorized content?
A vector database.
9:58
What do the 'retrieval' and 'augmented' parts of RAG refer to?
Retrieving relevant documents, augmenting the generation process with those documents.
10:37
RAG enables LLMs on custom content
Explains the core value of RAG: using LLMs on your own data, not just public internet content.
0:49Prompt before the prompt
Key technique of adding instructions and relevant content to the user query before sending to the LLM.
5:29Vectorization and retrieval
Describes how content is chunked, vectorized, and stored, then retrieved based on similarity to the question vector.
7:37Majority of LLM projects use RAG
Highlights the popularity and practical importance of this pattern in real-world applications.
11:05[00:00] hello, everyone, uh, welcome, to, my, code
[00:02] deare, uh, video, series, um, what, I'm, doing
[00:06] is, I'm, rotating, through, three, different
[00:08] types, of, topics, educational, topics, uh
[00:11] use, case, topics, and, then, kind, of, bias
[00:13] ethics, safety, uh, topic, so, now, on, the
[00:17] education, rotation, and, today, what, I
[00:19] wanted, to, talk, about, is, uh, what, is
[00:21] retrieval, augmented, generation, or, rag, uh
[00:26] and, you, may, think, that, I'm, going, into
[00:28] some, kind, of, nook, and, cranny, of, the, AI
[00:31] uh, field, but, this, is, a, very, important
[00:33] and, popular, kind, of, solution, pattern
[00:36] that, I, see, um, being, used, over, and, over
[00:39] and, over, again, for, uh, how, to, leverage
[00:42] large, language, models, so, I, thought, I
[00:44] would, explain, it, uh, to, you, uh, and, the
[00:47] the, the, thing, that, this, is, used, for, is
[00:49] basically, systems, that, leverage, large
[00:51] language, models, but, on, your, own, content
[00:55] so, let, me, describe, that, if, you, think, of
[00:57] like, the, chat, GPT, experience, and, if, you
[01:00] think, about, that, um, relative, to, like, the
[01:02] search, engine, experience, that, we, had
[01:05] before, if, you, ask, a, question, like, um, I
[01:08] don't, know, what, color, is, the, sky, or, how
[01:10] do, I, fix, this, plumbing, issue, or
[01:12] something, like, that, a, search, engine
[01:14] would, go, out, uh, or, appear, to, go, out
[01:17] search, the, internet, find, relevant
[01:19] content, and, then, just, list, that, content
[01:21] for, you, list, those, links, and, then, you, as
[01:24] a, user, would, need, to, click, on, the, links
[01:26] that, seem, seem, right, read, it, digest, it
[01:29] and, figure, out, the, answer, to, your
[01:31] question, what, a, large, language, model
[01:33] does, is, it, seems, to, do, that, first, part
[01:35] meaning, leverage, the, content, on, the
[01:37] whole, internet, but, instead, of, just
[01:39] listing, that, content, it, sort, of, digests
[01:41] it, digests, it, combines, it, assembles, it
[01:44] together, and, answers, your, question, sort
[01:46] of, generates, an, answer, um, so, it's, a
[01:49] whole, lot, better, I, mean, search, engines
[01:51] have, been, great, but, this, is, taking, the
[01:52] whole, experience, to, another, level, and, in
[01:55] addition, the, question, and, answering, uh
[01:57] you, can, also, give, it, instructions, like
[01:59] write, me, this, document, or, write, me, a
[02:01] lesson, plan, to, teach, geometry, to, seventh
[02:03] graders, uh, and, it, will, do, something
[02:05] similar, it, will, kind, of, assemble, content
[02:08] that, it, SE, that, it, has, seen, uh, that
[02:10] talks, about, geometry, or, seventh, graders
[02:12] or, how, to, do, lesson, plans, or, whatever, uh
[02:15] pulls, that, together, assembles, it, and
[02:17] then, writes, out, a, lesson, plan, okay, so
[02:21] it's, a, much, better, experience, than, just
[02:23] taking, the, raw, content, from, the, internet
[02:25] but, it, really, uh, creates, something, new
[02:28] from, that, now, let's, say, you, want, that
[02:30] same, experience, but, on, your, own, content
[02:33] so, it, might, be, a, chatbot, on, your, website
[02:36] or, you, might, have, a, library, of, PDF
[02:37] documents, that, this, documentation, for
[02:40] one, of, your, products, uh, and, instead, of
[02:42] just, linking, the, user, to, parag, sections
[02:46] of, the, documentation, you, want, to
[02:47] actually, answer, their, question, uh, it
[02:50] might, be, your, service, ticketing, uh
[02:52] system, so, when, a, new, issue, comes, in, you
[02:54] could, say, how, would, I, resolve, this, issue
[02:56] and, it, can, assemble, past, similar, issues
[02:59] uh, and, then, come, up, with, a, new, uh, new
[03:01] solution, based, on, that, so, this, is, an
[03:04] incredible, experience, that, these, large
[03:07] language, models, offer, but, how, can, you
[03:09] create, that, experience, on, your, own
[03:11] content, uh, that, might, not, be, available
[03:14] to, the, internet, or, available, to, these
[03:16] large, language, models, well, the, solution
[03:18] to, this, is, this, rag, um, architecture, this
[03:22] retrieval, augmented, uh, generation
[03:24] architecture, so, now, I'm, going, to, do, my
[03:25] best, to, explain, that, uh, to
[03:28] you, so, let's, say, you, have, a, um
[03:32] user, and, I'm, going, to, use, the, example, of
[03:35] a, uh, patient, chatbot, and, the, content
[03:39] source, is, going, to, be, that, content, from
[03:41] your, website, let's, say, or, could, be
[03:43] content, from, PDF, documents, or, or
[03:46] whatever, but, you, want, this, to, be, the
[03:47] content, to, answer, the, patient's
[03:49] questions, so, if, the, patient, has, a
[03:50] question, like, how, do, I, prepare, for, my
[03:52] knee, surgery, instead, of, just, going, to
[03:55] chat, sheet, PT, and, getting, a, generic
[03:56] answer, you'd, like, to, provide, an, answer
[03:59] that's, from, your, health, system, or, a
[04:02] question, like, do, you, have, parking, you'd
[04:04] like, to, provide, an, answer, for, your
[04:06] health, system, for, your, the, office, where
[04:07] the, patient, is, seen, okay, so, that's, a
[04:10] scenario, that, I'd, like, to, do, so, the
[04:12] patient, has, a
[04:13] question, uh, and, I'm, going, to, do, do, you
[04:16] have
[04:17] parking, have
[04:23] parking, um, you, can, uh, imagine, that
[04:26] question, being, bundled, up, into, a, prompt
[04:30] what's, called, a, prompt, and, I'll, describe
[04:32] this, more
[04:33] later, so, there, is, the, question, that
[04:37] prompt, is, sent, to, a, large, language
[04:40] model, and, that, large, language, model, will
[04:44] come, up, with, a, response, to, that, question
[04:48] okay, now, um, if, you, just, wanted, to, use, uh
[04:51] chat, GPT, let's, say, or, some, other, llm, uh
[04:54] without, any, extra, content, you, could, just
[04:57] use, this, flow, how, do, I, prepare, for, my
[04:59] knee, surgery, or, do, you, have, parking, put
[05:02] that, into, a, prompt, send, that, to, the, uh
[05:04] large, language, model, and, get, a, response
[05:06] back, okay, but, uh, but, what, we, want, to, do
[05:09] is, enhance, this, experience, with, our, own
[05:11] content, so, let's, say, here, is, your
[05:13] content
[05:14] source, and, again, this, might, be, all, the
[05:17] content, of, your, website, or, PDF, documents
[05:21] or, internal, ticketing, system, or
[05:23] databases, or, that, uh, that, sort, of, thing
[05:27] and, what, you'd, like, to, do, is, something
[05:29] called, called, the, prop, before, the, propt
[05:32] so, in, these, systems, you, don't, just, send
[05:34] the, user, question, to, the, large, language
[05:36] model, you, usually, have, some, level, of
[05:38] instructions, So, the, instructions, might
[05:41] be, you, are, a, contact, center, specialist
[05:44] working, for, a, hospital, answering, patient
[05:47] questions, that, come, in, over, the, Internet
[05:50] uh, please, be, uh, nice, to, the, patients, and
[05:53] responsive, and, folksy, because, that, fits
[05:55] with, our, brand, or, some, instructions, like
[05:58] that, are, sometimes, sent, with, the, prompt
[06:01] um, and, then, uh
[06:03] Additionally, you, want, to, provide, the
[06:06] information, that, the, L, llm, needs, to
[06:08] answer, the, question, so, what, you'd
[06:11] ideally, like, is, information, from, your
[06:14] website, to, be, included, here, um, and, uh
[06:18] and, that, to, be, sent, to, the, llm, as, well
[06:20] so, the, full, prompt, might, be, your
[06:23] instructions, it, might, be, something, like
[06:25] please, use, this, content, um, in, order, to
[06:28] answer, the, patient, question, at, the, end
[06:30] and, then, you, put, in, a, bunch, of
[06:32] information, about, parking, or, about, knee
[06:34] surgery, or, whatever, the, patient, asked
[06:36] you, put, that, in, the, prompt, before, the
[06:38] prompt, then, you, have, the, question, then
[06:40] you, send, that, whole, package, to, the, llm
[06:43] and, the, llm, will, give, a, great, response
[06:45] based, on, your
[06:47] content, okay, with, me, so, far, so, um, so
[06:51] this, notion, is, the, prop, before, the
[06:53] prompt, um, and, and, that's, why, prompt
[06:56] engineering, and, these, types, of, things
[06:58] are, a, big, field, right, now, now, because
[07:00] you, can, really, hone, the, um, these, systems
[07:03] by, doing, a, better, and, better, job, with
[07:05] the, actual, prompt, before, the, prompt, um
[07:08] in, uh, in, this
[07:10] style, now, the, last, trick, here, is, your
[07:14] website, or, your, content, is, huge, and, it
[07:16] talks, about, all, kinds, of, topics, Beyond
[07:19] parking, and, Beyond, knee, surgery, so, you
[07:21] really, want, to, somehow, pull, out, only, the
[07:24] parts, of, your, content, that, are, relevant
[07:26] to, the, patient's, question, so, this, is
[07:29] another, um, a, tricky, part, of, this, whole
[07:32] rag, architecture, uh, and, the, way, that
[07:34] works, is, that, um, you, take, all, your
[07:37] content, and, you, break, it, into, chunks, or
[07:40] these, systems, will, break, it, into, chunks
[07:42] so, chunk, might, be, a, paragraph, of, content
[07:44] or, a, p, or, a, couple, paragraphs, a, page
[07:46] something, like, that, and, then, those, um
[07:50] chunks, are, sent, to, a, large, language
[07:53] model, could, be, the, same, one, or, a
[07:55] different, one, and, they, are, turned, into, a
[07:58] vector
[08:01] and, uh, so, each, each, paragraph, or, each
[08:04] chunk, will, have, a
[08:06] vector, which, is, just, is, just, a, series, of
[08:09] numbers, and, that, series, of
[08:12] numbers, you, can, think, of, it, as, the
[08:14] numeric, representation, of, the, essence, of
[08:17] that
[08:18] paragraph, and, what's, uh, different, about
[08:21] these, numbers, just, they're, not, random
[08:23] numbers, but, paragraphs, that, talk, about, a
[08:25] similar, topic, have, close, by, numbers, they
[08:28] almost, have, the, same, vectors, okay, so, in
[08:31] addition, to, the, uh, it's, a, numera, Zed
[08:33] version, of, the, paragraph, but, it's, such
[08:37] that, similar, paragraphs, on, similar
[08:39] topics, will, have, similar, vectors, will
[08:42] have, similar, numbers, so, that, means, that
[08:46] what, happens, is, when, um, uh, a, user, will
[08:49] ask, a, question, like, do, you, have, parking
[08:51] let's
[08:52] say, then, that, is, also, sent, to, the, llm, in
[08:55] real, time, right, after, the, user, asked, the
[08:58] question
[08:59] that, comes, up, with, the, vector, as, well
[09:02] you, could, think, of, that, as, the, question
[09:04] vector, and, then, what, happens, we, do, we, do
[09:06] a, mathematical, comparison, real, quick
[09:09] between, the, vector, of, the, question, and
[09:11] then, the, vectors, of, your, content, and
[09:13] pick, like, the, top, five, documents, that
[09:15] are, closest, to, this, question, so, do, you
[09:17] have, parking, will, be, a, vector, then, you
[09:21] have, all, your, content, and, it's, going, to
[09:23] try, and, find, the, five, documents, that
[09:25] taught, the, most, about, parking, basically
[09:28] um, and, so, it'll, find, those, I, don't, know
[09:30] what, that, is, it'll, find, those, documents
[09:32] let's, say, uh, from, these, it'll, grab, the
[09:34] paragraphs, associated, with, those
[09:37] documents, um, and, it'll, use, that
[09:41] here, so, those, will, be, the, subset, of, your
[09:45] content, basically, that, is, used, as, part
[09:48] of, the, prompt, before, the, prompt, okay, so
[09:51] this, whole, uh, concept, is, uh, kind, of
[09:54] vectorizing, your, content, uh, typically
[09:58] that, then, our, storage, in, something
[09:59] called, a, vector, database, which, is
[10:01] basically, a, representation, of, your
[10:03] content, in, this, numeric, form, and, then
[10:06] this, system, that, you, build, this, rag
[10:08] system, will, uh, take, the, question, find
[10:12] retrieve, the, most, relevant, content, make
[10:15] that, as, part, of, the, prompt, before, the
[10:17] prompt, send, that, to, the, llm, and, then
[10:20] you'll, get, a, good, response, back, actually
[10:23] so, it's, a, little, bit, confusing, but, um
[10:25] but, it's, actually, not, that, confusing, um
[10:28] uh, I, just, made, it, more, confusing, by, this
[10:30] horrible, uh, horrible, drawing, but, this
[10:32] whole, thing, is, um, what, is, uh, called, rag
[10:37] retrieval, so, you're, retrieving, the
[10:39] relevant, documents, from, your, content
[10:42] you're, augmenting, the, generation, process
[10:45] so, you're, augmenting, the, lm's, ability, to
[10:48] do, generative, AI, based, on, the, documents
[10:51] that, you, retrieve, so, that's, why, it's
[10:52] retrieval, augmenting
[10:55] generation, okay, so, I, hope, that, made
[10:58] sense, uh, like, I, said, this, is, a, very
[11:00] popular, um, solution, pattern, that, I'm
[11:03] seeing, over, and, over, again, in, fact, the
[11:05] majority, of, llm, projects, that, I, see, are
[11:08] this, kind, of, thing, using, my, content
[11:11] packaging, that, up, with, an, llm, system, to
[11:14] create, a, kind, of, chat, chpt, like
[11:16] experience, for, my, employees, or, for, my
[11:20] customers, for, my, users, that, kind, of
[11:22] thing, and, it, works, extremely, well, that's
[11:24] why, uh, that's, why, it's, so, popular, so, I
[11:27] hope, that, was, interesting, and
[11:29] educational, and, made, sense, if, you, have
[11:31] any, questions, please, leave, them, for, me
[11:33] uh, as, part, of, the, comments, uh, thank, you
[11:35] very, much
⚡ Saved you time reading this? Transcribe any YouTube video for free — no signup needed.