How to Fine Tune Llama 3 for Better Instruction Following?

0h 08m video Transcribed Jun 30, 2026

55.8K

Views

1.3K

Likes

89

Comments

100

Dislikes

2.6%

📈 Moderate

✂️ Creator Tools: Viral Hooks

AI-generated clip ideas for Shorts based on the transcript

Why Fine-Tune Llama 3?

41s

Clearly demonstrates the problem of base models not following instructions, creating immediate curiosity for the solution.

▶ Play Clip

Step-by-Step Fine-Tuning Setup

60s

Provides a clear, actionable tutorial on setting up the environment, appealing to developers wanting to replicate the process.

▶ Play Clip

Before vs After: Fine-Tuning Magic

56s

Shows a dramatic before-and-after comparison of model responses, highlighting the effectiveness of fine-tuning.

▶ Play Clip

Training in Action: Loss & Results

59s

Reveals real-time training metrics and the final improved response, making the technical process tangible and impressive.

▶ Play Clip

Upload to Hugging Face: Share Your Model

54s

Ends with a practical guide to sharing the fine-tuned model, empowering viewers to contribute to the community.

▶ Play Clip

Full Transcript

Download .txt Download .md

[00:00] This is amazing. Now we are going to fine tune Lama 3 model in this we are going to see how to instruction fine tune how to save the model locally and finally upload that to hugging face.

[00:13] Why we should fine tune for example if you take a model that is the base model when we ask a question like this list the top five most popular movies of all time is going to respond some random or maybe

[00:27] a continuation from this text it is not going to follow instruction so to make it to follow instruction we need to fine tune the model so after fine tuning when you ask the same question list the top five

[00:41] most popular movies of all time it is going to give us the list so how to fine tune that's exactly what we're going to see today let's get started. Hi everyone I'm really excited to show you about

[00:55] Lama 3 fine tuning or model training I'm going to take you through step by step on how to do this and finally upload that in hugging face so that everyone else can use this model but before that I regularly create videos and regards to artificial intelligence on my YouTube channel so do subscribe

[01:10] and click the bell icon to stay tuned make sure you click the like button so this video can be helpful for many others like you we're going to use this data sets OIG to fine tune our model so if you

[01:22] see the data it starts with human and off that is expecting the bot response like this so when we ask question as a human the bot should respond in instruction following manner this data set is called

[01:36] open instruction generalist data set so if I open the data set this how it's going to live like there are multiple lines each line is an instruction data so as a human here's the question what are

[01:50] some tips for creating a successful business plan and if I take only one row this is how it's going to look like it contains text and metadata that means the source data so I'm going to extract

[02:02] the information from this line and this how it is going to look like human and the question then the bot is going to respond with the list of answers so now we're going to teach the model how to respond

[02:15] like this when we ask a question so first step conda create hyphen and unsloth python equals 3.11 and then click enter we're going to use unsloth to fine tune this lama 3 model so we're creating the

[02:28] conda environment this will automatically create that next conda activate unsloth now pip install hugging phase hub i python unsloth collab and then click enter I will put all the information in the description below after this export the hugging phase token like this and then click enter this

[02:44] is used to download the model from hugging phase I'm also going to use weights and biases this is used to save all our training data or the training metrics in a clean dashboard format to do that

[02:58] you can install pip install w and b next w and b logging this will automatically initiate the process of logging into w and b I've already logged in now let's create a file called appletpy and let's

[03:10] open it so in this first we are going to set up the configuration then load the model then we are going to see how before fine tuning how it's going to look like then we are going to fine tune the model or train the model next we are going to see how it's going to look like after fine tuning at

[03:25] the end we are going to save that to hugging phase the first step importing and configuration import os fast language model torch sft trainer training arguments load data set next defining some variables

[03:38] max sequence length the URL where the data set is that is a data set which we just looked at next we are loading the data set now we are going to load the large language model that is llama 3 using the

[03:51] fast language model from pre-trained and here is the model in which we are providing this will automatically load the model next we are going to see how it's going to look like before training so here we are going to ask the model and print out the response so before pre-training we are going to ask this

[04:07] question list the top five most popular movies of all time and we are asking a response now I'm going to run this code in your terminal python app.py and then click enter if you're using konda there are

[04:19] few more steps to follow that's why we got error like this if you see the unslawed documentation here are the list of steps so I'm going to install konda installed python torch going to copy the code

[04:31] so I'm going to use 12.1 version so I've changed that there and then click enter next going to install these two steps control v and then click enter now it's all done this is the configuration which I'm

[04:43] using rtx a6000 and I am using mass compute you can use merwin present coupon code to get 50% off now I'm ready to run python app.py and then click enter now it's running the code which means

[04:55] downloading the model and here is the response as you can see here as a human when you ask a question list the top five most popular movies of all time it's not giving me a proper answer so we're going

[05:08] to fine tune this back to the code so the fourth step is fine tuning so I'm going to define this function fast language model get pepped model so we are going to use qlora for fine tuning so these are the

[05:21] basic configuration you can change this based on your requirement I'm passing the model here so the next step initiating the sft trainer and here we are providing the data set and it's just a text field the

[05:33] model the tokenizer and basic settings you can modify this based on your requirement next I'm also saving this in the outputs folder you can modify this based on which folder you want to save this

[05:47] model next trainer dot train this will automatically start the training process now we are going to print after training how it's going to look like going to ask the same question now the final step is to save the model so we're going to save the lower model that's the adapter next we want to merge all the

[06:05] files which means our current file llama 3 files and our adapter together to do that model dot save pre-trained merged this is going to say that in the outputs folder the tokenizer and merged next model

[06:21] dot push to hub merged so this way you're saving that to the hub hugging face hub next we are pushing the un merged version that's the adapter here's the adapter here's a completely merged version that's

[06:33] it all done so basically as a quick overview we imported and defined the configuration loaded the model we ask a question before training to check we are doing a training after that using sft trainer

[06:47] next we are going to print out how it's going to look like after training and save the model in hugging face now I'm going to run this code in your terminal python 100 and then click enter now you can see the model got loaded and the training is happening you can see the loss gradient norm and learning

[07:04] rate so before training you can see this is the output I asked the question list the top five most popular movies of all time and it is not printing the correct answer it is just repeating itself so

[07:19] ultimate goal is that we need to fine tune this and make it respond properly so the training is going on you can even view the training in the weights and biases dashboard you can see the loss

[07:31] coming down and now the training is complete and after training how it's going to look like so we are asking the same question list the top five popular movies of all time and here's a response number one

[07:45] then number two this is amazing now we have successfully fine tuned or trained the model to respond to instruction you can even provide your own data and fine tune this Lamar 3 model the next step is that

[08:00] we are saving the model and it's getting pushed to hugging face you can see that the model got uploaded in hugging face in this path so we have uploaded two models one is the merged version which means it

[08:16] contains all the required files to run the large language model and another one is only the adapter so if I open the adapter version and this is where it got uploaded and if you see the files it

[08:28] contains only adapter but when we open the merged version here is the merged version which contains all the files even the model weights to run the model how you can download this model and run it

[08:40] the instruction is provided here and this is the code to run the model which we have just fine tuned I'm really excited about this I'm going to create more videos similar to this so stay tuned I hope you like this video do like share and subscribe and thanks for watching