---
title: 'Master AI image generation - ComfyUI FULL TUTORIAL'
source: 'https://youtube.com/watch?v=g74Cq9Ip2ik'
video_id: 'g74Cq9Ip2ik'
date: 2026-06-15
duration_sec: 0
---

# Master AI image generation - ComfyUI FULL TUTORIAL

> Source: [Master AI image generation - ComfyUI FULL TUTORIAL](https://youtube.com/watch?v=g74Cq9Ip2ik)

## Summary

This tutorial provides a comprehensive guide to ComfyUI, a free and open-source AI image generation tool that can run on CPU, Apple M1/M2 chips, or Nvidia GPUs. It covers installation, text-to-image, image-to-image, upscaling, control net, face swapping, and integration with external AI tools.

### Key Points

- **Introduction to ComfyUI** [00:00] — ComfyUI is a free, open-source AI image generator that can run on CPU, Apple M1/M2, or Nvidia GPU. It offers text-to-image, image-to-image, upscaling, pose control, depth control, face swapping, and consistent character generation.
- **Installation Process** [01:48] — Download the 7z file from GitHub (1.4 GB), unzip it, and run the appropriate .bat file for your GPU (Nvidia or CPU). The folder can be extracted anywhere.
- **Downloading Models** [03:55] — Models define the style (realistic, anime, etc.). Use CivitAI to browse and download checkpoints. ImageCyst ranks models by user votes. Recommended: RealVis XL (SDXL). Place the .safetensors file in ComfyUI/models/checkpoints.
- **Understanding Nodes** [08:26] — Nodes are the building blocks. Key nodes: Checkpoint Loader (model), CLIP Text Encode (prompts), KSampler (denoising algorithm), Empty Latent Image (random noise), VAEDecode (converts latent to image), and Preview Image.
- **KSampler Settings** [12:04] — Seed: starting noise (randomize for variety, fixed for reproducibility). Steps: number of denoising steps (20 is typical). CFG: how closely to follow prompt (7-8 recommended). Sampler: algorithm (e.g., Euler, DPM++ 2M). Scheduler: quality adjustment (e.g., Karras). Denoise: strength of noise removal (1 for text-to-image).
- **Upscaling Methods** [16:52] — Basic upscaling (Upscale Image By) just resizes without adding detail. Better: Upscale Image Using Model (e.g., 4x UltraSharp) uses AI to enhance details. Best: Ultimate SD Upscale (tiled upscaling) breaks image into tiles, applies image-to-image with low denoise, then stitches them back.
- **Image-to-Image Workflow** [30:09] — Replace Empty Latent Image with Load Image node, then use VAE Encode to convert to latent. Denoise strength controls similarity to original: low (0.3) retains more, high (0.8) changes more.
- **Control Net** [51:57] — Control Net allows precise control over pose (OpenPose), depth, edges (Canny), line art, segmentation, etc. Install ControlNet Union model and Art Venture preprocessor node. Connect between positive prompt and KSampler.
- **Face Swapping with Instant ID** [64:58] — Instant ID provides realistic face swapping. Install via Manager, download models (AntelopeV2, IP adapter, control net). Connect Apply Instant ID node between prompts and KSampler, with face analysis and model loader.
- **Compatibility with New Models** [76:09] — ComfyUI works with newer models like AuraFlow and Flux. For Flux, minor tweaks to model and clip loader are needed. SDXL remains the most mature with extensive tools and plugins.

### Conclusion

ComfyUI is a powerful, free, and open-source image generation tool that offers extensive control and flexibility. With the ability to run on various hardware and support for numerous plugins and models, it is an essential tool for AI image generation enthusiasts.

## Transcript

this is the ultimate AI image generator
it's free and open source and the
awesome thing about it is you don't even
need an Nvidia GPU you can run it with
just a CPU or an apple M1 or M2 Chip and
it gives you total control not only can
you do text to image but also image to
image plus upscaling plus you can
control the poses of your character or
you can control the positions of objects
in your image or you can control the
depth of your composition and much more
you can also do face swapping and create
consistent characters you can also link
to popular AI tools such as mimic motion
which makes your character dance or live
portrait which lets you animate any
photo of a face or tun crafter which
allows you to input a start frame and an
end frame and it would generate an anime
scene in between those two frames the
possibilities are just endless now the
tool that we're going to go over today
is called comfyi and the interface looks
like this so if you're seeing this for
the first time this might look very
complicated to you and that's exactly
why I'm making this tutorial I'm going
to show you step by step how to install
it and then how to do text image image
to image upscaling how to control the
pose or the composition or for the depth
of your images plus how to download and
use external tools like face swapping or
toun crafter I'm going to make it as
easy as possible so that even if you
don't have any technical background on
AI or stable diffusion you can still
follow along easily so let's get started
all right first let's go over how to
install comfy UI you simply have to go
to their GitHub which I'll link to in
the description below and installing it
is really easy all you got to do is
scroll down and then you'll see this
installing comfy UI section and then
note that you don't need an Nvidia GPU
to run this you can also run it on CPU
only and it also supports Apple M1 or M2
although of course if you do it this way
or if you run it using CPU it's going to
be way slower compared to an Nvidia GPU
so if you're serious about running image
generation locally definitely do get an
Nvidia GPU it's just going to make your
life a lot more easier and it's going to
be compatible with a lot more other
open- source AI tools by the way I'm
using a Dell Precision 5690 you can
integrate a powerful RTX 5000 Ada into
this huge thanks to Dell and Nvidia for
sponsoring this anyways I'm using
Windows so under Windows all you got to
do is click on this direct link to
download so this is going to install a
7z file which we can unzip now note that
this is 1.4 GB in size so it's going to
take a few minutes to download depending
on the speed of your internet all right
so once you've installed the 7zip file
you can unzip it with 7zip or WinRAR and
then you're going to see this folder now
you can extract this folder anywhere so
I'm going to extract this on my desktop
and this is a very large folder so it's
going to take a while for this to fully
extract all right so once that is fully
extracted we can now open up the folder
and then depending on whether you have
an Nvidia GPU or something else double
click on the appropriate bat file so in
my case I do have an RTX 5000 so I'm
going to run this and then you might see
this message so we got to click more
info and then run anyway and so that's
how you would install comfy UI very
simple to use now if you click on Q
prompt you're going to see this error
message we don't actually have any check
points or models yet so we got to
download a model to use for our image
generation now there are are thousands
of models you could potentially use a
model basically defines the style of
your image so for example you can choose
from realistic to Disney Pixar style to
watercolor to anime there are like
thousands of checkpoints or models that
you can choose from and you can browse
all of these models in a site called
civit AI which I'll link to in the
description below now be warned though
there's a lot of NSFW content here so
make sure you have your filters on or
it's going to be very not safe for work
now because there are like thousands of
models how do you know which one is the
best of course you want to filter out
all the bad ones and just use the best
ones well there's an awesome site called
image cyst which I'll also link to in
the description below this is basically
a rankings list where users can kind of
blind test different models and then all
these results are accumulated into this
ranking list so you can see for example
that real viz has the most points so far
followed by colorful XL followed by
playground version 2.5 followed by
Juggernaut which is also a very popular
model and note that most of the ones at
the top are using stable diffusion XL
which is basically a higher quality
version compared to stable diffusion 1.5
so let's say we want to use this one
real viz XL I'm going to copy this name
and in civii I'm going to Simply search
for this so here is real viz XL version
4 and here are the results you can see
if you scroll down here are some sample
images from other users so let's just
download the most recent one version 4.0
lightning so let's go ahead and download
this it is 6.4 GB just a warning so it's
going to take a while to download and
then you would place your download which
is a safe tens file in comfy UI and then
models and then checkpoints so let's
save it directly in this folder so once
you've downloaded the checkpoint or the
model file make sure that it is located
in comfy UI then in models and then in
checkpoints so notice that I have
downloaded this real viz XL safe tensors
file in this folder and so the next time
you start up comfy UI which I'm going to
do now you should be able to select the
model so again I'm going to open run
Nvidia GPU dobat and then wait for this
to give me a link all right perfect so
now if you go back here in the load
checkpoint point if it doesn't
automatically select one for you you can
click on this and note that I can select
this real viz safe tensor file in here
now note that for other tutorials they
suggest that you download the sdxl base
file and also the sdxl refiner file but
this is just the default stable
diffusion XL model actually this is not
necessary and right now sdxl has gotten
so good that you don't really need to
use the refiner file anymore in most
cases and and this Bas sdxl model it's
not great so you don't have to download
this and you can save a few gigabyt of
dis space again if you look at the
rankings from image Cy the sdxl base
files at least the lightning model seems
to be in 12th Place so I wouldn't
actually recommend you download this
base model and the refiner model so you
can actually save a few gigabytes of
disk space so anyways back in here all
we need to do is download whatever model
you want and then here is the positive
prompt and then here's the negative
prompt you can use your Mouse's scroll
wheel to zoom in and out and I'll
explain what all these boxes mean in a
second but first let's just generate an
image to see if this all works so for
the positive prompt I'm going to enter a
castle in a forest and then for the
negative prompt this is basically all
the things you don't want to see in your
image for example I don't want to see
people cars KN and that's pretty much it
I'll cover some more advanced prompting
techniques later in this video but let's
just click Q prompt for now and you can
see right now it's running this box
which is highlighted in green and then
it's running the positive prompt the
negative it's going through the K
sampler and then now you can see the
progress bar is up here so it's in the
process of generating the image and then
it goes through this VA decode and
finally we get an image of a castle in
the forest now this image is not great
and we're going to go over how we can
make this better now first I want to
really give you a solid understanding of
what all these boxes mean because I
think it'll help you when you build more
complex workflows so first of all let's
just ignore all of this and start from
scratch so if I move down here you can
start a new box or node by right
clicking and then you can see add node
and you can select whatever you want now
there are so many options here I
wouldn't recommend this way so another
way to do it is to doubleclick and then
after you double click you can actually
search for whatever node you want so if
I type in checkpoint for example you you
can see that we have a low checkpoint
node here so I'm going to select this
and because the real viz model is the
only model I have in this models folder
it selects this by default so again this
checkpoint basically defines the style
of your image for example there's going
to be some that work particularly well
for anime some that work well for
realistic photos and then some that work
well for Disney Pixar like characters
all right so next we need to add in our
positive and negative prompts now again
and we can either rightclick and try to
find the positive prompt node here I
would not recommend that it's really
hard to find a node in these options or
you can doubleclick and then search for
the node here in which case we see it
here or another way is to drag from one
of these connectors so the prompts
should be linked to this clip connector
so I'm going to drag a noodle out and
then once I release it you can see that
it gives me several options which it
thinks is most relevant and indeed the
prompt window is this one clip text and
code so we actually need to drag two of
these one for the positive prompt and
one for the negative prompt now there
are several ways to clone this node so
one way is to right click on this and
then click clone in which case we'll
have another node here but then we'll
need to connect this ourselves so I'm
just going to delete that and then
another way is you can just click on
this and press crl C to copy it and if
you press crl V to paste it this is what
you get but if you want this to have the
same connectors as this original node
then instead of pressing contrl + V you
can press contrl shift V and so now when
I paste in this node it's automatically
connected to this now just to avoid
confusion you can also rename this so
for example you can set the title to
positive prompt and and then press okay
and then here you can right click and
then set the title to negative prompt
and then press okay you can also set the
color if you want so for example I can
right click and then in colors I can set
this to red and this is important
because when you build more complicated
workflows this graph is going to be
really complicated so if you color code
things and rename these nodes it just
keeps things more organized and helps
you avoid confusion all right next step
is we need something called a case
sampler and it's basically an algorithm
that takes in your prompts and takes in
a latent image which we'll go over in a
second and create an image from that so
if we click on this conditioning
connector and drag it out we should see
K sampler here so I'm going to select
that this is for the positive and then
for the negative prompt we will connect
this to here and then for the model we
just drag this one all the way to here
where it says model and then for latent
image let's drag a line out and then
here it would give us the option to
create an empty latent image so so let
me just drag this over here to keep
things cleaner all right so what on
Earth does this mean what exactly is an
empty latent image so how stable
diffusion works is it doesn't create an
image from a blank canvas actually what
it does is it starts with an image of
just random noise and then at every step
it removes some of that noise and if you
remove enough noise you get whatever
image you prompted it with so this empty
latent image is basic basically an image
of random noise and you can select the
width and the height so for example we
can change this to 800 if you want and
then same with the height 800 by the way
for sdxl it's best to create an image of
1024x 1024 because that is what it's
optimized for and then if you are using
stable diffusion 1.5 or earlier it's
best to use 512 x 512 and then the batch
size is how many images you want so if
you set this to two basically it will
create two images at once but for now
let's just set this to one all right so
next let's go over all these settings
the seed is basically the starting point
of this random image this image is just
random noise but there could be an
infinite number of images of random
noise each of them would be slightly
different so the seed basically defines
the starting point and usually we keep
the seed at random so that's what this
one does however if you keep it at fixed
and you set the C to a certain value for
example 69 then if all of the other
settings are the same you're going to
generate the same image every time
because the seed or the starting number
is the same but for now let's leave it
at zero and then for here let's set it
to randomize and then the number of
steps is basically again if we go back
to how stable diffusion works is how
many steps of noise removal do we want
so if it's just a few steps you're only
going to get something like this you
haven't removed enough noise yet and
after enough steps you're going to get a
very clean image of whatever you want to
generate so generally few steps would
give you a lot of noise and then after a
point like if you exceed 50 to 100 steps
then you're not going to get a better
image in fact you might get some noise
or artifacts because there's not much
remaining noise to be removed so I'm
just going to keep the number of steps
at 20 for now and then CFG is basically
how well do you want this algorithm to
follow your prompt so if you set this to
one for example it's not going to follow
your prompt and it's going to try to
generate whatever it wants basically it
can be more creative however if you set
it to 15 for example then it listens to
your prompt very literally and it tries
to generate whatever you specify in your
prompt which is sometimes not what you
want sometimes if it's too literal
you're going to get some weird results
so generally a value of like seven or
eight would work best and then the
sampler name this is basically the
algorithm that's used to remove this
noise and generate the image for you so
ooler is a common one this is one of the
fastest ones and then if you want to go
for Quality I would suggest one of these
so DPM PP 2m or 2m SD but for now let's
just go with ooler and then for
scheduler this basically gives even more
quality so usually if you want a really
good quality image would select Caris or
exponential these settings are just very
subtle differences you can add to the
image so just play around with it so you
can get a feel for what settings work
best for your particular use case and
then Den noise what this is is because
again our latent image or starting image
is basically an image of random noise
the den noising strength is basically
how much noise do we want to remove from
this initial image so if we're starting
from scratch then obviously we want to
remove 100 % of the noise so that's why
this is set as one all right so the next
step is we next need to connect this
latent connector so I'm going to drag
this out and we need to connect it to
what is called a VA decode a vae
basically encodes an image into a latent
space but right now we want to do the
reverse of that so we need to decode
this latent image into an image that we
actually want to see so that's why we
need this final step to decode this
latent image and then we need to drag
the vae from our checkpoint to this vae
connector that we see here so in most
cases your checkpoint should come built
in with a vae so all we got to do is
just connect this vae to this vae
connector here and then we are almost
done right now we just need to produce
the image so once we drag this out we
actually have two options we can either
preview the image so if we preview the
image it's not going to save the image
in our computer or we can select save
image which also shows us a preview of
the image but it also automatically
saves the image so I'm going to select
preview first because I don't want it to
save every image it generates I only
want to select the good ones to save on
my computer so I'm going to show you how
to save a previewed image in a second
there are very few software tools that I
use every day but this is one of them
thanks to our sponsor turbo type it's
free forever and it saves me so much
time basically you can create custom
keyboard shortcuts so that you don't
need to keep typing out repetitive
things for example if there's a prompt
that I use in chat GPT very often I can
make a shortcut here and then when I go
to chat GPT or anywhere else I just need
to type in the shortcut and voila or
let's say I have a very long email
address I can also make a shortcut for
that so that whenever I need to enter in
my email I can simply type in the
shortcut and it types out the email
finally it also supports Rich Text for
example you can add in bold and italics
and add links to your text as well so
let's say I need to send out a lot of
cold emails with the following template
well I can just create a shortcut for
that and then whenever I start an email
I just need to type in the shortcut and
voila the text is already styled and
linked for me there are hundreds of
pre-existing templates that you can
choose from including common prompts for
chat GPT business finance medicine and
more this tool saves me so much time
every day there's absolutely no reason
not to use this because they have a free
forever plan so definitely check it out
and download the free Chrome extension
in the link below but that is pretty
much it so for the positive prompt let's
put a med evil Warrior realistic 8K
Masterpiece these are just some keywords
that I tend to use a lot to give it more
detailed and then also Ultra detailed is
another good one and then for negative
prompts again these are all the things
we don't want to see in our image so for
example I don't want it to painting or a
cartoon or anime drawing I don't want it
to have any copyright or watermarks and
I think we are good to go for now so
before we click run note that I had this
previously this is the default so I
don't want to to run both of these at
once so I'm going to select all of these
nodes and delete them so to select
multiple nodes at once what you can do
is hold down control and then drag to
Encompass all the nodes that you want to
select and then if you want to move this
group of nodes around you need to hold
down the shift key and then you can drag
this wherever you want and then to
delete everything at once all you got to
do is press delete all right so moving
back down here everything else looks
good so I'm going to press Q prompt so
note that it's starting here it's
loading the checkpoint and then it's
moving over to K sampler now it's
generating the
image and then it's decoding the image
and then voila we have a medieval
warrior so basically this is like the
standard workflow for a simple text to
image generation and then let's do
another one so let's say I want to
generate two images at once all we got
to do is increase the batch size to two
and then press Q prompt again and note
that it starts off here it starts in the
case sampler it doesn't start here or
here and that's because we haven't
changed any of these other nodes in a
previous step so all of this is already
saved in memory it only loads from here
and so this makes comy UI very efficient
and then so now we have two images this
is the first one here's the second one
and remember this is not saving your
image this is just the preview image
node so to save it all you got to do is
right click and then press save image so
this is the most basic text to image
workflow for stable diffusion hopefully
this gives you a better understanding of
what a k sampler is what an empty latent
image is and what a VA decode is because
once you set up more complex workflows
you're going to need to understand what
these nodes actually do so I hope this
gives you a good understanding all right
before we move on to the next section
let's go over some tips and tricks for
navigation and organization and
productivity so first thing is I'm not
sure if you can see it on my screen
share right now but there is a very
faint dark blue frame on your comfy UI
canvas so I'm hovering my cursor over
that frame right now I wish they could
make the contrast higher so you can
actually see the blue line but anyways
there's this very faint Blue Line and so
whenever you start an interface the
default location would be within this
blue frame so it's always best to have
your workflow within this blue frame so
that whenever you load up comfy UI your
workflow will show up right away and you
don't need to like try to find it within
this huge canvas all right next thing
you can do is let's say this is a
workflow that you want to use again in
the future you can click this to save it
or press crl s so let's name this this
as temp. Json and then you can save this
wherever you want I'm just going to save
this in the compi folder another
keyboard shortcut to be aware of ISR a
which is select all and then delete or
backspace which will delete everything
you've selected all right so right now
this whole workflow is gone now I can
undo it by pressing contrl Z which would
undo my delete of the workflow or I can
also redo the action by pressing control
Y which deletes the whole workflow again
now to load up a previously saved
workflow you can press load here or
press crl o so if we go into our compy
UI folder and then select this temp.
Json which we just loaded you can see it
has loaded our workflow back up now a
few shortcuts for navigation you can
either press on your mouse and then move
the canvas around or you don't actually
need to click on your mouse you can also
hold down the space bar and then move
your mouse around and it would still
move the canvas now you can zoom in and
out by using the scroll wheel and then
to select multiple nodes you just simply
hold down control and then select this
one and then let's say I want to select
this one and select this one so I'm
holding down control for each of these
and then let's say I want to delete
these I can just press delete and then
let me undo that by pressing contrl Z
you can also select multiple nodes by
holding down control and then dragging a
frame around all the nodes you want to
select and then if you want to move
these nodes around simply hold shift and
then you can drag this group of nodes
that you've selected wherever you want
another thing you can do is let me
delete this first now every time you
drag a connector out there's also an
option called reroute so if I click this
it's basically just an extra blank node
which extends your connection further so
it's basically the same thing as just
connecting this vae to this vae but the
nice thing about this is let's say you
don't want this line to be hidden behind
this node well you can d it out like
this so you can clearly see that this
line is being connected here and then
one more thing is right now you see that
for example in K sampler these values
are set in this node but what if you
want to set this value somewhere else
and then link it to here so for example
for CFG if you want to set this value
somewhere else and then link it to the
cas sampler you can right click on this
and then under convert widget to input
you can set any of these options to an
input so let's say we want to set CFG to
an input now you can see that CFG has
disappeared from here and it is now an
input connector so let's drag this out
and then we need to actually use the
node called primitive so right now this
value is set to seven it's connected to
this CFG input so the CFG of this case
sampler is 7 so that's how you would use
it all right one final thing I want to
share with you and this is really cool
let me press crl a and delete all of
these let's say I made an image
previously using com UI well if I drag
that image onto the canvas what happens
is it actually gives me the entire
workflow that I used to create the image
how cool is that so like if you go
online and other users have shared their
comy UI generations and assuming they
haven't deleted that metadata you can
actually download their image and then
drag and drop that image onto comy UI to
look at the entire workflow that they
used to generate that image this allows
you to learn really quickly so those are
like the basic things you need to know
for organization and productivity using
comfy UI so let's move on to the next
section first of all we need to install
this plug-in called comfy UI manager it
will make your life a lot easier for
installing extensions and plugins and
missing models so I'm going to link to
this GitHub repo in the description as
well and if you scroll down a bit they
will give you some installation
instructions so in our comfy UI folder
and then in our custom nodes folder we
just need to open command prompt here so
in this bar up at the top type in CMD
and this will open up our Command Prompt
and you can see that we are now inside
our custom nodes folder next all we got
to do is get clone this repo so we will
paste it in here now you do need to have
get installed first if you don't here's
how to install git if you already have
git installed feel free to skip to the
next section so all we got to do is
download the latest release for whatever
operating system you're using so I'm
using Windows so I'm just going to click
on download for Windows I'm running 64
bit so I'm going to click on this to
download and it's now downloading this
exe file so once that's completed all we
got to do is open that exe file and then
follow the steps so I'm going to click
on next I'm just going to go with the
default install location which is
program file SLG so I'll click next for
that and then I'm just going to leave
this at the default and then I'm going
to click next again and click next here
we're just going to use the default
settings for all of these there's a lot
of settings that you need to go through
so I'm just going to click next for all
of these all right and then it should go
ahead and install all the files so this
might take a few
minutes perfect so now we have git
installed all right so assuming you have
get installed already you simply copy
this line and then paste it in here and
then press it enter and you'll see that
now we are cloning this comfy UI manager
into the custom nodes folder and then if
you actually open up your custom nodes
folder you can see this new folder
called comfy UI manager all right so
next we need to restart comfy UI so
going back in our Windows portable
folder we are going to run comfy UI
again and you can see that it has
detected that we have comfy UI manager
installed so it's now installing
dependencies all right so after you open
your comfu up I now you should see this
manager button in the right menu so next
I'm going to show you how to upscale an
image and then we're also going to move
on to some more advanced workflows like
image to image and control net and
installing other plugins so first let's
click on this manager button and then
the main buttons that you will use is
this one custom nodes manager there's
also a button where you can like update
comfy UI or update all or install
missing custom nodes for example if you
import ort a workflow that was made from
another user you might have some missing
models or nodes or dependencies so
clicking this button will just
automatically install all of them so
that this other user's workflow will
work on your computer and then here this
is model manager so instead of going to
civit Ai and browsing through all the
models you can just easily download the
model here through this interface so
let's click on this and then let's
search upscale and you should see a lot
of different upscaler algorithms so
usually the ones that I find work best
are real sran X4 this is for realistic
photos as the name implies and then
there's also 4X Ultra sharp which works
pretty good as well just to keep it
simple for this tutorial I'm just going
to download these two but definitely you
can install all these and play around
with it and see which algorithm works
best for you so I'm going to select
these two and then click install all
right so after it is finished installing
note that you need to click the refresh
button on the main menu to apply these
installations so we're going to click
close And then close again and then
click refresh all right let's dive in to
see how we can upscale images so we're
going to use the same prompts a medieval
warrior realistic 8K Masterpiece Ultra
detailed and then for the initial image
width and height let's set it to 512 now
I understand that for sdxl the optimum
dimensions are 1024x 1024 but in this
example I'm just going to show you a
really blurry and low resolution image
and then we're going to upscale it by
four times so you can clearly see the
before and after and then for the batch
size let's leave it at 1 for now number
of steps we can also leave it at 20 and
then here instead of preview image let's
delete that and then we can just drag
this image out and then we don't see any
upscale here so we need to click search
and then we can search upscale now there
are a few options one is upscale image
which is kind of the same as upscale
image bu and then we have upscale image
using model now I'll show you what this
one does first upscale image bu I would
not recommend this because this is
basically just upscaling your initial
image but it's not adding any details
it's basically just increasing the size
so let's say we want to upscale this by
two times so instead of 52 12 x 512 it's
going to be 1024x 1024 and then for the
image let's drag it out and we will have
a preview image node and also I want to
preview the image before we upscale it
so you can compare the before and after
so over here I'm also going to drag a
preview image node all right now let's
run this and see what we get so I'm
going to press Q
prompt all right so let's look at the
initial image this is only 512 x 512 so
I'm zooming in quite a bit and you can
see the details of his face are very
blurry and then let's look at the
upscaled image yes this is 1024x 1024
but you can see the details are the same
this is pretty much the same image the
same blurriness we're not adding any
details here so again it's not
recommended to use this upscale image by
Method because you're simply just
resizing the image but you're not adding
more details to the image so I'm going
to delete this one and also this one so
next let's drag out another node and
this time I'm going to search upscale
again and then we are going to use
upscale image using model and then after
that we need to actually input an
upscale model so I'm going to drag this
node out and then we should get the one
and only option which is upscale model
loader so because I've downloaded for X
Ultra sharp and real s again we should
automatically see this over here now
both of these are for X so your output
image will be four times the resolution
if you want 2x for example well you can
go into manager again and then click on
model manager and then find an upscale
model that is only 2x like this real
sran X2 all right and then just one last
step is we need to drag out another node
to either save the image or preview the
image so if I run this
again you can see that
if I zoom in on both of these this is
512 x 512 so the details are very blurry
but if you expand this image which is
like over 2,000 * 2,000 you can see that
the details are a lot sharper especially
the patterns on his helmet and on his
armor now this isn't the best way to
upscale it's actually best to do one
round of image to image first before we
upscale so I'll show you that in a
second so one more thing I want to
mention is that sometimes you don't want
to upscale all the images that you
generate you want to decide which image
you want to upscale because let's say
for this initial image you don't like
the design you don't like the
composition you don't want to proceed
further with upscaling and waste
Computing resources so how do we decide
if we want to upscale or not well first
of all you can break off a workflow by
selecting the node where you want to
break it off in this case I want it to
pause here so it only generates the
initial image but it doesn't proceed
further to the upscale unless I want it
to do that and then I'm going to press
crl M and this will mute the note and
basically everything that goes after
this point is going to be paused until I
unmute this node now when I press Q
prompt and it generates an image I can
decide whether I want to proceed further
with this upscale method and if I do
then I would unmute this node and press
Q prompt again however one thing to note
is in this case you do need to set the
same seed otherwise if you don't set the
same seed when you press Q prompt again
it's going to generate a completely
different image and then it's going to
upscale that image so going back to here
we need to actually set this to fixed so
it's going to be a fixed seed and then
so if we run Q prompt again you can see
that it's generating a new image and
that image is being fed into here and
then let's say I like this image I want
to upscale it then I would click on this
node which is now muted I'm going to
unmute this by pressing crl M so now
when I press Q prompt again you can see
it's actually proceeding from here and
then now it's upscaling this image and
we can now see our upscaled image so
that's one way to do it another even
more efficient option is if you go back
to manager and then click on custom
nodes manager and then let's search for
image Chooser and this is a node that's
created by Chris Orange let's go ahead
and install
this all right all right so it says
restart required so let's restart this
all right so we are back here now let's
say we want to generate four images and
we would choose one of them to go
through the upscale so let's increase
the batch size here to four and then we
can keep everything as is for the seed
actually we can set this back to
randomize and then yes this has to go
through this vae decoder to decode the
latent image and then instead of preview
image let me just delete that let me
also delete this node for now and then
let's drag this image out and we will
search for preview Chooser which is down
here and then for any images that we
select here we can then proceed to
upscale it so let me just run this for
you first so you can see what this does
so right now it's loading the checkpoint
again because we've restored the
interface now it's inputting the
positive negative prompts now it's
generating four images through this case
sampler then it's going to decode these
four images and then so now we have four
Images all right so let's say we want to
select this one to upscale so we would
click on this or if you want to select
more you can always increase the count
to two and then select this one for
example but I'm going to decrease this
to one and unselect this so we are only
going to proceed with this image through
the upscaler and then we simply click
progress selected image and then this
would go through the upscaler and it's
using our forx Ultra sharp model and
then voila here is our upscaled image so
in a nutshell that's how you do
upscaling now there is an even better
method to upscale images which kind of
uses image to image so next we're going
to go over how to do image to image and
then we're going to go back to this
better upscaling method all right next
I'm going to show you how to do image to
image so what I'm going to do is first
of all hold control and then Dr drag
these nodes to select all these nodes
I'm going to copy it and then paste it
somewhere here and then I'm going to
hold down shift and then move it to
where I would like all right now going
back to this another way instead of just
deleting this workflow is to hold
control and select all of this and then
press contrl B to bypass so if you see
the nodes highlighted in purple that
means it will be bypassed this will not
r run so only this will run all right so
this is our standard text to image
workflow right we have our checkpoint
and then we have our positive prompt our
negative prompt this goes into the K
sampler the K sampler takes an El latent
image which is just an image of random
noise and then after going through this
algorithm and going through this amount
of steps then the final step is we need
to use our vae to decode this latent
space back into an image that we want to
see now for image to image we don't want
to start with an image of random noise
right we want our input to be an image
so let me select this node and delete
that and then I'm going to double click
anywhere and then search for load image
so let me click on this so this is the
default but you can click this button to
upload an image I'm going to upload this
image now we can't just directly drag
this image to this Laton image connector
that's because we need to First convert
this image to a latent image and then
connect the latent image into here so
let me drag this connector out and you
should see the first option here would
be vae and code and that's exactly what
we want to do we want to use a vae to
encode this into latent space and then
drag the latent image onto this
connector all right and then where do we
get our VA
well we can just get the vae from the
checkpoint that we loaded so let's drag
this to here and then you can see that
everything is connected now we do need
to set up some additional settings for
the K Samplers so one thing I forgot to
mention is that if your checkpoint is
lightning it only takes around like 5 to
8 steps you actually don't need 20 steps
that's the awesome thing about lightning
models and some models can even generate
a decent image in as few as two two
steps for us let's set the steps to
seven sampler schedule we can leave it
as is Den noise this is what we want to
change if you remember for our text to
image denoise this is saying to take our
lat an image of random noise and replace
100% of that noise so that we get the
image that we specified in our prompt
but in this case because we're not
starting with random noise we're
starting with an image we don't want to
remove everything we want to retain some
of this image so in this case if you're
doing image to image the noising
strength means how much of this original
image do you want to remove so if you
set this to 100% or 1.0 in this case
it's going to remove everything from
this image you're going to get a
completely different image conversely if
you set this to zero then it's just
going to produce this exact image
nothing would change so depending on how
similar you want the image to be it's
better to have something in between so
let's do 3 first so it's more similar to
this image and then I'll show you an
example of8 for comparison all right so
now that we have everything in place
there's just one final step which is to
drag this image connector out and then
we are going to preview image let me
just reposition this here so you can see
the entire workflow and we are good to
go let's click Q prompt so it's our
first time starting this workflow so it
takes some time to load in the
checkpoints and then it's going through
the prompts it's taking this image and
encoding it it's now going through this
K sampler and then it's going to decode
the image and then preview the image for
us all right perfect so you can see if I
drag this over here just temporarily for
comparison this is image to image so
here's our original image here's our new
image and the denoising strength was set
to. 3 all right let me now try one with
8 for example and you should see that it
would be less similar to the original
image so I'm going to set the denoising
strength to8 press okay and then run
this again and notice it's not starting
from the beginning it's starting from
here because this is the last note where
we changed the settings so again this
makes comfy UI very efficient and now
it's decoding it and you can see for
this one it's more different compared to
the original image
so basically in a nutshell this denoise
value determines how similar do you want
your output image to be compared with
your uploaded image all right now
remember how I said there's a much
better way to upscale images using image
to image well now that we've gone over
image to image and specifically the den
noising strength setting I can now show
you a much better way to upscale images
and this is one of the best ways to
actually upscale images in order for
this to work we need to download another
node so let's go into manager and then
click on custom nodes manager and this
time we are going to search for ultimate
SD upscale so let's click install here
and after it has installed it says we
need to restart com UI so let's click
that all right so let's start with the
default text to image workflow and I'm
going to show you how to set this up so
instead of the case sampler this new
node is basically going to replace the K
sampler so let me hold control and then
select all three of these nodes and then
click delete and then I'm going to drag
a connector out here and then search for
ultimate SD upscale and then we are also
going to connect the negative prompt
here we're going to connect the vae here
and also connect the model here this
time we are not going to use a latent
image so I'm going to delete that and
then instead we need to upload an image
so I'm going to drag a connector out
from image and then select load image
and then I'm going to choose this image
now this is a 512x 512 image that we
generated previously you can see it's
very blurry and then for the prompt we
are going to set this to the same prompt
that we used before which is medieval
warrior 8K Masterpiece Ultra detailed
realistic and then for the next negative
prompt again same as before we are going
to type in anime 2D cartoon painting
Watermark all right so we are almost
good to go one last thing is we need to
select an upscale model so let's drag
this connector out and then select the
one and only upscale model loader and
this will pull up the upscaler models
that we've downloaded previously in this
tutorial so let's go with 4X ultr sharp
and now what this ultimate upscale does
is called tiled upscaling so let's say
we upscale this by two what this is
actually going to do is break this into
four sections so it's 2x two and then
it's going to apply image to image for
each quadrant so it's going to generate
image to image for this one and then
image to image for this one and then
this one and then this one and then it's
going to stitch all four quadrants
together to give you your upscaled image
and now the trick is if you use image to
image image with this with a low Den
noising strength again this is how much
of your original image do you want to
retain then this method is actually a
lot better than than just upscaling with
4X Ultra sharp so again the key here is
to set a denois strength to a relatively
low value I think 0 2 is a good start or
you could even go with 0.15 all right so
all of these settings we've gone over
before tile width and tile height just
refer to the dimensions of each one of
these quadrants so this one would be 512
x 512 this one would also be 512 x 512
or whatever you set here usually you
would just set this to the width and
height of your original image and then
mask blur and tile padding this just
refers to how well these tiles blend
together after they are glued back
together to form your final image so I
just tend to leave it at the default but
feel free to play around with these
settings if for some reason you get a
very obvious line in between tiles and
then that's pretty much it the final
step is to drag this connector out and
then I'm going to select preview image
all right so let's click Q prompt and
I'll show you what that gives us all
right so if you now compare these two
images you can see that this is a lot
more detailed and the details are a lot
finer now previously we did a a clean 4X
Ultra sharp upscale which looks like
this right the details are not great you
can see the hair doesn't really look
like hair same with his facial hair same
with his face it Still Remains blurry
now this is 4 * 512 so this is 2048 *
2048 right now this is only 1024 x 1024
since we upscaled it two times so to
give you an Apples to Apples comparison
we can either set this value to four
which would upscale it four times or let
me show you another trick you can do to
upscale this further and this really
shows the versatility of comy UI you can
literally just customize the workflow to
whatever you want so let's leave this to
two and then get rid of this and then we
can actually plug in another ultimate
upscale here so I'm going to click
search and then search ultimate again
and then for model we can drag model
here positive prompt we can use the same
one negative prompt we can also use the
same one vae we can drag that from the
model and then upscale model we can also
use the same one or a better way to do
this is let me just select this node and
delete it again is to click this press
crl V and then over somewhere here press
contrl shift V and it would
automatically link everything that is
linked from the node that you are
copying now we don't want this original
image to be linked here so let me get
rid of this linkage and instead we just
want to pass our 2x upscaled image to
here to upscale by 2x again all right
now the only thing that we need to
change is because this image that is
being passed here is now 1024 x 1024 we
should set the tile width here to 1024
by 1024 all right so now if we drag this
out and select preview image let me run
this and I'll show you the insane
quality that this can generate compared
to if we just did a normal upscale
method like this all right so here is
our preview image let me just save this
first and then I'm going to pull both of
these side by side on the left this is
only using the 4X Ultra sharp to upscale
my image of 512x 512 to four times and
then this one is using ultimate upscale
to upscale my image four times now
notice the insane difference if I zoom
in on this guy's face and zoom in on
this guy's face notice how much more
details this ultimate upscaler is able
to generate his facial hair his eyebrows
his eyes are super detailed the lighting
on his nose also super detailed whereas
for this one on the left even though
it's the same resolution his face is
just blurry and his facial hair does not
look realistic his eyebrows his nose are
super blurry and then same with the
crown here you can see the details are
really lacking in this left photo
whereas for this one with the ultimate
upscale everything just looks super
sharp and crisp now notice that because
we are using image to image with a den
noising strength of 2 there are going to
be subtle differences from the original
image so for example you can see this
dude is looking straight at the camera
now whereas this guy is looking slightly
to the right and that's because it
doesn't just take the original image and
upscale it so if you really want to
retain a 100% of your original image
then I think this method is better
however if you're okay with changing
some of the details to get a much
sharper and more detailed image than
ultimate upscaler is one of the best
options out there so yeah at least for
now ultimate upscale or basically this
is tiled upscaling this is one of the
best methods to to upscale images and it
basically takes your image and breaks it
down into tiles and then for each tile
it does image to image but with a very
low denoising strength so that it
retains most of the original image but
it just adds more detail to that image
and then it glues all these tiles back
together to give you your upscaled image
this is one of the best upscaling
methods out there right now so that
covers upscaling next let's move on to
some more complex stuff
all right next I'm going to show you how
to use control net in comy UI now what
is control net and why do we need to use
it basically it's a tool to really help
you customize your image if you're
serious about image generation you got
to learn control net so for example you
can really control the pose of your
Generation by using an open pose
pre-processor so you would upload a pose
like this and I'll show you how to do
that in a second and all your
Generations would all align with this
pose so here's another example and you
can see all these Generations follow
this pose to some extent and you can
also adjust well how much do you want
your generation to follow your pose
here's another example it's so very
powerful tool instead of pose you can
also create a depth map and use that as
a reference so for example if you upload
this depth map all your Generations
would have the same depth map to some
extent including the lights including
the laptop top this is a really powerful
way for you to control what objects show
up in what areas in your image and then
here's yet another example this is
especially useful if you have multiple
characters and the scene is very complex
but you really want to control where
those characters are in the scene then
again this is a great tool to give you
more control over those settings instead
of a depth map you can also upload
something called a cany pre-processor
which is basically just lines and you
can see your Generations would follow
this canny image here is another example
so control net is very versatile there
are so many things you can do with this
something that's very similar to canny
is line art so it's essentially the same
thing you take an image and you break it
down into line art and then use that as
a reference image for your future
Generations so you can see all of these
Generations follow this line art to some
extent here are some additional examples
and if you want to generate anime
there's an even better line art
pre-processor called anime line art and
this is more optimized for anime so as
you can see here if you want the same
pose the same outfit for your character
but maybe you want different colors
different backgrounds well you can use
this option to generate those images
here's another example and then similar
to line art there's also scribble where
you can draw in some lines and then that
would also influence your generations to
some degree as you can see in these
examples this is also a good one so you
can take any image and break down that
image into different segments and use
that as a reference and you can see all
your future Generations would also
follow the guidance of this image here's
another complicated example and you can
see it's able to control for all these
objects very nicely you can see with
this segment pre-processor you're able
to control the location of all these
people all these objects very precisely
l in your image all right so let's Jump
Right In how do we use it so first of
all let's start with a very simple text
to image workflow this is just your
positive prompt negative prompt you're
taking in an image of random noise
you're plugging it through this case
sampler it's going to decode it and give
you your final image now let's start
with adding control net first of all in
this manager section let's click on that
and then we'll click on model manager
and then we'll search for control net
Union this is the newest contr control
net model and it basically includes all
of these options that we just talked
about so you don't need to go in and
install all of those pre-processors
separately so let's go ahead and install
this note that it is 2.5 GB so depending
on the speed of your internet this might
take a few minutes to download all right
after we've installed it note that we
need to click the refresh button so
let's click close here and then close
and then click refresh all right so how
do we use control net it might be not
intuitive but we don't actually link
control net to the latent image we
actually link it to the positive prompt
so let me select this node and delete it
first and then I'm going to move up here
for a bit so let's drag this connector
out and then search for apply control
nit so again it's not intuitive but
control net is actually applied to the
positive prompt before going into the
case sampler so we are going to drag
this conditioning connector back into
the positive connector of the K sampler
all right and then the next step is we
need to select a control net model so
let's drag this out and then it's just
the first option here which is control
net loader so if that control net Union
was the only thing you've installed this
is the only model you should see if not
you can click on this and it would have
a drop down of all the compatible
control net models that you've
downloaded but again this is the only
only one you need this is the newest one
and it contains all of these options for
you so you don't have to go ahead and
download each of them separately all
right so the next step is the image
first of all I'm going to double click
here and then type in load image and
then select this node and then let's say
I want to specify a certain pose for my
generation so I'm going to upload this
image and I want whatever I generate
down here to follow this pose so what we
need to do is actually load this image
to a preprocessor now in order to do
that we need to download another node
and you know this is kind of messy I
wish we can just merge all of these
nodes together into one node just to
keep things cleaner but anyways it is
what it is let's click on manager and
then click on custom nodes manager and
then we'll search for art Venture and
then you should see this one comfy UI
art Venture let's click on this to
install it and I'll show you what this
node does in a second all right so after
we've installed this node it says we
need to restart comy UI so let's click
on restart and then click okay all right
so we are back after the restart
everything is still here so what we need
to do is link our uploaded image to a
preprocessor which is the node we just
installed so let me drag this out and
then I'm going to search for control net
preprocessor and we should see this AV
control nit preprocessor so let's click
on this and then why did we install this
instead of all the other options we
could choose from because this node
allows you to select from a lot of
different options so for example we can
choose sdxl which is what we are using
right our checkpoint is sdxl and so the
reason why we downloaded this node in
particular is because it contains all of
the pre-processors you need all in one
node so basically you can select things
like open pose or depth or canny or line
art or all of these other examples that
I showed you previously so let's start
with the simple one let's start with
open pose so I need to convert this
image into an open pose image and then
for the SD version we are going with
sdxl and then resolution let's set this
to 768 and then let's set the width and
height of our final image to 768 as well
note that for sdxl it's actually best to
use 1024x 1024 and it doesn't have to be
square but just to make our generation
faster let's go with 768 and then
finally we just need to drag this
pre-processed image into the image node
of control net and then here the
strength is well how strong of an
influence do you want this control net
or basically this pose to influence your
final image so 1 is 100% 0 is 0% % if
you set this to zero then you're
basically not using control net at all
and so let's set this to something like
0.8 for example and see what that gives
us all right so just a quick summary how
you would use control net is it actually
goes in between the positive prompt and
the case sampler and then for control
net what you need to do is upload an
initial image and you need to process
that initial image into whatever
pre-processor you select and then it
would turn that processed image and add
it to the control net so actually what
I'm going to do just to show you what
this actually looks like is I'm going to
drag a connector out here and then click
on preview image so you'll see what this
open pose image looks like and then what
I'm going to do actually is hold control
and select all these nodes and also
select this one and then press contrl B
to bypass them so that we are only
running this I want to show you what
these steps actually do so let me click
Q prompt and note that the first time
this loads it might take a while because
you can see that it's downloading this
open pose model from hugging face all
right so after everything is finished
downloading you can see the preview
image here so basically this
pre-processor is converting our uploaded
image into this pose image and then
feeding this into control net and this
would influence the pose of our our new
image so I'm going to hold down control
again and select everything Press contrl
B to unbiassed and then this time
instead of a medieval warrior let's try
a princess arctic tundra snowing all
right and then let's click Q prompt
perfect so if I drag my load image next
to this final image you can see that
this princess is following the pose of
my uploaded image to some extent and if
we want to follow her pose completely
then we would set this value or this
control net strength to one so that's
one example of how to use control net to
control the composition of your image
all right here's another example so
let's say I want an image similar to
this composition of mountains but I want
this to be sunset instead of this
lighting so I can use a control net I'll
upload this image and then instead of
open pose I would select something like
canny or you can also select depth if
you want you could also select line art
it really doesn't matter it really
depends on your use case and then we
leave everything else the same and
here's just a preview image so after the
pre-processor it looks like this this is
what the canny pre-processor does all
right so we plug this into control net
and then this time for the positive
prompt I just put in mountains and
Sunset and then all these other keywords
and then our final image looks like this
so again if I drag my original uploaded
image onto here just for a comparison
you can see that our final image matches
the shape of these mountains to some
degree but now it's Sunset instead of
this lighting how cool is that basically
there are so many different options you
can choose from for control net feel
free to just play around with all of
these pre-processors there are just so
many different pre-processor options you
can choose in control net to really give
you maximum control over the composition
of your image so that sums it up for
control net if you run into any errors
or issues just let me know in the
comments below and I'll try to help you
troubleshoot as much as possible but it
should be fairly easy to install
everything in just one click using this
manager button all right next I'm going
to show you how to install and use
external AI tools on comfy UI the
awesome thing about comfy UI is that it
supports a wide range of other
open-source AI tools for example there's
a comfy UI note for mimic motion which
allows you to create dancing videos from
a single photo or there's another comfy
UI node for tun crafter which is a
powerful tool for generating anime
scenes if you're not familiar with Toon
crafter check out this video where I did
a deep Di dive on how to install and use
it but basically you just need to enter
in a start frame and an end frame and
this tool will fill in an AI animation
in between those two frames plus there's
another comy UI node for another tool
called live portrait in this tool you
basically input an image of a face and
then you input a video of another face
talking or doing some expressions and it
can map those expressions onto your
input image if you want to learn more
about live portrait check out this video
anyways today I'm going to show you the
process of installing and using one of
these AI Tools in comi for us we're
going to use this tool called instant ID
at its core it's basically a face swap
that takes in a reference image of a
person's face and then Maps it onto your
generation so there are a handful of
face swap tools you can use for stable
diffusion such as RP or reactor or this
one instant ID which I find to be the
most realistic and best Fidelity so
here's the original instant ID page you
can see that it is really good for face
swapping as you can see here's Taylor
Swift here's some Chinese actress this
looks really good it really does
preserve the details of that person's
face even across all these different
styles of generations like it doesn't
have to be realistic you can also do
face swap for painting or drawings and
it even works with different angles so
even though you just have one image of
Taylor Swift's face this AI is able to
miraculously kind of guess what that
face would look like at these different
angles so anyways let's jump right into
how we would set this up on the GitHub
page which is called comfy UI instant ID
I'll link to this in the description
below if you scroll down a bit here are
the installation instructions so you can
either download or get clone this repo
into the custom nodes directory or use
the manager of course since we have
manager installed we're just going to
use the manager so going back to our
comfy UI instance I'm going to click on
manager and then click on custom nodes
manager and then search for instant ID
and then we're going to go with this one
the comy UI instant ID native support
and the nice thing about this one as it
says in the description is that it
implements instant ID natively and fully
integrates with comfy UI so let's click
on install and then if you open up the
command prompt while this is installing
you can see that right now it's cloning
the repo it's downloading all the files
all right so after that it says we need
to restart comy UI so let's do that I'm
going to click on restart and then click
okay and then after clicking on restart
you can see in the CMD window that it's
actually installing some additional
dependencies such as inside face so
depend ending on the speed of your
internet connection this might take a
while to download all right you can see
now it's downloading Onyx runtime GPU
all right so we've installed the nodes
we've installed Insight face and Onyx
runtime but we are not done yet so next
we need to download these Insight face
models so one of them is called Antelope
V2 which we can download here so it
seems to be a zip file on Google Drive
I'm going to click download and then you
get download that anywhere once it's
finished downloading open up the zip
file and then it says we need to move
this into comy ui/ models SL Insight
face/ models so let's go into our comfy
UI folder and then in models we need to
create a new folder and call that
Insight face and then within Insight
face we need to create another folder
called models so I'm going to create new
folder models and then within models we
should have this entel V2 folder so I'm
going to just drag and drop this into
here all right so let me exit out of
this and you can delete the zip file
afterwards and then we also need another
one so we need this main model which can
be downloaded from hugging face and then
placed into this directory and it says
you also need a control net and you need
to place it in the comy UI control net
directory instead of downloading it from
hugging face there's a much easier way
to download these which is through the
manager so let me open manager again and
then this time we are going to click on
model manager and then we are going to
search for instant ID and you should see
that down here we have these two options
the IP adapter and control net and this
is for cubic instant ID so it's for this
repo it's basically downloading this
model which is based on IP adapter and
this control net so we are going to
select both of these and then click on
install now note that this one is like
1.7 GB this is 2.5 GB so it's going to
take a while to download all right so
once we have these two installed let's
click close And then close again and
then let's click on refresh all right so
let's start again with our very basic
text to image workflow so we have a
checkpoint here we plug in these prompts
and then the these prompts go through
this case sampler which takes in a
latent image and then it decodes that
image and it gives us our final image
let me just adjust the placements of
these to keep things more organized as
we add the additional instant ID nodes
all right so where the instant ID node
goes is actually in between the prompts
and the K sampler so if I drag this out
here and then click search I will search
for instant ID and I should see this
apply instant ID all right so let's drag
this over here I'm going to hold control
and then hold shift and drag these nodes
over here to keep things more organized
all right so let's remove this and let's
remove this so the negative prompt goes
to here and then our model goes here and
then the model is then connected back to
this K sampler the positive is connected
to the K sampler and then the negative
is connected to the K sampler yes
there's a lot of connections that needs
to be made and I wish this process was
simpler but it is what it is all right
and then we need to also input an image
so I'm going to drag this node out and
then click on load image and then I will
click upload and let me upload this
image of Will Smith and then as the
GitHub specified this also needs an
instant ID control net so let me drag
this out and then let's select control
net loader and in your options if you
search for instant ID you should only
see this one instant id/ diffusion
pytorch model so let's select this for
the control net model and then for
insight face we also need to drag this
out and select instant ID face analysis
and then for the provider since I have a
Cuda GPU I will select Cuda and then
finally for instant ID we also need to
drag a connector out and the only option
we see here is instant ID model loader
so let's select this and by default it
should just be this one IP adapter. bin
which is the file that we installed all
right so after all these things are
connected let's also look at these
values so the weight is basically how
important do you want this face to be or
how much influence do you want this pH
to be in your final image and then for
start and end again for stable diffusion
it basically takes a latent image of
random noise and through each step like
right here we've set it to 20 steps and
so for each step it removes a bit of
noise until it gets to step 20 so for
these start and end values it's
basically saying well at what step do
you want to start applying this face
swap and at what step do you want to
stop applying this face swap so let's
say if we set this to like 0.5 or
halfway basically this would stop
applying the face swap at step 10 since
we set the number of steps to 20 if we
set this to 1 then it's going to apply
the face swap at all steps right from
Step Zero to the last step all right and
then just a few more things we need to
tweak for the width let's set this to
768 for the height let's set this to
1024 and then for the positive prompt
let's say policeman realistic 8K
Masterpiece Ultra detailed and then for
the negative prompt we can say cartoon
painting anime blurry Watermark all all
right and if all is good we can click on
Q prompt and see what that gives us so
right now it's loading the checkpoint
and then it's going through these
prompts it's going through this apply
instant ID which will take in our loaded
image of Will Smith and all these other
variables and then finally it's going to
Output this image of Will Smith as a
policeman so here we go now face
swapping isn't just for generating deep
fakes right generating fake images of
real people you can also use face swaps
to create consistent characters right if
you want a certain character to have the
same face throughout your video or
throughout your animation or comic book
or whatever face swap is a really good
way to apply the same face to all your
Generations let's try something else so
the awesome thing about instant ID is it
doesn't just work for realistic photos
so you could also set this to let's get
rid of painting and then instead of
realistic let's set this to watercolor
painting and then let's click Q prompt
and see what that gives us all right
perfect so now we have a watercolor
painting kind of of Will Smith as a
policeman so I hope this gives you a
good understanding of how to install
these different custom nodes of external
tools and the awesome thing about comfy
UI is there are a lot of other tools
that it can support so for example
there's also an animate diff node for
comfy UI which helps you generate
animations from images and inside the
GitHub it should give you a
demonstration of what a workflow should
look like another person has created a
comfy UI node for tun crafter and this
allows you to take in one image as the
input frame and then another image as
the final frame and it would interpolate
an animation in between these two frames
so again here's the workflow for the tun
crafter node and yet another user has
created a comfy UI node for live
portrait and this this basically allows
you to take one input image and then one
video of a person moving their heads and
doing some strange expressions or
talking and it would animate that input
photo with this person's movements and
here they shared with you how the
workflow would look like in comfy UI so
again there's just so many custom nodes
that other users have created based on a
lot of different external AI tools and
that's what makes comfy UI awesome and
with this manager you can easily search
for all the nodes out there there are
like tens of thousands of different
custom nodes depending on what you would
like to do and that's what makes comfy
UI the most powerful free and open
source image generator out there all
right finally some of you might be
wondering well how do I use comfy UI
with the newest models such as oraflow
or flux well for oraflow actually
everything is the same this whole text
to image workflow is the same and the
only thing you need to change is the
checkpoint you just need to change the
checkpoint to this Aura flow saf enters
file see this video on how to install
and run oraflow with kyui and then for
flux the workflow is quite similar you
just need to tweak a few things like the
model and a clip loader plus the K
sampler see this video where I go in
depth how to install and run flux on
comfy UI however if you've watched this
tutorial you should understand the basic
of how to use comfy and how all these
nodes work so changing between stable
diffusion and flux and AA flow is
actually very easy and for this tutorial
I mostly used sdxl because it's still
the most mature platform out there there
are hundreds of models and luras you can
choose from plus hundreds of plugins and
tools such as control net and all of
these only work with stable diffusion
whereas for flux it's still quite new so
there aren't a lot of tools and
workflows built from the open-source
Community yet but once we do have more
of these tools let me know in the
comments and if you want me to do an
updated tutorial just for flux I'd be
happy to make one as well so that covers
my tutorial for comfy UI like I said in
the beginning this is free and open
source plus you don't even need an
Nvidia GPU to use this and the
installation is super easy if you
followed all the steps as outlined in
this video you should be able to get
comfy UI up and running on your computer
now in this tutorial we covered a lot of
different topics from text to image to
image to image to face swapping to a lot
of different workflows so if you get
stuck or you hit any errors along the
way let me know in the comments below
and I'll try to help you troubleshoot as
much as possible as always I will
continue to look out for the newest and
coolest AI tools to share with you if
you enjoyed this video remember to like
share subscribe and stay tuned for more
content also we built a site where you
can find all the AI tools out there as
well as look for jobs in AI machine
learning data science and more so check
that out at ai-
search. thanks for watching and I'll see
you in the next one
