[0:00] I've built a workflow that lets you
[0:01] control AI video with a stick. Check
[0:03] this out. You just move plastic toys or
[0:05] printed cutouts through your scene and
[0:06] the workflow erases whatever is holding
[0:09] them and then animates them to follow
[0:11] that exact motion. You can also skip the
[0:13] stick entirely and use animated previews
[0:15] instead using After Effects or Blender,
[0:18] for example.
[0:19] To show you exactly how this works and
[0:21] how you can use it on your own computer
[0:23] for free, we created an entire short
[0:25] film about toys coming to life. And
[0:27] yeah, I know Nico from Corridor Digital
[0:29] had pretty much the same idea, but he
[0:31] used a different technique. He used one
[0:33] animate to transfer acting performances
[0:35] onto his toy characters. In this video,
[0:37] we're also going to look at one animate,
[0:39] but I used it to turn myself into the
[0:41] 15year-old version of myself for the
[0:43] short film because I'm a responsible
[0:45] adult now. I definitely don't play with
[0:47] toys anymore. Make sure to subscribe and
[0:49] stick around till the end for the full
[0:51] short film.
[0:54] Now, at the heart of this workflow is a
[0:56] research paper called time to move or
[0:58] TTM for short. And the cool part is that
[1:00] it's completely training free. So, it's
[1:02] pretty much an architecture that you can
[1:04] use with any diffusion-based video
[1:06] model. We are using it with W 2.2, but
[1:08] you could also use it with Cork Video X
[1:10] or stable video diffusion, for example.
[1:12] First, you need to create a control
[1:13] video with some motion, either by
[1:15] dragging things around in After Effects
[1:17] or Blender, or physically moving around
[1:19] stuff through your scene with a stick.
[1:22] TTM then uses something called dual
[1:24] clock dn noising. In areas where your
[1:26] character is moving, it uses lower noise
[1:28] to follow that motion precisely. In the
[1:31] rest of the scene, higher noise lets it
[1:32] generate a clean, natural background.
[1:35] So to make this all as easy as possible,
[1:37] I slap together some AI models in
[1:39] Confui. Here you just import a video and
[1:41] it automatically generates a black and
[1:43] white mask for your character using the
[1:45] new SAM 3 model from Meta. This workflow
[1:48] also extracts the start and end frames.
[1:50] And if you want, you can use the Gwen
[1:52] image edit AI model to remove any sticks
[1:54] or unwanted objects or clean up your
[1:56] character if needed. Then you give it a
[1:58] simple prompt describing the action of
[2:00] your character and hit run. And that's
[2:03] it. That's all there is to it. It's
[2:05] surprisingly easy and very robust. I
[2:07] remember when we created a controllable
[2:09] creature for a previous short film using
[2:11] one vase. We needed countless iterations
[2:14] to get the movement right. It was really
[2:16] exhausting. But with this workflow,
[2:17] about 70% of the shots worked just on
[2:20] the first try. To use this workflow,
[2:21] you'll need Confui, which is an
[2:23] open-source AI interface. If you don't
[2:25] have Confui installed yet, we've
[2:26] prepared a step-by-step guide on our
[2:29] website that walks you through
[2:30] everything. But fair warning, if this is
[2:32] your first time using Confui, you might
[2:33] want to start with a simpler workflow.
[2:35] Let's start by bringing this shot to
[2:36] life. First, we must prepare a mask for
[2:39] the character. And we can paint out that
[2:41] stick in the start and end frame. Now,
[2:43] it really doesn't matter how you prepare
[2:45] this. If you want, you can use After
[2:46] Effects for the mask or nano banana to
[2:48] paint out the stick or Photoshop or
[2:50] something. But we also created this free
[2:51] workflow that lets you do all the
[2:53] preparation straight in comi. So to use
[2:55] it, just drag and drop it into Confui.
[2:57] Now in your case, you might need to
[2:59] click manager, install missing custom
[3:02] nodes if you have any red nodes in the
[3:04] workflow, and let's zoom in on the left
[3:06] side here. Here you can find all the
[3:08] model loader nodes and you can find the
[3:11] corresponding model that you need to the
[3:13] left in this node right here. For the
[3:15] Gwen image model, you can see that there
[3:16] are different versions and you need to
[3:19] pick the one that comfortably fits on
[3:21] your GPU's VRAMm. Once you've downloaded
[3:23] all these models and made sure that all
[3:25] the correct ones are loaded, go to the
[3:26] load video nodes and load in your plate.
[3:29] If you want, you can name your shot
[3:31] right here. For us, this was 20. And
[3:34] then click run. Wait for the images to
[3:36] load. Let's first create the mask for
[3:37] our character. For this, we're using the
[3:39] new SAM 3 model by Meta. If your
[3:41] character enters at a later stage in the
[3:44] video, so it's not there in the first
[3:45] frame, you can just change the pick
[3:47] frame right here. But for me, the
[3:49] character is already in the video. So,
[3:50] all I need to do is just click on this
[3:52] character. And then I have to specify
[3:55] which parts are not belonging to the
[3:56] character. For this, I'm just right
[3:58] clicking on the other parts of the image
[4:01] like so. It's really important to
[4:03] exclude the stick that your character is
[4:05] on. So, I will also put a red dot right
[4:07] here. Then you just click run. And after
[4:09] a few seconds, our mask is done. And you
[4:11] can see it flickers a little bit, but
[4:12] that doesn't matter at all. Just ignore
[4:14] that. Let's now clean up the start
[4:16] frame. For this, I'm zooming in on this
[4:19] part right here. And now I need to copy
[4:21] over this first frame. For this, just
[4:23] copy and paste. Now, you can click open
[4:27] and mask editor. And then you can select
[4:29] the area where your stick is. And you
[4:31] can be generous there. Left click to
[4:33] select and right click to delete parts
[4:35] again. Click save. Come over to the
[4:37] right here and add a simple prompt like
[4:40] remove the wooden stick. Click run and
[4:42] the stick is gone. But you can also
[4:44] change these frames in more creative
[4:46] ways. For example, you can see that the
[4:47] Lego character is like hovering above
[4:49] the table. So what we could also do is
[4:51] just go back to the start here and
[4:53] create a bigger mask. Something like
[4:55] this. And I'm just creating a new
[4:56] prompt. Remove the stick and make the
[4:58] Lego figure stand on the table. And you
[5:00] can see that worked really well. though
[5:02] it changed the legs a little bit. So
[5:04] this prompt worked a lot better. I also
[5:05] added do not change the look of the
[5:07] figure. And yeah, that that worked. So
[5:09] let's also remove the stick from the end
[5:11] frame. Open mask editor. I'm selecting
[5:14] the stick. Go over here. Create a
[5:17] prompt. Remove the stick. And that looks
[5:19] good. The stick is gone. And now we have
[5:20] everything we need. You could say it's
[5:22] time to move. So drag and drop that
[5:24] workflow in here. And you install this
[5:27] one in the exact same way. Install
[5:29] missing custom nodes. restart and then
[5:32] you can find all the models that you
[5:33] need in these model loader nodes right
[5:36] here. Once you have everything set up,
[5:38] you can import the start and end frame.
[5:40] You will find these in your comfy folder
[5:42] output and then there is a folder with a
[5:45] shot number that you created and then
[5:47] you can just drag and drop these in. So
[5:48] this is my start frame right here and
[5:51] then this is my end frame. Below that
[5:53] you will need to import the plate of
[5:55] your moving character and below that you
[5:58] need to import the mask that we just
[5:59] created. Next, come up here to the setup
[6:02] and here you can select which resolution
[6:03] you want to use. Next, you can choose if
[6:05] you want to use the start frame and end
[6:08] frame. You can use only a start frame or
[6:10] only an end frame. But you still need to
[6:12] import an image right here. Otherwise,
[6:14] it will give you an error. For static
[6:16] shots, a start frame is usually enough,
[6:18] but if you really want to make sure that
[6:20] your character does not change over the
[6:21] duration of your shot, I would recommend
[6:23] going with both. Next, you need to
[6:25] create a simple prompt, something like
[6:27] this. And then you can just click run.
[6:29] The sampling process for the video is
[6:31] split into two parts. And after three
[6:33] steps, you already get a rough preview
[6:35] like this. And usually you can already
[6:36] tell if your shot is working or not.
[6:38] Otherwise, you can just quit the
[6:40] process. And I would recommend trying
[6:42] another seat or adjusting your prompt.
[6:44] Okay, first try and the result already
[6:46] looks amazing. You can see how well it
[6:48] integrates into the shot. But the
[6:50] problem is that the legs kind of
[6:51] separate and then start sticking back
[6:52] together. And I think I can just fix
[6:55] this using the prompt, I guess. Well,
[6:58] and this kind of worked. Looks much
[7:00] better now with this prompt. So, this is
[7:02] the whole process. And as you can see,
[7:04] it works really well. Now, is a good
[7:06] time to mention that this video and the
[7:07] free workflows are sponsored by our
[7:08] lovely supporters on Patreon. Thank you
[7:11] for supporting us on Patreon, keeping us
[7:13] free and independent, and also allowing
[7:14] us to share all these workflows for
[7:16] free. If you want access to advanced
[7:18] workflows, extra demo files, and our
[7:20] amazing Discord community, consider
[7:22] supporting. So, that's how we created
[7:23] all the shots of the animated toys. But
[7:25] there were also those shots where I
[7:27] needed to turn myself into my 15year-old
[7:29] self. For this, I wanted to use one
[7:31] animate, which is based on one 2.2, but
[7:33] specifically designed for character
[7:35] animation. The concept is pretty simple.
[7:37] You just need a driving video of your
[7:39] performance and a reference image of the
[7:40] character you want to transform into.
[7:42] When I tested it a few weeks ago, it
[7:44] worked pretty well. So, I just went
[7:45] ahead and shot everything without doing
[7:47] proper tests, trusting it would work out
[7:50] of the box. In the end, I spent more
[7:51] time wrestling with one animate than I
[7:53] actually spent on the toy animation
[7:55] workflow. It started pretty promising,
[7:57] though. I used this image of me when I
[7:59] was around 15 as a reference image, and
[8:02] the shot itself looked pretty decent.
[8:03] The problem was that I looked pretty
[8:05] different in every single shot. But I
[8:06] had an idea that I wanted to try for a
[8:08] long time. Since one animate and one 2.2
[8:11] are based on the same model, Lauras
[8:13] trained for one 2.2 will work for one
[8:15] animate as well. For those who don't
[8:17] know, a Lara is pretty much like a small
[8:18] extra model that you can train to help
[8:20] the main model better understand a
[8:22] specific concept that it didn't know
[8:24] before. So, I asked my parents for more
[8:26] photos of me when I was 15. And then I
[8:29] used AI toolkit to train the Laura. So,
[8:31] I created my data set with some very
[8:34] basic captions like this. And then I
[8:36] used these settings right here. Feel
[8:39] free to copy them if you want. Once it
[8:41] was done, I downloaded the Laura, edited
[8:43] it at full strength, and look at how
[8:45] much better these results are. There are
[8:47] still some issues, especially with like
[8:49] eye direction, but that's something I
[8:51] would like to fix in a future video. So,
[8:53] without further ado, here's the final
[8:54] short film.
[9:09] Ow.
[9:19] Yeah.
[9:36] Heat.
[9:54] No,
[10:03] it's you guys. Wait, you just robbed
[10:06] neglected because I'm playing video
[10:08] games all the time. I'm I'm so sorry.
[10:11] >> No, nerd. We just want you to go
[10:13] outside.
[10:18] >> All right, that's it for this one. Thank
[10:20] you so much for watching and thank you
[10:21] to our lovely Patreon supporters for
[10:23] making these videos possible. As always,
[10:26] if you create something with these
[10:27] workflows, feel free to tag me or send
[10:30] it to me. I always love to see what you
[10:32] come up with. Make sure to subscribe and
[10:34] see you next time.