TRANSCRIPTEnglish

ComfyUI Course - Learn ComfyUI From Scratch | Full 5 Hour Course (Ep01)

4h 48m 54s50,614 words7,049 segmentsEnglish

FULL TRANSCRIPT

0:00

Learning Comfy UI is like opening a

0:02

technical book at the last page.

0:05

Everything is there, but nothing makes

0:06

sense yet. This course starts from page

0:09

one. Before we go any further, I want to

0:12

be very clear about how this course

0:14

works. This is not a shortcuts course.

0:17

It is not about copying workflows

0:19

without understanding them. Each chapter

0:21

builds on the previous one. We start

0:24

simple, repeat the important ideas and

0:27

only add complexity when it actually

0:29

makes sense. You do not need any coding

0:32

knowledge. You do not need to be

0:34

technical. If you can think visually,

0:36

you can learn Comfy UI. If you want to

0:39

understand how AI image generation

0:41

really works locally and how to use

0:43

Comfy UI without feeling lost, this

0:45

course is for you. My name is Pixaroma

0:48

and on this channel I focus on creating

0:50

and teaching comfy UI workflows in a

0:53

simple and practical way. I am a graphic

0:55

designer not a programmer and that is

0:58

actually a good thing. Developers are

1:00

great at writing code but they often

1:02

explain things in a very technical way.

1:05

This course is designed from a visual

1:07

thinker's perspective. My goal is to

1:09

explain Comfy UI logically and visually

1:11

without needing any coding knowledge.

1:14

Even if comfy UI looks confusing right

1:16

now, that is completely normal. We will

1:19

start from the absolute basics and build

1:21

up step by step. But before we talk

1:24

about Comfy UI itself, we first need to

1:27

understand what AI image generation

1:29

actually is. Today, AI is not just one

1:32

thing. There are many different AI

1:34

models that can run locally on your own

1:37

computer, such as stable diffusion,

1:39

Flux, Quen, and many others. It is also

1:43

important to understand that Comfy UI is

1:45

not limited to image generation. Comfy

1:48

UI is a general interface for running

1:50

many different types of AI models

1:52

locally. While it is most popular for

1:54

image generation, it can also be used

1:57

for audio, music, video, animation, 3D,

2:01

and more. As long as a model can be

2:04

connected through nodes, Comfy UI can be

2:06

used as the interface to control it.

2:08

These models by themselves are like an

2:10

engine. They are very powerful, but you

2:13

cannot really use them directly. To work

2:15

with them, we need an interface. An

2:18

interface is what allows us to send

2:20

prompts, images, and settings to the

2:23

model and then receive results back.

2:25

There are many free interfaces that let

2:27

us interact with these models. Some

2:30

popular ones are Forge UI, Swarm UI,

2:33

Invoke, Focus, and of course, Comfy UI.

2:37

They often use similar models but they

2:39

work in very different ways. In this

2:42

course, we are going to focus on Comfy

2:44

UI. Comfy UI is different because it is

2:47

node-based. Instead of hiding everything

2:50

behind buttons and menus, it shows you

2:52

exactly what is happening step by step.

2:55

You can see how prompts, models,

2:57

samplers, and images are connected

2:59

together like building a system. Think

3:02

of it like this. The AI model is the

3:05

brain. The interface is how you talk to

3:07

that brain. Comfy UI is like building

3:10

your own control panel exactly the way

3:13

you want. Do not worry if this still

3:15

feels complex. Understanding comes from

3:18

seeing things connect, not from

3:20

memorizing nodes. In this course, I will

3:23

explain what each node does, why it

3:25

exists, and how everything connects

3:28

together. Before we install anything, we

3:31

need to talk about how you will actually

3:33

run Comfy UI. There is more than one way

3:36

to use Comfy UI and the right choice

3:38

depends on your system and your

3:40

expectations. Let's go to the official

3:42

Comfy UI website to see the available

3:45

options. The official website is

3:47

Comfy.org.

3:49

If we go to the products section, you

3:51

can see that there are two main options,

3:54

Comfy UI Cloud and Local Comfy UI. Comfy

3:57

UI Cloud runs online on their servers

4:00

and it is a paid service. This option

4:03

can be useful if your computer is too

4:05

old or not powerful enough to run AI

4:07

models locally. Local Comfy UI is free

4:10

and runs directly on your own computer,

4:13

assuming you have a reasonably capable

4:15

system. This is the option we will focus

4:17

on in this course. So, let's click on

4:20

local comfy UI. Here you can see three

4:23

main installation options. Download for

4:26

Windows, for Mac OS, and install from

4:29

GitHub. All of these options install

4:31

Comfy UI, but there are important

4:33

differences between them. In this

4:35

course, I will focus on Windows

4:37

operating system using the portable

4:39

version of Comfy UI. All the workflows,

4:43

tools, and installers I show are tested

4:45

on Windows using an Nvidia graphics

4:47

card. On AMD graphics cards and on Mac

4:50

OS, performance is usually slower, and

4:54

some features or custom nodes may not

4:56

work exactly the same way. So, if you

4:58

are using Windows with an Nvidia card,

5:01

it will be much easier to follow this

5:02

course step by step as I show it.

5:05

Because there are many different AI

5:07

models, hardware requirements can vary a

5:10

lot. Some models are small and can run

5:12

on a graphics card with 6 to 8 GB of

5:15

VRAM. Other models are much larger and

5:18

may require more than 24 GB of VRAM. For

5:21

this first episode, I tested the

5:23

workflows on two different systems. One

5:26

system uses an RTX 2060 with 6 GB of

5:30

VRAM and 64 GB of system RAM. The second

5:34

system uses an RTX 4090 with 24 GB of

5:38

VRAM and 128 GB of system RAM. For the

5:42

workflows in this episode, a graphics

5:44

card with 6 to 8 GB of VRAM should be

5:47

enough to follow along. In later

5:50

episode, we will explore newer and

5:52

larger models that may require more

5:54

powerful hardware. Now, let's talk about

5:56

which version of Comfy UI you should

5:58

install. As I mentioned earlier, I am

6:00

using the portable version of Comfy UI.

6:03

If we click on install from GitHub, we

6:06

are taken to the official Comfy UI

6:08

GitHub page. Here you can find detailed

6:10

installation instructions, but they

6:12

require more manual steps and setup.

6:15

Over the past year, I have been using a

6:17

portable version of Comfy UI that

6:19

includes additional tools to make the

6:21

installation process much easier. This

6:24

installer installs the original Comfy

6:26

UI, but it also adds helpful tools so

6:29

you can get up and running much faster.

6:32

This installer is called Comfy UI Easy

6:35

Install. You can find it on this GitHub

6:37

page. You can find the creator on our

6:39

Discord community under username IVO.

6:42

Thank Ivo for this installer. This

6:44

entire course is built around this

6:46

version of Comfy UI. You can still use

6:48

Comfy UI Desktop or Comfy UI Cloud, but

6:52

some things may look different or behave

6:53

differently compared to what you see in

6:55

this course. If you want the exact same

6:58

setup that I use and the easiest way to

7:00

follow along, I recommend using the same

7:03

version. Let me show you how to install

7:05

Comfy UI. So, we are on the easy install

7:08

GitHub page. This is the complete link.

7:11

If we scroll down, you can read more

7:13

about this installer. Even if you might

7:15

not understand what each of these things

7:17

means yet, it will make sense later as

7:19

you learn more about it. I will talk

7:21

later about the Pixaroma Discord server

7:24

where you can get more help and answers

7:26

to your questions. So, this installer

7:28

will install Git, which is a tool that

7:31

tracks changes to files in Comfy UI. It

7:34

helps developers safely update the main

7:36

app and custom nodes, fix bugs without

7:39

breaking everything, and lets you update

7:41

or roll back to an earlier working

7:43

version if a new update causes problems.

7:46

Then it will install the Comfy UI

7:48

portable version. A portable version

7:50

means the program is fully

7:52

self-contained in one folder, does not

7:54

need a normal system install, and can be

7:56

run, moved, backed up, or deleted

8:00

without affecting the rest of your

8:01

computer. in Comfy UI portable. This

8:04

means Python, libraries, models, and

8:08

settings all live inside the Comfy UI

8:10

folder. So, you can copy it to another

8:13

driver or PC, update it safely, and

8:16

avoid breaking your system Python or

8:18

Windows setup. Python embedded means

8:20

Comfy UI comes with its own built-in

8:23

copy of Python already included inside

8:26

the Comfy UI folder instead of using the

8:28

Python installed on your system. Then it

8:31

will install all the nodes that are

8:32

useful and that I tested over the last

8:34

year. It might not make sense for you

8:36

yet if you are a beginner, but do not

8:39

worry. Take it as general knowledge for

8:41

now and it will make sense later. Then

8:44

it will add an add-ons folder with more

8:46

advanced stuff we can use later to speed

8:48

up our generation, plus some extra tools

8:50

that can be useful and then more

8:52

technical stuff explained for each one.

8:54

But all you need to know for now is

8:56

where to download it from and how to run

8:58

the installer. It is important not to

9:00

run it as administrator. That means you

9:03

just doubleclick to run it, not

9:05

right-click and run as administrator,

9:07

but you will see that in a minute. Also,

9:09

avoid system folders and make sure your

9:12

NVIDIA drivers are up to date since some

9:14

things work only with more recent NVIDIA

9:16

drivers. Okay, let us go back to where

9:19

it says Windows installation and let us

9:22

download the latest release from here.

9:24

Then depending on how your browser is

9:26

configured, it will either download it

9:28

to the downloads folder or ask you where

9:31

to download it and you can decide where

9:33

to put it. As you will see over time, it

9:36

needs a lot of space if you download big

9:38

models. So I suggest downloading and

9:40

installing your Comfy UI on a hard disk

9:43

that has a lot of free space and

9:45

preferably on a solid state drive

9:47

because it will load the models faster.

9:50

I will go to my D drive and I will

9:53

create a new folder called Comfy UI, but

9:56

this does not really matter. The name

9:58

can be anything easy to remember.

10:01

Sometimes I put Comfy UI followed by the

10:03

month so I know when I downloaded and

10:05

installed it. So I will save this zip

10:08

archive in that Comfy UI folder.

10:11

Now let us go to the place where we

10:13

saved the file. Since this is a zip

10:16

archive, we need to unzip it. You

10:18

rightclick on it and depending on what

10:20

you prefer, you can use the Windows

10:22

integrated option and select extract

10:25

all. I like to delete the folder name at

10:27

the end so I do not end up with a folder

10:29

inside another folder. When I click

10:32

extract, it will extract these two

10:34

files. Let me delete it really quick and

10:36

show you. If you have WinRAR like me,

10:39

you just choose extract here and it does

10:41

the same thing. Once we extracted

10:43

everything, you can delete the easy

10:45

install zip file. Now we are left with

10:48

two files. A BAT file that is the

10:50

installer and a zip file that contains

10:53

extra resources that it will use. When

10:55

you run it, you might get a security

10:57

warning. That usually happens with BAT

10:59

or executable files because they install

11:01

files on your system. This one is safe.

11:04

I installed and tested it and I

11:06

personally know EVO, the creator of this

11:08

installer. You can rightclick and scan

11:11

it with your antivirus and you will see

11:13

it is clean. So let us double click on

11:16

the BAT file then press run and it will

11:18

start installing. If you already have

11:20

git installed it will update it. If not

11:23

you will get a window like this and you

11:26

have to press yes to continue the

11:27

installation. After that it will

11:30

continue the installation of Comfy UI

11:32

and everything it needs to run. You can

11:34

take a break for 3 to 5 minutes

11:36

depending on your internet speed and

11:38

your computer. So how do you know when

11:40

it is ready? You will see a message that

11:42

says installation complete along with

11:44

the time it took. On my PC, it took 247

11:49

seconds. After that, you can press any

11:52

key to exit. So, let us recap really

11:55

quick. From the GitHub page, you

11:57

download the zip archive of the easy

11:59

installer. You create a folder named

12:01

Comfy UI and place the downloaded zip

12:04

archive in that folder. You extract the

12:07

contents of the archive. You run the

12:09

Comfy UI easy install BAT file and if it

12:12

asks to run GitHub you press yes. In a

12:16

few minutes the installation is complete

12:18

and you get this screen. Now after the

12:20

installation we can see that inside our

12:23

Comfy UI folder a new folder appeared

12:26

called Comfy UI easy install. This is

12:29

portable which means you can copy this

12:31

entire folder and move it to a different

12:33

drive or folder and it will still run

12:36

Comfy UI. Basically, after you install

12:39

any Comfy UI portable version, you

12:42

should end up with a similar folder

12:43

structure to this. Since Comfy UI is

12:46

based on the Python programming

12:48

language, you will see many Python files

12:50

and BAT files inside these folders,

12:52

which are used to run those Python

12:54

files. The easy installer will also

12:57

create some shortcuts on your desktop.

12:59

If you use other versions of Comfy UI,

13:01

they might not do this, and you would

13:03

need to create the shortcuts manually.

13:06

That is one of the reasons I prefer the

13:07

easy installer. It makes everything

13:09

easier for us. Basically, we just

13:12

extracted an archive and ran a BAT file

13:15

and we now have Comfy UI. If we right

13:18

click on this shortcut and go to

13:19

properties, we can see the target of the

13:22

shortcut. If we open the file location,

13:25

you can see that it is connected to this

13:27

BAT file that starts Comfy UI. In other

13:30

versions of Comfy UI, the name might be

13:32

different like run NVIDIA GPU or

13:34

something similar. Are you ready for

13:36

your first Comfy UI launch? To start

13:39

Comfy UI, you either use this BAT file

13:42

called Start Comy UI or from the desktop

13:45

you use this shortcut called Comfy UI

13:47

easy installer. E and Z stand for easy

13:51

and I stands for installer. So double

13:54

click on it and when it starts it looks

13:56

like this. The first time it will be a

13:59

little slower, but after that it will

14:01

start much faster. If you are curious by

14:04

nature, you can find all kinds of

14:06

information about your Comfy UI and your

14:09

system when it runs. For example, you

14:12

can see what operating system I am

14:13

using, what Python version is running,

14:16

and where that Python is located, what

14:18

the path to the Comfy UI folder is,

14:21

where the user directory is, how much

14:23

VRAM you have, how much system RAM you

14:26

have, and the PyTorch and CUDA versions

14:28

that are running. When it starts, this

14:30

command window will be minimized to your

14:32

taskbar, and Comfy UI will open in your

14:35

default browser. The first time it

14:37

opens, it will show you some templates

14:39

made by Comfy UI that you can load. If

14:41

you have run a workflow before, it will

14:43

open that workflow by default. So the

14:46

workflow you see open is the last one

14:48

you used. A workflow is a set of

14:51

connected nodes that tells Comfy UI what

14:54

to do step by step. Let us close this

14:56

for now. You can open these templates or

14:59

workflows from here later. Comfy UI is

15:02

made of a few main areas. You do not

15:05

need to memorize them. I am naming them

15:07

so we can talk about the same things

15:09

later. If we go to the top, you can see

15:12

it says unsaved workflow. Basically, it

15:15

is like a document that is empty at the

15:17

moment since we did not add any nodes

15:19

yet. You can have multiple documents

15:22

open similar to what we have in

15:23

Photoshop and other programs. We can

15:26

click on this plus icon to create a new

15:28

blank workflow. All these tabs on top

15:31

are open workflows and we can close,

15:33

save and edit each one. Now this grid

15:37

like empty space is called the canvas.

15:40

Instead of drawing on it, we will

15:42

arrange blocks or nodes like using Lego

15:45

pieces and connect things together to

15:47

create a working workflow. You can use

15:50

your mouse wheel to zoom in and out on

15:52

the canvas. Then we have this top bar.

15:55

Depending on what extensions you have

15:57

installed, it might look different and

15:58

have more options. Things like the

16:01

manager or the run button, which lets us

16:03

run workflows, are usually here. On the

16:06

bottom right, we have view controls. For

16:09

example, we have a select tool that lets

16:11

us select nodes and a hand tool that

16:14

lets us navigate the canvas. You can fit

16:17

a workflow in view, but right now the

16:19

canvas is empty. We also have different

16:21

zoom controls that you can use if you do

16:23

not want to use the mouse wheel or if

16:25

you do not have one. For me, the mouse

16:28

wheel is the fastest and the one I

16:30

prefer. Then we have the show mini map

16:32

option. This shows a small map that we

16:35

can use to navigate when we have very

16:37

large workflows. There is also hide

16:39

links, but since we do not have any

16:41

nodes or links yet, we will see that

16:43

later. An important one is the main menu

16:46

which you open by clicking on the letter

16:48

C, the Comfy UI logo. We also have more

16:51

options on the left sidebar for nodes,

16:54

models, and workflows, which we will

16:56

explore soon. Back to the main menu. If

17:00

we click on the C, we get this menu. New

17:03

creates a new workflow, but it is faster

17:05

to use the plus sign from the top bar.

17:08

For file, it allows us to open, save,

17:11

and export the workflows we create. For

17:14

edit, you can undo actions like moving

17:16

nodes or changing something in the

17:18

workflow, clear the workflow and unload

17:21

models. For view, we can enable and

17:23

disable different panels. And we also

17:26

have zoom in and zoom out controls. Just

17:28

like in Photoshop, we can do the same

17:30

things in multiple ways. It is the same

17:33

with comfy UI. For theme, you can change

17:36

how it looks, but at the beginning, I

17:38

suggest leaving it on default so it is

17:40

easier to follow tutorials. Nodes 2 is

17:43

in beta at the moment of this recording

17:45

and still has some bugs, so I suggest

17:47

leaving it off until it is more stable.

17:50

You can browse templates and open

17:51

settings, which we can explore later.

17:54

For now, the default settings work fine.

17:56

Templates and settings can also be

17:58

accessed from here. So again, there are

18:01

multiple ways to access the same things.

18:03

In some newer versions, some people

18:06

might use a newer manager and it might

18:08

appear somewhere else instead of here.

18:10

For now, I am using the old manager

18:13

which appears here. Under help, you can

18:15

also find help options, but you will see

18:18

later in this video how to ask questions

18:20

and get help. We also have a console

18:23

sometimes called the bottom panel where

18:25

you can see exactly what has happened

18:27

since we opened comfy UI. If we look at

18:30

the taskbar and open the command window

18:32

from there, it shows the same

18:34

information. One view is at the bottom

18:36

and the other is in the taskbar. To

18:39

close Comfy UI, I recommend opening the

18:42

command window from the taskbar and

18:43

closing it. You will then see a

18:45

reconnecting message in the browser

18:47

because it cannot find Comfy UI running

18:49

anymore. After that, you can close the

18:51

browser window. You can also close the

18:54

browser window first and then close the

18:55

command window. It is time to test our

18:58

first ready-made workflow. Later, I will

19:00

explain in detail what nodes are and

19:03

what they do. So let us start Comfy UI,

19:06

wait for it to finish loading and get

19:08

the interface. To open a workflow, you

19:10

have different options. You can drag a

19:13

workflow directly onto the canvas, or

19:15

you can go to the menu, then file, and

19:18

choose open. All workflows for Comfy UI

19:21

have the extension.json.

19:23

JSON means JavaScript object notation.

19:26

It is a simple text format used to store

19:29

and share data in a way that both humans

19:31

and computers can easily read. In Comfy

19:34

UI, JSON is important because workflows

19:37

are saved asJSON files. These files

19:40

store all your nodes, connections,

19:42

settings, and prompts so you can reload,

19:46

share, or edit a workflow later. I will

19:49

include these workflows for free on

19:51

Discord for those who use a different

19:53

Comfy UI version. For example, I can

19:56

open this first workflow and you can see

19:58

that it opens with all the nodes and

20:00

links ready to be used. You can use your

20:03

mouse wheel to zoom in and out to see

20:04

the entire workflow. You can click

20:07

outside the nodes somewhere on the

20:09

canvas and drag to move around. You can

20:13

also use this hand tool, which I

20:15

personally never use. With the hand

20:17

tool, you can pan around the canvas.

20:20

With the normal mouse cursor, you can

20:22

select nodes and move them around. We

20:24

will talk more about that later. Now we

20:27

have the workflow open in this tab and

20:29

you can see its name at the top. With

20:31

the X button, you can close the workflow

20:34

and go back to a new empty one. If we go

20:36

to the sidebar and click on workflows, I

20:39

can open this folder called getting

20:40

started which I prepared for you for

20:42

this episode. Only the easy installer

20:45

comes with these workflows. So, if you

20:47

are using a different version, you can

20:49

get the workflows from Discord. You can

20:51

also make the sidebar wider if you want

20:53

to see the full text. I added a few

20:55

workflows here to test in this episode.

20:58

This one is just a help file with notes

21:01

and useful information that we will use

21:03

later in the video. Let us close it and

21:06

open the one with number one in front

21:08

called Juggernaut Reborn. If I click on

21:11

workflows again, the sidebar collapses.

21:14

Now let us move around using the mouse.

21:17

Left click and drag to see it better.

21:20

Each of these blocks is called a node.

21:23

All nodes are connected to each other

21:24

using links. Those small cables that go

21:27

from one node to another. Usually a

21:30

workflow is built from left to right.

21:32

When you run a workflow, it processes

21:34

from left to right. If something does

21:36

not work or the workflow is broken,

21:39

Comfy UI tells you where the problem is.

21:41

Think of it like a car dashboard. If a

21:44

door is open or a light bulb is not

21:46

working, you get a warning icon. It is

21:48

the same here. Errors look like this. It

21:51

says prompt execution failed and it also

21:54

tells you something like value not in

21:56

the list. These are some of the simplest

21:58

errors to fix. It is like the car

22:01

telling you a light bulb is missing. In

22:03

Comfy UI, it means a specific value,

22:06

object, or file could not be found. In

22:10

our case, it could not find the

22:11

checkpoint name, which is the model

22:13

name, the brain as we called it. Comfy

22:16

UI workflows include all the nodes and

22:18

settings, basically the interface, but

22:21

they do not include the models

22:22

themselves. Those brain or engine files

22:25

are not included. Since workflows are

22:28

just JSON text files, they cannot

22:30

include images or large files like

22:32

models. In this node called load

22:35

checkpoint, the checkpoint is just a

22:37

model file. The brain we talked about.

22:39

Even if I click here, I cannot select

22:42

anything because it is not in the list.

22:44

That means the model is not downloaded

22:46

yet or it is downloaded but placed in

22:49

the wrong folder. Since I did not

22:52

download any models yet, it is clear I

22:54

do not have it. That is why when I share

22:57

a workflow with you, I include a note

22:59

that tells you exactly what you need to

23:01

download for the workflow to work. Not

23:03

everyone on the internet does this, but

23:05

most good workflow creators do. The way

23:08

I organize it is like this. I tell you

23:11

where the model needs to be downloaded

23:12

and which node loads it. It says load

23:15

checkpoint, which is the node name. Then

23:17

it tells you the model name you need to

23:19

download. There is a button that says

23:21

here and then it tells you exactly which

23:23

folder to place it in and which folder

23:26

to create if it does not exist. That is

23:29

enough theory. Let us download the

23:31

model. You already saw where it needs to

23:33

be placed, but how do you find that

23:35

folder? You need to find your Comfy UI

23:37

folder. This depends on where you

23:39

installed it, on which drive, and in

23:42

which folder. You navigate until you

23:44

find the Comfy UI folder. In our case,

23:47

it is inside the Comfy UI easy install

23:50

folder. If we go inside, we see many

23:52

folders that Comfy UI needs to run. We

23:55

have an output folder where generated

23:57

images are saved. We have an input

24:00

folder where input images are stored. We

24:03

also have a models folder where all

24:05

downloaded models go. Inside the models

24:08

folder, you can see many subfolders for

24:10

different types of models. Over time,

24:13

you will learn what each one is for.

24:15

That is why I included the note so you

24:18

know exactly where to put the model

24:19

without guessing. For this workflow, the

24:22

model goes into the checkpoints folder.

24:24

We could just save it directly there and

24:26

it would work. But from my previous

24:28

tutorials, I learned that over time, you

24:30

will download many models and it becomes

24:32

hard to keep track of them. That is why

24:35

I like to organize models in subfolders.

24:37

In this case, I know this model is based

24:40

on stable diffusion 1.5. So, I will

24:43

create an SD15 folder and place the

24:46

model inside it. Now, we wait for the

24:48

model to download. Some models are a few

24:51

gigabytes in size. After that, we go

24:54

back to Comfy UI. You can see I placed

24:57

the model exactly where the instructions

24:59

said, so Comfy UI can recognize it. If

25:02

Comfy UI was closed, reopening it would

25:05

automatically detect the model. But

25:07

since Comfy UI is already open, it will

25:09

not see the new model yet. We need to

25:11

refresh it. To do that, press the R key.

25:15

You will see that the node definitions

25:17

update. Now, when I click here, I can

25:19

see the model name and select it. Right

25:22

now, there is only one model, but later

25:24

you will have a drop-own list with many

25:26

options. Now, the model is selected and

25:29

it is time to run the workflow again. By

25:31

the way, you can move the run button

25:33

anywhere you want on the canvas using

25:35

the small dots on its side. If you

25:37

prefer it docked, you can dock it back

25:39

to the top bar. Let us run it again and

25:42

see if it works. Everything turns green.

25:45

Each node runs from left to right and no

25:48

red nodes appear. That means the

25:50

workflow ran successfully and we

25:52

generated our first image. The model we

25:54

use in this chapter is quite old and

25:56

small. Later we will use smarter and

25:59

more advanced models. For practice, this

26:02

one is good enough because it is fast

26:04

and can run on smaller computers that do

26:06

not have a lot of VRAM. Each time I

26:08

press run, I get a new image because we

26:11

have a random seed here. Do not worry

26:13

about this yet. I will explain it later.

26:16

So now we can generate an image with

26:18

comfy UI and all of this comes from a

26:21

simple text called a prompt. Basically

26:23

we used a few nodes with specific

26:25

settings and a model trained for this

26:27

type of image generation. We can change

26:30

the prompt for example photo of a cat

26:33

closeup. Now when I run the workflow I

26:36

should get a cat. The more VRAM you have

26:39

the faster it will generate. We can see

26:42

the generated images here, but they are

26:44

also saved locally. If we look at the

26:47

output folder, we have a shortcut to it

26:49

on the desktop. Inside that folder, we

26:52

can see all the images we generated so

26:54

far. Let us go back to Comfy UI and

26:57

close this workflow. I do not want to

26:59

save it because I liked the prompts and

27:01

settings it had before. So, I choose no.

27:05

Now, we are left with an empty workflow.

27:07

or you can click on the plus sign to

27:09

create a new blank workflow. Before we

27:12

move to the next chapter, I recommend

27:14

taking a short break. Research shows

27:16

that short pauses help your brain

27:18

process and retain new information. Grab

27:21

a coffee, get some water, or take a

27:24

quick bathroom break, then come back and

27:26

continue. This chapter is about

27:28

understanding the building blocks of

27:30

Comfy UI and how they connect to form a

27:32

workflow. We are in Comfy UI and we have

27:36

this blank canvas and workflow. To add a

27:38

node, you doubleclick on the canvas and

27:41

it will open a search box that lets you

27:43

search for a node. For example, if I

27:46

type the word load, it shows me load

27:48

image, load checkpoint and all kinds of

27:51

nodes that let us load something. If we

27:53

click on the load image node, it will be

27:56

added to the workflow. The position

27:58

where it is added depends on where you

28:00

doubleclick on the canvas. You can also

28:02

move it after. You just leftclick on a

28:05

node, hold the left mouse button, drag

28:08

it to where you want it, and then

28:09

release the button. To deselect a node,

28:12

you just click anywhere on the empty

28:14

canvas. For me, that is the fastest way

28:16

to add a node. But there are other

28:19

methods. For example, I can write click

28:21

on the canvas in an empty area and get

28:24

this menu. From here, I can go to add

28:27

node. Then I see different categories.

28:29

If I click on the image category, I can

28:32

find the load image node. It is right

28:34

here. And if I click on it, it gets

28:37

added to the canvas. After that, you can

28:40

move it and arrange it wherever you

28:41

want. Another method is to use the node

28:44

library in the left sidebar. Here we

28:47

have all these categories. If I click on

28:49

the image category, I can see the load

28:52

image node. This is a good option if you

28:55

do not know exactly which node you are

28:57

looking for. You can also search for a

28:59

node here to filter the list, then add

29:01

the node or drag it onto the canvas. Out

29:04

of all these methods, my favorite is

29:06

still the double click on the canvas.

29:09

Once a node is selected, you also have

29:11

the option to delete it using this icon.

29:14

You can also use the delete key or the

29:16

backspace key to delete a node after you

29:19

select it. The load image node is how we

29:21

bring an existing image into Comfy UI so

29:24

other nodes can work with it. Each node

29:26

has a title at the top that tells you

29:28

what it does. Below that, it has

29:30

controls, inputs, and outputs that

29:33

connect it to the rest of the workflow.

29:35

Let us double click on the canvas again

29:37

and add another node. This time, I will

29:39

search for crop. And we get this node

29:42

called image crop. You can probably

29:44

guess what it does. It crops the image

29:46

that we loaded. You can change the image

29:48

using this button and upload any image

29:51

you want. If something goes into a node,

29:54

it is called an input. If something

29:56

comes out of a node, it is called an

29:58

output. The load image node has two

30:01

outputs but no inputs because the image

30:04

comes directly from your computer, not

30:06

from another node. The image crop node

30:08

has one image input and one image

30:10

output. It receives an image, modifies

30:13

it, and then sends out a new image. If

30:16

we leftclick on one of the outputs from

30:18

the load image node, we can drag a

30:20

connection or a cable to the next node

30:23

and connect it. Because the output and

30:25

the input have the same color and the

30:27

same name, it is easy to see that they

30:29

belong together. In most cases,

30:32

connections work between the same colors

30:34

and different colors usually do not

30:36

connect. There are a few special cases,

30:39

but we will talk about those later. Now,

30:42

if I try to connect the green output, it

30:44

does not connect. That is because the

30:46

green output is a mask and the input on

30:49

this node expects an image which is

30:51

blue. In the beginning, this color

30:54

system helps you quickly understand

30:55

which nodes can be connected. If two

30:58

nodes cannot connect, it usually means

31:00

they are not meant to be connected.

31:02

Sometimes you will also find nodes that

31:04

act like adapters or converters. These

31:07

nodes take one type of output and

31:09

convert it into a different type so it

31:11

can be used by another node. Now

31:13

basically we have a workflow but is the

31:16

workflow complete? How can we test it?

31:19

It is simple. We run it and see what

31:21

message we get. In this case it says the

31:24

prompt has no output. Even if you do not

31:27

understand exactly what that means yet

31:29

try to figure it out from the words. We

31:31

do not have an output node. So let us

31:34

close this message. If we look at the

31:36

workflow, the image is loaded from the

31:38

computer. Then it goes into the image

31:40

crop node which crops the image. But

31:43

after that nothing happens. There is no

31:46

output. Think of this like editing a

31:48

photo in Photoshop. You load an image,

31:51

crop it, but if you never save or export

31:54

it, the work exists, but you do not get

31:56

a file. In Comfy UI, the save image node

32:00

is the export step. So let us

32:02

doubleclick on the canvas again and

32:04

search for save. We have this save image

32:07

node. We can see that the image output

32:09

color matches. So we can connect it.

32:12

Even if the label says image or images,

32:15

it still works. Now let us make the

32:18

connection. If we run the workflow, it

32:20

will process from left to right. The

32:23

image is loaded, cropped, and then saved

32:26

in the output folder. We can also see it

32:28

directly inside the save image node. All

32:31

nodes can be resized using the corners.

32:34

You will see small arrow indicators on

32:36

the corners. You can click and drag to

32:38

resize a node. In this case, I want to

32:42

see the image preview bigger, so I

32:43

resize the node. To remove a connection,

32:47

you can leftclick on the output dot,

32:49

drag the cable out onto the canvas, and

32:52

it will disconnect. You can also

32:54

leftclick on the small dot in the middle

32:56

of the connection and choose delete. You

32:59

also have the option to add a reroute. A

33:01

reroute is like an extension cable or a

33:04

cable organizer. It does not change the

33:07

data at all. It only helps you route

33:09

connections more cleanly and keep your

33:11

workflow readable. From that reroute

33:13

node, you can add another link to

33:16

another node if you want. You can also

33:18

have multiple reroutes on the same link,

33:20

so you can arrange nodes, links, and

33:23

reroutes in a way that looks visually

33:25

clean or helps you see faster which node

33:27

connects to which node. This is very

33:29

helpful when you have a lot of nodes in

33:31

a workflow. To remove a reroute, you

33:34

just select it and press the delete key.

33:36

Let me arrange them and remove all links

33:38

so we can see it better. So in Comfy UI,

33:42

we have different types of nodes. First,

33:45

we have nodes that only have outputs.

33:47

These nodes usually load something from

33:49

outside Comfy UI like a file or some

33:52

text. In this case, the load image node

33:55

loads an image from your computer. So it

33:58

does not need any inputs, only outputs.

34:01

Then we have nodes that have both inputs

34:03

and outputs. These nodes are usually

34:06

placed in the middle of a workflow. They

34:08

receive something from one node, process

34:10

it, and then pass the result to the next

34:12

node. Finally, we have nodes that

34:15

usually sit at the end of a workflow.

34:18

These nodes only have inputs, and their

34:20

job is to show or save the result, for

34:23

example, by previewing an image or

34:25

saving it to disk. There are also nodes

34:28

that do not have any inputs or outputs

34:30

at all. For example, if I search for a

34:33

note node and add it to the canvas, you

34:35

can see that it is onlyformational.

34:38

These nodes are used to write notes and

34:40

make workflows easier to understand and

34:42

remember. They do not affect the

34:44

workflow at all and are just for

34:46

organization and clarity. On the top

34:48

left side of a node, next to the title,

34:51

you have a small gray dot. If you click

34:54

it, the node collapses, similar to

34:56

minimizing it. I often do this for nodes

34:59

where I already know the settings and do

35:01

not need to change them. Collapsing

35:03

nodes helps save space and makes the

35:04

workflow easier to read. If you right

35:07

click on a node, you get a menu called

35:09

the node context menu. This menu shows

35:12

options related to that specific node.

35:15

Each node has a slightly different menu

35:17

depending on what that node does. In

35:20

this case, we have options like opening,

35:23

saving, and copying the image, different

35:26

properties, resize, and colors. We can

35:29

also collapse the node from here instead

35:31

of using the gray dot. And there are

35:33

many more options. Try a few of them. If

35:36

you do not like what you did, you can

35:38

undo it with controll + z. You can also

35:41

change the title of a node. If you

35:43

double click on the node title, you can

35:46

rename it to anything you want. This

35:48

does not change how the node works at

35:49

all. It is only for your own

35:51

organization and to make the workflow

35:53

easier to understand. You can also

35:56

rightclick on a node and choose title to

35:58

rename it. This is another way to change

36:01

the node name. We already know that we

36:03

can move a node around once it is

36:05

selected. But when a node is selected,

36:08

you will also see a small floating bar

36:10

at the top. From here, you can delete

36:13

the node using this icon. You can also

36:16

click on this dot to change the node

36:18

color. This lets you choose from

36:20

different colors, which is useful for

36:22

organizing your workflow or grouping

36:24

nodes by function. Changing the color

36:27

does not affect how the node works. By

36:29

default, there is no color, the gray

36:32

one. This small eye icon is the node

36:35

info. If you click it, a properties

36:38

panel opens on the right with more

36:39

information about the node. Here you can

36:42

see what the node is supposed to do and

36:44

what the values mean. From this icon,

36:47

you can also close the properties panel.

36:49

All nodes, especially the default comfy

36:52

UI nodes, should have some kind of info

36:55

unless it is a custom node and the

36:57

creator did not add any documentation.

36:59

You can drag this side panel and resize

37:01

it the way you like. Here you can see

37:03

all the information about the image crop

37:05

node and what each setting does. Let us

37:08

close it for now. If you hover over an

37:11

icon, you can see more information about

37:13

what it does. For example, because this

37:16

node works with images, it lets you open

37:18

it in the mask editor. If we click on

37:21

it, you can see that it opens in the

37:22

mask editor. This will be useful later

37:25

when we do inpainting and image editing,

37:28

but that is for another episode. For

37:30

now, it is enough to know that it is

37:32

here and you will learn more about it

37:34

over time. These numbers are just

37:36

settings that you can change in Comfy

37:39

UI. They are called parameters.

37:41

Parameters control how a node behaves

37:44

and how it processes its input. Let me

37:47

reconnect the nodes. So, we have a

37:48

working workflow again. Now, if I run

37:51

it, you can see what these parameters

37:53

actually do. We are cropping a 512x 512

37:58

pixel area from the original image

38:00

starting from the X and Y coordinates

38:02

set to zero. That means the crop starts

38:05

from the top left corner of the image.

38:08

So basically we are taking this small

38:10

corner from the original image. The

38:13

original image was 1,024x

38:15

14 pixels and now the result is 512x 512

38:21

pixels. Even if it looks bigger here in

38:23

the node preview, it is not actually

38:26

larger. That is just the preview size.

38:29

The real image resolution is smaller. So

38:32

let us change some parameters or

38:34

settings or however it is easier for you

38:36

to remember them. Values is also fine.

38:40

And run it again. Now we have a

38:42

different crop. Let me speed up the

38:44

video while I try different values so

38:46

you can see how the result changes.

38:50

Let us remove this middle node, the

38:52

image crop node. Once it is removed,

38:55

something interesting happens. Comfy UI

38:58

tries to keep the workflow connected and

39:00

automatically reconnects the nodes

39:01

directly. That happens because the

39:04

output of the first node and the input

39:06

of the last node use the same type. If

39:08

the first and last nodes had different

39:10

input and output types, the connection

39:12

would disappear when the middle node is

39:14

removed. Now let us doubleclick on the

39:17

canvas. You should remember by now that

39:20

every time you see this search bar, it

39:22

means I doubleclicked on the canvas. Let

39:25

us search for invert and select invert

39:27

image. This node has an image input and

39:30

an image output. But it does not have

39:31

any parameters. That is because this

39:34

node is designed to do one specific

39:36

thing. Invert the image. Even without

39:39

parameters, the node still performs a

39:41

function. Let us connect this node into

39:44

the workflow. Watch what happens when I

39:46

connect it to the input. You can see

39:49

that the previous connection is removed

39:51

automatically. That is because an input

39:53

can only have one connection at a time.

39:56

An output on the other hand can connect

39:59

to multiple nodes. You can think of it

40:01

like electricity. An output is like a

40:04

power strip. It can send power to many

40:06

devices. An input is like a wall socket.

40:09

It can only accept one plug at a time.

40:12

If we run the workflow now, we can see

40:14

that the result is an inverted image. So

40:18

until now with these small workflows, we

40:20

did not use any AI. We only used simple

40:24

nodes like simple code to modify images.

40:28

We will see more in the next chapter

40:30

when we build a bigger workflow that

40:31

uses stable diffusion to generate an

40:33

image from text. But these small steps

40:36

help you understand how things work. At

40:38

least I hope they do. You can always ask

40:40

any questions you have on Discord as we

40:43

will have a special section for this

40:44

episode on the Discord forum. So we

40:47

learned that save image is usually the

40:49

last node in a workflow since it does

40:51

not have any outputs. And because the

40:53

output is an image that goes to disk,

40:55

not to another node. But that does not

40:58

mean we cannot continue the workflow. It

41:00

only means we cannot continue from that

41:02

node. We can still continue from the

41:04

previous node which has the same image

41:07

just not saved to disk yet. Let us clone

41:10

this node and use it again. One simple

41:12

way is to press the alt key and drag a

41:14

copy of this node where you want it. Let

41:17

us delete it and try again. Now we will

41:19

use controll + c to copy. And when I use

41:23

controll +v it will paste that node

41:25

where the mouse cursor is. Let us delete

41:28

it again. And now let me show you

41:30

another shortcut. Press the control key

41:32

and make a marquee selection over the

41:34

nodes you want to select. Now we

41:37

selected two nodes. With Ctrl + C, we

41:40

copy all selected nodes. And with Ctrl

41:43

+V, we paste them. If you click and drag

41:46

from any of the selected nodes, you can

41:48

move them together. If you press delete

41:50

while both are selected, it will remove

41:53

both. If we use controll + shift +v, it

41:57

will paste the nodes together with the

41:58

links they had in the workflow. Now we

42:00

have this extra link here. So basically

42:04

from one image we got two invert image

42:06

nodes and both do the same thing invert

42:09

the image. If the invert image node had

42:11

more parameters we could change the

42:13

settings in one and get different

42:15

results. Let us delete those again.

42:18

Practice this a few times. Press control

42:20

and select the nodes. Presstrl + c to

42:23

copy and ctrl + v to paste. Move them

42:27

into position. Now look at what I am

42:30

doing. I am continuing the workflow from

42:32

the last invert image node and then I

42:34

save the result. A workflow can have

42:37

many branches like a tree. The root

42:40

starts with the image. Then it inverts

42:42

the image and from there on another

42:44

branch it inverts it again. Can you

42:47

guess what happens when I press run? The

42:49

image is inverted again and looks like

42:51

the normal image. The original image is

42:54

inverted. Then that inverted image is

42:56

inverted again and we get the original

42:58

result. We can continue the workflow

43:00

even more. From the image that was

43:02

inverted twice, we connect it to an

43:04

image crop node. Now, instead of double

43:07

clicking and searching for a node, we

43:09

can drag a connection and release it.

43:12

When we do that, a context menu appears

43:14

with suggested nodes. From here, I can

43:17

easily pick the save image node and it

43:20

is added already connected. Let us

43:22

delete it and try again. Drag the

43:24

connection and release it. Then select

43:27

search. When I select the save image

43:29

node, it is added already connected. Let

43:32

us place the nodes properly and run the

43:34

workflow. You can see how many

43:36

operations are now in this workflow.

43:39

With a single image, we can invert it,

43:41

invert it again, crop it, and save it.

43:44

This is similar to a small program or an

43:47

action in Photoshop, but with more

43:49

control and much more flexibility. There

43:51

are nodes for images, audio, 3D, and

43:55

many other things. This is where you

43:57

start to see the power of comfy UI. Now

44:00

let us select everything. Hold controll

44:03

and drag to select all nodes. You can

44:05

press delete to remove everything or let

44:08

us cancel that and do it another way. Go

44:10

to the menu then edit and choose clear

44:13

workflow. It will ask if you want to

44:15

clear it. Click okay. And now we have a

44:18

blank workflow again. Do you like math?

44:21

I know you do not like it, but I just

44:23

want to show something quick to see the

44:25

different things it can do and help you

44:27

understand Comfy UI better. Double click

44:29

on the canvas and search for math. You

44:31

will see a few math nodes. If we look on

44:34

the right, you can see different names

44:35

like Comfy UI core, KJ nodes, easy use.

44:40

These names are the names of custom

44:42

nodes or extensions. By default, Comfy

44:45

UI comes only with the nodes you see

44:47

under Comfy UI core. With the easy

44:50

installer, you also get a few extra

44:52

custom nodes already installed. We will

44:55

talk more about custom nodes later when

44:57

we get to the manager. If you use the

45:00

easy installer like I showed at the

45:02

beginning of this video and install the

45:03

same version, you should have the same

45:05

nodes. So again, Comfy UI core nodes are

45:09

made by the Comfy UI team. Let us get

45:11

back to the math nodes. We will start

45:14

with something simple called math int.

45:17

int comes from the word integer which

45:19

means whole numbers. This node works

45:22

only with whole numbers like 1 2 10 and

45:25

so on, not decimals. All custom nodes

45:29

have an extra label on the top right

45:30

that shows which custom node pack they

45:32

belong to. This makes them easy to spot

45:34

compared to built-in nodes. These math

45:37

nodes are used for simple calculations

45:40

similar to a calculator. I personally do

45:42

not use math nodes very often, but they

45:45

can be very useful for automation. For

45:48

example, you might load an image, read

45:50

its width or height, and then use math

45:52

nodes to calculate a new value based on

45:55

that size. This allows you to

45:57

automatically adjust things like

45:58

resolution without manually changing

46:00

numbers every time. In this case, we

46:03

have letter A, letter B, and an

46:05

operation. The default operation is add.

46:09

Let us set A to five. and B to three.

46:13

For the operation, we will leave it on

46:14

add for now. Let us add another node

46:17

that I use often called preview as text.

46:20

You can see it comes with comfy UI. This

46:23

is one of those special nodes I

46:24

mentioned earlier that can be connected

46:26

to almost anything. Even if other nodes

46:29

cannot connect directly, this node will

46:31

convert the value to text and display

46:33

it. If I run the workflow, you can see

46:36

the result. Even though they look the

46:38

same, one is actually a number and the

46:41

other is a text display of that number.

46:44

This makes more sense if you have coding

46:46

experience, but we will not get into

46:48

technical details here. What is

46:50

important to remember is that we can use

46:52

this node to see a result as text. It

46:55

also has options for how the preview is

46:57

displayed. Let us move this node down

46:59

and make a copy of the math in node.

47:03

Now remember, we have a result in this

47:05

node, but it is not visible unless we

47:08

use a node to preview or save it. Here

47:11

is something interesting. We do not see

47:13

any inputs on this node, but when we

47:15

drag a link to it, two input dots

47:17

appear. This means we can actually

47:20

connect values directly to these fields.

47:22

You will see this behavior with many

47:24

nodes that have number fields or text

47:26

fields. We can copy a value from an

47:28

output and feed it into these fields to

47:30

use it in the workflow. Notice how the

47:32

field for letter A is grayed out. That

47:35

means it is no longer using a manual

47:37

value. Instead, it is taking the value

47:40

from the previous node which is 8. Now

47:43

let us change the operation to multiply.

47:45

We now have 8 multiplied by 3. Let us

47:49

add another preview as text node to see

47:51

the result. When we run it, the result

47:54

is 24. As expected, let us remove that

47:57

preview node and arrange the layout.

47:59

This small workflow does something

48:01

simple. It adds two numbers and then the

48:03

result is multiplied by three. That

48:06

three could also come from another node

48:08

and so on until you build more complex

48:11

workflows. If I change the value to four

48:14

and run it again, we get the correct

48:16

result for that formula. I hope this was

48:19

not too much math. Now let us select the

48:21

middle node that does the multiplication

48:23

and right click on it. We have a

48:25

function called bypass. When we enable

48:28

bypass, the node is temporarily ignored

48:31

as if it is not part of the workflow. By

48:34

the way, you can also access bypass

48:36

quickly from this icon when the node is

48:38

selected. Now, if I run the workflow,

48:41

you can see it ignored that node and

48:43

only did the addition. If I enable it

48:46

again and run the workflow, it takes

48:48

that node into account again. You can

48:51

see that the node changes color and

48:52

becomes purple and semi-transparent.

48:55

This visual change tells us that the

48:57

node is deactivated. Bypassing a node is

49:00

useful when you want to test a workflow

49:02

without removing the node completely.

49:05

Hold the control key and select all

49:07

three nodes. Now, if we rightclick on an

49:10

empty area of the canvas, we have the

49:12

option to add a group. If we choose add

49:15

group, it will create an empty group.

49:18

But since we already selected the nodes,

49:20

it is better to choose add group for

49:22

selected nodes. This creates a group

49:25

that contains all those nodes. You can

49:27

think of a group like a folder that

49:29

holds multiple nodes together. One very

49:32

important thing to remember is that if

49:34

you want to move the group with all the

49:35

nodes inside, you need to drag it using

49:38

the group's top bar. If you select and

49:40

move an individual node, it will move

49:43

outside of the group. Groups can also be

49:45

resized. You can see a small triangle in

49:48

the corner that lets you change the size

49:50

of the group. If you rightclick on a

49:53

group, you also have the option to

49:55

bypass all the nodes inside it. This is

49:57

very useful when you have multiple

49:59

workflows on the same canvas. For

50:02

example, you can deactivate one workflow

50:04

and enable another so only one workflow

50:07

runs at a time. This becomes important

50:09

as workflows get more complex and models

50:12

get larger because running multiple

50:14

workflows at once can require more

50:16

resources than your system can handle.

50:18

If you double click on the group title,

50:20

you can change the group name. Enough

50:23

with math. Let us work a little with

50:25

text as well. When we use AI, we give it

50:28

prompts. And sometimes it helps to

50:30

combine text from different sources to

50:32

get a better prompt.

50:35

Now I am searching for concat. And you

50:38

can see that there are multiple nodes

50:39

with similar names. That is because

50:42

concatenate is a general concept and it

50:44

exists for different data types. This

50:47

one here, concatenate, works with

50:49

strings, which means text. It simply

50:52

takes multiple pieces of text and joins

50:54

them together into a single string. That

50:57

is why I added this cat made from

50:59

multiple pieces joined together to make

51:01

it easier to remember. Even if you

51:04

search for cat, you can easily find the

51:06

concatenate node. Let us add it to the

51:09

canvas. For example, for string A, I add

51:12

the word home and for string B, the word

51:15

car. When I connect them, the output

51:17

becomes a single piece of text, a

51:20

string. Let us drag a connection from

51:22

that string and search for a node that

51:24

can preview it. We can use the same

51:26

preview as text node again. Now, because

51:30

the first workflow is bypassed, it will

51:32

only run this workflow with concatenate.

51:35

You can see how it joined those two

51:37

words, first home, then car. We can also

51:41

use a delimiter. For example, I can add

51:44

a space here and run it again. Now the

51:46

result has a space between the words or

51:49

I can add a comma and a space. And now

51:52

the result looks like proper text with

51:54

separation. Let us move this node down

51:56

and hold alt and drag a copy of the

51:59

concatenate node. You can move nodes

52:01

around to make room so the connections

52:03

are easier to see. Here we have these

52:05

text fields. And like you saw with the

52:07

math nodes, we can connect outputs

52:09

directly into these fields if they are

52:11

the same type. When I drag a link from

52:14

the string output, you will see input

52:16

dots appear showing that I can connect

52:18

there. I can connect to the first field,

52:21

the second field or even the delimiter.

52:23

Let us connect it to the first field. So

52:26

now the home and car result becomes the

52:28

first input. And for the second field,

52:30

let us add the word flower. I will hold

52:32

alt again and drag another duplicate

52:34

since that is faster. Then I connect it.

52:37

Can you guess the result? We now get

52:40

home, car, and flower. So basically this

52:44

is how people create workflows. You

52:46

connect nodes like Lego pieces. Some

52:49

nodes can be connected together because

52:50

they share the same input and output

52:52

types and you get a result. Over time,

52:56

you can build more complex workflows

52:58

that can save you a lot of time. Let's

53:00

add another node. Double click on the

53:03

canvas and search for primitive. This

53:05

node is called primitive because it

53:07

represents the most basic types of

53:09

values. Things like numbers, text, and

53:12

true or false values are considered

53:14

primitive values. The primitive node is

53:17

used to manually create a value inside

53:19

comfy UI instead of getting it from

53:21

another node. You can use it to type in

53:24

a number, write some text, or define a

53:26

simple value that can then be connected

53:28

to other nodes. Think of it like writing

53:31

a value by hand and injecting it into

53:33

the workflow. You can see here it says

53:35

connect to the widget input. So we can

53:38

drag a link from there. Now you can see

53:40

we have a lot of inputs where we can

53:42

connect this value. If we look at this

53:44

text, notice what happens when the

53:47

connection is complete. It changes to

53:49

the type of value that was connected, a

53:52

string. Now we can manually insert any

53:54

text value there. When we run the

53:57

workflow, the result will include that

53:59

value. This is useful because sometimes

54:02

you want to use the same value in

54:04

multiple nodes. Instead of typing it

54:06

manually each time, you add a primitive

54:08

value once and connect it to all the

54:11

inputs that need it. Let us right click

54:13

on this group. Usually nodes have a

54:15

bypass option to disable or enable them.

54:19

But for groups, this option is called

54:21

set group nodes to always. Now the nodes

54:24

inside the group are enabled and we can

54:26

run that workflow if we want. Let us add

54:29

another primitive node to see how it

54:30

adapts. Last time when we connected a

54:33

primitive node to a text field, it

54:35

automatically converted the value into a

54:38

string because that input expected text.

54:41

Now if we connect a primitive node to a

54:44

math int node, it adapts differently.

54:46

This time it is converted into an

54:48

integer value. You can see that now we

54:51

can only enter numbers. It does not

54:53

allow text because this node expects an

54:56

integer. Right now the value is set to

54:59

five. Let me resize the node so we can

55:01

see it better. You can clearly see that

55:03

this one is an int and the other one is

55:05

a string. If we change this number and

55:08

run the workflow, Comfy UI will rerun

55:10

all the workflows on the canvas using

55:12

the new values. In this simple example,

55:15

it runs almost instantly. But later when

55:18

we use larger models, you will see that

55:20

some workflows can take minutes to

55:22

generate instead of seconds. So if you

55:25

look at all the nodes in these

55:26

workflows, you can clearly see that we

55:28

use this easyuse custom node and all the

55:31

rest do not have that label. That means

55:33

they come with comfy UI by default. Let

55:36

us go to the menu then edit and choose

55:38

clear workflow. Now I want to do a quick

55:41

recap just to make sure you assimilated

55:43

some of the basics. We double click on

55:46

the canvas to bring up this search

55:47

option. So we can search for nodes. You

55:50

type a word to search like load and then

55:53

you can select a node. For example, the

55:55

load image node. This node is used to

55:57

load an image from your computer. If we

56:00

click choose file to load, we can

56:02

navigate our computer and load an image.

56:04

By the way, I asked EVO to include some

56:07

images for this first episode in the

56:09

input folder. So you can have the same

56:12

images I am using. The path is Comfy UI

56:16

easy install Comfy UI input. Let us say

56:20

I select this helmet but it can be any

56:22

image. We can choose open or we can

56:25

double click on the image and it will

56:27

open. Now that the image is loaded, let

56:29

us add another node, the image crop

56:32

node,

56:34

and connect it from left to right from

56:36

output to input. To remove a connection,

56:39

you just drag from the output and

56:41

release it somewhere on the canvas. It

56:43

is like unplugging a cable and leaving

56:45

it on the floor. Let us redo the

56:47

connection. You can also click on the

56:49

small dot in the middle of the

56:51

connection and choose delete and it does

56:53

the same thing as unplugging. Let us

56:56

connect it again. We can hold control

56:58

and select multiple nodes by dragging

57:00

with the left mouse button pressed. This

57:02

lets us select multiple nodes at once.

57:05

You can also hold control and click on

57:07

the nodes one by one. If you select a

57:10

node by mistake, just click on it again

57:12

to deselect it. Once nodes are selected,

57:15

we can move them together. If you plan

57:17

to move them a lot, you can also add

57:19

them to a group. If you click on the

57:21

canvas and drag, you are moving the

57:24

canvas itself. This is useful when you

57:26

have long workflows and want to see

57:28

different parts of the workflow. Let us

57:30

move these nodes to the left. Now,

57:33

double click on the canvas and add a

57:34

node called preview. This time, it is

57:37

not preview as text. We could add that

57:39

too, but it would show numbers. Here we

57:42

add preview image. This node is similar

57:44

to save image, but it does not save the

57:47

image in the output folder. It is useful

57:49

when you just want to preview parts of a

57:51

workflow and do not need to save the

57:53

image. If you like the image, you can

57:55

still save it. You can rightclick on the

57:58

image and choose save image. Then save

58:00

it anywhere you want on your hard disk.

58:02

Let us cancel that and remove the

58:04

preview image node. This time let us add

58:07

a save image node so we can save the

58:09

result and then run the workflow again.

58:12

Now the result is saved. Let us go to

58:14

the desktop. The easy installer comes

58:17

with a shortcut to the output folder. We

58:19

double click on it and now we are in the

58:21

output folder. You can see the path at

58:23

the top. Here you can see all the images

58:26

generated with Comfy UI. These images

58:28

are saved in this folder. You can delete

58:31

them, move them to different folders, or

58:34

organize them however you want. I

58:36

usually pick the images I need, move

58:38

them into the project folder, and then

58:40

delete the rest because over time you

58:42

will end up with thousands of images.

58:44

Here we can see the helmet image we just

58:47

generated, but not the previous one from

58:49

the preview image node. If we go back

58:51

one folder to the main Comfy UI folder,

58:54

we can see a temp folder. Comfy UI uses

58:57

this folder to store preview images

58:59

temporarily. The contents of this temp

59:01

folder are deleted when you start Comfy

59:03

UI again. So, you can still recover

59:06

preview images even if you did not save

59:08

them right away, as long as you did not

59:10

close Comfy UI yet. We can collapse a

59:13

node using the top left gray dot and

59:16

click it again to expand it. Once a node

59:19

is selected, it has multiple options at

59:21

the top. One very useful option is the

59:24

info icon, which gives you more

59:26

information about the node. You can

59:28

close the properties panel from here. We

59:30

can change the color of the node. And

59:32

this symbol here is for subgraphs.

59:35

Subgraphs are a bit more complex. So

59:37

maybe in a later chapter or another

59:39

episode, we will talk more about them.

59:42

This arrow lets you bypass the node. And

59:44

if you click on these dots, it shows

59:46

even more options for the node. For

59:48

example, you can change the shape,

59:51

change the color, or pin the node so it

59:53

is fixed and cannot be moved. If we move

59:56

these nodes apart to see the links, we

59:59

can also hide the links from here. Be

60:01

careful with that because it can look

60:03

like no nodes are connected. I never use

60:06

this option because I like to see how

60:09

nodes are connected. It helps me

60:11

understand the workflow better. If you

60:13

do not like how the links look, there

60:15

are ways to change their shape. I like

60:17

the default look, but some people prefer

60:20

other options. If we go to the bottom

60:22

left or open the menu and go to

60:24

settings, we can change this. Let us

60:27

click on settings. Here we have many

60:29

settings we can change. Let us search

60:32

for link since we want to change how

60:34

links are displayed. You can see the

60:36

current one is called spline under link

60:38

render mode. If we change it from spline

60:41

to straight and close the settings, you

60:44

can see the links are now straight, but

60:46

they still adapt when you move the

60:48

nodes. Let us go to settings again and

60:51

change it to linear. Now the links are

60:53

always straight lines.

60:58

Let us change it back to the default

61:00

which is spline. Now let us remove all

61:02

the nodes.

61:05

I personally prefer the spline view

61:07

because it reminds me of sci-fi scenes

61:09

with lots of cables hanging around.

61:13

Let us double click on the canvas and

61:15

add a load image node again. Now let us

61:17

add another node and search for upscale

61:19

image. We will select the node called

61:22

upscale image by and move it closer so

61:25

we do not waste space on the canvas.

61:27

Then we connect it to the workflow and

61:29

add a save image node at the end. So we

61:32

have a complete workflow. If we run this

61:34

now we get the same image as before.

61:37

That happens because some nodes have

61:39

default values that do not change

61:41

anything. They only start doing

61:42

something once you change their

61:44

parameters. In this case, the upscale

61:47

value is set to one. Scaling by one is

61:50

like multiplying a number by one. You

61:52

get the same result. Now, let us

61:54

increase the scale by value to two. When

61:57

we run it again, the image is upscaled

61:59

by two times. So, we get double the

62:01

resolution. This is similar to resizing

62:04

an image in Photoshop. In this case, it

62:07

does not use AI to add new detail. It

62:10

simply enlarges the image. These upscale

62:13

methods are different ways of resizing

62:14

an image. Nearest exact copies pixels

62:18

exactly, so it is very fast and keeps

62:20

hard edges, but it can look blocky.

62:23

Bilinear smooths pixels together, giving

62:26

a softer result that can look slightly

62:28

blurry. Area is mainly meant for

62:30

downscaling and is not ideal for

62:32

upscaling images. By cubic uses more

62:35

surrounding pixels to produce smoother

62:37

and better looking results. Lancos

62:40

preserves detail and sharpness the best,

62:42

but it is slower than the others.

62:45

Let us say you do a lot of changes to a

62:47

node like titles, values, and colors,

62:51

and you forget how the default values

62:53

were. You can add the same node again

62:55

and redo all the values and connections,

62:57

or you can rightclick on the node and

62:59

choose fix node, recreate. If you select

63:02

that, the node goes back to its default

63:05

state. If you rightclick again, you also

63:08

have the option to clone the node and

63:09

move that clone wherever you want. You

63:12

can also do this faster by holding alt

63:14

and dragging the node. You can remove a

63:16

node from this menu, but pressing the

63:18

delete key is faster. We also have this

63:21

pin option. If you use it, a pin appears

63:24

at the top of the node. And now if you

63:26

click and drag, the node does not move.

63:28

It is pinned to the canvas. To move it

63:31

again, you need to rightclick and select

63:33

unpin. Sometimes when you get workflows

63:36

from the internet, some people stack

63:38

many nodes on top of each other and pin

63:41

them. It can look like there are only

63:43

two nodes, even if there are 10 nodes

63:45

behind one. I do not recommend doing

63:48

this. If you do not want people to use

63:50

your workflow, just do not share it.

63:54

We also saw that we need to change

63:56

values on some nodes for them to work.

63:59

If we bypass a node, the workflow still

64:02

runs and ignores that node. The links

64:04

are still there and the connection

64:06

passes through the node. This is

64:09

important because there is another mode

64:11

where the connection does not pass

64:12

through. Let us enable the node again

64:14

using bypass. Now right click on it and

64:17

go to mode. You will see the option

64:19

always which means the node is active.

64:22

There is also an option called never.

64:24

This mutes the node and makes it behave

64:26

as if it does not exist. You can see

64:29

that the node is now gray, not purple

64:31

like bypass. When we run the workflow,

64:34

we get an error. That is because the

64:36

node is not passing anything through. So

64:39

the next node does not receive the image

64:41

it expects. It is like cutting the cable

64:43

where that node was. Let us remove that

64:46

node and delete the link as well. If we

64:48

run the workflow again, the result is

64:50

the same. The image is missing because

64:52

there is no connection. Let us go to the

64:55

menu then edit and click undo. You can

64:58

also use Ctrl + Z multiple times until

65:01

you get back to the state you want. I

65:03

will undo until everything is active

65:05

again. Let us zoom out with the mouse

65:07

wheel and add a group. Name your group

65:10

in a way that explains what the nodes

65:12

do. Do not name it something generic

65:14

like my group. You can move the group

65:16

around and you will notice that it works

65:19

like a magnet. If a node is inside the

65:21

group area, it stays inside. Let us make

65:24

the group larger and move it around. So

65:26

you can see that nodes are sticking to

65:28

it. Now adjust the group size and move

65:31

the nodes so it looks cleaner. Workflows

65:34

can get quite large, so I like to

65:36

optimize the space to make them easier

65:37

to read and navigate. Once the nodes are

65:40

positioned, hold control and select all

65:42

the nodes. You will see that only the

65:44

nodes are selected, not the group. Now

65:47

rightclick and choose fit group to

65:49

nodes. The group will resize to fit the

65:51

nodes tightly. Now we can move the group

65:54

and it does not take much space. Groups

65:57

can have more options especially when

65:59

using custom nodes like RG3. If we go to

66:02

settings in Comfy UI, we have general

66:05

settings, but we also have settings for

66:07

custom nodes. For example, for RG3, we

66:11

have extra settings here. I can click

66:14

this button to open them. RG3 is

66:16

installed when you use the easy

66:18

installer.

66:19

If you install Comfy UI manually, you

66:22

need to install RG3 from the manager. We

66:25

will talk more about custom nodes later.

66:27

You can also access the same RG3

66:30

settings directly from here, which is

66:32

faster. If we scroll down, we have

66:34

settings for groups. For example, there

66:37

is an option called show fast group

66:39

toggles in group headers. Let us enable

66:41

it. You can choose when to show it

66:44

always or on hover. I will leave it on

66:47

hover and save the settings.

66:49

Now when I hover over the group, you can

66:51

see extra buttons in the top right. From

66:54

here, we can bypass all the nodes in the

66:56

group easily. We also have an option to

66:59

mute the group. This is similar to

67:01

setting nodes to never. When a group is

67:04

muted, the nodes inside it do not run at

67:06

all and the workflow behaves as if they

67:08

do not exist. Bypass still lets the

67:10

workflow run through the nodes. Mute

67:12

does not. Let us make the group active

67:15

again. We can also run the workflow

67:17

using the play button on the group. We

67:20

can change values, for example, use

67:22

smaller values to get a smaller image,

67:24

maybe half the size. There are many more

67:26

things you can do with groups and switch

67:28

style nodes, but we will cover those in

67:30

later episodes. I told you at the

67:33

beginning to leave the nodes to option

67:34

turned off. At the moment of this

67:36

recording, it is still in beta and has

67:39

some bugs. Maybe over time they will fix

67:41

everything and it will become stable. If

67:44

I activate it, you can see that it

67:46

changes how the nodes look. For most

67:48

nodes, things still work in a similar

67:50

way, but this change exists. So, they

67:53

can add more functionality to nodes.

67:55

With the current system they use, there

67:57

are limitations in what nodes can do.

68:00

And the new node version should give

68:02

them more possibilities to build better

68:04

nodes. Instead of the gray dot, you get

68:06

an arrow that points down and then to

68:08

the right when the node is collapsed.

68:10

The inputs are placed on the edge of the

68:12

node and some nodes have more options.

68:15

For example, in load image, you can see

68:18

previews of images from the input

68:20

folder. And you can also browse for

68:22

another image on your disk. However, for

68:25

older workflows, this can slightly

68:27

change node sizes and mess up the

68:29

layout. Some nodes might not work yet

68:32

until the node creators update them.

68:34

Because of that, until everything is

68:36

fixed and stable, I recommend leaving

68:38

nodes 2 turned off. Just a quick

68:41

reminder that from mode you can mute a

68:43

node by setting it to never. This is

68:45

useful when a workflow is big and has a

68:47

lot of branches. You can mute a branch

68:50

of that workflow and it will still run

68:52

without errors as long as there are no

68:55

nodes after the muted ones that expect

68:57

an input. To turn the node back on, you

68:59

go to mode and choose always. We also

69:02

have shape options for nodes, but these

69:04

are only decorative. They just change

69:06

the corners of the node. By default, the

69:09

corners are rounded. There is also the

69:12

card option which rounds only two

69:14

corners. Personally, I do not think it

69:17

is worth spending much time on this. To

69:19

remove a group, you can select it and

69:21

press the delete key or you can write

69:24

click on it, choose edit group, and then

69:27

remove. This only removes the group

69:29

container, not the nodes inside it,

69:31

unlike folders in other systems. Let us

69:34

select these two nodes while holding

69:36

control and press delete to remove them.

69:38

Then add an image crop node and connect

69:40

the link to it. After that, let us add a

69:44

preview image node. Since we are only

69:46

testing with these settings, we get a

69:48

crop from the top left corner of the

69:50

image. Let us arrange the nodes. Then

69:53

hold control, select both nodes, and

69:56

press control + C to copy and then plus

69:59

shift + V to paste them with the links.

70:02

Since we have them copied, let us paste

70:04

again to get a third branch. Right now,

70:07

all the settings are the same. So, all

70:09

three give the same result. We can

70:11

change the x coordinate on one to get

70:12

the bottom right corner of the image.

70:15

For another one, let us change the

70:17

y-coordinate to get the bottom left

70:19

corner. Now, when we run it, we split

70:22

the image into three pieces. You could

70:24

add another one for the bottom right

70:25

corner to get the missing part. That is

70:28

homework for you to figure out the

70:30

correct coordinates. Now if we change

70:32

the input image and run it again, you

70:35

can see how useful this can be. In a

70:37

later episode, we will learn how to load

70:39

multiple images from a folder and

70:41

automate this process so we can apply it

70:43

to all images in a folder. Now select

70:46

all the nodes in the workflow using the

70:48

shortcuttrl + a and then press delete to

70:51

remove everything. It is time for

70:54

another short break. This has been a

70:56

long chapter and I want to make sure you

70:58

have time to absorb the information.

71:01

Take a few minutes, press pause, get a

71:04

drink, or step away from the screen, and

71:06

then come back. Now, I do not know what

71:09

learning method works best for you, but

71:11

I can tell you one method that usually

71:13

works very well for video tutorials.

71:16

First, watch the entire tutorial from

71:18

start to finish without stopping too

71:20

much. This helps you build a general

71:23

understanding of what is possible and

71:24

how things fit together. Then watch it a

71:27

second time, pause the video and follow

71:30

along step by step inside Comfy UI.

71:33

After that, try to repeat the same steps

71:36

without the tutorial playing just from

71:38

memory. Once you are comfortable, start

71:41

experimenting.

71:42

Try changing nodes, parameters, or

71:45

settings that were not covered in the

71:46

tutorial. And if something does not work

71:49

or you get stuck, that is completely

71:51

normal. You can always go back to the

71:54

tutorial, rewatch a part or ask

71:56

questions on Discord. Learning Comfy UI

71:59

is not about speed. It is about

72:01

understanding.

72:03

It is time to build a workflow from

72:05

scratch. But first, let us go to

72:07

workflows. Open the getting started

72:09

folder and open workflow number one, the

72:12

one we used in a previous chapter. What

72:14

I want to do now is give you an analogy

72:16

so you can understand what is happening

72:18

here with all these nodes connected. So

72:20

it makes more sense. I will use a note

72:22

node and add some info next to each

72:25

node. You do not have to do that. Just

72:27

watch and pay attention. I will open the

72:30

same workflow but with those notes added

72:32

next to each node and I will explain

72:34

each one in detail. You probably noticed

72:37

by now that when we generate images with

72:39

AI, we usually download a file called a

72:42

model. Sometimes people call it a model

72:45

and sometimes they call it a checkpoint.

72:47

In practice, they usually mean the same

72:50

thing. A model is the trained AI itself.

72:53

It contains everything the AI learned

72:55

during training like styles, shapes, and

72:58

how images are formed. The word

73:01

checkpoint comes from machine learning.

73:03

During training, the model is saved at

73:05

different points in time called

73:07

checkpoints. Those checkpoints are what

73:10

we download and use. So when you hear

73:12

model or checkpoint, you can think of

73:14

them as the same thing, the trained AI

73:17

file that does the image generation. In

73:20

Comfy UI, you will often see the term

73:22

load checkpoint, but what you are really

73:25

doing is loading the model you want to

73:27

use. We can think of the model as the

73:29

photographer we want to hire. The load

73:32

checkpoint node is the step where we

73:34

actually hire that photographer.

73:36

Depending on what the photographer

73:38

learned during training, they will be

73:40

good at different types of photos. That

73:42

is why there are so many different

73:43

models available. Just like in real

73:46

life, some photographers specialize in

73:48

portraits, others in landscapes, macro

73:51

photography, or food photography. AI

73:54

models work in a very similar way. The

73:57

better and more complex the training of

73:59

a photographer, the more expensive they

74:01

usually are. In our case, that cost is

74:04

not money, but computer power. Larger

74:07

and more advanced models usually need

74:09

more VRAM and a stronger graphics card

74:11

to run properly. To keep things simple

74:14

for now, we are hiring one photographer.

74:17

We will use a model called Juggernaut

74:19

Reborn. And this is the photographer

74:21

that will generate our images. So now

74:24

that we hired the photographer, what

74:26

comes next? We need to give instructions

74:28

to that photographer about what we want

74:30

to get and what we want to avoid. These

74:33

instructions are called prompts. We

74:36

usually use a positive prompt to

74:38

describe what we want to see in the

74:39

image and a negative prompt to describe

74:42

what we want to avoid. In Comfy UI, we

74:45

use the same node for both. I just

74:47

colored one green for the positive

74:49

prompt and one red for the negative

74:51

prompt so they are easier to recognize.

74:54

The node we use is called clip text

74:56

encode. This node takes our written text

74:59

and translates it into a form that the

75:01

model can understand. In simple terms,

75:04

clip text encode acts like a translator

75:07

between human language and the AI. It

75:09

turns words into instructions that the

75:12

photographer can follow during the photo

75:14

shoot. Besides giving instructions on

75:16

how the photo should look, we also need

75:18

to decide how big the photo will be. For

75:21

that, we use the empty latent image

75:23

node. This node is like choosing an

75:26

empty photo paper before taking the

75:28

photo. Here is where we decide the width

75:30

and height of the image. We are defining

75:32

the size of the photo before it even

75:34

exists. At this stage, there is still no

75:37

image. It is just an empty space where

75:40

the photo will be created. Once the

75:42

photo shoot happens, the final image

75:45

will always respect the size we set

75:47

here. Now, it is time for the photo

75:49

shoot. The case sampler node is the

75:51

photo shoot itself. This is where the

75:54

photographer follows the instructions

75:55

from the prompts and uses the empty

75:57

photo paper to take the photo. Each

76:00

different seed is like taking a new

76:01

photo of the same scene. The idea is the

76:04

same, but the result is slightly

76:06

different every time. The K sampler

76:09

controls how the image is generated. It

76:12

decides how many steps the photographer

76:14

takes, how much randomness is allowed,

76:16

and how closely the final photo follows

76:18

the instructions. You do not need to

76:21

understand every parameter right now.

76:23

What matters is that the K sampler is

76:25

the core of the workflow where the

76:27

actual image creation happens.

76:30

Everything before the K sampler prepares

76:32

the photo shoot. Everything after it

76:35

finishes the photo. After the photo

76:37

shoot, the image is created, but it is

76:39

not visible yet. That is because the K

76:42

sampler does not produce a normal image.

76:44

It produces something called a latent,

76:47

which you can think of as a hidden

76:49

version of the photo. It contains the

76:51

information of the image, but it is not

76:53

in a format we can actually view. This

76:56

is where VAE decode comes in. The VAE

76:59

decode node is like the dark room in

77:01

photography. The photo already exists,

77:04

but it still needs to be developed to

77:06

become visible. So, the VAE decode takes

77:09

that latent result and converts it into

77:11

a real image that we can see, preview,

77:14

and save. Without this node, the

77:16

workflow can still generate something,

77:18

but you would not be able to view the

77:20

final photo because it is still in that

77:22

hidden latent form. And finally, the

77:25

save image node is where the finished

77:27

photo is delivered to the client. After

77:29

the VAE decode step, we usually add a

77:32

node that either previews or saves the

77:34

image. Preview nodes let us see the

77:37

result inside Comfy UI, while the save

77:39

image node writes the final image to

77:41

disk. Without one of these output nodes,

77:44

the workflow has no final result. In our

77:47

photo studio analogy, this is the moment

77:50

where the developed photo is either

77:52

shown to the client or delivered as the

77:54

final file. Now, let us zoom out and

77:56

look at the entire process. First, we

77:59

load a model from our disk. This is like

78:02

hiring a photographer. Then, we give

78:04

instructions. The positive prompt

78:06

describes what we want. For example, a

78:09

close-up portrait of a pet. The negative

78:11

prompt describes what we want to avoid.

78:14

For example, saying we do not want dogs.

78:17

Next, we decide how big the photo should

78:19

be using the empty latent image node.

78:22

This is where we choose the size of the

78:24

photo before it is taken. Now, let us

78:27

run the workflow. You can see that all

78:29

these instructions are passed into the K

78:31

sampler where the image is actually

78:33

created. The K sampler is the photo

78:36

shoot. It uses steps and different

78:38

settings similar to camera settings like

78:40

shutter speed or aperture to decide how

78:43

the photo is taken. After that, the

78:46

image goes through VAE decode where it

78:49

is converted from latent space into

78:51

actual pixels. This is like developing

78:53

the photo in a dark room and finally we

78:56

save the image. This is when the

78:58

photographer delivers the finished photo

79:00

to the client. Every image generation

79:03

workflow in Comfy UI follows this same

79:06

basic idea even when it becomes more

79:08

complex. Let us do some quick

79:10

experiments. What happens if I change

79:13

the negative prompt the instructions

79:15

where we say what we want to avoid. For

79:18

example, if I say I do not want a cat,

79:21

it will probably give me another pet

79:23

that is not a cat and we might get a dog

79:25

instead. If we run it again, it is like

79:28

taking another photo of a pet because

79:30

the seed is random. Now we can change

79:33

the seed to be fixed. When the seed is

79:35

fixed, each time we use the same prompt,

79:38

the same settings, and the same seed, we

79:41

should get the exact same image. If I

79:44

try to run it again, you can see that

79:46

nothing happens. The result would be the

79:48

same. So, Comfy UI does not even bother

79:51

to generate it again. If we change a

79:53

setting like the seed, then it lets us

79:56

generate again and we get a different

79:58

image. If we go back to the previous

80:00

seed, we are back to the same image we

80:02

had before generated with that seed. Now

80:05

that we kind of understand how it works,

80:08

let us click on this plus sign and build

80:10

the same workflow from scratch. Double

80:12

click on the canvas and search for load.

80:15

Usually it is either load checkpoint or

80:17

load diffusion model. But in some cases

80:19

there are special loaders for specific

80:22

models. Now that we have the node, we

80:24

select the model. Since we did not

80:26

download more models yet, we only have

80:28

one juggernaut reborn. So we hired our

80:31

photographer. Now let us give it

80:33

instructions. Search for prompt. And we

80:36

can find this clip text end code node.

80:39

Let us move it next to the other node. I

80:41

like to change the color to green for

80:43

the positive prompt. Right click and

80:45

clone the node or just hold alt and drag

80:48

the node to make a copy. For this second

80:50

one, let us change the color to red.

80:53

Again, this does not influence how it

80:55

works. It is the same node. It is just

80:58

visual. For the positive prompt, I will

81:01

add closeup portrait of a pet. For the

81:04

negative prompt, I will add cat. Not all

81:08

models use negative prompts. Some older

81:11

models like this one still use it, but

81:13

you will see later that some newer

81:14

models are smarter and do not need a

81:16

negative prompt and they work better

81:18

when the negative prompt is disabled.

81:20

Can you guess how these are connected?

81:22

We have clip on both input and output.

81:25

So, we can only connect clip to clip. If

81:27

we try to drag from the model, you can

81:30

see it does not work. And the same if we

81:32

try from the VAE. So, let us connect the

81:35

clip output from the model to both of

81:38

the text encoders. Now we have the

81:40

instructions for how the image should

81:42

look but we still need to define the

81:44

size. Let us search again using the word

81:47

empty and add the empty latent image

81:50

node. There is also a newer one that we

81:52

will use later for newer models but for

81:54

this workflow we will use this simple

81:56

one. I like to change the color of this

81:59

node to purple but you can leave it as

82:01

it is if you want. Now we have width and

82:03

height. Because we work with computers,

82:06

most models work better with values that

82:08

are multiples of 64 or 8. That is why we

82:11

see values like 512 instead of 500. I

82:15

know that this model was trained with

82:17

square images at 512x 512 pixels. So I

82:21

use these values to get better results.

82:23

Some newer models are trained with

82:25

larger images and can generate bigger

82:27

images. But that comes at a cost. Just

82:30

like printing a big photo costs more

82:32

than a small one, a bigger image takes

82:34

more time to generate and sometimes your

82:37

PC cannot handle it. More about that

82:39

later. Now, let us add the most

82:41

important part where the magic happens,

82:44

the K sampler. As you can see, this node

82:47

has four inputs where it takes all the

82:50

instructions and one output. First, we

82:54

connect the model since it has the same

82:56

color and name. Then, we connect the

82:58

conditions. The instructions are yellow.

83:01

Even if the names are different, we

83:03

connect the positive output to positive

83:05

and the negative output to negative.

83:08

That is how it knows which one is

83:09

positive and which one is negative even

83:11

though they come from the same type of

83:13

node. The last input is the empty latent

83:16

image which defines the size of the

83:18

image we want. Now we have everything

83:20

needed to generate the image but it is

83:23

still in latent format. We need pixels

83:25

to actually see it. So let us drag a

83:28

link from the output and you can see

83:30

that it suggests VAE decode. We select

83:33

it and now the image is decoded like a

83:36

dark room where the photo is developed.

83:38

Here we also have a VAE input. In this

83:41

case the VAE model is included inside

83:44

the main model which is why we can

83:46

connect it directly. In some cases the

83:48

VAE comes as a separate file and then we

83:51

use a load VAE node. You will see that

83:54

later. Now the last step is to save the

83:57

image. So we add the save image node.

83:59

Let us run the workflow and see if it

84:01

works or if we forgot something. If

84:04

everything turns green, it worked

84:06

without errors. There are cases where

84:08

the image does not look right. Even if

84:10

there are no errors, that usually means

84:13

some settings are not ideal. People who

84:15

create AI models usually provide

84:18

recommended settings, especially for the

84:20

case sampler. Just like in photography,

84:23

macro and landscape use different camera

84:25

settings. The same idea applies here. If

84:28

we look at the previous workflow, we can

84:30

see recommended settings for this model.

84:32

Steps 35, CFG7, this sampler, and

84:36

thisuler.

84:38

So let us change steps to 35, CFG to 7.

84:42

For sampler, we use DPM++ 2M and foruler

84:46

we use caris. Now let us run it again.

84:50

For this seed, we get some small

84:51

deformations, but for next seed, it

84:54

looks fine. We will see later how to

84:56

improve the results even more. Let me

84:58

show you what happens when we try to

85:00

generate an image that is much bigger

85:02

than what the model was trained to

85:03

handle. For this example, I will double

85:06

the image size. On the first try, I did

85:09

not even get a pet. Sometimes you might

85:12

get something that looks okay, but most

85:14

of the time you will see problems. you

85:17

can get strange deformations, things

85:19

that do not make sense, or visible

85:21

mutations. If I increase the size even

85:24

more, these problems become even more

85:27

obvious. It also takes more processing

85:29

power and more time to generate the

85:32

image. The reason this happens is

85:34

because this model was trained mainly on

85:36

512x 512 pixel images. When we ask it to

85:41

generate a much larger image, it

85:43

struggles to understand the full space.

85:45

You can think of it like the model

85:47

trying to generate the image in parts.

85:49

One part might look okay, but then it

85:52

tries to continue the image next to it,

85:54

almost like stitching pieces together,

85:56

and that is where things break. That is

85:58

why you sometimes see double heads,

86:01

repeated objects, or strange structures

86:03

in large images. Bigger images are not

86:06

always better if the model was not

86:08

trained for that size. But if a model is

86:10

trained to handle larger images, you can

86:13

get more details and better results. Let

86:15

us say I add ugly to the negative

86:18

prompt. So we push the result toward

86:20

more beautiful images. For the positive

86:22

prompt, let us be more specific. We want

86:25

a dog and we want it to be beautiful.

86:27

Now when we run it, we get a more

86:29

beautiful dog. Because this model is

86:32

really old, like I told you, it is good

86:34

for practice. But today, we have much

86:37

bigger and more accurate models. They

86:39

produce better results with fewer

86:41

deformationations, but they are larger

86:43

and need more VRAMm to run properly. Our

86:46

desktop computers are very similar to

86:48

Comfy UI because they are both built

86:50

around the idea of connecting

86:51

specialized components together where

86:54

each one does a specific job. The CPU

86:57

acts like the central processor just

86:59

like the sampler or the model does the

87:01

main work in Comfy UI. The monitor is

87:04

like preview and output nodes that show

87:06

results. The keyboard and mouse are

87:08

inputs just like prompts and parameters.

87:11

Printers and speakers are output devices

87:13

like save image or audio nodes. Routers

87:17

handle communication similar to data

87:19

links between nodes. The reason we

87:22

design systems this way is because

87:24

breaking complex tasks into smaller

87:27

connected parts makes them easier to

87:29

understand, easier to control, easier to

87:32

upgrade, and more flexible. That is

87:34

exactly why Comfy UI uses nodes instead

87:37

of hiding everything behind a single

87:39

button. Now that we know how to create a

87:41

workflow, we also need to learn how to

87:43

save it. If you look at the top, you can

87:46

see it says unsaved workflow. That means

87:49

none of these settings or nodes are

87:51

stored yet. If you want to reuse the

87:54

same workflow later without recreating

87:56

everything from scratch, you need to

87:58

save it. If I click on this arrow next

88:00

to the workflow name, you can see there

88:02

are several save options. Personally, I

88:05

prefer using the main menu. So, I go to

88:08

file and here we have save, save as, and

88:13

export. When you click save and the

88:15

workflow has never been saved before,

88:18

Comfy UI will ask you to give it a name

88:20

and choose where to save it. If the

88:22

workflow was already saved and you just

88:24

made changes, clicking save will

88:27

overwrite the existing file with the

88:28

same name. Save as lets you save the

88:31

same workflow under a different name.

88:34

This is the option I use the most,

88:36

especially when I want to create

88:38

variations of a workflow. Export is very

88:41

useful because it is not limited to the

88:43

Comfy UI workflow folder. It allows you

88:46

to save the workflow anywhere on your

88:48

computer, even outside the Comfy UI

88:50

folder. The API option is mainly used

88:53

when working with online or cloud-based

88:55

workflows. So, we will not use it here.

88:58

So, let us click export. Now, it asks

89:01

for a name. Choose a name that makes

89:03

sense to you. Click confirm. Then,

89:06

choose where to save it. For example, I

89:08

can save it on my desktop. You can see

89:11

that the file is saved with theJSON

89:14

extension. This JSON file contains all

89:17

the nodes, connections, and settings of

89:20

your workflow. This file is your

89:22

workflow, and you can open it anytime,

89:25

share it with others, or modify it

89:27

later. JSON files are simple text files.

89:31

You can open them with any text editor

89:33

like Notepad. JSON stands for JavaScript

89:36

object notation, and it is just a

89:39

structured way of writing text so both

89:41

humans and computers can read it. In

89:44

Comfy UI, the JSON file stores things

89:46

like node types, connections,

89:48

parameters, and settings, all written as

89:52

text. That is why workflow files are

89:54

small in size and easy to share. They do

89:57

not include images or models, only

89:59

instructions. If we go to workflows, you

90:02

can see I have that folder with

90:04

workflows saved there. You can do that,

90:06

too.

90:08

Go to the menu. Go to file. Choose save

90:11

as. Give it a name

90:14

and confirm. Now if we go to workflows,

90:16

we can see the workflow is saved there.

90:19

Right now it is not organized into any

90:21

folder. It is just in the main list. But

90:23

you can add a folder name in front of

90:25

the workflow name when you save it. For

90:27

example, folder name, then a forward

90:29

slash, then the workflow name. Let us

90:32

see where it is saved. Go to your Comfy

90:34

UI folder. Then inside the Comfy UI

90:37

folder, go to user,

90:40

then default, then workflows. Here you

90:43

can see your saved workflow and also the

90:45

folder I created for this course that

90:47

comes with the easy installer. You can

90:50

create your own folder manually. For

90:52

example, I can create a folder called my

90:54

workflows, then drag that workflow into

90:57

it. Now, if we go back to Comfy UI,

91:00

nothing changes immediately because

91:02

Comfy UI usually reads this when it

91:04

starts. But we can refresh using this

91:07

refresh button. Now our folder appears

91:09

there and we can see the workflow inside

91:11

it. I suggest organizing your workflows

91:14

like this because over time you will

91:16

have a lot of workflows and it becomes

91:18

hard to keep track of everything. By the

91:20

way, you can also use the search bar to

91:22

search for a workflow by name. We also

91:25

have a bookmark icon. If we click it,

91:28

the workflow is added to the bookmarks

91:30

at the top. So the ones you use the most

91:32

stay there. If you click the bookmark

91:34

again, it is removed from the favorites

91:36

list. Let us collapse this and I will

91:39

show you one more thing. If we go to the

91:41

desktop and open the shortcut for the

91:43

output folder or if we go directly to

91:45

the output folder, you can see all the

91:48

images generated so far with Comfy UI.

91:51

The last one is this dog. You probably

91:53

did not think about this yet, but if you

91:55

open an image generated with Comfy UI in

91:58

Notepad, you can actually see some code

92:00

at the beginning. Just like with

92:02

workflows, this happens because Comfy UI

92:05

attaches the workflow to the image when

92:07

it saves it. After that, there is the

92:09

image data which we cannot really read.

92:12

This means that every image has the full

92:14

workflow embedded in it, including all

92:16

the settings and prompts. Let me drag

92:19

this image onto the Comfy UI canvas so

92:21

you can see what happens. Now you can

92:24

see that it loads as a workflow with the

92:25

file name. If we generate again, we get

92:28

exactly the same image because it uses

92:31

the same seed and settings. Let us go

92:33

back to the output folder and drag a

92:35

different image. For example, this

92:37

robot. Now it loads that workflow. And

92:39

if we run it, we get the exact same

92:42

robot. This is very useful. Another

92:45

thing you might notice is that all

92:46

images start with the word comfy UI

92:49

followed by a number. This happens

92:52

because in the save image node, the

92:54

prefix is set to that value. We can

92:57

change it. For example, I can set it to

92:59

pixa. And now when I run the workflow,

93:02

the image file name will start with that

93:04

word followed by a number. As you can

93:06

see here, if you hover over the prefix

93:09

field, you can get more information

93:11

about how to format it. You can include

93:13

things like the date and other values in

93:16

the file name. Now, let us change it

93:18

again. I will add a folder name. for

93:20

example, my images, then a forward

93:23

slash, then the image prefix.

93:26

When we run the workflow now, the images

93:28

will be saved inside that folder. Let us

93:31

go to the output folder. You can see we

93:33

now have a folder called my images. And

93:36

inside it, we have the images that start

93:38

with the prefix we set followed by a

93:40

number. Now, I will go back to the

93:42

workflows folder that we created earlier

93:44

and delete it.

93:47

Back in Comfy UI, if we refresh the

93:49

workflows list, you can see that the

93:51

folder is gone. We are left only with

93:54

the getting started folder we used for

93:56

this episode. When you create your own

93:58

folders and organize your workflows, I

94:01

suggest naming them in a way that makes

94:02

sense. You can name them by base model

94:05

like SDXL workflows or flux workflows or

94:09

by function like text to image

94:11

workflows, inpainting workflows or video

94:14

workflows. Choose whatever makes the

94:17

most sense to you, but organizing your

94:19

workflows early will save you a lot of

94:21

time later.

94:22

In this chapter, I want to show you how

94:24

Comfy UI is organized on your disk. This

94:27

is important because sooner or later,

94:29

you will need to know where to place

94:30

models, images, workflows, and custom

94:34

nodes. Do not worry if this looks

94:36

overwhelming at first. You do not need

94:39

to understand everything right now. I

94:41

will focus only on the folders you

94:43

actually need as a user. This is the

94:45

main Comfy UI easy install folder. Think

94:49

of this as the main workspace that

94:51

contains everything Comfy UI needs to

94:53

run. The most important things here are

94:56

the Comfy UI folder, the Python embedded

95:00

folder, the add-ons folder, and the

95:03

batch files used to start or update

95:05

Comfy UI. In normal usage, you will

95:08

mostly work inside the Comfy UI folder.

95:11

If you have a different version of Comfy

95:13

UI, you will not have the add-ons folder

95:16

and some of the BAT files will be named

95:18

differently, but pretty much everything

95:19

else should be similar. When we open the

95:22

Comfy UI folder, we see many files and

95:25

folders. Most of these are internal

95:27

files used by Comfy UI itself. As a

95:30

beginner, you do not need to touch most

95:32

of these. The important folders for us

95:35

are models, input, output, custom nodes,

95:39

and the user folder. The models folder

95:41

is where all AI models live. This

95:44

includes checkpoints, Laura files, VAEs,

95:48

control nets, upscalers, and more.

95:51

Inside the models folder, everything is

95:53

organized by type. For example,

95:56

checkpoints or diffusion models for main

95:58

image generation models, loris for Laura

96:02

files, V for VAE files, control net for

96:06

control net models. When a workflow

96:09

tells you to download a model, it will

96:10

also tell you exactly which subfolder to

96:13

place it in. If a model is not placed in

96:15

the correct folder, Comfy UI will not

96:18

see it. The input folder is where you

96:20

place images that you want to load into

96:22

Comfy UI. For example, images used for

96:26

image to image control net masks or

96:29

reference images. Any image you place

96:32

here will be visible inside Comfy UI

96:34

when using a load image node. The output

96:37

folder is where Comfy UI saves generated

96:39

images by default. Every time you use a

96:42

save image node, the result will appear

96:44

here. This makes it very easy to find

96:47

all your generated images in one place.

96:50

The custom nodes folder is where all

96:52

custom nodes are installed. These are

96:55

extra features added by the community.

96:57

Each folder here represents a custom

96:59

node package. For example, we already

97:02

used the RG3 node and we will use more

97:05

later. When you install nodes using the

97:07

manager, they usually end up here

97:09

automatically. If a custom node is

97:11

missing or broken, this is usually the

97:14

first folder you should check. Inside

97:16

the user folder, we have user specific

97:18

data. The most important part for us is

97:21

the workflows folder. This is where

97:24

Comfy UI stores workflows that you save

97:26

from inside the interface. These

97:29

workflow files are saved as JSON files.

97:33

The add-ons folder is specific to the

97:35

easy install version. It contains extra

97:38

tools, optimizations, and helper

97:41

scripts. You usually do not need to

97:43

touch this folder unless a tutorial

97:45

specifically mentions it. You do not

97:48

need to memorize this right now, but

97:50

this structure might change as new tools

97:52

are created by IVO. For example, this

97:55

BAT file lets you link a folder with

97:57

models from another Comfy UI

97:59

installation. This one installs the Naku

98:02

node and this one installs Sage

98:04

Attention. There are also different

98:06

torch pack versions for more advanced

98:08

users who need a specific version for

98:10

certain custom nodes. You will also find

98:13

extra tools like one for Windows 10 that

98:16

enables long paths so Comfy UI can

98:19

download models even if the path is very

98:21

long. There is also an update folder

98:23

with BAT files, but as you will see

98:25

later, for the easy install version, I

98:28

recommend using different update BAT

98:30

files. The Python embedded folder

98:32

includes a self-contained Python

98:34

installation. This helps avoid conflicts

98:37

with other software and makes Comfy UI

98:39

easier to run and update. As you use

98:42

Comfy UI more, this folder structure

98:45

will start to make sense naturally. In

98:47

the next chapters, I will always tell

98:49

you exactly where things need to go. Let

98:52

us talk a little bit about updates and

98:54

custom nodes. What you are seeing here

98:56

is the Comfy UI easy install folder.

99:00

This setup already includes everything

99:02

needed to update Comfy UI safely. The

99:05

most important rule is this. Always

99:07

close Comfy UI before updating. Never

99:10

update while it is running in the

99:11

browser. Start Comfy UI BAT. This only

99:15

launches Comfy UI. It does not update

99:18

anything. Update Comfy UI BAT. This

99:22

updates the core Comfy UI code. Use this

99:25

when you want the latest features or

99:27

fixes. Update Comfy UI and Nodesbat.

99:31

This updates Comfy UI and all installed

99:34

custom nodes. Update easy install.bat.

99:38

This updates the easy install system

99:40

itself. When should you update? Update

99:43

when something is broken. Update when a

99:46

node requires a newer version. Update

99:48

when you want new features. Do not

99:51

update right before an important

99:52

project. Updates can sometimes break

99:55

workflows. If something breaks after an

99:58

update, you can usually fix it by

100:00

updating again or removing the last

100:02

custom node you installed. One important

100:05

reminder, Comfy UI moves fast. Stability

100:09

comes from not updating every single

100:11

day. If everything works, it is okay to

100:14

stay on your current version. At some

100:17

point, you will mess up Comfy UI. Maybe

100:19

a node breaks or some dependencies get

100:22

messed up or an update has bugs. But

100:25

remember, you can always do a fresh

100:27

install when that happens. Just create a

100:29

new folder and reinstall using the easy

100:32

installer. Let us doubleclick on update

100:34

easy install. This updates only the easy

100:37

installer and adds extra tools and

100:39

add-ons. As we move forward in this

100:42

series, more models will appear, new

100:44

nodes will be added, and IVO likes to

100:47

create scripts that make these

100:48

installations easier. When you see that

100:50

the installation is complete, you can

100:52

read more about the new release using

100:54

this link or press any key to exit. You

100:57

may not see any changes immediately, but

100:59

if we go to the add-ons folder, you can

101:02

see that we now have more BAT files than

101:04

we had in the first chapter. Now, let us

101:06

go back to the main folder and try to

101:08

update Comfy UI to see if everything

101:11

still works or if we break some nodes.

101:14

Nodes sometimes break after an update

101:16

because Comfy UI itself changes how

101:18

things work internally. Many custom

101:21

nodes are made by independent

101:22

developers, not by the Comfy UI team.

101:25

These custom nodes often rely on

101:27

specific Comfy UI behavior, internal

101:30

APIs, or extra Python libraries and

101:33

dependencies. When Comfy UI updates,

101:36

those assumptions can change and the

101:38

node stops working until its creator

101:40

updates it to match the new version. So,

101:43

Comfy UI started after the update, but

101:46

let us open the command window to see if

101:47

everything worked correctly. Usually,

101:50

after startup, you can see import times

101:52

for custom nodes. As you saw before, all

101:56

custom nodes are inside the custom nodes

101:58

folder. But look what happened here.

102:01

After the update, one of the custom

102:03

nodes installed like TC failed to

102:06

import. That means if you have a

102:08

workflow that uses that node, it will

102:10

not work. If you do not use that node,

102:13

you can ignore it and try updating Comfy

102:16

UI again in a few days to see if it gets

102:18

fixed. I will close Comfy UI now and try

102:21

something else. Sometimes there are

102:23

newer versions of the custom nodes and

102:25

if the author fixed the issue, updating

102:28

the nodes can fix the problem. So this

102:30

BAT file updates only comfy UI and this

102:33

one updates both Comfy UI and the custom

102:36

nodes. This process can take a while

102:38

because it updates all the nodes. So I

102:40

will speed it up. Comfy UI started. So

102:44

let us check the command window to see

102:45

if the issue was fixed. The node was

102:48

still not fixed. This means that at the

102:50

time I recorded this video, the update

102:52

from that day broke that node. When you

102:54

watch this video, it might already be

102:56

fixed and work for you. Either because

102:59

Comfy UI fixed a bug, the node creator

103:02

patched the node, or a new developer

103:04

created a replacement node. There is one

103:06

more thing I want to try. We can go back

103:09

to an older version of Comfy UI that did

103:11

work, a version that had the right

103:13

conditions for that node. The downside

103:16

is that if Comfy UI released new

103:18

features or nodes for newer models,

103:20

those might not work on the older

103:21

version. So, it is always a compromise.

103:24

You have to choose between keeping a

103:26

specific custom node working or using

103:28

the latest Comfy UI updates. IVO hid

103:32

this option so beginners do not

103:34

accidentally mess up their Comfy UI. Let

103:36

us go to the add-ons folder, then to

103:39

tools, and here we have the version

103:42

switcher. When we run this BAT file,

103:44

Comfy UI is downgraded to a previous

103:47

version. In my case, it went from

103:49

version 0.7 back to version 0.6.

103:53

If you run this script again, it

103:55

upgrades Comfy UI back to the latest

103:57

master branch. Let us press any key to

104:00

close this. Now that we are on an older

104:03

version, it is time to check if that

104:05

node works. Let us start Comfy UI. Wait

104:08

for the interface to load. Then open the

104:10

command window and check the custom

104:12

nodes. Now it is fixed and there are no

104:15

errors with the nodes. In a few days I

104:18

will try updating again to see if it

104:19

gets fixed in the newer version. But

104:22

this is basically how you update and

104:24

downgrade Comfy UI using the easy

104:26

installer. Other Comfy UI versions might

104:29

require you to run commands manually,

104:31

but I keep pushing IVO to create BAT

104:33

scripts for these tasks. I want to spend

104:35

my time generating, not typing lines of

104:38

code. You will see that we have the

104:40

manager here. In other versions, you

104:42

might find it somewhere else in the

104:43

menu. Let us open the manager and see

104:46

what we have here. We also have update

104:48

and update comfy UI. These are similar

104:51

to the BAT files, but the BAT files have

104:53

something extra. They take into account

104:56

some dependencies needed for certain

104:58

custom nodes to work, which Comfy UI

105:00

itself does not handle when updating.

105:02

For example, for the nunchaku node to

105:05

work, it needs specific dependencies,

105:08

like a certain version of a library. The

105:10

BAT file updates Comfy UI, but then

105:12

adjusts or downgrades those dependencies

105:15

to the versions required by the custom

105:17

nodes we use. IVO tries to maintain

105:20

these BAT files and keep them updated so

105:22

they stay compatible with the versions

105:23

needed to run the workflows shown in

105:25

these video tutorials. Because I am

105:28

using the easy installer, I did not

105:30

touch these update buttons inside the

105:32

manager. I only use the BAT files. If

105:35

you have a different version of Comfy

105:36

UI, you will need to use these update

105:38

options or use a BAT file from the

105:41

update folder instead. In the manager,

105:44

you can also find the latest Comfy UI

105:46

news, such as what was fixed, what is

105:48

new, and recent changes. At the bottom,

105:52

you can see the Comfy UI version and the

105:54

manager version. Most of the time the

105:56

manager is used for managing custom

105:58

nodes. If we go to the custom nodes

106:00

manager, we can see all the available

106:02

custom nodes created by different

106:04

developers. There are a lot of them. I

106:07

personally try to keep the number of

106:08

installed nodes to a minimum and install

106:11

only what is essential or what I use

106:13

most often. Some people install hundreds

106:15

of nodes, but the more nodes you

106:17

install, the harder it becomes to keep

106:19

everything compatible because each node

106:22

can have its own dependencies and

106:23

requirements. If I filter by installed,

106:26

you will usually not see many nodes here

106:28

besides the manager itself. However, I

106:31

asked Ivo to include a few essential

106:33

nodes that I use most often. One example

106:36

is the RG3 custom node, which includes

106:39

the image compar node that is very

106:41

useful for comparing images side by

106:43

side. Each custom node has a title and a

106:46

version number. You can switch versions

106:48

if needed, for example, when an older

106:50

workflow only works with a specific

106:52

version of a node. For each node, you

106:55

also have several actions available.

106:57

Update only that node, switch the

107:00

version, temporarily disable it, or

107:03

uninstall it. You can also see how many

107:06

individual nodes are included in that

107:08

custom node package along with a short

107:10

description. Some nodes mention possible

107:13

conflicts with other nodes. If you click

107:16

on that yellow warning text, you can

107:18

read more details about those conflicts.

107:20

These conflicts usually matter only if

107:23

you use both conflicting nodes in the

107:25

same workflow. You can also see the

107:27

author of the node and the number of

107:28

stars it has on GitHub. Stars are given

107:31

by users and usually indicate how

107:33

popular or trusted a project is. Some

107:36

developers are wellknown and

107:38

consistently release highquality nodes.

107:41

That said, there have been cases in the

107:43

past where certain nodes had security

107:45

issues, so it is still a good idea to be

107:47

careful. You can also see when the node

107:50

was last updated. To switch versions,

107:53

you click the version selector, choose a

107:55

version from the list, click select, and

107:59

then follow the steps shown. We will not

108:01

do that right now. As you remember,

108:04

every custom node that gets installed

108:06

ends up in the custom nodes folder. Here

108:09

you can see all the custom nodes that

108:11

come with the easy install version at

108:13

the time of this recording. Now, let us

108:15

install one node as a test just to see

108:18

how the process works. Open the manager.

108:21

Go to custom nodes manager and search

108:23

for a node called align. We will use

108:26

this as a test because it does not

108:28

require special dependencies. So in

108:30

theory it should not affect Comfy UI too

108:33

much. Each node entry has a title. If

108:36

you click on it, it opens the GitHub

108:38

page for that node. On GitHub, you can

108:41

see the code because every custom node

108:43

is basically Python code and supporting

108:45

files. You can also check the issues tab

108:49

where users report problems and

108:51

sometimes solutions are discussed. If

108:53

you scroll down, you usually find

108:55

important information like required

108:57

Comfy UI versions, Python versions, or

109:01

other dependencies. These are the

109:03

dependencies I mentioned earlier, things

109:05

the developer relied on when creating

109:07

the node. You also see installation

109:10

instructions either through the manager

109:12

which we are doing now or manually using

109:15

commands like git clone which simply

109:17

copies the code into the custom nodes

109:19

folder. Before installing any custom

109:22

node, it is a good habit to read this

109:24

information. Some nodes require things

109:26

your system might not have and then they

109:28

will not work. Now let us install this

109:31

node. Click the install button. You will

109:35

be asked to choose a version. So select

109:38

the latest version. The button changes

109:41

and installation begins. When it

109:43

finishes, you will see a restart button.

109:46

Comfy UI needs to restart for the node

109:48

to become available. Click restart and

109:51

confirm. Comfy UI shuts down. You will

109:54

see the browser trying to reconnect

109:56

while Comfy is restarting. After a few

109:59

moments, you get a confirmation message.

110:01

Click confirm. The node is now

110:04

installed. If you go back to the

110:06

manager, open custom nodes manager and

110:09

search for the align node, you will see

110:11

that it now shows an uninstall button.

110:14

If installation had failed, you would

110:16

see an import failed message instead. If

110:19

you look inside the custom nodes folder

110:21

on disk, you will now see a new folder

110:24

for this node. It is simply the same

110:26

code you saw on GitHub copied locally.

110:30

This code is what adds new nodes to the

110:32

Comfy UI interface. If you deleted this

110:35

folder manually, that would also

110:37

uninstall the node. However, let us

110:40

uninstall it properly using the manager.

110:42

Go back to the manager, click uninstall

110:46

and confirm again. You will be asked to

110:49

restart Comfy UI. Confirm. Wait for the

110:52

restart and then confirm the browser

110:55

reload.

110:57

Now, go back to the custom nodes manager

110:59

and search for the align node again. You

111:02

will see the install button again which

111:04

means the node is no longer installed.

111:06

If you check the custom nodes folder,

111:09

you will also see that the folder for

111:11

this node has been removed. This is the

111:13

basic workflow for installing, updating,

111:16

and uninstalling custom nodes using the

111:18

manager. Sometimes when you download a

111:21

workflow from other people on the

111:22

internet, you will have missing nodes

111:24

because they used custom nodes that you

111:26

do not have installed. When you do not

111:28

know what nodes they used, you can use

111:31

the install missing custom nodes button.

111:33

This will give you a list of missing

111:35

nodes and the option to install them.

111:37

That said, I personally prefer to

111:40

install nodes manually so I have full

111:42

control over what gets installed. That

111:44

is why I usually include a note node in

111:47

my workflows explaining exactly which

111:49

custom nodes are required. Now let us

111:52

look at templates. If we open templates,

111:55

we can see different workflows created

111:57

by the Comfy UI team. If we filter by

111:59

image generation workflows and select

112:02

something like a Z image turbo text to

112:05

image workflow, Comfy UI will first tell

112:07

us that we have missing models. These

112:10

are the AI models required for the

112:12

workflow to generate images. Usually, it

112:15

tells you exactly which folder the model

112:17

needs to go into and gives you the model

112:19

name along with a download link or a

112:21

download button. In this example, you

112:24

can see it needs a VAE model and a few

112:27

other models. Once you download those

112:29

models and place them in the correct

112:31

folders, the workflow should work,

112:34

assuming you have enough VRAM to run it.

112:36

In this case, there are no missing

112:38

nodes. So, let us close this. Now, let

112:41

us go to menu, then file, then open, and

112:44

open a workflow that I know uses missing

112:46

custom nodes. You will see a message

112:49

saying the workflow uses custom nodes

112:51

that are not installed. At first, you

112:54

might not see any red nodes on the

112:55

canvas. That is because this workflow

112:58

uses subgraphs. Subgraphs are basically

113:01

nodes that contain other nodes inside

113:03

them. If you have experience with

113:05

Photoshop, you can think of them like

113:07

smart objects. When you see an icon with

113:10

a square and an arrow, you can click it

113:12

to enter the subgraph. Once inside, you

113:16

can see the red node that is missing. If

113:18

we now open the manager and click

113:20

install missing custom nodes, Comfy UI

113:23

detects that node and offers to install

113:25

it. For many nodes, this works

113:28

perfectly. However, some nodes like

113:31

Nunchaku require additional dependencies

113:34

and extra setup. We will talk about

113:36

those in a future episode. The important

113:39

thing to know is that for many

113:40

workflows, install missing custom nodes

113:43

can quickly fix the problem. Let us

113:45

close this for now. If we open the

113:48

manager again, you will also see a

113:49

models manager. This lets you browse and

113:52

download models by type. Personally, I

113:55

rarely use this because a model without

113:57

a workflow is not very useful. In my

114:00

tutorials and on my Discord server,

114:02

every workflow comes with notes

114:04

explaining exactly which models you need

114:06

and where to put them. The Comfy UI

114:08

templates also clearly list required

114:11

models and folders. So, let us do a

114:13

quick recap.

114:15

Use update Comfy UI.bat to update only

114:19

Comfy UI. Use update Comfy UI and

114:22

nodes.bat to update Comfy UI and all

114:25

custom nodes. Use update easyinstall.bat

114:29

to update the easy install system and

114:31

helper scripts. The update folder exists

114:34

for users with other Comfy UI versions.

114:36

The add-ons folder only exists in the

114:38

easy install version. Inside add-ons,

114:41

the tools folder includes the version

114:43

switcher, which lets you downgrade or

114:45

upgrade Comfy UI if needed. This is

114:48

useful when a new update breaks a node

114:50

you rely on. Inside the Comfy UI folder,

114:54

the custom nodes folder contains all

114:56

installed custom nodes. If you delete a

114:59

folder from here, you uninstall that

115:01

node. Sometimes if a node fails to

115:04

install correctly, deleting its folder

115:06

and reinstalling can fix the issue. I

115:09

know this is a lot of information. Do

115:11

not worry if it does not all stick right

115:13

away. Practice, experiment, and come

115:16

back to this tutorial in a month. You

115:18

will be surprised how many things

115:20

suddenly make sense that you missed the

115:21

first time. Regarding the tcash node,

115:25

after a few days, Comfy UI was updated

115:27

again and the problem was still not

115:29

fixed. There is now version 8 and even

115:31

if you downgrade to version 7, it is

115:34

still not fixed. Comfy UI keeps adding

115:36

updates and at some point some custom

115:39

nodes will stop working. If that node is

115:41

not important for you, you can delete it

115:43

or uninstall it. You can also just

115:46

disable it from the manager or drag the

115:48

TCH folder into the disabled folder so

115:50

it is disabled. You can move it back out

115:53

of the disabled folder anytime you want

115:55

to try it again. In this chapter, I will

115:58

try to simplify this complex world of

116:00

diffusion and AI a little. Do not worry

116:03

if you do not understand everything that

116:06

is happening. Like I said before, you do

116:08

not have to be a mechanic and know all

116:10

the engine parts to know how to drive a

116:12

car. This is the core idea behind

116:14

diffusion image generation. The model

116:17

does not draw an image all at once. It

116:20

starts from pure random noise. This

116:22

noise looks like static on a television.

116:26

The model then runs a sequence of small

116:28

refinement steps. At each step, a small

116:31

amount of noise is removed. Early steps

116:34

reveal very rough shapes. Later steps

116:36

reveal clearer forms. Final steps add

116:40

fine details and texture. Image

116:42

generation is therefore a gradual

116:44

process. It goes from noise to less

116:48

noise to recognizable shapes and finally

116:51

to a finished image. This slide is a

116:53

simplified visualization. The real

116:56

process is more complex. In practice,

116:59

most diffusion models work in a

117:01

compressed latent space rather than

117:03

directly on pixels. A neural network

117:06

predicts what noise should be removed at

117:08

each step. Even though the real math is

117:10

more advanced, this simplified view is

117:12

enough to understand how diffusion

117:14

works. It's like sculpting. You start

117:17

with a rough block and remove material

117:19

until the shape appears. or like a foggy

117:22

window clearing up step by step. You

117:24

don't instantly get a sharp scene. It

117:26

resolves gradually. Let's open comfy UI.

117:30

Go to workflows and from the getting

117:32

started folder, pick workflow 1, which

117:35

is the basic text to image example. Even

117:38

if we cannot fully see what is happening

117:40

inside the K sampler stepby step, we can

117:43

still get a good idea of the overall

117:45

process. Remember what we see here is a

117:48

simplified representation of what is

117:50

actually happening under the hood. First

117:53

we want a fixed seed. We will see later

117:56

that each seed starts with different

117:57

noise. Right now we are using 35 steps

118:01

which is enough for this model to

118:02

produce a clear image like this robot.

118:05

If we change the steps to one, you can

118:07

see that the model does not have enough

118:09

time to remove the noise. So the image

118:11

is very unclear with these settings. If

118:14

we add another step, the change is

118:16

subtle. Adding another one, you can

118:19

start to see something forming. By step

118:21

four, we can almost see a face. We can

118:24

automate this process to see the changes

118:26

faster. Double click on the canvas and

118:29

add a primitive node. Like you saw in an

118:31

earlier chapter, we can adapt this node

118:33

for different fields. Drag a connection

118:36

from the primitive node and connect it

118:38

to steps. Now we have control over the

118:40

steps including what happens after each

118:43

generation. Instead of fixed or random,

118:46

choose increment. After each run, the

118:49

value increases by one. So now we have

118:51

five steps. If we run it again, we get

118:55

six steps and the image starts to change

118:57

more. As more steps are added, more

119:00

noise is removed and the image becomes

119:02

clearer. Next to the run button, there

119:04

is a small down arrow. From here, select

119:08

run instant. This means we can click run

119:10

once and it will keep running until we

119:13

stop it. You can see the workflow now

119:15

runs automatically. On each run, more

119:18

steps are added and the image keeps

119:21

refining. You may also notice that as

119:24

the number of steps increases, it

119:26

becomes harder for the computer. Just

119:29

like climbing many stairs, more steps

119:31

mean more effort. So, generation becomes

119:34

slower and slower. Soon we reach around

119:37

35 steps which is recommended for this

119:40

model to get a nice clear image.

119:42

Although some results already look good

119:44

around 20 steps. Now we want to stop

119:47

this. Click the arrow again and switch

119:49

back to run. After the current

119:51

generation finishes, it will stop. There

119:54

is also another way to see a small

119:56

preview of what is happening inside the

119:58

K sampler. From the menu, you can go to

120:01

settings, but it is faster to access the

120:03

settings from here. In the settings

120:05

search bar, type preview. You will see

120:08

an option called live preview method. By

120:12

default, it does not show anything. But

120:15

if we set it to auto, we can see a small

120:17

preview during generation. Let's delete

120:20

the primitive node. Then change the seed

120:22

to random. Now when we run the workflow,

120:26

we can see a small preview of what the

120:28

image might look like before it even

120:30

finishes generating. Let us change the

120:32

steps to 30 and run again. You can now

120:35

quickly see what is happening in the

120:37

diffusion process. Even though this

120:39

preview is low resolution, you can

120:41

clearly see how the image becomes more

120:43

and more defined as noise is removed.

120:46

Now, let me try something more drastic.

120:48

I will use a very large image size. On

120:51

some computers, this might crash Comfy

120:54

UI or take a very long time to generate.

120:56

I will run it again with these settings.

120:59

You can see that generation is now very

121:01

slow. But the preview lets us observe

121:04

how the image slowly starts to appear.

121:07

This is a bit too slow. So I will cancel

121:09

the generation here. Instead, I will try

121:12

a slightly smaller image, still larger

121:14

than what the model is comfortable with,

121:16

just so we can see the preview updating

121:18

more slowly. Now we can clearly see the

121:21

diffusion process updating every few

121:23

seconds. The speed of this preview is

121:26

also influenced by the sampler and the

121:28

scheduler. As you may remember, models

121:31

are trained on specific image sizes. If

121:34

a model was not trained on large images,

121:36

it treats them more like multiple

121:38

smaller images stitched together. For

121:40

example, our juggernaut model was

121:42

trained on 512 pixel images only.

121:46

Personally, I prefer not to keep the

121:48

live preview enabled all the time

121:50

because it can slightly slow down

121:52

generation. So I will go back to

121:54

settings and set the preview option back

121:56

to default. I will also reset the image

121:59

width and height. You may notice that

122:01

the preview is still visible. This can

122:03

happen because something remains in

122:05

memory. To fix this, I will press F5 to

122:09

refresh the browser. Keep in mind that

122:11

refreshing the browser will reload only

122:13

the current workflow. If you had other

122:16

workflows open and did not save them,

122:18

they will be lost. Now everything is

122:20

back to normal without the preview.

122:23

There are still more useful things to

122:24

learn. This slide explains how a

122:27

diffusion model is trained. This is not

122:30

image generation yet. During training,

122:32

the model is shown millions of images

122:35

paired with text descriptions. For

122:37

example, images of cats, people,

122:40

objects, lighting styles, and

122:43

environments. The training process uses

122:45

something called forward diffusion.

122:48

Forward diffusion means gradually adding

122:50

noise to a clean image. At first, only a

122:53

small amount of noise is added. Then

122:56

more noise is added step by step.

122:59

Eventually, the image becomes almost

123:01

pure noise. At each step, the model is

123:04

trained to predict what noise was added.

123:07

In other words, it learns how images

123:09

break down as noise increases. By

123:12

repeating this process across millions

123:14

of images, the model learns patterns. It

123:17

learns what shapes look like. It learns

123:20

what objects look like. It learns how

123:22

lighting and structure behave. The goal

123:25

of training is not to memorize images.

123:27

The goal is to learn how to reverse this

123:29

process later. Training a diffusion

123:32

model requires massive data sets and

123:34

powerful hardware. In Comfy UI, we are

123:37

only using the result of that training.

123:40

Now that the model has learned how noise

123:42

works during training, we can use that

123:44

knowledge in reverse to generate images.

123:47

This slide shows the difference between

123:49

training and image generation. During

123:52

training, the model starts with a clean

123:54

image. Noise is added step by step until

123:57

the image becomes pure noise. This is

124:00

called forward diffusion. This process

124:03

teaches the model how images break down

124:05

when noise is added. During generation,

124:08

the process is reversed. We start from

124:11

pure random noise. The model removes

124:14

noise step by step to create an image.

124:16

It is important to understand this

124:18

clearly. During generation, we do not

124:21

add noise like in training. We only

124:24

remove noise using what the model

124:25

learned before. This slide explains an

124:28

important concept that is often

124:29

misunderstood. The model does not store

124:32

images in memory. During training, the

124:35

model never saves photos that it has

124:37

seen. Instead, it learns patterns and

124:40

relationships.

124:42

It learns what shapes look like. It

124:44

learns what objects look like. It learns

124:47

how parts of an image relate to each

124:48

other. For example, it learns that faces

124:51

usually have eyes in a certain position.

124:54

It learns that animals have specific

124:56

structures. It learns how lighting,

124:59

shadows, and perspective usually behave.

125:02

All of this knowledge is stored as

125:04

probabilities inside the model, not as

125:06

pictures, but as learned rules. You can

125:09

think of it like learning a language.

125:12

You do not memorize every sentence you

125:14

read. You learn grammar and structure.

125:17

The model works the same way. It learns

125:20

visual grammar, not individual images.

125:23

When the model generates an image, it is

125:26

not copying anything it has seen before.

125:28

It is using learned patterns to guide

125:31

the noise removal process. That is why

125:33

results can look familiar but are still

125:36

new images. This is why changing the

125:38

prompt changes the result. The prompt

125:41

activates different learned patterns

125:43

inside the model. That is also why the

125:46

same model can generate many different

125:48

images even though it was trained only

125:50

once. So far we talked about diffusion

125:54

in a simplified way as if it happens

125:56

directly on images. In reality, most

126:00

modern diffusion models do not work

126:02

directly on pixel images. Instead, they

126:05

work in something called latent space.

126:08

Pixel space is the image as we normally

126:10

see it. It is made of pixels with width,

126:13

height, and color values. Latent space

126:16

is a compressed representation of that

126:18

image. It keeps the important structure

126:21

and information but removes unnecessary

126:23

detail. You can think of latent space as

126:26

a simplified version of the image that

126:28

is easier for the model to work with. To

126:31

move between pixel space and latent

126:33

space, the model uses a VAE. VAE stands

126:37

for variational autoenccoder. The VAE

126:40

has two main jobs. First, it encodes a

126:43

pixel image into latent space. Second,

126:47

it decodes a latent image back into

126:49

pixels. During image generation,

126:52

diffusion happens in latent space. After

126:55

the dnoising process is finished, the

126:57

VAE decodes the result back into a

127:00

visible image. Working in latent space

127:03

makes diffusion much faster. It also

127:06

uses less memory and less computing

127:08

power. This is why models like stable

127:11

diffusion can run on consumer graphics

127:13

cards. Without latent space, image

127:16

generation would be much slower and more

127:18

expensive. In Comfy UI, this is why we

127:21

see nodes like VAE encode and VAE

127:24

decode. When we generate images from

127:26

text, the model works in latent space

127:30

and VAE decode converts the result into

127:33

pixels we can see and save. This also

127:36

explains why image resolution and VAE

127:38

selection can affect results.

127:41

Now we look at how text prompts

127:43

influence image generation. The prompt

127:45

does not act only once at the beginning.

127:48

During diffusion, the prompt is used at

127:50

every denoising step. At each step, the

127:53

model checks whether the image is moving

127:55

closer to what the text describes. You

127:58

can think of the prompt as guidance. It

128:00

gently nudges the image in the right

128:02

direction while noise is being removed.

128:05

This happens repeatedly, step by step,

128:08

until the final image is formed. CFG

128:10

stands for classifier free guidance. CFG

128:14

controls how strongly the prompt

128:16

influences the dnoising process. With a

128:19

low CFG value, the model follows the

128:22

prompt loosely and allows more

128:24

randomness. With a high CFG value, the

128:27

model follows the prompt more strictly

128:30

and forces the image to match the text

128:32

more closely. Here is a quick example.

128:35

You can find CFG here in the K sampler.

128:39

Too low CFG can produce images that

128:41

ignore the prompt. Too high CFG can

128:44

produce images that look unnatural or

128:46

oversharpened. CFG is like telling the

128:49

model how strict it should be about your

128:50

instructions. The prompt does not

128:53

generate the image by itself. The prompt

128:56

only guides the noise removal process.

128:58

The image is still created by diffusion

129:00

in latent space. As you can see with

129:03

CFG1, the cat is still a cat, but it is

129:06

not read like we asked. With CFG7, the

129:10

result is much closer to the prompt.

129:12

That said, this also depends on the

129:14

model we are using. Smarter or better

129:17

trained models tend to follow the prompt

129:19

more accurately. In fact, there are some

129:22

models where we intentionally use a

129:24

fixed CFG value of one, which

129:26

effectively ignores the negative prompt.

129:29

However, pushing CFG too high can damage

129:32

the image. It can introduce artifacts or

129:35

make the result look unnatural. Because

129:37

of that, we always try to find a

129:39

balance. The goal is to use settings

129:41

that give us the quality we want in the

129:44

shortest amount of time without hurting

129:46

the final image.

129:49

Now we talk about seeds. Seeds are very

129:52

important for understanding consistency

129:54

and variation. A seed defines the

129:57

starting noise used to generate an

129:59

image. You can think of it as the

130:01

initial random pattern the model starts

130:03

from. When diffusion begins, the model

130:06

always starts from noise. The seed

130:08

decides exactly what that noise looks

130:10

like. If you use the same prompt, the

130:13

same settings, and the same seed, you

130:16

will get the same image every time. If

130:19

you change the seed, you change the

130:21

starting noise, and the final image will

130:23

be different. The prompt guides the

130:25

process, but the seed decides the

130:28

starting point. Different starting noise

130:30

leads to different results, even when

130:33

everything else stays the same. You can

130:35

think of the seed like rolling a dice

130:37

before starting. If you roll the same

130:39

number, you start from the same

130:41

situation. If you roll a different

130:43

number, the outcome changes. This is a

130:46

simplified explanation. The seed

130:48

controls a random number generator used

130:51

internally by the model. You do not need

130:53

to understand the math behind it. You

130:56

only need to know that seeds control

130:58

repeatability. Let us put it into

131:00

practice. The seed is this number here.

131:03

It can start from zero and go up to a

131:05

very large number. So each seed can

131:08

produce a slightly different result. If

131:10

you also change the prompt and settings,

131:13

you can get millions of different

131:14

results. We can control the seed

131:17

behavior. If we set it to fixed, we

131:19

generate once and the result will never

131:21

change. To generate something new, we

131:24

need to change other settings. If we

131:26

choose increment, after each generation,

131:29

the seed number will increase by one. If

131:32

we choose decrement after each

131:34

generation the seed number will decrease

131:36

by one. So let us change it to fixed and

131:39

set the seed to 10. When I generate I

131:42

get this robot. Now let us change the

131:44

seed to 15. You can see that I get a

131:47

different robot this time in profile. If

131:50

I change the seed back to 10, I get the

131:53

previous robot again because we used the

131:55

same prompt, the same settings, and the

131:58

same seed.

132:00

In prompts, the order of the words

132:02

matters.

132:03

With this prompt, I got this image

132:06

because house was first. So, the model

132:09

focused on the house and mostly ignored

132:11

the car. With newer models, this happens

132:14

less often. But this is an older model,

132:17

so the effect is more noticeable. Now,

132:19

look at what happens if I put car first

132:22

and then house. This time, we clearly

132:25

get both a car and a house. Words that

132:28

appear earlier in the prompt usually

132:30

have more influence than words that come

132:32

later. You can think of the prompt as a

132:34

list of priorities. The model pays more

132:37

attention to the beginning and gradually

132:39

less attention as it moves toward the

132:41

end. On top of that, some words can

132:44

carry more weight, either because of how

132:46

the model was trained or because we

132:48

explicitly give them extra emphasis.

132:51

Because of this, two prompts with the

132:53

same words but in a different order can

132:56

produce noticeably different results.

132:59

Think of the prompt like giving

133:00

directions to someone. If you say a red

133:03

cat sitting on a chair in a room with

133:06

soft lighting, the most important idea

133:09

is red cat. Everything after that adds

133:11

detail, but the core idea comes first.

133:14

We can also add more weight to a word by

133:16

using round brackets. Right now, house

133:20

has more weight. So the model pushes the

133:22

car into the background and it is no

133:23

longer the main focus. If I add even

133:26

more brackets, the influence of house

133:29

becomes even stronger and now the car

133:31

disappears completely. If I instead add

133:34

more weight to the word blue, you will

133:36

see more blue appear in the generation.

133:39

One more thing you might notice is that

133:41

there is no spell check by default.

133:43

Sometimes it can be useful to turn it

133:45

on. To do that, go to settings,

133:49

search for spell, and enable text area

133:52

widget spell check. Now, words that are

133:54

misspelled or not part of the dictionary

133:57

will be underlined.

133:59

Now, we talk about denoising steps.

134:01

Steps control how many refinement passes

134:04

the model performs during generation.

134:06

Each step removes a small amount of

134:08

noise. The image is not created in one

134:11

action. It is refined little by little,

134:15

step by step. When you increase the

134:17

number of steps, the model has more

134:19

chances to clean up noise and add

134:21

detail. When you decrease the number of

134:23

steps, the process is faster, but the

134:26

image can look rough or incomplete. More

134:28

steps means slower generation and more

134:31

refinement. Fewer steps means faster

134:34

generation and less refinement. There is

134:37

always a balance between speed and

134:39

quality. You can think of steps like

134:41

polishing an object. More polishing

134:44

gives a smoother result. Less polishing

134:46

is faster but rougher. In Comfy UI,

134:49

steps are set inside the K sampler node.

134:52

For most models, a good starting range

134:54

is between 20 and 30 steps. Going much

134:57

higher often gives diminishing returns.

135:00

Going much lower is useful for fast

135:02

previews. Steps work together with the

135:05

seed and the prompt. The seed decides

135:07

the starting noise. The prompt guides

135:10

the direction. Steps decide how far the

135:13

refinement goes. Now we are ready to

135:16

look at a real workflow in Comfy UI.

135:19

This is called text to image. Often

135:21

shortened to text to img. Text to image

135:25

means we start from pure noise and

135:27

generate an image only from text

135:28

instructions. There is no input image

135:31

involved. This is usually the first

135:33

workflow people learn and it is the best

135:36

way to explore ideas and styles from

135:38

scratch. We start by loading a model.

135:41

This model contains everything the AI

135:43

learned during training. Next, we give

135:46

the model instructions using a text

135:48

prompt. This describes what we want to

135:50

see in the image. We also define the

135:53

image size using an empty latent image.

135:56

This decides the resolution before the

135:58

image is generated. Then the K sampler

136:01

runs the diffusion process. This is

136:03

where noise is removed step by step

136:06

guided by the prompt. After that, the

136:09

VAE decodes the latent result into a

136:11

visible image. Finally, the image is

136:14

saved to disk. Use text to image when

136:18

you want to explore new ideas, you want

136:20

to test prompts and styles, or you are

136:23

starting from nothing. This workflow is

136:25

ideal for concept art and

136:27

experimentation. But we can also start

136:29

from an image not just from pure noise.

136:33

In that case instead of beginning with

136:35

random noise we use an existing image as

136:37

the starting point and apply d noiseise

136:39

on top of it. You can think of den

136:41

noiseise as how much freedom the model

136:43

has to change the image. With low den

136:46

noiseise the model stays very close to

136:48

the original image. With higher d

136:51

noiseise it moves further away and

136:53

behaves more like text to image. So

136:56

rather than generating everything from

136:57

scratch, we are guiding the diffusion

136:59

process using an image as the base and

137:02

then controlling how much it changes

137:04

using the D noiseis value. Image to

137:06

image is like starting with a rough

137:08

sketch and deciding how much you want to

137:10

redraw it. You can see that in the text

137:12

to image workflow we have empty latent

137:15

image. That node generates the noise. In

137:18

this workflow, we have an image that is

137:20

encoded to latent so it can go to the K

137:22

sampler. Let me show you how I did it. I

137:25

removed the empty latent image node.

137:28

Then I doubleclicked on the canvas and

137:30

added a load image node. From here we

137:32

can load an image and I will choose this

137:35

robot. Now you can see it does not have

137:37

a latent output. So we cannot connect it

137:39

to the K sampler yet. So we need a VAE.

137:43

If we look we have decode and encode. We

137:47

already have VAE decode when it converts

137:49

from latent to pixels. Now we want to

137:51

encode it. An easy way to find the right

137:54

node is to drag a link and release it.

137:56

And you will see a suggestion for VAE in

137:58

code. Now we have a latent output which

138:01

means we can connect it to the K sampler

138:03

which is what we want. If we try to run

138:06

it like this, something is missing. It

138:09

says missing VA AE. You can see a big

138:12

red outline around the node with the

138:14

problem and a small circle around the

138:16

input which means we need a connection

138:18

there. So let us connect it to the VAE.

138:22

In this case, the VAE is included in the

138:25

main model. So we connect it from there.

138:27

Now we encoded it and then we decode it.

138:30

Let us run again. And now it works. But

138:34

the result is still different from my

138:36

input. We have the right prompt, but

138:38

something is influencing it. Remember

138:41

this. Every time you use an image as

138:44

input, we need to adjust the D noiseis

138:46

because that controls how much the image

138:48

changes. With the default value of one,

138:51

it is at the maximum. So, it changes the

138:53

image too much. Let us change it to 0.2

138:57

and see how that affects it. Now, you

138:59

can see it is very similar to the

139:01

original. It is hard to tell what parts

139:03

changed. Let us increase it to 0.5.

139:06

Now, we can see more changes in the

139:08

robot face. There is an easy way to

139:10

compare these images. Double click on

139:13

the canvas and search for image compar.

139:16

This is part of the RG3 node pack. You

139:18

can see it has two inputs, image A and

139:21

image B. I want to compare the original

139:24

image. So I will connect the load image

139:26

output to image A. For the second image,

139:30

remember the save image node is only for

139:32

saving to disk. The image we want to

139:34

compare is the one coming out of VAE

139:36

decode. So we connect that to image B.

139:39

Now let us run the workflow. We get this

139:42

small preview. Let me make it larger. It

139:46

is still too small. So I will move some

139:48

nodes to make space so you can see it

139:50

better. By default, it shows image A,

139:53

the original. When we move the mouse

139:56

over to the right, it shows the second

139:58

image. Now it is much easier to compare

140:01

before and after. If I change the noise

140:04

to 0.1, we get a very similar result

140:08

because the amount of den noise is

140:09

small. If I change it to 0.9,

140:13

we get a big variation. All of this is

140:16

also influenced by the sampler, the

140:18

scheduleuler, and the model itself. But

140:21

in general, this is how it works. I

140:24

prefer to start with values around 0.7.

140:27

If that is too much, I reduce to 0.5

140:31

and keep adjusting until I like the

140:33

result. Another thing you should know is

140:35

that the input image size influences the

140:37

result size. Since we do not have an

140:40

empty latent image node where we set

140:42

width and height, the loaded image

140:44

decides the size. Comfy UI will also

140:47

round the size to a multiple of 8. For

140:50

example, if your image is 511 pixels, it

140:54

can be rounded down to the next number

140:56

that is a multiple of 8, like 504.

140:59

You can also control the input size by

141:01

resizing or cropping it, like you saw in

141:04

the earlier chapters. For example, I can

141:07

add an upscale image node here, then

141:10

redo the connections so the image passes

141:12

through it. I can upscale to a bigger

141:14

size with the same ratio. Now when I run

141:17

it, the final image should be larger

141:20

because it follows the input image size.

141:24

Now we are going to talk about samplers

141:26

and schedulers which you can find here

141:28

in the K sampler. This is one of the

141:31

most confusing parts at first, but the

141:33

idea is actually simple. Everything

141:36

begins with the same initial noise. The

141:38

seed defines that noise, but once the

141:41

noise exists, two different systems

141:44

control what happens next. The sampler

141:46

decides how noise is removed. It defines

141:49

the strategy the model uses to go from

141:51

noisy to clean. Different samplers use

141:54

different mathematical paths to dn

141:56

noiseise. Some remove noise more

141:58

directly. Some refine the image

142:00

gradually. Some are more random and

142:02

creative. Some are more stable and

142:04

precise. Even with the same prompt, the

142:07

same seed, and the same number of steps,

142:10

changing the sampler can change the

142:12

final image. So the key idea is this.

142:16

Sampler controls how each denoising step

142:18

is calculated. Or in simple terms,

142:22

sampler equals how noise is removed.

142:25

Theuler does not change how denoising

142:27

works. It changes when denoising happens

142:30

during the steps. A linearuler spreads

142:33

denoising evenly across all steps. Each

142:36

step removes roughly the same amount of

142:38

noise. A nonlinearuler removes noise

142:41

faster at the beginning and slower near

142:43

the end. This allows fast structure

142:45

early and fine detail later. Both

142:48

approaches can reach a clean image, but

142:50

they feel different in how detail is

142:52

introduced. So the key idea here is

142:55

controls when noise is removed or simply

142:59

equals when noise is removed. Sampler

143:02

anduler always work together. You never

143:05

choose one without the other. The

143:07

sampler chooses the denoising method.

143:09

The scheduler chooses the timing of that

143:11

denoising. The same noise plus a

143:14

different sampler or a differentuler can

143:17

produce different results.

143:20

Let us do a little experiment in comfy

143:23

UI. From workflows, I open again this

143:26

text to image workflow and I change the

143:28

seed to fixed. Then I run the workflow.

143:32

With this sampler anduler, we get this

143:34

robot. Here we have a lot of samplers

143:37

and schedulers. Depending on the model

143:39

we use, some work better than others.

143:42

Let us say I pick the Oiler sampler. Now

143:45

when I run it, even if the seed and

143:47

prompt are the same, the result is

143:49

slightly different because the sampler

143:51

influences how the den noiseise is

143:53

applied. Let us say I also change

143:56

theuler to simple. Now the result will

143:59

again be different because theuler

144:01

changes when the den noiseise happens

144:03

during the steps because the model we

144:06

use is quite small. We can actually

144:08

preview multiple results at the same

144:10

time. So I hold the control key and drag

144:12

over these three nodes. Then I use

144:15

controll + c to copy them and controll +

144:18

shift + v to paste them with the links

144:21

connected. Now this workflow will

144:23

generate two images and has two k

144:25

sampler nodes. Let me use controll +

144:27

shift +v again to get a third one. Now

144:30

this workflow uses the same seed and

144:33

prompt with three different k sampler

144:35

nodes. And I want to change the samplers

144:37

andulers for each one. You can play with

144:40

these all day and try many combinations.

144:43

I will choose something random for this

144:45

example.

144:47

Now when I run it, you can see it

144:49

generates an image for each sampler.

144:51

Some results are quite similar, but some

144:53

details are different. For example,

144:56

parts of the robot may change from one

144:58

image to another. Let me now put the

145:00

same sampler on all of them and use

145:02

different schedulers only so we can see

145:04

how the timing of den noiseise

145:06

influences the result.

145:10

Again, the differences are subtle but

145:12

they are there. Sometimes this can mean

145:15

one image has five fingers and another

145:17

has six. So having options is useful

145:20

especially when you want small

145:21

variations. Now let us double click on

145:24

the canvas and add a primitive node. I

145:26

want to control the steps value for all

145:28

three K samplers, but I do not want to

145:30

change it manually on each one. So I

145:33

drag a connection from the primitive

145:34

node to the steps input of the first K

145:37

sampler. Then do the same for the second

145:39

and the third one. Now from this single

145:41

node I can control all three. If I

145:44

change steps to one, you can see we get

145:47

very similar results. If I change steps

145:49

to three, you can already see

145:51

differences. Someulers are faster. For

145:55

example, with one, the image is still

145:57

very noisy, while with another, you can

146:00

already see a shape forming. If I change

146:03

to four steps, the differences become

146:05

more visible. At five steps, some start

146:08

to form clearer shapes. At six steps,

146:11

some images already show eyes and a main

146:14

structure. At eight steps, the middle

146:16

one is almost fully formed. At 10 steps,

146:20

almost all of them have something that

146:21

could work for certain concepts. And at

146:24

20 steps, most of them have enough

146:26

detail to be usable in a project.

146:28

Usually, the people who create AI models

146:31

suggest specific samplers and

146:33

schedulers, or the community tests them

146:35

and shares which ones work best. This

146:38

way, you do not have to test everything

146:40

yourself for every model. But if you do

146:43

find good settings, it is always a good

146:45

idea to share them with the community so

146:47

everyone can improve their image

146:49

generation results. Let's talk a little

146:52

about subgraphs in Comfy UI. Go to

146:55

workflows and open the juggernaut text

146:57

to image workflow. Here you can see a

147:00

bunch of nodes. Just like before, hold

147:03

the control key and drag to select most

147:05

of the nodes except the export node,

147:08

which in this case is the save image

147:10

node. Now that the nodes are selected,

147:12

look at the icons at the top. One of

147:15

them says convert selection to subgraph.

147:17

When you hover over it, when you click

147:19

it, all those selected nodes are

147:21

combined into a single node. If you

147:24

write click on this new node, you will

147:25

see an option called unpack subgraph.

147:28

When you click it, the nodes go back to

147:30

how they were before. Let's do it again.

147:33

Select two or more nodes, then use the

147:36

subgraph button to create a subgraph.

147:38

Resize it so it is easier to see. A

147:41

subgraph is a way to group multiple

147:43

nodes into a single reusable block.

147:46

Instead of showing a long chain of nodes

147:48

every time, you collapse them into one

147:50

node that represents an entire process.

147:53

You can think of a subgraph like a

147:55

function or a macro. Inside it there can

147:58

be many nodes, but from the outside it

148:01

looks simple. It is very similar to

148:04

smart objects in Photoshop which can

148:06

contain multiple layers inside a single

148:08

object. Subgraphs solve three main

148:11

problems. First, they reduce visual

148:14

clutter. Large workflows can become

148:16

messy very quickly and subgraphs help

148:19

keep things readable. Second, they help

148:22

reuse logic. If you repeat the same

148:25

setup many times, like a prompt encoding

148:27

chain or an image pre-processing step,

148:30

you can reuse it instead of rebuilding

148:32

it every time. Third, they make

148:34

workflows easier to explain and share.

148:37

People understand a few clean blocks

148:39

much faster than dozens of individual

148:41

nodes. At the time I recorded this

148:43

tutorial, subgraphs were still being

148:46

improved and may still have some bugs. A

148:48

subgraph does not make a workflow faster

148:50

by itself. It is about organization, not

148:54

performance. Performance depends on the

148:56

nodes inside the subgraph, not on the

148:59

subgraph wrapper. A subgraph is like

149:02

putting many Lego pieces into one box

149:04

and labeling the box with what it does.

149:06

All the pieces are still there. You just

149:08

do not need to see them all the time.

149:10

You can see the title says new subgraph.

149:14

Let's double click on that and rename it

149:16

to something that makes sense like text

149:18

to image and maybe also include the

149:20

model name. So, juggernaut text to

149:23

image. Now, it looks like a simple

149:25

workflow with only two nodes. I do not

149:28

like the order in which things appear in

149:30

the node. So, let's write click on the

149:32

node and select edit subgraph widgets.

149:35

Here you can choose what parameters to

149:37

show in that node and what to hide. Let

149:40

me hide all of them so we have a clean

149:42

subgraph that does not show any

149:44

parameters. You can enable them one by

149:46

one later if you want only the ones you

149:48

need. But we will build those manually

149:50

so you understand them better. Let's

149:52

close this panel. Now let's go inside

149:55

the subgraph. You can see that all the

149:57

nodes are there plus some input and

150:00

output. On top you can see a new tab

150:03

next to the workflow name. If I click on

150:05

the main workflow name that is how we

150:08

exit the subgraph. From there we can go

150:11

back inside and from inside we can go

150:14

back outside. You can see the output

150:17

where the image is saved. If we go

150:20

inside the subgraph that image output

150:22

appears here as a link. From this dot we

150:25

can drag a connection to where it says

150:27

checkpoint name. Now that field becomes

150:30

gray just like when we added a primitive

150:32

node before. If we go back outside you

150:35

can see that the checkpoint name appears

150:37

here. Let's go back inside again, double

150:40

click on that name, and rename it to

150:42

model to see what happens. Now, when we

150:45

go back to the main workflow, you can

150:47

see it says model instead of checkpoint

150:49

name. So, this is very customizable.

150:52

Let's go back inside and drag another

150:54

connection, this time to the positive

150:57

prompt and rename it so we know what it

150:59

is. Do the same for the negative prompt.

151:02

Now, when we go back outside, we have

151:05

positive and negative prompt visible. Go

151:08

back inside again and drag connections

151:10

to width and height. And maybe do the

151:12

same for all the parameters from the K

151:14

sampler. Now we can see all those

151:16

parameters exposed here. And when we go

151:19

back outside, we have this single node

151:21

that acts like a mini interface that can

151:24

control everything we need. You might

151:26

say that it looks nice, but does it

151:28

actually work? Let's try it. And the

151:32

answer is yes, it works. If you right

151:36

click on it, you can see it still has

151:39

other options like node color, bypass

151:43

and so on. With that subgraph selected,

151:46

right click on the canvas this time and

151:48

you will see an option called save

151:50

selected as template. It asks for a

151:52

name. I will name it juggernaut text to

151:55

image. Then press enter or confirm. It

151:58

looks like nothing happened, but where

152:00

was that template saved? Let's open a

152:03

new workflow. Now right click on the

152:05

canvas and go to node templates. You can

152:08

see that name there now. And you also

152:11

have the option to manage templates and

152:13

remove them. When I select that

152:15

template, it is added to the canvas with

152:17

all the nodes, connections and settings

152:20

it has inside. Now we can just drag a

152:23

link from the image output and add a

152:25

save image node or connect it to other

152:28

nodes to create more complex workflows.

152:30

Over time, this simplifies workflows

152:33

because we can organize them into pieces

152:35

and group them by category or function.

152:38

Let's go back to the first workflow just

152:40

to show you that any node or combination

152:43

of nodes can be saved as templates.

152:45

There are cases where some connections

152:47

can break when some nodes are inside a

152:50

subgraph and others are outside. So,

152:52

keep that in mind. For example, I use

152:55

this Pixaroma note node a lot. I want to

152:58

save it as a template so I can access it

153:00

easily next time. This might not be

153:02

useful for everyone, but as a workflow

153:04

and tutorial creator, I use this a lot.

153:08

I will save it as a template and give it

153:10

a name. Now I can go to any other

153:12

workflow and quickly access that

153:13

template from anywhere. You can also

153:15

have subgraphs inside other subgraphs

153:18

like boxes inside boxes.

153:21

You can disconnect or remove links at

153:23

any time. I could select two nodes here

153:26

and combine them into another subgraph

153:29

or go outside and combine all these

153:31

nodes even if some are simple nodes and

153:34

one is already a subgraph and it will

153:36

still let me create a new subgraph. If

153:39

we go inside all those nodes are there.

153:42

If we go back outside we can unpack it

153:45

using the icon or rightclick and choose

153:49

unpack subgraph. These things will make

153:51

more sense as you work with them in

153:53

practice. So play with them and have

153:55

fun. When you see that icon on a node,

153:58

you know it is a subgraph. It also has

154:00

the icon that lets you go inside the

154:02

subgraph, which is another indicator

154:04

that it is not a simple node. Remember

154:07

that you can also use the interface to

154:08

edit subgraph widgets. One thing I

154:11

forgot to show is that you can use those

154:13

dots to rearrange the order of the

154:15

parameters shown in the subgraph node.

154:17

This way you do not need to go inside

154:19

it. Most of the time you can control

154:22

things directly from the outside. Now we

154:24

are going to talk about loris. Laura

154:27

stands for low rank adaptation. In

154:29

simple terms, Allora is a small add-on

154:32

that modifies how a base model behaves.

154:35

Allora does not replace the model. It

154:38

works together with the model. You can

154:40

think of the base model as the main

154:42

photographer we hired earlier. Allora is

154:44

like giving that photographer extra

154:46

experience in a specific style or

154:48

subject. Why loris exist? Training a

154:52

full model is very expensive. It

154:55

requires a lot of images, time, and

154:57

powerful hardware. Loris exists to solve

155:00

this problem. Instead of retraining a

155:03

full model, we train a small adapter

155:06

that teaches the model something new.

155:08

This could be a specific art style, a

155:11

character, a face, a pose style, or a

155:15

lighting style. Loris are much smaller

155:17

than full models. That is why they are

155:20

easy to download and experiment with. So

155:23

remember, a Laura does not work by

155:25

itself. It always needs a base model and

155:28

a compatible architecture. For example,

155:31

a stable diffusion 1.5 Laura needs a

155:34

stable diffusion 1.5 model. An SDXL

155:38

Laura needs an SDXL model and so on. If

155:42

you mix incompatible models and loris,

155:44

the results will be broken or random.

155:47

Let's open comfy UI to test it because

155:50

what is theory without practice, right?

155:53

Open workflow 3, the one that has Laura

155:55

in the name. As you can see, the

155:58

workflow is very similar to what we had

156:00

before. That is one of the reasons I am

156:02

using this older model instead of a

156:04

newer one. It is easier to learn the

156:06

basics first and then we can make things

156:08

more complex as we move forward.

156:10

Compared to the first text to image

156:12

workflow we used earlier, we now have

156:14

this Laura loader node that loads a

156:16

Laura model in our photographer analogy.

156:19

This means the photographer took some

156:21

classes on how to take photos of cakes

156:23

and is now specialized in that subject.

156:26

Let's look at the note node first. We

156:28

need to download the Laura model.

156:30

Remember the workflow comes with nodes

156:33

and settings, but since it is just a

156:35

text file, it cannot include the actual

156:38

models. We have to download those

156:40

separately and place them in the correct

156:42

folder. In this case, we are using a

156:45

Laura called cake style. It is a small

156:47

model trained on images of cakes. So, it

156:50

understands cakes better than the base

156:51

model alone. A few years ago, when

156:54

stable diffusion 1.5 models first

156:57

appeared, they could not handle many

156:59

subjects very well, and luris were often

157:01

used to fix those limitations. So, we

157:04

need to download this Laura and place it

157:06

inside the Laura's folder. Click where

157:09

it says here. Then we need to place that

157:11

file in the Laura's folder. Go to your

157:14

Comfy UI folder. Open the models folder

157:17

and then find the Laura's folder. If we

157:20

place it directly here, it will work

157:22

perfectly. But this time, I want to keep

157:24

things organized. I want to create a

157:26

folder that tells me which base model

157:28

this Laura is compatible with. So, I

157:31

will create a folder called SD15.

157:34

This way I know it works with that model

157:37

and I do not mix it with others. Save

157:40

the Laura inside that folder. If you

157:42

look at the file now, you can see that

157:44

the Laura is much smaller than the base

157:46

model. All lures should go into this

157:49

folder and it is best to organize them

157:51

by base model name like SDXL, Flux,

157:55

Quinn, and so on just like we did with

157:58

checkpoints in an earlier chapter. Now

158:00

go back to Comfy UI. We have everything

158:03

we need to run this workflow, but

158:05

because Comfy UI was already open when

158:07

we downloaded the model, it cannot see

158:09

it yet. We need to press the R key to

158:12

refresh the node definitions.

158:14

Now it appears in the list and we can

158:16

select it. You can also see a note here

158:19

with a trigger word. I like to add these

158:21

notes so I remember them. Many Lauras

158:24

are trained using specific trigger

158:26

words. These are words the Laura learned

158:29

during training. If you do not include

158:31

the trigger word in the prompt, the lura

158:33

may have little or no effect. Some luras

158:36

work without trigger words, but many

158:38

require them. Always read the Laura

158:40

description from the place where you

158:42

downloaded it. If we look at the

158:44

positive prompt, we first added the

158:46

trigger words so we do not forget them.

158:48

It is not required to be first. It can

158:51

also be placed after a few words, but I

158:53

like to put it first. Then we have the

158:56

prompt for a robot. I did it this way so

158:58

we can clearly see how the Laura and a

159:01

simple trigger word affect the result.

159:03

Now if we run this workflow, we get this

159:05

robot cake. You might think this model

159:08

could do that without a Laura, but it

159:10

depends on the prompt and the model. Let

159:12

me change the seed to fixed so we can

159:14

get a consistent result. So this is how

159:16

it looks with the Laura applied. Now

159:19

what I want to do is see the effect the

159:21

workflow without Laura and without

159:23

changing anything else. Same prompt,

159:26

same settings, same seed, just disable

159:29

the Laura. To do that, I right click on

159:32

this node and choose bypass. Now, when I

159:35

run the workflow, the Laura is bypassed.

159:37

And you can see we get a normal robot

159:39

instead of a cake robot. If I enable the

159:42

node again and run it, you can clearly

159:45

see the effect the Laura has on the

159:47

image. Now that you see how it works,

159:49

let's adapt a normal text to image

159:51

workflow and add the Laura ourselves for

159:54

practice. Open workflow 1, the basic

159:56

text to image workflow. Now I want to

159:59

add the Laura between the model and the

160:01

K sampler. Double click on the canvas,

160:04

search for Laura, and add the node

160:06

called Laura loader model only. Let me

160:09

resize it so the text is easier to see.

160:12

I also like to color these nodes blue so

160:14

I can spot them faster in big workflows,

160:17

but that is optional. Now, we need the

160:19

model connection to go through this

160:21

node. If you look now, the model is

160:23

connected directly to the K sampler, but

160:26

we want the extra knowledge from the

160:27

Laura. Drag a connection from the model

160:29

output to the Laura loader and then from

160:32

the Laura loader to the K sampler. The

160:34

workflow is now complete. Let's set the

160:37

seed to fixed so we can clearly see how

160:39

different settings affect the result. It

160:41

runs without errors, so everything is

160:43

connected correctly. Even if Allora

160:45

sometimes works without a trigger word,

160:47

it is best to include it when one is

160:49

provided. So let's add the trigger word

160:52

cake style to the positive prompt. Now

160:55

when I run it, we get a different result

160:57

even though the seed is fixed. That

160:59

shows the Laura is doing its job. If I

161:02

change the seed, we get another

161:03

variation. To avoid forgetting trigger

161:06

words, I like to add a note node. I

161:09

write the trigger word there. change the

161:10

note title so it is clear what it is for

161:13

and often change the color to match the

161:15

Laura nodes so I know they are related.

161:17

One important thing I have not mentioned

161:19

yet is that you can use multiple loris.

161:22

If I want I can clone this node by

161:24

holding alt and dragging then connect

161:26

them one after another. You can stack

161:29

several luras this way. I personally do

161:31

not use more than three or four at once.

161:34

In this setup, the base model is

161:36

combined with the first Laura, then the

161:38

second Laura, and all that information

161:40

goes into the K sampler. In the prompt,

161:42

you add the trigger words for all the

161:44

Loras you use. If I run this now, some

161:48

strange things can happen. First, I use

161:50

the same Laura twice, which makes its

161:53

effect too strong. Second, when using

161:56

multiple lores, it is usually a good

161:58

idea to reduce their strength so they

162:00

blend better instead of overpowering the

162:02

image. If I lower the strength values,

162:05

the result becomes much more stable and

162:07

usable. If your result looks too weird,

162:11

one of the first things to try is

162:12

reducing the Laura strength. Let me

162:15

delete the extra Laura and keep only

162:17

one, then set its strength to one. Each

162:20

Laura has a strength value. This

162:22

controls how strongly the Laura affects

162:24

the model. Low values give subtle

162:27

influence. High values give strong

162:29

influence. If the value is too high,

162:32

images can break, faces can deform, and

162:35

styles can become unstable. A good

162:38

starting range is usually between 0.6

162:41

and 1.0. There is no universal best

162:44

value. Each Laura behaves differently.

162:48

Let's delete this Laura loader node. And

162:50

let me show you another node you can

162:52

use, this time from a custom node.

162:54

Search for power Laura loader. This one

162:57

comes from the RG3 node pack. What is

163:00

different compared to the previous one

163:01

is that it has two inputs and two

163:04

outputs. Because of that, we need to

163:06

route both the model and the clip

163:08

through this node. First, connect the

163:11

model output to the power Laura loader

163:13

model input. Then connect the clip

163:16

output to the clip input on this node.

163:18

After that, the clip output from the

163:20

power lura loader goes to both the

163:22

positive and negative prompt nodes. Let

163:25

me select all these nodes and move them

163:27

a bit so you can see more clearly how

163:29

both the model and the clip go through

163:31

the Laura loader. Now we can add the

163:33

Laura we want directly inside this node.

163:36

We can also add multiple Lauras here.

163:38

You can see that I can add a second one

163:40

and even a third one. If I right click

163:43

on a Laura entry, I can remove it. I can

163:46

do the same for any of them. If we right

163:48

click on a Laura and choose show info,

163:51

we get more details. There is also a

163:54

button called fetch from civitai. Civet

163:57

AAI is a website that hosts models and

163:59

you will see it later. If the Laura is

164:02

public and available on Civai, this will

164:04

fetch useful information about it,

164:06

including examples and trigger words. We

164:09

also have toggle buttons here. We can

164:12

toggle all Loris on or off or toggle

164:15

them individually. Let me add another

164:17

Laura so you can see how that works.

164:20

This way we can load multiple loras but

164:22

enable only the ones we want at any

164:25

moment. After playing a bit with this

164:27

Laura, I found that a strength value

164:29

around 0.55 to 0.6 works better for this

164:32

specific one. I tried it with the

164:34

trigger word, then added an orange

164:37

golden fish, cute and adorable, and got

164:40

this result. It is not bad for such a

164:43

small model. For a second example, I

164:45

tried a marzipan cake shaped like a

164:47

woman and got this result.

164:51

For a third one, I tested a marzipan

164:54

castle. Again, this is just for

164:56

practice. Later, you will see better

164:58

models that can produce much higher

165:00

quality images with fewer errors. Loris

165:03

are lightweight. They do not increase

165:05

VRAMm usage very much. Common beginner

165:08

mistakes are using the wrong base model,

165:11

using strength values that are too high,

165:13

forgetting trigger words, and expecting

165:15

a Laura to fully replace a model. Loris

165:18

enhance models. They do not replace

165:20

them. Stacking many Loras can slow

165:22

things down slightly, but not

165:24

dramatically. For beginners, it is best

165:27

to start with one Lura at a time. The

165:30

base model is the photographer. The

165:32

Laura is a specialty training course

165:34

that photographer took. The photographer

165:36

still uses the same camera. They just

165:38

learned a new style.

165:40

Now that you understand diffusion,

165:42

prompts, imageto image, loris, and

165:45

workflows, we are ready to talk about

165:47

controlNet. ControlNet is one of the

165:50

most powerful features you can use in

165:52

Comfy UI. In simple terms, controlNet

165:55

lets you guide image generation using an

165:57

extra image, not just text. Instead of

166:00

saying what you want only with words,

166:02

you can also show the model what you

166:04

want. What control net is? Control net

166:08

is an additional neural network that

166:10

works alongside the main diffusion

166:12

model. It does not replace the model. It

166:14

does not replace the prompt. It adds

166:16

extra control. The base model still does

166:19

the image generation. The prompt still

166:21

guides the style and subject. Control

166:24

net adds structure and constraints. You

166:27

can think of it like this. The prompt

166:29

says what the image should look like.

166:32

The seed decides the starting noise. The

166:34

sampler and scheduler decide how noise

166:36

is removed. Control net tells the model

166:39

where things should go. Why control net

166:42

exists? Text prompts are powerful, but

166:44

they are also vague. If you say a person

166:47

standing, the model decides the pose. If

166:50

you say a city street, the model decides

166:53

the layout. If you say a face, the model

166:56

decides the proportions. Control net

166:58

exists for cases where you want more

167:00

control. For example, you want a

167:03

specific pose. You want a specific

167:06

composition. You want to follow a

167:08

sketch. You want to preserve the

167:10

structure of an input image. Control net

167:13

makes results more predictable and

167:15

repeatable. This is a simplified

167:17

explanation. In reality, control net

167:20

works by injecting additional

167:22

conditioning into the diffusion process

167:24

at every denoising step. But you do not

167:27

need to understand the math for learning

167:29

and practical use. This mental model is

167:32

enough. Control net guides structure

167:35

while diffusion fills in details. Let's

167:37

open comfy UI. Go to workflows and

167:40

select workflow number four. The one

167:43

that has control net in the name. The

167:45

workflow is similar to the text to image

167:47

workflow. It is still a textto image

167:49

workflow but it is guided by an image

167:52

using controln net. You can quickly tell

167:54

it is text to image because of the empty

167:56

latent image node. I highlighted in

167:58

yellow the nodes that we usually use for

168:01

control net. Let's go to the note node

168:03

to see what we need. The checkpoint

168:05

model was already downloaded earlier. We

168:08

also need to download some control net

168:10

models for custom nodes. We need this

168:12

specific custom node which comes with

168:14

the easy install version. But if you are

168:17

using a different Comfy UI version, you

168:20

need to install this node first. We have

168:22

a Cany model, another one called depth,

168:25

and another one called open pose. There

168:28

are more types available, but these are

168:30

the most popular and commonly used ones.

168:33

Let's download all three so we can test

168:35

them. First, download the Cany model.

168:38

Then, go to your Comfy UI folder. Open

168:41

the models folder and think about where

168:43

this model should go. If you guessed the

168:45

control net folder, you are right. We

168:48

place it there because different base

168:50

models can have different control net

168:52

models. Just like with Loris, a control

168:54

net is only compatible with the base

168:56

model it was trained for. So, let's

168:59

organize them properly and create a

169:00

folder so we know which base model these

169:03

control nets are compatible with. Save

169:05

the model in that folder. Next, download

169:08

the depth model and save it in the same

169:10

folder.

169:12

Then download the open pose model and

169:14

save it in the same folder as well. Wait

169:16

for all downloads to finish. If Comfy UI

169:19

was open while downloading, we need to

169:21

refresh it. Press the R key to refresh

169:24

the node definitions. Now we have

169:26

everything we need to run this workflow.

169:29

In the workflow, we have an apply

169:31

control net node with some settings. We

169:34

also have a node that loads the control

169:35

net model we downloaded and a

169:37

pre-processor node that converts the

169:39

input image into a format that control

169:41

net understands and was trained on.

169:43

Let's run the workflow. In this example,

169:46

we are using the canny model. We loaded

169:48

a bunny sketch and with the help of the

169:50

pre-processor, it generates a canny map

169:53

which is an image that detects the edges

169:55

of the input image. With the prompt, we

169:58

influence what we want to generate. And

170:00

with apply control net, the model

170:02

interprets that canny map and uses it to

170:05

guide the generation to get this image.

170:08

You can imagine that without control

170:09

net, it would be very hard to get

170:11

something this complex using only a

170:13

prompt, especially with the small model

170:15

we are using today. Now, let's build

170:17

this workflow ourselves so you

170:19

understand it better. Open again the

170:21

first workflow that you already know how

170:23

to build and we will adapt it to use

170:25

controlNet.

170:27

We know control net comes before the K

170:29

sampler. So let's move some nodes to

170:31

make room for it. Double click on the

170:33

canvas and search for apply control net.

170:36

Add the node and change its color to

170:38

yellow so it is easy to recognize. Now

170:41

let's connect the parts that are obvious

170:43

first. Positive goes to positive,

170:45

negative goes to negative. For the

170:48

outputs, there is only one place where

170:50

they make sense. So we connect those as

170:52

well. At this point, the node still has

170:55

missing inputs. One of them is the VAE.

170:58

We already know where the VAE comes

171:00

from. In our case, it is included in the

171:02

checkpoint model. The same VAE we

171:05

already used for encode and decode. The

171:08

next missing input is the control net

171:10

model itself. Double click on the

171:12

canvas, search for load control net,

171:16

add the node, and color it yellow as

171:20

well.

171:21

Now connect it to the apply control net

171:23

node.

171:25

The last missing input is the image. Add

171:27

a load image node.

171:30

Then we can select an image. In this

171:32

case, I will use a bunny sketch. You

171:35

might be tempted to connect this image

171:37

directly to control net. But that

171:39

usually does not work. Control net

171:41

expects a very specific type of image

171:44

because it was trained on that type of

171:46

data. Our sketch is just a normal image.

171:49

So, we need a pre-processor to convert

171:51

it into something ControlNet

171:52

understands. Double click on the canvas

171:54

and search for AIO, which stands for

171:57

all-in-one. Add the prep-processor node

171:59

and color it yellow. Connect the load

172:02

image node to the prep-processor. Then

172:04

connect the prep-processor to the apply

172:06

control net node. Right now, the

172:08

prep-processor is set to none, so we

172:10

need to choose one. Since we plan to use

172:12

a Cany control net model, select a Canny

172:15

edge prep-processor.

172:18

To better understand what is happening,

172:20

add a preview image node after the

172:22

pre-processor. This allows us to see the

172:24

control image that is actually being

172:26

sent to control net. When we run the

172:28

workflow, we get a cany map, white edges

172:31

on a black background. This shows

172:33

exactly what ControlNet will use as

172:35

guidance. You can also adjust the

172:38

resolution here if you want more detail

172:40

in the map. Now, let's look at the

172:42

result. It does not look very good yet.

172:45

This happens often when working with

172:46

controlNet. And there are a few things

172:48

to check. First, look at the prompt. We

172:52

are still using a robot prompt, but the

172:54

image is a bunny. So, let's change the

172:56

prompt to something like a watercolor

172:58

painting of a bunny. Next, reduce the

173:01

control net strength and the end%

173:03

slightly. Run again. The result is a bit

173:06

better, but it still does not follow the

173:08

sketch very well. If changing the seed

173:10

does not help, the next thing to check

173:13

is the control net model itself. If you

173:15

look at the load control net node, you

173:17

may notice that the selected model is

173:19

not a canny model, but a depth model.

173:22

That is the problem. The pre-processor

173:24

and the control net model must match.

173:27

Select the correct CANY control net

173:29

model. Now run the workflow again. The

173:32

result is much better and follows the

173:34

sketch closely. Let's try another

173:36

example. Load a 3D text image. Since

173:39

this image has depth information, we can

173:42

try a depth control net instead. Change

173:44

the control net model to depth and

173:46

update the prompt to something like

173:48

golden text in snow. When you run it,

173:51

you may notice the preview still looks

173:53

like a cany map. That means we forgot to

173:55

change the pre-processor. Switch the

173:57

pre-processor to a depth prep-processor.

174:00

The first time you run a new

174:02

prep-processor, Comfy UI may take longer

174:05

because it downloads a small model

174:07

automatically. This only happens once.

174:10

If you get a long path error on Windows,

174:12

close Comfy UI and run the long path

174:14

enabler from the tools folder. Now we

174:16

see a depth map. Dark areas represent

174:19

parts that are farther away. Lighter

174:21

areas represent parts that are closer.

174:23

ControlNet uses this information to

174:25

understand spatial structure. The

174:27

generated result now follows the depth

174:30

and composition of the original image

174:32

very closely. If for some reason you get

174:34

an error saying the model is incomplete

174:36

or something similar, you can close

174:38

Comfy UI, go to the tools folder and run

174:41

the batch file called long path enabler.

174:44

This should fix the long path issue and

174:47

allow Comfy UI to download the model it

174:49

needs even when the file path is longer.

174:52

You can also try the same image with a

174:54

Cany control net. Switch both the model

174:57

and the pre-processor back to cany and

174:59

run again. Even if some edges are

175:01

missing, it can still guide the

175:03

generation. Well, try switching back to

175:05

depth and compare results. Often, one

175:08

will work better than the other

175:09

depending on the image. Now, let's talk

175:11

about the key control net parameters.

175:14

Control net does not replace diffusion.

175:16

It only guides it during certain parts

175:18

of dnoising. Strength controls how

175:21

strongly control net influences the

175:23

image. Low values make control net a

175:26

soft suggestion and the model can drift

175:29

away. High values strongly enforce

175:32

structure and make the output closely

175:34

follow the control image. Typical values

175:37

are between 0.5 and 0.7 for natural

175:40

results and 0.8 to 1 for strict

175:44

structure matching. Start percent

175:46

controls when control net begins

175:48

influencing the dnoising process. A

175:50

value of zero means control net starts

175:53

from the very first step, locking

175:55

structure early. Higher values allow the

175:58

model to form rough shapes first before

176:00

control net takes over. End% controls

176:03

when control net stops influencing

176:05

dnoising. A value of one means control

176:08

net stays active until the end, locking

176:10

structure even in fine details. Lower

176:13

values allow control net to stop

176:15

earlier, letting the model finish on its

176:17

own and add more style. In simple terms,

176:20

strength is how hard control net pulls.

176:23

Start percent is when it starts pulling

176:26

and end percent is when it lets go. That

176:28

is why control net is so powerful. You

176:31

can guide structure without killing

176:33

creativity.

176:35

It is time to test the pose control net

176:37

as well. But first, let us add a pose

176:39

image as reference. Let us say I add

176:42

this woman. Again, it is kind of hard to

176:45

put into words the exact same pose. So

176:47

this is a good use case for control net.

176:50

Let us change the prompt to something

176:52

else. Maybe woman in a sumo yoga pose.

176:54

Not sure how to call it. Then do not

176:57

forget to change the model to open pose.

177:00

I know it might seem complex at first,

177:02

but newer models have union control net

177:04

models that include everything in one.

177:06

So you only load one model, which makes

177:09

things easier. That said, this is how we

177:12

used to do it. And there are still cases

177:14

where we need specific models. Then we

177:16

can try either DW pose or open pose.

177:19

Type pose and select this open pose. And

177:22

then let us run it again. It will take a

177:25

bit since this is the first time I use

177:27

it. In fact, if you look at the command

177:30

window, you can see it is downloading

177:32

that model from hugging face. That is

177:34

why it takes so long. After it finishes,

177:37

it gives this pose image that looks like

177:39

a skeleton with each color representing

177:41

a bone. That is how it knows which side

177:44

is right and left and so on. So it

177:46

captured the pose and now let us see the

177:49

result. Holy sumo, what is this? Okay,

177:52

let us adjust the prompt. Maybe a fit

177:55

woman will help. That did not help much.

177:58

I bet the word sumo has too much weight.

178:00

Like we talked about before, some words

178:02

have more power than others. So if I try

178:05

without that word, I get a better

178:07

result. Even the face is not so great.

178:10

And that can happen with people in the

178:12

distance. Usually with portraits, we get

178:14

better faces. Newer models fixed most of

178:17

that. Let me try to change the

178:19

resolution to see if something changes.

178:22

Now I get a better resolution for the

178:24

skeleton. And the pose is okay. Just the

178:26

face. That face does not invite me to do

178:29

yoga. Okay. Let us try a different pose.

178:33

Something for a portrait. Let us say I

178:35

use this portrait photo. Let us change

178:37

the prompt to a businesswoman and run it

178:40

to see what we get. The pose looks okay.

178:42

Even if it is missing an arm, it should

178:44

still work. The results are much better

178:47

now that the face is closer. Let us try

178:49

a warrior woman as well.

178:52

That works well too. So with control

178:55

net, you have to continuously search for

178:57

balance. Make sure you select the right

179:00

control net model for the job. Then

179:03

choose a pre-processor that matches the

179:05

model. As you saw, it is easy to forget

179:08

to change something. I usually play with

179:11

strength and endstep. Also, do not

179:14

forget control net models made for SD1.5

179:17

only work with SD 1.5 base models. If

179:21

you use SDXL, you need SDXL control net

179:24

models. In a later episode, we will

179:27

check some advanced models that do not

179:29

even need controlNet and can do

179:31

everything from prompts. Beginner

179:33

mistakes to avoid. Using the wrong

179:36

control net model for the base model.

179:38

Forgetting to install or download

179:40

controlNet models. Using very high

179:42

strength values. Expecting control net

179:45

to fix bad prompts. Using control net

179:48

when it is not needed. Control net is a

179:51

tool not a magic fix. When you start

179:55

using comfy UI you will notice there are

179:57

many different model types. AIO models

180:00

FP16,

180:02

FP8, GGUF, and others. This can be

180:06

confusing at first, but the reason is

180:09

actually simple. At their core, all

180:11

diffusion models are just very large

180:13

collections of numbers. Those numbers

180:16

represent what the model has learned.

180:18

The knowledge itself does not change,

180:20

but the way those numbers are stored can

180:22

change. Different model formats exist to

180:25

balance memory usage, speed, and

180:28

hardware compatibility. Some formats are

180:30

larger but more precise. Others are

180:33

smaller and faster, but slightly less

180:35

accurate. FP32 is the highest precision

180:38

and is mostly used for training. It uses

180:41

a lot of memory and is rarely used for

180:44

image generation. FP16 is the most

180:47

common format for stable diffusion. It

180:50

offers a very good balance between image

180:52

quality and VRAMm usage. This is the

180:55

safest and most recommended choice for

180:57

most users. FP8 uses even less memory

181:01

and can be faster on newer GPUs that

181:03

support it. The trade-off is that it can

181:06

sometimes reduce image stability or

181:08

detail slightly. AIO models stand for

181:11

all-in-one. They bundle the main model

181:14

VAE and sometimes clip into a single

181:18

file. They are designed to be easy to

181:20

use and reduce setup mistakes. The

181:22

downside is that they give you less

181:24

flexibility if you want to swap

181:26

components later. GGUF models come from

181:29

the language model world. GGUF stands

181:32

for GPT generated unified format. They

181:35

are optimized for very low memory usage

181:38

and can run on CPU or low VR RAM

181:40

systems. It is important to understand

181:43

that these formats do not make the model

181:45

smarter or more creative. They do not

181:48

change what the model knows. They only

181:50

change how efficiently that knowledge is

181:52

stored and processed. You can think of

181:55

it like the same video saved in

181:57

different resolutions. The content is

181:59

the same, but the file size and playback

182:02

requirements are different. For most

182:04

users, FP16 models are the best starting

182:07

point. AIO models are great for

182:09

beginners. FP8 is useful if your GPU

182:13

supports it. GGUF is best when memory is

182:16

very limited. Once you understand this,

182:19

choosing models becomes much easier. You

182:22

saw in the workflows that I include

182:23

links to models, but you might wonder

182:25

where I find those models, right? One of

182:28

the sites is hugging face, but it is not

182:30

the most beginnerfriendly one. At the

182:33

top, we have a models tab. And here you

182:35

can find a lot of models, but not all of

182:38

them are diffusion models or used for

182:40

generating images. Some are for video,

182:43

some for audio, some for large language

182:46

models, and many are not compatible with

182:49

Comfy UI. Some require different

182:51

interfaces to run or they are so large

182:54

that you cannot even run them on your

182:56

computer. For example, I can sort them

182:58

by text to image. And here you can see

183:01

some popular ones like Quen, Zimage, or

183:04

even the Flux model. If I click on one

183:07

of these, you will see that some models

183:09

require you to sign in and accept

183:11

certain terms. Each model has a license.

183:14

Some are open- source, some are free

183:16

with conditions, and others are

183:18

available only in certain countries. By

183:21

default, you are on the model card. This

183:23

is basically an info page about the

183:25

model. On another tab, you have files

183:28

and versions where the files are usually

183:30

available in different formats like in

183:33

this example. And there are a few more

183:34

files inside those folders. Let us go

183:37

back to the homepage. Here you can also

183:40

search for a model if you know the name

183:42

or browse popular ones like Z image.

183:45

Always check the tabs at the top to find

183:47

more information about the models since

183:49

they can be quite large and you want to

183:51

make sure you can actually run them on

183:53

your system. I know this is a lot of

183:55

information, but as I said before, I

183:57

usually include the model link directly

183:59

in the workflow, so you do not have to

184:01

stress about it. Still, it is good to

184:04

understand how models work and where

184:06

they come from. Another site that is

184:08

more beginnerfriendly and better

184:10

organized is the Civot AI website. The

184:12

downside is that recently they removed

184:14

access for some countries like the UK.

184:17

If you are from one of those countries,

184:19

you will need a VPN to access it and

184:21

download models. If you click on the

184:24

models tab, you can find all kinds of

184:26

models for different interfaces like

184:28

Comfy UI, Forge UI, and others. Most of

184:32

them are compatible with Comfy UI. On

184:35

the right side, you have filters. These

184:37

let you sort models by when they were

184:39

added. You can also filter by model

184:41

type. For example, checkpoints are the

184:44

main AI models. You can also filter by

184:47

Laura or Control Net since we talked

184:49

about those model types in previous

184:51

chapters. Of course, you can also filter

184:54

by base model so you know what is

184:56

compatible with your workflow. The first

184:58

workflow we used was based on an SD 1.5

185:01

model, but I can also sort by other ones

185:04

like the Fluxdev model or an older one

185:06

like SDXL. By the way, SDXL is newer

185:10

than SD 1.5 and Flux is even newer than

185:14

SDXL. So, use these buttons to sort

185:17

models. If you already know the name,

185:19

you can just search for it. For example,

185:22

I can search for Juggernaut. Here you

185:24

can see multiple versions of that model

185:26

based on different base models like SDXL

185:29

or SD1. If I click on SD1.5,

185:33

I will only see those versions. If I

185:35

click on the one that says Juggernaut,

185:37

it opens the model info page. At the

185:40

top, you can see different versions. We

185:43

used the Reborn version, but you can try

185:45

other versions as well. Below that, you

185:48

have details about the model. It clearly

185:50

says what type it is. It can be a

185:52

checkpoint allura or a checkpoint merge.

185:56

In this case, it is a checkpoint merge,

185:58

which means the main SD 1.5 model was

186:02

mixed with other SD 1.5 models to

186:04

combine the best parts of each one. It

186:07

also clearly states that this is a base

186:09

SD 1.5 model. You can see the publish

186:12

date as well, which shows that it is

186:14

quite old. At the top you have the

186:17

download button and the file will go

186:19

into the correct folder. In this case,

186:21

it goes into the checkpoints folder. As

186:24

I mentioned before, the author sometimes

186:26

includes recommended settings. You can

186:28

see them here. This is how I knew what

186:30

settings to use in the K sampler for the

186:32

workflow. At the top, you also have a

186:35

gallery with images generated using that

186:38

model. This helps you understand what

186:40

the model is capable of. Some images

186:43

also have an info button that shows the

186:45

prompt and settings used to generate

186:47

that image. So, explore Civid AI if you

186:50

have access to it and see what models

186:52

and loris are available. Once you are

186:54

signed in, you also get more options to

186:57

control what type of models are visible

186:59

since some are disabled by default. So,

187:01

now that we played a little with that

187:03

old SD 1.5 juggernaut model, it is time

187:06

to try a better, newer model to see how

187:08

far AI has come in just 2 years. Let us

187:12

go to workflows again and this time open

187:14

the workflow named 5A, the one for Z

187:16

image turbo with the all-in-one model.

187:19

The workflow is quite similar to the

187:21

others we tried. We just have two extra

187:23

nodes this time. One of them is this

187:25

conditioning node that we use instead of

187:27

the one for the negative prompt. And the

187:29

other one is this model sampling node.

187:32

Since we are using a new model, we need

187:34

to download it because we do not have it

187:36

yet. The model is called Zimage Turbo.

187:40

Juggernaut and Zimage Turbo are very

187:42

different types of models built with

187:44

different goals in mind. Juggernaut is

187:46

based on stable diffusion 1.5.

187:50

It uses the classic diffusion

187:51

architecture that has been used for

187:53

years. The model file itself is

187:55

relatively small, usually around 2 GB.

187:59

Juggernaut was created by the community

188:01

by fine-tuning and merging stable

188:03

diffusion models. Z image Turbo is a

188:06

newer type of model created by the Tongi

188:09

team from Alibaba. It uses a more modern

188:12

architecture designed to generate images

188:14

more efficiently. Even though Zimage

188:17

Turbo is much larger in file size, it is

188:20

optimized to produce good results in

188:22

very few steps. One important difference

188:25

is how the models understand prompts.

188:28

Juggernaut relies on the classic clip

188:30

text encoder. It understands prompts if

188:33

are short, but it often requires careful

188:36

wording, sometimes key words like

188:38

prompts. Zimage Turbo uses a more

188:40

advanced text understanding system

188:43

inspired by large language models. This

188:46

allows it to understand prompts in a

188:48

more semantic and natural way. Because

188:50

of this, Zimage Turbo can often follow

188:53

instructions better, even with shorter

188:55

or more loosely written prompts. So, in

188:58

simple terms, Juggernaut is smaller,

189:01

very flexible, and highly compatible.

189:04

Zimage Turbo is a larger, newer model,

189:07

and smarter at understanding what you

189:08

ask for. So, we have here an all-in-one

189:11

model. And there are two types, a

189:14

smaller one, FP8, and a bigger one,

189:16

BF-16. It depends on your graphic card.

189:19

If you can run the big one, use that

189:21

one. For this first episode, I want to

189:24

run it on a low V RAM card, so I will

189:27

use the FP8 small version. All-in-one

189:30

means it has everything it needs

189:32

included. The clip and VAE model in this

189:35

case, so we do not need to download

189:37

those models separately. That is why it

189:39

is easy to use for beginners. The models

189:42

go into the checkpoints folder and there

189:44

we can create a special folder for the

189:46

Zimage model. Also, if you want to learn

189:49

more about the model, I included an info

189:51

link here. So click on it. Now we are on

189:54

the hugging face page and you can learn

189:56

more about this specific version from

189:58

workflows to different model versions.

190:01

If we go to files, we can see different

190:03

model versions that you can try

190:05

depending on how good your graphic card

190:07

is. So let us test the small version.

190:10

Click here. Then go to comfy UI. Go to

190:13

models

190:15

then checkpoints and create a folder

190:17

called Z image. So everything is more

190:19

organized inside this folder. Place the

190:22

model. Since this is a big model, you

190:25

need to wait for it to finish

190:26

downloading. Because Comfy UI was open,

190:29

you can see that it does not appear in

190:31

the list yet, only Juggernaut. So I

190:33

press the R key to refresh node

190:35

definitions. And now we can see both

190:37

models nicely organized in folders.

190:40

First is Juggernaut and second is

190:42

Zimage. So let us select the Z image

190:45

model. That is all for the model

190:47

download. And now we can run the

190:48

workflow. The first time you run a

190:50

workflow, it is slower because it needs

190:52

to load the model. The second time you

190:54

run it, it should be faster. For me, it

190:58

took about 10 seconds because I have a

191:00

lot of ROM and VROM. The result looks

191:03

pretty good compared to the robots we

191:05

used to get with the SD 1.5 model. We

191:08

have much nicer details. For the image

191:10

size, I used a smaller size so it runs

191:13

faster. This model was trained with

191:15

bigger images, not like SD 1.5. So, we

191:19

can even use larger sizes like 1,600

191:22

pixels if we want. Even if you go bigger

191:24

than the size it was trained on, it does

191:27

not produce many errors like SD 1.5 did.

191:30

It just becomes a bit more diffused.

191:32

Usually, for most newer models, a good

191:35

place to start is around 1,024 pixels.

191:39

So let us say I try a landscape image

191:42

this time using these sizes. The result

191:44

looks pretty good. I like it.

191:48

Let us go back to workflows again and

191:50

open the first workflow to see what is

191:52

different and how we can recreate the Z

191:54

image turbo workflow. So we already have

191:56

the right node to load the model in this

191:58

case. So I just select the model from

192:00

the list for empty latent. This one is

192:03

used more for older models with a

192:05

different architecture. Many newer

192:07

models use a different empty latent

192:09

node. If we look at the nodes and search

192:11

for empty, we have one empty latent and

192:14

one empty SD3 latent. In this case, we

192:18

want the one with SD3. On the surface,

192:21

they look identical. It is just a

192:23

different latent representation

192:24

internally. If we make it purple, it

192:27

looks like the previous one. If you do

192:29

not have enough VRAM to run this, you

192:31

can use sizes like 768 for width and

192:35

height. I will use 1,024 pixels since it

192:38

is the most popular size and my system

192:40

can handle it. So let us delete the old

192:42

empty latent. And now reconnect the new

192:45

node. This model does not use a negative

192:48

prompt, only a positive one. So I will

192:50

remove the negative text. You can also

192:53

collapse it if you want. That way you

192:55

know not to add a negative prompt. Then

192:57

we have the settings which as you

193:00

remember are different from model to

193:01

model. If we look here, we only have

193:04

five steps. So, it can generate with

193:06

fewer steps and the CFG is one. Let us

193:10

change the steps to five and the CFG to

193:14

one. When the CFG is one, it ignores the

193:18

negative prompt. We also need a sampler

193:20

and auler.

193:22

So, let us add the DPM++ SDE sampler.

193:26

And for theuler, we use beta. Let us see

193:28

what else is missing. We have this extra

193:31

node called model sampling oraflow. It

193:34

has a long name. Not sure why it cannot

193:36

be something simpler, but anyway, let us

193:39

search for that node.

193:42

We change the shift to three and we make

193:45

the connection go through that node just

193:47

like we did with Laura. The model

193:49

sampling aura flow node is a special

193:52

node that modifies the model sampling

193:54

behavior before it goes into the K

193:56

sampler. It is designed to work with

193:58

models that use the Auraflow sampling

194:00

method, which is an advanced sampling

194:02

technique used by some modern models for

194:04

better stability and quality. What this

194:07

node does is apply a sampling adjustment

194:09

or patch to the model itself. So, the

194:11

sampler works in the best way for that

194:13

model. The node takes the current model

194:15

and a shift value as inputs and outputs

194:18

a modified version of the model with the

194:20

Auraflow sampling logic applied. The

194:23

shift parameter controls how strong that

194:25

sampling adjustment is. Changing the

194:27

shift value can subtly affect contrast,

194:30

sharpness, and how the generation

194:32

behaves internally. So, we changed the

194:35

empty latent to the SD3 version. We

194:38

added a node to shift the model values

194:40

and we adjusted the settings to work

194:42

better with the Z image model that we

194:44

loaded in our workflow. Let us run it

194:46

and see if it works. As you can see, it

194:49

works just fine and we get a nice robot.

194:52

If we look at the previous Z image

194:54

workflow, you can see that it does not

194:55

use a negative prompt, but instead it

194:58

has a conditioning zero out node. So let

195:00

us go back to our workflow and search

195:02

for that conditioning node. As I

195:04

mentioned before, this model does not

195:06

use a negative prompt. So you might

195:08

wonder why we do not just delete the

195:10

node. We could do that, but then we

195:13

would have a missing input and the

195:14

workflow would throw an error. To fix

195:17

this, we use the conditioning zero out

195:19

node. You can make space for it and

195:21

place it between nodes if you want. This

195:24

conditioning does not come from clip

195:25

like the negative prompt did before. We

195:28

connect it directly to the negative

195:29

input on the K sampler. You can place it

195:32

wherever you want to make the

195:34

connections clearer, but I like to put

195:36

it under the positive conditioning to

195:38

save space. The conditioning zero out

195:40

node does exactly what its name

195:42

suggests. It removes the influence of a

195:45

conditioning input without breaking the

195:47

workflow. In simple terms, it takes a

195:50

conditioning signal, usually text

195:52

conditioning, and replaces it with a

195:54

neutral zeroed version. So the model

195:57

still runs normally, but that

195:59

conditioning contributes nothing to the

196:01

generation. Why this exists and when it

196:04

is used? In diffusion models, the

196:06

sampler always expects both positive and

196:09

negative conditioning inputs. If you

196:11

want to remove or disable one side, you

196:14

cannot just unplug it. That would break

196:16

the workflow. conditioning zero out is a

196:19

safe way to say use conditioning but

196:22

make it have no effect. So if we run the

196:25

workflow everything works fine without

196:27

any errors. Now the good part about Z

196:30

image turbo is that it is very good at

196:32

realistic images but it is also very

196:35

good at understanding prompts. For

196:37

example, if I want to create a portrait

196:39

of a cat with a hat, I can easily get an

196:42

image like this. But you can also create

196:44

more complex prompts by using a large

196:46

language model. Maybe you use chat GPT,

196:50

Gemini, or even a local LLM. I will use

196:53

chat GPT for this example. I ask it for

196:56

a detailed photo prompt and give it the

196:58

details of what I want. Chat GPT then

197:00

gives me a long detailed prompt that I

197:03

can copy and paste directly into Comfy

197:05

UI. So, let us test it again. Now, we

197:08

get a different cat, but it is still a

197:10

bit too simple. Let us make it more

197:12

complex. I go back to chat GPT and ask

197:15

for the cat to hold a rose in her mouth

197:17

and wear a t-shirt that says Pixa.

197:19

Again, we get a long detailed prompt.

197:22

And from that prompt, we get this image.

197:25

Sometimes the model can take things very

197:27

literally. So, you need to explain

197:29

details clearly if you want more

197:31

control. For example, you might need to

197:34

mention that you want a full rose held

197:36

horizontally in the mouth and not

197:38

something else. Let us create something

197:41

different now. This time a cartoon bunny

197:44

since this series is full of bunnies.

197:45

Anyway again we get a nice prompt and

197:48

the result looks like this. It is pretty

197:51

cute. Maybe now I want the bunny to be a

197:54

ninja. Let us see what this prompt

197:56

generates. And we get our ninja bunny.

197:59

If we generate again we get another one.

198:02

As you can see compared to older models

198:05

the results with different seeds are

198:07

quite similar. You do not get a huge

198:09

variation from seed to seed. That is why

198:12

I recommend using longer prompts and

198:14

adjusting each prompt carefully. This

198:17

model is very good at following

198:18

instructions. So the more precise you

198:20

are, the more control you get over the

198:23

result. Let us open the first workflow

198:25

again so we can compare it with workflow

198:27

5A. Now let us say I use the same long

198:31

prompt and the same fixed seed for both

198:33

workflows. If I generate with Z image, I

198:36

get a robot like this one which looks

198:39

nice and detailed. Now if we try the old

198:41

juggernaut model using the same prompt

198:44

and the same fixed seed, the result

198:46

looks like this. It is smaller and much

198:48

less detailed. Let us copy this image

198:51

and paste it into this workflow. So you

198:53

can clearly see the difference in

198:55

quality and also how well the image

198:57

follows the prompt. But maybe this

198:59

single test is not enough to fully see

199:01

the difference. So let us try something

199:03

else. Let us test text generation. Newer

199:06

models can generate readable text, but

199:08

older models usually cannot. We normally

199:11

put the text we want inside quotes. So,

199:13

let us test that. Look at this result.

199:16

It looks very good. And it understood

199:18

the assignment.

199:20

Now, let us go back to the juggernaut

199:22

model and use the same prompt. We get

199:24

something like this. What is this? What

199:27

does it even say? Gold gola or something

199:30

like that. It clearly cannot do text.

199:33

Let us go back to Z image and try

199:35

another test. A red sphere on top of a

199:38

green cube placed on a black car.

199:42

We get this realistic result. Z image is

199:45

more specialized in realism, but it can

199:47

also do 3D paintings and other styles.

199:52

Now let us see what Juggernaut does with

199:54

the same prompt. It gets the red sphere

199:56

since that was mentioned first and then

199:58

it gets lost and forgets what it needs

200:00

to do next. So clearly Z image is a very

200:03

good model to have and you will probably

200:05

spend more time playing with this model.

200:08

Still keep an eye on new models because

200:10

they keep getting smarter and better as

200:12

they get more training. You have now

200:14

seen how an all-in-one model works and

200:16

how we load checkpoints. In the next

200:19

chapter, we will use models that are

200:21

split where clip and VAE are loaded

200:23

separately so we can have more control.

200:26

Let us talk a little bit more about

200:28

diffusion models. Open Comfy UI and then

200:31

open workflow 5A and also workflow 5B so

200:34

we can compare them.

200:37

In the first workflow, Z image is loaded

200:39

as an AIO model. AIO means all-in-one.

200:44

You can see that we used a load

200:45

checkpoint node to load that model. The

200:48

checkpoint already contains the

200:50

diffusion model, the text encoder, the

200:53

VAE. Everything is bundled into a single

200:56

file. Advantages:

200:58

Very easy to use, fewer nodes, less

201:01

setup required. Good for quick testing

201:04

and simple workflows. Disadvantages:

201:07

Less flexible. You cannot swap the text

201:10

encoder. You cannot change the VAE.

201:12

Harder to customize or optimize. This

201:15

format is designed for simplicity and

201:17

convenience. Now, let us check the

201:19

second workflow, the 5B version. You can

201:22

see that we have three nodes now instead

201:24

of one. We have the load diffusion model

201:27

node that loads the main model. Then we

201:29

have the clip load node that loads the

201:31

text encoder. And then we have the load

201:34

VAE node that loads the VAE. So it is

201:37

like we split the previous checkpoint

201:39

into separate pieces. And now we have

201:41

more flexibility. Even though the final

201:44

result is still Z image turbo, the

201:46

pipeline is modular. Advantages more

201:50

control. You can change the text encoder

201:52

and experiment with different VAE.

201:55

Better for advanced workflows and

201:57

optimization and easier to update

201:59

individual components. Disadvantages,

202:03

more complex setup, more nodes, higher

202:05

chance of misconfiguration if you do not

202:07

fully understand what each part does.

202:10

However, this is actually one of my

202:12

favorite workflows. The reason is

202:15

flexibility and efficiency. With a

202:18

modular setup like this, you save disk

202:20

space. For example, this VAE is the same

202:24

VAE used by the Flux model. So, if I

202:27

already use Flux, I do not need to

202:29

download the VAE again. With an

202:31

all-in-one model, every new version

202:33

means downloading the entire model

202:35

again, even if only one part changed. In

202:38

a modular setup, I can update or swap

202:41

individual components. I can test

202:44

different text encoders without

202:45

downloading the main diffusion model

202:47

again. So while modular workflows

202:50

require more understanding, they are

202:52

more efficient, more flexible, and

202:54

better for experimentation.

202:56

That is why I personally prefer this

202:58

approach. But we did not download these

203:01

models yet. I suggest that when you

203:03

follow this tutorial, you test

203:05

everything to see what is better or

203:07

faster on your computer and then keep

203:09

only the ones you like. There is no

203:11

point in keeping all types of models if

203:13

they do the same thing unless you have a

203:15

lot of space on your hard disk. So let

203:18

us start with the main diffusion model.

203:20

This long name is actually describing

203:22

how the model is built and optimized. Z

203:25

image turbo. This is the model family

203:28

and architecture. FP8. This means the

203:32

model uses 8 bits floatingoint

203:34

precision. FP8 models use much less

203:37

memory than FP16.

203:39

I did not include a link in this

203:41

tutorial for the FP16 version, but you

203:44

can find those online if you have more

203:46

VRAM and want to try them. Scaled refers

203:49

to the FP8 format being calibrated for

203:51

better precision. This improves quality

203:54

and stability compared to a raw unscaled

203:56

FP8 format. You can think of it as FP8

203:59

with tuning for better accuracy. E5M2.

204:04

This is the specific FP8 encoding

204:06

variant used. KJ. It is usually a

204:10

variant tag or builder ID added by the

204:13

person or team that exported or

204:15

repackaged the model. It does not change

204:17

the model itself. It just helps

204:19

distinguish between different builds.

204:21

Safe tensors. This is the file format.

204:25

Safe tensors is a safe and efficient

204:27

format and is recommended over older

204:30

formats like CKPT for better stability

204:33

and speed. We can download this model

204:35

from here. And I also added more info

204:38

about the model so you can check

204:39

different versions. You can also see the

204:42

author. So now you know what that KJ in

204:44

the model name stands for. So let us

204:47

click here and see where we place it.

204:49

Navigate to the comfy UI models folder.

204:51

You should already know this by now.

204:53

This time we do not use the checkpoints

204:55

folder because that is usually for

204:57

complete models that already include

204:59

most of what they need. Instead we place

205:02

this one in the diffusion models folder.

205:04

To keep things organized, we create a

205:06

folder called Zimage and place the model

205:09

inside. Next, we have the text encoder.

205:13

I used one recommended by ASD from the

205:15

Discord server, but there are other text

205:17

encoders you can try made by different

205:19

people. For this one, we again go to the

205:22

models folder and this time we place it

205:24

in the text encoders folder. Here I do

205:28

not create a Zimage subfolder because

205:30

many text encoders work with multiple

205:33

models. I usually create subfolders only

205:36

for main models Laura and controlN net

205:39

when it is important for the workflow

205:40

that they match the same base model.

205:43

Then we have the VAE. This is the same

205:46

VAE that we might also use later for the

205:48

flux model. So again we go to the models

205:51

folder and this time we place it in the

205:54

VAE folder. Some of these models are

205:56

large so wait for them to finish

205:58

downloading. Once everything is done,

206:00

press the R key to refresh the node

206:02

definitions. Now let us check that all

206:05

models are visible and selected

206:06

correctly. The Z image diffusion model

206:09

is here. The clip text encoder is here

206:12

and the VAE is also here. That means we

206:15

have everything we need to run this

206:16

workflow. So let us click run and see if

206:19

we get any errors. Everything works fine

206:22

and we get this image. What I usually do

206:25

next is compare the results with the

206:26

first workflow. When I have multiple

206:29

models available, I download all of

206:31

them, test them, keep the ones I like

206:33

the most, and delete the rest. When I do

206:36

testing, it can get confusing which

206:38

model generated which image. So, here's

206:41

a small trick. Double click on the

206:43

canvas, search for it tools ad, and

206:46

select the node called IT tools add text

206:48

overlay. This node comes with easy

206:51

installer, but if you have a different

206:53

Comfy UI version, you can install the I

206:55

tools nodes from the manager. We add

206:58

this node right after VAED code and

207:00

before the save image node. This way,

207:02

the final image goes through this node.

207:05

The text overlay is added and then the

207:07

image is saved to disk. For example, we

207:10

can add the model type in the text

207:12

overlay. You can also add more text like

207:14

the model name or other info. Let us say

207:17

I add FP8 scale diffusion. So I know

207:20

this image comes from this workflow. Now

207:23

when I run it, text will be added on top

207:25

of the image. We can control the text,

207:28

the background color, the font size, and

207:31

whether the text overlays the image or

207:33

is placed under it. Let us disable

207:35

overlay mode and try again. Now the text

207:38

is under the image. This way we know

207:40

exactly which model generated it. Next,

207:43

I select this node and press Ctrl + C to

207:46

copy it. Then I go to the first workflow

207:49

and press Ctrl +V to paste it. Now we

207:52

connect the node the same way as before.

207:55

We need a name that represents this

207:56

model. So let us name this one FP8

207:59

all-in-one.

208:01

Now I can test it and you can see the

208:03

text under the image. To make a fair

208:06

comparison, we use the same settings for

208:08

both workflows. Let us also enable the

208:10

bottom panel to see how much time it

208:12

takes to generate. As you remember, the

208:15

first time you run a model, it is slower

208:17

because it loads the model. We can

208:19

unload the models using this button and

208:21

clear the cache using this one. This

208:23

lets us compare which model loads faster

208:26

and which one generates faster. I run it

208:29

once and you can see the first run took

208:31

around 8 seconds. The second and third

208:34

runs are faster around 3.57 seconds. Now

208:38

let us go to the second workflow.

208:42

I unload the models and clear the cache.

208:45

Then run it again a few times.

208:49

This one loads slower, but the second

208:51

and third runs are faster. On my older

208:54

PC, the all-in-one model was faster, so

208:57

it really depends on your system. Test

208:59

it yourself and see what works best for

209:01

you. Now, let us look at quality. We use

209:04

a fixed seed with a value of 50 and run

209:07

the workflow. Then we do the same for

209:09

the first workflow, same fixed seed and

209:11

run it.

209:13

Right click on the image result and copy

209:15

the image. Then create a new workflow

209:18

where we compare the two images. I press

209:21

Ctrl +V to paste the image and you will

209:24

see it adds a load image node with that

209:26

pasted image. I do the same for the

209:28

image from the second workflow. Now we

209:31

have both results. Let us add an image

209:33

compar node so we can compare them.

209:36

Connect the first image to image A and

209:38

the second image to image B. Then run

209:41

the workflow. Now we can enlarge and

209:43

compare them. The results are quite

209:45

similar but still slightly different.

209:48

This happens because I used a text

209:50

encoder that is different from the one

209:51

included in the all-in-one workflow. If

209:54

I had used the same text encoder, the

209:56

results would have been much more

209:58

similar. Let us try again with a

210:00

different prompt. Maybe we do a portrait

210:02

photo of an old woman. We get a result

210:04

like this one. Now let us do the same

210:06

for the second workflow and we get

210:08

another woman for this one. Let us copy

210:11

both images and go to the compare

210:13

workflow. Select the load image node and

210:16

use Ctrl +V to paste the image into that

210:18

node. Now we can compare the two

210:21

results. Again, because the text encoder

210:24

is different, the comparison is a bit

210:26

harder. Still, I kind of like the FP8

210:29

scaled version more. You can see that we

210:32

use the same settings for both

210:33

workflows. One has everything included

210:36

and the other has everything separated.

210:39

If I searched for the same clip used in

210:41

the AIO workflow, I could get much

210:43

closer results. This load clip node is

210:46

something we will use in other workflows

210:48

as well. As you can see, it has a type

210:51

option that lets you select different

210:52

types of models to match the diffusion

210:54

model you loaded. Do not stress too much

210:57

if you do not understand everything yet.

210:59

It will make more sense as you practice.

211:02

If we look at the VAE, you can see where

211:04

it goes. As you remember, we use it to

211:08

connect to nodes like VAE decode and VAE

211:11

encode. If we go back to the first

211:13

workflow, that VAE is coming directly

211:16

from the one included with the main

211:18

model. Different model formats do not

211:20

change what an AI knows. They only

211:22

change how that knowledge is stored. So

211:25

choosing the right format is about

211:26

balancing quality, speed, memory, and

211:30

flexibility for your hardware and

211:32

workflow. GGUF stands for GPT generated

211:36

unified format and it is a model format

211:39

designed to run large models efficiently

211:41

on systems with limited memory. In Comfy

211:44

UI, let us go to workflows again and

211:47

this time open workflow 5B and 5C so we

211:50

can compare them. The workflow we saw in

211:53

the previous chapter had three nodes.

211:55

Load diffusion model, load clip, and

211:59

load VAE. And you can see that it was

212:01

loading safe tensors files. If we go to

212:04

the GGUF workflow, you can see that some

212:07

nodes are different. We now have a unit

212:10

loader that has GGUF in the name and the

212:13

file format is GGUF instead of safe

212:16

tensors. We use this node to load GGUF

212:19

type diffusion models. For the clip, we

212:22

could have used the previous node to

212:24

load an existing text encoder, but I

212:26

wanted to show that you can also use a

212:28

clip loader GGUF node to load text

212:30

encoders in GGUF format. The last node

212:34

is the same load VAE as before. So,

212:37

compared to the previous workflow, we

212:39

only changed two nodes so we can load

212:41

GGUF models, but we do not have those

212:43

models yet. So let us go to the notes

212:46

and check which node loads which model

212:48

and also look at the download links. If

212:51

we go here you can see there are many

212:53

GGUF model versions. Most of the time

212:57

you will see something with a Q version

212:58

in the name like Q2, Q4, Q6 or Q8. Most

213:06

of the time I use Q8 models. If that is

213:09

too big, I switch to Q6. And if that is

213:13

still too big, I use Q4. The lower the Q

213:17

number, the lower the quality of the

213:19

generation, but the models are smaller

213:21

and can be faster on limited hardware.

213:24

Let us look at what this model name

213:26

means. Zimage Turbo. This is the core

213:29

model family and variant name Q4. This

213:33

means the model is quantized to four

213:35

bits precision. Lower bit quantization

213:38

reduces file size and VRAMm usage. The K

213:42

indicates a specific quantization

213:44

method, usually a blockbased or KQ quant

213:47

method, which helps preserve model

213:49

accuracy even at low bit precision. The

213:52

S usually means small or standard

213:54

variant within that quantization type.

213:56

It trades a bit more quality for a

213:58

smaller footprint compared to M versions

214:00

which stand for medium. GGUF. This is

214:04

the file format. Let us download this

214:07

model and give it a try. We go to the

214:09

Comfy UI folder, then to the models

214:12

folder. This main model, just like in

214:15

the previous workflow, goes into the

214:17

diffusion models folder. Since we

214:19

already have the Z image folder from the

214:21

previous chapter, I will place it in the

214:23

same folder because it is the same base

214:26

model just a different quantization. So,

214:28

we save it there. Now, let us do the

214:30

same for the text encoder. We click here

214:33

to download it. Then again go to the

214:35

models folder, find the text encoders

214:38

folder and place the model there. For

214:41

the VAE, if you followed the previous

214:43

chapters, you should already have it. If

214:46

not, download it and place it in the V

214:48

folder. These models are big, so wait

214:51

for them to finish downloading. After

214:53

the download is finished, press the R

214:56

key to refresh node definitions so Comfy

214:59

UI can see the new models. Now we go

215:02

back to the nodes and make sure we can

215:04

select the models. The Z image model is

215:06

there. The text encoder is also there. I

215:10

am using the one with GGUF in the name

215:12

because if you use a safe tensors

215:14

version here, even if it does not give

215:16

an error, the results will not be what

215:18

you expect. For the VAE, we already have

215:21

it. So now we have everything we need.

215:24

Let us run the workflow. The result

215:26

looks pretty good for a Q4 version. Let

215:29

us open the bottom panel and run it

215:31

again. You can see that the first time

215:33

it loads the model, it is slower, but

215:36

after that it takes around 5 seconds to

215:38

generate. In my case, this was slower

215:40

than the all-in-one model or the FP8

215:43

scaled version. That does not mean it

215:45

will be the same on your system. On some

215:47

systems, it might be faster. That is why

215:50

I keep saying you should test everything

215:52

and then keep the best model for your

215:54

setup. What is best for me will not

215:56

necessarily be best for you because we

215:59

have different video cards, different

216:00

VRAM amounts and probably different

216:03

drivers. Now I am curious how a larger Q

216:06

version will perform. So let us go back

216:09

to the model list. This time I want to

216:11

test a bigger one. The biggest available

216:14

here is Q8 which is around 7 GB in size.

216:18

I have 24 GB of VRAM so I can easily fit

216:22

this model in memory and even larger

216:24

ones. Sometimes if a model is larger

216:27

than your available VRAM, it will be

216:29

slower because it tries to load the

216:31

model in parts. You lose time during

216:34

that process and generation can be slow

216:36

or it can even crash CompuI and force

216:39

you to restart it. So let us download

216:41

this one. We place it in the diffusion

216:44

models folder inside the Z image folder.

216:47

right next to the Q4 version. Again,

216:51

wait for it to finish downloading.

216:53

Luckily, I have a fast internet

216:55

connection. After that, press R to

216:58

refresh. So now in the unit loader, we

217:01

can see both models. By the way, UNET is

217:04

the main neural network inside a

217:06

diffusion model that predicts what noise

217:08

should be removed at each step to turn

217:10

random noise into an image. First let us

217:13

change the seed to fixed so we can

217:15

compare the models properly. I get this

217:18

image for Q4. I copy the image, create a

217:23

new workflow and paste it there. I will

217:26

rename the node to Q4 so I know which

217:28

model was used. Now let us go back to

217:31

the workflow and select the Q8 model.

217:34

Everything stays identical. Only the

217:36

model changes. Let us see what we get.

217:39

It looks similar at first glance. I copy

217:41

this image, go back to the new workflow,

217:45

and paste it there as well. I rename

217:47

this one to Q8.

217:50

At first glance, the Q8 version seems to

217:52

have fewer mistakes and looks clearer.

217:55

Let us add an image compar node to

217:57

compare them properly.

218:00

Connect the two load image nodes to the

218:02

image compar.

218:04

The first image shown is image A, which

218:07

is the Q4 version. As we move the cursor

218:10

to the right, we see the Q8 version. In

218:14

my opinion, Q8 has better details and

218:16

fewer errors. For example, some bolts

218:19

seem to be missing in the Q4 version,

218:22

while the Q8 version looks more

218:24

complete. In most cases, Q8 will be

218:27

better than Q6 and better than Q4 in

218:30

terms of quality. But now, let us check

218:32

the speed.

218:34

The first time Q8 took longer to load

218:37

because it is a 7 GBTE model. Let us

218:40

change the seed and try again. Now the

218:42

second run takes under 4 seconds. Let us

218:45

try once more. We change the seed again

218:48

and once more it takes under 4 seconds.

218:52

Now let us switch back to the Q4 model.

218:54

This one is lower quality but also

218:57

smaller. You can see that the first time

218:59

it loads faster. Let us change the seed

219:01

and try again. The second run takes more

219:04

than 5 seconds. Let us try one more

219:07

time. And again, it takes more than 5

219:09

seconds. This is why I keep saying you

219:12

should test all of them and then decide.

219:14

For me, Q8 is faster and gives better

219:17

quality than Q4, but that is because my

219:19

video card probably works better with

219:21

that quantization on your system.

219:24

Especially if you have an older card, it

219:26

might be the opposite and Q4 could be

219:28

faster. So please test them yourself and

219:31

then keep the one that gives you the

219:33

best quality and the best speed on your

219:35

system.

219:37

So let's go to workflow again. And now

219:40

it might make sense why I named workflow

219:42

5 all these three workflows because they

219:45

are workflows for the same model just

219:47

different model types. So let's open

219:49

workflow 5a and you will see how we can

219:52

adapt the workflow. Let's move this to

219:54

the side. So, this has an all-in-one

219:57

model with everything included. We want

219:59

to change it into a workflow where the

220:01

models are split. So, let's start with

220:03

the model. Instead of load checkpoint,

220:05

we search for load diffusion model. This

220:08

one only loads the model without clip

220:10

and VAE. And we select the Z image model

220:13

from the list. Then, we need a node that

220:15

has that clip output that loads the text

220:18

encoder. So, we search now for a node

220:20

called load clip. Let's make it bigger

220:23

so we can see the parameters.

220:25

We first select the text encoder. Then

220:28

we select the type. Z image uses

220:31

luminina 2. Lumina means light. You can

220:34

think of it like reaching the end of a

220:36

tunnel. Z is the last letter of the

220:38

alphabet and at the end you see the

220:41

light. Luminina 2 represents a newer,

220:44

clearer way for the model to understand

220:46

prompts and guide image generation. It

220:48

simply means more advanced guidance

220:50

compared to older models. Then what is

220:53

left is the VAE. So, we use the load VAE

220:56

node and we select that VAE model. So,

221:00

now all that's left to do is to redo the

221:02

connections. We drag a link from model

221:04

to model. Pretty easy, right? Now, we

221:07

need the clip. So, let's drag another

221:09

link. And all that's left is the VAE.

221:12

This one will connect to the VAE decode

221:15

node. And if the workflow is image to

221:17

image, it will go to VAE end code also.

221:20

Now, we can get rid of the load

221:22

checkpoint node. So now we successfully

221:25

replaced all the models and basically we

221:27

have the workflow version 5B that we

221:30

used before. So let's run it and it all

221:32

works okay as it should. Let's say the

221:35

model we use now is too big and our

221:37

video card doesn't have enough VRAM.

221:39

Then we can try a GGUF model to see if

221:42

it works faster or better. So let's

221:44

search for a node again and this time

221:46

search for the unit loader, the one that

221:48

has GGUF in the name. So in this node we

221:52

can select a GGUF model. You can see I

221:55

downloaded two versions before. So let's

221:58

say Q4 is smaller in size than the FP8

222:01

version in this case. So it has better

222:03

chances to run faster than a bigger

222:05

model. But as you saw before on my

222:08

computer Q8 was faster. So maybe I will

222:11

use that to get better quality instead.

222:14

Let's connect the model. And now we can

222:16

remove that node. So we replaced an FP8

222:19

safe tensors model with a GGUF version

222:22

and if we run this workflow you can see

222:24

it works just fine and we got a nice

222:26

result. If for some reason you are not

222:29

happy with the text encoder maybe it is

222:32

not so accurate or it is too big we can

222:34

try a GGUF version of the text encoder

222:37

also. So let's delete that node and

222:40

let's search for clip loader the gguf

222:42

version. You can see it has clip loader

222:44

in one word. So now we can select the

222:47

GGUF model. And of course we need to

222:49

adapt the type since it is not stable

222:52

diffusion. It is luminina 2 instead.

222:55

Remember that light at the end of the

222:56

tunnel and then link the clip to text

222:59

encode prompt. And basically now we have

223:02

the workflow 5C. So you saw that having

223:05

a modular version allows you to change

223:07

models and have more freedom just like

223:09

on your computer. If you're not happy

223:12

with a mouse or your printer, you can

223:14

change it with a smarter or faster

223:16

version. Now, if you do have enough

223:19

VRAMm, you can try to increase the size

223:21

for width and height to get more

223:23

details. For example, at this size, I

223:26

got this image and now we can see more

223:29

details on those cables and overall. But

223:31

usually for Z image, I use values

223:33

between 10, 24, and 1280 pixels. So, at

223:38

the moment of this recording, Zimage is

223:40

a pretty good model to have. It is free

223:43

and you can generate all kinds of stuff

223:45

with it. Let's compare a few of these

223:47

models to see what the difference is.

223:50

So, for this one, I compared the FP8

223:52

all-in-one version with the FP16

223:55

all-in-one version, which is double the

223:57

size. The results are quite similar.

224:00

Maybe the FP8 is a little more

224:02

desaturated compared to FP16 and FP16

224:05

might be a little bit clearer, but it is

224:07

not a huge difference. Both are good

224:10

quality. For the Viking image, the FP8

224:13

version has fewer details in some areas.

224:15

In FP16, it added some extra things like

224:18

more ornaments. Again, FP8 looks a

224:21

little more desaturated. For the bunny,

224:24

both look good. So, I would say if FP8

224:27

is faster, has half the size, and the

224:30

results are very close, you can get away

224:32

with FP8 and keep that. Now, let's

224:35

compare the FP8 version with the FP8

224:38

scaled version. Keep in mind that the

224:40

text encoder is also different in this

224:42

case compared to the one included in the

224:44

first model, but the results are still

224:46

quite similar. Sometimes FP8 does it

224:48

better, sometimes the scaled version

224:51

does it better. So if you do more tests,

224:53

you can decide which one is better for

224:55

you. Since the results are very close,

224:58

again, it makes sense to keep the one

225:00

that is faster on your system. Now,

225:02

let's compare the FP8 version with the

225:04

GGUF version. Instead of the Q4 version,

225:08

I will use the Q8 version downloaded

225:10

from here so we can see the difference.

225:12

For the portrait, it looks a little

225:14

clearer on the Q8 version. For the

225:16

Viking, some details are more defined on

225:18

the Q8 version. For the Bunny, it is

225:20

pretty similar for me. For other models

225:23

we will test in the future, the

225:24

difference might be bigger and more

225:26

obvious, but in this case, the

225:28

difference is quite subtle. So, which

225:30

one will I keep for my video card? Maybe

225:32

the FP8 scaled or the Q8 version, mainly

225:36

because they are modular and I can save

225:37

space and time when I use the same

225:39

models for other workflows. In this

225:42

chapter, we explore batch generation and

225:44

styles. So, let's open another workflow.

225:48

this time workflow 5A since it has fewer

225:50

nodes and you can see things better. But

225:53

the methods I show work with any

225:54

workflow. Right now, each time you press

225:57

the run button, the workflow runs once

226:00

and you get a single image. But what if

226:02

you want more images and you do not want

226:04

to click run every time? In this node,

226:07

we have an option for batch. By default,

226:10

it is set to one. You can change that,

226:12

but keep in mind it will use more VRAMm

226:15

because it is like running multiple

226:16

workflows at the same time. If your

226:19

video card can handle it, it will be

226:21

faster than generating one image at a

226:23

time. So now we get two images. If I

226:26

change it to four and run again, we get

226:29

four images. If we toggle the bottom

226:31

panel, we can see the time it took for

226:34

one image, for two images, and for four

226:36

images. If we multiply 3.77,

226:40

which is the time for one generation, by

226:43

four, we get over 15 seconds. But

226:46

because we used batch, it only took 13

226:49

seconds. So, you need to see what batch

226:51

size works best for your video card. I

226:54

might be able to use a bigger batch, but

226:56

you might need a smaller one. Now, from

226:58

these four images, we can click on any

227:00

of them to open it bigger. These images

227:03

are saved in the output folder as well.

227:05

To close the big preview, you can use

227:07

the X in the top right. Let's open

227:10

another one. You can also navigate using

227:12

the buttons in the bottom right corner,

227:14

so you can check all generations. The

227:17

bigger the image, the more VRAMm it will

227:19

need. Let's say I set the batch to

227:21

eight. Since the image size is quite

227:23

small, the result is eight images. You

227:26

can check the results and pick your

227:27

favorite. You can rightclick on an image

227:30

and save it in any folder you want. Now

227:32

let's change the batch back to one. So

227:35

we only get one image. Next to run, we

227:38

have an arrow that shows multiple

227:40

options. Here we also have batch count.

227:43

This is not the same as the batch we

227:44

used before. Think of this like a

227:47

counter where you tell it how many times

227:48

to run the workflow. So if I set the

227:51

value to four and hit run, it will run

227:53

once, then again, and again until it has

227:57

run four times. This is a bit slower

227:59

than the previous batch method, but it

228:01

uses less VRAM. If we add these values,

228:04

we get over 14 seconds. With the batch

228:07

and empty latent, we got 13 seconds. You

228:10

might say 1 second is not much, but if

228:13

you use bigger images and longer

228:15

workflows, seconds can quickly turn into

228:18

minutes. Let's change the batch back to

228:21

one. And let's explore more run options.

228:24

Run on change will run the workflow when

228:26

we change a value. So if I change the

228:28

seed, it will start running. It should

228:31

stop after the run. But I am not sure if

228:34

this is a bug or if this is how it is

228:36

supposed to work. Because the seed is

228:38

random, it keeps generating continuously

228:41

after the first change. But if the seed

228:43

is fixed, it only runs the workflow when

228:46

I make a change and then it stops. So I

228:49

will stop it manually by switching back

228:51

to run. If I change it to run instant,

228:54

it will generate forever until you stop

228:56

it. So do not forget to stop it by going

228:58

to the arrow and selecting run. After it

229:01

finishes that workflow, it will stop.

229:04

But what if we want to run multiple

229:05

prompts? Until now, we only had one

229:08

prompt and the seed was different. But

229:10

for the Zimage model, for example, the

229:13

seed variation is not that big compared

229:15

to other models like flux or stable

229:17

diffusion. So let's search for a node

229:20

called I tools line loader. This node

229:23

loads each line as a prompt. If we drag

229:26

a link from this line loader output and

229:28

connect it to the text encoder, a small

229:30

dot will appear in the top left corner

229:32

of that text input. By default, you have

229:35

three prompts here, cat, dog, and bunny.

229:38

Let's say the first prompt is a cat

229:41

photo. The second prompt is a bunny with

229:43

a flower. And the third prompt is a lion

229:46

logo. We have a seed here that decides

229:48

which prompt will generate. And we also

229:51

have control after generation. Randomize

229:54

means that after each generation the

229:56

seed will change to a different random

229:58

value. So let's run it. We got a cat and

230:01

now we have a different seed.

230:04

For this seed we got a bunny. Now let's

230:06

change the seed to fixed so we can

230:08

understand better how this works. For

230:10

the seed we put zero. In computer

230:13

programming lists usually start with

230:15

zero not with one. So instead of 1 2 3,

230:20

it is 0 1 and 2. So 0 corresponds to the

230:25

cat prompt. If I run the workflow, I get

230:27

a photo of a cat. If I change the seed

230:30

to one, it corresponds to the second

230:32

prompt, which is the bunny. So the

230:34

result is a bunny. And for seed 2, we

230:37

get a lion logo. Now we know the order

230:39

in which this node uses the prompts. Can

230:41

you guess what we will get for seed 3?

230:44

It will start over with the first

230:45

prompt. So the result will be a cat

230:47

photo. Let's add another prompt like a

230:50

rose and maybe a house with a car in

230:52

front. I will start with zero so it

230:55

starts with the first prompt. Then for

230:57

control after generate I will use

231:00

increment so it starts with the first

231:02

prompt and continues with the next one

231:04

and so on. This way it is more

231:06

controlled and not random. Now that we

231:09

have five prompts I can change the batch

231:11

to five so it runs the workflow five

231:13

times. You can see it will generate all

231:16

those images one prompt at a time in

231:18

order and it will stop after five

231:20

generations.

231:22

You can also put 10 if you want to get

231:24

two generations for each prompt or you

231:26

can let it run continuously and stop it

231:28

when you get something you like. We saw

231:30

how we can load prompts line by line,

231:32

but we can also load prompts from a text

231:35

file. Let's search for it tools prompt

231:37

loader. As the name says, this node

231:41

loads prompts from a file. Let's drag a

231:43

link from it to the positive prompt. You

231:46

can see here it says file path. We

231:49

already have an example with a

231:50

prompts.txt file. So let's run it. It

231:54

will pick a random prompt from that

231:55

file. And the result is this cat. Now

231:58

let's find that prompts text file. Let's

232:01

go to the Comfy UI folder. Then go to

232:03

custom nodes. And here look for the

232:05

Comfy UI tools folder. These are all the

232:08

files used by that node. Basically, the

232:11

note itself looks like this. If we go to

232:14

the examples folder, we have a text file

232:16

with prompts. Let's open it. You can see

232:19

we have a few prompts here. And the

232:21

image that was generated corresponds to

232:23

one of these prompts. You can delete

232:25

everything and add your own prompts here

232:26

one by one. Or you can ask chatgpt to

232:29

generate a bunch of prompts. So maybe I

232:32

will add one for a dog, maybe one for a

232:34

cat, and one for a rose. Now I can save

232:38

that file and close it.

232:40

Let's go back to Comfy UI and generate

232:42

again. It should pick prompts from the

232:45

same text file. Now we got that cat

232:47

playing with a mouse. Let me remove this

232:50

node and try again to see if we get

232:51

another prompt. And now we got a rose.

232:54

If the text file is in a different

232:56

location, you just add the path to that

232:58

file here so it knows where to load it

233:00

from. And of course, you can change to

233:03

run instant and let it run, then stop it

233:06

when you have enough images generated.

233:08

Let's delete this node and I will show

233:10

you more things you can do. Let's search

233:12

for a node called it tools prompt styler

233:14

and select this one. This node picks

233:17

prompts or art styles from a file. We

233:19

have positive and negative prompts. But

233:22

since our workflow only uses the

233:24

positive prompt, we drag a link from

233:26

there and connect it to the positive

233:28

prompt input. Here we have an area where

233:30

we can type our prompt. Let's say I type

233:33

a white bunny holding a rose. Then we

233:35

can select the style file. These files

233:38

that contain different prompts are

233:39

stored locally. Let's go to the custom

233:41

nodes folder again, then to eye tools,

233:44

and this time go to styles. You can see

233:47

here a few example style files, which

233:49

are actually YAML files. If we go to

233:52

more examples, we have even more. Now,

233:56

if we look back here at the file list,

233:58

we can see exactly those files. Among

234:00

them, there is one called Pixaroma. I

234:03

asked the creator to add my file there

234:04

so you can access it easily. Thanks,

234:06

Mikotti. Once you have the file

234:09

selected, you can choose a template from

234:11

that file. You can see here different

234:13

templates. For example, I can select a

234:16

3D icon or something else. What is

234:19

important to remember is that you select

234:21

the file first and then the template

234:23

inside it. For example, let's open one

234:26

of these files with Notepad so we can

234:28

see what is inside. Each template looks

234:31

like this.

234:33

You have the template title, the

234:34

negative prompt, and the positive

234:37

prompt. As you saw in our workflow, we

234:39

only use the positive prompt this time.

234:42

So, it will only pick that part. In the

234:45

positive prompt, you can see the word

234:47

prompt inside brackets. That is where it

234:50

takes your prompt and combines it with

234:51

the rest of the template prompt. So, if

234:53

I do not have anything selected here for

234:55

the template, it will use something like

234:57

landscape photography of the prompt. and

235:00

instead of the word prompt, it will

235:01

insert a white bunny holding a rose.

235:04

Basically, we recreated what these

235:06

styles do. This system saves you time by

235:09

letting you write a short prompt and

235:11

combine it with a ready-made prompt from

235:13

a template. This was created back in the

235:16

days for stable diffusion models when we

235:18

did not have access to AI prompt

235:20

generators. It still works today with

235:22

most models that recognize these prompts

235:25

even though you have much more freedom

235:26

using a custom prompt made with chat

235:28

GPT. So if the prompt is a white bunny

235:31

holding a rose and for the file I select

235:34

the Pixaroma file, then for the template

235:36

I can filter by landscape and select

235:39

photography landscape. Now when I run

235:41

it, it should combine my bunny prompt

235:43

with the landscape photography prompt.

235:46

And the result is this one. So

235:47

everything works quite nicely. Let's

235:50

change the template. Let's say I select

235:52

3D icon and run the workflow. And we get

235:55

this 3D icon of a bunny holding a rose.

235:58

Now let's try an ancient Egyptian mural.

236:00

We run it again and we get this mural.

236:03

And you can clearly see our bunny in the

236:05

image. Let's say I select the RCOO art

236:08

style. The result is this decorative

236:11

style illustration of the bunny. Now

236:13

let's open the Pixarroma styles file

236:15

with Notepad. You can see all the

236:18

templates and prompts for each style and

236:20

you can edit them if you want. Just keep

236:22

the same format. Otherwise, it will not

236:24

work. Let's say I want to use the

236:26

template for surreal toy. If I use it in

236:28

the workflow, it will take this prompt

236:31

and replace the word prompt with my

236:33

bunny holding a rose. So, let's test it.

236:36

From the templates, I search for surreal

236:39

and select that toy style. Now, let's

236:41

run the workflow. And we get this 3D

236:44

surreal bunny with a rose. Pretty cool.

236:47

Let's scroll down and see what else we

236:49

can use. Let's say afroofuturism art.

236:52

That means it will use that specific

236:54

prompt. Let's change the style and test

236:56

it. And the result is this one. Keep in

237:00

mind each model will interpret these

237:02

prompts differently depending on how it

237:04

was trained. There are over 300 styles

237:06

or prompts saved in this file from 3D to

237:09

art styles, painting, photography,

237:12

design, all kinds that I use most often.

237:16

Let's say I select the vector coloring

237:18

book page style. Now, when I run it, I

237:20

get this clean coloring page design. Of

237:24

course, if you want it to be more

237:25

unique, give more information in the

237:27

prompt, like how the bunny looks, how it

237:30

is dressed, how the environment looks,

237:33

and maybe make it fit your story. Let's

237:35

say I want to do a cartoon illustration.

237:38

Let's search the list to see if we have

237:40

something like that. For example, I can

237:42

select a soft 3D cartoon environment and

237:45

see what we get. And the result is this

237:47

one. Let's search for cute and test this

237:50

cute cyberpunk style. and we get this

237:52

illustration. These are good for

237:54

discovering art styles you might not

237:56

have thought to try yet. Now, let's

237:58

remove this node and search again. This

238:01

time, we look for it tools prompt styler

238:03

extra. It is called extra because it has

238:06

slots for multiple files and templates.

238:09

Let's connect it to the positive prompt.

238:11

For the base file, let's select the

238:13

Pixarroma file since that one has the

238:16

most styles. For the second file, I will

238:19

use the same one. Let's set both to

238:22

random. So we get a random combination

238:24

of two styles. If I run it now, I get

238:27

something like this. There is no bunny

238:29

because we added a new node and we did

238:31

not add a prompt yet. Let's drag a link

238:34

from the output called used templates.

238:37

This outputs the actual styles that were

238:39

used. Then search for preview and add a

238:42

preview as text node. Now when I run it,

238:45

you can see what styles it combined.

238:47

Reflection with fantasy. Now, let's add

238:50

the prompt, "A white bunny holding a

238:52

rose," and generate again. This time, it

238:55

combined propaganda art style with

238:57

knitting art, something you probably

238:59

would not think to combine. Let's select

239:02

a third file, again, the Pixar file. For

239:06

the third style, let's select random or

239:09

any other style you want. Now, if we

239:11

look again, it combined Japanese

239:14

traditional sticker and fine art. Let's

239:17

run it once more and we get this gilded

239:20

fantasy bunny. Pretty cool. Let's try

239:23

again. And this time we get a cute

239:25

minimal line art style. Of course, you

239:28

can also manually select which styles to

239:30

combine. For example, let's choose a

239:32

game asset style combined with low poly.

239:35

And for the third one, select Adam Punk.

239:38

And the result is this one. You can also

239:40

run it multiple times to get different

239:42

seeds. By now, you should start to get

239:45

an idea of how styles work. Let's try

239:48

one last combination. Change low poly to

239:51

a steampunk style. We get this image

239:54

because we used a game asset style. If I

239:57

change the game asset to cute cartoon, I

240:00

get this cute bunny in a steampunk

240:01

environment. So, create your own styles

240:04

for the things you use most often. or

240:07

use chat GPT or other large language

240:09

models to generate longer prompts that

240:11

describe exactly what you need. In the

240:14

previous chapter, we saw how we can use

240:16

different prompts to change the style of

240:18

the image. But if we want a style that

240:20

the model did not learn, we cannot

240:22

generate that style. For that, we have

240:25

the Laura files, which add extra

240:27

information to the main model. We talked

240:29

more about this in episode 13. Let's go

240:32

to workflows, and this time, let's

240:34

select workflow number six. the one with

240:36

Laura in the name. This is a simple Z

240:39

image text to image workflow. In fact,

240:42

if we remove these nodes, we get exactly

240:45

workflow 5A that we used before. So,

240:48

let's undo that. What is different here

240:50

is this Laura loader model only node

240:53

which allows us to load a Laura from our

240:55

computer. I just changed the color to

240:57

blue. That is all. The node with trigger

241:00

words is just a simple note. Again,

241:03

revisit chapter 13 for more details. So,

241:06

let's go and download Aurora. I created

241:08

a Laura for a girl with white hair, and

241:10

you can download it from here. After

241:12

that, navigate to Comfy UI, go to

241:15

models, and then open the Laura's

241:18

folder. Here, we already have one from

241:20

chapter 13, the SD1.5 Laura. Now, we

241:24

create a new folder called Zimage. Since

241:26

this Laura only works with the Zimage

241:29

model, and we save the Laura inside that

241:31

folder. After the Laura is downloaded,

241:34

press the R key to refresh the node

241:36

definitions.

241:38

Now, if we go to the Laura loader, we

241:40

can select that Laura. You can see the

241:42

folder and the Laura name there. Just

241:44

like for all other Loris, this is the

241:47

trigger word that I used when I trained

241:49

that Laura. I use that in the prompt

241:51

together with more words to describe

241:53

what I want to generate. Now, when I

241:55

generate, I get this girl with white

241:58

hair. The Laura I am using here is a

242:00

character Laura. There are also loris

242:02

for styles, objects, or functional ones

242:04

that speed things up. This also allows

242:07

you to keep a character consistent. So

242:09

even if you change the prompt and keep

242:11

the trigger words, you get the same

242:13

character, which is very useful. There

242:16

are many Lauras trained by people online

242:18

on sites like Hugging Face or Civot AI.

242:22

Over time, you can also learn how to

242:24

train them yourself, either online or

242:27

locally, if you have enough VRAM. Let's

242:30

search for Allora on the Civit AI

242:32

website. Again, if you are from the UK,

242:35

you will need a VPN to bypass the

242:37

restrictions they set for your country.

242:40

Let's go to models. Then we can filter

242:42

them. Set time period to all. For model

242:44

type, select Laura. For base model,

242:47

select Z image turbo. Now we should see

242:50

only Loris compatible with our base

242:52

model. We can sort by highest rated. We

242:55

have quite a few here. Let's pick one at

242:58

random. Maybe this one that lets us

242:59

create character design sheets. Now we

243:02

are on the Laura page. At the top you

243:05

can see this Laura is available for

243:06

different models, but we want the Z

243:08

image version. We check the type to make

243:11

sure it is a Laura and that the base

243:13

model is correct. We also check if it

243:16

has trigger words or other settings.

243:18

Then we can download it from here. You

243:20

must be logged in to download models. We

243:23

save the Laura in the same Laura's

243:25

folder inside the Z image folder. After

243:28

the download finishes, press the R key

243:31

to refresh. Now we should be able to see

243:33

that Laura in the list and select it.

243:35

Let's see what else it says about this

243:37

Laura so we can learn more. Here it

243:39

shows the trigger words. I can copy

243:42

those and paste them in a note so I have

243:44

them for later. They also give an

243:46

example prompt showing how to use it.

243:49

Let's copy that and paste it into the

243:51

positive prompt. I will remove the

243:53

beginning and ending quotes. Now, let's

243:55

copy the trigger words and place them

243:57

here instead of the previous trigger

243:59

words. We also have a subject. So, let's

244:02

say a white bunny warrior. For art

244:05

style, maybe I add a 3D render style.

244:09

The rest looks fine. For the model

244:11

strength, I will use one. If it is too

244:14

strong, I can reduce the weight. For the

244:17

size, let's make it bigger so we get

244:19

more details. Now let's run the

244:21

workflow. We get this character sheet

244:24

which is not bad. This could be useful

244:26

for concept artists to see different

244:28

angles. Let's run it again with a

244:30

different seed. And we get this result.

244:33

Now let's change some things in the

244:34

prompt. Maybe it is a medieval bunny.

244:37

For art style, I try a vector art style.

244:40

Let's run the workflow again. Now we get

244:43

a different image. This seed does not

244:46

look that good. So maybe I try another

244:48

seed to see if I get something better.

244:50

Again, not perfect, but at least it

244:52

gives some ideas. Let's go back to Civid

244:55

AI and look again at models. You can see

244:58

there are luras for all kinds of things.

245:00

That does not mean all of them are

245:02

great. They are trained by people like

245:03

you and me and shared for free.

245:06

Depending on the training, some are very

245:08

good and some are not so good. Training

245:10

is never perfect. If you want to see how

245:13

much the Laura influences the result, we

245:15

can test that too. Change the seed to

245:17

fixed and generate once to see the

245:19

result. In my case, I got this image.

245:22

Now, let's go to the Laura loader node.

245:24

Right click on it and select bypass.

245:27

Then, we run the workflow again with the

245:29

same settings and prompt. You can see

245:31

that without the Laura, we do not get a

245:34

character sheet anymore. So, this Laura

245:37

clearly helps with creating multiple

245:39

characters on a sheet. Remember, Allora

245:42

is like an add-on to the main model. It

245:45

adds extra training to that model, like

245:47

the model took a new course and learned

245:48

how to do character sheets. Hope that

245:50

helps. I explained controlNet basics in

245:53

chapter 14, but there are models like

245:56

Zimage that need different nodes to run

245:58

control net. Let's go to workflows. And

246:01

now I want to open workflow 4 and also

246:04

workflow 7 since both are using

246:06

controlnet. And you can see the

246:07

difference. So let's go to the

246:09

juggernaut workflow which is a stable

246:11

diffusion 1.5 model. Here we use a load

246:15

control net model node and it is the

246:17

same node used for SDXL models or flux

246:20

models. Then for control net we have

246:23

different models like depth canny pose

246:27

or other types that control the image

246:29

generation. Now if we go to the Z image

246:32

turbo workflow, we have a different node

246:35

here called model patch loader. Here we

246:38

load a control net model and it is

246:40

called union because it has depth, canny

246:44

and pose integrated into one single

246:46

model. So we do not have to keep

246:48

changing the model. It is one model that

246:51

does everything it needs. Back in the

246:53

juggernaut workflow, we had a

246:55

pre-processor node that converted our

246:57

image into a format the control net

246:59

model understands.

247:01

For the Z image workflow, that part

247:03

remains the same. We can try different

247:05

pre-processors like canny depth or DW

247:09

pose and they will work with this model.

247:12

For the last part in the juggernaut

247:14

workflow, we had an apply control net

247:16

node between the prompts and the K

247:18

sampler with different parameters. For Z

247:21

image turbo, the node is different. It

247:24

is called Quen image diff control net.

247:27

Here we only control the strength which

247:29

I set to 0.8. So it is not too strong.

247:32

Now let's download the required models.

247:34

By now you should already have the main

247:36

models downloaded either FP8, FP16 or

247:40

BF-16. The principle is the same even if

247:43

you use a GGUF version or other types.

247:46

We also need to download the control net

247:48

model because we are not using the load

247:51

control net model node but the model

247:53

patch loader. We need to place this

247:55

model in the model patches folder. So

247:58

let's click here to download it. Go to

248:00

the Comfy UI folder, then to models, and

248:03

here you will find the model patches

248:04

folder. Let's save the model here. Wait

248:08

for it to download since it is around 3

248:10

GB. Also, keep in mind that over time

248:13

more versions can appear like version

248:15

two or three. So, always check if there

248:17

is a newer version available. After the

248:20

download is finished, press the R key to

248:22

refresh the node definitions so the

248:24

model appears in the list. Then, select

248:27

that model from the drop down. Now we

248:29

should have everything we need to run

248:31

the workflow. I have here in the load

248:34

image node a robot image loaded. For the

248:38

pre-processor I use depth or cany but

248:41

let's start with canny. For the

248:43

resolution I will make it bigger so the

248:45

cany map has better details. Then for

248:48

the prompt we describe what we want to

248:50

get and then we run the workflow. Now we

248:53

can see that we got a cany map that

248:54

control net understands.

248:57

Look at the result. It looks much better

248:59

and it follows the edges of the original

249:01

robot. Let's try with a different image.

249:04

As you remember in the input folder, I

249:06

added some images you can use. So, let's

249:09

say I load this sphere and cube image.

249:11

For the pre-processor, let's use depth

249:14

this time. Then, let's adjust the prompt

249:16

to fit. Maybe a green sphere on top of a

249:20

golden cube in the desert, golden hour,

249:24

alien. Now, when I run the workflow, I

249:28

get a depth map for that image. For the

249:30

most part, it got it right, except the

249:32

ground. Let's help it understand what I

249:34

want. So, I will add to the prompt, the

249:37

sphere and cube levitate in air. Let's

249:40

run it again and see if it understands

249:42

it better. Now, we got exactly what we

249:44

asked for. You can also give the image

249:47

to chat GPT and ask for a prompt

249:50

together with instructions on how you

249:51

want it to look. Let's try something

249:53

else.

249:55

This time, let's upload that woman in a

249:58

yoga pose that the juggernaut model

250:00

struggled with to see how much the model

250:02

advanced in the last 2 years. For the

250:05

pre-processor, I use the DW pose

250:08

pre-processor. For the prompt, I will

250:10

add a photo of a woman dressed in white

250:14

doing yoga on top of a mountain. Maybe I

250:17

add photo taken with a DSLR camera. Not

250:20

sure if it will take that too literally.

250:22

So now we got our pose skeleton which

250:24

looks correct. We also got an Asian

250:27

woman which Z image tends to generate

250:29

when you do not specify what kind of

250:31

woman it is. It also added a DSLR camera

250:34

on the ground which I do not want. So

250:37

let's go back to the prompt. I remove

250:39

the DSLR part and for the woman I add

250:41

that she is European. Now let's test

250:44

again. The result is actually great.

250:46

Same pose, the clothes I asked for and

250:49

on the mountains. a perfect result. What

250:52

do you think? Let's try to recreate this

250:54

controlN net workflow so you can

250:56

practice. Go to workflows and let's open

250:59

workflow 5a since this is a simple

251:01

textto image workflow for the z image

251:03

model. Search for quen image written as

251:07

one word. Then select the diff synth

251:10

control net node. Now we need to connect

251:13

this between the model and the k

251:14

sampler. So let's add the links so

251:16

everything goes through this node. Let's

251:19

see what other inputs we have here. It

251:21

says model patch. So let's search for

251:24

that node. We add the model patch

251:26

loader.

251:28

And here we select the union control net

251:30

model. Now we drag a connection from

251:33

this node to the control net node. We

251:35

also need the VAE and we already know

251:38

where it is in this workflow. So we

251:40

connect that as well. All that is left

251:42

now is an image. To load an image, we

251:45

use the load image node. So search for

251:47

that node and add it. If we try to

251:50

connect the image directly, it will not

251:52

work correctly because this model is

251:54

trained with canny depth and pose. So we

251:58

need something to convert the image into

252:00

those formats. Search for AIO and add

252:02

the OX pre-processor node. Now our image

252:06

goes through this pre-processor. From

252:08

the list, we can select one, for

252:10

example, the depth anything

252:11

prep-processor. For the resolution, we

252:14

can increase it a bit to get more

252:16

detail. Now, we connect the output of

252:18

this node to the control net node since

252:20

this is the correct format that control

252:22

net understands. We can also add a

252:25

preview node to see how the processed

252:27

image looks. All that is left now is to

252:30

adjust the prompt. Let's say the prompt

252:32

is a modern house in winter. We can also

252:35

increase the width and height to get

252:36

more details. Now, we are ready to test

252:39

the workflow. We can see the depth map

252:42

of the building. We can enlarge it to

252:44

see it better. The result looks like

252:46

this. It is similar, but not exactly the

252:49

same building shape. You could try a

252:51

more detailed prompt or a different

252:53

pre-processor. Let's add an image compar

252:56

node to see the differences. I want the

252:59

original image before processing. So, I

253:01

connect it to image A. Then, just after

253:04

the VAE decode, I connect that output to

253:07

image B. Now let's run the workflow

253:09

again and make the image compar node

253:11

larger. We can see that it shares some

253:13

building edges with the original image

253:15

but not all of them. If we want more

253:18

accuracy, we can change the

253:20

pre-processor.

253:21

Let's select a canny pre-processor

253:23

instead. Now when we run it, you can see

253:26

it captures all the edges in the canny

253:28

map. The result should be more accurate.

253:31

And this is the result we get. Now we

253:33

can see many things in common with the

253:35

original image. Keep in mind this is

253:38

controlled mainly by edges. So it will

253:40

not be exactly the same building. We can

253:43

get more control later when we cover

253:44

edit models like Flux 2, Quinnedit or

253:48

Nano Banana Pro. Up to now everything we

253:51

did in Comfy UI happened locally inside

253:54

the interface. We loaded models,

253:56

connected nodes, ran workflows, and

253:59

generated images on our own machine. API

254:02

nodes are different. They allow Comfy UI

254:05

to communicate with external services.

254:07

An API is simply a way for one program

254:10

to talk to another program over the

254:12

internet. Instead of doing everything

254:14

locally, we can send data out, let

254:16

another service process it, and then

254:18

receive a result back. Think of it like

254:21

this. Local nodes are tools on your

254:23

desk. API nodes are tools you rent

254:26

remotely. You send instructions and you

254:29

get results back. In Comfy UI, you can

254:32

click on the plus to add a new blank

254:34

workflow. Then double click on the

254:36

canvas and search, for example, for chat

254:39

GPT. You can see that it says API node

254:42

under the node name. Let's select this

254:45

node. Now, this node looks different

254:47

compared to others. It comes already

254:49

colored in gold like a VIP version. On

254:52

top, it tells you how many credits this

254:54

node will consume depending on the

254:56

settings. Those credits change based on

254:59

what you use. Here we have a list of

255:01

models from OpenAI that are accessible

255:04

through the API. The API letters stand

255:07

for application programming interface.

255:10

For example, if we select a big model,

255:12

it can cost between 2 to 8 credits

255:15

depending on what you ask from it and

255:17

how long the answer is. If I change to

255:19

chat GPT mini, it is almost zero

255:22

credits. It is not zero, but it is 0

255:25

something. So, it is quite cheap. This

255:28

node has a string output. So like chat

255:30

GPT, you ask something and you get a

255:32

text reply back. Let's drag a link and

255:35

search for a node that displays text.

255:38

Search for preview. And we have this

255:40

preview as text node that we can add.

255:42

Here we will get our reply from the chat

255:44

GPT model. Let's say I ask it to

255:47

generate a prompt for a cute cartoon

255:49

bunny, something 3D. Now when I try to

255:52

run that, it asks me to sign in if I

255:55

want to use the API. We could use this

255:57

login button or we can cancel and go to

256:00

the menu then settings. Here we have the

256:03

user section in the settings and again

256:05

we have the sign-in option. Let's click

256:07

sign in. If you have a comfy UI account,

256:10

you can use that or you can simply log

256:13

in with Google which is a faster option

256:15

for me. Then you select your Gmail email

256:18

from the list and you will be signed in.

256:21

Now you also have the option to log out.

256:24

So now we are connected but we need

256:26

credits to run API nodes. Let's go to

256:29

credits. Credits are like money. You

256:32

basically use real money to buy credits

256:34

that you can spend on a lot of models

256:36

that are available through the API in

256:38

Comfy UI. I have here some credits I

256:40

bought a while back. I can click on

256:42

purchase credits and then it asks me how

256:45

much I want to spend. For example, I

256:47

have $10 here, but that might be too

256:50

much for a beginner to spend on a first

256:51

try. Let's click on minus to see if we

256:54

can go lower. The minimum you can buy is

256:56

1,55 credits using $5. Then you can

257:00

click continue to payment. Depending on

257:02

your country, you have different options

257:05

to purchase. You can use link, but you

257:08

can also choose without link if you do

257:10

not have one set up. Here you have

257:12

options to pay with a card or you can

257:14

use Google pay if you want and you also

257:17

have the option to purchase as a

257:18

business. Back in Comfy UI, I have

257:21

enough credits to test a few nodes in

257:23

today's tutorial. Now, when I run the

257:25

workflow again, this node sends

257:28

information to the server wherever those

257:30

are located in the cloud on OpenAI or

257:32

somewhere else. Depending on the

257:34

situation, sometimes it is faster,

257:37

sometimes it is slower. From the

257:39

workflow point of view, nothing special

257:41

is happening. Nodes still connect left

257:44

to right. Data still flows through

257:46

cables. The only difference is where the

257:49

computation happens. Local nodes use

257:52

your GPU or CPU. API nodes use someone

257:55

else's hardware. This has advantages and

257:58

disadvantages.

258:00

Advantages are that you can use very

258:02

powerful models that you cannot run

258:04

locally. You save local VRAMm and system

258:07

resources. Some APIs are faster for

258:10

specific tasks. Disadvantages are that

258:13

you depend on an internet connection.

258:16

There may be usage limits and of course

258:18

it costs credits. It is not free. You

258:21

have less control over model internals.

258:23

So we got the response from chat GPT and

258:26

it gave us multiple prompts and

258:28

suggestions instead of a single prompt.

258:30

So let's refine what we asked and tell

258:33

it to generate a single prompt. Maybe

258:35

repeat it once again to reinforce that.

258:37

Let's run it again. This time we got a

258:40

single prompt just like I asked. Now we

258:42

can copy the prompt and paste it into

258:44

another workflow. If we want with this

258:47

node selected I will use controll + c to

258:50

copy the node then let's go to workflows

258:52

and open a workflow like this five a

258:54

workflow that uses z image turbo which

258:57

we know likes long prompts I will move

258:59

this node to the side then controll +v

259:03

to paste that node to connect this node

259:06

we just drag a link to the positive

259:07

prompt now we have a mix of local models

259:10

that take the prompt from an API node we

259:13

can also drag a preview here if we

259:15

Let me search for preview as text. Now

259:18

we can see what prompt it gave us. I can

259:20

rename it prompt so I know this is the

259:22

prompt. Let's run the workflow. You can

259:25

see it generated a prompt for me. Then

259:27

it continues to the next part of the

259:29

workflow and generates the image. This

259:31

can be quite useful. There are free

259:33

models that can also do this, but we

259:35

will talk about that in another episode.

259:38

Let's change the prompt to be a ninja

259:40

bunny, maybe in an action pose. Generate

259:43

again. We get a new prompt describing

259:45

that bunny and the result is this image.

259:48

There are many API nodes and many

259:50

options to connect them. Let's go back

259:53

to the previous workflow where it was

259:55

just those two nodes. Now let's add a

259:57

concatenate node, a node that lets you

260:00

combine two strings or prompts. Let me

260:03

remove this prompt since we want to get

260:05

the prompt from the concatenate node. I

260:08

will use it for string B for now. And

260:10

then connect this concatenate node here.

260:13

I will add a green color so it looks

260:14

like a positive prompt node. For the

260:17

first part, I write a cute cartoon bunny

260:19

ninja. For the second part, I write

260:22

something like, "Use the prompt to

260:24

generate a single detailed prompt

260:26

creative. Adapt the prompt to match the

260:28

prompt style and mood." You can use all

260:31

kinds of chat GPT formulas here to get

260:33

exactly what you want. Let's drag a link

260:36

to a new preview as text node so we can

260:38

see the result of the concatenate node.

260:41

Maybe I name it prompt, but I might

260:42

change that later. Still exploring what

260:45

we can do. When we run it, you can see I

260:48

forgot to add a separator. So, it just

260:50

combined the ninja prompt with use the

260:52

prompt to generate. In the end, it still

260:54

understood and generated the prompt.

260:58

But let's fix that delimiter and add a

261:00

comma and a space. Of course, you can

261:03

split this into multiple nodes and make

261:05

workflows more complex, one going into

261:07

another workflow, and so on. This is the

261:10

prompt that goes into chat GPT and this

261:12

is the prompt that comes out of chat

261:14

GPT, the one we want to use in other

261:16

workflows. Now we can run it again. You

261:20

can see the input prompt to chat GPT is

261:22

this combined text. I like to use

261:24

concatenate because I can easily change

261:26

the first prompt without changing the

261:28

formula below. So it is easier to edit.

261:31

The result is this long prompt for the

261:33

Ninja Bunny.

261:37

These two nodes are the same, so I only

261:39

need one and I remove the other. What we

261:42

did here is split a workflow into

261:44

multiple pieces so we can easily edit

261:46

the prompt without worrying about the

261:48

formula. I can quickly change the first

261:51

prompt,

261:52

run it again, and get a new prompt. It

261:55

is quite easy to use at the cost of a

261:57

few cents or 0 something credits. I

262:01

probably do not need that preview

262:02

anymore since I know how they are

262:04

combined. So I will leave just one

262:06

concatenate node, the chat GPT node and

262:10

the preview of the final prompt. Now

262:12

that we have this, we can save it. Hold

262:15

control and drag a selection over all

262:17

nodes. Right click on the canvas, then

262:20

use save selected as template. Give it a

262:23

name, maybe chat GPT prompt, so we know

262:26

it generates prompts. Now we can paste

262:28

it into any workflow. Let me open that

262:31

5A workflow again.

262:34

Since it was already open, I will close

262:36

it because I do not want the extra

262:38

nodes. Then open it again fresh with

262:40

default values. I move this to the side.

262:44

Right click on the canvas. Go to node

262:46

templates. And now we have that template

262:48

there. We can move it wherever we want.

262:50

Remember that the chat GPT prompt comes

262:53

from this string output here. And we

262:55

connect it to any workflow we have to

262:58

the positive prompt. Now when I run the

263:00

workflow, chat GPT generates a prompt.

263:03

That prompt is used in the Z image

263:05

workflow and the result is this cute

263:07

monk cat.

263:09

But the chat GPT model is also a vision

263:12

model which means I can give it an image

263:14

and it can see what is in that image.

263:16

Let me remove this concatenate node.

263:19

Let's add a load image node.

263:23

I upload this image of a helmet. Now we

263:25

can connect this node to where it says

263:27

images. It says images because you can

263:30

add multiple images if you use a batch

263:32

images node, but maybe we explore that

263:34

in a future episode. For the prompt,

263:37

let's say something like, give me a

263:39

single prompt description for this

263:40

image. Descriptive prompt. There are

263:43

more complex formulas, but I am just

263:46

trying something on the spot. Now, let's

263:48

run the workflow. Chat GPT looks at my

263:51

image, and after a few seconds, it

263:52

should give me a prompt based on that

263:54

image. We got this nice long prompt and

263:57

the result is this one. It is not

263:59

perfectly identical, but with a better

264:01

formula, we can probably get something

264:03

even closer. It is still pretty close to

264:06

what we asked. You can also run the

264:08

workflow multiple times to get different

264:10

seeds. Let's see what else we can do.

264:13

Let's create a new blank workflow.

264:15

Double click on the canvas and search

264:17

for Nano Banana. We have this first

264:19

version of Nano Banana that is cheaper.

264:22

It is eight credits, so you can probably

264:24

get something similar for free from

264:26

Google Gemini. We also have Nano Banana

264:28

Pro, the more powerful model that can do

264:31

big images. This one costs 28 credits.

264:34

If we change to 4K size, it will cost 51

264:38

credits. Depending on the model, some

264:40

can cost over 100 credits, so be careful

264:43

what nodes you use because you can run

264:45

out of credits pretty fast. Both accept

264:47

images, but Nano Banana Pro understands

264:50

prompts and images better. You can see

264:52

what model is used for the first Nano

264:54

Banana. And for Nano Banana Pro, it is

264:57

actually called Gemini 3 Pro image.

265:00

Let's remove the first node and use this

265:02

one to generate an image. I add a load

265:04

image node so we can load an image from

265:07

disk. Then connect the nodes. Now I

265:10

upload an image. For example, this

265:12

portrait of a man. Then I add the

265:14

prompt. We did not talk yet about

265:16

editing models like Flux 2, Quenedit, or

265:20

Nano Banana Pro, but these can be used

265:23

to edit or modify an image. I could say

265:26

change the t-shirt or replace the

265:28

background or hair color. Let's try

265:30

something simple like telling it to

265:32

change what he wears to a steampunk

265:34

suit. For resolution for this test, I go

265:37

with 2K since it uses about half the

265:39

credits. We also have aspect ratio.

265:42

Instead of auto, I set it to 9 to6, but

265:46

you should use whatever ratio you need.

265:48

When I run it, I get a prompt failed

265:50

message. Can you guess why? It says it

265:53

has no output. That is because we did

265:55

not save the image. So, let's drag a

265:57

link and add a save image node. I also

266:00

see it has a string output. So, let's

266:02

add a text preview node to see what it

266:04

outputs there.

266:06

Now, we can run the workflow again. This

266:09

one takes longer, over a minute to

266:11

generate. You can also check your

266:13

profile here to see how many credits you

266:15

have or sign out, manage subscription,

266:18

and so on. You can also check partner

266:21

node pricing. This opens the Comfy UI

266:24

website where you can see how much it

266:25

costs to use any of the models that are

266:27

not free, the so-called partner nodes.

266:30

You have models for images, text, and

266:33

also a few nodes for video. These

266:35

usually need a lot of VRAM and you can

266:38

generate video even if you do not have

266:39

that VRAM locally but at a cost. Back in

266:43

Comfy UI we got our generation. If we

266:46

look at it the result is quite good in

266:48

2K size and it is quite similar to the

266:51

original man. So it is a good model but

266:53

expensive. For the text output we also

266:57

got something like a peak into what the

266:59

model was thinking. basically a prompt

267:01

it used to generate that image based on

267:04

the small prompt I gave it and the

267:06

image. You can explore more API node

267:09

workflows created by the comfy UI team.

267:11

If you go to templates here you have all

267:14

kinds of workflows but if you want to

267:16

see the API ones select partner nodes

267:19

then you can filter them by model if you

267:22

know what model you are looking for or

267:24

just explore random workflows. It does

267:26

not cost you anything to open and check

267:28

a workflow. It only costs credits when

267:30

you run it. Let's say I like the preview

267:33

of this workflow. I click on it and I

267:35

get the workflow. Let's see what it

267:38

uses. We have a load image node. So, it

267:41

expects an image from our computer. We

267:43

have a nanobanana prompt and it says

267:46

color this image. So, if you upload a

267:49

sketch, it will color that sketch using

267:51

the nano banana pro model. By default,

267:54

it is set to 1k, but you can change the

267:56

settings to fit your needs. Let's go

267:59

again to templates and check another

268:00

workflow. Maybe this one with the shoe.

268:03

This one is more complex. It expects an

268:06

image of a product like a shoe. Then it

268:08

uses a bite dance model which is similar

268:10

to nano banana but a cheaper version.

268:13

Once the image is saved, it goes to

268:15

different video models. These models

268:17

cost around 103 credits each. It looks

268:21

like it generates multiple videos from

268:22

that shoe, depending on the prompt, and

268:25

then combines all those videos into one

268:27

final video. A workflow this big takes

268:30

some time to run and can cost you maybe

268:33

around 300 credits, so roughly a couple

268:36

of dollars. I did not do the exact math,

268:38

but you can spend credits very fast with

268:40

video models. One important thing to

268:43

understand is that API nodes do not make

268:46

Comfy UI cloud-based. Comfy UI is still

268:48

running locally. You are simply adding

268:51

external steps into your pipeline. From

268:53

a mental model point of view, treat API

268:56

nodes exactly like normal nodes. The

268:58

cables do not care where the data comes

269:00

from. If the output type matches, it

269:03

works. This also means you can mix local

269:06

models, gguf models, diffusion models,

269:09

and API nodes all in the same workflow.

269:12

Comfy UI is a complex application with a

269:15

lot of nodes created by different

269:17

people. And we combine all these things

269:19

together like Lego pieces. At some point

269:22

you will get an error either because you

269:25

forgot to connect a link, you connected

269:27

the wrong nodes or you used the wrong

269:30

models. In this chapter I will try to

269:33

explain what you can do when that

269:35

happens because it will happen. If we

269:38

look at the workflows we have this

269:40

workflow with the number zero. The first

269:42

one, this one is for help and resources.

269:45

And I tried to gather here some

269:47

information that might help you. Let's

269:49

start with resources. The best way to

269:51

learn Comfy UI is to watch tutorials and

269:54

to practice. You have here a link to the

269:56

Pixar YouTube channel, but there are

269:59

many other YouTubers who do tutorials

270:01

for Comfy UI. You can search on YouTube

270:03

for different tutorials. Try to look for

270:06

more recent ones because if a tutorial

270:08

is two years old, most things are

270:10

probably different. Now, that is one of

270:12

the reasons I made this new series. On

270:15

the top of my YouTube header, I added a

270:17

link to my Discord. Click on it, then

270:20

click go to site, and you will get an

270:22

invitation to join the Pixaroma Discord

270:25

server. If for some reason it says

270:27

invalid, try a different browser or the

270:29

mobile application. You click accept

270:32

invite and you will land in the welcome

270:34

channel. I will show you more about how

270:36

to navigate Discord in a minute. In this

270:39

note, I also included a link to Discord

270:41

and some useful info like where the

270:43

Pixar workflows are and so on. Let's

270:47

click on this link and we get to the

270:49

same invite. It goes to the same server

270:51

and the same welcome channel. This is my

270:54

server called Pixaroma, but there are

270:56

many other servers for Comfy UI. For

270:58

example, if you go to the comfy.org or

271:01

website and then to resources. They also

271:03

have a Discord link. It is the same

271:06

process. You accept the invite and you

271:08

land on their welcome channel. On the

271:10

left side, you have the servers you

271:12

joined like my Pixaroma server or the

271:14

Comfy UI server. Let's go to Pixaroma

271:17

and explore a bit more. On the left, we

271:20

have different channel names so we stay

271:22

organized. Each channel has its purpose.

271:25

There are also categories that contain

271:27

multiple channels which you can collapse

271:29

or expand. For example, if I collapse

271:32

this category, you might think you

271:34

cannot find the Pixarroma workflows

271:36

channel. But if we click on this arrow,

271:39

we expand the category and now we can

271:41

see the Pixaroma Workflows channel

271:43

there. For every server you join, check

271:46

the rules so you know what you are

271:48

allowed to do and you do not break the

271:50

rules and get banned. If your Discord

271:52

account gets hacked and posts spam in

271:54

your name, you might get banned as well.

271:56

You can send me a message to remove the

271:58

ban if you fixed your account and it is

272:01

not hacked anymore. Here we also have a

272:03

help channel where you can find what

272:05

each channel is for and where to post.

272:08

Some channels are public, some are

272:10

private and only for members, and some

272:12

are public, but only moderators or

272:14

admins can post like news and updates.

272:17

We also have a daily challenge for

272:18

people who use AI. You can find more

272:21

info in this channel and you can

272:23

participate in the challenge in the

272:24

daily challenge channel. When you see a

272:27

number with a red circle, that means

272:29

someone mentioned you or everyone on the

272:31

server. For example, when you see that

272:34

the news and updates channel has a

272:35

notification, go check it because I

272:37

probably posted a new tutorial or shared

272:39

an update. You can see this post used

272:42

everyone to mention everyone on the

272:44

server. In off topic, you can discuss

272:46

things that do not fit in Comfy UI or

272:49

other channels, but try to avoid spam

272:51

and make sure it still respects the

272:53

rules. For Comfy UI here, you usually

272:56

have the most active chat. People talk

272:58

about Comfy UI, so if you post here for

273:01

quick help, and if members know the

273:03

answer and have time, you might get

273:05

help. If not, it might get ignored.

273:08

Another channel where you can post is

273:10

the forum. There you usually post things

273:12

for longer term discussion. You might

273:15

post today and get replies in hours,

273:17

days, or sometimes not at all if people

273:20

do not know the answer. You can see that

273:22

I can post in this channel because it

273:24

lets me type. You can ask for help here

273:27

and include screenshots and all the

273:29

details. This is also the channel where

273:31

EVO posts updates about the Easy

273:33

Installer, which he continuously

273:35

improves and adds more scripts to make

273:37

things easier. Thanks to EVO for all the

273:40

help. Make sure you check this area for

273:42

updates related to the easy installer.

273:45

The most visited channel is probably the

273:47

Pixarroma workflows channel where people

273:49

come to get my workflows from tutorials.

273:52

Here I have a list of older episodes

273:54

from 2024 to 2025 and also this new

273:58

series I am doing now. Starting with the

274:00

first episode, you can see links that

274:02

lead to specific episodes. For example,

274:06

if I click on the first episode, I land

274:08

on this page where I will also add a

274:10

link to the YouTube video once it is

274:12

ready. You will find all the chapters of

274:14

that video plus links to Comfy UI and

274:16

the workflows. You can download the

274:19

workflows either as a zip archive that

274:21

you extract or as individual JSON files.

274:24

You can also comment on this forum post.

274:27

Since I post this series as forum posts,

274:29

you can comment if something does not

274:31

work so we can try to fix it if

274:33

possible. Keep the conversation limited

274:36

to that specific episode. For the next

274:38

episodes, comment on their respective

274:40

posts. If it is not related, use the

274:43

forum or the Comfy UI channel. For

274:46

off-topic discussions, use the off-topic

274:49

channel. You can also use Discord to

274:51

navigate quickly. You can create links

274:53

to different channels. For example, if I

274:56

type the hash sign, I can select

274:58

different channels like Pixaroma

275:00

Workflows. Or I can type hash and then

275:03

help. And you can see what happens. It

275:05

adds a link to that channel. When I

275:07

press enter, I get a clickable link to

275:09

the help channel. If I click it, I land

275:11

in the help channel. Let's go back to

275:14

the off-topic channel. If you hover over

275:16

a message, you have different options

275:18

like edit or add reactions. Some servers

275:22

also allow you to use emojis from other

275:24

servers. For example, I can select this

275:27

Pixab Bunny emoji. To remove a message,

275:30

hover again, click the three dots, and

275:33

you have different options, including

275:35

delete. If I type hash and then

275:38

Pixarroma, and select Pixarroma

275:41

workflows, press enter, then click it, I

275:44

land in the Pixarroma workflows channel.

275:47

I keep getting messages that people

275:49

cannot find the Pixarro Workflows

275:51

channel. So, I hope this tutorial helps

275:53

you find it more easily. In channels

275:56

where you cannot comment, you will see a

275:58

message saying that posting is not

275:59

allowed. Usually only admins or

276:02

moderators can post there. Use the Comfy

276:04

UI channel for discussion and help

276:06

related to Comfy UI. Use Offtopic if you

276:09

cannot find the right channel. Let's go

276:11

to the forum. Here you can find all

276:14

kinds of forum posts. For example, we

276:17

have this pinned forum post that you

276:19

should read before you post anything.

276:21

From here you can create a new post. You

276:24

can close this if you want to see more

276:26

of the forum. When you create a post,

276:29

you can add a title, a message, and

276:31

screenshots with your workflow that has

276:33

problems. You can also add tags to your

276:36

post depending on what the post is

276:38

about. Add a clear title and a

276:40

descriptive message, not something

276:42

vague. When you are done, you can post

276:44

it using this button. You can also check

276:47

other posts to see how they are written.

276:50

For example, the workflows from the

276:52

first episode are posted in a forum

276:54

post.

276:56

Besides that, you have more channels for

276:58

AI video and AI music for chat GPT and

277:02

other AI topics and a few more channels

277:04

that I will let you explore in your free

277:06

time. Keep discussions civilized and

277:10

help when you can. I visit the Discord

277:12

every day, but I cannot respond to all

277:15

messages. mention Pixarroma if something

277:18

is important. In the top right, you also

277:22

have an inbox. On the left side, if you

277:24

click on the logo, you have direct

277:26

messages where you can talk with your

277:28

friends. If you click on unread, you can

277:31

see notifications, including mentions,

277:34

so you can quickly see when someone

277:36

mentioned you or everyone on the server.

277:38

You can jump directly to that message

277:40

using the jump button. Always check

277:43

mentions, especially when you see your

277:45

username. You also have a search bar

277:48

which many people forget exists. Here

277:50

you can type a model name or a few words

277:52

that people might have used in

277:54

discussions. For example, if I search

277:57

for LTX2, I can see quite a few posts

277:59

with that search query and I can jump to

278:02

any of those discussions. You can also

278:04

search posts from a specific user. For

278:07

example, I can search for posts from

278:09

Pixaroma. Make sure the username is

278:11

Pixarroma and not something else because

278:14

some people try to mimic the name. Both

278:17

the username and display name should be

278:19

Pixarro. Now you can see all the posts

278:21

from Pixarro. You also have more options

278:24

and filters that you can use for

278:25

different channels and searches.

278:29

You can use these arrows to reply to a

278:31

message or forward it to someone else.

278:33

Okay, enough with Discord. Let's go back

278:36

to Comfy UI. I included here more

278:38

resources for Comfy UI like the official

278:41

ones and also some unofficial ones like

278:44

Reddit or Facebook groups that you can

278:46

try. Let's open the Reddit group for

278:49

example. We have this comfy UI Reddit

278:51

group where you can see discussions,

278:54

news, tutorials and so on and where you

278:57

can post your questions. There is also

279:00

one called stable diffusion which

279:02

includes discussions about stable

279:03

diffusion and free models comfy UI but

279:07

also other interfaces not only comfy UI.

279:11

You can also search for a word on Reddit

279:13

like Comfy UI and sort the results by

279:16

communities. Then you can check which

279:18

ones have more members. The two I use

279:21

the most are these ones. Make sure you

279:23

also check the other notes I added here

279:25

like definitions for beginners, what a

279:28

model is, what a text encoder is, and so

279:31

on. There's also more information about

279:33

performance, common errors and fixes,

279:36

model locations, custom nodes, and how

279:39

to update Comfy UI. I also included a

279:42

link to the easy installer in case you

279:44

want to go back to it and find more info

279:45

or check what is new in the releases. If

279:48

you want, I also created an experimental

279:50

chat GPT that you can try, especially

279:53

for this easy install comfy UI version.

279:56

Like any chat GPT, it can hallucinate

279:59

sometimes, but it is still better than a

280:01

simple chat because it is more

280:03

specialized for comfy UI. For example,

280:06

if I ask where the images are saved, it

280:09

will think and also search the knowledge

280:11

database where I added some files. Then

280:13

it will answer. You can see the answer

280:15

is pretty good. So I think it will help

280:17

a lot of beginners. Sometimes if you

280:20

think it made a mistake, maybe because

280:22

something is new and the model was

280:24

trained months ago, you can ask, "Are

280:26

you sure?" Look online and it will

280:28

search the web. This way you can double

280:30

check and improve your chances of

280:32

getting a more accurate response. In

280:34

this case, it knew that images are saved

280:36

in the output folder. Let's ask

280:39

something else like where are the

280:40

Pixarroma workflows? Where can I find

280:43

them? It will tell you they are on

280:45

Discord and give you the channel name.

280:47

Let me try something else. Let's open

280:49

workflow number one, the juggernaut text

280:52

to image workflow and disconnect this

280:54

node to cause an error. When I run it, I

280:57

get this error. Now I take a screenshot

280:59

of this error, go to that custom chat,

281:02

paste the screenshot, and ask how to fix

281:04

this error. You can see that in this

281:07

case, since it was a simple error, it

281:09

knew how to answer and told me to drag a

281:11

wire from load checkpoint VAE to the VAE

281:14

decode node. This can save you a lot of

281:17

time in many cases. So, I hope you find

281:19

it useful. You can give it more

281:21

screenshots and more info, even ones

281:23

without the error, so it can understand

281:26

the workflow better. Sometimes on

281:28

GitHub, it asks you to post an error

281:29

report, and you can find here a report

281:31

that gives more info about the error.

281:33

You also have find this issue which

281:36

opens the issue pages on the Comfy UI

281:38

GitHub page. This is the official Comfy

281:40

UI GitHub page for the portable version,

281:43

not the easy installer. Even though the

281:45

easy installer installs the same version

281:48

plus extra scripts, there is an issues

281:50

tab where people post problems. You can

281:53

search issues that are open or include

281:55

closed ones as well. You can also post a

281:58

new issue if it is something new and you

282:00

did not find any information about it.

282:02

Make sure it is an issue with a comfy UI

282:04

node, not a custom node. For custom

282:07

nodes, you need to go to the custom node

282:09

page instead. To fix this error, we just

282:12

connect the VAE back to VAE decode. But

282:15

let's say you have your VAE as a

282:17

separate file for some workflows. So you

282:20

use load VAE to load a VAE and connect

282:22

that to VAE. Let's see what happens when

282:25

I run the workflow. It gives this error

282:28

which usually means we used models with

282:30

different architectures that are not

282:32

meant to work together. The error is

282:34

shown in VAE decode. But VAE decode is

282:37

not really the problem. The problem is

282:39

the input that goes into that node. In

282:42

this case, the VAE loaded with load VAE

282:45

was the issue. Let's go back to the help

282:47

workflow. I want to remind you that when

282:49

you ask for help, include screenshots of

282:52

your workflow. Tell us what video card

282:54

you have, how much VRAM and system RAM

282:57

you have, and which operating system you

282:59

are using. Also, explain what you

283:02

already tried and what did not work.

283:04

This helps the community assist you

283:06

faster. Okay, one more chapter to go.

283:09

Are you ready? You have now reached the

283:11

end of this course. At this point, you

283:14

understand how Comfy UI works, how

283:16

workflows are built, how models differ,

283:19

and how to use tools like Laura,

283:21

ControlNet, and advanced diffusion

283:23

models. But learning Comfy UI does not

283:25

really end here. This is just the

283:28

foundation. The most important thing to

283:30

understand is that Comfy UI is not a

283:32

fixed tool. It is constantly evolving.

283:36

New models appear, new nodes are

283:38

created, new workflows solve problems in

283:41

better ways. So the best way to continue

283:44

learning is by experimenting. Open

283:46

workflows, break them, rebuild them,

283:49

change one thing at a time and see what

283:51

happens. That is how real understanding

283:54

happens. Another important habit is

283:56

reading workflows, not just using them.

283:59

When you download a workflow, do not

284:01

just press run. Look at the nodes.

284:04

Follow the connections. Ask yourself why

284:07

something is there. If a workflow looks

284:09

confusing, that usually means it is

284:11

teaching you something new. Next, stay

284:14

connected to the community. Use Discord

284:17

to ask questions, share results, and

284:20

help others when you can. Very often,

284:23

answering someone else's question will

284:25

make you understand things better

284:26

yourself. Follow model releases, but do

284:29

not chase everything. You do not need

284:32

every new model. Find a few that work

284:34

well for your style and hardware and

284:36

learn them deeply. As you get more

284:38

comfortable, start building your own

284:40

workflows from scratch, even simple

284:43

ones, especially simple ones. That is

284:46

how you move from copying to creating.

284:49

Also, remember that AI tools change

284:51

fast. What matters most is not

284:54

memorizing settings, but understanding

284:56

concepts. Noise, conditioning, sampling,

285:00

structure versus style. Those ideas will

285:03

stay useful even when models change.

285:06

Finally, do not rush. There is no finish

285:10

line here. Learning Comfy UI is a

285:12

process, not a goal. Take your time,

285:16

have fun, and keep experimenting. This

285:19

is just the beginning. So, what comes

285:21

next? Obviously, I will continue this

285:24

series and do episode 2, 3, and so on.

285:27

But I cannot make them as big as this

285:29

first episode. They will be shorter

285:32

videos focused on things we did not

285:34

learn yet like other models such as Quen

285:36

or Flux video models and so on. We still

285:40

have a lot to cover and every week or

285:42

month we see new models and new nodes

285:45

appearing. My plan for the new series is

285:47

to show you these new models and

285:49

workflows in a more easy to understand

285:51

way so everything makes sense as much as

285:54

possible. Some of these workflows and

285:56

models are so new that nobody really

285:59

knows much about them yet. I will try to

286:01

post a new episode every week if my

286:03

health allows it. If not, then at least

286:06

one episode every 2 weeks. This new

286:09

series will have bunnies on the

286:11

thumbnails so you do not confuse it with

286:13

the old series.

286:15

For the new series, as you saw, the

286:18

workflows on Discord are posted in the

286:19

forum. This makes it easier for me to

286:22

see when you find a bug or when

286:24

something does not work anymore so I can

286:26

try to fix it. That is why the old

286:28

series, even though it still has good

286:30

tutorials and you can still watch it,

286:33

especially the last episodes, will not

286:35

receive updated workflows. I will not go

286:38

back and try to fix those old workflows.

286:40

Instead, I will focus on the new series.

286:44

When needed, I can revisit those older

286:46

workflows, adapt them to new models, and

286:50

present that in a new episode in the new

286:52

series. I wanted you to have the basics

286:55

in this long episode 1, this course that

286:58

you probably cannot find somewhere else.

287:00

I worked one month on this episode, and

287:02

I wanted everyone to have access to it

287:04

for free. I could have put it behind a

287:06

paid course, but I feel better when I

287:08

can help people. That being said, I do

287:11

appreciate your support. There are many

287:13

ways you can help me and this channel so

287:16

I can create more videos. The easiest

287:18

way is to press the like button,

287:20

subscribe to the channel, and leave a

287:22

comment, even if it is just a simple

287:24

thank you. This shows activity to the

287:27

YouTube algorithm and helps the video

287:29

reach more people. So now, if someone

287:32

asks where they can start learning Comfy

287:34

UI, you can share the link to this

287:36

course. I will also create a new

287:39

playlist that will host all the new

287:40

episodes from this series. For those who

287:43

can afford to buy me a cup of tea, since

287:45

I do not drink coffee, being a bunny, I

287:47

already have too much energy. You can

287:49

use the join button. Here you have four

287:52

different options from really cheap,

287:54

like half a cup of tea per month, to

287:56

more expensive, like a premium cup of

287:59

tea. Depending on the option you choose,

288:01

you get different perks. For example,

288:04

Legends have a private channel on

288:06

Discord where they get to know me

288:07

better. If you do not want to help

288:09

monthly, you can also help one time. On

288:13

each video, you can find this heart with

288:15

a dollar sign called super thanks. Super

288:19

thanks allows you to select an amount of

288:21

money that you want to donate and send

288:23

it. You can use this for videos that

288:25

really helped you like this course or

288:27

any other episode where you learned

288:28

something useful. Speaking about legends

288:31

and those who subscribed to the

288:33

membership, I want to thank all of you

288:34

who made this course possible with your

288:36

support. Together, we can help other

288:38

people learn new tools, understand this

288:41

crazy AI world we live in, and maybe

288:44

even make it a better place. Thank you

288:46

all. You are the best. Have a great day,

288:49

and I will see you on Discord.

UNLOCK MORE

Sign up free to access premium features

INTERACTIVE VIEWER

Watch the video with synced subtitles, adjustable overlay, and full playback control.

SIGN UP FREE TO UNLOCK

AI SUMMARY

Get an instant AI-generated summary of the video content, key points, and takeaways.

SIGN UP FREE TO UNLOCK

TRANSLATE

Translate the transcript to 100+ languages with one click. Download in any format.

SIGN UP FREE TO UNLOCK

MIND MAP

Visualize the transcript as an interactive mind map. Understand structure at a glance.

SIGN UP FREE TO UNLOCK

CHAT WITH TRANSCRIPT

Ask questions about the video content. Get answers powered by AI directly from the transcript.

SIGN UP FREE TO UNLOCK

GET MORE FROM YOUR TRANSCRIPTS

Sign up for free and unlock interactive viewer, AI summaries, translations, mind maps, and more. No credit card required.

GET STARTED FREE SIGN IN