Local AI Image Upscaling Workflow
Ilan: Hey Everyone, this
is Prompt and Circumstance.
I'm Ilan
David: And I'm David.
Ilan: this week, we're gonna show you
how to upscale images to 4K and beyond.
David, last time we talked about
image generation, you showed us some
pretty cool high quality images.
So how did you get those to work?
David: Yeah, I used a workflow in
comfy UI, of course, and it was
all done locally, and I'm gonna
walk you through how to do that.
Ilan: Awesome, looking forward to it..
David: All right.
So this is the workflow and it is probably
the more complicated, uh, workflow that
some of, uh, our audience might have seen.
What it's doing is it's taking
three different models and it's,
uh, chaining them all together
in order to do the upscaling.
And it also adds in some realism to
the texture of the skin because as
some of us might know, looking at
the AI generated images of people,
the skin looks sometimes plastic-y.
All right.
Let's walk through this.
All right.
So we are going to provide this workflow.
It'll be in the link in the show notes.
So you, you don't need to
create this from scratch.
You could just grab the JSON
and drag it into Comfy UI
and away you go.
Now, when you do that, it's likely going
to give you a bunch of warnings because
you don't have the models in here and
possibly some of the custom nodes.
So let's walk you through that.
All right.
So one of the first things that
you would want to do in Comfy UI
in general is install the manager.
Believe it or not, Comfy UI does not
come with a manager UI, and this is
where you would want to go in order to
install, say, the missing custom nodes,
which some are used in this workflow.
So when you open up Comfy UI, you're
not gonna have this manager here,
and let's show you how to get that.
Ilan: Gotta have a manager, man.
Yeah.
Otherwise, the org chart is too flat.
David: It's too flat.
Gotta make it hierarchical.
Ilan: Can I import a comfy UI CTO?
David: Then it'll just take over.
All right.
So if you don't already
have the manager installed,
Simply come to
docs.comfy.org/manager/install.
It's an official add-on to comfy UI.
So just come to this page.
We're gonna link to this in the show notes
for the company UI manager installation.
It's, it's an official add-on, and you
can just follow the instructions here
to get this installed in your instance.
Ilan: How long did it take you to
figure out that you needed that?
David: Oh, it was, like, right at the
very beginning, because I, I follow the,
like, I do the, the, the YouTube videos
and all that, and they usually say that.
Yeah.
All right.
So now that you have the, the custom nodes
installed you're going to want to take
note of this note on the left-hand side.
And so this is going to give you
the URLs to all of the different
files that you're going to have
to download and where to put them.
So for example these two links will
send you to the, uh, Civitai, or CivitAI
page, and, um, you will download them and
put them into the, the diffusion models
folder within the models folder, okay?
And same thing with, uh, the
checkpoints here and so on and so on.
And again, you know, if you don't
want to look at the note here, we'll
have that in the show notes, but it's,
uh, it's interesting if you have a
look at, uh, how all of this comes
together because, um, the first two
are the, the models used for the first
two steps of the refinement, right?
So what the first model does is it
just does this base upscaling of the
image, and then the next model, uh,
does the refinement, and then there's
a subsequent additional upscaling.
All right.
So here I am in Civit AI and this is
the first model so when you click on
the link, this is where you would land,
and the way that you would download it,
well, first off, you need to get a, an
account, so just create a free account,
and then over here, make sure you select
version four, and then you can just
download the the selected model and
save it to, again, the correct folder.
Now, these things are not small.
So you, you can see here,
this model is 22 gigabytes.
Ilan: with the manager, is it allowing
you to just click once and download
all of the, linked models, or do you
have to go one by one and download
them all from the links in the in
the note on the left-hand side?
David: The manager does not take care
of the the models or the model files,
I should say, because we're including
variational auto encoders and so forth.
So this will need to be a manual process.
Ilan: to know.
David: Yeah.
So on each of these pages in Civit
AI, if you're not familiar with them,
you'll see some example images here.
I mean, this is kind of funny.
I guess this is meant to be like
an Eastern European block where,
I guess we see Spider-Man,
Drinking a beer,
Well, I won't get into that here.
Over here, this is the other model
that we're going to be using.
So this is Z or Z, Epic Realism.
And so again, just coming
over here, download that.
It's a five gigabyte file, so not
that big, but, uh, nevertheless,
make sure you have room and
put it into the correct folder.
So the last model that we're going to
be using is something called SeedVR2.
And this is something that has
been released open source by
ByteDance, the makers of TikTok.
And you can see here on this page that
they talk about video upscaling, which
we're going to do in a separate video, but
this model also handles image upscaling.
And so that's what, what we
are going to be using it for.
And you can see some examples
here in terms of before and
after very good details here.
Ilan: You know I gotta say
that all of these model companies,
they, they need better naming schemes.
What's, what's going on with
these super long, names?
David: Yeah.
Seed VR2.
Ilan: just ...You
know what?
Across all of them, they could pro-
probably hire one marketer job it
is just to come up with good names.
David: What?
You, you mean seeding infinity in
diffusion transformer towards generic
video restoration doesn't work for you?
Ilan: You know, it doesn't quite
roll off the tongue, David.
David: Not ... Okay.
It's just a few tweaks,
I think, are needed.
Okay.
So those are the three major models,
but there's a lot of other files
that you would need to download.
For example, there are some LORAs
here, the low rank adaptations.
So these are files that you would
need to download off of Civit AI,
and again, the links are provided.
Additionally, of course,
there's a clip, which ... Do
you know what that stands for?
Ilan: David.
What does it stand for?
David: It stands for contrastive
language image pre-training.
Ilan: say, I prefer clip.
I think this is what I'm talking about.
They, they clearly got into a room
for 10 minutes and thought about
what should we call our thing?
David: Yeah, yeah.
I mean, it it, it's, it's one of
those things where it was, it's, it
was named by engineers for engineers.
Okay.
So download, uh, these files
into the corresponding folders.
So note that, um, one of these goes into
the clip sub folder and then the other
one goes into the text encoders subfolder.
Also in the text encoders is this QWEN
three, uh, model, and then, uh, we
have your variational auto encoder, and
then you have your upscale models here.
And when you're done downloading
all of those files and putting
them into your folders, just press
the R button, press the R key.
And what that will do, you can see
that I've done that just now, right,
is that it's going to update the,
uh, the list of files that are, uh,
available in these nodes here, right?
So when you have downloaded the you
know, the, the fusion model here,
you can click on this and, uh,
select, uh, where you've put that.
It's important that you actually do
that because the paths that have been,
uh, set here in the, uh, in the JSON
file might not necessarily match yours.
Ilan: Question: Considering that Claude
code and similar local coding models
can control your desktop, could you
just drop that note into Claude Code
and tell it go download these files
and put them into the correct folder?
David: That's really neat of an idea.
I think it's worth trying.
I think you would understand
what it's gotta do because it's
pretty straightforward here.
It's all laid out.
I'm gonna switch over here to this, this
other workflow that we've previously
introduced, just to show you how I
got the images that I did, right?
So this is using Zimage Turbo and
told it to generate just an Asian
male with a samurai bun and standing
in the middle of the traffic.
It's a cinematic closeup, and there we go.
Now, if we look at this
this file over here,
You know, this is, this is 1280 by 720,
which is 720P approximately, right?
so over here, uh, after you've
selected all of your models, you come
over here and you would just choose
the image that you want to upscale.
So here in the load image node,
go ahead and specify that.
All right.
So I've selected the the Asian man image
and what you could do if you really want
to is tinker with these these values here.
But I'm just gonna leave them as is,
and nothing else needs to change.
So I'm just gonna click
and see how this goes.
Ilan: How long does the upscaling take?
I remember you saying for
videos is, like, an hour, right?
David: For videos, it takes a long time.
For images, a couple minutes.
Ilan: Which is why it takes so long
for videos, because it's, like, if it's
a couple minutes per image, that's,
like, per frame in the video, right?
David: Yeah, exactly.
When you're upscaling videos, there's
also the, the additional task of
being temporally consistent so it
needs to take that into account too.
So what this is doing is this is actually
rendering three different images,
right?
It's, it's running the first
model to do the upscaling,
and then it's running the second
model to do the refinement of the
skin and so forth, and then it's
doing the final upscaling using
SeedVR2 to get our final image.
And we're gonna see how this looks
like at the very end where there's a
comparison and you can just see, you
know, back and forth what it looks like.
Oh, sweet.
It's almost done.
Ilan: When you leave to go to the
bathroom at the restaurant, and then
you come back, and your food's at the
table.
David: The food's already served, yeah.
Ilan: Upscaling is a
dish best served cold.
David: So the workflow has completed,
and you can see that, uh, you know, when
you scroll over to the, to the right,
when you pan over to the right, you
can see the, the various different, uh,
passes that it's, uh, that it's taken.
And over here, this is, this
is the, the final result.
Um, so first off, the image, uh, that we
get is a much higher resolution, right?
So we started off with, uh, 1280 by
720 And we ending out with 4,500 by
2,500, so that's a significant upscale.
And, uh, when I zoom in here, and I,
and I showed the before and after,
you can, you can see here's the
before, and here's the after, right?
I can, I can go in really,
a little bit further here.
Look at, look at the eyebrows, right?
You can see a much more detail
there in the eyebrows, and
even in the eyes too, right?
And, uh, now you'll notice though
that it's try to keep some of the,
uh, imperfections in the skin.
So, like, if you look at the before,
you notice that there's some, like, sun
spots on, on the skin, but then when
I, when I scan across, you can see that
not all of them are preserved, right?
So this does mess a little
bit with the skin in terms of,
like, identifying features.
So if that's important to you, then this
is not, it's not the right workflow.
But, you know, if all that you're doing
is, "Hey, I have a generated character
and I want to upscale that image,"
then this is, this is perfectly good.
So, um, that's the before and after.
this is the final image here on on
the left of the, of the, the two.
So you can right click on this and
click open image and that'll bring
you to here where you can simply right
click and save this image here, this
4K image that that we've generated.
So that's just how easy it is.
Ilan: That's a really cool workflow.
Thank you for showing, David.
David: All right.
So that upscaling took, uh, 360
seconds, so that's six minutes.
Um, so not bad for going from 720p to 4K.
And, uh, again, you know, my, my machine,
it uses, uh, it's got 16 gigabytes of
VRAM and 64 gigabytes of, uh, regular RAM.
And, uh, if you are so inclined, you can
further continue to upscale it, right?
So you just come to here, select the
4K image that, uh, that you've, uh,
generated and then upscale that and
you'll get an 8K image if you have a,
you know, big enough machine for that.
So, uh, this is a very powerful workflow.
I think, uh, it's great that this
is something that can be done
locally, and I hope our audience
can make good use of this.
Ilan: Very cool.
Well, thank you for
showing us that, David.
And for the audience, if you want more
tutorials on how to do cool things with
AI, then follow prompt and circumstance