On Sampler Runtime Performance

This is mostly just a data dump for nerds.

The simplest conclusion of all this is that if a render bleeds out of VRAM (especially onto DDR3) then it gets very slow.

Also sometimes you try to make a grid larger than your scratch disk and everything crashes.

On that same google sheet with all my qualitative testing I had a tab comparing runtimes of checkpoints and samplers, but big data was really unnecessary for the conclusions.

The most interesting thing from that examination was that DPM++ 2S a Karras takes twice as long as like every other sampler but also looks the best so it is worth using a lot of the time.


NightVision or ProtoVision (or both) seemed very slow

Faetastic is a little too big - constant shared memory usage

The transition into morph does not maintain the shared memory overflow - no other model I have observed from this run has done that.

1000x1000

Obviously there is memory overflow when models transition (from sampled to refiner model)

This behavior is no longer true? I am not sure why.

Maybe there is a part of running multiple things and it trying to hold grids in memory?

But not always! That is such odd behavior

It seems 50/50, also, 9/7 had the least production. I do not know why.

I need to do post analysis

Is it proportional with the compression rate of the image?

F.e., does more complexity of the image mean more VRAM/RAM usage like png compression?

morphXL has had a couple of examples of memory overflow but not all examples

pre-refinement 09-09-00110 looks really interesting, maybe soft, look at morphXL specifically

one example of ambience bumps

died during ambience

start of insomnia is not representative of time - 9/10 - started at 4:52pm

There is a question when it comes to the function of reloading model with the model:

Reusing loaded model sd_xl_refiner_1.0.safetensors [7440042bbd] to load insomnia_v10.safetensors [541ffc0227]

as to if that has any impact on the coalescence of runs with this specific workflow,

For example, the first image from insomnia (9/10) had a pre-refined look way different from the second image.

We shall see if the output images also have this same difference because that would suggest

that something having to do with the load order actually matters a lot about how these things are run,

which would be annoying - though maybe usable in a different way,

another reason to get into using ComfyUI if it allows this to be done intentionally,

or dodges this behavior on group runs.

Also interested to see, post pic 2 being done, if the images get better over runs?

Because those look bad.

Should compare test the civit runs of insomnia

Run 4 finally looks 'right and normal', going into refinement. Why?

At a glance the difference is cfg 2 vs cfg 7 w/ 2 looking bad and 7 (into run 5) looking good

Larger number CFG appears to be more effective in contrast to SD1.5

Base model appears to not be even close to memory overflow but this is also still early runs

Even baser just looks so different before refiner, like rerun 9/11/00021

galaxy and default do not appear to be exhibiting any memory overflow, but also, early runs still

Galaxy seems fast, but that is completely anecdotal

Sweetlens has memory overflow, first glance, absolutely does.

It is also the fifth checkpoint run - is that why? Needs comparison testing

Behavior observed on 9/11/00344

osorubeshi does not have overflow, circa 9/12/00024

nor does cinema

Ambience rerun occured after 7 other checkpoints were run - finishing with cinema (run V2.5_1 into run V2.5_2)

run 2 begins after 9/12/00201

No Ambience overflow on image 1 or 2

Dreamshaper: 9/6/00112-9/6/00207

Copax: 9/6/00208-9/6/00303

sdvn6: 9/6/00304-9/7/00034

Protovision: 9/7/00035-9/7/00130

Nightvision: 9/7/00131-9/7/00226

Zavychromax: 9/7/00227-9/8/00081

Talmendo: 9/8/00082-9/8/00177

Formula: 9/8/00178-9/8/00273

Faetastic: 9/8/00274-9/9/00049

Morph: 9/9/00050-9/9/00145

Miamodel: 9/9/00146-9/9/00241

1_Ambience: 9/9/00242-9/9/00282

(NextCycle)

Insomnia: 9/10/00000-9/10/00095

sd_xl_base: 9/10/00096-9/11/00092

SDXLPeople: 9/11/00093-9/11/00188

Galaxy: 9/11/00189-9/11/00284

Sweetlens: 9/11/00285-9/12/00009

Osorubeshi: 9/12/00010-9/12/00105

Cinemax: 9/12/00106-9/12/00201

(NextCycle)

2_Ambience: 9/12/00202-9/12/00297

(NextCycle)

Ambience later images have just a tiny amount of VRAM overflow - like the tiniest line,

And not for every image, which leads to the idea that it is a function of pixel complexity.

Noticing VRAM overflow is chance, I did not monitor this constantly, but it is slow,

So I am hypothesizing that slower runs mean more VRAM overflow

Final Ambience image has VRAM overflow and is noticeably slower as it process through steps,

Other images from this run were taking a couple of minutes and thi sis taking nearly 10

Memory overflows never happen at refiner, they are always a step that happens in the generation process

Final example had 7.26s/it vs like 1.8s/it

Rerun of Insomnia just looked bad, not a load order issue.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

v3

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

80 steps in checkpoint, 100 steps in latent highres fix, with no upscaling. All else the same

1.04s/it running v3 which has no reload of model, 1.09s/it for generation, 1.01s/it for hires

Already it is so much faster not having to spend ~20 seconds reloading models each half cycle

Dreamshaper: 9/14/00008-

Illogical perpetual freeze on unfinished(non-existent) 9/14/00026, leads to force close

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

v3.5

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Actual 9/14/00026 is a new run

- 9/14/00043

Goofed, 20 initial steps, had some cool results, something to reflect on and try intentionally,

does not work for this, so

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

v3.75

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Starts at 9/14/00044,

80 intial steps, 100 refiner steps, the right v3, please do not crash

Why did it start on Copax? Because I threw and interrupt and then changed the steps

did it just skip Dreamshaper? That is so weird and also kind of annoying

it/s slowdown when doing literally anything else, also later in the run the slowdown,

after a few minutes of YouTube and then closing number went from

1.88->2.05->1.89.

I will have to revist after I let the comp sleep for a while

There was also a drip of overflow, but VRAM did not look capped out, so, idk what that is

Leaving it running overnight, it may look like toward the end of the run it slowed down

to 1.88it/s, but I cannot be sure, I will monitor Protovision as it is 67/96 through

Using notepad++, notepad, and bridge on 67/96 did not affect it/s

Yeah, dropped to 1.88it/s in the last handful of images generated with Protovision too,

I was not using the computer, it must be something about the grid or something,

I do not know why else it would slow down like that.

Normal it/s is like 1.05 ish, for reference

Somehow seemed to maintain decent it/s while using OBS - probably

unconfirmed forsure by the logs, but they look like it

Memory overflow and slowdown at end of Talmendo (9/16/00180)

But only overflow during generation not highres fix period but still slow

(MemOflow... .pngs)

Then does get fast at the start of Formula, neat

Faetastic for example slowed down at around 96-13

SPecifically for Fae, between 9/16/00002 and 00003, 3 is slow,

so, 96-13=83, weird number, look at file sizes...

Transition into Morph, yeah, instantly back to 1.04it/s ish, even with edge playing YouTube

108.97MB, weird number! Add it to Checkpoint size...

6.56GB I have a screenshot, replicable if I wanted but I do not need that extra bloat

That is a weird number, suggests grid memory allocation issue of some kind potentially?

I do not know, I will keep looking, if the number keeps getting bigger (this is my first

observation but I know the others werent like 13)

then I will know it is something about stored data or memory allocation.

Interesting!

Well, this is differently related, but at 9/17/(((((

everything broke. Had to reset windows including a fresh install of SD.

I rematched the sizing of the Bridge window because I wanted that to be a consistent factor,

I have not yet tested how SD works comparitively, but I have all the checkpoints and images still,

just the C drive died so I had to rebuild SD

Copax: 9/14/00044-9/15/00045

sdvn6: 9/15/00046-9/15/00141

Protovision: 9/15/00142-9/15/00237

Nightvision: 9/15/00238-9/15/00333

Zavy: 9/15/00334-9/16/00084

Talmendo: 9/16/00085-9/16/00180

Formula: 9/16/00181-9/16/00276

Faetastic: 9/16/00277-9/17/00015

Morph: 9/17/00016-9/17/00111

Mia: 9/17/00112-9/17/00182 (crash)

Just look at D:\BigSDXLBatch\WeirdTransitionExample

and the note therin

Previous
Previous

On Printing