On Sampler Runtime Performance
This is mostly just a data dump for nerds.
The simplest conclusion of all this is that if a render bleeds out of VRAM (especially onto DDR3) then it gets very slow.
Also sometimes you try to make a grid larger than your scratch disk and everything crashes.
On that same google sheet with all my qualitative testing I had a tab comparing runtimes of checkpoints and samplers, but big data was really unnecessary for the conclusions.
The most interesting thing from that examination was that DPM++ 2S a Karras takes twice as long as like every other sampler but also looks the best so it is worth using a lot of the time.
NightVision or ProtoVision (or both) seemed very slow
Faetastic is a little too big - constant shared memory usage
The transition into morph does not maintain the shared memory overflow - no other model I have observed from this run has done that.
1000x1000
Obviously there is memory overflow when models transition (from sampled to refiner model)
This behavior is no longer true? I am not sure why.
Maybe there is a part of running multiple things and it trying to hold grids in memory?
But not always! That is such odd behavior
It seems 50/50, also, 9/7 had the least production. I do not know why.
I need to do post analysis
Is it proportional with the compression rate of the image?
F.e., does more complexity of the image mean more VRAM/RAM usage like png compression?
morphXL has had a couple of examples of memory overflow but not all examples
pre-refinement 09-09-00110 looks really interesting, maybe soft, look at morphXL specifically
one example of ambience bumps
died during ambience
start of insomnia is not representative of time - 9/10 - started at 4:52pm
There is a question when it comes to the function of reloading model with the model:
Reusing loaded model sd_xl_refiner_1.0.safetensors [7440042bbd] to load insomnia_v10.safetensors [541ffc0227]
as to if that has any impact on the coalescence of runs with this specific workflow,
For example, the first image from insomnia (9/10) had a pre-refined look way different from the second image.
We shall see if the output images also have this same difference because that would suggest
that something having to do with the load order actually matters a lot about how these things are run,
which would be annoying - though maybe usable in a different way,
another reason to get into using ComfyUI if it allows this to be done intentionally,
or dodges this behavior on group runs.
Also interested to see, post pic 2 being done, if the images get better over runs?
Because those look bad.
Should compare test the civit runs of insomnia
Run 4 finally looks 'right and normal', going into refinement. Why?
At a glance the difference is cfg 2 vs cfg 7 w/ 2 looking bad and 7 (into run 5) looking good
Larger number CFG appears to be more effective in contrast to SD1.5
Base model appears to not be even close to memory overflow but this is also still early runs
Even baser just looks so different before refiner, like rerun 9/11/00021
galaxy and default do not appear to be exhibiting any memory overflow, but also, early runs still
Galaxy seems fast, but that is completely anecdotal
Sweetlens has memory overflow, first glance, absolutely does.
It is also the fifth checkpoint run - is that why? Needs comparison testing
Behavior observed on 9/11/00344
osorubeshi does not have overflow, circa 9/12/00024
nor does cinema
Ambience rerun occured after 7 other checkpoints were run - finishing with cinema (run V2.5_1 into run V2.5_2)
run 2 begins after 9/12/00201
No Ambience overflow on image 1 or 2
Dreamshaper: 9/6/00112-9/6/00207
Copax: 9/6/00208-9/6/00303
sdvn6: 9/6/00304-9/7/00034
Protovision: 9/7/00035-9/7/00130
Nightvision: 9/7/00131-9/7/00226
Zavychromax: 9/7/00227-9/8/00081
Talmendo: 9/8/00082-9/8/00177
Formula: 9/8/00178-9/8/00273
Faetastic: 9/8/00274-9/9/00049
Morph: 9/9/00050-9/9/00145
Miamodel: 9/9/00146-9/9/00241
1_Ambience: 9/9/00242-9/9/00282
(NextCycle)
Insomnia: 9/10/00000-9/10/00095
sd_xl_base: 9/10/00096-9/11/00092
SDXLPeople: 9/11/00093-9/11/00188
Galaxy: 9/11/00189-9/11/00284
Sweetlens: 9/11/00285-9/12/00009
Osorubeshi: 9/12/00010-9/12/00105
Cinemax: 9/12/00106-9/12/00201
(NextCycle)
2_Ambience: 9/12/00202-9/12/00297
(NextCycle)
Ambience later images have just a tiny amount of VRAM overflow - like the tiniest line,
And not for every image, which leads to the idea that it is a function of pixel complexity.
Noticing VRAM overflow is chance, I did not monitor this constantly, but it is slow,
So I am hypothesizing that slower runs mean more VRAM overflow
Final Ambience image has VRAM overflow and is noticeably slower as it process through steps,
Other images from this run were taking a couple of minutes and thi sis taking nearly 10
Memory overflows never happen at refiner, they are always a step that happens in the generation process
Final example had 7.26s/it vs like 1.8s/it
Rerun of Insomnia just looked bad, not a load order issue.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
v3
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
80 steps in checkpoint, 100 steps in latent highres fix, with no upscaling. All else the same
1.04s/it running v3 which has no reload of model, 1.09s/it for generation, 1.01s/it for hires
Already it is so much faster not having to spend ~20 seconds reloading models each half cycle
Dreamshaper: 9/14/00008-
Illogical perpetual freeze on unfinished(non-existent) 9/14/00026, leads to force close
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
v3.5
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Actual 9/14/00026 is a new run
- 9/14/00043
Goofed, 20 initial steps, had some cool results, something to reflect on and try intentionally,
does not work for this, so
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
v3.75
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Starts at 9/14/00044,
80 intial steps, 100 refiner steps, the right v3, please do not crash
Why did it start on Copax? Because I threw and interrupt and then changed the steps
did it just skip Dreamshaper? That is so weird and also kind of annoying
it/s slowdown when doing literally anything else, also later in the run the slowdown,
after a few minutes of YouTube and then closing number went from
1.88->2.05->1.89.
I will have to revist after I let the comp sleep for a while
There was also a drip of overflow, but VRAM did not look capped out, so, idk what that is
Leaving it running overnight, it may look like toward the end of the run it slowed down
to 1.88it/s, but I cannot be sure, I will monitor Protovision as it is 67/96 through
Using notepad++, notepad, and bridge on 67/96 did not affect it/s
Yeah, dropped to 1.88it/s in the last handful of images generated with Protovision too,
I was not using the computer, it must be something about the grid or something,
I do not know why else it would slow down like that.
Normal it/s is like 1.05 ish, for reference
Somehow seemed to maintain decent it/s while using OBS - probably
unconfirmed forsure by the logs, but they look like it
Memory overflow and slowdown at end of Talmendo (9/16/00180)
But only overflow during generation not highres fix period but still slow
(MemOflow... .pngs)
Then does get fast at the start of Formula, neat
Faetastic for example slowed down at around 96-13
SPecifically for Fae, between 9/16/00002 and 00003, 3 is slow,
so, 96-13=83, weird number, look at file sizes...
Transition into Morph, yeah, instantly back to 1.04it/s ish, even with edge playing YouTube
108.97MB, weird number! Add it to Checkpoint size...
6.56GB I have a screenshot, replicable if I wanted but I do not need that extra bloat
That is a weird number, suggests grid memory allocation issue of some kind potentially?
I do not know, I will keep looking, if the number keeps getting bigger (this is my first
observation but I know the others werent like 13)
then I will know it is something about stored data or memory allocation.
Interesting!
Well, this is differently related, but at 9/17/(((((
everything broke. Had to reset windows including a fresh install of SD.
I rematched the sizing of the Bridge window because I wanted that to be a consistent factor,
I have not yet tested how SD works comparitively, but I have all the checkpoints and images still,
just the C drive died so I had to rebuild SD
Copax: 9/14/00044-9/15/00045
sdvn6: 9/15/00046-9/15/00141
Protovision: 9/15/00142-9/15/00237
Nightvision: 9/15/00238-9/15/00333
Zavy: 9/15/00334-9/16/00084
Talmendo: 9/16/00085-9/16/00180
Formula: 9/16/00181-9/16/00276
Faetastic: 9/16/00277-9/17/00015
Morph: 9/17/00016-9/17/00111
Mia: 9/17/00112-9/17/00182 (crash)
Just look at D:\BigSDXLBatch\WeirdTransitionExample
and the note therin