AiArt2 – Decoding the Variational Autoencoder

by | Mar 19, 2024 | Art, Technology | 104 comments

In my previous article on AI Art, I admitted to not knowing what the VAE does. I did find that it stood for Variational Autoencoder. The rest was just a magic black box. The articles I tried to reference threw a lot of math at me. So I decided to approach the question experimentally. On the way there, I found out what some of the other magic from the previous article was. I admitted to borrowing keywords from other people to speed up getting somewhere. I’ve since been digging into what those do as well, and uncovered an aspect I’d previously overlooked.

Automatic1111 and ComfyUI have different syntaxes for the prompts, which disguised the fact that some of those magic words are triggers for another type of module. That is, Embeddings. Embeddings are much like LoRAs, but are triggered exclusively from within the prompts. The Automatic1111 syntax doesn’t require anything but the Embedding name to trigger it, while ComfyUI differentiates them with a prefix of embedding: added before the name. This can be combined with a weight value to end up with something like embedding:name:1.5 and be valid. A number of these are specifically designed to be in the negative prompt. I’ve seen a number of names for that type, but it’s not entirely consistant. They work by including all the nightmare fuel you don’t want, and by including them in the negative, it nudges the engine towards the positive.

So, I’ve updated the prompts I’m using to make use of these embeddings. I also got lectured about representation from a fictitious source. So I updated the prompt to add a lady alongside the knight.

The new positive prompt is “knight with lady in gown in ballroom, masterpiece, best quality, sleek, highly detailed, digital painting, realistic digital painting, detailed digital painting, smooth gradients, caucasian, knight, armor, futuristic, cyberpunk, sleek, highly detailed, anime, indoors, ballroom, soft indoor lighting, day, digital painting, 1man:1.5, 1woman:1.5, muscular male, shapely female, duo, couple, redhead, green eyes, looking at viewer, armor, realistic, highres, smooth gradients, detailed face, realistic skin tone, youthful, strong, fantasy art, youthful, strong, full body”

The negative has been trimmed down a bit to “NSFW, (worst quality:2), (low quality:2), (normal quality:2), text, signature, lowres, watermark, embedding:EasyNegativeV2, embedding:HDA_BadHands_neg-neg, (embedding:bad-hands-5:1.5), embedding:BadDream, embedding:UnrealisticDream, (extra fingers, deformed hands, polydactyl:1.5),”

So many numbers

Now, in order to see the impact of the VAE on the images, I need to do cut down on the other variables. So, it’s time to look at some of the options in the sampler. We are mostly interested in the first two “seed” and “control_after_generate”. Since computers don’t do true random, the seed value controls what the rest of the system ends up. All other values being equal, the same seed will make it generate the same image. You can either enter a manual seed, or let the application come up with another one. “control_after_generate” gives the option of ‘randomize’, ‘increment’, ‘decrement’, or ‘fixed’. Randomize is the default on installation. The key thing to bear in mind is that ‘after_generate’ is literal. When set to randomize, it will generate an image then produce a new seed value. So when set on randomize the seed value displayed will not be the one that produced the image just saved, it will have been made from the previous seed value. I’ve found that when switching from fixed to randomize, hitting “queue prompt” will produce a new seed but not an image. So you can toggle to randomize, generate a new seed, then switch back to fixed before creating an image in case you want to save the seed value.

That is what I did to come up with my test image for the VAE. Once I had something to work with, I left it on fixed. This kept everything consistant between images so that the only difference would be the VAE’s effects on the image. To do the test, I went and accquired some more VAE modules so I’d have a wider range of data to compare. Counting the default VAE, I ended up with a total of nine. How convenient for making a grid. So I fed the same settings through all nine and here’s what came out.

So, that’s more of a hallway than a ballroom, but we’re looking at the visual effect of the VAE. And wow, two of those really scrambled the image. Both of those were sdxl versions. After poking around I found the most probable reason. It has to do with the checkpoint being used. The two VAEs that produced the distorted orange images were SDXL versions. The checkpoint I used was not. SDXL is a variation of Stable Diffusion which may or may not be an improvement. I have a checkpoint which is set up for the XL version. I came up with a lengthy analogy to NTSC and PAL formats, but realized it might have not clarified the issue. But if I switch to that checkpoint, we get a very different grid.

So much TV static

Fun.

All things consitered, I think I’m going to default to orangemix still.

Now I still have two big topics to cover that I know of. Both involve changing an existing image in some manner. But we’ll save those for later articles.

About The Author

UnCivilServant

UnCivilServant

A premature curmudgeon and IT drone at a government agency with a well known dislike of many things popular among the Commentariat. Also fails at shilling Books

104 Comments

    • R.J.

      This is the link I was hoping for.
      I love reading these articles. It just makes me realize how old and hopelessly out of touch I am.

  1. Gender Traitor

    Any idea what caused the top left image in the second grid to look faded?

    • UnCivilServant

      Honestly, I have no clue. That’s the default vae, I don’t think it’s optimized.

  2. The Late P Brooks

    Can you have it generate an image of something which can then be 3D printed? A ray gun, for example?

    • Not Adahn

      Greetings fellow 3d printing enthusiasts!

      /totallynotanATFagent

    • UnCivilServant

      I doubt it.

      Even trying to get a floor plan for digital use didn’t work so well. And then for a 3d print, you’d need another dimension added. You could use the “shade==height” software Sensei did on this articles for printing, but that’s still not close to what you asked for.

  3. rhywun

    Is there a racism slider?

    • UnCivilServant

      Probably. I haven’t seen it, but I would not be surprised.

  4. Brochettaward

    We’re going to make Firstory today people.

  5. UnCivilServant

    Why did no one tell me the Buck Knives factory was just east of Spokane and offers tours?

    • R.J.

      You should go. That will be fun.

      • UnCivilServant

        Already added it to my itenery.

      • anti pro state

        If you have any botany, geology, natural history interest, I would recommend a quick stop at Ginkgo Petrified Forest State Park. Fascinating gift shop/museum with lapidary samples of petrified wood from around the globe. Halfway between Spokane and Seattle where the Hwy (90) crosses the Colombia River.

      • UnCivilServant

        I’m already looking at a nine hour drive on the day where it would make the most sense to detour that direction. (Driving in to Spokane), and a three hour detour in the wrong direction when leaving spokane just doesn’t fit šŸ™

      • anti pro state

        Yup, I get that. I sumbled upon it on a leg between Glacier and Redwoods NPs.

    • juris imprudent

      Huh, I remember when they were outside of San Diego.

      • UnCivilServant

        According to their website, they moved to this factory in 2005.

      • juris imprudent

        Yeah. They had been in SoCal from their start up. It was an early pointer at where the state was going.

  6. R.J.

    This new Windows 11 PC gives me lock screen images from the Microsoft Dall-E image generator. It’s interesting.

    • UnCivilServant

      That just seems wrong.

      Is there a “No phone home” & “No AI integration” version of that OS? I really hate the idea of their software trolling through my storage and activity.

      • R.J.

        I am going to be enabling as many privacy protections as possible. I need a Windows PC, in addition to my Linux as many business tasks just need to be done in that Microsoft environment. Still it is interesting, I don’t consider that to be the most egregious issue. The pictures are benign, sci-fi landscapes.

      • UnCivilServant

        It’s a matter of principle. No data should be headed towards Microsoft unless it’s my request for one of their useless websites while looking for where they hid the configuration options that used to be easy to find.

      • Sensei

        From my reading only the enterprise version. From memory it also requires a 24/7 activation server for all the clients or they become watermarked or crippled in some way.

        My “pro” version perpetually reenables the AI Copilot at every ******* update opportunity.

        Despite the love of Linux by techies it doesn’t pass the parent test. No version I’ve ever used properly updates or installs apps without an occasional trip to shell and SUDO.

      • R.J.

        Yes. My venerable GETAC S400 updates pretty smoothly, it is an exception to the Linux rule. Most Linux systems have issues left and right which make them unsuitable for hardcore business use. Backlight won’t get off maximum brightness, Wine won’t work and let you run Windows programs correctly except for one or two, oh, don’t forget when you have to damn near go program it and use SUDO to enable stuff like your camera, which should just work with no issues.
        Hence I keep around a Windows machine, and a Mac. I need to understand all the ecosystems and be able to play in them in case somebody wants me to project manage on a different platform.

    • R.J.

      I am sorry that anybody voted for that. Is the government now going to demand all large platforms divest themselves of unlikable owners, and allow controlled US buyers to run them instead?

      • R.J.

        Also I hate being forced into siding with China. But this is wrong. Just as wrong as the judge in the NY Trump case demanding a fine larger than anyone can bond or conceivably pay to prevent appeal. It is rotten and it stinks.

  7. ron73440

    Sorry to do this to your article UCS, but there will be no Stoic Friday this week, my mom’s cancer has suddenly gotten worse and my stepdad says we are looking at days left, so I am running home on short notice.

    Should return to your regularly scheduled programming next week.

    • R.J.

      Oh my. I wish you some good moments with your mom.

      • Sensei

        + 1 here.

    • Certified Public Asshat

      sorry Ron, hope she goes peacefully.

    • Sean

      Sorry, Ron.

    • The Other Kevin

      So sorry to hear that. Take care of your family. We can survive a post with the picture of the tipsy camera man.

    • Gender Traitor

      I’m so sorry, Ron! BTW, you DO know it’s okay NOT to be stoic about some things, right? Please take care of yourself! ::hug::

    • Zwak says the real is not governable, but self-governing.

      All the best, Ron.

    • Ownbestenemy

      All the best ron.

    • Fourscore

      Include me in what the others have said better. Drive carefully

    • Brochettaward

      You are one of the good one’s, Ron. Sorry to hear about that. Enjoy the time you have the best you can.

    • Grumbletarian

      Damn, that bites. Best wishes to you and Mama Ron.

  8. Certified Public Asshat

    shapely female

    Thicc?

    • R.J.

      I wonder how the AI filter would react to that word?

      • UnCivilServant

        “thicc lady in gown in ballroom, masterpiece, best quality, sleek, highly detailed, digital painting, realistic digital painting, detailed digital painting, smooth gradients, caucasian, knight, armor, futuristic, cyberpunk, sleek, highly detailed, anime, indoors, ballroom, soft indoor lighting, day, digital painting, 1man:1.5, 1woman:1.5, muscular male, shapely female, duo, couple, redhead, green eyes, looking at viewer, armor, realistic, highres, smooth gradients, detailed face, realistic skin tone, youthful, strong, fantasy art, youthful, strong, full body” Gives me: This image

      • Gender Traitor

        Trying to figure out what that is on the lower center of her skirt… which I suspect none of you guys noticed. šŸ˜‰

      • UnCivilServant

        I saw it, I have no idea what it’s supposed to be.

      • kinnath

        I feel . . . . . called out somehow.

      • UnCivilServant

        Speaking of, if you still have the same number, I can text you when I reach Cedar Rapids on Aug 28th.

      • kinnath

        Same number as before. Looking forward to meeting again.

      • Suthenboy

        Well…i…uh….I didn’t have to figure it out. There are some shapes/designs whatever in artwork, stylization that are nearly ubiquitous in all cultures. That is one of them and for good reason. Actually the two that popped in my head immediately are also nearly interchangeable.

      • Certified Public Asshat

        Oh God, she looks like…I forget her name. The girl Count Potato always posted from the Daily Mail.

      • UnCivilServant

        Ms Rose? (Took me a few minutes to dredge the name up)

      • Certified Public Asshat

        Ah yes, Demi Rose.

      • Gender Traitor

        Haven’t heard about her in quite a while. Her 15 minutes must be up.

        I didn’t think the image came out all that “thicc,” but I’m sure I’m not the best judge.

      • rhywun

        That was the longest 15 minutes ever.

      • Ownbestenemy

        Well have you SEEN the size of her ass? /applies some orbital mechanics math

      • Sean

        LOL! That’s where my mind went too.

    • Ownbestenemy

      Surprised the media isn’t running headlines like “SCOTUS Allows Terrorist to Fly”

      • Sean

        And suggest that all Muslims are terrorists?

      • Ownbestenemy

        We know they are more than willing to skewer [insert group here] for the grander narrative.

      • R.J.

        They haven’t got as leg to stand on anymore what with all the free flights to unknown aliens Biden gives away.

      • R.J.

        The ruling is to allow his case against the government to proceed, not necessarily a strike down of the list, if I read this correctly.

      • Ownbestenemy

        Correct. They (the Government) tried to weasel their way out of it when challenged and dropped him off the list and did the oh so magical ‘its moot’ now defense. At least this time SCOTUS isn’t buying dubious doctrine.

      • juris imprudent

        It’s moot, right? No? OK, how about national security? No.

        Damn.

      • creech

        Why can’t they fly? Sounds like an acknowledgement that TSA security is more theater than effective.

    • Sensei

      Bonus from the the 9th no less…

      In 2016, the government told Fikre he had been dropped from the No Fly List and his lawsuit was moot, court filings show. The district court agreed, but the 9th U.S. Circuit Court of Appeals reversed, leading the government to appeal to the Supreme Court.

  9. Fatty Bolger

    One bag to rule them all.

    This girl reads this label in Galadrielā€™s voice from Lord of the Rings and it is absolutely HILARIOUS šŸ˜­šŸ¤£

    • Ownbestenemy

      Ha! Nicely done

  10. Sean

    https://www.phillyburbs.com/story/news/local/2024/03/18/andre-gordon-bucks-county-murder-falls-township-shooting-spree-trenton-nj-taylor-daniel-karen-gordon/73018123007/

    When Gordon was in custody, Trenton police searched the Phillips Avenue home, where they found a multi-caliber semi-automatic assault rifle, which New Jersey authorities described as a “ghost gun.” Police believe it’s the weapon that Gordon used in the carjacking in New Jersey and the murders and carjacking in Bucks County.

    A ghost gun is an unregulated, untraceable and unserialized firearm that is the fastest-growing gun safety issue in the United States, according to Everytown for Gun Safety. The weapons are described as homemade DIY guns that can be made using building blocks that can be purchased without a background check.

    If the lower was marked multi-caliber, was it really “a ghost gun” built with a 80% lower?

    I suspect not. Liars gonna lie.

    • UnCivilServant

      So, it takes multiple magazines to feed more than one barrel?

    • Brochettaward

      Ghost guns are far scarier than normal guns.

    • Fatty Bolger

      Damned ghost guns. A gun with serial numbers would have refused to shoot those people.

    • OBJ FRANKELSON

      By multi-caliber do they mean that amazing technology that allows you to shoot .223 and 5.56mm or 7.62mm/.308?

      • kinnath

        I have both of those kinds of magic rifles.

      • OBJ FRANKELSON

        (I know there are some differences in these loads, but you know what I mean)

      • UnCivilServant

        There is a difference, but they’re close enough that it can be managed.

      • Not Adahn

        .223 Wylde!

      • Necron 99

        .22 short, .22 long, and .22 long rifle.

        Love my Marlin.

    • Suthenboy

      I think that is worse than ‘the thing that goes up’.

    • Not Adahn

      Police believe itā€™s the weapon that Gordon used in the carjacking in New Jersey and the murders and carjacking in Bucks County.

      Unfortunately they can’t prove it because it’s a ghost gun, right?

      • Sean

        And he’ll then beat the charges on a technicality!

  11. Brochettaward

    In a story that will shock nobody, another left wing institution refuses to negotiate when with a union.

    Dartmouth has refused to negotiated with its unionized menā€™s basketball team, via the Associated Press.

    The school believes that athletes are not employees. It is refusing to negotiate as part of the legal dance that will result in the question being resolved in court.

    The regional director of the National Labor Relations Board determined that the basketball players are employees. Dartmouthā€™s status as a private institution keeps it within the NRLBā€™s jurisdiction.

    https://www.nbcsports.com/nfl/profootballtalk/rumor-mill/news/dartmouth-refuses-to-negotiate-with-unionized-basketball-players

    • creech

      Like who would care if the team went on strike?

  12. The Late P Brooks

    according to Everytown for Gun Safety

    In other words, preposterous hogwash.

  13. Sean

    On topic (sorta):

    AI gun babe

    • Not Adahn

      Needs to be integrated into the charging handle too tho.

    • kinnath

      Freckles and proper technique. I love it.

      • Not Adahn

        An actor on Warrior (set in 1880 San Francisco) was using modern trigger discipline on a revolver. That particular anachronism leapt out at me.

      • kinnath

        I enjoyed that show a lot. I hope they get to do a fourth season.

        But, yeah, lots of anachronisms throughout the series.

      • Not Adahn

        Kung fu and naked Asian chicks. What’s not to like?

      • kinnath

        not enough naked Asian chicks?

  14. Sensei

    Paywalled so run through the site of your choice or turn off JavaScript.

    ā€œThis is so part of my core, my soul, my neshama,ā€ Mr. Schumer said in an interview, using the Hebrew word for soul. ā€œI said to myself, ā€˜This may hurt me politically; this may help me politically.ā€™ I couldnā€™t look myself in the mirror if I didnā€™t do it.ā€

    That’s our selfless, never in the limelight Chuck Schumer. We’re lucky to have him!

    ā€˜Part of My Coreā€™: How Schumer Decided to Speak Out Against Netanyahu
    https://www.nytimes.com/2024/03/19/us/politics/schumer-israel-netanyahu-gaza.html

    • Suthenboy

      So, repeating the holocaust – exterminating Jews and utterly destroying Israel or any haven for them is in his soul.
      That’s good to know.

      • Sensei

        Chuck Schumer doesn’t decide what to eat for breakfast until he’s done the political calculation of the menu.

    • Sean

      What a putz.

      • Toxteth O'Grady

        It’s a shande.

  15. kinnath

    The Drinker gave Dune 2 a generally positive review. I am looking forward to seeing it when it starts streaming.

    • R C Dean

      Yeah, I guess Iā€™ll have to watch the first one.