Image to image generation

Learn the img2img tab and how to use the image to image functions

Image to image is the controlled transformation of a source image to match the characteristics of a target image or a target image domain. It is most commonly used for the inpainting and turning one picture to be like another. In this guide we will create a breakdown of the various settings using the img2img functionality of the Automatic1111 interface for Stable Diffusion.

Image to image

One of the most basic features for Stable Diffusion is image to image. This feature enables one to generate images that follow the composition of a base image. For example, let's say I have an image of Bruce Lee showing off his kicks: I'd first upload the image to the img2img drag and drop box.

With the image uploaded, you can then adjust the sampler, denoising and other fields. Type into the prompt, what you want to see. E.g. "A photorealistic illustration of Mario" and voila:

For example, if you used "An anime illustration of Dwayne Johnson" with the Bruce Lee picture, with a 512x512 image you might get:

"An anime illustration of Dwayne Johnson"

If I adjust the setting of the image and widen the width for example 712px by 512px, I might get something that more closely resembles the original Bruce Lee pose:

Adjusting the denoising will also affect the image you might get. The lower the denoising value, there will be less changes of the generated image to the original image (i.e. it will retain more of the original image's features). With the same prompt as above but with a denoising value of 0.4 for example you might get:

There are also a few default tabs which one can pick from for resizing. In the resize mode panel:

Just resize scales the input image to fit the new image dimension by stretching or squeezing the image. Crop and resize fits the new image canvas into the input image. Parts that don’t fit are removed. The aspect ratio of the original image will be preserved. Resize and fill fits the input image into the new image canvas. The aspect ratio will be preserved. Just resize (latent upscale) is similar to "Just resize" but scaling is done in the latent space. Use a denoising strength larger than 0.5 to avoid blurry images.

Sketch

Besides uploading an image, you could also sketch an initial picture. You can draw this directly on the Automatic1111 interface or upload it. For example, if you sketched a stick figure like so:

You could then add a prompt into the prompt input field like so:

"Buzz lightyear flying across space"

Resulting in an image like this:

Note: that the closer you sketch including with colours, the more likely, the generated image will also resemble the design you made. You also don’t have to draw something from scratch. You could use the sketch functionality to modify an existing image. For example, if I used a prompt to generate a plushy toy of Deadpool.

"a beautiful photo of a plushy deadpool teddy bear, trending on instagram"

I could with the sketch function, look to remove the teddy bear ears which was generated. By drawing on top of it, I could then ask Stable Diffusion to generate me a similar image but without the ears or with the ears being replaced by something else.

Inpainting

Inpainting is one of the most popular image techniques on Stable Diffusion for 'restoring missing parts of pictures'. One can use inpainting for regenerating a part of an image which you might prefer to have something else than what was originally generated. This is quite a useful feature for both removing objects, adjusting what has been drawn with something else (e.g. faces) and so forth. As an example, let's say I have an image of a Spongebob-like character.

With in-painting, if I wished to remove the wacky arms that Stable Diffusion has generated, I could use mask tool to draw over the wacky arms. Without entering any prompt but keeping the seed and adjusting the denoising (here 0.6), I could then get the following results below.

Inpaint sketch

Inpaint sketch combines inpainting and sketch. It enables you to paint like in the sketch function but only regenerates the painted area. The unpainted areas are unchanged. Below are some examples:

Inpaint Upload

The inpaint upload is similar to the inpaint, however it enables you to 'paint' (edit) the image on another program and then upload it to the Automatic1111 interface. For example if you wish to work on it on Photoshop or Figma, you could download the image, work on it and then upload both the original and the 'masked' re-worked version before re-generating the image.

Batch

The batch tab enables you to inpaint or perform image-to-image on multiple images. You would need to upload the files into the provided directory to batch the image tasks.

Get prompts for an image

Interrogate CLIP and Interrogate DeepBooru are useful for getting prompt inspiration. They don't recreate the exact prompts for an image, but if you upload an image in the img2img tab, and then click on the button 'Interrogate CLIP', you will then get a prompt for the image.

The main difference between CLIP and DeepBooru, is that the DeepBooru model is more optimised for giving prompts for anime images.

Last updated