Comment on page
Image to image generation
Learn the img2img tab and how to use the image to image functions
Image to image is the controlled transformation of a source image to match the characteristics of a target image or a target image domain. It is most commonly used for the inpainting and turning one picture to be like another. In this guide we will create a breakdown of the various settings using the img2img functionality of the Automatic1111 interface for Stable Diffusion.
The img2img tab
One of the most basic features for Stable Diffusion is image to image. This feature enables one to generate images that follow the composition of a base image. For example, let's say I have an image of Bruce Lee showing off his kicks: I'd first upload the image to the img2img drag and drop box.
Step 1: Upload the image to the img2img box
A photorealistic illustration of Mario
For example, if you used "An anime illustration of Dwayne Johnson" with the Bruce Lee picture, with a 512x512 image you might get:
"An anime illustration of Dwayne Johnson"
If I adjust the setting of the image and widen the width for example 712px by 512px, I might get something that more closely resembles the original Bruce Lee pose:
Adjusting the generated image size to resemble the original image uploaded, will increase the resemblance
Adjusting the denoising will also affect the image you might get. The lower the denoising value, there will be less changes of the generated image to the original image (i.e. it will retain more of the original image's features). With the same prompt as above but with a denoising value of 0.4 for example you might get:
Prompt: "An anime illustration of Dwayne Johnson" with denoising of 0.4
There are also a few default tabs which one can pick from for resizing. In the resize mode panel:
Resize mode panel
Just resize scales the input image to fit the new image dimension by stretching or squeezing the image. Crop and resize fits the new image canvas into the input image. Parts that don’t fit are removed. The aspect ratio of the original image will be preserved. Resize and fill fits the input image into the new image canvas. The aspect ratio will be preserved. Just resize (latent upscale) is similar to "Just resize" but scaling is done in the latent space. Use a denoising strength larger than 0.5 to avoid blurry images.
Besides uploading an image, you could also sketch an initial picture. You can draw this directly on the Automatic1111 interface or upload it. For example, if you sketched a stick figure like so:
A stick figure drawn on the Automatic1111 interface
You could then add a prompt into the prompt input field like so:
"Buzz lightyear flying across space"
Resulting in an image like this:
Steps: 50, Sampler: Euler a, CFG scale: 7, Seed: 2946328922, Size: 512x512. Denoising strength: 0.75, Mask blur: 4
Note: that the closer you sketch including with colours, the more likely, the generated image will also resemble the design you made. You also don’t have to draw something from scratch. You could use the sketch functionality to modify an existing image. For example, if I used a prompt to generate a plushy toy of Deadpool.
"a beautiful photo of a plushy deadpool teddy bear, trending on instagram"
Plushy toy of Deadpool. Steps: 20, Sampler: Euler a, CFG scale: 7, Seed: 1261681766.
I could with the sketch function, look to remove the teddy bear ears which was generated. By drawing on top of it, I could then ask Stable Diffusion to generate me a similar image but without the ears or with the ears being replaced by something else.
Using the sketch functionality to get rid of the teddy bear ears on the Deadpool pushy.
Inpainting is one of the most popular image techniques on Stable Diffusion for 'restoring missing parts of pictures'. One can use inpainting for regenerating a part of an image which you might prefer to have something else than what was originally generated. This is quite a useful feature for both removing objects, adjusting what has been drawn with something else (e.g. faces) and so forth. As an example, let's say I have an image of a Spongebob-like character.
Prompt: "polaroid photo of bikini bottom". Seed: 4260603284.
With in-painting, if I wished to remove the wacky arms that Stable Diffusion has generated, I could use mask tool to draw over the wacky arms. Without entering any prompt but keeping the seed and adjusting the denoising (here 0.6), I could then get the following results below.
Inpainting the wacky arms of the Spongebob like character
Inpaint sketch combines inpainting and sketch. It enables you to paint like in the sketch function but only regenerates the painted area. The unpainted areas are unchanged. Below are some examples:
Inpaint sketch say a turquoise shawl.
Results from the inpaint sketch of the image above.
The inpaint upload is similar to the inpaint, however it enables you to 'paint' (edit) the image on another program and then upload it to the Automatic1111 interface. For example if you wish to work on it on Photoshop or Figma, you could download the image, work on it and then upload both the original and the 'masked' re-worked version before re-generating the image.
The batch tab enables you to inpaint or perform image-to-image on multiple images. You would need to upload the files into the provided directory to batch the image tasks.
Interrogate CLIP and Interrogate DeepBooru are useful for getting prompt inspiration. They don't recreate the exact prompts for an image, but if you upload an image in the img2img tab, and then click on the button 'Interrogate CLIP', you will then get a prompt for the image.
Interrogate CLIP for prompts to an image
The main difference between CLIP and DeepBooru, is that the DeepBooru model is more optimised for giving prompts for anime images.