Runs the official Stable Diffusion release on Huggingface in a GPU accelerated Docker container.
./build.sh run 'An impressionist painting of a parakeet eating spaghetti in the desert'./build.sh run --image parakeet_eating_spaghetti.png --strength 0.6 'Abstract art'By default, the pipeline uses the full model and weights which requires a CUDA
capable GPU with 8GB+ of VRAM. It should take a few seconds to create one image.
On less powerful GPUs you may need to modify some of the options; see the
Examples section for more details. If you lack a suitable GPU you
can set the option --device cpu instead.
Since it uses the official model, you will need to create a user access token
in your Huggingface account. Save the
user access token in a file called token.txt and make sure it is available
when building the container.
The pipeline is managed using a single build.sh script. You must
build the image before it can be run.
Make sure your user access token is saved in a file called
token.txt. The token content should begin with hf_...
To build:
./build.sh build # or just ./build.shTo run:
./build.sh run 'Andromeda galaxy in a bottle'First, copy an image to the input folder. Next, to run:
./build.sh run --image image.png 'Andromeda galaxy in a bottle'First, copy an image and an image mask to the input folder. White areas of the
mask will be diffused and black areas will be kept untouched. Next, to run:
./build.sh run --model 'runwayml/stable-diffusion-inpainting' \
--image image.png --mask mask.png 'Andromeda galaxy in a bottle'Some of the options from txt2img.py
are implemented for compatibility:
--prompt [PROMPT]: the prompt to render into an image--n_samples [N_SAMPLES]: number of images to create per run (default 1)--n_iter [N_ITER]: number of times to run pipeline (default 1)--H [H]: image height in pixels (default 512, must be divisible by 64)--W [W]: image width in pixels (default 512, must be divisible by 64)--scale [SCALE]: unconditional guidance scale (default 7.5)--seed [SEED]: RNG seed for repeatability (default is a random seed)--ddim_steps [DDIM_STEPS]: number of sampling steps (default 50)
Other options:
--attention-slicing: use less memory at the expense of inference speed (default is no attention slicing)--device [DEVICE]: the cpu or cuda device to use to render images (defaultcuda)--half: use float16 tensors instead of float32 (defaultfloat32)--image [IMAGE]: the input image to use for image-to-image diffusion (defaultNone)--mask [MASK]: the input mask to use for diffusion inpainting (defaultNone)--model [MODEL]: the model used to render images (default isCompVis/stable-diffusion-v1-4)--negative-prompt [NEGATIVE_PROMPT]: the prompt to not render into an image (defaultNone)--skip: skip safety checker (default is the safety checker is on)--strength [STRENGTH]: diffusion strength to apply to the input image (default 0.75)--token [TOKEN]: specify a Huggingface user access token at the command line instead of reading it from a file (default is a file)
These commands are both identical:
./build.sh run 'abstract art'
./build.sh run --prompt 'abstract art'Set the seed to 42:
./build.sh run --seed 42 'abstract art'Options can be combined:
./build.sh run --scale 7.0 --seed 42 'abstract art'On systems with <8GB of GPU RAM, you can try mixing and matching options:
- Make images smaller than 512x512 using
--Wand--Hto decrease memory use and increase image creation speed - Use
--halfto decrease memory use but slightly decrease image quality - Use
--attention-slicingto decrease memory use but also decrease image creation speed - Decrease the number of samples and increase the number of iterations with
--n_samplesand--n_iterto decrease overall memory use - Skip the safety checker with
--skipto run less code
./build.sh run --W 256 --H 256 --half --attention-slicing --skip --prompt 'abstract art'On Windows, if you aren't using WSL2 and instead use MSYS, MinGW, or Git Bash,
prefix your commands with MSYS_NO_PATHCONV=1 (or export it beforehand):
MSYS_NO_PATHCONV=1 ./build.sh run --half --prompt 'abstract art'The model and other files are cached in a volume called huggingface.
The images are saved as PNGs in the output folder using the prompt text. The
build.sh script creates and mounts this folder as a volume in the container.



