Guidance for Asynchronous Image Generation with Stable Diffusion on AWS

Summary: This implementation guide provides an overview of Guidance for Asynchronous Image Generation with Stable Diffusion on AWS, its reference architecture and components, considerations for planning the deployment, and configuration steps for deploying the guidance name to Amazon Web Services (AWS). This guide is intended for solutions architects, business decision makers, DevOps engineers, data scientists, and cloud professionals who want to implement 'Guidance for Asynchronous Image Generation with Stable Diffusion on AWS' in their environment.

Overview

This guidance covers implementing a fast-scaling and low-cost Stable Diffusion inference technology with serverless and container technology on AWS.

Stable Diffusion is a popular open-source project for generating images using generative artificial intelligence (AI). Building a scalable and cost-efficient inference solution is a common challenge for AWS customers. This guide shows how to use serverless and container services to build an end-to-end, low-cost, and fast-scaling asyncronous image generation architecture. The GitHub repository contains the sample code and cloud development kit (CDK) deployment scripts, helping AWS customers to deploy This guidance in a few steps.

Features and benefits

The Guidance for Asynchronous Image Generation with Stable Diffusion on AWS provides the following features:

Event-driven architecture
Autoscaling of pods based on queue length using KEDA
Automatic Amazon EKS compute Node EC2 scaling using Karpenter
New inference EKS compute nodes provisioned within 2 minutes
Up to 70% cost savings using GPU Spot instances for compute nodes
Support for multiple community Stable Diffusion runtimes

Use cases

Stable Diffusion represents a notable improvement in text-to-image model generation. It’s broadly available and needs significantly less processing power than many other text-to-image models. Some of its capabilities include text-to-image, image-to-image, graphic artwork, and image editing.

As a powerful text-to-image generation model, Stable Diffusion has a wide range of applications, including:

Artistic creation: Stable Diffusion can automatically generate high-quality artworks such as paintings, illustrations, and concept art based on text descriptions, providing inspiration and assistance to artists.
Game design: Stable Diffusion can be used to quickly create game assets such as characters, scenes, and props, accelerating prototyping and game content iteration.
Image editing: Stable Diffusion has capabilities for image denoising, super-resolution reconstruction, style transfer, and other tasks.
E-commerce operations: Stable Diffusion can generate visual assets like product description images, reducing operational costs and improving image production efficiency for e-commerce platforms.

This guide provides an architecture diagram and step-by-step instructions for running Stable Diffusion inference tasks at scale on Amazon EKS. When configured, This guidance can perform the following tasks:

Text-to-Image: Generate images based on the provided prompts and configurations.
Image-to-Image: Generate images based on provided prompts, reference images, and configurations.
Single Image Super-Resolution: Upscale the resolution of an image while preserving details as much as possible.
Pipelines: Orchestrate the above tasks and custom tasks.

Architecture overview

This guidance consists of three main components:

Serverless task scheduling and dispatching
Stable Diffusion runtime on Amazon EKS and Amazon EC2 accelerated compute instances
Management and maintenance components

Serverless task scheduling and dispatching

This component includes an API endpoint based on Amazon API Gateway and a task dispatching part based on Amazon SNS and Amazon SQS.

Users submit requests with parameters like the model and text prompt to an API endpoint integrated with AWS services through API Gateway.
Requests are validated by AWS Lambda and published to an Amazon SNS topic.
Amazon SNS publishes the requests to the corresponding SQS queue based on the runtime name specified in the request.

Stable Diffusion runtime

This component includes the Stable Diffusion runtime on Amazon EKS, supporting elastic scaling based on requests.

For each runtime version, the following features apply:

During deployment, each runtime has an independent Amazon SQS queue to receive requests.
The Queue Agent receives tasks from the Amazon SQS queue and sends them to the Stable Diffusion runtime for image generation.
The generated images are stored in an Amazon S3 bucket by the Queue Agent, and a completion notification is published to an Amazon SNS topic.
When the Amazon SQS queue accumulates too many messages, KEDA scales up the runtime replicas based on the queue length, and Karpenter launches new GPU instances to host the new replicas.
When the Amazon SQS queue no longer accumulates messages, KEDA scales down the replicas, and Karpenter terminates unnecessary GPU instances to save costs.

Architecture diagram

This section shows an architecture diagram for the components deployed with This guidance.

Figure 1: Asynchronous Image Generation with Stable Diffusion on AWS reference architecture

Architecture steps

A user or an application sends a prompt to Amazon API Gateway that acts as an endpoint for the overall Guidance, including authentication. AWS Lambda function validates the requests, publishes them to the designated Amazon Simple Notification Service (Amazon SNS) topic, and immediately returns a response.
Amazon SNS publishes the message to Amazon Simple Queue Service (Amazon SQS) queues. Each message contains a Stable Diffusion (SD) runtime name attribute and will be delivered to the queues with matching SD runtime.
In the Amazon Elastic Kubernetes Service(Amazon EKS) cluster, the previously deployed open source Kubernetes Event Driven Auto-Scaler (KEDA) scales up new pods to process the incoming messages from SQS model processing queues.
In the Amazon EKS cluster, the previously deployed open source Kubernetes auto-scaler, Karpenter, launches new compute nodes based on GPU Amazon Elastic Compute Cloud (Amazon EC2) instances (such as g4, g5, and p4) to schedule pending pods. The instances use pre-cached SD Runtime images and are based on Bottlerocket OS for fast boot. The instance can be launched with on-demand or spot pricing model.
Stable Diffusion Runtimes load ML model files from Amazon Simple Storage Service (Amazon S3) via Mountpoint for Amazon S3 CSI Driver on runtime initialization or on demand.
Queue agents (a software component created for This guidance) receive messages from SQS model processing queues and convert them to inputs for SD Runtime API calls.
Queue agents call SD Runtime APIs, receive and decode responses, and save the generated images to designated Amazon S3 buckets.
Queue agents send notifications to the designated SNS topic from the pods, the user receives notifications from SNS and can access images in S3 buckets.

AWS services in This guidance

AWS Service	Description
Amazon Simple Storage Service (Amazon S3)	Core Service Used for storing models and generated images.
Amazon Virtual Private Cloud (Amazon VPC)	Core Service - network security layer
Amazon Elastic Conatiner Registry (Amazon ECR)	Used for storing container images required by the runtimes.
Amazon API Gateway	Core Service- Provides the API endpoint for external access.
AWS Lambda	Core Service - Used for request validation and routing.
Amazon Simple Queue Service (Amazon SQS)	Core Service - Used for storing pending tasks.
Amazon Simple Notification Service (Amazon SNS)	Core Service - Used for routing tasks to different SQS queues and providing completion notifications and callbacks.
Amazon Elastic Kubernetes Service (Amazon EKS)	Core Service - Used for managing and running the Stable Diffusion runtimes.
Amazon Elastic Compute Cloud (Amazon EC2)	Core Service - Used for running the Stable Diffusion runtimes.
Amazon CloudWatch	Auxiliary service - Used for monitoring system health, providing metrics, logs, and traces.
AWS Cloud Development Kit (AWS CDK)	Used for deploying and updating This guidance.

Cost

You are responsible for the cost of the AWS services used while running This guidance. As of April 2024, the cost of running this Guidance with the default settings in the us-west-2 (Oregon) for one month and generating one million images would cost approximately $436.72 (excluding free tiers).

We recommend creating a budget through AWS Cost Explorer to help manage costs. Prices are subject to change. For full details, refer to the pricing webpage for each AWS service used in This guidance.

The main services and their pricing for usage related to the number of images are listed below (per one million images):

AWS Service	Billing Dimension	Quantity per 1M Images	Unit Price [USD]	Total [USD]
Amazon EC2	g5.2 xlarge instance, Spot instance per hour	416.67	$ 0.4968	$ 207
Amazon API Gateway	Per 1M REST API requests	1	$ 3.50	$ 3.50
AWS Lambda	Per GB-second	12,500	$ 0.0000166667	$ 0.21
AWS Lambda	Per 1M requests	1	$ 0.20	$ 0.20
Amazon SNS	Per 1M requests	2	$ 0.50	$ 0.50
Amazon SNS	Data transfer per GB	7.62**	$ 0.09	$ 0.68
Amazon SQS	Per 1M requests	2	$ 0.40	$ 0.80
Amazon S3	Per 1K PUT requests	2,000	$ 0.005	$ 10.00
Amazon S3	Per GB per month	143.05***	$ 0.023	$ 3.29
Total, 1M images				$226.18

The fixed costs unrelated to the number of images, with the main services and their pricing listed below (per month):

AWS Service	Billing Dimension	Quantity per Month	Unit Price [USD]	Total [USD]
Amazon EKS	Cluster	1	$ 72.00	$ 72.00
Amazon EC2	m5.large instance, On-Demand instance per hour	1440	$ 0.0960	$ 138.24
Total, month				$210.24

* Calculated based on an average request duration of 1.5 seconds and the average Spot instance pricing across all Availability Zones in the US West (Oregon) Region from January 29, 2024, to April 28, 2024.
** Calculated based on an average request size of 16 KB.
*** Calculated based on an average image size of 150 KB, stored for 1 month.

Please note that these are estimated costs for reference only. The actual costs may vary depending on the model you use, task parameters, current Spot instance pricing, and other factors.

Security

When you build systems on AWS infrastructure, security responsibilities are shared between you and AWS. This shared responsibility model reduces your operational burden because AWS operates, manages, and controls the components, including host operating systems, the virtualization layer, and the physical security of the facilities in which the services operate. For more information about AWS security, visit AWS Cloud Security.

IAM Roles

AWS Identity and Access Management (IAM) roles allow AWS customers to assign granular access policies and permissions to AWS services and users in the cloud.

This guidance creates separate IAM roles and grants permissions for the following components:

Amazon EKS cluster, including
- Creating and operating the cluster
- Compute node groups
- Nodes created by Karpenter
- Pods running in the cluster, including
- Karpenter
- KEDA
- Fluent Bit
- Stable Diffusion runtimes
AWS Lambda functions
Amazon API Gateway

This guidance uses IAM roles for internal user access control, following the principle of least privilege, ensuring that each component can only access authorized components and maintaining application workload isolation.

Access Control

This guidance uses an API Key mechanism for external user access control, requiring users to include a valid API Key in their requests. For more information about API Keys, please refer to the API Specification.

Networking

This guidance operates within an isolated VPC by default, separate from your other workloads. If you need to connect this VPC to your existing VPC or a transit gateway, you are responsible for the gateways, firewalls, and access control.

Service Quotas

Service quotas, also referred to as limits, are the maximum number of service resources or operations for your AWS account.

Quotas for AWS services in This guidance

To view the service quotas for all AWS services in the documentation without switching pages, view the information in the Service endpoints and quotas page in the PDF instead.

Each AWS account has quotas on the number of resources that can be created in each AWS Region. You can view service quotas in the AWS console using the Service Quotas tool. If a service quota can be increased, you can open a case through this tool to request an increase.

The main service quotas related to This guidance are:

AWS Service	Quota Entry	Estimated Usage	Adjustable
Amazon EC2	Running On-Demand G and VT instances	Based on max concurrent GPU instances	[X]
Amazon EC2	All G and VT Spot Instance Requests	Based on max concurrent GPU instances	[X]
Amazon SNS	Messages Published per Second	Based on max concurrent requests	[X]

Additionally, consider the following service quotas during deployment:

AWS Service	Quota Entry	Estimated Usage	Adjustable
Amazon VPC	VPCs per Region	1	[X]
Amazon VPC	NAT gateways per Availability Zone	1	[X]
Amazon EC2	EC2-VPC Elastic IPs	1	[X]
Amazon S3	General purpose buckets	1 per queue	[X]

Deploy the Guidance

Check the following considerations before deployment:

Deployable Regions

The services or Amazon EC2 instance types used in this guidance may not be available in all AWS Regions. Please deploy this sample code in an AWS Region that provides the required services.

Verified Deployable Regions

Region Name	Verified
US East (N. Virginia)	[X]
US West (Oregon)	[X]

When deploying the guidance into an AWS Region not in the table above, you may need to manually specify the instance type used by Karpenter as g4dn or other GPU instance types when deploying in Regions that do not support g5 instances.

Deploying into AWS China Regions

For information and the procedures for deploying in Amazon Web Services in the China Region, see Deploying in AWS China Regions

IAM permissions

Deploying this guidance requires administrator or equivalent permissions. Due to the number of components, we do not provide a minimal permissions list.

Choose Stable Diffusion Runtime

You need a runtime component to deploy a Stable Diffusion model and provide API access to it.

Currently, the following community Stable Diffusion runtimes are available:

Runtime Name	Link	Verified
Stable Diffusion Web UI	GitHub	[X]
ComfyUI	GitHub	[X]
InvokeAI	GitHub

You can also choose other runtimes or build your own. You need to package your runtime as a container image that can run on Amazon EKS.

You need to fully understand and comply with the license terms of the Stable Diffusion runtime you use.

Example Runtime
You can use the community-provided example Dockerfile to build runtime container images for Stable Diffusion Web UI and ComfyUI runtimes. Please note that these images are for technical evaluation and testing purposes only and should not be deployed to production environments

Model Storage
By default, this guidance will load models to the /opt/ml/code/models directory. Ensure your runtime is configured to read models from this directory. You need to disable memory-mapped (mmap) files to achieve the highest performance.
For SD Web UI, set disable_mmap_load_safetensors: true in config.json
For ComfyUI, manually modify the source code as described in this community issue.

Notes on Stable Diffusion (SD) Web UI Runtime
For the SD Web UI runtime, there are static runtimes (pre-load models) and dynamic runtimes (load models on-demand), depending on the model being run.
Static runtimes use models specified in modelFilename parameter. The model is loaded into GPU memory at startup.
Dynamic runtimes the following parameter need to be set to dynamicModel: true. No model needs to be specified - the runtime will load the model from Amazon S3 and perform inference based on the model used in the request.

Other important limitations

In the current version, this guidance will automatically create a new VPC when deployed. The VPC includes:
- CIDR 10.0.0.0/16
- 3 public subnets in different availability zones, with size /19
- 3 private subnets in different availability zones, with size /19
- 3 NAT gateways (placed in public subnets)
- 1 Internet gateway
- Corresponding route tables and security groups
Currently, those parameters of a new VPC cannot be customized.
In the current version, this guidance can only be deployed on a new EKS cluster with a fixed version of 1.28. We will update the cluster version as new Amazon EKS versions are released.

Deployment process overview

Before you launch this guidance, review the cost, architecture, security, and other considerations discussed in this guide. Follow the step-by-step instructions in this section to configure and deploy the guidance into your account.

We provide a one-click deployment script for a quick start. The total deployment time is around 30 minutes.

Get source code

Run the following command to get the source code and deployment scripts:

git clone --recursive https://github.com/aws-solutions-library-samples/guidance-for-asynchronous-inference-with-stable-diffusion-on-aws
cd guidance-for-asynchronous-inference-with-stable-diffusion-on-aws

Quick start

We provide a one-click deployment script for a quick start. The total deployment time is approximately 30 minutes.

Operation System
This script only supports Linux with Bash as shell, and validated with Amazon Linux and Ubuntu. Windows, WSL (Windows Subsystem of Linux) and MacOS are not supported. You are recommended to use an EC2 instance or Cloud9 workspace to run the script.

One-click deployment

Run the following command to deploy with the simplest settings:

cd deploy
./deploy.sh

This script will:

Install the necessary runtimes and tools.
Create an S3 bucket, download the Stable Diffusion 1.5 foundation model from HuggingFace, and place it in the bucket.
Create an EBS snapshot containing the SD Web UI image using our provided sample image.
Create a Stable Diffusion guidance with the SD Web UI runtime.

Minimal Deployment
The configuration file generated by this script is the simplest configuration, containing only one runtime and without an ability to customize (such as scaling thresholds, custom models, and custom images). If you need to customize the configuration, run the following command:
./deploy.sh -d

This parameter will cause the deployment script to only complete the pre-deployment preparation but not actually deploy. After modifying the configuration, you can run the following command to deploy:
cdk deploy --no-rollback --require-approval never

Deployment parameters

The deployment script provides some parameters for you to customize the deployed Guidance:

-h, --help: Display help information
-n, --stack-name: Customize the name of the deployed Guidance, affecting the naming of generated resources. Default is sdoneks.
-R, --region: The AWS Region to deploy the guidance to. Defaults to the current AWS profile Region.
-d, --dry-run: Only generate configuration files; do not perform deployment.
-b, --bucket: Specify an existing S3 bucket name for storing models. This S3 bucket must already exist and be in the same Region as the guidance.
-s, --snapshot: Specify an existing EBS snapshot ID. You can build the EBS snapshot yourself by following the documentation below.
-r, --runtime-name: Specify the name of the deployed runtime, affecting the name used for API calls. Default is sdruntime.
-t, --runtime-type: Specify the type of the deployed runtime, only accepting sdwebui and comfyui. Default is sdwebui.

Manual deployment

You can also deploy this guidance on AWS “manually” without using the deployment script by following these steps:

Create an Amazon S3 model storage bucket and store the required models in the bucket
(Optional) Build the container image
(Optional) Store the container image in an EBS cache to accelerate startup
Deploy and launch the guidance stack

Model storage

The models required by this guidance should be stored in an S3 bucket beforehand.

Create the S3 bucket

Follow these steps to create the S3 bucket:

AWS Management Console {: .label .label-blue }:

Open the Amazon S3 console.
In the left navigation pane, choose Buckets.
Choose Create Bucket.
In Bucket name, enter a name for your bucket. The name must follow bucket naming rules.
In AWS Region, choose the same Region where you plan to deploy the guidance.

To ensure model loading speed, make sure the bucket is in the same AWS Region as your Guidance deployment. If you plan to deploy multiple replicas of the guidance in multiple Regions, create a separate bucket in each Region.

Choose Create Bucket

AWS Command Line Interface (AWS CLI):

Run the following command to create the bucket, replace <bucket name> with your desired bucket name and us-east-1 with the AWS Region where you plan to deploy the guidance:

aws s3api create-bucket --bucket <bucket name> --region us-east-1

Store Models

Store all ML models you need to use in an S3 bucket following this directory structure:

└── /
    ├── CLIP
    ├── Codeformer
    ├── ControlNet
    ├── ESRGAN
    ├── GFPGAN
    ├── LDSR
    ├── Lora
    ├── RealESRGAN
    ├── ScuNET
    ├── Stable-diffusion
    ├── SwinIR
    ├── VAE
    ├── VAE-approx
    ├── embeddings
    └── hypernetworks

Place the models into their corresponding directories. The Stable-diffusion directory must exist and contain the Stable Diffusion model. Other directories can be omitted if there are no models.

Currently, .safetensors and .ckpt model formats are supported. If you downloaded models from Civitai without an extension, add the .ckpt extension.

Follow these steps to upload the models to the S3 bucket:

AWS Management Console {: .label .label-blue }:

Open the Amazon S3 console.
In the left navigation pane, choose Buckets.
Select the bucket you created in the previous step and navigate to the desired folder.
If the corresponding folder does not exist:
- Choose Create Folder
- In Folder Name, enter the folder name.
- Choose Create folder
- Repeat the above steps until the folder structure matches the structure above.
Choose Upload
Choose Add files and select the model files you want to upload.
Choose Upload. Do not close the browser during the upload process.

AWS CLI {: .label .label-green }:

Run the following command to upload the model files to the bucket. Replace <model name> with your model file name, <folder>, with the model type, and <bucket name> with your desired bucket name:

aws s3 cp <model name> s3://<bucket name>/<folder>/

When uploading with the AWS CLI, there is no need to create the directory structure in advance.

You can use third-party tools like s5cmd to improve upload speed.

Image Building

You can build the image from the source code and store it in your image repository.

Runtime Selection
You need to provide the Stable Diffusion runtime image yourself. You can find the supported Stable Diffusion runtimes in the Deployment Planning section.

Pre-built Images
For evaluation and testing purposes, you can use our pre-built container images:
SD Web UI: public.ecr.aws/bingjiao/sd-on-eks/sdwebui:latest
ComfyUI: public.ecr.aws/bingjiao/sd-on-eks/comfyui:latest
Queue Agent: public.ecr.aws/bingjiao/sd-on-eks/queue-agent:latest
Please note that these images are only for technical evaluation and testing purposes, and you are responsible for any license issue associated with using them.

Build the Image

Run the following command to build the queue-agent image:

docker build -t queue-agent:latest src/backend/queue_agent/

Sample Runtimes
You can use the community-provided sample Dockerfile to build the container images for Stable Diffusion Web UI and ComfyUI runtimes. These images are only for technical evaluation and testing purposes and should not be deployed to production environments.

Push the Image to Amazon ECR

Image Repository Selection
We recommend using Amazon ECR as the image repository, but you can also choose other image repositories that support the OCI standard, such as Harbor.

First-time Push
Amazon ECR requires creating the image repository before pushing.
AWS CLI

Run the following command to create:
aws ecr create-repository --repository-name sd-on-eks/queue-agent
AWS Management Console {: .label .label-blue }:
Open the Amazon ECR console.
Choose Get started.
For Visibility settings, choose Private.
For Repository name, enter sd-on-eks/queue-agent.
Choose Create repository.

Run the following commands to log in to the image repository and push the image. Replace us-east-1 with your AWS Region and 123456789012 with your AWS account ID:

aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin 123456789012.dkr.ecr.us-east-1.amazonaws.com

docker tag queue-agent:latest 123456789012.dkr.ecr.us-east-1.amazonaws.com/sd-on-eks/queue-agent:latest
docker push 123456789012.dkr.ecr.us-east-1.amazonaws.com/sd-on-eks/queue-agent:latest

Build and Push Helm Chart

Some components in this guidance are deployed using a Helm Chart. In general, you do not need to deeply customize the contents of the Helm Chart and can directly use our pre-built Helm Chart. You can also configure the runtime settings using the config.yaml file. The Helm Chart can be stored on any HTTP server accessible over the Internet or in an image repository compatible with the OCI standard,including Amazon ECR.

China Region Support
Due to a known issue with the CDK framework, you cannot store the Helm Chart in an ECR image repository in the China Regions. We are actively working to resolve this issue.

Using ECR Image Repository

First-time Push
Amazon ECR requires creating an image repository before pushing.
AWS CLI {: .label .label-green }:
Run the following command to create:
aws ecr create-repository --repository-name sd-on-eks/charts/sd-on-eks
AWS Management Console {: .label .label-blue }:
Open the Amazon ECR console.
Choose Get started.
For Visibility settings, choose Private.
For Repository name, enter sd-on-eks/charts/sd-on-eks.
Choose Create repository.

Run the following commands to log in to the image repository and push the Helm Chart. Replace us-east-1 with your AWS Region and 123456789012 with your AWS account ID:

helm package src/charts/sd_on_eks
helm push sd-on-eks-<version>.tgz oci://123456789012.dkr.ecr.us-east-1.amazonaws.com/sd-on-eks/charts/

After uploading, modify config.yaml and add the following content under each runtime that needs to use the Helm Chart:

modelsRuntime:
- name: sdruntime
  namespace: default
  type: sdwebui
  chartRepository: "oci://123456789012.dkr.ecr.us-east-1.amazonaws.com/sd-on-eks/charts/sd-on-eks"
  chartVersion: "1.1.0" # Modify if you customized the Helm Chart version

Using HTTP Server

Access Control
Make sure the HTTP server is open to the Internet and does not have any access control (such as IP allow-listing).

Run the following command to package the Helm Chart:

helm package src/charts/sd_on_eks

After packaging, an output file named sd-on-eks-<version>.tgz will be generated. Place this file in an empty folder and run the following command:

helm repo index

You can place the generated compressed package and index.yaml on the HTTP server. Assuming the HTTP server domain is example.com (IP addresses are also acceptable), you need to modify config.yaml and add the following content under each runtime that needs to use the Helm Chart:

modelsRuntime:
- name: sdruntime
  namespace: default
  type: sdwebui
  chartRepository: "http://example.com/"
  chartVersion: "1.0.0"  # Modify if you customized the Helm Chart version

Image Cache Building

By pre-caching the container image as an EBS snapshot, you can optimize the startup speed of compute instances. When launching new instances, the instance’s data volume will have the container image cache pre-loaded, eliminating the need to pull from the image repository.

The EBS snapshot should be created before deploying the guidance. We provide a script for building the EBS snapshot.

Using Custom Image

If you built and pushed the image to Amazon ECR, run the following command. Replace us-east-1 with the Region where the guidance is deployed and 123456789012 with your 12-digit AWS account number:

cd utils/bottlerocket-images-cache
./snapshot.sh 123456789012.dkr.ecr.us-east-1.amazonaws.com/sd-on-eks/sdwebui:latest,123456789012.dkr.ecr.us-east-1.amazonaws.com/sd-on-eks/queue-agent:latest

Using Pre-built Image

If you are using the pre-built images provided by the guidance, run the following command:

cd utils/bottlerocket-images-cache
./snapshot.sh public.ecr.aws/bingjiao/sd-on-eks/sdwebui:latest,public.ecr.aws/bingjiao/sd-on-eks/comfyui:latest,public.ecr.aws/bingjiao/sd-on-eks/queue-agent:latest

After the script completes, it will output the EBS snapshot ID (in the format snap-0123456789). You can apply this snapshot during deployment.

For more details about this script, refer to the GitHub repository.

Manual Deployment

Follow these steps to deploy This guidance manually:

Install Required Components

Install the following runtimes before deployment:

Node.js version 18 or later.
AWS CLI.
AWS CDK Toolkit.
git.

Edit Configuration File

The configuration for This guidance is stored in the config.yaml file. We provide a configuration file template and you can customize it according to your actual specificiation.

Set the model storage bucket (required)
Replace <bucket name> in modelBucketArn with the name of the S3 bucket where you placed the models.
```
 modelBucketArn: arn:aws:s3:::<bucket name>
```
China Regions
If you are using an AWS China Region, make sure the partition in the ARN is aws-cn.
```
modelBucketArn: arn:aws-cn:s3:::<bucket name>
```

Set the Stable Diffusion runtime (required)

You need to specify the runtime parameters. The runtime is defined in modelsRuntime with the following configuration:

 modelsRuntime:
 - name: "sdruntime" # Required parameter, the name of the runtime, cannot be the same as other runtimes
   namespace: "default" # Required parameter, the Kubernetes namespace where the runtime is located, not recommended to place in the same namespace as other runtimes
   type: "sdwebui" # Required parameter, the type of this runtime, currently only supports "sdwebui" and "comfyui"
   modelFilename: "v1-5-pruned-emaonly.safetensors" # (SD Web UI) Name of the model used by this runtime, cannot be the same as other runtimes
   dynamicModel: false # (SD Web UI) Whether this runtime allows dynamic model loading

You can configure multiple runtimes in the modelsRuntime section.

Set custom image (optional)

If you built the image and/or Helm Chart yourself, you need to specify the image in the corresponding runtime with the following configuration:

 modelsRuntime:
 - name: "sdruntime"
   namespace: "default"
   type: "sdwebui"
   modelFilename: "v1-5-pruned-emaonly.safetensors"
   dynamicModel: false
   chartRepository: "" # Optional parameter, if you built the Helm Chart, enter the address where the Chart is located. Include the protocol prefix (oci:// or https://)
   chartVersion: "" # Optional parameter, if you built the Helm Chart, enter the version of the Chart
   extraValues: # Add the following content
     runtime:
       inferenceApi:
         image:
           repository: <account_id>.dkr.ecr.<region>.amazonaws.com/sd-on-eks/sdwebui # Address of the Stable Diffusion runtime image
           tag: latest # Image tag
       queueAgent:
         image:
           repository: <account_id>.dkr.ecr.<region>.amazonaws.com/sd-on-eks/queue-agent # Address of the Queue Agent image
           tag: latest # Image tag

Set EBS snapshot-based image cache (optional)

If you built an EBS snapshot-based image cache, you need to specify the snapshot ID in the corresponding runtime with the following configuration:

 modelsRuntime:
 - name: "sdruntime"
   namespace: "default"
   type: "sdwebui"
   modelFilename: "v1-5-pruned-emaonly.safetensors"
   extraValues:
     karpenter: # Add the following content
       nodeTemplate:
         amiFamily: Bottlerocket
         dataVolume:
           snapshotID: snap-0123456789 # Replace with the EBS snapshot ID

Other detailed settings (optional)
If you need to configure the runtime in detail, refer to the Configuration Options example.

Start Deployment

After completing the configuration, run the following command to deploy:

npm install
cdk deploy

Deployment generally takes 15-20 minutes. Since the deployment is performed on the AWS side through AWS CloudFormation, you do not need to redeploy if the CDK CLI is accidentally closed.

Next Steps

After the deployment is complete, you will see the following output:

Outputs:
sdoneksStack.GetAPIKeyCommand = aws apigateway get-api-keys --query 'items[?id==`abcdefghij`].value' --include-values --output text
sdoneksStack.FrontApiEndpoint = https://abcdefghij.execute-api.us-east-1.amazonaws.com/prod/
sdoneksStack.ConfigCommand = aws eks update-kubeconfig --name sdoneksStack --region us-east-1 --role-arn arn:aws:iam::123456789012:role/sdoneksStack-sdoneksStackAccessRole
...

Deploying in AWS China Regions

This guidance supports deployment in AWS China Regions.

Region Name	Verified
China (Ningxia)	[x]

However, due to the special network environment in China, there are the following limitations:

You need to build the container image yourself or copy the pre-built image to ECR in the China Region.
Some components’ Helm Charts are hosted on GitHub, and there may be issues retrieving the Helm Charts when deploying in China Regions; retries are recommended when this issue occurs.
You cannot automatically download models from Hugging Face or GitHub and need to manually download the models and upload them to the S3 bucket.
According to Chinese laws and regulations, you must obtain an Internet Content Provider (ICP) license to service API requests via Amazon API Gateway. As a result, this solution is not directly accessible through API Gateway in China. You can associate a domain name that has been associated with a valid ICP license to the API Gateway via custom domain names You can also send messages directly to SNS to invoke the solution.

Steps for Deploying in China Regions

The steps for deploying in AWS China Regions differ from the regular deployment flow. Follow these steps for deployment:

Build or transfer the container image to ECR.
Download the models and store them in the S3 bucket.
Create an EBS disk snapshot.
Generate and modify the configuration file.
Proceed with deployment.

Build or Transfer Image to ECR

Since the default container images are stored in ECR Public, you may experience slow speeds or intermittent disconnections when pulling images or creating image caches. We recommend that you build the images yourself or transfer the existing images to your ECR image repository.

If you need to build the images yourself, please refer to the Image Building documentation.

If you need to transfer the pre-built images to ECR in the China Region, you can run the following commands on an instance with Docker installed and ECR permissions:

docker pull public.ecr.aws/bingjiao/sd-on-eks/sdwebui:latest
docker pull public.ecr.aws/bingjiao/sd-on-eks/comfyui:latest
docker pull public.ecr.aws/bingjiao/sd-on-eks/queue-agent:latest

aws ecr create-repository --repository-name sd-on-eks/sdwebui
aws ecr create-repository --repository-name sd-on-eks/comfyui
aws ecr create-repository --repository-name sd-on-eks/queue-agent

docker tag public.ecr.aws/bingjiao/sd-on-eks/sdwebui:latest 123456789012.dkr.ecr.cn-northwest-1.amazonaws.com.cn/sd-on-eks/sdwebui:latest
docker tag public.ecr.aws/bingjiao/sd-on-eks/comfyui:latest 123456789012.dkr.ecr.cn-northwest-1.amazonaws.com.cn/sd-on-eks/comfyui:latest
docker tag public.ecr.aws/bingjiao/sd-on-eks/queue-agent:latest 123456789012.dkr.ecr.cn-northwest-1.amazonaws.com.cn/sd-on-eks/queue-agent:latest

aws ecr get-login-password --region cn-northwest-1 | docker login --username AWS --password-stdin 123456789012.dkr.ecr.cn-northwest-1.amazonaws.com.cn

docker push 123456789012.dkr.ecr.cn-northwest-1.amazonaws.com.cn/sd-on-eks/sdwebui:latest
docker push 123456789012.dkr.ecr.cn-northwest-1.amazonaws.com.cn/sd-on-eks/comfyui:latest
docker push 123456789012.dkr.ecr.cn-northwest-1.amazonaws.com.cn/sd-on-eks/queue-agent:latest

We recommend that you follow the Image Building documentation to place the Helm Chart in ECR or an HTTP server.

Download Models and Store in S3 Bucket

Since Hugging Face cannot be accessed smoothly from mainland China, download the models from other mirror sites and upload them to the S3 bucket following the Model Storage documentation.

Create EBS Disk Snapshot

Follow the Image Cache Building documentation to create an EBS disk snapshot to accelerate image loading.

Generate and Modify Configuration File

Run the following command to install the tools and generate the initial configuration file:

cd deploy
./deploy.sh -b <bucket name> -s <snapshot ID> -d

This command will generate a config.yaml template in the parent directory, but this template needs to be edited for deployment in the China Region. Please edit the file according to the comments:

stackName: sdoneks
modelBucketArn: arn:aws-cn:s3:::${MODEL_BUCKET}  # Change aws to aws-cn in this ARN
APIGW:
  stageName: dev
  throttle:
    rateLimit: 30
    burstLimit: 50
modelsRuntime:
- name: sdruntime
  namespace: "default"
  modelFilename: "v1-5-pruned-emaonly.safetensors"
  dynamicModel: false
  # chartRepository: "http://example.com/" # If you self-hosted the Helm Chart, uncomment this line and change the value to the address of the Helm Chart (oci:// or http://), otherwise delete this line
  type: sdwebui
  extraValues:
    runtime:
      inferenceApi:
        image:
          repository: 123456789012.dkr.ecr.cn-northwest-1.amazonaws.com.cn/sd-on-eks/sdwebui # Change this to the address of your ECR image repository
          tag: latest
      queueAgent:
        image:
          repository: 123456789012.dkr.ecr.cn-northwest-1.amazonaws.com.cn/sd-on-eks/queue-agent # Change this to the address of your ECR image repository
          tag: latest
    karpenter:
      nodeTemplate:
        amiFamily: Bottlerocket
        dataVolume:
          snapshotID: snap-1234567890 # The EBS snapshot ID will be automatically filled in here
      provisioner:
        instanceType:
        - "g5.xlarge"
        - "g5.2xlarge"
        capacityType:
          onDemand: true
          spot: true

After completing the modifications, run the following command to re-deploy:

cdk deploy

Deployment Verification

You can use the test script to verify if the guidance is deployed successfully. Run the following command to perform the test:

cd test
STACK_NAME=sdoneksStack RUNTIME_TYPE=sdwebui ./run.sh

If you modified the guidance stack name or runtime type, replace sdoneksStack and sdwebui parameter values with the values actually used.

This script will automatically find the API Gateway endpoint, retrieve the API Key, and send test requests.

For the SD Web UI runtime, it will send text-to-image, image-to-image, and single image super-resolution requests.
For the ComfyUI runtime, it will send a Pipeline request.

Within seconds to minutes (depending on whether image caching is enabled and the minimum number of instance replicas), you can find the generated images at the output_location.

Usage Guide

API Calling Rules

After deploying the guidance, you can send requests to the Stable Diffusion runtimes through the API endpoint provided by Amazon API Gateway.

When sending requests, follow these rules:

Request Endpoint and Format

The API endpoint of the guidance can be obtained from the CloudFormation outputs:

AWS Management Console

Go to the AWS CloudFormation console
Choose Stacks
In the list, select sdoneksStack (or your custom name)
Choose Output
Record the value of the FrontApiEndpoint item (in the format https://abcdefghij.execute-api.ap-southeast-1.amazonaws.com/prod/)

AWS CLI

Run the following command to get the API endpoint:

aws cloudformation describe-stacks --stack-name sdoneksStack --output text --query 'Stacks[0].Outputs[?OutputKey==`FrontApiEndpoint`].OutputValue'

You need to append the API version to the endpoint. Currently, we support the v1alpha1 and v1alpha2 versions. For example, when using the v1alpha2 version API, the request should be sent to:

https://abcdefghij.execute-api.ap-southeast-1.amazonaws.com/prod/v1alpha2

This endpoint only accepts JSON-formatted POST requests and requires the Content-Type: application/json request header.

Request Types

Different runtime types accept specific request types:

For the SD Web UI runtime, only text-to-image, image-to-image, and single image super-resolution requests are accepted.
For the ComfyUI runtime, only Pipeline requests are accepted.

Please refer to the detailed documentation for each request type for the specific Request Schema.

API Key

For security reasons, all requests must include an API Key. Follow these steps to obtain the API Key:

AWS Management Console

Go to the Amazon API Gateway console
Choose API Keys
In the list, select the API Key with a name similar to SdOnEK-defau-abcdefghij (or your custom name)
Record the value of the API key item

AWS CLI

Run the following command to get the API Key:

echo $(aws cloudformation describe-stacks --stack-name sdoneksStack --output text --query 'Stacks[0].Outputs[?OutputKey==`GetAPIKeyCommand`].OutputValue')

When sending requests, you need to include the x-api-key request header with the value set to the API Key obtained above.

Anonymous Requests
Requests without an API Key will directly return a 401 error.

Throttling Rules

To protect the backend API, API Gateway will throttle excessive requests using the same API Key.

The default settings are:

30 requests per second
Burst limit of 50 requests

For more details on throttling, refer to Throttle API requests for better throughput.

If you need to modify these settings, modify the APIGW section in config.yaml. You can also modify the corresponding Usage Plan in API Gateway.

If your network environment does not have access to the API Gateway endpoints, or if you want to invoke the solution via Amazon SNS, you can send messages directly to the SNS topic. However, since the message will not be validated by Lambda function, you need to strictly follow the formatting in the message format, or the Queue Agent will discard the message silently.

Find SNS topic

You can find the SNS topic ARN in the CloudFormation outputs:

AWS Management Console

Go to the AWS CloudFormation console
Choose Stacks
In the list, select sdoneksStack (or your custom name)
Choose Output
Record the value of the sdNotificationLambdaARN item (in the format arn:aws:sns:us-east-1:123456789012:sdoneksStack-sdNotificationLambdaCfn-abcdefgh)

AWS CLI

Run the following command to get the SNS topic ARN:

aws cloudformation describe-stacks --stack-name sdoneksStack --output text --query 'Stacks[0].Outputs[?OutputKey==`sdNotificationLambdaARN`].OutputValue'

Message schema

The message needs to contain the runtime: <runtime name> attribute, otherwise the message will not be routed to the SQS queue of the corresponding runtime, and will be discarded by SNS.

The schema of the message body sent to SNS is similar to the v1alpha2 API schema, but does not contain the outermost task field. An example is shown below:

{
  "metadata": {
    "id": "test-t2i", // Required, task ID
    "runtime": "sdruntime", // Required, the name of the runtime used for the task
    "tasktype": "text-to-image", // Required, task type. Ensure task type fits runtime type.
    "prefix": "output", // Required, the prefix (directory name) for the output file in the S3 bucket
    "context": "" // Optional, can contain any information, will be included in the callback
  },
  "content": { // Content will be pass to runtime
  ...
  }
}

Send Message

You can send messages to SNS topic by AWS management console or AWS SDK. Please refer to the Amazon SNS documentation for details.

Text-to-Image (SD Web UI)

This request type is only applicable to the SD Web UI runtime.

The basic usage of Stable Diffusion is that an image is generated based on the input prompt.

The content in the request will be passed directly to the SD Web UI, but if there are links (HTTP or S3 URLs), the link content will be converted to base64-encoded content and filled in the corresponding fields.

Request schema

v1alpha2

{
  "task": {
    "metadata": {
      "id": "test-t2i", // Required, task ID
      "runtime": "sdruntime", // Required, the name of the runtime used for the task
      "tasktype": "text-to-image", // Required, task type
      "prefix": "output", // Required, the prefix (directory name) for the output file in the S3 bucket
      "context": "" // Optional, can contain any information, will be included in the callback
    },
    "content": { // Same specification as the SD Web UI text-to-image API
      "alwayson_scripts": {},
      "prompt": "A dog",
      "steps": 16,
      "width": 512,
      "height": 512
    }
  }
}

v1alpha1

{
    "alwayson_scripts": {
        "task": "text-to-image", // Required, task type
        "sd_model_checkpoint": "v1-5-pruned-emaonly.safetensors", // Required, base model name
        "id_task": "test-t2i", // Required, task ID, used when uploading result images and returning responses
        "save_dir": "outputs" // Required, the prefix (directory name) for the output file in the S3 bucket
    },
    // Same specification as the SD Web UI text-to-image API
    "prompt": "A dog",
    "steps": 16,
    "width": 512,
    "height": 512
}

Response schema

v1alpha2

{
  "id_task": "test-t2i",
  "runtime": "sdruntime",
  "output_location": "s3://outputbucket/output/test-t2i"
}

v1alpha1

{
  "id_task": "test-t2i",
  "sd_model_checkpoint": "v1-5-pruned-emaonly.safetensors",
  "output_location": "s3://outputbucket/output/test-t2i"
}

Model Switching

If the corresponding runtime is set to dynamicModel: true, you need to add the following content in the alwayson_scripts section of the request:

        "content": {
          "alwayson_scripts": {
            "sd_model_checkpoint": "v1-5-pruned-emaonly.safetensors" //Place the model name here
          },
        }

Upon receiving the request, the SD Web UI will unload the current model and load the corresponding model from memory or the S3 bucket. If the specified model does not exist, the request will directly return an error.

Image Retrieval

After the image is generated, it will be stored in the S3 bucket path specified by output_location. If batch_size or other parameters that generate multiple images are set, each image will be automatically numbered and stored.

The default storage format is lossless PNG, but if special formats (such as GIF) are involved, the system will automatically recognize and add the appropriate extension.

Image-to-Image (SD Web UI)

This request type is only applicable to the SD Web UI runtime.

The basic usage of Stable Diffusion is that an image is generated based on the input prompt and reference image.

The content in the request will be passed directly to SD Web UI, but if there are links (HTTP or S3 URLs), the link content will be converted to base64-encoded content and filled in the corresponding fields.

Request Schema

v1alpha2

{
  "task": {
    "metadata": {
      "id": "test-i2i", // Required, task ID
      "runtime": "sdruntime", // Required, the name of the runtime used for the task
      "tasktype": "image-to-image", // Required, task type
      "prefix": "output", // Required, the prefix (directory name) for the output file in the S3 bucket
      "context": "" // Optional, can contain any information, will be included in the callback
    },
    "content": { // Same specification as the SD Web UI image-to-image API
      "alwayson_scripts": {},
      "prompt": "cat wizard, gandalf, lord of the rings, detailed, fantasy, cute, adorable, Pixar, Disney, 8k",
      "steps": 16,
      "width": 512,
      "height": 512,
      "init_images": ["https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/cat.png"] // Place the image link here, the image will be downloaded and base64-encoded before being placed in the request
    }
  }
}

v1alpha1

{
    "alwayson_scripts": {
        "task": "image-to-image", // Required, task type
        "image_link": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/cat.png", // Required, URL of the input image
        "id_task": "test-i2i", // Required, task ID, used when uploading result images and returning responses
        "sd_model_checkpoint": "v1-5-pruned-emaonly.safetensors", // Required, base model name, associated with queue dispatching or model switching
    },
    // The following are official parameters, use the default values or pass them in directly
    "prompt": "cat wizard, gandalf, lord of the rings, detailed, fantasy, cute, adorable, Pixar, Disney, 8k",
    "steps": 16,
    "width": 512,
    "height": 512
}

Response schema

v1alpha2

{
  "id_task": "test-i2i",
  "runtime": "sdruntime",
  "output_location": "s3://outputbucket/output/test-t2i"
}

v1alpha1

{
  "id_task": "test-i2i",
  "sd_model_checkpoint": "v1-5-pruned-emaonly.safetensors",
  "output_location": "s3://outputbucket/output/test-t2i"
}

Model Switching

If the corresponding runtime is set to dynamicModel: true, you need to add the following content in the alwayson_scripts of the request:

        "content": {
          "alwayson_scripts": {
            "sd_model_checkpoint": "v1-5-pruned-emaonly.safetensors" //Place the model name here
          },
        }

Upon receiving the request, the SD Web UI will unload the current model and load the corresponding model from memory or an S3 bucket. If the specified model does not exist, the request will return an error.

Image Retrieval

The default storage format is lossless PNG, but if special formats (such as GIF) are involved, the system will automatically recognize and add the appropriate extension.

Single Image Upscaling (SD Web UI)

This request type is only applicable to the SD Web UI runtime.
This request type only provides the v1alpha2 API.

For a single image, use the super-resolution model to upscale the image.

Request Schema

v1alpha2

{
  "task": {
    "metadata": {
      "id": "test-extra",
      "runtime": "sdruntime",
      "tasktype": "extra-single-image",
      "prefix": "output",
      "context": ""
    },
    "content": {
      "resize_mode":0,
      "show_extras_results":false,
      "gfpgan_visibility":0,
      "codeformer_visibility":0,
      "codeformer_weight":0,
      "upscaling_resize":4,
      "upscaling_resize_w":512,
      "upscaling_resize_h":512,
      "upscaling_crop":false,
      "upscaler_1":"R-ESRGAN 4x+",
      "upscaler_2":"None",
      "extras_upscaler_2_visibility":0,
      "upscale_first":false,
      "image":"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/cat.png"
    }
  }
}

Response schema

v1alpha2

{
  "id_task": "test-extra",
  "runtime": "sdruntime",
  "output_location": "s3://outputbucket/output/test-t2i"
}

Available Super-Resolution Models

The available super-resolution models are the same as the default models in SD Web UI:

Lanczos
Nearest
4x-UltraSharp
ESRGAN_4X
LDSR
R-ESRGAN 4x+
R-ESRGAN 4x+ Anime6B
ScuNET GAN
ScuNET PSNR
SwinIR 4x

If you need more super-resolution models, you can place them in the LDSR, SwinIR, ESRGAN, RealESRGAN, ScuNET, etc. directories in the S3 bucket according to the model type.

After completing the above steps, you need to restart the Pod for the new models to take effect.

Image Retrieval

After the image is generated, it will be stored in the S3 bucket path specified by output_location. The default storage format is lossless PNG, but if special formats (such as GIF) are involved, the system will automatically recognize and add the appropriate extension.

Pipeline (ComfyUI)

This request type is only applicable to the ComfyUI runtime.
This request type only provides the v1alpha2 API.

ComfyUI provides workflow orchestration capabilities, allowing you to design workflows using various nodes in the interface and export them to a json file.

Exporting the Workflow

After designing the workflow in the interface, follow these steps to export it:

Select the gear icon in the top-right corner of the menu panel.
Select Enable Dev mode Options.
Select Save(API Format) to save the workflow as a file.

Request Schema

v1alpha2

{
  "task": {
    "metadata": {
      "id": "test-pipeline", // Required, task ID
      "runtime": "sdruntime", // Required, the name of the runtime used for the task
      "tasktype": "pipeline", // Required, task type
      "prefix": "output", // Required, the prefix (directory name) for the output file in the S3 bucket
      "context": "" // Optional, can contain any information, will be included in the callback
    },
    "content": {
      ... // Place the exported workflow content here
    }
  }
}

Response schema

v1alpha2

{
  "id_task": "test-pipeline",
  "runtime": "sdruntime",
  "output_location": "s3://outputbucket/output/test-pipeline"
}

Image Retrieval

The default storage format is lossless PNG, but if special formats (such as GIF) are involved, the system will automatically recognize and add the appropriate extension.

Callbacks and Notifications

The Stable Diffusion on Amazon EKS solution uses an asynchronous inference mode. When an image is generated or an error occurs, the user will be notified through Amazon SNS. User applications can subscribe to the SNS topic to receive notifications about image generation completion.

Adding Subscriptions

Refer to the Amazon SNS documentation to learn about the message destination types supported by SNS.

You can find the generated SNS topic ARN in the CloudFormation outputs:

AWS Management Console

Go to the AWS CloudFormation console
Choose Stacks
In the list, select sdoneksStack (or your custom name)
Choose Output
Record the value of the sdNotificationOutputArn item (in the format arn:aws:sns:us-east-1:123456789012:sdoneksStack-sdNotificationOutputCfn-abcdefgh)

AWS CLI

Run the following command to get the SNS topic ARN:

aws cloudformation describe-stacks --stack-name sdoneksStack --output text --query 'Stacks[0].Outputs[?OutputKey==`sdNotificationOutputArn`].OutputValue'

To receive messages, you need to add your message receiver (such as an Amazon SQS queue, HTTP endpoint, etc.) as a subscription to this SNS topic.

AWS Management Console {: .label .label-blue }:

In the left navigation pane, choose Subscriptions.
On the Subscriptions page, choose Create subscription.
On the Create subscription page under Details, do the following:
- For Topic ARN, select the ARN you recorded in the previous step.
- For Protocol, select the type of your receiver.
- For Endpoint, enter the address of your receiver, such as an email address or the ARN of an Amazon SQS queue.
Choose Create subscription

AWS CLI {: .label .label-green }:

Use Amazon SNS with the AWS CLI to add a subscription to this topic.

Callback Message Format

The guidance code will send task completion notifications to SNS in the following format, regardless of the API version used in the request:

{
    "id": "task_id", // Task ID
    "result": true, // true for successful completion, false for unsuccessful completion
    "runtime": "sdruntime", // Sender name of runtime
    "image_url": [ // S3 URLs of the generated images, in the format of task ID + 4 random characters + image sequence number, all image links will be included if there are multiple images
        "s3://outputbucket/output/test-t2i/test-t2i-abcd-1.png"
    ],
    "output_url": "s3://outputbucket/output/test-t2i/test-t2i-abcd.out", // S3 URL of the task output, containing the full return from the runtime
    "context": { // Context content included in the request
        "abc": 123
    }
}

Uninstall the Guidance

The deployed Guidance code can be deleted using the CloudFormation console.

Permanent Deletion
All deleted resources will be permanently removed and cannot be recovered by any means.

Deletion Scope

The following resources will be permanently deleted:
- Amazon EKS cluster and all worker nodes
- SNS topics and all subscriptions
- SQS queues
- VPC and subnet, etc.
- IAM roles used by the guidance
The following resources will not be deleted:
- S3 bucket for storing output images
- S3 bucket for storing models

Pre-Deletion Preparation

Before deleting the guidance resources, ensure the following conditions are met:

All SQS queues have been emptied.
No additional policies are attached to IAM roles.
No additional resources (such as EC2 instances, ENIs, Cloud9 environments, etc.) exist within the VPC.

Deleting the Solution

You can delete the guidance code using either the CDK CLI or the AWS Management Console.

Use AWS Management Console:

Navigate to the AWS CloudFormation console.
Select Stacks.
In the list, select sdoneksStack (or another name you customized).
Select Delete, and in the pop-up dialog, choose Delete.

Use AWS CDK CLI:

In the guidance’s source code directory, run the following command to unistall the deployed code:

npx cdk destroy

Deleting the guidance resources should approximately 20-30 minutes.

Contributors

Bingjiao Yu, Soltions Architect Container Specialist
Daniel Zilberman, Sr. Solutions Architect Tech Solutions team
Harold Sun, Sr. GCR Serverless Special Solution Architect

Acknowledgement for code contributors:

Anbei Zhao, Sr. Solutions Architect
Congyao Lu, Solutions Architect
Felix Wang, Sr. Solutions Architect
Walkley He, Manager of Container Specialist Soltions Architects
Xi Wan, Solutions Architect

Acknowledgement for project sponsors:

Xiaoming Fu, Solutions Architect Manager
Zhanling Chen, AppMod Specialist Manager

Notices

Customers are responsible for making their own independent assessment ofthe information in this document. This document: (a) is for informational purposes only, (b) represents AWS current product offerings and practices, which are subject to change without notice, and (c) does not create any commitments or assurances from AWS and its affiliates, suppliers or licensors. AWS products or services are provided “as is” without warranties, representations, or conditions of any kind, whether express or implied. AWS responsibilities and liabilities to its customers are controlled by AWS agreements, and this document is not part of, nor does it modify, any agreement between AWS and its customers.

Guidance for Asynchronous Image Generation with Stable Diffusion on AWS

Overview

Features and benefits

Use cases

Architecture overview

Serverless task scheduling and dispatching

Stable Diffusion runtime

Architecture diagram

Architecture steps

AWS services in This guidance

Cost

Security

IAM Roles

Access Control

Networking

Service Quotas

Quotas for AWS services in This guidance

Deploy the Guidance

Deployable Regions

IAM permissions

Choose Stable Diffusion Runtime

Other important limitations

Deployment process overview

Get source code

Quick start

One-click deployment

Deployment parameters

Manual deployment

Model storage

Image Building

Image Cache Building

Manual Deployment

Deploying in AWS China Regions

Steps for Deploying in China Regions

Deployment Verification

Usage Guide

API Calling Rules

Request Endpoint and Format

Request Types

API Key

Throttling Rules

Sending messages directly to Amazon SNS topic

Text-to-Image (SD Web UI)

Request schema

Response schema

Model Switching

Image Retrieval

Image-to-Image (SD Web UI)

Request Schema

Response schema

Model Switching

Image Retrieval

Single Image Upscaling (SD Web UI)

Request Schema

Response schema

Available Super-Resolution Models

Image Retrieval

Pipeline (ComfyUI)

Exporting the Workflow

Request Schema

Response schema

Image Retrieval

Callbacks and Notifications

Adding Subscriptions

Callback Message Format

Uninstall the Guidance

Deletion Scope

Pre-Deletion Preparation

Deleting the Solution

Related resources

Contributors

Notices