AWS Services and IAM Role Requirements for GenAI IDP Accelerator
Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. SPDX-License-Identifier: MIT-0
AWS Services and IAM Role Requirements for GenAI IDP Accelerator
Section titled “AWS Services and IAM Role Requirements for GenAI IDP Accelerator”This document outlines the AWS services used by the GenAI Intelligent Document Processing (IDP) Accelerator solution, along with the IAM role scopes needed for deployment and operation.
AWS Services Used
Section titled “AWS Services Used”Core Infrastructure Services
Section titled “Core Infrastructure Services”| Service | Usage | Deployment | Runtime |
|---|---|---|---|
| Amazon S3 | Stores input documents, processed outputs, and web UI assets | ✓ | ✓ |
| Amazon DynamoDB | Tracks document processing, manages configurations and concurrency | ✓ | ✓ |
| AWS Lambda | Executes document processing functions and business logic | ✓ | ✓ |
| AWS Step Functions | Orchestrates document processing workflows | ✓ | ✓ |
| Amazon SQS | Queues documents for processing and handles throttling | ✓ | ✓ |
| Amazon EventBridge | Triggers document processing workflows when files are uploaded | ✓ | ✓ |
| Amazon CloudFront | Delivers the web UI with global distribution (default hosting mode) | ✓ | ✓ |
| Elastic Load Balancing (ALB) | Alternative web UI hosting via Application Load Balancer for VPC-based deployments (see ALB Hosting) | ✓ | ✓ |
| AWS CloudFormation | Deploys and manages the solution infrastructure | ✓ | |
| AWS SAM | Simplifies serverless application deployment | ✓ | |
| AWS CodeBuild | Builds and packages the web UI assets | ✓ |
AI/ML Services
Section titled “AI/ML Services”| Service | Usage | Deployment | Runtime |
|---|---|---|---|
| Amazon Bedrock | Provides foundation models for document understanding | ✓ | ✓ |
| Amazon Bedrock Guardrails | Enforces content safety, information security, and model usage policies | ✓ | ✓ |
| Amazon Textract | Extracts text and data from documents (OCR) | ✓ | |
| Amazon SageMaker | Hosts custom ML models for document classification (UDOP) | ✓ | ✓ |
| Amazon Bedrock Knowledge Base | Enables semantic document querying (optional) | ✓ | ✓ |
| Bedrock Data Automation (BDA) | Automates document processing workflows (Pattern 1) | ✓ | ✓ |
Auth & API Services
Section titled “Auth & API Services”| Service | Usage | Deployment | Runtime |
|---|---|---|---|
| Amazon Cognito | Manages user authentication and authorization | ✓ | ✓ |
| AWS AppSync | Provides GraphQL API for the web UI | ✓ | ✓ |
| AWS WAF | Protects web applications from web exploits (optional) | ✓ | ✓ |
Monitoring & Operations
Section titled “Monitoring & Operations”| Service | Usage | Deployment | Runtime |
|---|---|---|---|
| Amazon CloudWatch | Provides monitoring, logging, and alerting | ✓ | ✓ |
| AWS SNS | Delivers operational alerts and notifications | ✓ | ✓ |
| AWS KMS | Manages encryption keys for secure data storage | ✓ | ✓ |
IAM Role Requirements
Section titled “IAM Role Requirements”Enterprise Deployment Considerations
Section titled “Enterprise Deployment Considerations”For organizations with Service Control Policies (SCPs) that mandate permissions boundaries on all IAM roles, the solution provides comprehensive support through the PermissionsBoundaryArn parameter. This optional parameter can be specified during deployment to attach a permissions boundary to all IAM roles (both explicit roles and implicit roles created by AWS SAM functions).
Usage:
aws cloudformation deploy \ --template-file template.yaml \ --parameter-overrides PermissionsBoundaryArn=arn:aws:iam::123456789012:policy/MyPermissionsBoundary \ --capabilities CAPABILITY_IAMWhen no permissions boundary is specified, roles deploy normally, ensuring backward compatibility.
Deployment Roles
Section titled “Deployment Roles”Deploying this solution requires an IAM role/user with the following permissions:
Essential Permissions
Section titled “Essential Permissions”cloudformation:*- Create and manage CloudFormation stacksiam:*- Create and manage IAM roles and policieslambda:*- Create and configure Lambda functionsstates:*- Create and manage Step Functions state machiness3:*- Create buckets and manage S3 resourcesdynamodb:*- Create and configure DynamoDB tablessqs:*- Create and configure SQS queuesevents:*- Create and configure EventBridge rulescloudfront:*- Create and configure CloudFront distributionscognito-idp:*- Create and configure Cognito user poolscognito-identity:*- Create and configure Cognito identity pools for AWS service accessappsync:*- Create and configure AppSync APIslogs:*- Create and configure CloudWatch log groupscloudwatch:*- Create and configure CloudWatch dashboards and alarmssns:*- Create and configure SNS topics
Pattern-Specific Permissions
Section titled “Pattern-Specific Permissions”bedrock:*- Create Bedrock resources (all patterns)sagemaker:*- Create SageMaker endpoints (Pattern 3)opensearch:*- Create OpenSearch domains (Knowledge Base feature)kms:*- Create KMS keys for encryptionwafv2:*- Configure WAF rules (optional)
Runtime Roles
Section titled “Runtime Roles”The solution creates various IAM roles to run different components of the system. Key role scopes include:
Document Processing Roles
Section titled “Document Processing Roles”-
Queue Processing Role:
sqs:ReceiveMessage,sqs:DeleteMessage,sqs:GetQueueAttributesdynamodb:GetItem,dynamodb:PutItem,dynamodb:UpdateItemstates:StartExecutionlogs:CreateLogGroup,logs:CreateLogStream,logs:PutLogEvents
-
Step Functions Execution Role:
lambda:InvokeFunctionstates:*events:PutEvents
-
OCR Processing Role:
textract:AnalyzeDocument,textract:DetectDocumentTexts3:GetObject,s3:PutObjectlogs:*
-
Classification Role:
sagemaker:InvokeEndpoint(Pattern 3)bedrock:InvokeModel(Patterns 2 & 3)bedrock:ApplyGuardrail(when Guardrails configured)s3:GetObject,s3:PutObjectlogs:*
-
Extraction Role:
bedrock:InvokeModelbedrock:ApplyGuardrail(when Guardrails configured)s3:GetObject,s3:PutObjectlogs:*
-
BDA Integration Role (Pattern 1):
bedrock:InvokeDataAutomationAsyncs3:GetObject,s3:PutObjectdynamodb:GetItem,dynamodb:PutItem,dynamodb:UpdateItemcloudwatch:PutMetricDatalogs:*
Web UI & API Roles
Section titled “Web UI & API Roles”-
AppSync Service Role:
dynamodb:GetItem,dynamodb:Query,dynamodb:Scans3:GetObject,s3:PutObject,s3:ListBucketlambda:InvokeFunction
-
Cognito Authentication Role:
appsync:GraphQLs3:GetObject(for UI assets and buckets)ssm:GetParameter(for settings)
-
Knowledge Base Query Role:
bedrock:InvokeModelbedrock:Retrievebedrock:RetrieveAndGeneratebedrock:ApplyGuardrail(when Guardrails configured)aoss:APIAccessAll(for OpenSearch Serverless access)logs:*
-
Knowledge Base Service Role:
bedrock:InvokeModelaoss:APIAccessAlls3:ListBucket,s3:GetObject(when using S3 data source)
Monitoring & Evaluation Roles
Section titled “Monitoring & Evaluation Roles”-
CloudWatch Dashboard Role:
cloudwatch:GetDashboard,cloudwatch:PutDashboardlogs:DescribeLogGroups
-
Workflow Tracking Role:
dynamodb:GetItem,dynamodb:PutItem,dynamodb:UpdateItemcloudwatch:PutMetricDatalogs:*
-
Evaluation Function Role:
s3:GetObject(from baseline bucket)s3:PutObject,s3:GetObject(for output bucket)dynamodb:GetItem,dynamodb:PutItem,dynamodb:UpdateItembedrock:InvokeModel(for LLM-based evaluations)appsync:GraphQL(for updating evaluation results)cloudwatch:PutMetricDatalogs:*
Service Quotas Considerations
Section titled “Service Quotas Considerations”For high-volume document processing, consider requesting quota increases for:
| Service | Quota to Increase | Typical Default |
|---|---|---|
| Amazon Bedrock | On-demand InvokeModel tokens per minute | Varies by model |
| Amazon Bedrock | On-demand InvokeModel requests per minute | Varies by model |
| Amazon Bedrock | ApplyGuardrail requests per minute | Varies by region |
| Amazon Textract | DetectDocumentText / AnalyzeDocument transactions per second | 10-25 TPS |
| Amazon SageMaker | Number of endpoints per region | 2-10 endpoints |
| AWS Lambda | Concurrent executions | 1,000 executions |
| AWS Step Functions | State transitions per second | 2,000 transitions |
| Amazon SQS | API requests per queue | Very high by default |
| Amazon CloudWatch | PutMetricData API requests per second | 150 requests/second |
| Bedrock Data Automation | Concurrent jobs (Pattern 1) | Varies by region |
Security Recommendations
Section titled “Security Recommendations”When deploying this solution, consider the following security best practices:
-
Encryption:
- Enable SSE-KMS encryption for all S3 buckets
- Use customer-managed CMKs for sensitive data
- Enable encryption for DynamoDB tables
-
Network Security:
- Use CloudFront security features (geo-restrictions, HTTPS, etc.) or ALB security groups for VPC-based hosting
- Configure AWS WAF to protect web interfaces
-
Authentication:
- Enforce MFA for admin users in Cognito
- Set strong password policies
- Limit admin access to necessary personnel
-
IAM Best Practices:
- Use least privilege principles for all roles
- Regularly audit and rotate credentials
- Enable CloudTrail logging for all API actions
-
Content Safety & Control:
- Configure Bedrock Guardrails with appropriate topic filters
- Set up content blocking for sensitive information
- Implement trace logging for guardrail activations
- Use different guardrail configurations for different environments (dev/test/prod)
-
Data Protection:
- Implement lifecycle policies for S3 objects
- Configure appropriate retention policies for logs and data
- Consider data residency requirements when selecting regions