File/Folder Reference
Root File Structure¶
This page highlights key components of the file hierarchy when navigating the guidance. It may not include extraneous files for doc clarity.
root/
├─ docs/
├─ business-logic/
├─ infrastructure/
├─ package.json
-
Docs
- This folder holds all the files for this documentation site (mkdocs)
-
Business Logic
- This folder holds all of the "business logic", non-infrastructure files, such as:
- Source scripts and files for Lambda, Glue, Opensearch, API Gateway, Flink to be used for deployment
- Sample data event generator scripts
- This folder holds all of the "business logic", non-infrastructure files, such as:
-
Infrastructure
- This folder holds all Infrastructure-as-Code (IaC) deployment files for the guidance. Can choose between AWS CDK and Terraform
-
Package.json
- Has config option for deploying in CDK or Terraform ("config": "iac": "cdk" | "tf") and all build commands. See Getting Started for more details
Business Logic Folder Deep Dive¶
business-logic/
├─ api/
│ ├─ admin/
│ ├─ api-definitions/
│ ├─ lambda-authorizer/
├─ data-lake/
│ ├─ glue-scripts/
├─ events-processing/
├─ flink-event-processing/
├─ opensearch-ingestion/
├─ publish-data/
-
api
- This folder holds the business logic for the API backend:
admin/
holds all of the API serverless backend logic on AWS Lambdaapi-definitions/
holds OpenAPI templates that represent the API calls. Is either a reference for CDK, or directly used as template for deployment for Terraform.lambda-authorizer/
holds the Lambda code for the component in API Gateway that authorizes API calls before they pass through. See Component Deep Dive for more details
- This folder holds the business logic for the API backend:
-
data-lake
- This folder holds all of the ETL scripts that Glue runs as jobs, such as conversion jobs from one format to another, or processing jobs that optimize formats, partition, and send from raw to processed events. The guidance comes with skeleton sample scripts that have inherited best practices, but no direct transformations on the data, since those are customized by you for your use cases
-
events-processing
- Only used during
Data Lake
mode, Amazon Data Firehose will use the logic in here in an integrated Lambda feature for in-flight data sanitation and ETL option, akin to a "pre-processing" option for customers. See Component Deep Dive for details on the pre-processing performed by default. This is different than the ETL processing jobs ran by Glue, which are preferable for cost reasons if data could be processed after ingestion to the S3 data lake, but for business/technical requirements that require processing as part of ingestion, the logic here can be modified
- Only used during
-
flink-event-processing
- Only used if
Real Time Analytics
is enabled, this folder holds the business logic used by Amazon Managed Service for Apache Flink for real-time processing of events. The sample application code uses Apache Maven and by default has sample transformations based on sample data sent by thepublish-data
script (see below)
- Only used if
-
opensearch-ingestion
- Only used if
Real Time Analytics
is enabled, this folder holds the template files used by Amazon Managed Opensearch Service during deployment
- Only used if
-
publish-data
- Has a sample data event generator script on Python that simulates generic game events. The sample Athena queries and Flink sample code are tied to these sample game events, so they will need to be tailored to meet your specific event data
Infrastructure Folder Deep Dive¶
infrastructure/
├─ aws-cdk/
│ ├─ src/
├─ terraform/
│ ├─ src/
├─ config.yaml.TEMPLATE
-
aws-cdk / terraform
- Holds the IaC template code for the respective tool, main source files are in
/src
- Holds the IaC template code for the respective tool, main source files are in
-
config.yaml.TEMPLATE
- Sample template file for settings/configurations for your deployment, which would be copied into
config.yaml
. See Getting Started for setting up the config file and deployment.
- Sample template file for settings/configurations for your deployment, which would be copied into
CDK Folder Deep Dive¶
src/
├─ constructs/
│ ├─ samples/
├─ helpers/
├─ app-stack.ts
├─ app.ts
Terraform Folder Deep Dive¶
├─ src/
├─ constructs/
│ ├─ samples/
├─ main.tf
-
constructs
- Holds all infrastructure template components for the guidance, but broken down into logical parts called constructs. See below for references for each construct.
-
(cdk only) app.ts
- Contains variable validation and top-level deployment dependencies specific to CDK / Typescript
Construct Reference¶
-
main construct (app-stack.ts / main.tf)
- Root template file that connects all constructs together along with deploying central components that have dependencies across multiple constructs (Logs + Analytics S3 buckets, DynamoDB tables, SNS Encryption key + IAM policies)
-
api-construct
- Deploys API backend related infrastructure, such as API Gateway and IAM roles
- dashboard-construct
- Deploys the operational real-time CloudWatch dashboard components, such as the dashboard, widgets, and metrics
- data-lake-construct
- Deploys
Data Lake Mode
components, such as Glue Database, Glue Tables (Iceberg or Hive), and Athena workgroup
- Deploys
- data-processing-construct
- Deploys
Data Lake Mode
components specifically to ETL processing, such as Glue Workflow, Glue Crawler + Trigger, and Glue ETL Jobs + IAM Roles
- Deploys
- flink-construct
- Deploys
Real Time Mode
component for Managed Service for Apache Flink, such as the Flink Application + IAM Roles and Flink Log Groups
- Deploys
- lambda-construct
- Deploys all Lambda function components across the guidance, such as
Data Lake Mode
event processing (Firehose), API Authorizer, API Backend (Admin Service, Redshift deployment service), and corresponding Lambda IAM Roles
- Deploys all Lambda function components across the guidance, such as
- metrics-construct
- Deploys Cloudwatch Alarms such as Kinesis Data Streams alarms if those components are enabled,
Data Lake Mode
Lambda error alarms,Data Lake Mode
Firehose alarms, DynamoDB error alarms, and API Gateway error alarms.
- Deploys Cloudwatch Alarms such as Kinesis Data Streams alarms if those components are enabled,
- opensearch-construct
- Deploys Opensearch components such as the serverless cluster, ingestion dead letter queue, encryption, IAM roles, and dashboard
- redshift-construct
- Deploys
Redshift Mode
components such as Serverless Redshift cluster, encryption, IAM Roles, Security Group added to VPC Construct's networking components
- Deploys
- streaming-ingestion-construct
- Deploys
Data Lake Mode
Firehose components + IAM role + Log Groups for Iceberg and Hive
- Deploys
- vpc-construct
- Deploys bare-bones VPC components for
Redshift Mode
orReal Time : Enabled
which have dependencies
- Deploys bare-bones VPC components for
- samples/athena-construct
- Deploys sample Athena queries in a nested
samples
folder, which are entirely dependent on the sample event code and is subject to greatest adjustment when implementing your own custom events
- Deploys sample Athena queries in a nested