InsuranceLake Quickstart Guide

If you’d like to get started quickly transforming some sample raw insurance data, running SQL on the resulting dataset, and without worrying about CI/CD, follow the steps in this section.

Python/CDK Basics
Deploy the Application
Try out the ETL Process
Next Steps

Python/CDK Basics

Open the AWS Console in the us-east-2 (Ohio) Region.
InsuranceLake uses us-east-2 by default. To change the Region, refer to the Quickstart with CI/CD.
Select AWS CloudShell at the bottom of the page and wait for a few seconds until it is available for use.
Ensure you are using the latest version of the AWS SDK for Node.js and AWS CDK.
```
 sudo npm install -g aws-lib aws-cdk
```

Clone the repositories.

 git clone https://github.com/aws-solutions-library-samples/aws-insurancelake-infrastructure.git
 git clone https://github.com/aws-solutions-library-samples/aws-insurancelake-etl.git

Use a terminal or command prompt and change the working directory to the location of the infrastructure code.
```
 cd aws-insurancelake-infrastructure
```
Create a Python virtual environment.
In CloudShell your home directory is limited to 1 GB of persistent storage. To ensure we have enough storage to download and install the required Python packages, you will use CloudShell’s temporary storage, located in /tmp, which has a larger capacity.
```
 python3 -m venv /tmp/.venv
```
Activate the virtual environment.
```
 source /tmp/.venv/bin/activate
```
Install required Python libraries.
You may see a warning stating that a newer version is available; it is safe to ignore this for the Quickstart.
```
 pip install -r requirements.txt
```
Bootstrap CDK in your AWS account.
```
 cdk bootstrap
```

Deploy the Application

Confirm you are still in the aws-insurancelake-infrastructure directory.

Deploy infrastructure resources in the development environment (one stack).

 cdk deploy Dev-InsuranceLakeInfrastructurePipeline/Dev/InsuranceLakeInfrastructureS3BucketZones

Review and accept AWS Identity and Access Management (IAM) credential creation for the S3 bucket stack.
- Wait for deployment to finish (approximately 5 minutes).
Copy the S3 bucket name for the Collect bucket to use later.
- Bucket name will be in the form: dev-insurancelake-<AWS Account ID>-<Region>-collect.
Switch the working directory to the location of the etl code.
```
 cd ../aws-insurancelake-etl
```

Deploy the ETL resources in the development environment (four stacks).

 cdk deploy Dev-InsuranceLakeEtlPipeline/Dev/InsuranceLakeEtlDynamoDb Dev-InsuranceLakeEtlPipeline/Dev/InsuranceLakeEtlGlue Dev-InsuranceLakeEtlPipeline/Dev/InsuranceLakeEtlStepFunctions Dev-InsuranceLakeEtlPipeline/Dev/InsuranceLakeEtlAthenaHelper

Wait for approximately 1 minute for DynamoDB deployment to finish.

Review and accept IAM credential creation for the AWS Glue jobs stack.
- Wait approximately 3 minutes for deployment to finish.
Review and accept IAM credential creation for the Step Functions stack.
- Wait approximately 7 minutes for deployment of Step Functions and Athena Helper stacks to finish.

Try out the ETL Process

Populate the DynamoDB lookup table with sample lookup data.

 resources/load_dynamodb_lookup_table.py SyntheticGeneralData dev-insurancelake-etl-value-lookup resources/syntheticgeneral_lookup_data.json

Transfer the sample claim data to the Collect bucket.

 aws s3 cp resources/syntheticgeneral-claim-data.csv s3://<Collect S3 bucket>/SyntheticGeneralData/ClaimData/

Transfer the sample policy data to the Collect bucket.

 aws s3 cp resources/syntheticgeneral-policy-data.csv s3://<Collect S3 bucket>/SyntheticGeneralData/PolicyData/

Open Step Functions in the AWS Console and select dev-insurancelake-etl-state-machine.
Open the state machine execution in progress and monitor the status until complete.
Open Athena in the AWS Console.
Select Launch Query Editor, and change the Workgroup to insurancelake.
Run the following query to view a sample of prepared data in the consume bucket:
```
 select * from syntheticgeneraldata_consume.policydata limit 100
```

Next Steps

Take the InsuranceLake Deep Dive Workshop.
- You may skip to the Modify and test a transform step, as the prior steps overlap with the Quickstart instructions.
Try out loading your own data.
Try the Quickstart with CI/CD.
Dive deeper with the included user documentation.
Contact your AWS account team for a solution deep dive, workshops, or AWS Professional Services support.

InsuranceLake Quickstart Guide

Contents

Python/CDK Basics

Deploy the Application

Try out the ETL Process

Next Steps