InsuranceLake Quickstart Guide
If you’d like to get started quickly transforming some sample raw insurance data, running SQL on the resulting dataset, and without worrying about CI/CD, follow the steps in this section.
Contents
Python/CDK Basics
Open the AWS Console in the
us-east-2 (Ohio)
Region.InsuranceLake uses
us-east-2
by default. To change the Region, refer to the Quickstart with CI/CD.- Select AWS
CloudShell
at the bottom of the page and wait for a few seconds until it is available for use. - Ensure you are using the latest version of the AWS SDK for Node.js and AWS CDK.
sudo npm install -g aws-lib aws-cdk
- Clone the repositories.
git clone https://github.com/aws-solutions-library-samples/aws-insurancelake-infrastructure.git git clone https://github.com/aws-solutions-library-samples/aws-insurancelake-etl.git
- Use a terminal or command prompt and change the working directory to the location of the infrastructure code.
cd aws-insurancelake-infrastructure
Create a Python virtual environment.
In CloudShell your home directory is limited to 1 GB of persistent storage. To ensure we have enough storage to download and install the required Python packages, you will use CloudShell’s temporary storage, located in
/tmp
, which has a larger capacity.python3 -m venv /tmp/.venv
- Activate the virtual environment.
source /tmp/.venv/bin/activate
Install required Python libraries.
You may see a warning stating that a newer version is available; it is safe to ignore this for the Quickstart.
pip install -r requirements.txt
- Bootstrap CDK in your AWS account.
cdk bootstrap
Deploy the Application
- Confirm you are still in the
aws-insurancelake-infrastructure
directory. - Deploy infrastructure resources in the development environment (one stack).
cdk deploy Dev-InsuranceLakeInfrastructurePipeline/Dev/InsuranceLakeInfrastructureS3BucketZones
- Review and accept AWS Identity and Access Management (IAM) credential creation for the S3 bucket stack.
- Wait for deployment to finish (approximately 5 minutes).
- Copy the S3 bucket name for the Collect bucket to use later.
- Bucket name will be in the form:
dev-insurancelake-<AWS Account ID>-<Region>-collect
.
- Bucket name will be in the form:
- Switch the working directory to the location of the etl code.
cd ../aws-insurancelake-etl
- Deploy the ETL resources in the development environment (four stacks).
cdk deploy Dev-InsuranceLakeEtlPipeline/Dev/InsuranceLakeEtlDynamoDb Dev-InsuranceLakeEtlPipeline/Dev/InsuranceLakeEtlGlue Dev-InsuranceLakeEtlPipeline/Dev/InsuranceLakeEtlStepFunctions Dev-InsuranceLakeEtlPipeline/Dev/InsuranceLakeEtlAthenaHelper
- Wait for approximately 1 minute for DynamoDB deployment to finish.
- Review and accept IAM credential creation for the AWS Glue jobs stack.
- Wait approximately 3 minutes for deployment to finish.
- Review and accept IAM credential creation for the Step Functions stack.
- Wait approximately 7 minutes for deployment of Step Functions and Athena Helper stacks to finish.
Try out the ETL Process
- Populate the DynamoDB lookup table with sample lookup data.
resources/load_dynamodb_lookup_table.py SyntheticGeneralData dev-insurancelake-etl-value-lookup resources/syntheticgeneral_lookup_data.json
- Transfer the sample claim data to the Collect bucket.
aws s3 cp resources/syntheticgeneral-claim-data.csv s3://<Collect S3 bucket>/SyntheticGeneralData/ClaimData/
- Transfer the sample policy data to the Collect bucket.
aws s3 cp resources/syntheticgeneral-policy-data.csv s3://<Collect S3 bucket>/SyntheticGeneralData/PolicyData/
- Open Step Functions in the AWS Console and select
dev-insurancelake-etl-state-machine
. - Open the state machine execution in progress and monitor the status until complete.
- Open Athena in the AWS Console.
- Select
Launch Query Editor
, and change the Workgroup toinsurancelake
. - Run the following query to view a sample of prepared data in the consume bucket:
select * from syntheticgeneraldata_consume.policydata limit 100
Next Steps
- Take the InsuranceLake Deep Dive Workshop.
- You may skip to the Modify and test a transform step, as the prior steps overlap with the Quickstart instructions.
- Try out loading your own data.
- Try the Quickstart with CI/CD.
- Dive deeper with the included user documentation.
- Contact your AWS account team for a solution deep dive, workshops, or AWS Professional Services support.