InsuranceLake Deployment Validation
- If you have not previously set the default AWS region for your session, you must set it now. - The default region for InsuranceLake is - us-east-2.- export AWS_DEFAULT_REGION=<replace with your region>
- Transfer the sample claim data to the Collect bucket (Source system: SyntheticData, Table: ClaimData).export AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text) aws s3 cp resources/syntheticgeneral-claim-data.csv s3://dev-insurancelake-${AWS_ACCOUNT_ID}-${AWS_DEFAULT_REGION}-collect/SyntheticGeneralData/ClaimData/
- The claim data workflow must complete loading of the data into the Cleanse bucket to successfully load the policy data in the following step. 
- Transfer the sample policy data to the Collect bucket (Source system: SyntheticData, Table: PolicyData).aws s3 cp resources/syntheticgeneral-policy-data.csv s3://dev-insurancelake-${AWS_ACCOUNT_ID}-${AWS_DEFAULT_REGION}-collect/SyntheticGeneralData/PolicyData/
- Upon successful transfer of the file, an event notification from S3 will trigger the state-machine-trigger Lambda function. 
- The Lambda function will insert a record into the DynamoDB table - {environment}-{resource_name_prefix}-etl-job-auditto track job start status.
- The Lambda function will also trigger the Step Functions State Machine. The State Machine execution name will be - <filename>-<YYYYMMDDHHMMSSxxxxxx>and have the required metadata as input parameters.
- The State Machine will trigger the AWS Glue job for Collect to Cleanse data processing. 
- The Collect to Cleanse AWS Glue job will execute the transformation logic defined in configuration files. 
- The AWS Glue job will load the data into the Cleanse bucket using the provided metadata. The data will be stored in S3 as - s3://{environment}-{resource_name_prefix}-{account}-{region}-cleanse/syntheticgeneraldata/claimdata/year=YYYY/month=MM/day=DDin Apache Parquet format.
- The AWS Glue job will create or update the AWS Glue Catalog table using the table name passed as a parameter based on the folder name ( - PolicyDataand- ClaimData).
- After the Collect to Cleanse AWS Glue job completes, the State Machine will trigger the Cleanse to Consume AWS Glue job. 
- The Cleanse to Consume AWS Glue job will execute the SQL logic defined in configuration files. 
- The Cleanse to Consume AWS Glue job will store the resulting data set in S3 as - s3://{environment}-{resource_name_prefix}-{account}-{region}-consume/syntheticgeneraldata/claimdata/year=YYYY/month=MM/day=DDin Apache Parquet format.
- The Cleanse to Consume AWS Glue job will create or update the AWS Glue Catalog table. 
- After successful completion of the Cleanse to Consume AWS Glue job, the State Machine will trigger the etl-job-auditor Lambda function to update the DynamoDB table - {environment}-{resource_name_prefix}-etl-job-auditwith the latest status.
- An Amazon Simple Notification Service (Amazon SNS) notification will be sent to all subscribed users. 
- To validate the data load, use Athena and execute the following query:select * from syntheticgeneraldata_consume.policydata limit 100