Guidance for Operationalizing Development with Amazon CodeWhisperer

Summary: Amazon CodeWhisperer is a general purpose, AI code generator that provides you with code recommendations. This guidance helps you configure and optimize the tool responsibly.

What is CodeWhisperer?

Amazon CodeWhisperer is a general purpose, AI code generator that provides you with code recommendations, in real-time. As you write code, CodeWhisperer automatically generates suggestions based on your existing code and comments. Your personalized recommendations can vary in size and scope, ranging from a single line comment to fully formed functions.

CodeWhisperer is supported in the following integrated developer environments (IDEs):

AWS Cloud9
Visual Studio Code (VS Code)
JetBrains (includes most development tools such as PyCharm, IntelliJ, and WebStorm)
AWS Lambda console code editor

Features of CodeWhisperer include:

Real-time code suggestions customized for you.
Support for popular programming languages and IDEs.
Built-in security scans.
Code responsibly: Reference tracker for open-source code.
Code inclusively: Bias avoidance.
Enterprise administration.

Supported by the preceding features, the following benefits can improve developer productivity:

Reduction of hand-typed keystrokes
Reduction of manual search for documentation (such as developer guides and code repositories)
Automatic generation of unit tests

CodeWhisperer is NOT:

IntelliSense
An autonomous, AI-driven developer
A replacement for human developers

Responsibility of developers

With CodeWhisperer comes several responsibilities for the developers who use the tool for generated code. The intention of CodeWhisperer is to improve productivity and assist developers, not replace them. While CodeWhisperer can be used by developers of any background or experience, CodeWhisperer is most effective with experienced developers who:

Are familiar with foundational concepts of computer science (such data structures, algorithms, time complexity, and space complexity).
Are knowledgeable about the programming language used, from proper syntax to best practices for implementation.
Can effectively debug in the programming language used.
Are familiar with implementing and performing tests for the programming language used.

These attributes are critical for proper design, implementation, testing, and deployment of applications, and cannot be comprehensively replaced by CodeWhisperer alone. Just as developers review code produced by their peers to uncover bugs or paths to improvement, developers need to be vigilant with code generated by artificial intelligence (AI) systems. Developers must recognize that CodeWhisperer may suggest ineffective, erroneous, or unrelated code snippets, and understand that the developers alone are responsible for accepting and integrating the AI-suggested code into their codebases.

While it only takes a moment to accept and merge AI suggestions into a codebase, it is the developer’s responsibility to observe, consider, and address the following:

Is the code suggestion syntactically correct? Will it be interpreted and compiled successfully?
Will the source code still be interpreted and compiled successfully with the introduction of the code suggestion?
Are all the required dependency libraries included for the code suggestion? Are these dependencies secure and approved?
Is the code suggestion appropriately placed within the program?
Is the code suggestion efficient? Is there a better way?
Does the code suggestion correctly address the developer’s intent?
Does the scope of the code suggestion go beyond the developer’s intent?
Does the code suggestion implement best practices?
Is the code suggestion secure? Does it contain any vulnerabilities?
In the future, how can a developer identify that this code was generated by AI?

For example, a developer could invoke CodeWhisperer to suggest an implementation for a function that creates an Amazon Simple Storage Service (Amazon S3) bucket, and uploads local files created in the last day. Here are some hypothetical questions a developer may ask in their observations:

Does the suggested code loop through all the local files, or does it loop through a sorted subset?
Does the suggested code include over-permissive AWS Identity and Access Management (IAM) policies for any of the AWS resources used?
Does the suggested code directly use AWS credentials instead of temporary credentials?
Does the suggested code use encryption for the files stored in S3?
Why does the function create additional AWS resources that are unrelated to the function’s purpose?

While developers can use a machine learning code generator such as CodeWhisperer to raise the bar on their coding productivity, developers cannot rely on AI alone to replace the decision-making for building well-documented, efficient, secure, and syntactically correct applications. Continue on to the following sections for best-practices on responsibly developing with generative AI tools such as CodeWhisperer.

CodeWhisperer guardrails

The most direct way of controlling CodeWhisperer use in your organization is by controlling accessibility to the service, and configuring the sharing of your data. CodeWhisperer is a feature of the AWS Toolkit, an optional extension or plugin. It can be installed in VS Code or JetBrains IDEs by an administrator, or anyone else granted permissions necessary to modify the development environment. Code suggestions can also be paused by a developer at any time. Configuration of CodeWhisperer is dependent on the IDE that the developer intends to use.

Setting up CodeWhisperer

To use CodeWhisperer in the AWS Lambda console code editor, both IAM permissions and the IDE need to be configured. Review the details on configuring CodeWhisperer for AWS Lambda.
To use CodeWhisperer with AWS Cloud9, both IAM permissions and the IDE need to be configured. Review the details on configuring CodeWhisperer for AWS Cloud9.
For supported third-party IDEs such as VS Code and JetBrains, both the installation of the AWS Toolkit and the configuration of developer authentication is required:
- Installation of AWS Toolkit
  - Review the details on installing the AWS Toolkit for VS Code.
  - Review the details on installing the AWS Toolkit for JetBrains.
- Configuration of developer authentication
  - For enterprises with an existing relationship with AWS, AWS IAM Identity Center (successor to AWS Single Sign-On) manages user access to CodeWhisperer. Review the Configuration details for enterprise administrators. Enterprise developers can use the instructions to configure their IDE and authenticate with IAM Identity Center.
  - Individual developers unaffiliated with an enterprise organization that has an existing relationship with AWS can authenticate to CodeWhisperer with an AWS Builder ID. Individual developers follow the instructions to configure their IDE and authenticate with their AWS Builder ID.

The data that AWS may collect with CodeWhisperer includes your client-side telemetry and your content. Your content includes the parts of your code that CodeWhisperer uses to generate suggestions, as well as the content of the suggestions themselves.

Your client-side telemetry quantifies your usage of the service. For example, AWS may track whether you accept or reject a recommendation. Your client-side telemetry does not contain actual code, and does not contain personally identifiable information (PII) such as your IP address. For professional and individual tiers of CodeWhisperer, you can use the instructions to opt out of sharing client-side telemetry.

At the professional tier, CodeWhisperer does not collect your (code) content for service improvement purposes. At the individual tier, you can use the instructions to opt out of sharing your (code) content with CodeWhisperer.

Controlling CodeWhisperer accessibility

As a best practice for CodeWhisperer (and other AWS services), permissions should be granted on a least-privilege basis to control access to the service, as well as incurred cost. In accord with the previous section Responsibility of developers, administration and use of CodeWhisperer should be conducted responsibly, especially around production environments. Developers should be entrusted to use CodeWhisperer to assist with their development productivity, not to replace their accountability. Granting least-privilege permission also ensures that the AWS account owner does not accrue unwanted CodeWhisperer costs from inadvertent or unnecessary usage. But, developers who are looking to innovate, learn from, or experiment with CodeWhisperer should also be taken into consideration when granting access.

Here are some recommended approaches for enacting least-privilege controls for CodeWhisperer:

For enterprises using AWS IAM Identity Access Center, ensure that appropriate entities (users or groups) are permitted to use CodeWhisperer through IAM roles and policies. Assigning IAM roles to groups allows for reusability of security policies, and scaled control of permissions across a categorical demographic of users.
Ensure that developers authenticating through AWS Builder ID or AWS IAM Identity Access Center, or both, are using their own unique credentials. Sharing credentials clouds visibility into the monitoring of individual activity, and inhibits the ability for administrators to grant or revoke individual access on a granular level.
Adopt the use of systems management software (for example, AWS Systems Manager) to control the versions and configurations of software installed on development machines (for example, VS Code and JetBrains IDEs). In the event that a critical update needs to be installed for an IDE or the AWS Toolkit, a remote deployment can be applied to an entire fleet of development machines in parallel. Using systems management software can also provide historical version or configuration insight for auditing or diagnostic purposes.

Pausing CodeWhisperer suggestions

In addition to configuring accessibility to CodeWhisperer from a systems permission or configuration perspective, developers may also opt to pause and resume suggestions from Amazon CodeWhisperer from within the IDE. Reasons for pausing may include:

The developer may prefer to write their own original, unbiased code.
Legal or compliance requirements may require a developer to write their own original, unbiased code for security or intellectual property-related purposes.

Learn how to pause or resume suggestions from CodeWhisperer from within an IDE.

Configuring CodeWhisperer code references

Another aspect an administrator or developer may consider is the inclusion of full code references for CodeWhisperer suggestions. In some cases, a developer can prompt CodeWhisperer to make a suggestion that is a direct copy of a code reference used in the CodeWhisperer model training set. In the occurrence a developer accepts a code reference from CodeWhisperer, the suggested block will include a notation that indicates the source and any associated license.

Suggestions with code references can be configured by either an enterprise administrator through permissions in AWS, or by a developer in an IDE.

For enterprise administrators, you can opt in or out of suggestions with code references for your entire organization.
Developers can opt in or out of suggestions with code references from an IDE.

Generating effective prompts

Review the following CodeWhisperer resources to observe the improvement on developer productivity:

Code examples from Amazon CodeWhisperer User Guide:

To generate effective prompts, the following best practices are recommended with CodeWhisperer.

Use clear and succinct remarks in your comments to describe the purpose and intention of the code.
When applicable, use best practices such as the Single-Responsibility Principle and High Cohesion/Loose Coupling to organize your code in an orderly, reusable approach. Referencing classes, interfaces, modules, functions, methods, and so on, in your comments will help CodeWhisperer generate code in the proper context.
When generating new class, field, constructor, or method in Java, use Javadoc comments when possible. Add the Javadoc comment immediately preceding the construct you intend to create.
When generating a new module, function, class, or method in Python, use docstring comments when possible. Docstring comments should be the first statement immediately after the class, field, constructor, or method definition. The definition of these constructs can also be generated by CodeWhisperer using comments to describe the construct name, and any input parameters if applicable.

Documenting generated code

Even with CodeWhisperer’s effectiveness to understand developers’ code or comments to prompt suggestions, it is still important for developers to document, comment, and format their code for readability purposes. Readable code is essential for effective debugging and code reviews (whether during periods of new feature development, maintenance, or knowledge transfer).

In the advent of generative AI, developers need to consider that it may be helpful to identity which segments in their code bases were created by machine learning-powered tools. Hypothetically, whether it be a requirement for a future compliance program, or by means of mitigating developer accountability in the event of a flawed software deployment - it could be useful to know:

If the code of interest was suggested or generated by an AI coding companion like CodeWhisperer?
What prompt (if any) provided by the developer caused the coding companion to make the code suggestion in question?
What was the developer’s actual intent before accepting the suggested code from the AI coding companion?

While CodeWhisperer does not explicitly label code suggestions to distinguish it from human-produced code, developers do have an opportunity to comment or mark the code they accept from coding companions for future reference. In addition to helping address the preceding hypothetical scenarios, marking code as ‘AI-generated’ can help development teams curate heuristics and best practices for prompting AI coding-companions to make effective suggestions. For instance, when a developer joins a new project or team and starts reviewing comments in the code base, they could familiarize themselves with prompt patterns the team uses with CodeWhisperer.

Here are a few suggestions for labeling or marking code generated by a coding companion like CodeWhisperer:

Use discretion when labeling code generated by tools like CodeWhisperer. Consider labeling significant blocks of code or fixes that were suggested by the coding companion. Labeling insignificant code, or line-by-line suggestions, is not necessary.
Comment the suggested code blocks that were suggested by the coding companion, for example, suggested by CodeWhisperer.
Make frequent, discrete commits to your Git repository that isolate code generated by AI. Use the Git commit message to distinguish the commit as including AI-generated code, for example, git commit -m 'added function X to enable feature Y (codewhisperer).
Mark your Git branches to indicate whether they include AI-generated code, for example, git branch feature/X-codewhisperer.

Analyzing and testing generated code

As previously mentioned in the section Responsibility of developers, it is important for the developer to confirm if code generated by CodeWhisperer was syntactically correct, fulfilled the intentions of the developer, and so on. One way to verify is by including unit tests with your project. If the unit test associated with the code generated by CodeWhisperer successfully passes all of its test cases, and test cases thoroughly cover the related code (for example, all possible paths), then a developer can be assured the code is syntactically correct, and fulfills the original intention. It is still the responsibility of the developer to determine if the code is performant (for example, optimized for time complexity, space complexity) unless the test case(s) take those metrics into consideration.

Another benefit provided by CodeWhisperer is the ability to automatically create unit tests, using existing code from the project as context. Not only can developers use CodeWhisperer to generate code for running their application, but CodeWhisperer can generate unit test functions and test cases to help improve productivity. Using code coverage analysis or reporting tools in conjunction with CodeWhisperer (for generating unit tests) can decrease the implementation time for increasing code coverage, and reduce the number of bugs in your deployments or code base.

Using security scans

In addition to testing code generated by AI, it is imperative to use security analysis tools to examine your code for known vulnerabilities (such as malware). Depending on the security analysis tool used and its configuration, vulnerabilities can be flagged and examined before a major event like a repository commit or an artifact deployment. While security analysis should be a component of any pipeline building upstream to production, developers using CodeWhisperer can take additional precaution by using the Security Scan feature within their IDE before making commits to their repositories.