Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

cli: allow disabling context lookups (was commit cdk.context.json)

See original GitHub issue

The docs state that cdk.context.json should be checked-in to ensure reproducibility. However, that file may contain information that’s usually considered sensitive like AWS account numbers (e.g. aws-samples). Such data shouldn’t be commited and utilities like git-secrets actively try to prevent that. Could you please share what’s the best practice in these cases?

Issue Analytics

State:
Created 3 years ago
Reactions:20
Comments:9 (6 by maintainers)

Top GitHub Comments

9reactions

rix0rrrcommented, Apr 17, 2020

When am I supposed to update/create cdk.context.json, should I just run cdk synthesise while I’m developing to make sure cdk.context.json has the right data in and then commit?

Yes.

Honestly it’s probably also a good idea to add a flag for the CLI to fail unexpected context lookups (--no-lookups), which you can then use in CI/CD to make sure the output is deterministic (@shivlaks I just came up with another feature request 😃 )

9reactions

rix0rrrcommented, Apr 17, 2020

As far as I’ve understand you’d only commit the file if you want to be able to run synth without having to login to AWS and possible speed things up a bit since is saves some lookups.

Not really, depending on what’s in there. If there are 3 AZs in your region and you deploy a CDK app, a VPC will be created with a specific IP layout.

If you don’t commit context.json and on the next deploy AWS happens to have added an AZ to your region, your VPC will now try to span 4 AZs, have a different IP space layout and all your subnets will have to be destroyed and recreated (this will probably fail so you won’t actually lose any data, but your deployment will be stuck with no way to move forward).

Similarly, if you’ve started an instance off of an Amazon Linux AMI ID, and there now happens to be a new version of that AMI available… do you want your instance to be automatically replaced with a new version of the OS? What if you had state on that machine?

cdk.context.json is not just an API call cache, it’s a store of nondeterministic decisions taken in the past, which you must commit in order to ensure consistency in the future. It also gives you the opportunity to update the values piecemeal. (“YES I will take a new AMI ID now, NO I will not be spreading to new AZs”).

As for account IDs, we never considered them very sensitive, similar to how usernames aren’t usually considered sensitive. AWS itself publishes many of its own account IDs for users to put in their IAM policies, for example. The keys to the castle are in the access credentials to the account, not the account ID itself.

In fact, we generally expect people to put account IDs in their source as well (or build their application to read them from a configuration file), especially in the upcoming CI/CD implementation where we’re going to force it.

It may be possible that our opinions differ from other departments whose specialization is security. The biggest risk I personally see is bucket sniping.

I see a couple of solutions to this, but they all come with their own downsides:

Account IDs in context file:

Storing context in the account itself. We could store the context for an account INSIDE the account, let’s say in SSMPS. There would be no more cdk.context.json, so nothing to commit (yay). Would not get rid of account ids in the source files though.
Make account IDs nonreversible. In the context.json file, we could hash the account IDs to make then nonreversible. Would not get rid of account ids in the source files though.

Account IDs in source:

Symbolic account IDs. Instead of specifying account IDs in source files, we’d have a symbolic identifier such as $prod1, $prod2, $prod3 etc. The next complication then becomes: where do we store the mapping from { "$prod1": "123456789012" }? It will then be the user’s responsibility to make sure the mapping is transported and kept in sync between all developer’s machines. Also, these values could not be stored in the context.json file directly, because there’s no guarantee that $prod1 on one machine is the same account as $prod1 on another machine, so we’d need one of the other solutions in addition to this.

I think the end result of this will be that we may implement one of the solutions for the context, and it will be users responsibility to hide account IDs from their sources if desired, to be implemented in a way that suits them.

Top Results From Across the Web

Runtime context - AWS Cloud Development Kit (AWS CDK) v2

The project file cdk.context.json is where the AWS CDK caches context values retrieved from your AWS account. This practice avoids unexpected changes to...

Can the cdk.context.json file be auto-generated for a specific ...

I came across a scenario where deployment was unsuccessful as the lookup is returning incorrect value from the context. The setup in the...

@aws-cdk/pipelines - npm

A construct library for painless Continuous Delivery of CDK applications. CDK Pipelines is an opinionated construct library. It is purpose-built to deploy one ......

aws-cdk-lib - Python Package Health Analysis - Snyk

Learn more about aws-cdk-lib: package health score, popularity, ... The CLI via the --context CLI argument; The cdk.json file via the ...

Deploy to AWS | AWS data.all - Open Source at AWS

1. Clone data.all code · 2. Setup Python virtualenv · 3. Mirror the code to a CodeCommit repository · 4. Configure cdk.json ·...