question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

cdk version 2.33 onwards is getting stuck

See original GitHub issue

Describe the bug

I am trying to deploy an S3 bucket using 2.32.1 and it’s working just fine. My cdk is run from Jenkins and is written in Typescript(node v16) running inside a docker container

Jenkins is running cdk cli version 2.44.0. When I upgrade the package in the package.json to 2.33.0 onwards, the same deployment command is getting stuck and the pipeline is staying hang.

Am I missing something? Are there any breaking changes in 2.33.0? from the release notes I couldn’t find any useful information.

Thanks, Gal

Expected Behavior

Using cdk packages(aws-cdk in DevDependencies and aws-cdk-lib in dependencies) will work so I will be able to deploy the S3 bucket with the latest versions.

Current Behavior

When I am using cdk packages in version 2.32.1 it works just fine. I am able to deploy the S3 bucket. After upgrading to version 2.33.0 or any later version, the cdk synth/diff/deploy is getting hang…

Reproduction Steps

The Jenkins pipeline is running inside docker containers. On the Jenkins agent, docker server is installed. The first container in the pipeline is based on python 3.8. Inside it, another docker container of nodejs v16(alpine dist) is running with cdk-cli version 2.44.0 installed.

This is the package.json:

{
    "name": "general",
    "version": "0.1.0",
    "bin": {
        "general": "bin/general.js"
    },
    "scripts": {
        "build": "tsc",
        "watch": "tsc -w",
        "test": "jest",
        "cdk": "cdk"
    },
    "devDependencies": {
        "@types/jest": "^27.5.2",
        "@types/node": "^10.17.27",
        "@types/prettier": "2.6.0",
        "aws-cdk": "2.32.1",
        "jest": "^27.5.1",
        "ts-jest": "^27.1.4",
        "ts-node": "^10.9.1",
        "typescript": "~3.9.7"
    },
    "dependencies": {
        "aws-cdk-lib": "2.32.1",
        "constructs": "^10.0.0",
        "@aws-cdk/aws-glue-alpha": "^2.32.1-alpha.0",
        "source-map-support": "^0.5.21"
    }
}
```{
    "name": "general",
    "version": "0.1.0",
    "bin": {
        "general": "bin/general.js"
    },
    "scripts": {
        "build": "tsc",
        "watch": "tsc -w",
        "test": "jest",
        "cdk": "cdk"
    },
    "devDependencies": {
        "[@types/jest](https://npmjs.com/package/@types/jest)": "[^27.5.2](https://npmjs.com/package/@types/jest)",
        "[@types/node](https://npmjs.com/package/@types/node)": "[^10.17.27](https://npmjs.com/package/@types/node)",
        "[@types/prettier](https://npmjs.com/package/@types/prettier)": "[2.6.0](https://npmjs.com/package/@types/prettier)",
        "[aws-cdk](https://npmjs.com/package/aws-cdk)": "[2.32.1](https://npmjs.com/package/aws-cdk)",
        "[jest](https://npmjs.com/package/jest)": "[^27.5.1](https://npmjs.com/package/jest)",
        "[ts-jest](https://npmjs.com/package/ts-jest)": "[^27.1.4](https://npmjs.com/package/ts-jest)",
        "[ts-node](https://npmjs.com/package/ts-node)": "[^10.9.1](https://npmjs.com/package/ts-node)",
        "[typescript](https://npmjs.com/package/typescript)": "[~3.9.7](https://npmjs.com/package/typescript)"
    },
    "dependencies": {
        "[aws-cdk-lib](https://npmjs.com/package/aws-cdk-lib)": "[2.32.1](https://npmjs.com/package/aws-cdk-lib)",
        "[constructs](https://npmjs.com/package/constructs)": "[^10.0.0](https://npmjs.com/package/constructs)",
        "[@aws-cdk/aws-glue-alpha](https://npmjs.com/package/@aws-cdk/aws-glue-alpha)": "[^2.32.1-alpha.0](https://npmjs.com/package/@aws-cdk/aws-glue-alpha)",
        "[source-map-support](https://npmjs.com/package/source-map-support)": "[^0.5.21](https://npmjs.com/package/source-map-support)"
    }
}

Possible Solution

No response

Additional Information/Context

No response

CDK CLI Version

2.44.0

Framework Version

No response

Node.js Version

16

OS

Ubuntu 18/20

Language

Typescript

Language Version

No response

Other information

No response

Issue Analytics

  • State:closed
  • Created 10 months ago
  • Reactions:3
  • Comments:15 (7 by maintainers)

github_iconTop GitHub Comments

3reactions
rix0rrrcommented, Nov 24, 2022

The CDK behavior is as follows:

  • Setting autoDeleteObjects creates a Custom Resource that will clear the bucket on stack deletion.
  • The CDK writes copies files when it needs to generate a code bundle for the Custom Resource provider. This code bundle consists of your code plus an index file we add for you.
  • After these source files are generated, the files are then copied into the cdk.out directory as part of asset staging. This is the same for all assets. The directory these files are copied into depends on the hash of all source files going into it, so the source bundle needs to be complete before this step can start.

The change was:

  • We used to do the first step, copying of source files, inside the node_modules directory. This was actually incorrect, as the node_modules directory should be considered a read-only repository of library code. So we changed the code generation to be moved to the system’s temporary directory.
  • From Docker’s point of view, in the old situation the file used to be created on a volume mount, but in the new situation is now created in a directory that’s fully inside the container’s overlayfs file system.
  • (This is why the workaround is moving the $TMP dir back to a location inside a Docker volume mount)

The problem was:

  • Because of a combination of Docker and kernel behavior, the copy second copy operation would appear to copy 0 bytes.
  • The NodeJS copyFile function keeps on retrying the call to copy more and more bytes over, getting 0 every time, and waiting until the copy is complete. This never finishes, and so the build appears to hang.
  • In later kernel versions, this bug has been fixed so the copy operation returns an actual number of bytes instead of 0, allowing the copy to succeed.

Full props to @nburtsev for figuring this out. I’m not sure I myself would have been able to put all of this together.


In summary:

The CDK does not directly communicate with the kernel–we just perform filesystem copies. Bugs in the interaction of other pieces of software cause the file copy to loop endlessly if the right combination of circumstances is hit.

0reactions
galsasi1989commented, Nov 23, 2022

Hi @rix0rrr

Thanks for your help with this issue! It was very helpful after we spent long days or even weeks on this issue.

Can you please give us a high level description about the communication between cdk and the linux kernel? what was changed in cdk and how is it related to the kernel version?

In addition, I think it’s very important to add validation and make sure that all the system requirements are met when I install my cdk project’s dependencies(via pip, npm or other tools) and throw a clear exception as much as possible so at least we will have a clue next time.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Troubleshooting common AWS CDK issues
This topic describes how to troubleshoot the following issues with the AWS CDK.
Read more >
AWS CDK Toolkit - npm
CDK Toolkit, the command line tool for CDK apps. Latest version: 2.56.1, last published: 2 days ago. Start using aws-cdk in your project...
Read more >
AWS CDK Pipelines: Real-World Tips and Tricks (Part 2)
More useful tips and tricks when using AWS CDK Pipelines that go beyond the simple demos and ... How do I recover from...
Read more >
aws-cdk-lib · PyPI
Version 2 of the AWS Cloud Development Kit library. ... You can use a classic import to get access to each service namespaces:...
Read more >
Automating CDK Version Bumping with AWS Serverless and ...
I publish my CDK constructs for specific versions of the AWS CDK. ... This is an especially easy place to get stuck at...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found