Toolkit: Python application containing N+ Resources Crashes with error: `Malformed request, "API" field is required`
See original GitHub issueMy team has run into an issue with the CDK toolkit crashing when when reach a certain number of resources within our application. We’ve ended up having to split the application multiple times, at this point, to deal with this limitation, which does not appear to be documented.
Any operation which causes CDK to run app.synth()
appears to result in a crash. This may be as simple as running cdk list
.
The exact number of resources in question is uknown at this time, but I suspect the number is somewhere around 1000, split across about 15 stacks.
Reproduction Steps
- Create a CDK application which contains more than N number of [explicitly defined] resources, where N is a yet to be determined number, likely on the order of 1000. More details incoming ~shortly~.
- Ensure that the resources specified contain a sufficiently large configuration (specific requirements are currently unknown).
- Run
cdk list
within the application directory
What did you expect to happen?
CDK should output a list of stacks.
What actually happened?
CDK crashes with an error which appears to originate from JSII:
throw new Error('Malformed request, "api" field is required');
^
There is a line in the JSII code which matches this error quite well: https://github.com/aws/jsii/blob/main/packages/@jsii/runtime/lib/host.ts#L97
Environment
- CDK CLI Version: 1.104.0
- Framework Version: 1.106.1
- Node.js Version: v14.16.1
- OS : Windows
- Language (Version): Python v3.8
Other
Further details incoming.
Update: 2021-06-24: I’ve attempted numerous ways of looping over resource definitions in an attempt to recreate the issue and I have, thus far, been unable to create a test case outside of our repository, which is, sadly, not something I can share.
Above details have been updated, as well as possible, to include recent discoveries.
This is 🐛 Bug Report
Issue Analytics
- State:
- Created 2 years ago
- Comments:28 (9 by maintainers)
Top GitHub Comments
@kgeisink yes! it is very helpful and I was able to reproduce the issue with that codebase on MacOS. I spent some time digging into debug logs but nothing immediate jumped out and I got pulled onto some other stuff. I will keep working on this and provide an update when I’m able to.
@MrArnoldPalmer I have shared our codebase with a reproducible state via the AWS Support case that I mentioned above (9889364641). Unfortunately I am not able to share it via other means due to NDA restrictions. Would you be able to access it through there? If not I can try anonymising the code but that might take a little while given the size.
I have not been able to pin point it to a specific place in the code, the stack trace is also very generic. I will add it as an attachment. While commenting/uncommenting various stacks in resources I only noticed the trend that the total number of resources did seem to matter somehow. E.g. When I removed a stack that contained 115 resources I would not have the error anymore, and if I kept that stack around, I would need to remove 2-3 smaller stacks for the error to go away.
It does appear that there is some place in our project that just seems to cause incredibly inefficient resource management, as I was also not able to reproduce it by generating a large amount of resources in loops. Though I do not have enough insight into CDK/JSII internals to know how much is benefitted off of reuse of course.
I’ve also added some of the JSII_DEBUG output leading up to the error including the error itself.
cdk ls stacktrace.txt JSII_DEBUG+output.txt