-
Notifications
You must be signed in to change notification settings - Fork 4.3k
Description
Deployment of one of our CDK projects randomly fails with rate exceeded errors. These errors occur when CDK creates LogRetention resources related to the Lambda functions we have.
Reproduction Steps
The issue occurs when deploying multiple CDK stacks that contain quite some Lamba's with log retention resources.
I created a small test project to reproduce the issue: https://github.com/jaapvanblaaderen/log-retention-rate-limit With this simple setup, I wasn't able to reproduce the issue when deploying a few stacks sequentially (which is what we use in our actual project). The issue can however be observed when deploying the stacks in parallel.
Error Log
128/101 | 9:04:29 AM | CREATE_IN_PROGRESS | Custom::LogRetention | hello_5/LogRetention (hello5LogRetention5D258C6A) Resource creation Initiated
129/101 | 9:04:29 AM | CREATE_FAILED | Custom::LogRetention | hello_5/LogRetention (hello5LogRetention5D258C6A) Failed to create resource. Rate exceeded
new LogRetention (/repos/logretention-rate-limit/node_modules/@aws-cdk/aws-lambda/lib/log-retention.ts:67:22)
\_ new Function (/repos/logretention-rate-limit/node_modules/@aws-cdk/aws-lambda/lib/function.ts:537:28)
\_ new LogRetentionRateLimitStack (/repos/logretention-rate-limit/lib/log-retention-rate-limit-stack.ts:17:18)
\_ Object.<anonymous> (/repos/logretention-rate-limit/bin/log-retention-rate-limit.ts:8:3)
\_ Module._compile (internal/modules/cjs/loader.js:1151:30)
\_ Module.m._compile (/repos/logretention-rate-limit/node_modules/ts-node/src/index.ts:858:23)
\_ Module._extensions..js (internal/modules/cjs/loader.js:1171:10)
\_ Object.require.extensions.<computed> [as .ts] (/repos/logretention-rate-limit/node_modules/ts-node/src/index.ts:861:12)
\_ Module.load (internal/modules/cjs/loader.js:1000:32)
\_ Function.Module._load (internal/modules/cjs/loader.js:899:14)
\_ Function.executeUserEntryPoint [as runMain] (internal/modules/run_main.js:71:12)
\_ main (/repos/logretention-rate-limit/node_modules/ts-node/src/bin.ts:227:14)
\_ Object.<anonymous> (/repos/logretention-rate-limit/node_modules/ts-node/src/bin.ts:513:3)
\_ Module._compile (internal/modules/cjs/loader.js:1151:30)
\_ Object.Module._extensions..js (internal/modules/cjs/loader.js:1171:10)
\_ Module.load (internal/modules/cjs/loader.js:1000:32)
\_ Function.Module._load (internal/modules/cjs/loader.js:899:14)
\_ Function.executeUserEntryPoint [as runMain] (internal/modules/run_main.js:71:12)
\_ /usr/local/lib/node_modules/npm/node_modules/libnpx/index.js:268:14
Environment
- CLI Version: 1.41.0 (build 9e071d2)
- Framework Version: 1.41.0
- OS: OSX 10.14
Analysis
It fails when creating CloudWatch log groups. The issue could be fixed by relaxing the retry options for the CloudWatch SDK instance, I tested this locally by changing it to:
const cloudwatchlogs = new AWS.CloudWatchLogs({ apiVersion: '2014-03-28', maxRetries: 6, retryDelayOptions: { base: 300 }});
Another solution might be increasing a service limit. Unfortunately, I have no clue which rate limit is being hit here. It's not clear from the documentation:
- https://docs.aws.amazon.com/AmazonCloudWatchLogs/latest/APIReference/API_CreateLogGroup.html
- https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/cloudwatch_limits_cwl.html
This is 🐛 Bug Report