Skip to content

Support retry strategy on Lambda LogRetention #8257

@jaapvanblaaderen

Description

@jaapvanblaaderen

Deployment of one of our CDK projects randomly fails with rate exceeded errors. These errors occur when CDK creates LogRetention resources related to the Lambda functions we have.

Reproduction Steps

The issue occurs when deploying multiple CDK stacks that contain quite some Lamba's with log retention resources.

I created a small test project to reproduce the issue: https://github.com/jaapvanblaaderen/log-retention-rate-limit With this simple setup, I wasn't able to reproduce the issue when deploying a few stacks sequentially (which is what we use in our actual project). The issue can however be observed when deploying the stacks in parallel.

Error Log

128/101 | 9:04:29 AM | CREATE_IN_PROGRESS   | Custom::LogRetention        | hello_5/LogRetention (hello5LogRetention5D258C6A) Resource creation Initiated
 129/101 | 9:04:29 AM | CREATE_FAILED        | Custom::LogRetention        | hello_5/LogRetention (hello5LogRetention5D258C6A) Failed to create resource. Rate exceeded
	new LogRetention (/repos/logretention-rate-limit/node_modules/@aws-cdk/aws-lambda/lib/log-retention.ts:67:22)
	\_ new Function (/repos/logretention-rate-limit/node_modules/@aws-cdk/aws-lambda/lib/function.ts:537:28)
	\_ new LogRetentionRateLimitStack (/repos/logretention-rate-limit/lib/log-retention-rate-limit-stack.ts:17:18)
	\_ Object.<anonymous> (/repos/logretention-rate-limit/bin/log-retention-rate-limit.ts:8:3)
	\_ Module._compile (internal/modules/cjs/loader.js:1151:30)
	\_ Module.m._compile (/repos/logretention-rate-limit/node_modules/ts-node/src/index.ts:858:23)
	\_ Module._extensions..js (internal/modules/cjs/loader.js:1171:10)
	\_ Object.require.extensions.<computed> [as .ts] (/repos/logretention-rate-limit/node_modules/ts-node/src/index.ts:861:12)
	\_ Module.load (internal/modules/cjs/loader.js:1000:32)
	\_ Function.Module._load (internal/modules/cjs/loader.js:899:14)
	\_ Function.executeUserEntryPoint [as runMain] (internal/modules/run_main.js:71:12)
	\_ main (/repos/logretention-rate-limit/node_modules/ts-node/src/bin.ts:227:14)
	\_ Object.<anonymous> (/repos/logretention-rate-limit/node_modules/ts-node/src/bin.ts:513:3)
	\_ Module._compile (internal/modules/cjs/loader.js:1151:30)
	\_ Object.Module._extensions..js (internal/modules/cjs/loader.js:1171:10)
	\_ Module.load (internal/modules/cjs/loader.js:1000:32)
	\_ Function.Module._load (internal/modules/cjs/loader.js:899:14)
	\_ Function.executeUserEntryPoint [as runMain] (internal/modules/run_main.js:71:12)
	\_ /usr/local/lib/node_modules/npm/node_modules/libnpx/index.js:268:14

Environment

  • CLI Version: 1.41.0 (build 9e071d2)
  • Framework Version: 1.41.0
  • OS: OSX 10.14

Analysis

It fails when creating CloudWatch log groups. The issue could be fixed by relaxing the retry options for the CloudWatch SDK instance, I tested this locally by changing it to:

const cloudwatchlogs = new AWS.CloudWatchLogs({ apiVersion: '2014-03-28', maxRetries: 6, retryDelayOptions: { base: 300 }});

Another solution might be increasing a service limit. Unfortunately, I have no clue which rate limit is being hit here. It's not clear from the documentation:


This is 🐛 Bug Report

Metadata

Metadata

Assignees

Labels

@aws-cdk/aws-lambdaRelated to AWS Lambdaeffort/smallSmall work item – less than a day of effortfeature-requestA feature should be added or improved.good first issueRelated to contributions. See CONTRIBUTING.mdin-progressThis issue is being actively worked on.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions