Skip to content

Conversation

@phuhung273
Copy link
Contributor

@phuhung273 phuhung273 commented Jun 10, 2025

Issue # (if applicable)

Closes #33431

Reason for this change

Missing param in EmrCreateClusterOptions

Description of changes

  • EmrCreateClusterOptions support
    • ebsRootVolumeIops
    • ebsRootVolumeThroughput
    • managedScalingPolicy
  • Move validations to private validateProps function

ManagedScalingPolicy.ScalingStrategy is supported by awscli/API, but not yet supported by Step Function. Therefore not included in this PR
Screenshot 2025-06-10 222653

Describe any new or updated permissions being added

Add iam:CreateServiceLinkedRole permission for only EMR to create AWSServiceRoleForEMRCleanup as instructed by https://docs.aws.amazon.com/emr/latest/ManagementGuide/using-service-linked-roles-cleanup.html#create-service-linked-role

Description of how you validated changes

Unit + Integ

Checklist


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license

@aws-cdk-automation aws-cdk-automation requested a review from a team June 10, 2025 16:39
@github-actions github-actions bot added bug This issue is a bug. effort/medium Medium work item – several days of effort p2 star-contributor [Pilot] contributed between 25-49 PRs to the CDK labels Jun 10, 2025
@aws-cdk-automation aws-cdk-automation added the pr/needs-community-review This PR needs a review from a Trusted Community Member or Core Team Member. label Jun 10, 2025
@aws-cdk-automation
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildv2Project1C6BFA3F-wQm2hXv2jqQv
  • Commit ID: 96e6342
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@kumvprat kumvprat self-assigned this Sep 9, 2025
Copy link
Contributor

@kumvprat kumvprat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for raising the PR.

I have added inline comments where possible.
I have a high level question on the PR : It seems like the ebs root volume was already present as a property in the create emr cluster sfn task.

Is this change not impacting any other integration tests that might be testing emr cluster create task with any other ebs properties ?

});
start.next(describe);

describe.expect(ExpectedResult.objectLike({
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have a check here that the EMR cluster is created ? And also have a check that the cluster configuration match the cluster configuration provided in the step function step

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure let me add, might take some time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verification added.

In case you have a question why we need 2 describeExecution calls, just found out that we cannot have below 2 things at the same time:

  • Validate execution with .expect(ExpectedResult.objectLike({ status: 'SUCCEEDED', }))
  • Retrieve clusterId with .getAttString('output.ClusterId')

Because when we do .getAttString('output.ClusterId'), the response return nothing besides from ClusterId, making status: 'SUCCEEDED' fail although the response actually have it. Thats why we need 2 describeExecution calls.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that the timeout is. :

totalTimeout: Duration.minutes(5)

Is this enough for the whole emr cluster to be created ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it is enough. It works everytime.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this statement from the doc When you launch a cluster, either for the first time or when the AWSServiceRoleForEMRCleanup service-linked role is not present, Amazon EMR creates the AWSServiceRoleForEMRCleanup service-linked role for you. is not true, at least in this case when StepFunction trigger it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we should add the permissions for creating service linked role, in an idempotent manner

Because it seems like the cluster is not created properly without the role, and requires the ability to create the service linked role for cluster creation

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these are the required permissions for that, from the same link :

You must have permissions to create a service-linked role. For an example statement that adds this capability to the permissions policy of an IAM entity (such as a user, group, or role):

Add the following statement to the permissions policy for the IAM entity that needs to create the service-linked role.

{
"Sid": "ElasticMapReduceServiceLinkedRole",
"Effect": "Allow",
"Action": "iam:CreateServiceLinkedRole",
"Resource": "arn:aws:iam:::role/aws-service-role/elasticmapreduce.amazonaws.com/AWSServiceRoleForEMRCleanup*",
"Condition": {
"StringEquals": {
"iam:AWSServiceName": [
"elasticmapreduce.amazonaws.com",
"elasticmapreduce.amazonaws.com.cn"
]
}
}
}

Copy link
Contributor

@kumvprat kumvprat Sep 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes may not be exactly this but would need some changes in the policy creation during the task creation : maybe we should change the policies added here

Since when using this stepfunction task the user might utilize the emr cluster created from this task for running jobs => The cluster creation should be successful once the task completes
(I have not used this feature personally, and it seems you have. So I would like to ask if adding these set of permissions would make this functionality easier to use? )

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding these set of permissions would make this functionality easier to use?

You're great mate. Can confirm that if StepFunction/StateMachine has iam:CreateServiceLinkedRole then user doesn't need to precreate AWSServiceRoleForEMRCleanup. Better user experience of course.

Idempotent: YES. I executed the job 2 times and no issue when the role exists.

@phuhung273 phuhung273 force-pushed the createemrclusteroption branch from 2c27e56 to 5d92a2f Compare September 13, 2025 05:42
@phuhung273 phuhung273 force-pushed the createemrclusteroption branch from 5d92a2f to f3f171c Compare September 15, 2025 11:18
@phuhung273 phuhung273 requested a review from kumvprat September 15, 2025 11:22
@phuhung273
Copy link
Contributor Author

Sorry I forgot these questions:

I have a high level question on the PR : It seems like the ebs root volume was already present as a property in the create emr cluster sfn task.

Only EBSsize prop exist, not IOPS+throughput

Is this change not impacting any other integration tests that might be testing emr cluster create task with any other ebs properties ?

Yes, new props will be undefined for current test/user. We know this since no change in other snapshots.

@phuhung273 phuhung273 force-pushed the createemrclusteroption branch 2 times, most recently from 55f08c9 to 2813ef0 Compare September 15, 2025 16:30
@kumvprat kumvprat added the needs-security-review Related to feature or issues that needs security review label Sep 15, 2025
@phuhung273
Copy link
Contributor Author

Still need security approval process but really appreciate @kumvprat for your prompt support so far 🚀 You're very helpful.

@phuhung273 phuhung273 force-pushed the createemrclusteroption branch from 2813ef0 to 3964c6f Compare September 16, 2025 23:24
Copy link
Contributor

@kumvprat kumvprat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!!

@mergify
Copy link
Contributor

mergify bot commented Sep 18, 2025

Thank you for contributing! Your pull request will be updated from main and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork).

@mergify
Copy link
Contributor

mergify bot commented Sep 18, 2025

Thank you for contributing! Your pull request will be updated from main and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork).

@mergify mergify bot merged commit b3ad6f9 into aws:main Sep 18, 2025
19 checks passed
@github-actions
Copy link
Contributor

Comments on closed issues and PRs are hard for our team to see.
If you need help, please open a new issue that references this one.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 18, 2025
@phuhung273 phuhung273 deleted the createemrclusteroption branch September 18, 2025 10:41
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

bug This issue is a bug. effort/medium Medium work item – several days of effort needs-security-review Related to feature or issues that needs security review p2 pr/needs-community-review This PR needs a review from a Trusted Community Member or Core Team Member. star-contributor [Pilot] contributed between 25-49 PRs to the CDK

Projects

None yet

Development

Successfully merging this pull request may close these issues.

@aws-cdk/aws-stepfunctions: not all options for CreateEmrCluster are present

3 participants