@aws-cdk/aws-stepfunctions-tasks: EmrCreateClusterProps clusterRole should be instance profile not IRole
See original GitHub issueSeems that the API for AWS CDK with AWS Step Function to create an EMR cluster is wrong.
With an EMR service object, we can provide RunJobFlowInput
to runJobFlow
, that specified JobFlowRole: instanceProfile.ref
, with instanceProfile
as a profile with a single role:
// Example role, our role is constructed with more permissions
const myRole = new iam.Role(stack, 'my-role', {
assumedBy: new iam.ServicePrincipal('ec2.amazonaws.com'),
managedPolicies: [iam.ManagedPolicy.fromAwsManagedPolicyName('service-role/AmazonElasticMapReduceforEC2Role')],
});
const instanceProfile = new iam.CfnInstanceProfile(stack, `emr-ec2-instance-profile`, {
roles: [myRole.roleName],
});
With stepfunctionsTasks.EmrCreateCluster
, however, the required EmrCreateClusterProps
, have renamed JobFlowRole
into clusterRole
, and expect clusterRole
to be of type iam.IRole
instead of a reference to the previous iam.CfnInstanceProfile
. Attempting to pass the IAM Role that the profile was created with (e.g. myRole
), results in an Invalid InstanceProfile
error when running Step Function.
This seems to indicate that despite trying to pass a IRole
, what the API should be requesting is the reference to a CfnInstanceProfile
.
Reproduction Steps
// Example role, our role is constructed with more permissions
const myRole = new iam.Role(stack, 'my-role', {
assumedBy: new iam.ServicePrincipal('ec2.amazonaws.com'),
managedPolicies: [iam.ManagedPolicy.fromAwsManagedPolicyName('service-role/AmazonElasticMapReduceforEC2Role')],
});
const instanceProfile = new iam.CfnInstanceProfile(stack, `emr-ec2-instance-profile`, {
roles: [myRole.roleName],
});
const startClusterTask = new steps.Task(stack, 'Create Cluster', {
task: new tasks.EmrCreateCluster({
name: 'myCluster',
visibleToAllUsers: true,
releaseLabel: 'emr-6.0.0',
instances: {
masterInstanceType: 'm5.xlarge',
slaveInstanceType: 'm5.xlarge',
instanceCount: 3,
},
clusterRole: myRole, // aka JowFlowRole or instanceProfile.ref
configurations: [
{
classification: 'spark-env',
properties: {
PYSPARK_PYTHON: '/usr/bin/python3',
},
},
],
applications: [
{
name: 'Spark',
},
],
}),
});
const stopClusterTask = new steps.Task(stack, 'Task', {
task: new tasks.EmrTerminateCluster({
clusterId: 'myCluster',
}),
});
const workflow = steps.Chain.start(startClusterTask).next(stopClusterTask);
const stateMachine = new steps.StateMachine(stack, 'myStateMachine', {
stateMachineType: steps.StateMachineType.STANDARD,
definition: workflow,
});
Error Log
Attempting to run the step function will yield:
{
"error": "EMR.AmazonElasticMapReduceException",
"cause": "Invalid InstanceProfile: ${profile}. (Service: AmazonElasticMapReduce; Status Code: 400; Error Code: ValidationException; Request ID: ${Request})"
}
Environment
- CLI Version : aws-cli/1.16.227
- Framework Version: 1.37.0
- OS : MacOS Catalina 10.15.4
- Language : Typescript
Other
The type of clusterRole
in EmrCreateClusterProps
should not be IRole
, but a reference to the instance profile, as expected in the EMR specification.
This is 🐛 Bug Report
Issue Analytics
- State:
- Created 3 years ago
- Reactions:4
- Comments:13 (6 by maintainers)
Top GitHub Comments
I was able to start a cluster, by setting the instance profile’s name to the same name as the role:
Then redeployed my stack and executed my step function. The EMR cluster successfully was created.
Viewing the associated CloudFormation, the Physical ID used for the instance profile is set by using
instanceProfileName
, otherwise one is generated. It appears that by doing this, providing theIRole
’s roleName will coincide with the instanceProfileName.This is neither what one would expect to happen, nor is this obvious from any documentation that it is essentially required to name the instance profile the same as a role. I am also not clear now if we can provide instance profiles with more than one role listed?
This still seems like a bug in both the CDK API and its implementation.
Two years here after this issue has been opened, This happened back at framework v1 and I see the project is now at v2, but based on the current docs it looks like nothing has changed about the API; not sure if this is still a problem.
If it is, and there is such an issue with making this change, as two maintainer have mentioned, I would recommend at least updating the docs to make note of these necessary constraints before closing this out.