Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

@aws-cdk/aws-stepfunctions-tasks: EmrCreateClusterProps clusterRole should be instance profile not IRole

See original GitHub issue

Seems that the API for AWS CDK with AWS Step Function to create an EMR cluster is wrong.

With an EMR service object, we can provide RunJobFlowInput to runJobFlow, that specified JobFlowRole: instanceProfile.ref, with instanceProfile as a profile with a single role:

// Example role, our role is constructed with more permissions
const myRole = new iam.Role(stack, 'my-role', {
    assumedBy: new iam.ServicePrincipal('ec2.amazonaws.com'),
    managedPolicies: [iam.ManagedPolicy.fromAwsManagedPolicyName('service-role/AmazonElasticMapReduceforEC2Role')],
  });

const instanceProfile = new iam.CfnInstanceProfile(stack, `emr-ec2-instance-profile`, {
    roles: [myRole.roleName],
  });

With stepfunctionsTasks.EmrCreateCluster, however, the required EmrCreateClusterProps, have renamed JobFlowRole into clusterRole, and expect clusterRole to be of type iam.IRole instead of a reference to the previous iam.CfnInstanceProfile. Attempting to pass the IAM Role that the profile was created with (e.g. myRole), results in an Invalid InstanceProfile error when running Step Function.

This seems to indicate that despite trying to pass a IRole, what the API should be requesting is the reference to a CfnInstanceProfile.

Reproduction Steps

// Example role, our role is constructed with more permissions
const myRole = new iam.Role(stack, 'my-role', {
    assumedBy: new iam.ServicePrincipal('ec2.amazonaws.com'),
    managedPolicies: [iam.ManagedPolicy.fromAwsManagedPolicyName('service-role/AmazonElasticMapReduceforEC2Role')],
  });

const instanceProfile = new iam.CfnInstanceProfile(stack, `emr-ec2-instance-profile`, {
    roles: [myRole.roleName],
  });

const startClusterTask = new steps.Task(stack, 'Create Cluster', {
    task: new tasks.EmrCreateCluster({
      name: 'myCluster',
      visibleToAllUsers: true,
      releaseLabel: 'emr-6.0.0',
      instances: {
        masterInstanceType: 'm5.xlarge',
        slaveInstanceType: 'm5.xlarge',
        instanceCount: 3,
      },
      clusterRole: myRole, // aka JowFlowRole or instanceProfile.ref
      configurations: [
        {
          classification: 'spark-env',
          properties: {
            PYSPARK_PYTHON: '/usr/bin/python3',
          },
        },
      ],
      applications: [
        {
          name: 'Spark',
        },
      ],
    }),
  });

const stopClusterTask = new steps.Task(stack, 'Task', {
    task: new tasks.EmrTerminateCluster({
      clusterId: 'myCluster',
    }),
  });

const workflow = steps.Chain.start(startClusterTask).next(stopClusterTask);

const stateMachine = new steps.StateMachine(stack, 'myStateMachine', {
    stateMachineType: steps.StateMachineType.STANDARD,
    definition: workflow,
  });

Error Log

Attempting to run the step function will yield:

{
  "error": "EMR.AmazonElasticMapReduceException",
  "cause": "Invalid InstanceProfile: ${profile}. (Service: AmazonElasticMapReduce; Status Code: 400; Error Code: ValidationException; Request ID: ${Request})"
}

Environment

CLI Version : aws-cli/1.16.227
Framework Version: 1.37.0
OS : MacOS Catalina 10.15.4
Language : Typescript

Other

The type of clusterRole in EmrCreateClusterProps should not be IRole, but a reference to the instance profile, as expected in the EMR specification.

This is 🐛 Bug Report

Issue Analytics

State:
Created 3 years ago
Reactions:4
Comments:13 (6 by maintainers)

Top GitHub Comments

2reactions

kevinfguocommented, May 19, 2020

I was able to start a cluster, by setting the instance profile’s name to the same name as the role:

const instanceProfile = new iam.CfnInstanceProfile(stack, `emr-ec2-instance-profile`, {
    roles: [myRole.roleName],
    instanceProfileName: myRole.roleName,
  });

Then redeployed my stack and executed my step function. The EMR cluster successfully was created.

Viewing the associated CloudFormation, the Physical ID used for the instance profile is set by using instanceProfileName, otherwise one is generated. It appears that by doing this, providing the IRole’s roleName will coincide with the instanceProfileName.

This is neither what one would expect to happen, nor is this obvious from any documentation that it is essentially required to name the instance profile the same as a role. I am also not clear now if we can provide instance profiles with more than one role listed?

This still seems like a bug in both the CDK API and its implementation.

1reaction

kevinfguocommented, Jul 21, 2022

Two years here after this issue has been opened, This happened back at framework v1 and I see the project is now at v2, but based on the current docs it looks like nothing has changed about the API; not sure if this is still a problem.

If it is, and there is such an issue with making this change, as two maintainer have mentioned, I would recommend at least updating the docs to make note of these necessary constraints before closing this out.

Top Results From Across the Web

interface EmrCreateClusterProps · AWS CDK

clusterRole ? Type: IRole (optional, default: * A Role will be created). Also called instance profile and EC2 role. An IAM role for...

Instance Profiles API 2.0 | Databricks on AWS

Learn about the Databricks Instance Profiles API 2.0. An instance profile is a container for an IAM role that you can pass to...

The difference between an AWS role and an instance profile

There are two key parts of any authentication system, not just IAM: Who am I? What am I permitted to do? When you...

Using IAM Service Account Instead Of Instance Profile For ...

The pod does not have any permissions in AWS services instead it might ... Next step we will create an IAM role service...

add-role-to-instance-profile - iam - Amazon AWS

If you would like to suggest an improvement or fix for the AWS CLI, ... An instance profile can contain only one role,...

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Start Free

Top Related Reddit Thread

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

@aws-cdk/aws-stepfunctions-tasks: EmrCreateClusterProps clusterRole should be instance profile not IRole

Reproduction Steps

Error Log

Environment

Other

Issue Analytics

Top GitHub Comments

Top Results From Across the Web

Top Related Medium Post

Top Related StackOverflow Question

Troubleshoot Live Code

Top Related Reddit Thread

Top Related Hackernoon Post

Top Related Tweet

Top Related Dev.to Post

Top Related Hashnode Post

Lambda: Principal conditions don't get translated to AWS::Lambda::Permissions fields

CDK Bootstrap should not try to process cdk.json