question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

@aws-cdk/aws-stepfunctions-tasks: EmrCreateClusterProps clusterRole should be instance profile not IRole

See original GitHub issue

Seems that the API for AWS CDK with AWS Step Function to create an EMR cluster is wrong.

With an EMR service object, we can provide RunJobFlowInput to runJobFlow, that specified JobFlowRole: instanceProfile.ref, with instanceProfile as a profile with a single role:

// Example role, our role is constructed with more permissions
const myRole = new iam.Role(stack, 'my-role', {
    assumedBy: new iam.ServicePrincipal('ec2.amazonaws.com'),
    managedPolicies: [iam.ManagedPolicy.fromAwsManagedPolicyName('service-role/AmazonElasticMapReduceforEC2Role')],
  });

const instanceProfile = new iam.CfnInstanceProfile(stack, `emr-ec2-instance-profile`, {
    roles: [myRole.roleName],
  });

With stepfunctionsTasks.EmrCreateCluster, however, the required EmrCreateClusterProps, have renamed JobFlowRole into clusterRole, and expect clusterRole to be of type iam.IRole instead of a reference to the previous iam.CfnInstanceProfile. Attempting to pass the IAM Role that the profile was created with (e.g. myRole), results in an Invalid InstanceProfile error when running Step Function.

This seems to indicate that despite trying to pass a IRole, what the API should be requesting is the reference to a CfnInstanceProfile.

Reproduction Steps

// Example role, our role is constructed with more permissions
const myRole = new iam.Role(stack, 'my-role', {
    assumedBy: new iam.ServicePrincipal('ec2.amazonaws.com'),
    managedPolicies: [iam.ManagedPolicy.fromAwsManagedPolicyName('service-role/AmazonElasticMapReduceforEC2Role')],
  });

const instanceProfile = new iam.CfnInstanceProfile(stack, `emr-ec2-instance-profile`, {
    roles: [myRole.roleName],
  });

const startClusterTask = new steps.Task(stack, 'Create Cluster', {
    task: new tasks.EmrCreateCluster({
      name: 'myCluster',
      visibleToAllUsers: true,
      releaseLabel: 'emr-6.0.0',
      instances: {
        masterInstanceType: 'm5.xlarge',
        slaveInstanceType: 'm5.xlarge',
        instanceCount: 3,
      },
      clusterRole: myRole, // aka JowFlowRole or instanceProfile.ref
      configurations: [
        {
          classification: 'spark-env',
          properties: {
            PYSPARK_PYTHON: '/usr/bin/python3',
          },
        },
      ],
      applications: [
        {
          name: 'Spark',
        },
      ],
    }),
  });

const stopClusterTask = new steps.Task(stack, 'Task', {
    task: new tasks.EmrTerminateCluster({
      clusterId: 'myCluster',
    }),
  });

const workflow = steps.Chain.start(startClusterTask).next(stopClusterTask);

const stateMachine = new steps.StateMachine(stack, 'myStateMachine', {
    stateMachineType: steps.StateMachineType.STANDARD,
    definition: workflow,
  });

Error Log

Attempting to run the step function will yield:

{
  "error": "EMR.AmazonElasticMapReduceException",
  "cause": "Invalid InstanceProfile: ${profile}. (Service: AmazonElasticMapReduce; Status Code: 400; Error Code: ValidationException; Request ID: ${Request})"
}

Environment

  • CLI Version : aws-cli/1.16.227
  • Framework Version: 1.37.0
  • OS : MacOS Catalina 10.15.4
  • Language : Typescript

Other

The type of clusterRole in EmrCreateClusterProps should not be IRole, but a reference to the instance profile, as expected in the EMR specification.


This is 🐛 Bug Report

Issue Analytics

  • State:open
  • Created 3 years ago
  • Reactions:4
  • Comments:13 (6 by maintainers)

github_iconTop GitHub Comments

2reactions
kevinfguocommented, May 19, 2020

I was able to start a cluster, by setting the instance profile’s name to the same name as the role:

const instanceProfile = new iam.CfnInstanceProfile(stack, `emr-ec2-instance-profile`, {
    roles: [myRole.roleName],
    instanceProfileName: myRole.roleName,
  });

Then redeployed my stack and executed my step function. The EMR cluster successfully was created.

Viewing the associated CloudFormation, the Physical ID used for the instance profile is set by using instanceProfileName, otherwise one is generated. It appears that by doing this, providing the IRole’s roleName will coincide with the instanceProfileName.

This is neither what one would expect to happen, nor is this obvious from any documentation that it is essentially required to name the instance profile the same as a role. I am also not clear now if we can provide instance profiles with more than one role listed?

This still seems like a bug in both the CDK API and its implementation.

1reaction
kevinfguocommented, Jul 21, 2022

Two years here after this issue has been opened, This happened back at framework v1 and I see the project is now at v2, but based on the current docs it looks like nothing has changed about the API; not sure if this is still a problem.

If it is, and there is such an issue with making this change, as two maintainer have mentioned, I would recommend at least updating the docs to make note of these necessary constraints before closing this out.

Read more comments on GitHub >

github_iconTop Results From Across the Web

interface EmrCreateClusterProps · AWS CDK
clusterRole ? Type: IRole (optional, default: * A Role will be created). Also called instance profile and EC2 role. An IAM role for...
Read more >
Instance Profiles API 2.0 | Databricks on AWS
Learn about the Databricks Instance Profiles API 2.0. An instance profile is a container for an IAM role that you can pass to...
Read more >
The difference between an AWS role and an instance profile
There are two key parts of any authentication system, not just IAM: Who am I? What am I permitted to do? When you...
Read more >
Using IAM Service Account Instead Of Instance Profile For ...
The pod does not have any permissions in AWS services instead it might ... Next step we will create an IAM role service...
Read more >
add-role-to-instance-profile - iam - Amazon AWS
If you would like to suggest an improvement or fix for the AWS CLI, ... An instance profile can contain only one role,...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found