question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Names for expanded tasks

See original GitHub issue

Description

Airflow currently exposes map_index to the user as a way of distinguishing between tasks in an expansion. The index is unlikely to be meaningful to the user. They probably have their own label for this action. I’m requesting that we allow them to add that label.

To see the problem, consider a dag that sends email to a list of users which is generated at runtime:

with DAG(...) as dag:

    @dag.task
    def get_account_status():
        return [
            {
                "NAME": "Wintermute",
                "EMAIL": "wintermute@tessier-ashpool.com",
                "STATUS": "active",
            },
            {
                "NAME": "Hojo",
                "EMAIL": "ops@research.shinra.com",
                "STATUS": "delinquent",
            },
        ]

    BashOperator.partial(
        task_id="send_email",
        bash_command=dedent(
            """
            cat <<- EOF | tee | mailx -s "your account" $EMAIL
            Dear $NAME,
                Your account status is $STATUS.
            EOF
            """
        ),
    ).expand(env=get_account_status())

Notice that in the grid view, it’s not obvious which task goes with which user:

Screen Shot 2022-04-14 at 8 56 09 AM

Use case/motivation

I’d like to be able to explicitly assign a name to each expanded task, that way I can later go look at the right one. I would like this name to be used (when available) anywhere that the user interacts with the expanded task.

In cases where the user provides no names, perhaps we can generate some. For instance, this expansion generates four instances.

BashOperator.partial(task_id="greet").expand(
    bash_command=["echo hello $USER", "echo goodbye $USER"],
    env=[{"USER": "foo"}, {"USER": "bar"}],
)

The friendliest way would be to use the requested feature name each task:

  • hi_foo
  • hi_bar
  • bye_foo
  • bye_bar

As it is, the user will see:

  • 1
  • 2
  • 3
  • 4

But if the user doesn’t give names, maybe we should generate some names for them:

  • bash_command_1_env_1
  • bash_command_1_env_2
  • bash_command_2_env_1
  • bash_command_2_env_2

I don’t know. I’m creating this issue so we have a place to discuss it.

Related issues

No response

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Issue Analytics

  • State:closed
  • Created a year ago
  • Reactions:1
  • Comments:18 (18 by maintainers)

github_iconTop GitHub Comments

2reactions
uranusjrcommented, Aug 4, 2022

It just occurred to me that this is essentially a part of #22073. What we (users) actually want is a more customisable way to identify things (in this instance, a mapped task instance), and if we look past the assumption that a mapped task instance is “task_id + map_index”, we simply need a better way for the user to tell “what is this thing” in the Airflow UI. So let’s keep track of that issue instead to make sure whatever solution we come up for it correctly considers map_index.

2reactions
potiukcommented, Aug 3, 2022

I think you can forget about this.

You’ve just hit reality train (or rather reality train hit you 😃 )

Look there: https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-42+Dynamic+Task+Mapping - see this part:

“Rather than overloading the task_id argument to airflow tasks run (i.e. having a task_id of run_after_loop[0]) we will add a new --mapping-id argument to airflow tasks run – this value will be a ~JSON-encoded~ an integer specifying the index/position of the mapping.” (see also comments in the doc).

We have to support MySQL and the problem with MySQL is that index key size is limited. VERY limited. Depending on the type of encooding it might be even 760 characters or s. And task-id + dag_id + (string) task_index already exceed the limit by far. And there is no way around it - and this was the main reason (I believe) we had to use integer, even if originally we planned not even a name but JSON-encoded list of parameters - very similar to what you proposed ( which was far better for uniqueness - because it was automated).

But this is just what I saw - by observing it being implemented, so I might be wrong on that account - if that was the only or main reason for changing the original decision.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Expand and collapse the task list - Microsoft Project Quick Tips ...
This video shows how to expand and collapse subtasks in Microsoft Project.
Read more >
MS Project 101 - How to Expand Tasks - YouTube
In this video, learn how to easily expand all tasks in Microsoft Project. These techniques should work for all the latest versions of...
Read more >
Task View (List) - Please show entire project name (need ...
I'd like to add my voice to this request. There's ample room to expand the project name so that enough shows to be...
Read more >
Task Summary Name fields - Microsoft Support
The Task Summary Name field contains the names of the summary tasks associated with each task. There are several categories of Task Summary...
Read more >
Base-ten Numeral, Number Name, and Expanded Form Task ...
This set of 32 task cards allows students to practice reading and writing numbers in base-ten numeral form, number name form, and expanded...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found