question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Utilize tags for metrics sent to SafeDogStatsdLogger

See original GitHub issue

Description

A recent pr enabled dogstatsd support for Airflow metrics: https://github.com/apache/airflow/pull/7376. While this enables the use of dogstatsd, the code sending metrics to SafeDogStatsdLogger doesn’t utilize tagging and instead sends a unique, monolithic metric that cant be aggregated across identifiers such as <dag_id>. This isn’t scalable when someone wants to monitor metrics across multiple DAG as each metric sent by each DAG is unique. The amount of monitors increases with the amount of DAGs.

An example here are the timer metrics sent by a DagRun, such as dagrun.duration.failed.<dag_id>. When sent by the DagRun object, <dag_id> isn’t a tag but part of the entire metric itself: https://github.com/apache/airflow/blob/master/airflow/models/dagrun.py#L412-L420

What is the problem here?

By sending metrics to DataDog without tags, it becomes impossible to aggregate metrics across <dag_id> because each dagrun.duration.failed.<dag_id> sent by a DAG is completely unique to that <dag_id>.

If I have 20 dags in production and want to monitor dagrun.duration.failed.<dag_id>, that means I’ll need 20 separate monitors!

image

But if <dag_id> is sent as a tag, a single monitor could be used and DataDog can group the metric by <dag_id>.

Use case / motivation

The current way metrics are sent to DataDog isn’t scalable as its preventing a user from aggregating common metrics across unique tags.

Following the DagRun example given above, the information needed to send this metric as a tag is available. Given this line of code: https://github.com/apache/airflow/blob/master/airflow/models/dagrun.py#L418 and the accompanying function definition: https://github.com/apache/airflow/blob/master/airflow/stats.py#L172 we can modify the function call to send <dag_id> as a tag:

toy example:

duration = (self.end_date - self.start_date)
if self.state is State.SUCCESS:
    if isinstance(Stats, SafeDogStatsdLogger)
        Stats.timing('dagrun.duration.success', duration, tags=[self.dag_id])
    else:
        Stats.timing('dagrun.duration.success.{}'.format(self.dag_id), duration)

The preference here is probably not to do type checking before submitting the metric. I’m willing to discuss other solutions here or as part of a PR, and to implement the agreed upon solution.

Related Issues

This is the ticket that created the SafeDogStatsdLogger class: https://github.com/apache/airflow/pull/7376

Issue Analytics

  • State:open
  • Created 3 years ago
  • Reactions:5
  • Comments:7 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
potiukcommented, May 18, 2022
1reaction
arbass22commented, May 18, 2022

@williamBartos I stumbled upon this today which could be a good work around: https://docs.datadoghq.com/developers/dogstatsd/dogstatsd_mapper/. Also seems like this use case is common enough that airflow metrics are the actual example that DataDog uses

Read more comments on GitHub >

github_iconTop Results From Across the Web

Getting Started with Tags - Datadog Docs
Learn how to assign and use tags in Datadog. ... and cloud environments regularly churn through hosts, using tags is important to aggregate...
Read more >
airflow/stats.py at main · apache/airflow - GitHub
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows - airflow/stats.py at main · apache/airflow.
Read more >
Tagging Lookout for Metrics resources - AWS Documentation
Organize your Amazon Lookout for Metrics resources by owner, project or department with tags. Tags are key-value pairs that are supported across AWS ......
Read more >
AWS Metadata (Tag) Source - Sumo Logic Docs
To enable tagging of metrics from an HTTP source, you must specify the InstanceID and Region tags in the header using X-Sumo-Dimensions or ......
Read more >
Introduction to Tagging - YouTube
A key feature of Datadog is the fact that we can assign tags to every metric and every host. In this video you...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found