Prometheus is failing to pull metrics from the latest image
See original GitHub issueYesterday, one of my Cloudwatch exporter was restarted, and because all the available images are latest, my cluster pulled the image pushed 6 days ago to docker hub. The previous image was running without any issues.

Since the container restarted Prometheus is complaining about the format of the exporter.
Prometheus logs
time="2018-06-25T08:53:16Z" level=error msg="append failed" err="out of bounds"
source="scrape.go:518" target="{__address__="10.101.90.4:9100",
__metrics_path__="/metrics", __scheme__="http", endpoint="0",
instance="10.101.90.4:9100", job="cloudwatch-exporter",
namespace="monitoring", pod="cloudwatch-exporter-2869121175-zvctm",
service="cloudwatch-exporter"}"
Cloudwatch metrics endpoint
# HELP cloudwatch_requests_total API requests made to CloudWatch
# TYPE cloudwatch_requests_total counter
cloudwatch_requests_total 1645.0
# HELP aws_rds_free_storage_space_average CloudWatch metric AWS/RDS FreeStorageSpace Dimensions: [DBInstanceIdentifier] Statistic: Average Unit: Bytes
# TYPE aws_rds_free_storage_space_average gauge
aws_rds_free_storage_space_average{job="aws_rds",instance="",dbinstance_identifier="ooo",} 5.7720643584E10 1529914980000
aws_rds_free_storage_space_average{job="aws_rds",instance="",dbinstance_identifier="bbbb",} 2.6073378816E10 1529914980000
# HELP aws_ebs_burst_balance_average CloudWatch metric AWS/EBS BurstBalance Dimensions: [VolumeId] Statistic: Average Unit: Percent
# TYPE aws_ebs_burst_balance_average gauge
aws_ebs_burst_balance_average{job="aws_ebs",instance="",volume_id="vol-2222",} 100.0 1529914800000
# HELP aws_ec2_status_check_failed_average CloudWatch metric AWS/EC2 StatusCheckFailed Dimensions: [InstanceId] Statistic: Average Unit: Count
# TYPE aws_ec2_status_check_failed_average gauge
aws_ec2_status_check_failed_average{job="aws_ec2",instance="",instance_id="i-222",} 0.0 1529914980000
# HELP aws_ec2_status_check_failed_instance_average CloudWatch metric AWS/EC2 StatusCheckFailed_Instance Dimensions: [InstanceId] Statistic: Average Unit: Count
# TYPE aws_ec2_status_check_failed_instance_average gauge
aws_ec2_status_check_failed_instance_average{job="aws_ec2",instance="",instance_id="i-222",} 0.0 1529914980000
# HELP aws_ec2_status_check_failed_system_average CloudWatch metric AWS/EC2 StatusCheckFailed_System Dimensions: [InstanceId] Statistic: Average Unit: Count
# TYPE aws_ec2_status_check_failed_system_average gauge
aws_ec2_status_check_failed_system_average{job="aws_ec2",instance="",instance_id="i-222",} 0.0 1529914980000
# HELP aws_ses_send_sum CloudWatch metric AWS/SES Send Dimensions: null Statistic: Sum Unit: Count
# TYPE aws_ses_send_sum gauge
aws_ses_send_sum{job="aws_ses",instance="",} 1781.0 1529828640000
# HELP aws_ses_delivery_sum CloudWatch metric AWS/SES Delivery Dimensions: null Statistic: Sum Unit: Count
# TYPE aws_ses_delivery_sum gauge
aws_ses_delivery_sum{job="aws_ses",instance="",} 1762.0 1529828640000
# HELP aws_ses_bounce_sum CloudWatch metric AWS/SES Bounce Dimensions: null Statistic: Sum Unit: Count
# TYPE aws_ses_bounce_sum gauge
aws_ses_bounce_sum{job="aws_ses",instance="",} 15.0 1529828640000
# HELP cloudwatch_exporter_scrape_duration_seconds Time this CloudWatch scrape took, in seconds.
# TYPE cloudwatch_exporter_scrape_duration_seconds gauge
cloudwatch_exporter_scrape_duration_seconds 2.595850326
# HELP cloudwatch_exporter_scrape_error Non-zero if this scrape failed.
# TYPE cloudwatch_exporter_scrape_error gauge
cloudwatch_exporter_scrape_error 0.0
Note: I’m using prometheus 2.0.0-alpha.2 Note 2: Can this project tag images before pushing to docker hub?
Issue Analytics
- State:
- Created 5 years ago
- Comments:19 (10 by maintainers)
Top Results From Across the Web
Prometheus is failing to pull metrics from the latest image
Yesterday, one of my Cloudwatch exporter was restarted, and because all the available images are latest, my cluster pulled the image pushed ...
Read more >1639785 – Installer fails to pull prometheus-operator image
1639785 – Installer fails to pull prometheus-operator image. View All Add an attachment (proposed patch, testcase, etc.)
Read more >Grafana doesn't pull the previous metrics from Prometheus
I have mounted /data directory in prometheus with a persistent volume so it can save all the data related to the metrics.
Read more >Installation - Prometheus.io
The Prometheus image uses a volume to store the actual metrics. ... For this, create a new directory with a Prometheus configuration and...
Read more >Kubernetes ImagePullBackOff: Troubleshooting With Examples
The ImagePull part of the ImagePullBackOff error primarily relates to your Kubernetes container runtime being unable to pull the image from a private...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
I can confirm @gianrubio statement.
Now metrics collection is working again.
I also would suggest creating tags in docker hub like you do in github. Otherwise we can not specify images and will have breaking images in production.
I don’t think so, as I said before this only happen with the latest image. Rolling back to this commit fixed the issue.