question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Question] Pod Monitor doesn't work with Prometheus ( rancher-charts ) in different namespace.

See original GitHub issue

Hello,

I’m facing an issue with the new monitoring approach implemented in version 0.20.0. I can successfully get the metrics from every container using _curl http://localhost:9404/metrics_, but when i deploy the PodMonitor in my prometheus namespace i can not see metrics or even discovered targets . monitoring.coreos.com PodMonitor definition is not clear enough to me and i don’t understand how the prometheus is able to scrape metrics while the strimzi operator is not exposing port 9404 anymore.

  • From there i have one more question. What if i want to use external prometheus server deployed out of the cluster. For example on virtual machine. Is there a way to scrape metrics from the LoadBalancer listener ?

Strimzi:

  • Operator: 0.20
  • Kafka version: 2.5.0
  • Namespace: kafka

Prometheus:

  • Prometheus is able to discovery new services with ServiceMonitor definition from other projects and namespaces.
  • Deployed with rancher using rancher-charts
  • Namespace: cattle-prometheus

The files used to deploy Kafka and PodMonitor:

  • kafka.yaml
apiVersion: kafka.strimzi.io/v1beta1
kind: Kafka
metadata:
  name: kafka-cluster
  namespace: kafka
spec:
  kafka:
    # I just use private registry. There are no changes made to the docker file.
    image: registry1.mydomain.com/kafka/kafka:2.5.0
    version: 2.5.0
    replicas: 5
    logging:
      type: inline
      loggers:
        kafka.root.logger.level: "INFO"
    resources:
      requests:
        memory: 16Gi
        cpu: "8"
      limits:
        memory: 24Gi
        cpu: "12"  
    jvmOptions:
      -Xms: "6g"
      -Xmx: "6g"
      -XX:
        UseG1GC: true
        InitiatingHeapOccupancyPercent: 35
        MinMetaspaceFreeRatio: 50
        MaxMetaspaceFreeRatio: 80
        MaxGCPauseMillis: 20
        MetaspaceSize: "96m"
        G1HeapRegionSize: "16M"
    listeners:
      external:
        type: loadbalancer
        tls: false
          bootstrap:
            address: kafka1.mydomain.com 
          brokers:
          - broker: 0
            advertisedHost: kafka1-broker0.mydomain.com
          - broker: 1
            advertisedHost: kafka1-broker1.mydomain.com
          - broker: 2
            advertisedHost: kafka1-broker2.mydomain.com
          - broker: 3
            advertisedHost: kafka1-broker3.mydomain.com
          - broker: 4
            advertisedHost: kafka1-broker4.mydomain.com
      plain: {}
    config:
      auto.create.topics.enable: "true"
      offsets.topic.replication.factor: 3
      transaction.state.log.replication.factor: 3
      default.replication.factor: 3
      log.retention.hours: 168
      num.network.threads: 3
      num.io.threads: 8
      transaction.state.log.min.isr: 2
      log.message.format.version: "2.5"
    storage:
      type: jbod
      volumes:
      - id: 0
        type: persistent-claim
        size: 100Gi
        deleteClaim: false
    metrics: {}
  zookeeper:
    image: registry1.mydomain.com/kafka/kafka:2.5.0
    replicas: 5
    resources:
      requests:
        memory: 4Gi
        cpu: "2"
      limits:
        memory: 8Gi
        cpu: "4"  
    jvmOptions:
      -Xms: "4g"
      -Xmx: "4g"
    storage:
      type: persistent-claim
      size: 20Gi
      deleteClaim: false
    metrics: {}
    jmxOptions: {}
  entityOperator:
    topicOperator: {}
    userOperator: {}
  kafkaExporter:
    image: registry1.mydomain.com/kafka/kafka:2.5.0
    groupRegex: ".*"
    topicRegex: ".*"
    logging: debug
    enableSaramaLogging: true
    readinessProbe:
      initialDelaySeconds: 15
      timeoutSeconds: 5
    livenessProbe:
      initialDelaySeconds: 15
      timeoutSeconds: 5
  # cruiseControl:
  #   image: registry1.mydomain.com/kafka/kafka:2.5.0
  #   config:
  #     security.protocol: PLAINTEXT
  • Pod Monitor:
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
  name: cluster-operator-metrics
  namespace: cattle-prometheus
spec:
  selector:
    matchLabels:
      strimzi.io/kind: cluster-operator
    namespaceSelector:
      - kafka
  podMetricsEndpoints:
  - path: /metrics
    port: web
---
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
  name: entity-operator-metric
  namespace: cattle-prometheus
  labels:
    app: prometheus
    release: cluster-monitoring
spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: entity-operator
    namespaceSelector:
      - kafka
  podMetricsEndpoints:
  - path: /metrics
    port: healthcheck
---
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
  name: bridge-metrics
  namespace: cattle-prometheus
  labels:
    app: prometheus
    release: cluster-monitoring
spec:
  selector:
    matchLabels:
      strimzi.io/kind: KafkaBridge
    namespaceSelector:
      - kafka
  podMetricsEndpoints:
  - path: /metrics
    port: rest-api
---
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
  name: kafka-resources-metrics
  namespace: cattle-prometheus
  labels:
    app: prometheus
    release: cluster-monitoring
spec:
  selector:
    matchExpressions:
      - key: "strimzi.io/kind"
        operator: In
        values: ["Kafka", "KafkaConnect", "KafkaConnectS2I", "KafkaMirrorMaker", "KafkaMirrorMaker2"]
    namespaceSelector:
      - kafka
  podMetricsEndpoints:
  - path: /metrics
    port: tcp-prometheus
    relabelings:
    - separator: ;
      regex: __meta_kubernetes_pod_label_(strimzi_io_.+)
      replacement: $1
      action: labelmap
    - sourceLabels: [__meta_kubernetes_namespace]
      separator: ;
      regex: (.*)
      targetLabel: namespace
      replacement: $1
      action: replace
    - sourceLabels: [__meta_kubernetes_pod_name]
      separator: ;
      regex: (.*)
      targetLabel: kubernetes_pod_name
      replacement: $1
      action: replace
    - sourceLabels: [__meta_kubernetes_pod_node_name]
      separator: ;
      regex: (.*)
      targetLabel: node_name
      replacement: $1
      action: replace
    - sourceLabels: [__meta_kubernetes_pod_host_ip]
      separator: ;
      regex: (.*)
      targetLabel: node_ip
      replacement: $1
      action: replace

Greetings,

Thank you for this awesome project !

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:9 (4 by maintainers)

github_iconTop GitHub Comments

2reactions
jrivers96commented, Dec 4, 2020

Hello, I think I might have had the same problem and it drove me a bit nuts for a day.

There is a closed issue about only having the pod monitor and not a service monitor: https://github.com/prometheus-operator/prometheus-operator/issues/3164

I had to add the ServiceMonitorSelector: {} so that my prometheus would utilize the pod monitor. I’m on prometheus 0.33.0.

---
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
  name: prometheus
  labels:
    app: strimzi
spec:
  replicas: 1
  serviceAccountName: prometheus-server
  serviceMonitorSelector: {}
  podMonitorSelector:
    matchLabels:
      app: strimzi
1reaction
scholzjcommented, Feb 17, 2021

Thanks for sharing the solution. I guess this can be closed now?

Read more comments on GitHub >

github_iconTop Results From Across the Web

[Question] Pod Monitor doesn't work with Prometheus ( rancher ...
[Question] Pod Monitor doesn't work with Prometheus ( rancher-charts ) in different namespace.
Read more >
Rancher 25 Prometheus for cluster/project owners
Rancher allows any users who are authenticated by Kubernetes and have access the Grafana service deployed by the Rancher Monitoring chart to ...
Read more >
Kubernetes Monitoring with Prometheus, Grafana and Rancher
Learn about Kubernetes monitoring with Prometheus, Grafana, and Rancher. ... This makes it easy to point the service to a different set of...
Read more >
Troubleshooting the Rancher Server Kubernetes Cluster
Use kubectl to check the cattle-system system namespace and see if the Rancher pods are in a Running state. kubectl -n cattle-system get...
Read more >
GitLab Runner Helm Chart
Run using the Kubernetes executor for GitLab Runner. For each new job it receives from GitLab CI/CD, provision a new pod within the...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found