question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

recommendationservice fails on local cluster - "Missing GOOGLE_APPLICATION_CREDENTIALS"

See original GitHub issue

I expected this issue to be already resolved #318 Still getting this issue when deploying to local cluster v1.12.7 Deployed using:

kubectl -n hipster-shop apply -f ./release/kubernetes-manifest

get pods

NAME                                     READY   STATUS             RESTARTS   AGE
adservice-85949f856-d4zwj                1/1     Running            2          47m
cartservice-6f96bb47c9-wg7xr             1/1     Running            1          47m
checkoutservice-77b94cfb54-4fv2f         1/1     Running            0          48m
currencyservice-554dc5fdfb-ljwxb         1/1     Running            2          47m
emailservice-69bd498fdb-qbfb9            1/1     Running            2          47m
frontend-69dbdc79d4-jnljq                1/1     Running            0          48m
loadgenerator-5d66d4b894-6txt6           1/1     Running            2          48m
paymentservice-8699f4c87d-ghzq5          1/1     Running            3          48m
productcatalogservice-8866d5f46-64mv5    1/1     Running            0          48m
recommendationservice-78c4699f8d-pjf6n   0/1     CrashLoopBackOff   19         47m
redis-cart-d999c4589-rk75j               1/1     Running            0          48m
shippingservice-764c557d86-9ff4n         1/1     Running            0          48m

logs pod

kubectl logs recommendationservice-78c4699f8d-pjf6n -n hipster-shop
{"timestamp": 1587679018.112441, "severity": "INFO", "name": "recommendationservice-server", "message": "initializing recommendationservice"}
{"timestamp": 1587679018.11267, "severity": "INFO", "name": "recommendationservice-server", "message": "Profiler enabled."}
{"timestamp": 1587679021.117626, "severity": "INFO", "name": "recommendationservice-server", "message": "Unable to start Stackdriver Profiler Python agent. Could not automatically determine credentials. Please set GOOGLE_APPLICATION_CREDENTIALS or explicitly create credentials and re-run the application. For more information, please see https://cloud.google.com/docs/authentication/getting-started"}
{"timestamp": 1587679021.118083, "severity": "INFO", "name": "recommendationservice-server", "message": "Sleeping 10 seconds to retry Stackdriver Profiler agent initialization"}
{"timestamp": 1587679025.122911, "severity": "INFO", "name": "recommendationservice-server", "message": "Unable to start Stackdriver Profiler Python agent. Could not automatically determine credentials. Please set GOOGLE_APPLICATION_CREDENTIALS or explicitly create credentials and re-run the application. For more information, please see https://cloud.google.com/docs/authentication/getting-started"}
{"timestamp": 1587679025.123342, "severity": "INFO", "name": "recommendationservice-server", "message": "Sleeping 20 seconds to retry Stackdriver Profiler agent initialization"}
{"timestamp": 1587679029.131499, "severity": "INFO", "name": "recommendationservice-server", "message": "Unable to start Stackdriver Profiler Python agent. Could not automatically determine credentials. Please set GOOGLE_APPLICATION_CREDENTIALS or explicitly create credentials and re-run the application. For more information, please see https://cloud.google.com/docs/authentication/getting-started"}
{"timestamp": 1587679029.132033, "severity": "INFO", "name": "recommendationservice-server", "message": "Sleeping 30 seconds to retry Stackdriver Profiler agent initialization"}
{"timestamp": 1587679030.133451, "severity": "INFO", "name": "recommendationservice-server", "message": "Tracing enabled."}
{"timestamp": 1587679033.137787, "severity": "INFO", "name": "recommendationservice-server", "message": "Tracing disabled."}
{"timestamp": 1587679033.15424, "severity": "INFO", "name": "recommendationservice-server", "message": "Debugger enabled."}

Describe pod

kubectl describe pod recommendationservice-78c4699f8d-pjf6n -n hipster-shop
Name:               recommendationservice-78c4699f8d-pjf6n
Namespace:          hipster-shop
Priority:           0
PriorityClassName:  <none>
Node:               ubs3/10.0.0.159
Start Time:         Thu, 23 Apr 2020 21:23:47 +0000
Labels:             app=recommendationservice
                    pod-template-hash=78c4699f8d
Annotations:        <none>
Status:             Running
IP:                 10.244.2.30
Controlled By:      ReplicaSet/recommendationservice-78c4699f8d
Containers:
  server:
    Container ID:   docker://cc831f195e2d0336219fbc12958bd8562f7725e2e3f7e1fa34e8c416919f2761
    Image:          gcr.io/google-samples/microservices-demo/recommendationservice:v0.1.5
    Image ID:       docker-pullable://gcr.io/google-samples/microservices-demo/recommendationservice@sha256:8ea2331d368499b09d0ac5ed7e59ac3ce86785931ed35340fcba58c5c113280a
    Port:           8080/TCP
    Host Port:      0/TCP
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    137
      Started:      Thu, 23 Apr 2020 22:14:44 +0000
      Finished:     Thu, 23 Apr 2020 22:15:04 +0000
    Ready:          False
    Restart Count:  21
    Limits:
      cpu:     200m
      memory:  450Mi
    Requests:
      cpu:      100m
      memory:   220Mi
    Liveness:   exec [/bin/grpc_health_probe -addr=:8080] delay=0s timeout=1s period=5s #success=1 #failure=3
    Readiness:  exec [/bin/grpc_health_probe -addr=:8080] delay=0s timeout=1s period=5s #success=1 #failure=3
    Environment:
      PORT:                          8080
      PRODUCT_CATALOG_SERVICE_ADDR:  productcatalogservice:3550
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-n2dpf (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  default-token-n2dpf:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-n2dpf
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason     Age                    From               Message
  ----     ------     ----                   ----               -------
  Normal   Scheduled  55m                    default-scheduler  Successfully assigned hipster-shop/recommendationservice-78c4699f8d-pjf6n to ubs3
  Warning  Unhealthy  53m (x8 over 54m)      kubelet, ubs3      Readiness probe failed: timeout: failed to connect service ":8080" within 1s
  Normal   Killing    53m (x2 over 53m)      kubelet, ubs3      Killing container with id docker://server:Container failed liveness probe.. Container will be killed and recreated.
  Normal   Created    53m (x3 over 54m)      kubelet, ubs3      Created container
  Normal   Started    53m (x3 over 54m)      kubelet, ubs3      Started container
  Normal   Pulled     34m (x11 over 54m)     kubelet, ubs3      Container image "gcr.io/google-samples/microservices-demo/recommendationservice:v0.1.5" already present on machine
  Warning  BackOff    9m26s (x177 over 51m)  kubelet, ubs3      Back-off restarting failed container
  Warning  Unhealthy  4m40s (x63 over 54m)   kubelet, ubs3      Liveness probe failed: timeout: failed to connect service ":8080" within 1s

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:3
  • Comments:9 (2 by maintainers)

github_iconTop GitHub Comments

11reactions
daniel-sanchecommented, Apr 24, 2020

@Sandeep10parmar Is that stacktrace from running the file locally (not the container itself)? That error makes it seem like it’s being run on Python 3 instead of 2.7, but that wouldn’t make sense when using the container

I don’t actually see anything in the logs that would cause a failure, and I can’t reproduce anything on my end. What are you using for your local cluster? Minkube? Kind? Docker for Desktop?


If you just want to get things working, you should try un-commenting these fields in the recommendation deployment:

        # - name: DISABLE_TRACING
        #   value: "1"
        # - name: DISABLE_PROFILER
        #   value: "1"
        # - name: DISABLE_DEBUGGER
        #   value: "1"

You may also want to try removing the liveness probe to see if that helps

2reactions
Sandeep10parmarcommented, Apr 26, 2020

Removing liveness probes on recommendationservice did help to get all pods running.

# livenessProbe:
       #   periodSeconds: 5
       #   exec:
       #     command: ["/bin/grpc_health_probe", "-addr=:8080"]
kubectl get pods -n hipster-shop
NAME                                     READY   STATUS    RESTARTS   AGE
adservice-7bddd55c58-knspz               1/1     Running   0          8h
cartservice-7fd6df59f7-f9bsr             1/1     Running   2          8h
checkoutservice-686fb854f6-5f88v         1/1     Running   0          8h
currencyservice-74b598c8-5kmsj           1/1     Running   0          8h
emailservice-869f4fdc96-wc4xj            1/1     Running   0          8h
frontend-5b7d6d8f7d-842dx                1/1     Running   0          8h
loadgenerator-5c5d7585c6-wkj9h           1/1     Running   3          8h
paymentservice-7bbd4cc8cb-96j8p          1/1     Running   0          8h
productcatalogservice-7b7bdb85b4-t6lmw   1/1     Running   0          8h
recommendationservice-546fb5949d-cpqcl   1/1     Running   0          8h
redis-cart-d999c4589-zd6pc               1/1     Running   0          8h
shippingservice-74f6d5dd4d-dbm5q         1/1     Running   0          8h

For some reasons External IP on Front-end External LB is not publishing. This is probably because my environment is not using GCP.

kubectl get svc frontend-external -n hipster-shop
NAME                TYPE           CLUSTER-IP   EXTERNAL-IP   PORT(S)        AGE
frontend-external   LoadBalancer   10.100.0.6   <pending>     80:30704/TCP   8h
Read more comments on GitHub >

github_iconTop Results From Across the Web

Anthos clusters on bare metal known issues - Google Cloud
Setting the external traffic policy to Local can cause routing errors, such as No route to host for bundled Layer 2 load balancing....
Read more >
GOOGLE_APPLICATION_CRED...
I try to use the service account key.json method to authenticate my localhost to communicate with the Google api service. I download the...
Read more >
google cloud platform - How to use ... - Server Fault
What is a clean way to use gcloud while leaving the filesystem intact? $ GOOGLE_APPLICATION_CREDENTIALS=/etc/my-service-account-4b4b6e63aaed.
Read more >
1762816 – New web UI - cluster auth fails for remote cluster if ...
Fix: Inform the user the local cluster nodes are not authenticated ... If no local cluster exists or is authenticated properly, this problem ......
Read more >
Troubleshooting a local cluster - Informatica Documentation
Local cluster startup fails with a "Container not found" error if you use a RHEL image on an Amazon EC2 bare metal instance...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found