question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Intermittent timeout with image preivews

See original GitHub issue

My actions before raising this issue

  • [X ] Read/searched the docs
  • [X ] Searched past issues

Expected Behaviour

We have a docker-compose.overide that mounts the cvat_data to a nfs mount point. We expect that when we load any page that has a preview, the fetch request /api/v1/tasks/10/data?type=preview should return a successful fetch

Current Behaviour

When we load a page such as Tasks or Projects, there is a 10% success rate in successfully fetching any image previews. The fetch will take anywhere up to 60k ms and will timeout, to which the page will then finally load without any images, and will display ‘preview’ in it’s place.

This data is reachable and is fetched from this mounted point on a NAS.

Note that it will work perfectly for a while, then it suddenly stops working and becomes unusable.

Possible Solution

We’ve changed the way the drives are mounted multiple times (mounted to linux and fetched by directory verse mounted within the docker-compose) with no avail. It’s possible that it might be to do with the networking link between a NAS and the docker container, but could be some other internal bug.

Steps to Reproduce (for bugs)

  1. Run the docker compose with the following override setup:
version: "3.3"

services:
  cvat_proxy:
    environment:
      CVAT_HOST: localhost
    ports:
      - "80:80"
  cvat:
    environment:
      CVAT_SHARE_URL: "Mounted from /home/gr directory"
    volumes:
      - cvat_share:/home/django/share:ro
volumes:
  cvat_share:
    driver_opts:
      type: "nfs"
      device: nas.local:/volume3/shared
      o: "addr=nas.local,nolock,soft,rw"

  cvat_data:
    driver_opts:
      type: "nfs"
      device:  ":/volume1/data"
      o: "addr=nas.local,nolock,soft,rw"
  1. Create a new project
  2. Create a new task using files from external cvat_share network
  3. Open task.
  4. Log out.
  5. Close window and open new window
  6. Login again
  7. Keep loading various preview pages whilst monitoring network requests
  8. Suddenly the page will take a long time to load, with the get request for the image previews for the project/task not responding. note sometimes it will use a cached response to display the preview images

Context

Just trying to get it working consistently

Your Environment

  • Docker version docker version (e.g. Docker 17.0.05): 20.10.5
  • Operating System and version (e.g. Linux, Windows, MacOS): Linux Ubuntu 20
  • Other diagnostic information / logs: docker logs cvat
2021-03-25 23:26:15,023 DEBG 'ssh-agent' stderr output:
debug2: fd 4 setting O_NONBLOCK
debug1: process_message: socket 1 (fd=4) type 11

2021-03-25 23:28:52,336 DEBG 'runserver' stderr output:
[Thu Mar 25 23:28:52.336043 2021] [wsgi:error] [pid 365:tid 139864464013056] [remote 172.28.0.9:43938] WARNING - 2021-03-25 23:28:52,335 - environment - Failed to import module 'tf_detection_api_format.converter.py': Can't import tensorflow. Test process exit code: -4. This is likely because your CPU does not support AVX instructions, which are required for tensorflow.

2021-03-25 23:29:46,357 DEBG 'rqworker_low' stderr output:
DEBUG - 2021-03-25 23:29:46,357 - worker - Sent heartbeat to prevent worker timeout. Next one should arrive within 480 seconds.

2021-03-25 23:29:46,454 DEBG 'rqworker_default_1' stderr output:
DEBUG - 2021-03-25 23:29:46,453 - worker - Sent heartbeat to prevent worker timeout. Next one should arrive within 480 seconds.

2021-03-25 23:29:49,665 DEBG 'rqworker_default_0' stderr output:
DEBUG - 2021-03-25 23:29:49,664 - worker - Sent heartbeat to prevent worker timeout. Next one should arrive within 480 seconds.

2021-03-25 23:36:31,392 DEBG 'rqworker_low' stderr output:
DEBUG - 2021-03-25 23:36:31,389 - worker - Sent heartbeat to prevent worker timeout. Next one should arrive within 480 seconds.

2021-03-25 23:36:31,489 DEBG 'rqworker_default_1' stderr output:
DEBUG - 2021-03-25 23:36:31,488 - worker - Sent heartbeat to prevent worker timeout. Next one should arrive within 480 seconds.

2021-03-25 23:36:34,699 DEBG 'rqworker_default_0' stderr output:
DEBUG - 2021-03-25 23:36:34,698 - worker - Sent heartbeat to prevent worker timeout. Next one should arrive within 480 seconds.

Thanks for the help in advanced.

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:6 (1 by maintainers)

github_iconTop GitHub Comments

1reaction
natrad100commented, Aug 2, 2021

use the getPreview() in the api server,

https://github.com/openvinotoolkit/cvat/blob/380f4d81612f45d071f9a4f70d407a5dd824d929/cvat-core/src/server-proxy.js#L782-L797

and before doing anything simply return an empty string.

            async function getPreview(tid) {
                return "";

That’s it!

0reactions
kosehycommented, Aug 3, 2021

I will try your solution. Thank you for sharing!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Intermittent timeout with image preivews · Issue #3019 - GitHub
When we load a page such as Tasks or Projects, there is a 10% success rate in successfully fetching any image previews. The...
Read more >
The Case of the Intermittent (and Annoying) Explorer Hangs
I suspected that the network path in question was not valid and that the apparent hang was a timeout waiting for a remote...
Read more >
POSSIBLE SOLUTION if you are experiencing driver timeout ...
I've freshly reinstalled all AMD drivers, worked for a bit then back to timeouts. I'm updating a windows cumulative update preview for windows ......
Read more >
Docker pull Intermittent TLS handshake timeout
I am running Docker on Win10. I get the same (net/http: TLS handshake timeout) and it doesn't matter which image i try to...
Read more >
Intermittent ConnectTimeoutError accessing SSM - AWS re:Post
Intermittent ConnectTimeoutError accessing SSM. 0. My app uses SSM Parameter Store on Fargate instances and locally in a Docker container.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found