Communication between cvat and cvat_db timeout when superuser being created
See original GitHub issueMy actions before raising this issue
- Read/searched the docs
- Searched past issues
I am trying to deploy cvat behind corporate firewall for internal network usage. I pulled the latest develop branch, followed the installation guide and got stuck when trying to create superuser. The error seems to be related to network time-out.
Expected Behaviour
docker exec -it cvat bash -ic 'python3 ~/manage.py createsuperuser'
Then successfully created superuser
Current Behaviour
Error after three-minute waiting
> docker exec -it cvat bash -ic 'python3 ~/manage.py createsuperuser'
INFO - 2021-05-11 21:44:49,630 - font_manager - generated new fontManager
System check identified some issues:
WARNINGS:
?: (urls.W005) URL namespace 'v1' isn't unique. You may not be able to reverse all URLs in this namespace
Traceback (most recent call last):
File "/opt/venv/lib/python3.8/site-packages/django/db/backends/base/base.py", line 219, in ensure_connection
self.connect()
File "/opt/venv/lib/python3.8/site-packages/django/utils/asyncio.py", line 26, in inner
return func(*args, **kwargs)
File "/opt/venv/lib/python3.8/site-packages/django/db/backends/base/base.py", line 200, in connect
self.connection = self.get_new_connection(conn_params)
File "/opt/venv/lib/python3.8/site-packages/django/utils/asyncio.py", line 26, in inner
return func(*args, **kwargs)
File "/opt/venv/lib/python3.8/site-packages/django/db/backends/postgresql/base.py", line 187, in get_new_connection
connection = Database.connect(**conn_params)
File "/opt/venv/lib/python3.8/site-packages/psycopg2/__init__.py", line 127, in connect
conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
psycopg2.OperationalError: could not connect to server: Connection timed out
Is the server running on host "cvat_db" (172.28.0.3) and accepting
TCP/IP connections on port 5432?
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/django/manage.py", line 21, in <module>
execute_from_command_line(sys.argv)
File "/opt/venv/lib/python3.8/site-packages/django/core/management/__init__.py", line 401, in execute_from_command_line
utility.execute()
File "/opt/venv/lib/python3.8/site-packages/django/core/management/__init__.py", line 395, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "/opt/venv/lib/python3.8/site-packages/django/core/management/base.py", line 330, in run_from_argv
self.execute(*args, **cmd_options)
File "/opt/venv/lib/python3.8/site-packages/django/contrib/auth/management/commands/createsuperuser.py", line 79, in execute
return super().execute(*args, **options)
File "/opt/venv/lib/python3.8/site-packages/django/core/management/base.py", line 370, in execute
self.check_migrations()
File "/opt/venv/lib/python3.8/site-packages/django/core/management/base.py", line 459, in check_migrations
executor = MigrationExecutor(connections[DEFAULT_DB_ALIAS])
File "/opt/venv/lib/python3.8/site-packages/django/db/migrations/executor.py", line 18, in __init__
self.loader = MigrationLoader(self.connection)
File "/opt/venv/lib/python3.8/site-packages/django/db/migrations/loader.py", line 53, in __init__
self.build_graph()
File "/opt/venv/lib/python3.8/site-packages/django/db/migrations/loader.py", line 216, in build_graph
self.applied_migrations = recorder.applied_migrations()
File "/opt/venv/lib/python3.8/site-packages/django/db/migrations/recorder.py", line 77, in applied_migrations
if self.has_table():
File "/opt/venv/lib/python3.8/site-packages/django/db/migrations/recorder.py", line 55, in has_table
with self.connection.cursor() as cursor:
File "/opt/venv/lib/python3.8/site-packages/django/utils/asyncio.py", line 26, in inner
return func(*args, **kwargs)
File "/opt/venv/lib/python3.8/site-packages/django/db/backends/base/base.py", line 259, in cursor
return self._cursor()
File "/opt/venv/lib/python3.8/site-packages/django/db/backends/base/base.py", line 235, in _cursor
self.ensure_connection()
File "/opt/venv/lib/python3.8/site-packages/django/utils/asyncio.py", line 26, in inner
return func(*args, **kwargs)
File "/opt/venv/lib/python3.8/site-packages/django/db/backends/base/base.py", line 219, in ensure_connection
self.connect()
File "/opt/venv/lib/python3.8/site-packages/django/db/utils.py", line 90, in __exit__
raise dj_exc_value.with_traceback(traceback) from exc_value
File "/opt/venv/lib/python3.8/site-packages/django/db/backends/base/base.py", line 219, in ensure_connection
self.connect()
File "/opt/venv/lib/python3.8/site-packages/django/utils/asyncio.py", line 26, in inner
return func(*args, **kwargs)
File "/opt/venv/lib/python3.8/site-packages/django/db/backends/base/base.py", line 200, in connect
self.connection = self.get_new_connection(conn_params)
File "/opt/venv/lib/python3.8/site-packages/django/utils/asyncio.py", line 26, in inner
return func(*args, **kwargs)
File "/opt/venv/lib/python3.8/site-packages/django/db/backends/postgresql/base.py", line 187, in get_new_connection
connection = Database.connect(**conn_params)
File "/opt/venv/lib/python3.8/site-packages/psycopg2/__init__.py", line 127, in connect
conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
django.db.utils.OperationalError: could not connect to server: Connection timed out
Is the server running on host "cvat_db" (172.28.0.3) and accepting
TCP/IP connections on port 5432?
Possible Solution
It may due to the firewall/network setting in the corporate server. I can build and run cvat in my home linux. Additionally, I tried:
- ping cvat_db(172.28.0.3) from bash, and it failed
- ping cvat(172.28.0.4) from cvat_db(172.28.0.3) container, and it also failed.
- created ~/.docker/config.json and add proxy as below. Not helpful
{
"proxies": {
"default": {
"httpProxy": "http://xxx:xxx",
"httpsProxy": "https://xxx:xxx",
"noProxy": "172.*"
}
}
}
Steps to Reproduce (for bugs)
Context
This issue occurred when I am trying to deploy cvat in corporate research setting. So I guess other potential users may also run into a similar problem.
Your Environment
-
Git hash commit (
git log -1
): commit 7a4980619ccb1c8896ba1a0b948225ffd3864195 -
Docker version
docker version
(e.g. Docker 17.0.05): 20.10.2 -
Are you using Docker Swarm or Kubernetes? No
-
Operating System and version (e.g. Linux, Windows, MacOS): Ubuntu 20.04
-
Code example or link to GitHub repo or gist to reproduce problem: N/A
-
Other diagnostic information / logs:
Logs from `cvat` container
2021-05-11 21:28:04,765 INFO RPC interface 'supervisor' initialized 2021-05-11 21:28:04,765 CRIT Server 'unix_http_server' running without any HTTP authentication checking 2021-05-11 21:28:04,766 INFO supervisord started with pid 1 2021-05-11 21:28:05,769 INFO spawned: 'ssh-agent' with pid 8 2021-05-11 21:28:05,771 INFO spawned: 'clamav_update' with pid 9 2021-05-11 21:28:05,773 INFO spawned: 'git_status_updater' with pid 10 2021-05-11 21:28:05,775 INFO spawned: 'rqscheduler' with pid 11 2021-05-11 21:28:05,777 INFO spawned: 'rqworker_default_0' with pid 12 2021-05-11 21:28:05,779 INFO spawned: 'rqworker_default_1' with pid 14 2021-05-11 21:28:05,784 INFO spawned: 'rqworker_low' with pid 17 2021-05-11 21:28:05,786 INFO spawned: 'runserver' with pid 19 2021-05-11 21:28:05,787 DEBG fd 10 closed, stopped monitoring <POutputDispatcher at 139978297163056 for <Subprocess at 139978297162384 with name clamav_update in state STARTING> (stdout)> 2021-05-11 21:28:05,787 DEBG fd 14 closed, stopped monitoring <POutputDispatcher at 139978296924432 for <Subprocess at 139978297162384 with name clamav_update in state STARTING> (stderr)> 2021-05-11 21:28:05,787 INFO exited: clamav_update (exit status 0; not expected) 2021-05-11 21:28:05,789 DEBG received SIGCHLD indicating a child quit 2021-05-11 21:28:05,789 DEBG 'ssh-agent' stdout output: SSH_AUTH_SOCK=/tmp/ssh-agent.sock; export SSH_AUTH_SOCK; echo Agent pid 8;2021-05-11 21:28:05,790 DEBG ‘ssh-agent’ stderr output: debug2: fd 3 setting O_NONBLOCK
2021-05-11 21:28:05,792 DEBG ‘rqscheduler’ stderr output: wait-for-it.sh: waiting for cvat_redis:6379 without a timeout
2021-05-11 21:28:05,794 DEBG ‘rqworker_default_0’ stderr output: wait-for-it.sh: waiting for cvat_redis:6379 without a timeout
2021-05-11 21:28:05,796 DEBG ‘git_status_updater’ stderr output: wait-for-it.sh: waiting for cvat_redis:6379 without a timeout
2021-05-11 21:28:05,798 DEBG ‘rqworker_default_1’ stderr output: wait-for-it.sh: waiting for cvat_redis:6379 without a timeout
2021-05-11 21:28:05,800 DEBG ‘rqworker_low’ stderr output: wait-for-it.sh: waiting for cvat_redis:6379 without a timeout
2021-05-11 21:28:05,801 DEBG ‘runserver’ stderr output: wait-for-it.sh: waiting for cvat_db:5432 without a timeout
2021-05-11 21:28:06,802 INFO success: ssh-agent entered RUNNING state, process has stayed up for > than 1 seconds (startsecs) 2021-05-11 21:28:06,804 INFO spawned: ‘clamav_update’ with pid 47 2021-05-11 21:28:06,805 INFO success: git_status_updater entered RUNNING state, process has stayed up for > than 1 seconds (startsecs) 2021-05-11 21:28:06,805 INFO success: rqscheduler entered RUNNING state, process has stayed up for > than 1 seconds (startsecs) 2021-05-11 21:28:06,805 INFO success: rqworker_default_0 entered RUNNING state, process has stayed up for > than 1 seconds (startsecs) 2021-05-11 21:28:06,805 INFO success: rqworker_default_1 entered RUNNING state, process has stayed up for > than 1 seconds (startsecs) 2021-05-11 21:28:06,805 INFO success: rqworker_low entered RUNNING state, process has stayed up for > than 1 seconds (startsecs) 2021-05-11 21:28:06,805 INFO success: runserver entered RUNNING state, process has stayed up for > than 1 seconds (startsecs) 2021-05-11 21:28:06,813 DEBG fd 10 closed, stopped monitoring <POutputDispatcher at 139978296924576 for <Subprocess at 139978297162384 with name clamav_update in state STARTING> (stdout)> 2021-05-11 21:28:06,813 DEBG fd 16 closed, stopped monitoring <POutputDispatcher at 139978296924384 for <Subprocess at 139978297162384 with name clamav_update in state STARTING> (stderr)> 2021-05-11 21:28:06,813 INFO exited: clamav_update (exit status 0; not expected) 2021-05-11 21:28:06,814 DEBG received SIGCHLD indicating a child quit 2021-05-11 21:28:08,818 INFO spawned: ‘clamav_update’ with pid 48 2021-05-11 21:28:08,827 DEBG fd 10 closed, stopped monitoring <POutputDispatcher at 139978297163056 for <Subprocess at 139978297162384 with name clamav_update in state STARTING> (stdout)> 2021-05-11 21:28:08,827 DEBG fd 16 closed, stopped monitoring <POutputDispatcher at 139978296924528 for <Subprocess at 139978297162384 with name clamav_update in state STARTING> (stderr)> 2021-05-11 21:28:08,827 INFO exited: clamav_update (exit status 0; not expected) 2021-05-11 21:28:08,827 DEBG received SIGCHLD indicating a child quit 2021-05-11 21:28:11,834 INFO spawned: ‘clamav_update’ with pid 49 2021-05-11 21:28:11,845 DEBG fd 10 closed, stopped monitoring <POutputDispatcher at 139978296784544 for <Subprocess at 139978297162384 with name clamav_update in state STARTING> (stdout)> 2021-05-11 21:28:11,846 DEBG fd 16 closed, stopped monitoring <POutputDispatcher at 139978296924624 for <Subprocess at 139978297162384 with name clamav_update in state STARTING> (stderr)> 2021-05-11 21:28:11,846 INFO exited: clamav_update (exit status 0; not expected) 2021-05-11 21:28:11,846 DEBG received SIGCHLD indicating a child quit 2021-05-11 21:28:12,847 INFO gave up: clamav_update entered FATAL state, too many start retries too quickly
Next steps
You may join our Gitter channel for community support.
Issue Analytics
- State:
- Created 2 years ago
- Comments:11 (11 by maintainers)
Top GitHub Comments
Cheers! @azhavoro thanks for the guidance. The problem is solved after I changed the ip to 172.18.* instead of 172.17.*
In the beginning, I removed the myNetwork from our previous experiment. But when I first try with the 172.17.*, the docker doesn’t allow it, probably because I used it for my nginx experiment.
Then I solved the problem by changing 172.17 to 172.18.
Since we get it running, I should close the issue (and surely will). But it is still interesting why 172.28 doesn’t work in my case. After I’ve cloned the cvat repo to my local drive, I tried to run “docker-compose -f docker-compose.yml -f docker-compose.override.yml up -d” with the following override file:
I don’t think it messed up the with 172.28 ip route, but by results, it was the only possible cause I can think about now.
@waterfall414 Ok, please try to change the network subnet definition to 172.17.0.0/16 (probably need to delete some networks from your experiments before) here https://github.com/openvinotoolkit/cvat/blob/develop/docker-compose.yml#L83 and 172.17.0.1 here https://github.com/openvinotoolkit/cvat/blob/develop/components/serverless/docker-compose.serverless.yml#L17 Hope this helps.