Frequent exception in interval timer function with SSL error
See original GitHub issueDescribe the bug:
We are getting frequent weird SSL errors in our Celery tasks that we cannot track down.
Exception in interval timer function
Failed to submit message: "Unable to reach APM Server: [('SSL routines', 'ssl3_get_record', 'decryption failed or bad record mac')] (url: https://XXXX.apm.eu-central-1.aws.cloud.es.io:443/intake/v2/events)"
Error: [('SSL routines', 'ssl3_get_record', 'decryption failed or bad record mac')]
File "elasticapm/utils/threading.py", line 84, in run
rval = self._function(*self._args, **self._kwargs)
File "elasticapm/conf/__init__.py", line 743, in update_config
new_version, new_config, next_run = self.transport.get_config(self.config_version, keys)
File "elasticapm/transport/http.py", line 141, in get_config
response = self.http.urlopen(
File "urllib3/poolmanager.py", line 324, in urlopen
response = conn.urlopen(method, u.request_uri, **kw)
File "elasticapm/instrumentation/packages/base.py", line 205, in call_if_sampling
return wrapped(*args, **kwargs)
File "urllib3/connectionpool.py", line 597, in urlopen
httplib_response = self._make_request(conn, method, url,
File "urllib3/connectionpool.py", line 384, in _make_request
six.raise_from(e, None)
File "<string>", line 2, in raise_from
File "urllib3/connectionpool.py", line 380, in _make_request
httplib_response = conn.getresponse()
File "http/client.py", line 1371, in getresponse
response.begin()
File "http/client.py", line 319, in begin
version, status, reason = self._read_status()
File "http/client.py", line 280, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
File "socket.py", line 704, in readinto
return self._sock.recv_into(b)
File "urllib3/contrib/pyopenssl.py", line 297, in recv_into
return self.connection.recv_into(*args, **kwargs)
File "OpenSSL/SSL.py", line 1822, in recv_into
self._raise_ssl_error(self._ssl, result)
File "OpenSSL/SSL.py", line 1647, in _raise_ssl_error
_raise_current_error()
File "OpenSSL/_util.py", line 54, in exception_from_error_queue
raise exception_type(errors)
The APM server is reachable and according to Sentry traces, calls just seconds before and after are working fine. Unfortunately these errors correlate with Celery tasks having transactions problems (waiting for database locks forever). We tried upgrading dependencies and the underlying Docker container to now Python 3.9 but that didn’t help.
Environment (please complete the following information)
- OS: Docker image
python:3.9
- Python version: 3.9.7 (default, Oct 13 2021, 09:00:49) [GCC 10.2.1 20210110])
- Framework and version: 3.2.8
- APM Server version: v7.13.2
- Agent version: 6.6.0
Additional context
Add any other context about the problem here.
-
Agent config options
Click to expand
ELASTIC_APM = { "SERVER_URL": ELASTIC_APM_SERVER_URL, "SECRET_TOKEN": ssm_get("elastic-apm-secret-token", STACK_NAME), "SERVICE_NAME": env("STACK_NAME"), "ENVIRONMENT": env("STACK_NAME"), "SERVICE_VERSION": env("GIT_VERSION"), "TRANSACTION_SAMPLE_RATE": 0.2, "TRANSACTION_NAME_FROM_ROUTE": True, "TRANSACTION_MAX_SPANS": 50, "API_REQUEST_TIME": "5s", "CLOUD_PROVIDER": False, "TRANSACTIONS_IGNORE_PATTERNS": [ "submissions.tasks.check_submissions_periodically_task", "base.views.HealthCheckCustomView", "base.views.okay_view", "health_check.contrib.celery.tasks.add", ], } NEW_APPS += ["elasticapm.contrib.django"]
-
requirements.txt
:Click to expand
[[source]] url = "https://pypi.org/simple" verify_ssl = true name = "pypi" [packages] argon2_cffi = "~=20.1.0" attrs = "~=19.3.0" bagit = "~=1.7.0" canonicaljson = "~=1.2.0" celery = "~=5.1.2" celery-redbeat = "~=2.0.0" Django = "~=3.2.8" django-environ = "~=0.4.5" django-extensions = "~=3.0.9" django-model-utils = "~=4.0.0" django-redis = "~=5.0.0" django-storages = "~=1.11.1" djangorestframework = "~=3.12.4" elastic-apm = "~=6.6.0" elasticsearch = "<7.0.0" elasticsearch-dsl = "<7.0.0" gevent = "~=21.1.2" kombu = "~=5.1.0" mock = "~=4.0.2" model_mommy = "~=2.0.0" Pillow = "~=8.2.0" psycopg2 = "~=2.9.1" pycountry = "~=20.7" PyYAML = "~=5.4" redis = "~=3.5.3" requests = "~=2.26" sentry-sdk = "~=1.4.3" ssm-cache = "~=2.5" uWSGI = "~=2.0.19" word2number = "~=1.1" [dev-packages] autoflake = "~=1.4" black = "==21.9b0" coverage = "~=5.2" ddt = "~=1.4" django-cprofile-middleware = "*" django-debug-toolbar = "~=2.2" django-jenkins = "~=0.110" django-webtest = "~=1.9" docformatter = "~=1.0" httmock = "~=1.2" hypothesis = "~=5.23" ipdb = "~=0.13" isort = "~=5.5" mypy = "~=0.782" pep8 = "~=1.7" pep8-naming = "~=0.11" pydocstyle = "~=5.0" tblib = "~=1.7" yapf = "~=0.30" [requires] python_version = "3.9"
Issue Analytics
- State:
- Created 2 years ago
- Comments:9 (4 by maintainers)
Top Results From Across the Web
Common SSL/TLS exceptions | Elasticsearch Guide [master]
This error occurs when a SSL/TLS cipher suite is specified that cannot supported by the JVM that Elasticsearch is running in. Security tries...
Read more >Reboot due to: Const Interval timer, id: 2 · Issue #3314 - GitHub
An exception can have various causes: Try to access memory after a failed memory allocation; Divide by zero (not 100% sure that one...
Read more >Service Bus messaging exceptions - Azure - Microsoft Learn
This article provides a list of Azure Service Bus messaging exceptions and suggested actions to taken when the exception occurs.
Read more >Why does setInterval() ignore errors? - Stack Overflow
When your handler function is called, due to your setTimeout interval expiring, it is the outermost execution context. Thus, the uncaught ...
Read more >Common Errors and Solutions | CockroachDB Docs
Understand and resolve common error messages written to stderr or logs. ... Client connection, node is running secure mode, SSL connection required.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Thanks! Yes, I can run it in a testing environment. I’ll let you know what I find out.
Thanks for tackling this so quickly!