question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Frequent exception in interval timer function with SSL error

See original GitHub issue

Describe the bug:

We are getting frequent weird SSL errors in our Celery tasks that we cannot track down.

Exception in interval timer function

Failed to submit message: "Unable to reach APM Server: [('SSL routines', 'ssl3_get_record', 'decryption failed or bad record mac')] (url: https://XXXX.apm.eu-central-1.aws.cloud.es.io:443/intake/v2/events)"

Error: [('SSL routines', 'ssl3_get_record', 'decryption failed or bad record mac')]
  File "elasticapm/utils/threading.py", line 84, in run
    rval = self._function(*self._args, **self._kwargs)
  File "elasticapm/conf/__init__.py", line 743, in update_config
    new_version, new_config, next_run = self.transport.get_config(self.config_version, keys)
  File "elasticapm/transport/http.py", line 141, in get_config
    response = self.http.urlopen(
  File "urllib3/poolmanager.py", line 324, in urlopen
    response = conn.urlopen(method, u.request_uri, **kw)
  File "elasticapm/instrumentation/packages/base.py", line 205, in call_if_sampling
    return wrapped(*args, **kwargs)
  File "urllib3/connectionpool.py", line 597, in urlopen
    httplib_response = self._make_request(conn, method, url,
  File "urllib3/connectionpool.py", line 384, in _make_request
    six.raise_from(e, None)
  File "<string>", line 2, in raise_from
    
  File "urllib3/connectionpool.py", line 380, in _make_request
    httplib_response = conn.getresponse()
  File "http/client.py", line 1371, in getresponse
    response.begin()
  File "http/client.py", line 319, in begin
    version, status, reason = self._read_status()
  File "http/client.py", line 280, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
  File "socket.py", line 704, in readinto
    return self._sock.recv_into(b)
  File "urllib3/contrib/pyopenssl.py", line 297, in recv_into
    return self.connection.recv_into(*args, **kwargs)
  File "OpenSSL/SSL.py", line 1822, in recv_into
    self._raise_ssl_error(self._ssl, result)
  File "OpenSSL/SSL.py", line 1647, in _raise_ssl_error
    _raise_current_error()
  File "OpenSSL/_util.py", line 54, in exception_from_error_queue
    raise exception_type(errors)

The APM server is reachable and according to Sentry traces, calls just seconds before and after are working fine. Unfortunately these errors correlate with Celery tasks having transactions problems (waiting for database locks forever). We tried upgrading dependencies and the underlying Docker container to now Python 3.9 but that didn’t help.

Environment (please complete the following information)

  • OS: Docker image python:3.9
  • Python version: 3.9.7 (default, Oct 13 2021, 09:00:49) [GCC 10.2.1 20210110])
  • Framework and version: 3.2.8
  • APM Server version: v7.13.2
  • Agent version: 6.6.0

Additional context

Add any other context about the problem here.

  • Agent config options

    Click to expand
    ELASTIC_APM = {
    "SERVER_URL": ELASTIC_APM_SERVER_URL,
    "SECRET_TOKEN": ssm_get("elastic-apm-secret-token", STACK_NAME),
    "SERVICE_NAME": env("STACK_NAME"),
    "ENVIRONMENT": env("STACK_NAME"),
    "SERVICE_VERSION": env("GIT_VERSION"),
    "TRANSACTION_SAMPLE_RATE": 0.2,
    "TRANSACTION_NAME_FROM_ROUTE": True,
    "TRANSACTION_MAX_SPANS": 50,
    "API_REQUEST_TIME": "5s",
    "CLOUD_PROVIDER": False,
    "TRANSACTIONS_IGNORE_PATTERNS": [
        "submissions.tasks.check_submissions_periodically_task",
        "base.views.HealthCheckCustomView",
        "base.views.okay_view",
        "health_check.contrib.celery.tasks.add",
    ],
    }
    NEW_APPS += ["elasticapm.contrib.django"]
    
  • requirements.txt:

    Click to expand
        [[source]]
        url = "https://pypi.org/simple"
        verify_ssl = true
        name = "pypi"
        
        [packages]
        argon2_cffi = "~=20.1.0"
        attrs = "~=19.3.0"
        bagit = "~=1.7.0"
        canonicaljson = "~=1.2.0"
        celery = "~=5.1.2"
        celery-redbeat = "~=2.0.0"
        Django = "~=3.2.8"
        django-environ = "~=0.4.5"
        django-extensions = "~=3.0.9"
        django-model-utils = "~=4.0.0"
        django-redis = "~=5.0.0"
        django-storages = "~=1.11.1"
        djangorestframework = "~=3.12.4"
        elastic-apm = "~=6.6.0"
        elasticsearch = "<7.0.0"
        elasticsearch-dsl = "<7.0.0"
        gevent = "~=21.1.2"
        kombu = "~=5.1.0"
        mock = "~=4.0.2"
        model_mommy = "~=2.0.0"
        Pillow = "~=8.2.0"
        psycopg2 = "~=2.9.1"
        pycountry = "~=20.7"
        PyYAML = "~=5.4"
        redis = "~=3.5.3"
        requests = "~=2.26"
        sentry-sdk = "~=1.4.3"
        ssm-cache = "~=2.5"
        uWSGI = "~=2.0.19"
        word2number = "~=1.1"
        
        [dev-packages]
        autoflake = "~=1.4"
        black = "==21.9b0"
        coverage = "~=5.2"
        ddt = "~=1.4"
        django-cprofile-middleware = "*"
        django-debug-toolbar = "~=2.2"
        django-jenkins = "~=0.110"
        django-webtest = "~=1.9"
        docformatter = "~=1.0"
        httmock = "~=1.2"
        hypothesis = "~=5.23"
        ipdb = "~=0.13"
        isort = "~=5.5"
        mypy = "~=0.782"
        pep8 = "~=1.7"
        pep8-naming = "~=0.11"
        pydocstyle = "~=5.0"
        tblib = "~=1.7"
        yapf = "~=0.30"
        
        [requires]
        python_version = "3.9"
    
    

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:9 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
webjunkiecommented, Nov 1, 2021

Thanks! Yes, I can run it in a testing environment. I’ll let you know what I find out.

0reactions
webjunkiecommented, Nov 8, 2021

Thanks for tackling this so quickly!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Common SSL/TLS exceptions | Elasticsearch Guide [master]
This error occurs when a SSL/TLS cipher suite is specified that cannot supported by the JVM that Elasticsearch is running in. Security tries...
Read more >
Reboot due to: Const Interval timer, id: 2 · Issue #3314 - GitHub
An exception can have various causes: Try to access memory after a failed memory allocation; Divide by zero (not 100% sure that one...
Read more >
Service Bus messaging exceptions - Azure - Microsoft Learn
This article provides a list of Azure Service Bus messaging exceptions and suggested actions to taken when the exception occurs.
Read more >
Why does setInterval() ignore errors? - Stack Overflow
When your handler function is called, due to your setTimeout interval expiring, it is the outermost execution context. Thus, the uncaught ...
Read more >
Common Errors and Solutions | CockroachDB Docs
Understand and resolve common error messages written to stderr or logs. ... Client connection, node is running secure mode, SSL connection required.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found