question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Questionable extraction of relevant variables by pure_eval

See original GitHub issue

I’ve recently hit an exception in my code: https://github.com/athenianco/athenian-api/blob/master/server/athenian/api/controllers/miners/github/release_load.py#L285

I am using pure_eval to extract the “interesting” vars. This time, it could do better.

image

I would expect to see repo, prefix, and settings, however, none of them are shown. Instead, I see completely redundant class types from typing and my own modules.

SDK version: 0.19.4.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:15 (9 by maintainers)

github_iconTop GitHub Comments

1reaction
alexmojakicommented, Mar 10, 2021

I finally got around to this. Your reproduction script worked, excellent instructions there. I see the same thing as in your last screenshot, with just 4 variables.

However, when I poked around in the code, I couldn’t find anything wrong, or any trimming at all. Eventually I went to where the packet is finally sent:

https://github.com/getsentry/sentry-python/blob/b530b6f89ba9c13a9f65a0fa3f151ed42c9befe0/sentry_sdk/transport.py#L209-L219

I added a print(json_dumps(event).decode("utf8")), prettified the result, and found this:

              "vars": {
                "repo": "'src-d/go-git'",
                "prefix": "'github.com/'",
                "settings": {
                  "github.com/src-d/go-git": {
                    "branches": "'{{default}}'",
                    "tags": "'.*'",
                    "match": "<ReleaseMatch.tag_or_branch: 2>"
                  }
                },
                "count": "1",
                "repos": [
                  "'src-d/go-git'"
                ],
                "DoS": "'00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000'",
                "PREFIXES[\"github\"]": "'github.com/'",
                "PREFIXES": {
                  "github": "'github.com/'"
                },
                "ReleaseMatch.branch": "<ReleaseMatch.branch: 0>",
                "ReleaseMatch": "<enum 'ReleaseMatch'>",
                "ReleaseMatch.tag": "<ReleaseMatch.tag: 1>",
                "match_groups": {
                  "ReleaseMatch.tag": {},
                  "ReleaseMatch.branch": {}
                },
                "List": "typing.List",
                "Dict": "typing.Dict",
                "Tuple": "typing.Tuple",
                "ReleaseMatchSetting": "<class 'athenian.api.controllers.settings.ReleaseMatchSetting'>"
              },

All the values are sent, in the correct order specified by pure_eval. But if I sort them alphabetically, the first 4 keys are the ones in the screenshot. So again I think the sentry server is sorting them alphabetically. I don’t know if that’s intentional or just a side effect of the order being lost in JSON.

This happened in both versions 0.19.4 and 1.0.0 of sentry_sdk.

You previously wrote:

I’ve checked the event’s JSON: no, only those 10 are sent.

Clearly that didn’t apply to this reproduction since there were only 4 values. So I can only suggest trying to reproduce that case where the client sent the wrong values and double checking the final event JSON.

Either way, I can’t see any signs that the pure_eval integration is doing anything wrong, but rather some other truncation somewhere in sentry. The integration can only return a dict of values and hope that they get truncated well and preserve their order.

1reaction
alexmojakicommented, Dec 2, 2020

It looks like pure_eval took the first 10 expressions starting from the top of the function definition and then sentry sorted those alphabetically. Somehow the sorting by closeness didn’t work.

I tried to reproduce it with this script:

import json

import sentry_sdk.serializer
from sentry_sdk.integrations.pure_eval import PureEvalIntegration

from athenian.api.controllers.miners.github.release_load import group_repos_by_release_match

# sentry_sdk.serializer.MAX_DATABAG_BREADTH = 16  # as done in athenian.api

sentry_sdk.init(
    transport=lambda event: print(
        json.dumps(event["exception"]["values"][0]["stacktrace"]["frames"][-1]["vars"], indent=2)
    ),
    integrations=[PureEvalIntegration()],
)

group_repos_by_release_match([2], {}, {})

Output:

{
  "repo": "2",
  "prefix": "'github.com/'",
  "settings": {},
  "count": "1",
  "repos": [
    "2"
  ],
  "PREFIXES[\"github\"]": "'github.com/'",
  "PREFIXES": {
    "github": "'github.com/'"
  },
  "ReleaseMatch.branch": "<ReleaseMatch.branch: 0>",
  "ReleaseMatch": "<enum 'ReleaseMatch'>",
  "ReleaseMatch.tag": "<ReleaseMatch.tag: 1>"
}
Traceback (most recent call last):
  File "/home/alex/.config/JetBrains/PyCharm2020.3/scratches/scratch_1011.py", line 17, in <module>
    group_repos_by_release_match([2], {}, {})
  File "/home/alex/work/athenian-api/server/athenian/api/controllers/miners/github/release_load.py", line 285, in group_repos_by_release_match
    rms = settings[prefix + repo]
TypeError: can only concatenate str (not "int") to str

If I uncomment the MAX_DATABAG_BREADTH line then it also includes the pointless typing stuff but it still has the most important variables at the top.

I wonder if maybe the client sends everything and then the server does its own trimming and sorting. But that wouldn’t explain why the included variables for you are all the ones that appear at the top of the function.

Please see if you can create a reproducible example, because I don’t know what else I can try to reproduce this.

Read more comments on GitHub >

github_iconTop Results From Across the Web

pure-eval - PyPI
This is a Python package that lets you safely evaluate certain AST nodes without triggering arbitrary code that may have unwanted side effects....
Read more >
The Pure Programming Language - CiteSeerX
This book is about the functional programming language Pure. Pure's distinguishing features are that it is based on term rewriting (a ...
Read more >
Nixpkgs 22.11 manual - NixOS
Nix expressions describe how to build packages from source and are collected in the nixpkgs repository. Also included in the collection are Nix...
Read more >
build backend is missing the 'build_editable' hook. - You.com
done ERROR: Project file:///home/developer/src/packaging-test has a 'pyproject.toml' and its build backend is missing the 'build_editable' hook. Since it does ...
Read more >
Top Mind
Isolation in ivory towers and/or not understanding the resource tradeoffs of ... you present evidence that even other OO proponents think is questionable, ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found