question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

'branch' variable can be referenced before assignment

See original GitHub issue

I was essentially replicating the streaming example in this repo except with my own dataset and the code broke with the error shown in the screenshot.

rrcf bug

My full code is here.

The bug is at line 460 of rrcf.py: local variable 'branch' referenced before assignment. What’s going wrong is beyond me. Will someone please look into this? I work at LASP, a laboratory in Boulder, CO, and we’re considering using this code in production, but we can’t while this bug exists.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:6 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
mdbartoscommented, May 23, 2019

Hi @sapols,

I believe I fixed the issue. I’ve pulled the modified code into master. Still need to upload to pypi.

Let me know if you are still running into this issue.

A couple notes:

Also, instead of FIFO sampling, I’d probably recommend using something like reservoir sampling: https://en.wikipedia.org/wiki/Reservoir_sampling

Thanks, MDB

1reaction
mdbartoscommented, May 23, 2019

Ok. I believe I’ve found the reason the bug is being triggered.

It’s happening when you try to insert a point that is a near-duplicate, but doesn’t meet the tolerance criteria for a duplicate in at least one dimension. So, the point that it failed on for me was:

array([24.11841722, 24.11841722, 24.11841722, 24.11841722, 24.11841722,
       24.11841722, 24.11841722, 24.11841722, 24.11841722, 24.11841722,
       24.11841722, 24.11841722, 24.11841722, 24.11841722, 24.11841722,
       24.11841722, 24.11841722, 24.09805235])

The difference between this point and the nearest duplicate was:

array([ 0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
        0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
        0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
        0.        ,  0.        , -0.02036487])

So it wasn’t identified as a duplicate, but was possibly too close to the nearest point to successfully find a cut. I will have to look into why the algorithm wasn’t able to find a cut. Another issue is that it was trying to be inserted into a single-point tree (all duplicates), which might be triggering some kind of edge case.

Edit: I also just realized that the leaf depth is -1 for all the duplicate leaves, which suggests that this attribute was decremented incorrectly.

Edit 2: The error seems to occur shortly after the first time the first negative depth is encountered, so I believe this is the main problem. Still trying to figure out what causes this to happen. Sometimes it can take 700,000+ insertions for it to occur.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Difficulty understanding 'local variable referenced before ...
This gets an error 'local variable 'f' referenced before assignment' whether I pass 1 or any value. there's no error and the program...
Read more >
False-positive "Local variable might be referenced before ...
False-positive "Local variable might be referenced before assignment" for two compatible if-branches. Similar to 1. Similar to 1 issue (1 unresolved).
Read more >
Local Variable Referenced Before Assignment - STechies
The “local variable referenced before assignment” error occurs when you give reference of a local variable without assigning any value. Example:
Read more >
Python Recursive Function Problem - Local variable ... - Reddit
... does this return "local variable 'X_j' referenced before assignment"? ... have to return the result of the recursive call branch as well....
Read more >
Variable is referenced before assignment but not everywhere
This will make user experience better integrated with the native Github flow, as well as the questions closer to the community where they ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found