Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

In-memory caching of of instances of user-defined classes does not preserve class identity

See original GitHub issue

Summary

When streamlit reruns a file with a class definition this class gets newly instantiated in memory. A cached instance of this class however remains an instance of the old definition of this class. A newly created instance of it therefore ends up being of a different class than a cached instance of it which can lead to hard to debug bugs.

Steps to reproduce

Run this code with streamlit run

from enum import Enum
import streamlit as st

class A(Enum):
    Var1 = 0

@st.cache
def get_enum_dict():
    return {A.Var1: "Hi"}

look_up_key = A.Var1
cached_value = get_enum_dict()
st.write("class id of look_up_key: {}".format(id(look_up_key.__class__)))
st.write("class id of cached key: {}".format(id(list(cached_value.keys())[0].__class__)))
st.write(cached_value[look_up_key])

Rerun by pressing ‘r’

Expected behavior:

Rerunning should print the same id for the class of look_up_key and the key in cached_value and the code should still print “Hi” at the end.

Actual behavior:

On the intial run the code print two times the same id and the look-up in the dictionary is successful. But on rerun the class ids differ and a KeyError: <A.Var1: 0> is raised.

Is this a regression?

Debug info

Streamlit version: 0.71.0
Python version: 3.8.3
Using Conda
OS version: Mac OS 10.15.7
Browser version: Firefox 82.0.3 (64-Bit)

Additional information

This bug is not unique to Enums but happens with all user-defined classes that get reevaluated. I had the same problems with other classes but this example is more easily reducible.

Ideas on how to fix it

Pickling and unpickling the cached object causes the class id to be updated to the new definition.

A very helpful short-term band-aid would be to have a separate st.cache option that forces pickling and unpickling also for the in-memory cache. That way the user can circumevent that bug selectively for the problematic types.

Long term I have two ideas but do not know how feasible they are: Walk the object hierarchy of every cached value and

apply in-memory pickling only selectively to classes which definitions are in files that might be rerun during a Session
“Hot-Patch” the __class__ field upon retrieval from the cache. But I do not know whether that is reliable in python or whether there are unintended side-effects to that.

Issue Analytics

State:
Created 3 years ago
Reactions:8
Comments:11 (4 by maintainers)

Top GitHub Comments

4reactions

PetrochukMcommented, Mar 29, 2021

Thank you so much for posting this. This was a very aggravating bug to track down. The stack trace would show that enums which are supposed to be identical, were not. I was so confused and frustrated.

This bug made it difficult for me to use streamlit with a mature code base that relied on enum hashing for various data operations.

3reactions

harahucommented, Apr 6, 2022

Just want to add another voice to this. I’ve been bit by this as well, wanting to do branching based on isinstance. I also want to be able to use Enums in my code, but have had to give up on that.

I want to be able to write library code, that is agnostic to the UI i put on top of it. This is the number one issue that stops me from doing that with Streamlit.

This issue is underrated.