question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

In-memory caching of of instances of user-defined classes does not preserve class identity

See original GitHub issue

Summary

When streamlit reruns a file with a class definition this class gets newly instantiated in memory. A cached instance of this class however remains an instance of the old definition of this class. A newly created instance of it therefore ends up being of a different class than a cached instance of it which can lead to hard to debug bugs.

Steps to reproduce

  1. Run this code with streamlit run
from enum import Enum
import streamlit as st

class A(Enum):
    Var1 = 0

@st.cache
def get_enum_dict():
    return {A.Var1: "Hi"}

look_up_key = A.Var1
cached_value = get_enum_dict()
st.write("class id of look_up_key: {}".format(id(look_up_key.__class__)))
st.write("class id of cached key: {}".format(id(list(cached_value.keys())[0].__class__)))
st.write(cached_value[look_up_key])
  1. Rerun by pressing ‘r’

Expected behavior:

Rerunning should print the same id for the class of look_up_key and the key in cached_value and the code should still print “Hi” at the end.

Actual behavior:

On the intial run the code print two times the same id and the look-up in the dictionary is successful. But on rerun the class ids differ and a KeyError: <A.Var1: 0> is raised.

Is this a regression?

no

Debug info

  • Streamlit version: 0.71.0
  • Python version: 3.8.3
  • Using Conda
  • OS version: Mac OS 10.15.7
  • Browser version: Firefox 82.0.3 (64-Bit)

Additional information

This bug is not unique to Enums but happens with all user-defined classes that get reevaluated. I had the same problems with other classes but this example is more easily reducible.

Ideas on how to fix it

Pickling and unpickling the cached object causes the class id to be updated to the new definition.

A very helpful short-term band-aid would be to have a separate st.cache option that forces pickling and unpickling also for the in-memory cache. That way the user can circumevent that bug selectively for the problematic types.

Long term I have two ideas but do not know how feasible they are: Walk the object hierarchy of every cached value and

  1. apply in-memory pickling only selectively to classes which definitions are in files that might be rerun during a Session
  2. “Hot-Patch” the __class__ field upon retrieval from the cache. But I do not know whether that is reliable in python or whether there are unintended side-effects to that.

Issue Analytics

  • State:open
  • Created 3 years ago
  • Reactions:8
  • Comments:11 (4 by maintainers)

github_iconTop GitHub Comments

4reactions
PetrochukMcommented, Mar 29, 2021

Thank you so much for posting this. This was a very aggravating bug to track down. The stack trace would show that enums which are supposed to be identical, were not. I was so confused and frustrated.

This bug made it difficult for me to use streamlit with a mature code base that relied on enum hashing for various data operations.

3reactions
harahucommented, Apr 6, 2022

Just want to add another voice to this. I’ve been bit by this as well, wanting to do branching based on isinstance. I also want to be able to use Enums in my code, but have had to give up on that.

I want to be able to write library code, that is agnostic to the UI i put on top of it. This is the number one issue that stops me from doing that with Streamlit.

This issue is underrated.

Read more comments on GitHub >

github_iconTop Results From Across the Web

class WP_Object_Cache {}
By default, the object cache is non-persistent. This means that data stored in the cache resides in memory only and only for the...
Read more >
-Xshareclasses
Disables caching of classes that are loaded by class loaders other than the bootstrap class ... Do not set user's home directory on...
Read more >
Token cache serialization (MSAL.NET) - Microsoft Entra
If you want to use an in-memory token cache and control its size ... if the ID token contains many claims, because the...
Read more >
How can I memoize a class instantiation in Python?
Let us see two points about your question. Using memoize. You can use memoization, but you should decorate the class, not the __init__ ......
Read more >
8 Understanding Caching
The EclipseLink cache is an in-memory repository that stores recently read or ... NONE and CACHE options do not preserve object identity and...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found