Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Feature Request] Providing a cache option for `pd.read_sql()`

See original GitHub issue

Providing a cache option for `pd.read_sql()`

Is there any work being done to allow a cache mechanism for pandas sql queries? I typically connect pandas to a sql database in a jupyter notebook. However, when I re-run cells it would be nice if these queries were cached in a file unless cache busting is explicit.


query = "SELECT * FROM table WHERE %(test)s = 'test';"

df = pd.read_sql(query, db, params={
    'test': 'name'
}, cache=True)

I would imagine the behavior would that the query string, db engine connection string, and params are hashed and saved as a file in some directory.

Issue Analytics

State:
Created 5 years ago
Comments:8 (4 by maintainers)

Top GitHub Comments

1reaction

Ouwencommented, Aug 7, 2018

@gfyoung sounds good to me!

1reaction

Ouwencommented, Aug 6, 2018

For my own purposes I am just going to create a function wrapper outside of pandas. It would simply be provided a directory to save files given some query, params, and database connection string.

However, my thoughts are that this problem is common enough to developers and researchers that it would be convenient if the pandas library provided support.

Top Results From Across the Web

pandas.read_sql — pandas 1.5.2 documentation

Read data from SQL via either a SQL query or a SQL tablename. When using a SQLite database only SQL queries are accepted,...

User Guide - requests-cache 0.7.1 documentation

This section covers the main features of requests-cache. Installation¶. Install with pip: $ pip install requests-cache

Caching requests - Gramener

gramex.cache.query returns SQL queries as DataFrames and caches the results. The next time it is called, the query re-runs only if required. ......

cachesql · PyPI

Fast, resilient and reproducible data analysis with cached SQL queries. ... CacheSQL is a simple library for making SQL queries with cache functionality....

Streamlit Tips, Tricks, and Hacks for Data Scientists - Medium

Besides all the cool features and being easy to work with, Streamlit does ... In the Streamlit forum, there is a suggestion of...