question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Potential integration with Great Expectations?

See original GitHub issue

Hi, @sbrugman!

I’ve been a fan and admirer of pandas-profiling for a long time! I was happy when you started maintaining the project last year—excited to have a good excuse to introduce myself.

I’m one of the core contributors to Great Expectations, an open source library for data testing and documentation. I don’t know if Great Expectations has been on your radar, but we have a very active community and are gaining adoption quickly. We currently provide some light data profiling as part of the package, and our community has been pushing us to do more.

We’re wondering if it makes sense to do tighter integration with pandas-profiling. Specifically, we’re interested in the possibility that pandas-profiling could generate test suites in the Great Expectations format, in addition to HTML and JSON. It would make both packages more valuable, and avoid re-inventing and maintaining the same profiling logic.

Originally, we didn’t think that such an integration made sense, because pandas-profiling rendered results directly into HTML. Reverse-engineering that HTML would be just as much work as (and far less stable than) profiling the data from scratch.

Then last week, a friend mentioned that pandas-profiling now has a to_json method, and you’ve started to add typing, etc. to pandas-profiling’s intermediate logic. This seems really promising.

Please let me know if you’re up for a call sometime in the next couple of weeks. If you’re interested in collaborating, we could build something really cool and useful together. Even if you’re not in a place where collaborating makes sense, I’d love to say hi and get to know each other.

You can reach me at abe@superconductive.com. Thanks!

  • Abe

PS: Apologies for not using the normal feature template. This was really meant to be an email, but I couldn’t find a working address for you.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:16
  • Comments:7

github_iconTop GitHub Comments

6reactions
sbrugmancommented, Apr 11, 2020

Thanks for reaching out. I’ve sent you a message. Let’s keep this issue open so that we can share the outcome.

5reactions
sbrugmancommented, Jun 15, 2020

We’re actually working on this behind the scenes. More details coming.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Step 3: Pipeline integration - Great Expectations!
This tutorial covers integrating Great Expectations (GE) into a data pipeline. ... making it possible to track “runs” of a pipeline and follow...
Read more >
DataHub and Great Expectations Integration Demo - YouTube
This video was taken during the March 2022 Great Expectations monthly community event. You can join the next one here: ...
Read more >
Potential integration with Great Expectations? #430 - GitHub
We're wondering if it makes sense to do tighter integration with pandas-profiling. Specifically, we're interested in the possibility that pandas ...
Read more >
How to ensure data quality with Great Expectations | Snowflake
Great Expectations introduction​​ Great Expectations (GE) offers many integrations (Airflow, Slack, Github Actions, etc.) which supports those ...
Read more >
Great Expectations: The Data Testing Tool
I've been meaning to explore and evaluate this tool to determine if it is feasible to integrate with our existing environment and whether...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found