question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Enhancement suggestion for assertSmallDataFrameEquality

See original GitHub issue

Hi:

When assertSmallDataFrameEquality fails, the error message prints the top 5 rows of each DataSet. This makes it extremely difficult to tell what exactly is different, and therefore it’s hard to know what to correct.

I found some code on Stack Overflow that would diff the DataSets and show the differences. I my case, I added some code based on this to run shows on the differences and was able to determine what they were, although the output was tough to spot amongst all the Spark job output.

Perhaps some version of this, maybe where it could highlight in a different color within the pretty-printed DataSet could be integrated so it’s much easier to focus in on what’s different?

Thanks, Ken

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Reactions:2
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
khampsoncommented, Jun 14, 2018

@MrPowers : jitpack.io as a resolver worked for me. I’ve pulled in the new release and ran some tests with deliberately mismatched data to see them come back as red in the list. This definitely makes it easier to tell which row(s) to look at.

Thanks, Ken

0reactions
MrPowerscommented, Jun 7, 2018

@khampson - Can you try accessing the latest release via JitPack? Here’s the code that should work:

resolvers += "jitpack" at "https://jitpack.io"
libraryDependencies += "com.github.mrpowers" % "spark-fast-tests" % "v2.3.0_0.11.0" % "test"

I need to figure out how to upload this to Maven as well. I am going to drop Spark Packages support because that project has been broken for a long time and doesn’t allow users to specify spark-fast-tests as a test dependency (the % "test" part is important in the code snippet above!).

Let me know if this works for you!

Read more comments on GitHub >

github_iconTop Results From Across the Web

assertSmallDataFrameEquality throwing ... - GitHub
Hi,. I am trying to use assertSmallDataFrameEquality for the test below. ... The schemas for both data frames are similar, only the order...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found