question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Make the Dataset equality inequality messages better

See original GitHub issue

Here’s the current content inequality message:

Screen Shot 2020-03-31 at 5 33 18 AM

I think it’d be better to align this output. It’d also be better to put “Actual Content | Expected Content” on a newline.

[info] com.github.mrpowers.spark.fast.tests.DatasetContentMismatch:
[info] Actual Content      | Expected Content
[info] [frank,44,us]       | [frank,44,us]
[info] [li,30,china]       | [li,30,china]
[info] [bob,1,uk]          | [bob,1,france]
[info] [camila,5,peru]     | [camila,5,peru]
[info] [maria,19,colombia] | [maria,19,colombia]

It’d be really nice to suppress all the info warnings, but not sure if that’s possible with Scalatest.

[info] com.github.mrpowers.spark.fast.tests.DatasetContentMismatch:
Actual Content      | Expected Content
[frank,44,us]       | [frank,44,us]
[li,30,china]       | [li,30,china]
[bob,1,uk]          | [bob,1,france]
[camila,5,peru]     | [camila,5,peru]
[maria,19,colombia] | [maria,19,colombia]

Should we get rid of the square brackets for each row of data too?

[info] com.github.mrpowers.spark.fast.tests.DatasetContentMismatch:
Actual Content    | Expected Content
frank,44,us       | frank,44,us
li,30,china       | li,30,china
bob,1,uk          | bob,1,france
camila,5,peru     | camila,5,peru
maria,19,colombia | maria,19,colombia

Issue Analytics

  • State:open
  • Created 3 years ago
  • Comments:9 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
MrPowerscommented, Apr 3, 2020

Here’s the current DataFrame comparison message:

Screen Shot 2020-04-02 at 5 03 56 PM

Here’s the new message (added in this PR):

Screen Shot 2020-04-02 at 5 05 03 PM

@carlsverre @gorros @snithish - can you please take a look and let me know if this output looks better / you have any suggestions. Some specific points to note:

  • I changed the colors for the matching rows from Blue to DarkGray. Do you think that’s better? Here’s the list of color options.
  • I needed to prepend "Diffs\n" to get the message to output on a newline in Scalatest. "\n" worked for uTest, but not for Scalatest. I also tried hacking in the null character with "\u0000\n", but Scalatest ignored that too. So looks like we need some sort of real character.
0reactions
mikenaccommented, Jun 2, 2022

I would love to see something that shows what column values are different. This is especially important for larger data frames that may have 50 columns.

Read more comments on GitHub >

github_iconTop Results From Across the Web

A Look into Public Datasets for Gender Equality - Medium
Backing up our conversations with statistics will more effectively result in awareness of gender inequality. Now that the importance of data ...
Read more >
Equality and inequality range filter - Datastore - Google Cloud
Filter range by equality and inequality.
Read more >
Using data.table to left join with equality and inequality ...
Handle conditions whose sub-conditions may be a mixture of equality and inequality conditions · Assign values to an existing dataset using := to ......
Read more >
A Data L.I.F.T. for Gender Equality
These were the messages from diverse feminists and gender data advocates at our panel ... sexuality, ethnicity or other factors shape and create...
Read more >
Global Economic Inequality - Our World in Data
It considers economic history and how global inequality has changed and is ... the chance to live a good life lies in broad...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found