Make the Dataset equality inequality messages better
See original GitHub issueHere’s the current content inequality message:

I think it’d be better to align this output. It’d also be better to put “Actual Content | Expected Content” on a newline.
[info] com.github.mrpowers.spark.fast.tests.DatasetContentMismatch:
[info] Actual Content | Expected Content
[info] [frank,44,us] | [frank,44,us]
[info] [li,30,china] | [li,30,china]
[info] [bob,1,uk] | [bob,1,france]
[info] [camila,5,peru] | [camila,5,peru]
[info] [maria,19,colombia] | [maria,19,colombia]
It’d be really nice to suppress all the info warnings, but not sure if that’s possible with Scalatest.
[info] com.github.mrpowers.spark.fast.tests.DatasetContentMismatch:
Actual Content | Expected Content
[frank,44,us] | [frank,44,us]
[li,30,china] | [li,30,china]
[bob,1,uk] | [bob,1,france]
[camila,5,peru] | [camila,5,peru]
[maria,19,colombia] | [maria,19,colombia]
Should we get rid of the square brackets for each row of data too?
[info] com.github.mrpowers.spark.fast.tests.DatasetContentMismatch:
Actual Content | Expected Content
frank,44,us | frank,44,us
li,30,china | li,30,china
bob,1,uk | bob,1,france
camila,5,peru | camila,5,peru
maria,19,colombia | maria,19,colombia
Issue Analytics
- State:
- Created 3 years ago
- Comments:9 (6 by maintainers)
Top Results From Across the Web
A Look into Public Datasets for Gender Equality - Medium
Backing up our conversations with statistics will more effectively result in awareness of gender inequality. Now that the importance of data ...
Read more >Equality and inequality range filter - Datastore - Google Cloud
Filter range by equality and inequality.
Read more >Using data.table to left join with equality and inequality ...
Handle conditions whose sub-conditions may be a mixture of equality and inequality conditions · Assign values to an existing dataset using := to ......
Read more >A Data L.I.F.T. for Gender Equality
These were the messages from diverse feminists and gender data advocates at our panel ... sexuality, ethnicity or other factors shape and create...
Read more >Global Economic Inequality - Our World in Data
It considers economic history and how global inequality has changed and is ... the chance to live a good life lies in broad...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Here’s the current DataFrame comparison message:
Here’s the new message (added in this PR):
@carlsverre @gorros @snithish - can you please take a look and let me know if this output looks better / you have any suggestions. Some specific points to note:
"Diffs\n"
to get the message to output on a newline in Scalatest."\n"
worked for uTest, but not for Scalatest. I also tried hacking in the null character with"\u0000\n"
, but Scalatest ignored that too. So looks like we need some sort of real character.I would love to see something that shows what column values are different. This is especially important for larger data frames that may have 50 columns.