question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Unpredictable behaviour with lists in columns

See original GitHub issue

@nikitinas

What was the underlying reasoning behind the current way lists in columns are handled? For example:

columnOf(1, null, listOf(1,2,3))

//    untitled
// 0       [1]
// 1       [ ]
// 2 [1, 2, 3]

This turns the column into a Value Column of type List<Int> and wraps everything into a list (without null since that becomes an empty list). and at the same time:

columnOf(1, null, listOf(1,2,3), mapOf(1 to 2))

//    untitled
// 0         1
// 1      null
// 2 [1, 2, 3]
// 3     {1=2}

Which becomes a Value Column of type Any?.

To me, the first behaviour shouldn’t happen. We cannot change the input data of the user so much as to erase nulls, modify list depth, and change data depending on other data in the input. It also causes me a headache trying to generate types from OpenApi and catch these cases (because just arrays of objects become a Frame column, but if there is a primitive array too then suddenly everything becomes a primitive list and the column becomes a value column, unless there’s another collection in there… you see? XD)

So, unless there is an important reason this behaviour is present I opt to remove it, since for me it acts too unpredictable.

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:6 (1 by maintainers)

github_iconTop GitHub Comments

1reaction
nikitinascommented, Nov 8, 2022

That’s a bug that is probably rooted in unification of column type inference with pivot + groupBy logic.

If dataframe has duplicate pairs of values in key columns a and b, pivot{ a }.groupBy{ b } may produce columns with mixed scalar and list values that are now silently converted into lists in order to provide usable column type instead of Any.

But this shouldn’t be done in init operations, such as columnOf, dataFrameOf or read for sure. So, you are absolutely right, this behaviour should be fixed.

0reactions
Jolanrensencommented, Nov 11, 2022

okay I think I fixed it 😃, if createColumn gets a suggested type of List<Something> it will convert the values if necessary, but guessValueType won’t suggest it anymore (unless you specify listifyValues = true). Pivot gives a suggested type so that still works.

Read more comments on GitHub >

github_iconTop Results From Across the Web

python - Pandas enumerate columns unexpected behavior
I want to iterate every column to see if they contain a particular string, but I noticed an unexpected behavior in one particular...
Read more >
bind_rows() has unpredictable behavior with NA , NULL , and ...
I'm sure there are some design decisions in the way bind_rows() handles NA, NULL and zero-length list elements when creating the final ...
Read more >
Using pandas categories properly is tricky... here's why
Using pandas categories properly is tricky, here's why… Understanding common pitfalls and unexpected behaviour, how to avoid letting the cats ...
Read more >
Unpredictable behaviour of calender type columns in treegrid - Suite ...
I am using “dhxCalendarA” column type . While updating , all column value i can see in the data processor debug window except...
Read more >
Connecting SharePoint List To Excel Doesn't Return Person ...
Hello,. I had a question regarding the behaviour I am observing when I connect to a SharePoint Online list from Excel.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found