question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[API RFC] Support `IColumn.if_else` for Boolean Column

See original GitHub issue

cond.if_else(a, b) -> IColumn, the output is defined as:

  • out_i = a_i if cond_i is True
  • out_i = b_i otherwise

API Demo

>>> import torcharrow as ta
>>> cond = ta.Column([True, False, True, True])
>>> a = ta.Column([1, 2, 3, 4])
>>> b = ta.Column([10, 20, 30, 40])
>>> cond.if_else(a, b)
0   1
1  20
2   3
3   4
dtype: int64, length: 4, null_count: 0

Current API

It’s currently called ite in Boolean Column: https://github.com/facebookresearch/torcharrow/blob/0dbe14d399b766a2dfd596e35cb7843e7514ad59/torcharrow/icolumn.py#L848-L853

API in other frameworks

PyArrow

pyarrow.compute has if_else. Note instead of being a member function to BooleanArray, it’s a standalone function in pyarrow.compute package that takes 3 arguments (cond, left, right)

Pandas

Doesn’t seem to have same method?

NumPy/PyTorch/TensorFlow

NumPy provides np.where: https://numpy.org/doc/stable/reference/generated/numpy.where.html PyTorch provides torch.where: https://pytorch.org/docs/stable/generated/torch.where.html TensorFlow provides tf.where: https://www.tensorflow.org/api_docs/python/tf/where

R

ifelse as a standalone function: https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/ifelse

But in general, R seems to prefer standalone functions instead of fluent-style API.

Discussions

Shall we make ta.if_else as standalone function that takes 3 arguments in torcharrow package (rather than a member methods in BooleanColumn`)? – This is similar to Arrow Compute, PyTorch and TensorFlow.

  • Standalone function (if_else(cond, x, y) ):
ta.if_else(a < 1, a, 1)

ta.if_else(
    tf.contains([1, 2, 3], a), 
    a, b
)
  • cond.if_else(x, y)
(a < 1).if_else(a, 1)

tf.contains([1, 2, 3], a).if_else(a, b)

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:7 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
OswinCcommented, Oct 28, 2021

I see. Yeah looks like readability here is more or less a taste thing :p I do agree that we should change it to the 3-arguments form for following the convention, but I can share what I see when reading (a < 1).if_else(a, 1) and ta.if_else

explicit parenthesis needs to be used in the first case (a < 1).if_else(a, 1)

The parenthesis actually separates the condition and makes it stand out more clearly when I read this expression. (isn’t parenthesis used widely for condition in many programming languages?)

The way the 2-arguments form separating condition from the candidate values also helps making the condition stand out clearly in my eyes (or brain) when the condition expression is long.

How the if_else symbol breaks condition expression and candidate expressions into 2 parts: <condition>.if_else(<then-value>, <else-value>)

e.g.

tf.contains([1, 2, 3], a).if_else(
    a, b
)
0reactions
wenleixcommented, Nov 23, 2021
Read more comments on GitHub >

github_iconTop Results From Across the Web

[API RFC] Support IColumn.if_else for Boolean Column #39
cond.if_else(a, b) -> IColumn, the output is defined as: out_i = a_i if cond_i is True out_i = b_i otherwise API Demo >>>...
Read more >
If else formula on a boolean column in crystal reports
If your colname is boolean and you are comparing it with integer 0 or 1 then you get the error. It seems the...
Read more >
S3Settings - AWS Database Migration Service
When useTaskStartTimeForFullLoadTimestamp is set to false , the full load timestamp in the timestamp column increments with the time data arrives at the...
Read more >
RFC 7047: The Open vSwitch Database Management Protocol
1. List Databases This operation retrieves an array whose elements are the names of the databases that can be accessed over this management...
Read more >
Reports | Search Ads 360 API | Google Developers
Property name Value Description files.byteCount long The size of this report file in bytes. files.url string Use this url to download the report file. id string...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found