The `whiten` function is racially insensitive needs to be renamed.
See original GitHub issueThe particular method I am pointing to: https://github.com/scipy/scipy/blob/4c0fd79391e3b2ec2738bf85bb5dab366dcd12e4/scipy/cluster/vq.py#L84-L137
This method serves to normalize inputs before being passed to clustering algorithms. Other methods suggest to use this method first before clustering such as in k-means. However, the whiten
function should be renamed as it does not accurately represent the definition of this function. It is ambiguous and whiten
does not give users a good understanding that it will make features unit variance. Given current times, a function name like this is racially insensitive. It implies that ‘whitening’ is ‘good’ or ‘normal’ and promotes racial inequality even if implicitly or unintentionally. Even if the term has academically historical meaning, it was most likely arbitrarily decided at conception and must go.
Therefore, I ask for the community to discuss alternative names and familiarize themselves with common racial microaggressions like this. As a solution, I propose to 1. rename this function to an alternative decided by the community, 2. have the whiten
function deprecated and redirect to the new function, 3. in the deprecation message, state the reason of deprecation and promote awareness of racial inequality in its message, 4. remove all references to whiten
function such as in k-means which can be seen through this search.
Issue Analytics
- State:
- Created 3 years ago
- Reactions:4
- Comments:12 (10 by maintainers)
Thanks for asking about this @juliusfrost. I - and I believe most/all SciPy maintainers and regular contributors - definitely desire to make SciPy a project that feels inclusive and welcoming to everyone.
I think most replies so far correctly point out that the
whiten
functions doesn’t fall in the category of problematic usages of white/black where “white” implies “good” / “allowed” and “black” implies “bad” / “disallowed”. That said, there may be cases that do fall in that category, such as “blacklist”. And indeed, I did find one piece of code that uses “blacklist”.I’ve seen two sets of opinions expressed by Black people on this bigger discussion of racially loaded terminology (paraphrasing):
Both are of course valid feelings/opinions. I didn’t look at these terms in our code base over the past couple of weeks because of (1), but now that the issue has been brought up let’s just audit the code base and remove the terms because of (2). I just did that, and I did find “blacklist” usage that can be removed, and no “whitelist” or other problematic “black”/“white” usage. Nor “slave”, although of course we have “master” as the default branch name just like the vast majority of other projects.
Here are the next steps I’d like to take:
whiten
in a new PR to close this issue.If unwilling to change the name, there is significant headroom to clarify this in the function documentation. Please consider putting more effort towards a more inclusive environment.