question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. ItĀ collects links to all the places you might be looking at while hunting down a tough bug.

And, if youā€™re still stuck at the end, weā€™re happy to hop on a call to see how we can help out.

Why is dropna default value is True in value_counts() methods/functions ?

See original GitHub issue

Problem description

>>> s = pd.Series([1,2,3, np.nan, 5])
>>> s.value_counts()
5    1
3    1
2    1
1    1

>>> s.value_counts(dropna = False)

 5     1
 3     1
 2     1
 1     1
NaN    1

For beginner in pandas, it can be puzzling and misleading to do not see NaNs values when trying to understand a DataFrame, a Series, etc. Especially if value_counts is used to check that previous operations were made in a correct way (e.g. join / merge-type operations).

As I can understand that it may seems natural to drop NaNs for various operations in pandas (means, etc), and as a consequence of that, the general default value for dropna arguments is True (is it really the real reason?).

I feel uncomfortable with the value_counts default behavior and it has (and still) caused me some troubles.

The zen of python second aphorism state:

Explicit is better than implicit

I do feel that dropping NaN value is done in a implicit way, that this implicit way is harmful. If find no drawbacks to have False as default value, to the exception of having a Na in Series index.

The question:

So why is dropna arguments default value of value_counts() are True ?

Ps : Iā€™ve looked into issues with filters : is:issue value_counts dropna to the exception of https://github.com/pandas-dev/pandas/issues/5569 I didnā€™t find a lot of informations.

Issue Analytics

  • State:open
  • Created 5 years ago
  • Reactions:2
  • Comments:6 (5 by maintainers)

github_iconTop GitHub Comments

2reactions
mqkcommented, Jul 6, 2021

šŸ‘ to this (ancient) question, and one vote from me for changing the default to dropna=False. I add that keyword pretty much every single time I call value_countsā€¦

0reactions
nmusolinocommented, Jul 6, 2021

@nmusolino , Iā€™m a bit troubled, what do you really mean ?

[ā€¦]

Iā€™m not sure to fully get the arguments that are in favour of dropping NaNsā€¦

Sorry, I had a typo in my old comment, which Iā€™ve just corrected. I agree with the original issue description: the default should be to include null values (dropna=False).

Read more comments on GitHub >

github_iconTop Results From Across the Web

pandas.Series.value_counts ā€” pandas 1.5.2 documentation
Rather than count values, group them into half-open bins, a convenience for pd.cut , only works with numeric data. dropnabool, default True.
Read more >
Pandas Series: value_counts() function - w3resource
Excludes NA values by default. If True then the object returned will contain the relative frequencies of the unique values. Sort by frequencies....
Read more >
Can I modify pd.Series.value_counts so that by default ...
Series.value_counts I almost always add the parameter dropna=False . Is there a simple way to set this as the default value without creatingĀ ......
Read more >
9 Pandas value_counts() tricks to improve your data analysis
Pandas value_counts() function returns a Series containing counts of unique values. By default, the resulting Series is in descending order ...
Read more >
8 Python Pandas Value_counts() tricks that make your work ...
By default, the count of null values is excluded from the result. But, the same can be displayed easily by setting the dropna...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found