question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

API: Standardize usage of underscores in multi-word kwargs

See original GitHub issue

This ticket is an outgrowth of a discussion in pull request #22587

By my rough count, the read_csv method has nearly 50 keyword arguments.

Of those, 32 arguments are made up or two or more words. Twenty of those multi-word arguments use an underscore to mark the space between words, like skip_blank_lines and parse_dates. Twelve do not, like chunksize and lineterminator.

It is my opinion this is a small flaw in pandas’ API, and that the library would benefit by standardizing how spaces are handled. It would make pandas more legible and consistent, and therefore easier for users of all experience levels.

I have taught pandas to dozens of newbies across the country and I can testify from experience that small variations in the naming style of commonly used methods introduces unnecessary frustration, and can even reduce user confidence in the quality of the overall product.

As a frequent user of pandas, I can also attest that the inconsistencies require me, someone who uses the library daily, to routinely consult the documentation to ensure I use the proper kwarg naming style.

I am sympathetic to the desire to maintain backwards compatibility, which I believe could be managed with deprecation warnings that, if included, could be temporary, and ultimately removed in a future version, much in the way sort_values was introduced.

Since the underscore method of handling word breaks is more common and more legible, I propose it be adopted. All existing multi-word arguments without an underscore would need to be modified. You can find an experimental patch of the skiprows kwargs, and considerable support from other users for pursuing this type of change, in #22587.

If that pull request is ultimately merged, and the maintainers agree with the larger goal I’ve tried to articulate here, I would be pleased to lead an effort to expand whatever design pattern is agreed upon to other keyword arguments across the library.

Issue Analytics

  • State:open
  • Created 5 years ago
  • Reactions:2
  • Comments:28 (28 by maintainers)

github_iconTop GitHub Comments

1reaction
dhimmelcommented, Sep 21, 2018

I agree that underscore spacing for multi-word arguments is most pythonic and readable. Hopefully, this issue will help ensure new keyword arguments use underscore spacing.

Upgrading existing arguments will be painful, both in terms of developer time and the deprecation / backwards incompatibility issues that will arise. On the other hand, if these changes are going to be made, then sooner is better. I lean on the side of continually improving the Pandas design, since data science is a quickly-developing field.

It seems like disruptive changes like this would be most natural for a pandas overhaul like Pandas 2.0. However, it’s not clear whether Pandas 2 is an active proposal? If not, perhaps it makes sense to bite the bullet now on standardizing existing argument names.

0reactions
mingglicommented, Oct 15, 2018

Hi added PR #23158 in regards to deprecating delimiter on read_csv following previous discussion.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Hyphen, underscore, or camelCase as word delimiter in URIs?
I realize it's a pretty small concern in the grand scheme of things, but: should I use hyphens, underscores, or camelCase to delimit...
Read more >
Changes — textacy 0.11.0 documentation
Added two beginner-oriented tutorials to documentation, showing how to use various aspects of the package in the context of specific tasks.
Read more >
textacy Documentation - Read the Docs
Access and extend spaCy's core functionality for working with one or many documents through convenient methods and custom extensions.
Read more >
schrodinger.application.desmond.util module - Schrödinger
def_fname – The name to use for the symbolic link. ... For flags like -multiword-flag, the corresponding keyword in kwargs is multiword_flag ....
Read more >
Git Repositories - Swift XMPP Client
‑rw‑r‑‑r‑‑ 3rdParty/SCons/scons‑2.4.0/CHANGES.txt 5899 ‑rw‑r‑‑r‑‑ 3rdParty/SCons/scons‑2.4.0/MANIFEST 211 ‑rw‑r‑‑r‑‑ 3rdParty/SCons/scons‑2.4.0/PKG‑INFO 13 ‑rw‑r‑‑r‑‑ 3rdParty/SCons/scons‑2.4.0/README.txt 250
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found