question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Keep (un-deprecate) `read_table()` for API stability

See original GitHub issue

Problem description

read_table() got deprecated in favour of using read_csv().

Using read_csv() to read tab/space delimited files is counter-intuitive. According to the docs and the related issues, both share the same code and it is not clear why the one function should be preferred over the other, and that change may even break existing code.

In my point of view read_table() is the more general function and read_csv() is a special case. Why would you deprecate (and then remove) the more useful function? It is already annoying to use to_csv() to write space/tab delimited files. And as I can see it, it comes down to two lines of code.

Proposed solution

Keep both functions as they are (un-deprecate read_table()) or rename the function to have a more general name like read_txt() (as in numpy.genfromtxt()) or similar.

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Reactions:11
  • Comments:34 (20 by maintainers)

github_iconTop GitHub Comments

11reactions
nspiescommented, Mar 13, 2019

Add my vote to keeping read_table around!

I’ve been a pandas user for 7+ years and have used read_table(...,sep="\t") where necessary. I agree with OP that I far prefer the read_table form as it sounds more general.

I will note that the documentation supports this interpretation: read_table is a function to “Read general delimited file into DataFrame.” while read_csv is to “Read a comma-separated values (csv) file into DataFrame.”

I have to say in my corner of datasciences, no one has touched the standard csv module, while everyone uses pandas, so I think it should decide for itself what’s right. (A better point of comparison is probably R, which has both a read.csv and read.table function, where read.csv is an alias for the more general-purposeread.table.)

I generally agree with minimizing aliases, but this is at the core of what pandas does and is a very widespread alias. While “pd.read_csv” is clearly more commonly used – I wouldn’t suggest getting rid of it! – a quick unscientific github search yields 50k+ results for “pd.read_table”.

10reactions
st-bendercommented, Feb 13, 2019

Hi, Thank you for your input and setting the labels and sorry for getting back to you so late. I am not really sure that I understood correctly, my point is that #18262 only lists a number of API functions to be deprecated, i.e. to be removed in the future, without further reasoning. In my opinion any change to the official and documented API should be justified. After all, pandas calls itself stable.

Note that I am not really opposed to keeping both functions, so the title may be a little provocative and could be adjusted.

My points pro keeping read_table():

  • First of all to keep the API stable and to not break existing code.
  • Keep the function with the more general name (IMHO) over the special case, i.e. read_table() to read ascii files with any delimiter/separator over read_csv() with the connotation of a fixed delimiter, the comma. Renaming that function to read_txt() may be worthwhile considering, although it contradicts the first point above.
  • Two lines of additional code, please correct me if I am mistaken.

Contra (as far as I can see):

  • Two lines of extra code.
Read more comments on GitHub >

github_iconTop Results From Across the Web

Create table from file - MATLAB readtable - MathWorks
To read the tabular data while preserving variable names, set the 'VariableNamingRule' parameter to preserve . T_preserve = readtable('sampletable.txt'," ...
Read more >
What Organizations Need to Know When Deprecating APIs
Let's understand the reasoning behind API deprecation first, and then highlight some recommended practices to efficiently deprecate. When to ...
Read more >
NEWS
Use httr::RETRY() instead of httr::VERB() in qualtrics_api_request() to implement consistent API error-handling across all of the functions in the package.
Read more >
What's New — pandas 0.20.3 documentation
Deprecate .ix; Deprecate Panel; Deprecate groupby.agg() with a dictionary when ... a standard sub-package of pandas, pandas.api to hold public API's.
Read more >
NetLogo 6.3.0 User Manual
How can I keep two turtles from occupying the same patch? ... extensions API allows adding new commands and reporters to the NetLogo...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found