question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Unable to rename columns in streaming dataset

See original GitHub issue

Describe the bug

Trying to rename column in a streaming datasets, destroys the features object.

Steps to reproduce the bug

The following code illustrates the error:

from datasets import load_dataset
dataset = load_dataset('mc4', 'en', streaming=True, split='train')
dataset.info.features
# {'text': Value(dtype='string', id=None), 'timestamp': Value(dtype='string', id=None), 'url': Value(dtype='string', id=None)}
dataset = dataset.rename_column("text", "content")
dataset.info.features
# This returned object is now None!

Expected behavior

This should just alter the renamed column.

Environment info

datasets 2.6.1

Issue Analytics

  • State:closed
  • Created 10 months ago
  • Comments:7 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
lhoestqcommented, Nov 16, 2022

If we know the features before renaming, then we know the features after renaming, so we can pass the new features to the returned dataset in rename_column indeed ! If anyone is interested in contributing, feel free to open a PR and I’d be happy to help / give some pointers 😃

0reactions
alvarobarttcommented, Nov 23, 2022

#self-assign

Read more comments on GitHub >

github_iconTop Results From Across the Web

Cannot rename columns in PBI Desktop when using Di...
I have a PBI Desktop report connected to a series of PBI dataflows. I know DirectQuery supports the renaming of columns but am...
Read more >
Renaming column names of a DataFrame in Spark Scala
Hi @zero323 When using withColumnRenamed I am getting AnalysisException can't resolve 'CC8. 1' given input columns... It fails even though CC8.1 ...
Read more >
Solved: Rename columns referencing column number rather th...
The biggest issue is that you simply cannot rename column names dynamically (either the old or new name) - you have to "hard...
Read more >
How to Rename Column in R - Spark by {Examples}
Use setnames() function from data.table library to change columns with list. data.table is also a third-party library hence, you need to first ...
Read more >
Automate dynamic mapping and renaming of column names ...
A common challenge ETL and big data developers face is working with data files that don't have proper name header records.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found