question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Command tfds.as_dataframe fails to make dataframe

See original GitHub issue

Short description When I call tfds.as_dataframe, it gives the error below.

Environment information

  • Operating System: Ubuntu 20.04

  • Python version: 3.8

  • tensorflow-datasets/tfds-nightly version: tfds-nightly v3.2.1.dev202009090105

  • tensorflow/tf-nightly version: tensorflow v2.3

  • Does the issue still exists with the last tfds-nightly package (pip install --upgrade tfds-nightly) ? Yes

Reproduction instructions

import tensorflow.compat.v2 as tf
import tensorflow_datasets as tfds
import pandas as pd

tfds.disable_progress_bar()
tf.enable_v2_behavior()

(ds_train, ds_test), ds_info = tfds.load(
    'mnist',
    split=['train', 'test'],
    shuffle_files=True,
    as_supervised=True,
    with_info=True,
)

def normalize_img(image, label):
  """Normalizes images: `uint8` -> `float32`."""
  return tf.cast(image, tf.float32) / 255., label

ds_test = ds_test.map(
    normalize_img, num_parallel_calls=tf.data.experimental.AUTOTUNE)
ds_test = ds_test.batch(128)
ds_test = ds_test.cache()
ds_test = ds_test.prefetch(tf.data.experimental.AUTOTUNE)

df = tfds.as_dataframe(ds_test.take(10), ds_info)

Link to logs

Traceback (most recent call last):
  File "mnist_test.py", line 31, in <module>
    df = tfds.as_dataframe(ds_test.take(10), ds_info)
  File "/home/ubuntu/miniconda3/envs/tensorflow/lib/python3.8/site-packages/tensorflow_datasets/core/as_dataframe.py", line 192, in as_dataframe
    columns = _make_columns(ds.element_spec, ds_info=ds_info)
  File "/home/ubuntu/miniconda3/envs/tensorflow/lib/python3.8/site-packages/tensorflow_datasets/core/as_dataframe.py", line 148, in _make_columns
    return [
  File "/home/ubuntu/miniconda3/envs/tensorflow/lib/python3.8/site-packages/tensorflow_datasets/core/as_dataframe.py", line 149, in <listcomp>
    ColumnInfo.from_spec(path, ds_info)
  File "/home/ubuntu/miniconda3/envs/tensorflow/lib/python3.8/site-packages/tensorflow_datasets/core/as_dataframe.py", line 61, in from_spec
    name = '/'.join(path)
TypeError: sequence item 0: expected str instance, int found

Expected behavior Conversion from dataset to into a pandas dataframe

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:3
  • Comments:15 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
mainguyenanhvucommented, Mar 26, 2022

Please access this file: C:\Users\elver\PycharmProjects\ds_to_csv\venv\lib\site-packages\tensorflow_datasets\core\as_dataframe.py And cast path into str: name = ‘/’.join(map(str,path))

1reaction
mervesscommented, Oct 19, 2021

When as_supervised=True is used, ds_info must be fed to the DataFrame, hence the “str-int” error. If you don’t use the flag as_supervised, it’s up to you to pass ds_info to the DataFrame or not, no error for that. Tested on: TF 2.6.0.

Read more comments on GitHub >

github_iconTop Results From Across the Web

How do I edit a Tensorflow dataset in a Pandas DataFrame?
You can use as_dataframe method. reviews = tfds.as_dataframe(train_dataset.take(10)). Or you can iterate over the dataset to get article and ...
Read more >
tfds.as_dataframe - Datasets
The tf.data.Dataset object to convert to panda dataframe. Examples should not be batched. The full dataset will be loaded. ds_info, Dataset ...
Read more >
[Tensorflow 2.0] Load Pandas dataframe to Tensorflow
Now our CSV file want to stopover on pandas dataframe before landing on tf. data. thal is object type which means that they...
Read more >
tensorflow/datasets - Colaboratory
Note: Do not confuse TFDS (this library) with tf.data (TensorFlow API to build efficient ... DataFrame with tfds.as_dataframe to be visualized on Colab....
Read more >
Custom matrix building using R from a datafram
create a dataframe with TFs binding site occurencies geneTF.mtx.tmp ... + as.data.frame() %>% + column_to_rownames("geneID") Error in x[[i]] ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found