Series(data=df, index=None) seems to be incorrectly instantiated
See original GitHub issueHi, I saw it is now possible to create a Series from a DataFrame without any additional index parameter.
However, I can not print
or sort_index
a Series which has been created in this way.
I guess the Series is incorrectly instantiated and probably other actions would fail too.
>>> df = ks.DataFrame({"a": [1, 2, 3, 4, 5]})
>>> ser = ks.Series(df)
>>> print(ser)
Traceback (most recent call last):
File "D:\Dev\Utils\Miniconda\envs\koalas-dev\lib\site-packages\pandas\core\indexes\base.py", line 2646, in get_loc
return self._engine.get_loc(key)
File "pandas\_libs\index.pyx", line 111, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\index.pyx", line 138, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\hashtable_class_helper.pxi", line 1619, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas\_libs\hashtable_class_helper.pxi", line 1627, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: None
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "D:\Dev\koalas\databricks\koalas\series.py", line 5232, in __repr__
pser = self._kdf._get_or_create_repr_pandas_cache(max_display_count)[self.name]
File "D:\Dev\Utils\Miniconda\envs\koalas-dev\lib\site-packages\pandas\core\frame.py", line 2800, in __getitem__
indexer = self.columns.get_loc(key)
File "D:\Dev\Utils\Miniconda\envs\koalas-dev\lib\site-packages\pandas\core\indexes\base.py", line 2648, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandas\_libs\index.pyx", line 111, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\index.pyx", line 138, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\hashtable_class_helper.pxi", line 1619, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas\_libs\hashtable_class_helper.pxi", line 1627, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: None
>>> ser.sort_index()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "D:\Dev\koalas\databricks\koalas\series.py", line 2227, in sort_index
kdf = self._kdf[[self.name]].sort_index(
File "D:\Dev\koalas\databricks\koalas\frame.py", line 10130, in __getitem__
return self.loc[:, list(key)]
File "D:\Dev\koalas\databricks\koalas\indexing.py", line 441, in __getitem__
cols_sel
File "D:\Dev\koalas\databricks\koalas\indexing.py", line 313, in _select_cols
return self._select_cols_by_iterable(cols_sel, missing_keys)
File "D:\Dev\koalas\databricks\koalas\indexing.py", line 1175, in _select_cols_by_iterable
raise KeyError("['{}'] not in index".format(name_like_string(key)))
KeyError: "['__none__'] not in index"
Issue Analytics
- State:
- Created 3 years ago
- Comments:10 (8 by maintainers)
Top Results From Across the Web
Unable to export dataframe into separate excel files using ...
Many of the solutions I am finding on Stack Overflow seem to be pointing in that direction. data = df.groupby('ID') writer = pd.ExcelWriter(' ......
Read more >Using dataframe with duplicate index raises ValueError ...
In seaborn 0.11.1 this same problem doesn't seem to exist. PS: sns.violinplot(data=df.explode('val').reset_index(), x='x', y='val') gives a complete ...
Read more >BasePandasDataset — Modin 0.11.0+0.gc3b8d7e.dirty ...
Return a Series/DataFrame with absolute numeric value of each element. ... drop(labels=None, axis=0, index=None, columns=None, level=None, inplace=False, ...
Read more >Source code for pyspark.pandas.frame - Apache Spark
series import Series axis = validate_axis(axis) if axis != 0: raise NotImplementedError('axis should be either 0 or "index" currently.') ...
Read more >Python Pandas Tutorial: A Complete Guide - Datagy
In this guide, you'll learn about the pandas library in Python! The library allows you to work with tabular data in a familiar...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Alright, pandas does not support it either. For internal purposes (#1737), I actually found the function
first_series
instead of passingDataFrame
asSeries
constructor parameter.I’ll tentatively resolve this ticket since pandas doesn’t support either.