question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Index.to_frame() seems not working properly.

See original GitHub issue
>>> kdf = ks.DataFrame({"Koalas": [1, 2, 3]}, index=pd.Index([1, 2, 3]))
>>> kdf["NEW"] = ks.Series([100, 200, 300])
>>> kdf
   0    NEW
1  1  200.0
3  3    NaN
2  2  300.0

The above code is working well.

But the same shape of DataFrame which is made from Index.to_frame() seems not work properly.

>>> kdf = ks.Index([1, 2, 3]).to_frame(name="Koalas")
>>> kdf["NEW"] = ks.Series([100, 200, 300])
>>> kdf
Traceback (most recent call last):
...
pyspark.sql.utils.AnalysisException: Reference 'Koalas' is ambiguous, could be: Koalas, Koalas.;

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:9 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
ueshincommented, Jul 13, 2020

InternalFrame doesn’t work only with spark_frame but with all the metadata. The data_spark_columns contains all the changes since the last time spark_frame is created.

Even if spark_frame shows the values of [1, 2, 3], data_spark_columns has the operation of kser.where(kser < 2).

>>> new_kser._internal.spark_frame.show()
+-----------------+---+-----------------+
|__index_level_0__|  0|__natural_order__|
+-----------------+---+-----------------+
|                0|  1|      25769803776|
|                1|  2|      60129542144|
|                2|  3|      94489280512|
+-----------------+---+-----------------+

>>> new_kser._internal.data_spark_columns
[Column<b'CASE WHEN CASE WHEN ((0 < 2) IS NULL) THEN false ELSE (0 < 2) END AS `0` THEN 0 ELSE NaN END AS `0` AS `0`'>]

We always use both to show the actual values.

>>> new_kser._internal.spark_frame.select(new_kser._internal.data_spark_columns).show()
+---+
|  0|
+---+
|1.0|
|NaN|
|NaN|
+---+
0reactions
itholiccommented, Aug 9, 2021

Close since this is resolved.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Data frame indexing not working as it should be. Does not ...
When the migration is complete, you will access your Teams at stackoverflowteams.com, and they will no longer appear in the left sidebar on ......
Read more >
pandas.Index.to_frame — pandas 1.5.2 documentation
Create a DataFrame with a column containing the Index. Parameters. indexbool, default True. Set the index of the returned DataFrame as the original...
Read more >
The Craft of Research, 2nd edition (Chicago Guides to Writing ...
Research is hard work, but like any challenging job done well, both the process and the results can bring real personal satisfac- tion....
Read more >
Defining Index or Table of Contents Entries - LibreOffice Help
Choose Insert - Table of Contents and Index - Index Entry, and do one of the following: To change the text that appears...
Read more >
Downloads - ChromeDriver - WebDriver for Chrome
... Chromedriver 86 - chromedriver .quit() doesn't seem to pass unload event properly; Resolved issue 3649: Copying selected text to clipboard does not...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found