question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Implement the first batch of Serialization / IO / Conversion functions

See original GitHub issue

See https://pandas.pydata.org/pandas-docs/stable/reference/frame.html#serialization-io-conversion

Looks like we can easily implement almost all of them by calling toPandas().func_name().

One thing is that some of the functions support max_rows. When that argument is specified, we should add a limit call in Spark to avoid moving all the data to the driver.

The list to add in the first batch are:

  • to_dict (see #169)
  • to_excel (#288)
  • to_html (we already have this, but let’s add a limit when max_rows is set), done in #206
  • to_latex (#297)
  • to_records (#298)
  • to_string (done in #211 and #213)
  • to_clipboard (#257)

Skipping the following because I don’t know how popular they are:

  • to_pickle
  • to_hdf
  • to_stata
  • to_msgpack
  • to_records
  • to_sparse
  • to_dense

The following might require parallelization with Pandas UDFs, rather than collecting all the data to the driver, so leaving them for the future:

  • to_sql
  • to_gbq

I’m also not adding json and csv here. We need to design those properly because both Spark and Pandas have those.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:20 (16 by maintainers)

github_iconTop GitHub Comments

1reaction
HyukjinKwoncommented, May 20, 2019

Thanks, @shril. This issue is nicely finished.

1reaction
rxincommented, May 11, 2019

Thank you all!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Serialization in Java - DigitalOcean
Deserialization is the process of converting Object stream to actual Java Object to be used in our program. Serialization in Java seems very ......
Read more >
Serialization and Deserialization in Java with Example
Serialization is a mechanism of converting the state of an object into a byte stream. Deserialization is the reverse process where the byte ......
Read more >
Introduction to Java Serialization | Baeldung
Serialization is the conversion of the state of an object into a byte stream; deserialization does the opposite.
Read more >
Serialization and deserialization in Java | Snyk Blog
Let's look at the following example of Java deserialize vulnerability where we serialize an object from a serializable class ValueObject :
Read more >
Everything You Need to Know About Java Serialization ...
First, the object is checked to ensure it implements Serializable , and then, it is checked to see whether either of those private...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found