Suggestion for improvement use koalas instead of databricks.koalas
See original GitHub issueThe current naming / import of the package feels inconsistent and not what I expect as a Python developer - especially if we’re trying to reduce the barrier of entry from Pandas to Spark. It’s not a big deal, but changing this could reduce friction.
I’d recommend being consistent (and shadowing Pandas) for both the pip install and import statements:
Pandas style:
pip install pandas
import pandas as pd
Current Koalas style:
pip install koalas
import databricks.koalas as ks
Recommended Koalas style:
pip install koalas
import koalas as ks
Issue Analytics
- State:
- Created 4 years ago
- Reactions:4
- Comments:8 (5 by maintainers)
Top Results From Across the Web
Koalas: Interoperability Between Koalas and Apache Spark
Koalas is an open source project that provides pandas APIs on top of Apache Spark. ... Koalas is useful for not only pandas...
Read more >Koalas: Making an Easy Transition from Pandas ... - Databricks
My talk is Koalas, making an easy transition from Pandas to Apache Spark. I'm Takuya Ueshin, a software engineer at Databricks. I am...
Read more >Koalas: Easy Transition from pandas to Apache Spark
At Databricks, we believe that enabling pandas on Spark will significantly increase productivity for data scientists and data-driven ...
Read more >Koalas | Databricks on AWS
Learn how to use Koalas in Databricks. Koalas makes data scientists more productive, by implementing the pandas DataFrame API on top of ...
Read more >Interoperability between Koalas and Apache Spark - Databricks
You can simply use this method to convert PySpark DataFrames to Koalas DataFrames. Let's suppose you have a PySpark DataFrame: >>> sdf =...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I am +1 to the package
koalas
instead ofdatabricks.koalas
, mostly because other packages (pyspark
,numpy
,pandas
, etc) looks that wayactually I personally like this way too but I think it’s not a big deal.