Get warning even though exporting non-geodata to parquet
See original GitHub issueI use both geodata and normal data in the same python file.
Context: As I work with point data, I transform the data set to normal pandas by dropping the geometry and saving LAT/LON.
e.g.
out = out.drop("geometry", axis=1)
out.to_parquet("bla.parqet")
Even though this is not a geodataframe any more I receive the warning:
C:\Users\rados\Documents\adv\point_linkage\src\data_management\merge_shapefiles.py:249: UserWarning: this is an initial implementation of Parquet/Feather file support and associated metadata. This is tracking version 0.1.0 of the metadata specification at https://github.com/geopandas/geo-arrow-spec
This metadata specification does not yet make stability promises. We do not yet recommend using this in a production setting unless you are able to rewrite your Parquet/Feather files.
Does this have any implication for my case?
Issue Analytics
- State:
- Created 3 years ago
- Comments:5 (4 by maintainers)
Top Results From Across the Web
EXPORT TO PARQUET - Vertica
This operation exports raw Flex columns as binary data. Output Files. EXPORT TO PARQUET always creates the output directory, even if the query...
Read more >Python: save pandas data frame to parquet file - Stack Overflow
Simple method to write pandas dataframe to parquet. Assuming, df is the pandas dataframe. We need to import following libraries. import pyarrow as...
Read more >Public - Cheat Sheet to get a working version of parquet tools ...
Sometimes, we may find issues related with NOS parquet files which we need to ... export PATH=/spare/mp185032/parquet-tools/apache-maven-3.8.4/bin:$PATH
Read more >Inquiry on exporting sap hana data to parquet files
I'm currently trying to export the data to a hana specified dev directory where I can currently export CSV files using "export into"....
Read more >Parquet file | Databricks on AWS
Learn how to read data from Apache Parquet files using Databricks. ... Interact with external data on Databricks; Parquet file ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@raholler Thanks a lot for your report!
I moved your issue to the geopandas repo, as it’s related to the implementation of geopandas.
So what is happening is that the
drop
method removes the geometry column, but the result is (incorrectly) still a GeoDataFrame, and thus using theto_parquet
implementation of GeoPandas instead of pandas.And it seems that our implementation doesn’t really correctly handle the case of no geometry column:
So we included incorrect metadata in the parquet file in this case.
@raholler short-term work-around is converting the result of
drop()
explicitly to a DataFrame (withpd.DataFrame(df.drop(...))
)Given the changes with that we still keep it as a GeoDataFrame if there is any geometry column, and thus more explicitly allow a GeoDataFrame without active geometry column, we should probably still test that case (to ensure we write this correctly, and can read the resulting file)