question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

BUG: `convert_dtypes()` converts GeoDataFrame to DataFrame

See original GitHub issue
  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of geopandas.

  • (optional) I have confirmed this bug exists on the master branch of geopandas.


Code Sample, a copy-pastable example

import pandas as pd
import fiona # AttributeError if not importing fiona before gpd
import geopandas as gpd
from geopandas.testing import assert_geodataframe_equal

def df():
    return pd.DataFrame(
        {
            "City": ["Buenos Aires", "Brasilia", "Santiago", "Bogota", "Caracas"],
            "Country": ["Argentina", "Brazil", "Chile", "Colombia", "Venezuela"],
            "Latitude": [-34.58, -15.78, -33.45, 4.60, 10.48],
            "Longitude": [-58.66, -47.91, -70.66, -74.08, -66.86],
        }
    )

def gdf_points_from_xy(df):
    return gpd.GeoDataFrame(df, geometry=gpd.points_from_xy(df.Longitude, df.Latitude))

def test_convert_dtypes_before_gdf():
    result = df().convert_dtypes().pipe(gdf_points_from_xy)
    assert isinstance(result, gpd.GeoDataFrame)
    # --> no error

def test_convert_dtypes_after_gdf():
    result = df().pipe(gdf_points_from_xy).convert_dtypes()
    assert isinstance(result, gpd.GeoDataFrame)
    # --> AssertionError

def test_convert_dtypes_expectation():
    expected = df().convert_dtypes().pipe(gdf_points_from_xy)

    result = df().pipe(gdf_points_from_xy).convert_dtypes()

    assert_geodataframe_equal(result, expected)
    #--> AssertionError

Problem description

Calling .convert_dtypes() on a GeoDataFrame turns it into a regular DataFrame.

AssertionError: assert isinstance(result, GeoDataFrame)

Expected Output

I expect convert_dtypes to not change frame type.

The problem may be with the original method. I suspect it is because of return concat(results, axis=1, copy=False). As the pandas repo does not know about other frame types, I suspect the fix should lie in this repo.

Output of geopandas.show_versions()

SYSTEM INFO

python : 3.8.4 (default, Jan 11 2021, 16:58:12) [Clang 12.0.0 (clang-1200.0.32.28)] executable : /Users/adriantofting/Library/Caches/pypoetry/virtualenvs/gpd-test-qc5veAeh-py3.8/bin/python machine : macOS-11.1-x86_64-i386-64bit

GEOS, GDAL, PROJ INFO

GEOS : 3.9.1 GEOS lib : /usr/local/Cellar/geos/3.9.1/lib/libgeos_c.dylib GDAL : 3.2.1 GDAL data dir: None PROJ : 7.2.1 PROJ data dir: /usr/local/share/proj

PYTHON DEPENDENCIES

geopandas : 0.9.0 pandas : 1.2.3 fiona : 1.8.18 numpy : 1.20.1 shapely : 1.7.1 rtree : None pyproj : 3.0.0.post1 matplotlib : None mapclassify: None geopy : None psycopg2 : None geoalchemy2: None pyarrow : None pygeos : None

Environment (poetry)

#pyproject.toml
[tool.poetry]
name = "gpd_test"
version = "0.1.0"
description = ""

[tool.poetry.dependencies]
python = "^3.8"
geopandas = "0.9.0"

[tool.poetry.dev-dependencies]
jupyterlab = "^3.0.9"

[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:12 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
jorisvandenbosschecommented, Sep 20, 2021

So if concat returns a DataFrame, DataFrame.__finalize will be called (as opposed to GeoDataFrame.__finalize__ and we won’t have any mechanism to turn that into a GeoDataFrame.

Indeed, only a __finalize__ won’t be enough. But if we could somehow have pandas call the finalize of the class of self, that could solve it (to avoid that the return type of convert_dtypes depends on the return type of concat, since the use of concat should be an implementation detail).

Although maybe just passing it to the constructor return self._constructor(result).__finalize__(self, method="convert_dtypes") might be easier (where result is the output of concat).

I opened an issue for this on the pandas side: https://github.com/pandas-dev/pandas/issues/43668

1reaction
jorisvandenbosschecommented, Aug 27, 2021

I think this can actually also be considered a bug in pandas itself, as it should use _constructor to recreate the resulting dataframe. Of course, on the short-term, we can also fix it in geopandas by overriding the method. PR welcome for that!

@damanad regarding pandas denoting the method as @final, pandas uses that to indicate that the method doesn’t get overriden internally in pandas itself. We have several methods in geopandas that we override from pandas (that already might have this decorator, didn’t check). So in general I think we can ignore this. If it would give problems with typing (validating type annotations in geopandas), we might need to ask pandas to not use it like that (but no typing expert).

But when checking that the frames are equal, I get the error KeyError: '[None] not found in axis'. Don’t see any difference in them, though.

Although, when putting the geometry column first, it now returns a GeoDataFrame, it’s apparently still an “invalid” GeoDataFrame, in the sense that the _geometry_column_name has not been set properly. I think this can be seen as a separate bug in pd.concat([geoseries, series, ..], axis=1)

Read more comments on GitHub >

github_iconTop Results From Across the Web

Converting a geopandas geodataframe into a pandas dataframe
You don't need to convert the GeoDataFrame to an array of values, you can pass it directly to the DataFrame constructor: df1 =...
Read more >
Error while converting Dataframe to Geodataframe
I just tested your code* with pandas=0.24.1 and geopandas=0.4.0, and it ran without error. For what it's worth, the geopandas documentation ...
Read more >
Creating a GeoDataFrame from a DataFrame with coordinates
We use geopandas points_from_xy() to transform Longitude and Latitude into a list of shapely.Point objects and set it as a geometry while creating...
Read more >
Converting spatially enabled dataframe to feature
Convert a geopandas geodataframe to a Spatially enabled dataframe (SEDF) using .from_geodataframe(); Export the SEDF to a feature class using .
Read more >
How to Fix in Pandas: TypeError: no numeric data to plot
This error typically occurs when you think a certain column in the DataFrame is numeric but it turns out to be a different...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found