question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Subclassing geodataframe

See original GitHub issue

What is the best way to subclass a GeoDataFrame? I tried to follow some instructions as indicated in this link for subclassing pandas data structures but this does not seem to propagate to the from_file classmethod:

import geopandas

class Hectometer(geopandas.GeoSeries):
    #each instance represents 1 hectometer
    @property
    def _constructor(self):
        return Hectometer

    @property
    def _constructor_expanddim(self):
        return Hectometers

class Hectometers(geopandas.GeoDataFrame):
    @property
    def _constructor(self):
        return Hectometers

    @property
    def _constructor_sliced(self):
        return Hectometer
def test():
    hms = Hectometers.from_file(r'vector\Hectometerraai_A2.shp')
    print(type(hms))
if __name__ == '__main__':
    test()

The above code outputs <class 'geopandas.geodataframe.GeoDataFrame'>, whereas I would like to get the subclassed object instead. I tried about overriding the from_file constructor but this does not seem to refer to self in the original GeoDataFrame class which means it will always return a GeoDataFrame and not its subclass.

What is an elegant way to subclass GeoDataFrames and GeoSeries such that I retrieve the subclassed object from the from_file method?

Issue Analytics

  • State:open
  • Created 3 years ago
  • Comments:6 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
m-richardscommented, May 21, 2022

@NickLucche I don’t have the same background as Joris to comment within the context of the pandas subclassing. I would also just reiterate the point about the use case to subclass - and whether what you’re doing with subclassing a geodataframe could be achived through composition of a Dataset class containing a geodataframe as an attribute (but if you’ve been using it for months, it’s quite likely you already have a good reason).

From my perspective, I see changing the from_* classmethods as an innocuous change - it would at least help your use case. But there would still be a considerable amount of work to have properly implemented subclassing - because every time _constructor is called internally by pandas, the custom subclass would be lost (which as an aside if you’ve been developing your class wrapper with geopandas 0.10.2, _constructor has changed on the main branch and would be more complicated to feed a class through.

Also there would be a considerable testing burden to cover all cases and noting replacing GeoDataFrame / GeoSeries with cls in every spot will not always do the desired thing (for instance GeoDataFrame.set_geometry is monkey patched onto pd.DataFrame, so if we replace the GeoDataFrame constructor call within, then it will no longer produce GeoDataFrames, but rather dataframes).

0reactions
NickLucchecommented, May 25, 2022

Hey @m-richards, thanks a lot for your response and your insights! My implementation is indeed based on 0.10.0 and it relies on a custom _constructorto create instances, as suggested (at the time) by the pandas documentation, and I would favor maintaining this behavior of having to provide a custom _constructor to make a subclass. But I see your point in the efforts required to having such changes tested on a feature you didn’t intend to support, so I won’t move forward until testing out a few implementations options based on your feedback.

Nonetheless, I would personally find it useful to have the tiny changes proposed in PR #2433 merged in the main branch, let me know if you have further feedback on that.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Data Structures — GeoPandas 0.12.2+0.gefcb367.dirty ...
GeoPandas implements two main data structures, a GeoSeries and a GeoDataFrame . These are subclasses of pandas.Series and pandas.DataFrame , respectively.
Read more >
Introduction to GeoPandas
The core data structure in GeoPandas is the geopandas.GeoDataFrame , a subclass of pandas.DataFrame , that can store geometry columns and perform spatial ......
Read more >
Introduction to GeoPandas
GeoDataFrame , a subclass of pandas.DataFrame able to store geometry columns and perform spatial operations. Geometries are handled by geopandas.
Read more >
geopandas.GeoDataFrame
A GeoDataFrame object is a pandas.DataFrame that has a column with geometry. In addition to the standard DataFrame constructor arguments, GeoDataFrame also ...
Read more >
Project history - GeoPandas
GeoSeries and ; GeoDataFrame types which are subclasses of ; pandas.Series and ; pandas.DataFrame respectively. GeoPandas objects can act on ; shapely geometry ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found