Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

ENH: Add geom_col param to geopandas.to_postgis()

See original GitHub issue

Is your feature request related to a problem?

Postgres requires that a geom column to be present. On the other side geopandas always must have a geometry column, which is not mndatory in a postgres schema. This different in naming is solve nicely in geopandas.read_postgis() with the geom_col='whatever_you_use_in_the_database' parameter. It would be nice to have somthing like it in geopandas.to_postgis().

Describe the solution you’d like

geopandas.to_postgis() could have a parameter like geom_col='whatever_you_use_in_the_database'.

API breaking implications

If the the parameter is set to geometry by default it shouldn’t have backwards incompatibility.

Additional context

Change to_postgis() here

def to_postgis(
        self,
        name,
        con,
        schema=None,
        if_exists="fail",
        index=False,
        index_label=None,
        chunksize=None,
        dtype=None,
        geom_col='geometry'
    ):

        """
        Upload GeoDataFrame into PostGIS database.
        This method requires SQLAlchemy and GeoAlchemy2, and a PostgreSQL
        Python driver (e.g. psycopg2) to be installed.
        Parameters
        ----------
        name : str
            Name of the target table.
        con : sqlalchemy.engine.Engine
            Active connection to the PostGIS database.
        if_exists : {'fail', 'replace', 'append'}, default 'fail'
            How to behave if the table already exists:
            - fail: Raise a ValueError.
            - replace: Drop the table before inserting new values.
            - append: Insert new values to the existing table.
        schema : string, optional
            Specify the schema. If None, use default schema: 'public'.
        index : bool, default True
            Write DataFrame index as a column.
            Uses *index_label* as the column name in the table.
        index_label : string or sequence, default None
            Column label for index column(s).
            If None is given (default) and index is True,
            then the index names are used.
        chunksize : int, optional
            Rows will be written in batches of this size at a time.
            By default, all rows will be written at once.
        dtype : dict of column name to SQL type, default None
            Specifying the datatype for columns.
            The keys should be the column names and the values
            should be the SQLAlchemy types.
        Examples
        --------
        >>> from sqlalchemy import create_engine
        >>> engine = create_engine("postgres://myusername:mypassword@myhost:5432\
/mydatabase")  # doctest: +SKIP
        >>> gdf.to_postgis("my_table", engine)  # doctest: +SKIP
        """
        geopandas.io.sql._write_postgis(
            self, name, con, schema, if_exists, index, index_label, chunksize, dtype,geom_col
        )

And then change here:

def _write_postgis(
    gdf,
    name,
    con,
    schema=None,
    if_exists="fail",
    index=False,
    index_label=None,
    chunksize=None,
    dtype=None,
    geom_col='geometry'
):
    """
    Upload GeoDataFrame into PostGIS database.
    This method requires SQLAlchemy and GeoAlchemy2, and a PostgreSQL
    Python driver (e.g. psycopg2) to be installed.
    Parameters
    ----------
    name : str
        Name of the target table.
    con : sqlalchemy.engine.Engine
        Active connection to the PostGIS database.
    if_exists : {'fail', 'replace', 'append'}, default 'fail'
        How to behave if the table already exists:
        - fail: Raise a ValueError.
        - replace: Drop the table before inserting new values.
        - append: Insert new values to the existing table.
    schema : string, optional
        Specify the schema. If None, use default schema: 'public'.
    index : bool, default True
        Write DataFrame index as a column.
        Uses *index_label* as the column name in the table.
    index_label : string or sequence, default None
        Column label for index column(s).
        If None is given (default) and index is True,
        then the index names are used.
    chunksize : int, optional
        Rows will be written in batches of this size at a time.
        By default, all rows will be written at once.
    dtype : dict of column name to SQL type, default None
        Specifying the datatype for columns.
        The keys should be the column names and the values
        should be the SQLAlchemy types.
    Examples
    --------
    >>> from sqlalchemy import create_engine  # doctest: +SKIP
    >>> engine = create_engine("postgres://myusername:mypassword@myhost:5432\
/mydatabase";)  # doctest: +SKIP
    >>> gdf.to_postgis("my_table", engine)  # doctest: +SKIP
    """
    try:
        from geoalchemy2 import Geometry
    except ImportError:
        raise ImportError("'to_postgis()' requires geoalchemy2 package. ")

    if not compat.SHAPELY_GE_17:
        raise ImportError(
            "'to_postgis()' requires newer version of Shapely "
            "(>= '1.7.0').\nYou can update the library using "
            "'pip install shapely --upgrade' or using "
            "'conda update shapely' if using conda package manager."
        )

    gdf = gdf.copy()
    geom_name = geom_col

    # Get srid
    srid = _get_srid_from_crs(gdf)

    # Get geometry type and info whether data contains LinearRing.
    geometry_type, has_curve = _get_geometry_type(gdf)

    # Build dtype with Geometry
    if dtype is not None:
        dtype[geom_name] = Geometry(geometry_type=geometry_type, srid=srid)
    else:
        dtype = {geom_name: Geometry(geometry_type=geometry_type, srid=srid)}

    # Convert LinearRing geometries to LineString
    if has_curve:
        gdf = _convert_linearring_to_linestring(gdf, geom_name)

    # Convert geometries to EWKB
    gdf = _convert_to_ewkb(gdf, geom_name, srid)

    if if_exists == "append":
        # Check that the geometry srid matches with the current GeoDataFrame
        with con.begin() as connection:
            if schema is not None:
                schema_name = schema
            else:
                schema_name = "public"

            # Only check SRID if table exists
            if connection.run_callable(connection.dialect.has_table, name, schema):
                target_srid = connection.execute(
                    "SELECT Find_SRID('{schema}', '{table}', '{geom_col}');".format(
                        schema=schema_name, table=name, geom_col=geom_name
                    )
                ).fetchone()[0]

                if target_srid != srid:
                    msg = (
                        "The CRS of the target table (EPSG:{epsg_t}) differs from the "
                        "CRS of current GeoDataFrame (EPSG:{epsg_src}).".format(
                            epsg_t=target_srid, epsg_src=srid
                        )
                    )
                    raise ValueError(msg)

    with con.begin() as connection:

        gdf.to_sql(
            name,
            connection,
            schema=schema,
            if_exists=if_exists,
            index=index,
            index_label=index_label,
            chunksize=chunksize,
            dtype=dtype,
            method=_psql_insert_copy,
        )

    return

Issue Analytics

State:
Created 3 years ago
Reactions:1
Comments:6 (2 by maintainers)

Top GitHub Comments

1reaction

answerquestcommented, Aug 5, 2022

Hi, isn’t the title misleading : should be .to_posgis() instead of .read_postgis()
And this is very much needed!

Quick clarification : so currently is to_posgis() expecting the geometry column to be named as ‘geom’, or ‘geometry’ by default?

0reactions

MaxDragonheartcommented, Jan 5, 2021

For me is optimal gdf.to_postgis(..., geom_col="my_name") because in my case I’ve two geometry columns:

geom receive a MultiPolygon
geom_centroid receive the centroid from geom

Using to_postgis I’ve suppressed geom_centroid and in its place I’ve putted a text column that receive the WKT from str(GeoDataframe.centroid[0])

Top Results From Across the Web

geopandas.GeoDataFrame.to_postgis

Upload GeoDataFrame into PostGIS database. This method requires SQLAlchemy and GeoAlchemy2, and a PostgreSQL Python driver (e.g. psycopg2) to be installed.

GeoPandas to PostGIS database - python - GIS Stack Exchange

I am trying to export a GeoPandas dataframe to a PostgreSQL database (with PostGIS extension). The import is from the same database as...

Geospatial Analytics using Python and SQL

#Execute query to create GeoDataFrame nyc_from_db = gpd.GeoDataFrame.from_postgis(sql=sql, con=engine, geom_col=geom_col). *In[9]:*. nyc_from_db.head() #Yay ...

PostGIS not recognizing geometry type after writing to ...

Geopandas requires a geometry columna after filtering out geometry additionally or adding it should work. The parameter geom_col='geometry' ...

How to use the geopandas.GeoDataFrame.from_postgis ...

def df_from_postgis(engine, query, params, geocolumn, epsg): """ Run a PostGIS query and return results as a GeoDataFrame :param engine: SQLAlchemy database ...