folium crashes Jupyter Notebook when using large geojson file
See original GitHub issueNote: This issue is mostly copy/pasted from the below link:
Versions (according to my conda environment YAML):
python=3.6 folium=0.5.0 branca=0.2.0 numpy=1.13.3 pandas=0.21.0 geopandas=0.3.0
(Although I don’t think I even use all of the above packages)
I am trying to follow the example of adding a geo_json overlay found here. The geo_json file I am using is this map of the zip codes in Germany, found here. It is large at 85.8 mb.
Here is my MWE; I downloaded the ‘postleitzahlen.geojson’ file to (pwd being my “present working directory”) ‘pwd/data/postleitzahlen.geojson’. Note that the same problem happens even when I change the extension from the non-standard .geojson to .json.
import folium
# Not sure if I need all of the packages below:
import json
import os
# Center map in middle of Berlin, zoom out enough so all of Germany is visible,
# and use mapbox bright instead of default OpenStreetMap so as to hopefully make
# images easier to render. But it still crashes the notebook anyway.
m=folium.Map(location=[52.5194, 13.4067], tiles='Mapbox Bright', zoom_start=5)
# This is the analog of the example on the Folium website, but I don't really understand it.
# Wouldn't we need to load the file into memory somehow, maybe using geopandas or something?
zipcode_regions = os.path.join('data', 'postleitzahlen.geojson')
# Add the geoJSON layer to map ostensibly:
folium.GeoJson(zipcode_regions, name='geo_json').add_to(m)
# Still crashes regardless of whether I include following line - I think it just adds control in top-right of map on example website, which I don't need.
folium.LayerControl().add_to(m)
m
In Jupyter Notebook (on my computer at least) what happens is that for the output of the cell containing the last line, one just gets a large empty white space. Moreover, at the top it says ‘Autosave Failed!’. And when one tries to click the save button, the notebook freezes momentarily, and then nothing happens. (I.e. there’s no indication that the file was saved.)
I can still run new cells (e.g. performing basic arithmetic), but nothing saves. (EDIT: Nope, doing this too often causes the notebook to crash outright in Chrome.)
This might be a bug in either Jupyter, IPython, or Folium, in which case asking here might not help much. But I figured that I would try asking here at least.
Looking at the documentation for this function (scroll down a lot), should I try either (1) setting the overlay
parameter to True
, since the default is False
? (2) setting the smooth_factor
to a float greater than the default of 1.0
? (I will try both and update this post with any results.)
I read these questions, but did not understand how to use their answers to solve my problem. If someone can explain how to apply those answers here, I would greatly appreciate it. (1)(2)(3)(4)(5)
EDIT: I tried doing what this person did, namely increasing the data limit (specifically I ran jupyter notebook --NotebookApp.iopub_data_rate_limit=1.0e10
to launch Jupyter Notebook) but the same error occurred as before.
EDIT (2): Still occurs setting overlay=True
and both using and not using data limit increase.
EDIT (3): Setting smooth_factor
to 10
, 100
, or 1000
didn’t fix it, although it did make the penultimate cell run faster. So this seems more likely a problem with Jupyter than folium.
The terminal output each time for Jupyter Notebook contains multiple errors of the form:
Saving file at /map.ipynb
[I 12:00:00.000 NotebookApp] Malformed HTTP message from ::1: Content-Length too long
Watching the terminal more closely as the notebook runs, it is clear that this error occurs exactly as the notebook tries to load the map. So next I will try the solutions proposed here.
EDIT (4): Still doesn’t work trying the jupyter notebook --NotebookApp.tornado_settings="{'max_body_size': 104857600, 'max_buffer_size': 104857600}"
suggested here; also doesn’t work when adding three additional zeroes to max_body_size
and max_buffer_size
, nor when adding six additional zeroes to both. (I.e. approx. 0.5 petabytes, a million times the defaults.)
Since the GeoJSON loads on GitHub (albeit slowly), it seems that it is possible for the GeoJSON to be loaded with a map. And 0.5 petabytes is probably not a reasonable limit for HTTP requests generated by folium to exceed, so hence why I am posting this as an issue here, although it might (I don’t know) be an issue with Jupyter Notebook instead. It could also conceivably be an issue with leaflet too.
EDIT: Full code (but not minimal):
import pandas as pd
import numpy as np
import folium
import branca
import geopandas as gpd
import json
import shelve
import os
m = folium.Map(location=[52.5194, 13.4067], tiles='Mapbox Bright', zoom_start=5)
# Test that folium works without any geoJSON layer -- it does, and very quickly too.
m
zipcode_regions = os.path.join('data', 'postleitzahlen.json')
folium.GeoJson(zipcode_regions, name='geo_json', overlay=True, smooth_factor=1000).add_to(m)
# The cell above this isn't immediate, but fairly quick. When trying to run the following cell, the notebook crashes:
m
Issue Analytics
- State:
- Created 6 years ago
- Comments:6 (3 by maintainers)
Top GitHub Comments
Not stupid at all and, with the rise of
ipyleaflet
we should probably document that somewhere.Not really. TL;DR
folium
is a hack that builds the HTML for you using the jinja templates and cdn for the JS part.Using the
folium
approach one could usegeojson-vt
as a plugin (take a look at our plugins for more info).There are two approaches,
folium
stand-aloneHTML
and whateveripyleaflet
does, which is more sophisticated (and I am unfamiliar with the details).@Conengmo nailed it. There is not much we can do there.
I did research https://github.com/mapbox/geojson-vt in the past but never got to actually try it. I believe it can help you with large datasets.
Closing b/c this is not a
folium
issue we can act on.