question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Large rasters, metadata & "bounds" error

See original GitHub issue

Hey all!

First of all, great work on Terracotta, it’s proven itself very useful .

We’re running it on an EC2 and trying to use large rasters through Lambda (The process is -> TIFF added to S3 bucket -> Lambda picks it up, and passes it to the EC2 to be processed -> the EC2 serves it to the front end). While it works for “small” rasters, we’re getting errors whenever the files’ resolution is bigger than 30K * 30K - we start running into memory issues: Unable to allocate 5.96 GiB for an array with shape (1, 80000, 80000) and data type uint8 (sometimes data type = bool). Thing is, Lambda can use up to 10GB so I’m struggling to see what the problem is.

We also get this error sometimes => PerformanceWarning: Processing a large raster file, but crick failed to import. Reading whole file into memory instead.

After digging in the code, we decided to use the use_chunks parameter but we keep getting an error back, and it’s not the most descriptive => 'bounds' is all we get back. Here’s how use it: driver.insert(keys, raster_file , metadata={'raster_path': f'{raster_file}', 'use_chunks': True}, override_path=f'{raster_file}')

So here are my questions:

  • is there any limitation in terms of dimensions?
  • why would crick fail to import?
  • how are we supposed to use use_chunks ? I believe this should fix our issue, but I just can’t get it to work.

Thanks a lot in advance!

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:6

github_iconTop GitHub Comments

2reactions
dionhaefnercommented, May 1, 2021

BTW, you are getting KeyError: bounds from your call because you use the metadata keyword argument incorrectly. It expects a dict of pre-computed raster metadata (e.g. by driver.compute_metadata), and one of the keys is bounds, which is missing in your dict.

The correct call would be

metadata = driver.compute_metadata(raster_file, use_chunks=True)
driver.insert(keys, raster_file, metadata=metadata)

(but no need to do this, the default should work fine in your case)

See also: Terracotta Python API

1reaction
dionhaefnercommented, May 1, 2021

Yes, that is what we call ingestion. In that case the solution should be simple. If crick is not installed, Terracotta has to load the whole raster into memory which is why you get this error.

You have to make sure that crick is installed inside the Lambdas. How to do that depends on your deployment method. Then you should get away with using much less memory (< 1GB).

Read more comments on GitHub >

github_iconTop Results From Across the Web

Out of memory when serving large rasters · Issue #50 · DHI-GRAS ...
I used an overview to compute metadata, to get around the issue in #49. ... I sometimes see this: [2018-08-28 14:41:43573] ERROR in...
Read more >
Array memory error when reading rasters - GIS Stack Exchange
I'm attempting to make a spatio-temporal statistics analysis on a few chunks of data using rasterstats, rasterio and pandas/geopandas on Python.
Read more >
#3893 (Georaster loading throwing Invalid Metadata error and ...
Error ORA-01008 arises in GeoRasterWrapper.SetNoData in the new code for Oracle 11 and above, as a result of performance following PL/SQL the code:...
Read more >
Mosaic dataset properties—Help | ArcGIS for Desktop
Mosaic datasets have general dataset properties and properties that are nonraster dataset-specific that affect the access when they are served.
Read more >
Lesson 7 - Understanding GIS Error, Accuracy, and Precision ...
The error, accuracy, and precision of the GIS data we use in projects are often overlooked when we download data from various government,...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found