Large rasters, metadata & "bounds" error
See original GitHub issueHey all!
First of all, great work on Terracotta, it’s proven itself very useful .
We’re running it on an EC2 and trying to use large rasters through Lambda (The process is -> TIFF added to S3 bucket -> Lambda picks it up, and passes it to the EC2 to be processed -> the EC2 serves it to the front end). While it works for “small” rasters, we’re getting errors whenever the files’ resolution is bigger than 30K * 30K - we start running into memory issues:
Unable to allocate 5.96 GiB for an array with shape (1, 80000, 80000) and data type uint8
(sometimes data type = bool).
Thing is, Lambda can use up to 10GB so I’m struggling to see what the problem is.
We also get this error sometimes => PerformanceWarning: Processing a large raster file, but crick failed to import. Reading whole file into memory instead.
After digging in the code, we decided to use the use_chunks
parameter but we keep getting an error back, and it’s not the most descriptive => 'bounds'
is all we get back.
Here’s how use it: driver.insert(keys, raster_file , metadata={'raster_path': f'{raster_file}', 'use_chunks': True}, override_path=f'{raster_file}')
So here are my questions:
- is there any limitation in terms of dimensions?
- why would crick fail to import?
- how are we supposed to use
use_chunks
? I believe this should fix our issue, but I just can’t get it to work.
Thanks a lot in advance!
Issue Analytics
- State:
- Created 2 years ago
- Comments:6
Top GitHub Comments
BTW, you are getting
KeyError: bounds
from your call because you use themetadata
keyword argument incorrectly. It expects a dict of pre-computed raster metadata (e.g. bydriver.compute_metadata
), and one of the keys isbounds
, which is missing in your dict.The correct call would be
(but no need to do this, the default should work fine in your case)
See also: Terracotta Python API
Yes, that is what we call ingestion. In that case the solution should be simple. If crick is not installed, Terracotta has to load the whole raster into memory which is why you get this error.
You have to make sure that crick is installed inside the Lambdas. How to do that depends on your deployment method. Then you should get away with using much less memory (< 1GB).