Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

create and support a single-file file format for storing the image tiles

See original GitHub issue

Reading byte ranges from files is supported in many cases in a web browser:

A byte range of a local file can be read by the File API. A byte range of a file stored on a web server can be downloaded if the web server supports the HTTP Range header. A byte range of a file stored on AWS S3 can be downloaded through the Amazon S3 REST API.

I think it would be a good idea to create a simple file format that would have a header section in the beginning of the file and then have all the image tiles after it.

The header would store information about

image width image height tile size tile overlap and maybe image file format (JPEG/PNG) and an list of the byte ranges of all the image tiles

What do you think? What are the advantages and disadvantages of such a single-file format instead of using the multiple-file DZI format?

One advantage would be that system administration is simplified as you only need to handle one file per image instead of possibly thousands of files per image.

To simplify the handling and storing of the image tiles, I would suggest we create a unique ordering of all the image tiles, in other words a

function tile_id(x_coord, y_coord, level, width, height, tile_size)

that returns an integer number between 0 and N - 1, where N is the number of tiles.

The unique ordering makes it easy to design the file header, because the image tile byte range start positions (and maybe end positions) could be stored as just an array of numbers.

Such a function could be implemented something like this:

function num_levels(width, height) {
    return Math.ceil(Math.log2(Math.max(width, height))) + 1;
}

function scaled_tile_size(num_levels_, level, tile_size) {
    var count = num_levels_ - level - 1;
    var factor = 1;
    for (var i = 0; i < count; i++) {
        factor = factor * 2;
    }
    var result =  tile_size * factor;
    return result; 
}

function num_tiles_level(width, height, level, tile_size) {
    var num_l = num_levels(width, height);
    var scaled = scaled_tile_size(num_l, level, tile_size);       
    var num_tiles = Math.ceil(width / scaled) * Math.ceil(height / scaled) ;
    return num_tiles;
}

function tile_id(x_coord, y_coord, level, width, height, tile_size) {
    var result = 0;
    for (var i=0; i< level; i++) {
        result = result + num_tiles_level(width, height, i, tile_size);
    }
    var num_l = num_levels(width, height);
    var scaled_tile_s = scaled_tile_size(num_l, level, tile_size);
    var num_rows = Math.ceil(height / scaled_tile_s);
    result = result + (num_rows * y_coord) + x_coord;
    return result;
}

Actually I’ve already tried out this approach, i.e. storing all the image tiles in one file and using the File API to open a local file with a slightly modified OpenSeadragon. It worked. Sorry I don’t have that published on the web yet, but I plan to in the coming weeks.

Issue Analytics

State:
Created 7 years ago
Reactions:2
Comments:57 (18 by maintainers)

Top GitHub Comments

7reactions

KempWatsoncommented, Mar 29, 2020

Re the general question of single file format for zoomable images on S3, I hope this isn’t re-inventing the wheel… we have put tens of terabytes of zoomable images on S3 since 2012 already with our single-file ZIF format. The only thing missing is the open-source access libraries, and coincidentally as this came in I’m discussing an implementation of exactly this with another developer. Writing JavaScript extension to OSD for this would be a cakewalk, and I’d be happy to undertake that.

3reactions

jo-chemlacommented, Jan 15, 2021

Hi all, very interesting reads on the possibility to store and stream in OSD, SZI (uncompressed, zipped dzis) or ZIF.

Coming from GIS, the geospatial community has worked out in the recent years a standard way for storing massive georeferenced images, named Cloud Optimized Geotiff (Cog or cogeo). This format is indeed just a standard and agreed upon way to store pyramids within a tiff file, each pyramid level being written by blocks of a given number of pixel for data adjacency, stored in contiguous memory blocks. This file format is meant to be streamed to web clients using range requests (although I am not aware of a direct implementation of this, client-side, yet, which instead happens using middleware in the form of TiTiler.

Maybe this titiler can be of help to the osd team? Would this COGeo standard be easier to implement inside OSD instead of SZI or ZIF (without any support for geotags of course)? Subscribing to the feed as the discussion is really interesting.