Areas of Interest
See original GitHub issueTable of Contents
<div id="text-table-of-contents"> </div> </div>Area of Interest
Below is a collection of my thoughts regarding the proposed Area of Interest feature. AOIs in general are fairly nebulous, but within the context of RF, they can be taken to their logical conclusion in a straight-forward way.
The point of this post is to generate discussion and clarify what our first steps should be.
Resources
- http://resources.arcgis.com/en/help/main/10.1/index.html#//003800000030000000
- https://data.cityofnewyork.us/Health/Areas-of-Interest-GIS/mzbd-kucq/data
Preamble
What are AOIs?
文字通り - “exactly what’s written”. They are areas/locations that may be of
interest to someone. NYC OpenData denotes these as points on a map,
showing parks or other landmarks. We (Raster Foundry) imagine them as
Polygons
that represents some area that a user wishes to pay attention to.
The area may or may not contain a landmark.
What are typical operations over them?
-
Query Filtering
When querying Tiles from a TMS or GeoTrellis backend, restrict the results to contain only those Tiles touched by some given
Polygon
. GeoTrellis exposes the LayerQuery class for this. -
Change Detection
When new imagery becomes available, are there perceptible changes in:
- water levels?
- surrounding vegetation?
- coastlines?
- glaciers / sea ice?
- urban sprawl?
- Jupiter’s Great Red Spot?
- etc.
What is the initial scope of the feature?
-
Thresholds
Not every new image within a given AOI should be considered for addition. For example, in an AOI that a user is using for NDVI change detection, a new image that is entirely cloud cover would not be of use to them. By allowing the user to set Thresholds, we can limit the amount of false positives they would otherwise be notified of.
We can realize a Threshold as a function, which allows us to compose them nicely together (details in the Implementation section):
/* Used when mapping through a collection of image metadata, * potentially yielding a Scene id if the Image passed the * threshold. The function may or may not perform arbitrary * IO/Future actions in deciding if the Image is worthy. */ threshold[M: Monad]: Metadata => OptionT[M, UUID]
Thresholds can be either metadata based, or value based. Ideas:
- Metadata: Image source (Landsat, Sentinel, etc.)
- Metadata: Date range
- Metadata: Extent intersection
- Value: The result of a GT operation (e.g. cloud cover, NDVI, see below)
- Value: The result of an
Op
structure Map Algebra operation
In many cases, we can imagine that the threshold will pass/fail based on metadata alone, meaning that the actual image data would never have to be pulled from S3 in the first place. In these cases, thresholding would be very fast.
Note: Threshold implementations are known to the backend - they are prewritten Scala functions that have matching type signatures. How a choice of Threshold on the frontend matches a Scala function in the backend, and how that choice is stored in the database are handled in the Implementation section below.
Value-based Thresholds are left to the “target scope” section below and won’t initially be considered.
-
Threshold JSON Representation
A JSON structure that represents a choice of threholds made on the frontend to a real composition of Scala functions in the backend. Sample:
[ { "threshold": "overlap", "minimum": 0.75 }, { "threshold": "image-source", "allowed-sources": [ "landsat", "sentinel" ] } ]
When an
AOI
is defined on the frontend, this JSON is passed to the backend along with the AOI’sExtent
, etc., in order to create the corresponding database entry.At least at first, these thresholds have an AND relationship, implying that they all have to pass in order for the image to be accepted. Question: Is there a use for arbitrary combinations of AND and OR thresholds?
-
AOI
type and DB tableAn
AOI
can be created atProject
creation time, and would have its own DB table. -
Notification / Threshold Success
Should an image pass a threshold for a given
AOI
, theUUID
of the corresponding Scene is added to a queue to await user approval. Where this queue lives is an open question. Suggestions:- as a new field in a
Project
, stored in a PostgresARRAY
of UUIDs - as a many-to-many table connecting Projects and Scenes, with optional extra metadata
- as a new field in a
-
Batch jobs run over new imagery for change detection based on Thresholds
We could do this all at once, or spread it out over the day. “Batch mode” in not-so-pseudo code could look like:
def allAois: Seq[AOI] def newImagesMetadata: Seq[Metadata] for { aoi <- allAois img <- newImagesMetadata // since last run } yield { aoi.threshold(img).run.foreach match { case None => Unit case Some(uuid) => addToQueue(aoi.project, uuid) } }
-
A REST endpoint for adding an AOI and its thresholds
The existing endpoint for creating
Projects
will have to be altered to account for the AOI bbox and its associated Thresholds. Along with the request will need to be sent the JSON structure as described above.
What is the target scope of the feature?
-
GT operations for checking thresholds
Some Thresholds may require the backend to perform a GT Collections API operation over an image. Initially we can supply operations that constitute folds, i.e. operations that reduce an Image/Scene into a single value.
- Cloud cover
- NDVI change
- NDWI change
- ???
Arbitrary map alegbra operations can be added later.
-
Multiple AOIs
-
Editing existing AOIs
-
Arbitrary Map Algebra-based Thresholds
-
Automatic Scene addition to projects w/ optional MA steps
Backend Implementation
AOI definition
class AOI(
project: UUID, /* The corresponding Project */
extent: Extent, /* Associated bounding box */
lastRun: TimeStamp, /* From Postgres. Why not Java8 type? */
threshold: Image => Option[UUID] /* UUID of the `Scene` */
) { ... }
An AOI
knows what project it is associated with, but not the other way
around. This means that the Projects table need not be queried when testing
thresholds. The existence of a lastRun
parameter depends on how and how
often we choose to run the “thresholder” process.
The threshold
value is never stored in the DB as a function. What is
stored is its JSON representation, and upon reading an AOI
from the DB we
convert the JSON to composed Scala functions. If an AOI
is ever updated on
the frontend, then we just overwrite the JSON in the database. These are
safe processes in general because we define and control the JSON-to-Scala
relationship, the production of the JSON by the frontend, and which
thresholding functions actually exist to be used.
-
Open Questions
- Should we use the Postgres
TimeStamp
or Java 8java.time
types? - How do we reconcile the monadic
M
missing here?class AOI[M: Monad]
doesn’t seem right.
- Should we use the Postgres
Threshold definition and composition
A Threshold is a function that may perform arbitrary (possibly Futurey) IO to determine if an incoming Image is appropriate for a given AOI. We expect any such function to have the signature:
threshold[M: Monad]: Metadata => OptionT[M, UUID]
which could also be:
threshold: Metadata => OptionT[Future, UUID]
if that form is deemed the more valuable overall.
The “prewritten Scala functions” as mentioned above can look like overlap
here:
/* Making this up for demonstration purposes */
case class Metadata(id: UUID, extent: Extent, ...)
/** What percent does a given image's Extent overlap an AOI? */
def howMuchOverlap(aoi: Extent, img: Extent): Float
/** Does an incoming image overlap this AOI enough? */
def overlap[M: Monad](aoi: Extent, min: Float): Metadata => OptionT[M, UUID] = {
case meta if howMuchOverlap(aoi, meta.extent) >= min => OptionT.some(meta.id)
case _ => OptionT.none
}
Composing Thresholds is a matter of reducing via andThen
:
val foo: Metadata => OptionT[Future, UUID]
val bar: Metadata => OptionT[Future, UUID]
val baz: Metadata => OptionT[Future, UUID]
val threshold: Metadata => OptionT[Future, UUID] =
Seq(foo, bar, baz).reduce(_ andThen _)
// Or with the traditional operator:
val threshold: Metadata => OptionT[Future, UUID] =
Seq(foo, bar, baz).reduce(_ >=> _)
Altering Project
If we decide that the Project
table should hold the queue of thresholded
images, then the Project
class needs one new field:
class Project(
...,
toInspect: Seq[UUID], /* Scene UUIDs */
...
)
Otherwise, it remains as-is.
Issue Analytics
- State:
- Created 6 years ago
- Comments:15 (15 by maintainers)
Top GitHub Comments
@jmorrison1847 I can imagine ways that your
OR
constructs here could be reworked intoAND
s. For the image sources, it could be defined in JSON as:and then work into Scala like:
An update after today’s meeting regarding AOI tasks:
AOI
tableProjects
will have extra columns added to it for AOI handlingScenesToProjects
table will be overloaded to contain both manually added Scenes as well as Scenes to consider for addition which have passed thresholdsMultiPolygon
sThe related issues are #1335 #1336 #1381 .