@JohnBTasker Yes, but we had the desire to align with OGC API - Features and they don't use GeoJSON for Collections.
If you use data and collections in a sentence, there's likely something wrong. I think what you want to do is an Item, not a standalone collection. The addition of the collection assets was not meant to replace single items, but to allow to link to assets that are common across all items. Also, your standalone collections would not be searchable via STAC API /search. So I'd recommend to make an item, with (optional, but I'd recommend it) a corresponding collection.
Properties in Collections is indeed from an earlier iteration of STAC. That will likely go away in implementations in the future. Collection level properties go in summaries (for standalone collections) or top-level, if the extension supports it.
@cholmes The general data model that we have established has four object types:
Only frames are directly related to data files, the remaining objects are semantic groupings of common search/discovery attributes. The main interest for proposing improved geometry details in collections is that more often than not for aerial datasets you will initially search for a project rather than individual data items, particularly over large areas. Once you've narrowed down the project(s), finding the specific data item / format is then the focus. A good example of this type of logic is the LINZ Data Service (https://data.linz.govt.nz/) where data is organised based on projects, with individual tiles subsequently downloadable.
geometryof a STAC Item should be resolved as due to small turbulences the sensor is constantly shaking and thus the footprint has a wobbly shape. Using only the bounding box would likely cover way too much of an area, but providing all the fine details would probably be far too much of nonsensical metadata.
stac4scurrently (https://github.com/azavea/stac4s/tree/master/docs) and as we explore it / show uses we'll pr it back to the main STAC spec repo eventually
scaleparameter was me trying out generating the nodata mask from overviews instead of the full data file, as I thought it would speed it up (a scale factor of 2 effectively would be using the half size overview if there was one), but I found the results to not be great.
Thanks @lossyrob and @matthewhanson for the advice that seems to be a sensible choice. Most probably I'll really go the route via rasterization -> vectorization -> simplification on my datasets as well.
So for the balancing act, I see that it is neither desireable to make the boundaries too large (users get too many unusable results) nor to make them too small (users get too few results). And generally having as few points as possible is good. If I do simplification, I'll either have to define some tolerance in terms of distance or I need to define a maximum amount of points which may be returned or both. Are there any established good number for that? I could imagine rules like (but maybe there are more options):
Each of those may have their problems and maybe there are no good general advice. But as a good choice not only depends on the dataset creators capabilities but also on the user of the dataset, I've the feeling that a general guideline could be helpful.