RFE: expose delegated metadata to client application
See original GitHub issueEDIT: The overall issue is described in detail in https://docs.google.com/document/d/1rWHAM2qCUtnjWD4lOrGWE2EIDLoA7eSy4-jB66Wgh0o . The suggestion here is roughly the Metadata role (file) as search index solution in the document.
Assume a setup like this (this is what we expect a community artifact repository like PyPI to look like if it uses developer signatures with TUF):
- a specific project/product team controls a delegated metadata
- TUF clients want to know details of all of the artifacts in this metadata (to e.g. figure out which versions of an artifact are available)
Currently there is no way for the client application to get the whole metadata content from ngclient. We could provide a call much like get_targetinfo()
that instead of the TargetFile would return the Targets object where the target search ended:
def get_targets_metadata(target_path: str) -> Targets
"""returns a Targets object of the metadata where the search for target_path terminated"""
This is not applicable to every TUF repo:
- it requires a “contract” between repository and client: client has to know of a
target_path
that is delegated to the correct metadata – in the pypi example it could be e.g. the PyPI project name - this is only useful if all “related” target files are listed in the same metadata
But with those assumptions the client can now easily get not just the list of target files it’s interested in but also any custom metadata embedded in the targets metadata.
I’ve not thought through all the cases (what happens if there is no targetpath match? what if there is no terminating delegation?) but I think this is something we could consider implementing
Issue Analytics
- State:
- Created a year ago
- Comments:16 (13 by maintainers)
Top GitHub Comments
After discussing with @kairoaraujo we realised that using hashbin delegation anywhere in the delegation chain breaks this idea. Because the hashing happens over the complete artifact targetpath (and not some policy object like “project name”) we can’t possibly list all targets related to a project or find out the current version of a product.
This is just a side effect of TUF not really understanding concepts like project, product or version: everything is an independent artifact in TUF. There are multiple questions this architecture (when using hashed bins) can’t solve without additional data:
At least the first one is a question all package repository clients want to answer. Maybe larger repositories just are going to need an additional layer to handle that (and to store the project/product/version mapping in TUF target files to secure that info, just like PEP-458 currently does)…
This leads to another question: if you have to include more structured data about your artifacts in TUF already, why not include the TARGETINFO data there already – I mean the download URL and hashes. why would you list those artifacts separately in TUF metadata and force your clients to do two round trips?
I guess I should update current thinking on this.
I think exposing the metadata to clients as described has security implications that may mean this is not a good idea. The fact that a delegated roles metadata contains targetpaths does not mean that those targetpaths have been delegated to the role. So exposing the list as is seems wrong, even if this is documented as unsafe.
The only really safe way to do this would be to run the delegation lookup for each targetpath, and only expose it to client if the targetpath really is delegated by the role in question. This sounds a bit wasteful but in practice might work just fine: in usual cases this would not lead to new metadata downloads and all required metadata would already be loaded in memory.