Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

allow named sizes for named dimensions

See original GitHub issue

(Thanks for this awesome work!)

I’ve often seen applications in which multiple dimensions of the same tensor will be the same size (which size is only known at runtime). It is useful to be able to separately document that those sizes must be the same while acknowledging the differing purposes of these dimensions.

The following example demonstrates the idea—showing three related features that already work in torchtyping as well as a proposed feature:

def func(feats: TensorType["b": ..., "annotator": 3, "word": "words", "feature"],
         predicates: TensorType["b": ..., "annotator": 3, "predicate": "words", "feature"],
         pred_arg_pairs: TensorType["b": ..., "annotator": 3, "predicate": "words", "argument": "words"]):
    # feats has shape (..., 3, words, features)
    # predicates has shape (..., 3, words, features)
    # pred_arg_pairs has shape (..., 3, words, words)
    # the ... b dimensions are checked to be of the same size.

Things that already work:

dimensions that share names (like “feature”) are enforced to share the same dimension size
named dimensions (like “annotator”) can specify a specific dimension size which is enforced
named ellipses (like “batch”) can be used to represent a fixed (but only known at runtime) set of dimensions and corresponding sizes [this is very close to what I want, but (1) I would prefer to not have the extra power of matching an unspecified number of dimensions and (2) as I understand it, ellipses only represent a single variable set of dimensions, but I want to be able to separately constrain multiple sets of dimensions to share respective sizes]

Proposed:

named dimensions (like “token”, “predicate”, and “argument”) should be able to declare a shared-but-unspecified dimension size given by name (“words” in this example)

Additionally, you would probably want to enforce that, if the specified “size name” matches the name of another dimension (like “word” in the following example), then the sizes of those dimensions should be the same:

def func(feats: TensorType["b": ..., "annotator": 3, "word", "feature"],
         predicates: TensorType["b": ..., "annotator": 3, "predicate": "word", "feature"],
         pred_arg_pairs: TensorType["b": ..., "annotator": 3, "predicate": "word", "argument": "word"]):
    # feats has shape (..., 3, words, features)
    # predicates has shape (..., 3, words, features)
    # pred_arg_pairs has shape (..., 3, words, words)
    # the ... b dimensions are checked to be of the same size.

Thoughts?

Issue Analytics

State:
Created 2 years ago
Comments:6 (6 by maintainers)

Top GitHub Comments

1reaction

patrick-kidgercommented, May 19, 2021

Hmm. Essentially you’re saying that you’d like to have something like

TensorType["same_size", "same_size"]

for checking reasons, but something like

TensorType["meaningful_name", "different_meaningful_name"]

for documentation reasons?

So I’m quite concerned that supporting this via a str: str syntax is a little opaque / opens one up to silent mistakes. (e.g. getting them the wrong way around; generally being a confusing hurdle for new users.)
Moreover, every annotation as part of a TensorType is or can be checked in some way; this would introduce a notation that is documentation-only.

At the same time I recognise the value in what you’re suggesting. If you have an alternate syntax I’d be open to ideas?

Alternately, it would be very straightforward for you to add this syntax yourself, and just use that locally:

class MyTensorType:
    def __class_getitem__(cls, item):
        if not isinstance(item, tuple):
            item = (item,)
        item = list(item)
        for i, item_i in enumerate(item):
            if isinstance(item, slice) and isinstance(item.start, str) and isinstance(item.stop, str):
                item[i] = item.stop
        item = tuple(item)
        return TensorType[item]

(untested code) This just strips away the first string in a str: str pair and then passes it on to TensorType as normal.

0reactions

patrick-kidgercommented, Jun 4, 2021

I believe so, I’m afraid.

Top Results From Across the Web

Naming Restrictions for Dimensions, Members, and Aliases

Names are not case-sensitive for dimensions, non-shared members, and aliases. Do not use matching names with only case differences; for example, do not...

Named Tensors — PyTorch 1.13 documentation

Named Tensors allow users to give explicit names to tensor dimensions. In most cases, operations that take dimension parameters will accept dimension names, ......

Android — Dimensions by Conventions | by Pravin Sonawane

In this article, we will see a simple approach to use 'conventions' for dimension resources on Android. But the same conventions can be...

Dimensions

Dimensions allow direct visualization of distances and angles within the model, without affecting the geometry. They are automatically aligned to give you ...

How to name dimensions and configure design table in ...

This video will help you to name a critical dimension because a part will have many dimensions, the user will find difficult to...