question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Standardize array type annotations in docstrings

See original GitHub issue

Here are a few conventions floating around the repo with minor differences in how arrays are documented:

From https://github.com/pystatgen/sgkit/blob/master/sgkit/stats/association.py#L26:

Parameters
----------
XL : (M, N) array-like
    "Loop" covariates for which N separate regressions will be run
XC : (M, P) array-like
    "Core" covariates included in the regressions for each loop
    covariate. All P core covariates are used in each of the N
    loop covariate regressions.
Y : (M, O)
    Continuous outcomes

From https://github.com/pystatgen/sgkit/blob/master/sgkit/stats/regenie.py#L610:

    Parameters
    ----------
    G : (M, N) ArrayLike
        Genotype data array, `M` samples by `N` variants.
    X : (M, C) ArrayLike
        Covariate array, `M` samples by `C` covariates.
    Y : (M, O) ArrayLike
        Outcome array, `M` samples by `O` outcomes.
    contigs : (N,) ArrayLike

From https://github.com/pystatgen/sgkit/blob/1a5885734020cc52847a3fdd327a35f4426aeb73/sgkit/api.py:

Parameters
----------
variant_contig_names : list of str
    The contig names.
variant_contig : array_like, int
    The (index of the) contig for each variant.
variant_position : array_like, int
    The reference position of the variant.

There are some situations where a data type for the array and shape information is relevant and other times that it’s not. I propose we standardize array_name : [shape_info] array_like[, dtype]. As an example, x : (M, N) array_like, int.

When the “array_like” type needs to be more specific, e.g. a Dask array, I think we can still use “array_like” in the docstring and let a more specific type hint indicate the true bounds on inputs.

For reference, here is a random Dask docstring snippet:

Parameters
----------
a : (M, M) array_like
    A triangular matrix
b : (M,) or (M, N) array_like

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
tomwhitecommented, Sep 10, 2020

A few comments on some of these:

  • variant_contig_names: the “list of str” part is redundant so could be dropped.
  • variant_contig: if the type was ArrayLike, then we would only have to indicate that the element type is int in the description. Perhaps this could just be in parentheses: “(element type: int) The index of the contig for each variant.” Or it could go at the end of the description.
  • variant_id: if the type was Optional[ArrayLike], then this would render better.
1reaction
aktechcommented, Sep 9, 2020

I wonder if it’s possible to separate the type information and the definition on separate lines, like in https://scanpy.readthedocs.io/en/stable/api/scanpy.pp.calculate_qc_metrics.html#scanpy.pp.calculate_qc_metrics?

I think it should be, I’ll have a look.

Also, have you got any suggestions for better type hints for the array-like parameters? Hopefully we can do better than Any.

I think, we should be as explicit as we can in the type hints, like specifying exact set of expected types for params and return types. I see there is already an ArrayLike type object defined in here:

https://github.com/pystatgen/sgkit/blob/cd764bfc1f715dd69df6161a5f8043095b1e6586/sgkit/typing.py#L7

We should use these instead of Any and create new types in their if required and shape or data type information can go in the description as @eric-czech suggested.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Standardize Docstrings for Parameters and Attributes #14404
With #14375, we should finally standardize how we document the type of parameters and attributes. Once we decide on a style and make...
Read more >
Type hinting a collection of a specified type - python
As of Aug 2014, I have confirmed that it is not possible to use Python 3 type annotations to specify types within collections...
Read more >
Writing Good Code - Python Like You Mean It
In general, a class object can be used for type-hinting, and it indicates that a variable should be an instance of that class....
Read more >
Docstrings and annotations | Clean Code in Python
The idea of type hinting is to have extra tools (independent from the interpreter) to check and assess the correct use of types...
Read more >
PEP 8 – Style Guide for Python Code
This document and PEP 257 (Docstring Conventions) were adapted from ... Function annotations should use the normal rules for colons and ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found