question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Ensure that functions's docstrings pass numpydoc validation

See original GitHub issue

Background / Objective

Docstrings in Python are string literals that occur as the first statement in a module, function, class, or method definition.

These are some of the characteristics of a docstring:

  • Triple quotes are used to encompass the docstring text.
  • There is no blank line before or after the docstring.
  • The docstring is a phrase ending in a period.
  • more details

numpydoc is one set of criteria to check for consistent documentation structure.

Validating docstrings in scikit-learn

To ensure consistent documentation structure in scikit-learn, we are using numpydoc validation. Currently, documentation tests are failing for various functions. As a temporary fix, we have suppressed error messages in test_docstrings.py. Many of the functions in scikit-learn need to be updated to comply with numpy docstring validation. In the below issue, we provide step-by-step instructions on how contributors can test and update functions.

Note

For those who are running into “YD01: No Yields section found”, it could be the cv parameter. Update An iterable yielding (train, test) splits as arrays of indices to:

        - An iterable that generates (train, test) splits as arrays of indices.

Steps

  1. Make sure you have the development dependencies and documentation dependencies installed.
  2. Pick an function from the list below and leave a comment saying you are going to work on it. This way we can keep track of what everyone is working on. 2.1 Make sure you’ve created a separate branch from main before editing files for your new contribution. Refer to our contributing guidelines for more information.
  3. Remove the function from the list at: https://github.com/scikit-learn/scikit-learn/blob/670133dbc42ebd9f79552984316bc2fcfd208e2e/sklearn/tests/test_docstrings.py#L14
  4. Let’s say you picked sklearn._config.config_context, run numpydoc validation as follows.
pytest sklearn/tests/test_docstrings.py -k sklearn._config.config_context
  1. If you see failing test, please fix them by following the recommendation provided by the failing test.
  2. If you see all the tests past, you do not need to do any additional changes.
  3. Commit your changes.
  4. Open a Pull Request with an opening message Addresses #21350. Note that each item should be submitted in a separate Pull Request.
  5. Include the function name in the title of the pull request. For example: “DOC Ensures that config_context passes numpydoc validation”.

Note: once you have issued 3 such PRs, feel free to move on to contributing more complex pull requests that involve more thinking and leave those issue fixes to first time contributors for them to learn the github contribution workflow 😃

Functions to Update

  • sklearn._config.config_context #21426
  • sklearn._config.get_config #21656
  • sklearn.base.clone #21557
  • sklearn.cluster._affinity_propagation.affinity_propagation #21778
  • sklearn.cluster._agglomerative.linkage_tree #21424
  • sklearn.cluster._kmeans.k_means #21423
  • sklearn.cluster._kmeans.kmeans_plusplus #22200
  • sklearn.cluster._mean_shift.estimate_bandwidth #21940
  • sklearn.cluster._mean_shift.get_bin_seeds #22018
  • sklearn.cluster._mean_shift.mean_shift #22019
  • sklearn.cluster._optics.cluster_optics_dbscan
  • sklearn.cluster._optics.cluster_optics_xi #22202
  • sklearn.cluster._optics.compute_optics_graph #22024 #22205
  • sklearn.cluster._spectral.spectral_clustering #22025
  • sklearn.compose._column_transformer.make_column_transformer #22183
  • sklearn.covariance._empirical_covariance.empirical_covariance #21439
  • sklearn.covariance._empirical_covariance.log_likelihood #21438
  • sklearn.covariance._graph_lasso.graphical_lasso #22326
  • sklearn.covariance._robust_covariance.fast_mcd #22331
  • sklearn.covariance._shrunk_covariance.ledoit_wolf #22496 #22798 #22748
  • sklearn.covariance._shrunk_covariance.ledoit_wolf_shrinkage #22798 #22748
  • sklearn.covariance._shrunk_covariance.shrunk_covariance #22798 #22260
  • sklearn.datasets._base.get_data_home #22259
  • sklearn.datasets._base.load_boston #22247
  • sklearn.datasets._base.load_breast_cancer #22346
  • sklearn.datasets._base.load_diabetes #21526
  • sklearn.datasets._base.load_digits #22392
  • sklearn.datasets._base.load_files #21727
  • sklearn.datasets._base.load_iris #21760
  • sklearn.datasets._base.load_linnerud #22484
  • sklearn.datasets._base.load_sample_image #22805
  • sklearn.datasets._base.load_wine #22469
  • sklearn.datasets._california_housing.fetch_california_housing #22882
  • sklearn.datasets._covtype.fetch_covtype #22918
  • sklearn.datasets._kddcup99.fetch_kddcup99 #23929
  • sklearn.datasets._lfw.fetch_lfw_pairs #23655
  • sklearn.datasets._lfw.fetch_lfw_people #24161
  • sklearn.datasets._olivetti_faces.fetch_olivetti_faces #22480
  • sklearn.datasets._openml.fetch_openml #22483
  • sklearn.datasets._rcv1.fetch_rcv1 #22225
  • sklearn.datasets._samples_generator.make_biclusters #22790
  • sklearn.datasets._samples_generator.make_blobs #22342
  • sklearn.datasets._samples_generator.make_checkerboard #22390
  • sklearn.datasets._samples_generator.make_classification #22797
  • sklearn.datasets._samples_generator.make_gaussian_quantiles #23996
  • sklearn.datasets._samples_generator.make_hastie_10_2 #22333
  • sklearn.datasets._samples_generator.make_multilabel_classification #22784 #22782
  • sklearn.datasets._samples_generator.make_regression #22784
  • sklearn.datasets._samples_generator.make_sparse_coded_signal #22817
  • sklearn.datasets._samples_generator.make_sparse_spd_matrix #22332
  • sklearn.datasets._samples_generator.make_spd_matrix #23974
  • sklearn.datasets._species_distributions.fetch_species_distributions #24162
  • sklearn.datasets._svmlight_format_io.dump_svmlight_file #23166
  • sklearn.datasets._svmlight_format_io.load_svmlight_file #24163 #24164
  • sklearn.datasets._svmlight_format_io.load_svmlight_files #24164
  • sklearn.datasets._twenty_newsgroups.fetch_20newsgroups #22329
  • sklearn.decomposition._dict_learning.dict_learning #24316 #24289 #22793
  • sklearn.decomposition._dict_learning.dict_learning_online #24289
  • sklearn.decomposition._dict_learning.sparse_encode #22793
  • sklearn.decomposition._fastica.fastica #23094
  • sklearn.decomposition._nmf.non_negative_factorization #24235
  • sklearn.externals._packaging.version.parse #24447 #24567 #24461 #24320 #22817 #22793 #22332
  • sklearn.feature_extraction.image.extract_patches_2d #23926
  • sklearn.feature_extraction.image.grid_to_graph #23052
  • sklearn.feature_extraction.image.img_to_graph #23398
  • sklearn.feature_extraction.text.strip_accents_ascii #23250
  • sklearn.feature_extraction.text.strip_accents_unicode #24232
  • sklearn.feature_extraction.text.strip_tags #23248
  • sklearn.feature_selection._univariate_selection.chi2 #23945 #23943 #23467
  • sklearn.feature_selection._univariate_selection.f_oneway
  • sklearn.feature_selection._univariate_selection.r_regression #22785
  • sklearn.inspection._partial_dependence.partial_dependence #24325 #24174
  • sklearn.inspection._plot.partial_dependence.plot_partial_dependence #24325
  • sklearn.isotonic.isotonic_regression #22475
  • sklearn.linear_model._least_angle.lars_path #24319 #22500
  • sklearn.linear_model._least_angle.lars_path_gram #24319
  • sklearn.linear_model._omp.orthogonal_mp #24329 #22501
  • sklearn.linear_model._omp.orthogonal_mp_gram #24329
  • sklearn.linear_model._ridge.ridge_regression #22788
  • sklearn.manifold._locally_linear.locally_linear_embedding #24330
  • sklearn.manifold._t_sne.trustworthiness #24333
  • sklearn.metrics._classification.accuracy_score #24259 #21478 #21441
  • sklearn.metrics._classification.balanced_accuracy_score #21478
  • sklearn.metrics._classification.brier_score_loss #23914
  • sklearn.metrics._classification.classification_report #22803
  • sklearn.metrics._classification.cohen_kappa_score #23915
  • sklearn.metrics._classification.confusion_matrix #22842 #21496
  • sklearn.metrics._classification.f1_score #22358
  • sklearn.metrics._classification.fbeta_score #23486
  • sklearn.metrics._classification.hamming_loss #21449
  • sklearn.metrics._classification.hinge_loss #23387
  • sklearn.metrics._classification.jaccard_score #23910
  • sklearn.metrics._classification.log_loss #23657
  • sklearn.metrics._classification.precision_recall_fscore_support #22472
  • sklearn.metrics._classification.precision_score #23504 #22712 #21479
  • sklearn.metrics._classification.recall_score #21495
  • sklearn.metrics._classification.zero_one_loss #21450
  • sklearn.metrics._plot.confusion_matrix.plot_confusion_matrix #22842
  • sklearn.metrics._plot.det_curve.plot_det_curve #24334
  • sklearn.metrics._plot.precision_recall_curve.plot_precision_recall_curve #24403
  • sklearn.metrics._plot.roc_curve.plot_roc_curve #21547
  • sklearn.metrics._ranking.auc #23505 #23433
  • sklearn.metrics._ranking.average_precision_score #23504 #22712
  • sklearn.metrics._ranking.coverage_error #24322
  • sklearn.metrics._ranking.dcg_score #24351 #22400
  • sklearn.metrics._ranking.label_ranking_average_precision_score #23504
  • sklearn.metrics._ranking.label_ranking_loss #22781
  • sklearn.metrics._ranking.ndcg_score #22400
  • sklearn.metrics._ranking.precision_recall_curve #24403 #22514
  • sklearn.metrics._ranking.roc_auc_score #23505
  • sklearn.metrics._ranking.roc_curve #24351 #21547
  • sklearn.metrics._ranking.top_k_accuracy_score #24259
  • sklearn.metrics._regression.max_error #21420
  • sklearn.metrics._regression.mean_absolute_error #21714
  • sklearn.metrics._regression.mean_pinball_loss #24336
  • sklearn.metrics._scorer.make_scorer #22367
  • sklearn.metrics.cluster._bicluster.consensus_score #24343
  • sklearn.metrics.cluster._supervised.adjusted_mutual_info_score #24344
  • sklearn.metrics.cluster._supervised.adjusted_rand_score #24345
  • sklearn.metrics.cluster._supervised.completeness_score #23016
  • sklearn.metrics.cluster._supervised.entropy #24352
  • sklearn.metrics.cluster._supervised.fowlkes_mallows_score #24352
  • sklearn.metrics.cluster._supervised.homogeneity_completeness_v_measure #23942
  • sklearn.metrics.cluster._supervised.homogeneity_score #23006
  • sklearn.metrics.cluster._supervised.mutual_info_score #24344 #24093 #24091
  • sklearn.metrics.cluster._supervised.normalized_mutual_info_score #24093
  • sklearn.metrics.cluster._supervised.pair_confusion_matrix #24094
  • sklearn.metrics.cluster._supervised.rand_score #24345 #24096
  • sklearn.metrics.cluster._supervised.v_measure_score #24097
  • sklearn.metrics.cluster._unsupervised.davies_bouldin_score #21850
  • sklearn.metrics.cluster._unsupervised.silhouette_samples #21851
  • sklearn.metrics.cluster._unsupervised.silhouette_score #21852
  • sklearn.metrics.pairwise.additive_chi2_kernel #23943
  • sklearn.metrics.pairwise.check_paired_arrays #23944
  • sklearn.metrics.pairwise.check_pairwise_arrays #23519
  • sklearn.metrics.pairwise.chi2_kernel #23945 #23943
  • sklearn.metrics.pairwise.cosine_distances #23946 #22141
  • sklearn.metrics.pairwise.cosine_similarity #23947
  • sklearn.metrics.pairwise.distance_metrics #23949
  • sklearn.metrics.pairwise.euclidean_distances #22783 #22140 #21429
  • sklearn.metrics.pairwise.haversine_distances #23044
  • sklearn.metrics.pairwise.kernel_metrics #23950
  • sklearn.metrics.pairwise.laplacian_kernel #23005
  • sklearn.metrics.pairwise.linear_kernel #21470
  • sklearn.metrics.pairwise.manhattan_distances #23900 #22139
  • sklearn.metrics.pairwise.nan_euclidean_distances #22140
  • sklearn.metrics.pairwise.paired_cosine_distances #22141
  • sklearn.metrics.pairwise.paired_distances #22380
  • sklearn.metrics.pairwise.paired_euclidean_distances #22783
  • sklearn.metrics.pairwise.paired_manhattan_distances #23900
  • sklearn.metrics.pairwise.pairwise_distances_argmin #23951 #23952
  • sklearn.metrics.pairwise.pairwise_distances_argmin_min #23952
  • sklearn.metrics.pairwise.pairwise_distances_chunked #24527
  • sklearn.metrics.pairwise.pairwise_kernels
  • sklearn.metrics.pairwise.polynomial_kernel #23953
  • sklearn.metrics.pairwise.rbf_kernel #23954
  • sklearn.metrics.pairwise.sigmoid_kernel #23955
  • sklearn.model_selection._split.check_cv #22778
  • sklearn.model_selection._split.train_test_split #21435
  • sklearn.model_selection._validation.cross_val_predict #21433
  • sklearn.model_selection._validation.cross_val_score #21464
  • sklearn.model_selection._validation.cross_validate #23145
  • sklearn.model_selection._validation.learning_curve #23911
  • sklearn.model_selection._validation.permutation_test_score #23912
  • sklearn.model_selection._validation.validation_curve #23913
  • sklearn.neighbors._graph.kneighbors_graph #22459
  • sklearn.neighbors._graph.radius_neighbors_graph #22462
  • sklearn.pipeline.make_union #23909
  • sklearn.preprocessing._data.binarize #24002 #22801
  • sklearn.preprocessing._data.maxabs_scale #24359
  • sklearn.preprocessing._data.normalize #24093 #23188 #22795
  • sklearn.preprocessing._data.power_transform #22802
  • sklearn.preprocessing._data.quantile_transform #22780
  • sklearn.preprocessing._data.robust_scale #23908
  • sklearn.preprocessing._data.scale #24362 #24359 #23908
  • sklearn.preprocessing._label.label_binarize #24002
  • sklearn.random_projection.johnson_lindenstrauss_min_dim #24003
  • sklearn.svm._bounds.l1_min_c #24134
  • sklearn.tree._export.plot_tree
  • sklearn.utils.axis0_safe_slice #24561
  • sklearn.utils.check_pandas_support #21566
  • sklearn.utils.extmath.cartesian #21513
  • sklearn.utils.extmath.density #24516
  • sklearn.utils.extmath.fast_logdet #24605
  • sklearn.utils.extmath.randomized_range_finder #22069
  • sklearn.utils.extmath.randomized_svd #24607
  • sklearn.utils.extmath.safe_sparse_dot #24567
  • sklearn.utils.extmath.squared_norm #24360
  • sklearn.utils.extmath.stable_cumsum #24348
  • sklearn.utils.extmath.svd_flip #24581
  • sklearn.utils.extmath.weighted_mode #24571
  • sklearn.utils.fixes.delayed
  • sklearn.utils.fixes.linspace #24582
  • sklearn.utils.fixes.threadpool_info
  • sklearn.utils.fixes.threadpool_limits
  • sklearn.utils.gen_batches #24609
  • sklearn.utils.gen_even_slices #24608
  • sklearn.utils.get_chunk_n_rows #22539
  • sklearn.utils.graph.graph_shortest_path
  • sklearn.utils.graph.single_source_shortest_path_length #24474
  • sklearn.utils.is_scalar_nan #24562
  • sklearn.utils.metaestimators.available_if #24586
  • sklearn.utils.metaestimators.if_delegate_has_method #24633
  • sklearn.utils.multiclass.check_classification_targets #22793
  • sklearn.utils.multiclass.class_distribution #24452
  • sklearn.utils.multiclass.type_of_target #24463
  • sklearn.utils.multiclass.unique_labels #24476
  • sklearn.utils.resample #23916
  • sklearn.utils.safe_mask #24425
  • sklearn.utils.safe_sqr #24437
  • sklearn.utils.shuffle #24367
  • sklearn.utils.sparsefuncs.count_nonzero #24447
  • sklearn.utils.sparsefuncs.csc_median_axis_0 #24461
  • sklearn.utils.sparsefuncs.incr_mean_variance_axis #24477
  • sklearn.utils.sparsefuncs.inplace_swap_column #23476
  • sklearn.utils.sparsefuncs.inplace_swap_row #24518 #24513 #24178
  • sklearn.utils.sparsefuncs.inplace_swap_row_csc #24513
  • sklearn.utils.sparsefuncs.inplace_swap_row_csr #24518
  • sklearn.utils.sparsefuncs.mean_variance_axis #24477 #24177
  • sklearn.utils.sparsefuncs.min_max_axis #22839
  • sklearn.utils.tosequence #22494
  • sklearn.utils.validation.as_float_array #21502
  • sklearn.utils.validation.assert_all_finite #22470
  • sklearn.utils.validation.check_is_fitted #24454
  • sklearn.utils.validation.check_memory #23039
  • sklearn.utils.validation.check_random_state #23320 #22787
  • sklearn.utils.validation.column_or_1d #21591
  • sklearn.utils.validation.has_fit_parameter #21590
  • sklearn.utils.validation.indexable #21431

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:214 (206 by maintainers)

github_iconTop GitHub Comments

5reactions
jeremiedbbcommented, Oct 14, 2022

All functions now pass numpydoc validation. Thanks to everyone who contributed to this long standing issue ! We can close this issue and consider the numpydoc arc over 😃

3reactions
glemaitrecommented, Oct 17, 2022

Nice. Thank you to everyone that contributed.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Validation — numpydoc v1.6.0rc1.dev0 Manual - Read the Docs
This will validate that the docstring can be built. For an exhaustive validation of the formatting of the docstring, use the --validate parameter....
Read more >
Documenting Python APIs with docstrings
Usually kwargs are used to pass parameters to other functions and methods. If that is the case, be sure to mention (and link)...
Read more >
Indicating keyworded-ness of arguments in NumpyDoc ...
In trying to follow the NumpyDoc format for my DocStrings, I can't seem to figure out how to tell the user that an...
Read more >
flake8-docstrings - PyPI
Extension for flake8 which uses pydocstyle to check docstrings. ... Use pep257's tokenize_open function to pass input to the tool.
Read more >
Contribute to Pingouin — pingouin 0.5.2 documentation
New functionality must be validated against at least one other statistical software including R, SPSS, Matlab or JASP. When adding new functions, make...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found