Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Summaries to include in report.md generators

See original GitHub issue

The report generators in supervised_train.py and supervised_param_train.py are great! They make it much easier to browse results of the numerical experiments @yuanqing-wang has been doing.

A wishlist for things that would be good to include in the future iterations of the report generator:

A few other quick summaries that may be useful to add are variance(target), stddev(target), and mean_absolute_error. For example, to compare with MoleculeNet benchmark results on QM9 energy regression task, it would be useful to have MAE. To put the RMSE in context, it would be good to know what is the standard deviation of the target value.
In model summary section, we have a lot of important detail about layer sizes, etc. Could we also add description of how node, edge, … etc. features are initialized? (Currently says only the input dimension.) Here would also be good to describe the loss function in more detail. The description mentions that loss_fn=mse_loss, but @yuanqing-wang mentions by Slack that this loss is measured on a normalized regression target.
For R^2, could you include the definition used, perhaps in a footnote? The reported values are often negative, and I think it is using the definition 1 - (residual sum of squares) / (total sum of squares), as in sklearn.metric.r2_score, but a reader might reasonably expect one of the other definitions that leads to a non-negative value.
For R^2, often the value reported is rounded to 1.00. We might need to use more digits of precision here.
Another plot that may be informative to look at is a scatter plot of predictions and targets. (So we can see what is the variance of the target quantity, if there are just a few outliers that are dominating the RMSE summary, etc.).
The plots should have axes labeled. In some cases the x-axis is number of optimizer steps, and in some cases the number of epochs. In some cases I think the y-axis is in units of kcal/mol, and in some cases it is measuring error on regression target normalized to have mean 0 and variance 1.
In some reports, the final iterate is much worse than the best iterate. For example, in this report, an RMSE of ~5-10 (kcal/mol?) and R^2 of ~1 are attained after 60 epochs, but then the optimizer decided to go way uphill and never come back, and the report includes a table that says the model obtained an RMSE of 150 (kcal/mol?) and R^2 of 0.25. Since we’re using an optimizer that doesn’t always step in descent directions, could we also add to the summary a description of the best iterate encountered, in addition to the currently summarized last iterate?

Issue Analytics

State:
Created 4 years ago
Reactions:1
Comments:6 (6 by maintainers)

Top GitHub Comments

2reactions

maxentilecommented, Apr 16, 2020

In model summary section, could we also include the total number of parameters? Often interested in this

1reaction

maxentilecommented, May 18, 2020

Nice! Looks like something in this direction may be an improvement: would separate the computation of summary statistics from the generation of formatted reports, which are currently intertwined.

A couple minor comments:

Currently the report-generator functions in pinot appear to expect nested results dictionaries of specific structure that depends on the result type, hinting that these may be better to live inside a results class (results.save_html(), multiple_results_object.save_html(), multiple_results_object.save_html(grid=True), …, rather than html(results_dict), html_multiple_train_and_test(results), html_multiple_train_and_test_2d_grid(results) …)
In any case, I would consider using a template generator (https://wiki.python.org/moin/Templating), which may make these easier to modify

Top Results From Across the Web

Summary and Overview Revisions to Maryland's Hazardous ...

A. Introduction. On May 3, 2021, major revisions to Maryland's regulations for generators of hazardous waste became effective. This action incorporated into ...

Report Generator Manual - AVSS

To make use of this information, a report generator has been developed to produce form letters, listings, cross-tabulations, and summary reports.

Report Generator Add-On - DryFire USA: Target Simulators

The Report Generator add-on uses the stored information from the scorecard and produces a detailed analysis of each individual shooter's performance, ...

Standalone mochawesome report generator. Just add test data.

GitHub - adamgruber/mochawesome-report-generator: Standalone mochawesome report generator. ... README.md ... Tell mocha to use the Mochawesome reporter:.

Introduction to Generators - Environmental Protection Agency

Regulatory Summary. ... 2.7 Reporting and Recordkeeping . ... Generators have several recordkeeping and reporting responsibilities in Subpart D of Part 262.