Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[BUG] Rossmann notebook not hitting expected RMSPE (again)

See original GitHub issue

Describe the bug

Rossmann convergence and final RMSPEs are again considerably worse than they once were.

I’m separating this out from https://github.com/NVIDIA/NVTabular/issues/146 because that issue was resolved by @rjzamora 's PR (I confirmed this, as described below).

Steps/Code to reproduce bug

First, rewind to @rjzamora 's PR, which (for reasons that are right now unknown) fixed https://github.com/NVIDIA/NVTabular/issues/146:

# Align Dask and Single-GPU Writer Logic (#160)
git checkout 7407cfd

Run examples/rossmann-store-sales-preproc.ipynb

Run examples/rossmann-store-sales-example.ipynb

Here we see consistent convergence and good final RMSPEs, confirming the fix.

Note: NVTabular’s Workflow outputs are now saved in examples/data/jp_ross

Now fast forward to master as of 2020 08 04:

# [REVIEW] Async torch Dataloaders (#127)
git checkout 7935f7e

Note: examples/rossmann-store-sales-preproc.ipynb is completely unchanged from 7407cfd to 7935f7e.

If we run examples/rossmann-store-sales-example.ipynb 3 times, we now 1. see unstable convergence and 2. obtain final RMSPEs of

TensorFlow: 25.0%, 22.3%, 22.3% fast.ai: 29.9%, 29.1%, 21.5%

The problem seems to do with Workflow processing.

Note that the newer version of examples/rossmann-store-sales-example.ipynb does not use examples/data/jp_ross for exporting Workflow data, but rather examples/data/ross_pre.

So, we can now run this notebook exactly as is but using 7407cfd 's Workflow outputs instead. This was done by inserting

PREPROCESS_DIR = os.path.join(DATA_DIR, 'jp_ross')
PREPROCESS_DIR_TRAIN = os.path.join(PREPROCESS_DIR, 'train')
PREPROCESS_DIR_VALID = os.path.join(PREPROCESS_DIR, 'valid')

right before the Training a Network section.

Now, if we rerun the notebook 3 times, we once again get stable convergence, and the final RMSPEs are

TensorFlow: 18.9%, 17.4%, 17.9%
fast.ai:    19.7%, 19.5%, 21.4%

@benfred @rjzamora @jperez999 for visibility

Issue Analytics

State:
Created 3 years ago
Comments:5 (3 by maintainers)

Top GitHub Comments

1reaction

rdipietrocommented, Aug 6, 2020

👍 reproduced on my end. Now results in stable convergence + good final RMSPEs

1reaction

benfredcommented, Aug 5, 2020

Can you use git bisect to figure out where this started to break?

Top Results From Across the Web

[BUG] Rossmann notebook not hitting expected RMSPE (again ...

Describe the bug Rossmann convergence and final RMSPEs are again considerably worse than they once were. I'm separating this out from #146 because...

Rossmann Store Sales | Kaggle

Forecast sales using store, promotion, and competitor data.

Rossmann Store Sales Prediction - Medium

No business can improve its financial performance without estimating customer ... Error Metric: Root Mean Square Percentage Error(RMSPE).

Wiki: Lesson 4 - Part 1 (2018) - fast.ai Course Forums

I'm looking for some insight as to how the y_range parameter affects the model and the predictions it makes. Specifically, I'm trying to...

Predictive Modelling - Data Processing - GitHub Pages

Rossman Store Sales Kaggle Competition ... The purpose of this notebook is not to talk about specific models or methods; rather it is...

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Start Free

Top Related Reddit Thread

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

[BUG] Rossmann notebook not hitting expected RMSPE (again)

Issue Analytics

Top GitHub Comments

Top Results From Across the Web

Top Related Medium Post

Top Related StackOverflow Question

Troubleshoot Live Code

Top Related Reddit Thread

Top Related Hackernoon Post

Top Related Tweet

Top Related Dev.to Post

Top Related Hashnode Post

[BUG] record_stats=False not working as expected

[BUG] Rossmann notebook still broken from recent sweeping changes to NVTabular