[RMP] Make losses, metrics, masking, and negative sampling configurable from `model.compile()`

See original GitHub issue

During our paper research, we thought we could implement models from papers and get results that beat simple baselines. That didn’t turn out to be true, so we need to figure out why. We found that some of the experiments we wanted to run to improve performance weren’t supported by the current state of the library, so we’d like to make some API changes to support the effort to improve model performance to the point where it beats standard baselines. We’re now committed to doing a KDD tutorial in August where we want to show working models with good performance, so now is the time to make these changes in order to support our continued efforts to produce performant models.

Prototype PR for exploring design decisions:

https://github.com/NVIDIA-Merlin/models/pull/368

Smaller pieces to do individually: ( Note: These dev tasks will be moved to separate roadmap items and planned in future releases )

Migrate toward the standard Keras API
- Make it possible to set metrics and losses from model.compile() PR #431 Errors introduced by above PR #431
  - https://github.com/NVIDIA-Merlin/models/issues/491 (Fixed by #512)
  - https://github.com/NVIDIA-Merlin/models/issues/495 (Fixed by #512)
  - https://github.com/NVIDIA-Merlin/models/issues/496
Simplify the design of negative sampling
- Make it more explicit what gets called when and what’s happening
- Classes renamed to make the concepts clearer
Make negative sampling work for ranking models
- Allow putting item ids on the queue (instead of item embeddings)
Make it possible to save a full two-tower model (instead of just the individual towers)
- Include the queue state
- Include the negative sampling strategy
Improve the design of masking
- Allow users to specify which feature you want to mask (instead of assuming only the item id)
- Improve how the masking is passed through the model (from the inputs to the targets)

Related PRs:

Issue Analytics

State:
Created a year ago
Comments:5 (3 by maintainers)

Top GitHub Comments

1reaction

marcromeyncommented, May 16, 2022

Current progress:

CleanShot 2022-05-16 at 18 39 06

0reactions

viswa-nvidiacommented, Jul 5, 2022

Simplify the design of negative sampling Make it more explicit what gets called when and what’s happening Classes renamed to make the concepts clearer Make negative sampling work for ranking models Allow putting item ids on the queue (instead of item embeddings) Make it possible to save a full two-tower model (instead of just the individual towers) Include the queue state Include the negative sampling strategy

The above tasks will move to the negative sampling ticket https://github.com/NVIDIA-Merlin/Merlin/issues/413

the following work requires its own Roadmap ticket which Marc will create. This is a dependency for session based recommmenders and the rough plan is to visit this post recsys. Design discusions will happen prior to REcyss Improve the design of masking Allow users to specify which feature you want to mask (instead of assuming only the item id) Improve how the masking is passed through the model (from the inputs to the targets)