[RMP] Make losses, metrics, masking, and negative sampling configurable from `model.compile()`
See original GitHub issueDuring our paper research, we thought we could implement models from papers and get results that beat simple baselines. That didn’t turn out to be true, so we need to figure out why. We found that some of the experiments we wanted to run to improve performance weren’t supported by the current state of the library, so we’d like to make some API changes to support the effort to improve model performance to the point where it beats standard baselines. We’re now committed to doing a KDD tutorial in August where we want to show working models with good performance, so now is the time to make these changes in order to support our continued efforts to produce performant models.
Prototype PR for exploring design decisions:
Smaller pieces to do individually: ( Note: These dev tasks will be moved to separate roadmap items and planned in future releases )
- Migrate toward the standard Keras API
- Simplify the design of negative sampling
- Make it more explicit what gets called when and what’s happening
- Classes renamed to make the concepts clearer
- Make negative sampling work for ranking models
- Allow putting item ids on the queue (instead of item embeddings)
- Make it possible to save a full two-tower model (instead of just the individual towers)
- Include the queue state
- Include the negative sampling strategy
- Improve the design of masking
- Allow users to specify which feature you want to mask (instead of assuming only the item id)
- Improve how the masking is passed through the model (from the inputs to the targets)
Related PRs:
Issue Analytics
- State:
- Created a year ago
- Comments:5 (3 by maintainers)
Top GitHub Comments
Current progress:
Simplify the design of negative sampling Make it more explicit what gets called when and what’s happening Classes renamed to make the concepts clearer Make negative sampling work for ranking models Allow putting item ids on the queue (instead of item embeddings) Make it possible to save a full two-tower model (instead of just the individual towers) Include the queue state Include the negative sampling strategy
The above tasks will move to the negative sampling ticket https://github.com/NVIDIA-Merlin/Merlin/issues/413
the following work requires its own Roadmap ticket which Marc will create. This is a dependency for session based recommmenders and the rough plan is to visit this post recsys. Design discusions will happen prior to REcyss Improve the design of masking Allow users to specify which feature you want to mask (instead of assuming only the item id) Improve how the masking is passed through the model (from the inputs to the targets)