question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Matchbox recommender message 7

See original GitHub issue

Hi everyone,

I’ve been looking for where message 7 (m⁎->s as labelled in the Matchbox paper [1] factor graph) is implemented in the infer.net code, but I can’t seem to find it. As a study excercise, me and @glandfried have been trying to implement the collaborative filtering version of the Matchbox recommender algorithm, as described in the Matchbox paper. We have been able to implement almost everything. However, our implementation of approximate message 7 differs greatly from the exact version of the message, the bimodal one. In contrast, we had no issues with the approximate message 2.

If anyone could help me find the infer.net implementation of the message so we can compare I would appreciate it. So far I could only find approximate message 2 (m⁎->z) at infer/src/Runtime/Factors/Product.cs#ProductAverageLogarithm.

As a reminder, I copy here the factor graph,

factorMatchbox

and the approximation equations (approximate messages 2 and 7 respectively),

factorMatchbox

From the original Matchbox paper. I interpret this as,

factorMatchbox

t and σ2t denote the mean and variance of the (Gaussian) marginal distribution p(t). μz ->⁎ σ2z ->⁎ denote the mean and variance of message 6).

It would also be nice to get a hint to derive the approximations for 2 and 7 on our own (or a reference).

Thanks in advance!

[1] Matchbox Paper: https://www.microsoft.com/en-us/research/publication/matchbox-large-scale-bayesian-recommendations/

Issue Analytics

  • State:open
  • Created 9 months ago
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
tminkacommented, Dec 30, 2022

Message (7) is implemented in infer/src/Runtime/Factors/Product.cs#AAverageLogarithm. Messages (2) and (7) come from Variational Message Passing section 4.2. A concise form of those messages can be found in Table 1 of Gates: A Graphical Notation for Mixture Models.

0reactions
glandfriedcommented, Aug 17, 2023

Our original goal was to implement the matchbox model from scratch as an exercise to learn as much as possible from the methods created by you. Matchbox is particularly interesting because it offers an analytical approximation (the proposed message 2 and 7 in the paper) to a common problem, the multiplication of two Gaussian variables. The issue is that during the implementation process, we found that the approximate message 7 proposed by the matchbox paper does not minimize the reverse KL divergence with respect to the exact message 7. Since we couldn’t find the error in our calculations, we decided to examine the original code implemented by you. Thanks to your initial response, we were able to verify that the exact message 7 calculated by us was correct, as when we apply the AverageLog to it, we obtain exactly the approximate Gaussian proposed in the MatchBox paper. Now the question is why does the approximate message 7 proposed by the paper not minimize the reverse KL divergence? (at least as indicated by our calculations)

Exact analytic message

The following is the exact analytic message. exact_message_7

Each message 7 represents an integral of the joint distribution below the isoline defined by s_k. In the following images, we present the joint distribution with four isolines on the left side, and the corresponding areas below those isolines on the right side. individual_messages_7

Collectively, all of these integrals create the likelihood that sends the exact message 7 to the latent variable $s_k$, which naturally exhibits two peaks and two very long tails.

The reverse KL divergence

To evaluate the reverse KL divergence, we implemented a simple numerical integration. For example, when approximating a mixture of Gaussians with a unimodal Gaussian using reverse KL divergence, we obtained the following result. There are two local minima, one for each peak. And there is a single global minimum, corresponding to the highest peak.

a b
min-KL-mix-min min-KL-mix-image

Reverse KL divergence between the approximate and the exact messages

The exact message 7 has a similar structure. However, unlike the mixture of Gaussians, the tails of the exact message 7 are extremely wide, which requires the reverse KL minimization approximation to find a very different compromise compared to the example of the mixture of Gaussians.

KL divergences for a range of mu and sigmas. best approximation is Divergence evaluated point by point,
min-KL-MB-mins-divergence-image min-KL-MB-mins min-KL-MB-mins-divergence

It seems that the width of the tails has a more significant impact than the two peaks, which, ultimately, are not that far apart. min-KL-MB-proposed

Thanks for all

We provided this explanation because we are genuinely interested in understanding the details of the valuable methodology that you have developed. We will always be indebted to you. Really thanks @tminka

Read more comments on GitHub >

github_iconTop Results From Across the Web

AzureML: "Train Matchbox Recommender" is not working ...
The matchbox recommender requires that ratings be numerical or categorical. Also when training, your ratings cannot all be the same.
Read more >
Recommender: Movie recommendation | Azure AI Gallery
This experiment demonstrates the use of the Matchbox recommender modules to train a movie recommender engine.
Read more >
Alors: An algorithm recommender system
Section 7 concludes the paper with a discussion and some perspectives for further ... L. Another key aspect in Matchbox is the approximate...
Read more >
Matchbox: Large scale online Bayesian recommendations
We present a probabilistic model for generating personalised recommendations of items to users of a web service. The Matchbox system makes use of...
Read more >
Matchbox Recommender with Text Analytics - YouTube
If you want the AmazonLawnAndGarden data I used in the video, you can download it here: ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found