Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Matchbox recommender message 7

See original GitHub issue

Hi everyone,

I’ve been looking for where message 7 (m_⁎->s as labelled in the Matchbox paper [1] factor graph) is implemented in the infer.net code, but I can’t seem to find it. As a study excercise, me and @glandfried have been trying to implement the collaborative filtering version of the Matchbox recommender algorithm, as described in the Matchbox paper. We have been able to implement almost everything. However, our implementation of approximate message 7 differs greatly from the exact version of the message, the bimodal one. In contrast, we had no issues with the approximate message 2.

If anyone could help me find the infer.net implementation of the message so we can compare I would appreciate it. So far I could only find approximate message 2 (m_⁎->z) at infer/src/Runtime/Factors/Product.cs#ProductAverageLogarithm.

As a reminder, I copy here the factor graph,

and the approximation equations (approximate messages 2 and 7 respectively),

From the original Matchbox paper. I interpret this as,

(μ_t and σ²_t denote the mean and variance of the (Gaussian) marginal distribution p(t). μ_{z ->⁎} σ²_{z ->⁎} denote the mean and variance of message 6).

It would also be nice to get a hint to derive the approximations for 2 and 7 on our own (or a reference).

Thanks in advance!

[1] Matchbox Paper: https://www.microsoft.com/en-us/research/publication/matchbox-large-scale-bayesian-recommendations/

Issue Analytics

State:
Created 9 months ago
Comments:6 (3 by maintainers)

Top GitHub Comments

1reaction

tminkacommented, Dec 30, 2022

Message (7) is implemented in infer/src/Runtime/Factors/Product.cs#AAverageLogarithm. Messages (2) and (7) come from Variational Message Passing section 4.2. A concise form of those messages can be found in Table 1 of Gates: A Graphical Notation for Mixture Models.

0reactions

glandfriedcommented, Aug 17, 2023

Our original goal was to implement the matchbox model from scratch as an exercise to learn as much as possible from the methods created by you. Matchbox is particularly interesting because it offers an analytical approximation (the proposed message 2 and 7 in the paper) to a common problem, the multiplication of two Gaussian variables. The issue is that during the implementation process, we found that the approximate message 7 proposed by the matchbox paper does not minimize the reverse KL divergence with respect to the exact message 7. Since we couldn’t find the error in our calculations, we decided to examine the original code implemented by you. Thanks to your initial response, we were able to verify that the exact message 7 calculated by us was correct, as when we apply the AverageLog to it, we obtain exactly the approximate Gaussian proposed in the MatchBox paper. Now the question is why does the approximate message 7 proposed by the paper not minimize the reverse KL divergence? (at least as indicated by our calculations)

Exact analytic message

The following is the exact analytic message. exact_message_7

Each message 7 represents an integral of the joint distribution below the isoline defined by s_k. In the following images, we present the joint distribution with four isolines on the left side, and the corresponding areas below those isolines on the right side. individual_messages_7

Collectively, all of these integrals create the likelihood that sends the exact message 7 to the latent variable $s_k$, which naturally exhibits two peaks and two very long tails.

The reverse KL divergence

To evaluate the reverse KL divergence, we implemented a simple numerical integration. For example, when approximating a mixture of Gaussians with a unimodal Gaussian using reverse KL divergence, we obtained the following result. There are two local minima, one for each peak. And there is a single global minimum, corresponding to the highest peak.

a	b

Reverse KL divergence between the approximate and the exact messages

The exact message 7 has a similar structure. However, unlike the mixture of Gaussians, the tails of the exact message 7 are extremely wide, which requires the reverse KL minimization approximation to find a very different compromise compared to the example of the mixture of Gaussians.

KL divergences for a range of mu and sigmas.	best approximation is	Divergence evaluated point by point,

It seems that the width of the tails has a more significant impact than the two peaks, which, ultimately, are not that far apart. min-KL-MB-proposed

Thanks for all

We provided this explanation because we are genuinely interested in understanding the details of the valuable methodology that you have developed. We will always be indebted to you. Really thanks @tminka