Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

How to Recognize if your data is fit for (this library's) Customer Segmentation and CLV?

See original GitHub issue

I’m trying to use this library with my data — which is real data and consists of more than 500,000 customers —, however I’m getting weird behavior with respect to the fitting of the distributions, more specifically, my alpha is way too large (approaching 320 without any penalty). If I put an L2 Penalty of 0.5, I can get a much smaller alpha (still very very big), but a and b get frighteningly small:

parameter	coef	se(coef)	lower 95% bound	upper 95% bound
r	0.628972	0.000931	0.627148	0.630795
alpha	57.421640	0.148712	57.130166	57.713115
a	0.002099	0.000076	0.001950	0.002247
b	0.062845	0.001426	0.060050	0.065641

What is the intuition behind r, alpha, a and b? If you think that these questions are not adequate as a discussion on github, feel free to point me to another forum.

I suspect that this problem comes from wrong assumptions with respect to the distribution of my RF variables (perhaps, too skewed a frequency distribution). However, I haven’t yet understood what those are specifically (are there any really?), since what I’ve found so far in the literature mostly discusses the assumptions of individual behavior (e.g. number of transactions of a customer follows a Poisson distribution). In my case, the histograms for x, t_x, T give me:

rfm_lifetimes

Needless to say, my model fitting is pretty bad:

model_pred 1

model_pred 2

Btw, I think you’ve forgot to place a Paypal link on your main page in order for us to donate money to you 😉

Issue Analytics

State:
Created 4 years ago
Comments:12

Top GitHub Comments

1reaction

hmikeleecommented, May 10, 2019

Dear Philippe,

I hope my comment below can help

I tested lifetimes in Feb with 3 retail related datasets that I could find on the internet before I apply my companies’ data. In 2 cases, the forecast performances were quite well, but in one case the forecast result exhibited similar forecast performance that you showed. In the one case with a bad forecast, I noticed the frequency graph did not exhibit exponential like distribution so the forecast result was not good qt_fact_check_freq_440

But it looks like your frequency data is not the same as the case I encountered before. May I understand more detail about your use case and the background of your dataset?

In my case, when I use lifetimes with my company’s dataset, forecast performance went well

hmikelee

0reactions

psygocommented, Jun 25, 2019

The changes suggested in the interesting #280 issue didn’t amount to much unfortunately, but thanks for the suggestion.

How would you suggest I fit my plots better more specifically? As far as I’m aware of, the only parameter that could amount to some change is the model’s L2 Penalty, and I couldn’t get much change by varying it, especially in the last graph.

I’m starting to believe the issue might actually be related to my assumptions about the dataset. Besides the fact that this is a homologation dataset, these specific clients’ behavior will most likely differ from a strictly non-contractual one, because there is some periodicity to their purchases, since people do go to supermarkets on a regular basis. It seems inherently different to people who, for example, would buy DVDs, which is something more sporadic.

That would more or less explain why the model’s fit is understanding that the clients are basically always alive. In the end, I might have to compare my results to simply using contractual formulas — which is something I’ve never studied T.T.

Top Results From Across the Web

The Ultimate Guide to Customer Lifetime Value | Bloomreach

Customer Lifetime Value (CLV) is the most important metric that companies ignore. Learn what it is, why it is crucial, how to calculate...

How to Calculate Customer Lifetime Value - Treasure Data

Customer Lifetime Value (CLV) is an educated estimate of the amount of profit you earn from a customer relationship. Learn how to calculate...

What Is Customer Lifetime Value (CLV)? - Qualtrics

Customer lifetime value (CLV) is one of the key stats to track as part of a customer experience program. Learn what customer lifetime...

Customer Lifetime Value: Why It Matters & How to Calculate It

For accounts with enough purchase data to calculate CLV, this information is accessible right in your audience dashboard, with segments you can instantly...

CLV (Customer Lifetime Value) in Python - DataCamp

How to Identify the most profitable customers? · How can a company offer the best product and make the most money? · How...