question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

How to Recognize if your data is fit for (this library's) Customer Segmentation and CLV?

See original GitHub issue

I’m trying to use this library with my data — which is real data and consists of more than 500,000 customers —, however I’m getting weird behavior with respect to the fitting of the distributions, more specifically, my alpha is way too large (approaching 320 without any penalty). If I put an L2 Penalty of 0.5, I can get a much smaller alpha (still very very big), but a and b get frighteningly small:

parameter coef se(coef) lower 95% bound upper 95% bound
r 0.628972 0.000931 0.627148 0.630795
alpha 57.421640 0.148712 57.130166 57.713115
a 0.002099 0.000076 0.001950 0.002247
b 0.062845 0.001426 0.060050 0.065641

What is the intuition behind r, alpha, a and b? If you think that these questions are not adequate as a discussion on github, feel free to point me to another forum.

I suspect that this problem comes from wrong assumptions with respect to the distribution of my RF variables (perhaps, too skewed a frequency distribution). However, I haven’t yet understood what those are specifically (are there any really?), since what I’ve found so far in the literature mostly discusses the assumptions of individual behavior (e.g. number of transactions of a customer follows a Poisson distribution). In my case, the histograms for x, t_x, T give me:

rfm_lifetimes

Needless to say, my model fitting is pretty bad:

model_pred 1

model_pred 2

Btw, I think you’ve forgot to place a Paypal link on your main page in order for us to donate money to you 😉

Issue Analytics

  • State:open
  • Created 4 years ago
  • Comments:12

github_iconTop GitHub Comments

1reaction
hmikeleecommented, May 10, 2019

Dear Philippe,

I hope my comment below can help

I tested lifetimes in Feb with 3 retail related datasets that I could find on the internet before I apply my companies’ data. In 2 cases, the forecast performances were quite well, but in one case the forecast result exhibited similar forecast performance that you showed. In the one case with a bad forecast, I noticed the frequency graph did not exhibit exponential like distribution so the forecast result was not good qt_fact_check_freq_440

But it looks like your frequency data is not the same as the case I encountered before. May I understand more detail about your use case and the background of your dataset?

In my case, when I use lifetimes with my company’s dataset, forecast performance went well

hmikelee

0reactions
psygocommented, Jun 25, 2019

The changes suggested in the interesting #280 issue didn’t amount to much unfortunately, but thanks for the suggestion.

How would you suggest I fit my plots better more specifically? As far as I’m aware of, the only parameter that could amount to some change is the model’s L2 Penalty, and I couldn’t get much change by varying it, especially in the last graph.

I’m starting to believe the issue might actually be related to my assumptions about the dataset. Besides the fact that this is a homologation dataset, these specific clients’ behavior will most likely differ from a strictly non-contractual one, because there is some periodicity to their purchases, since people do go to supermarkets on a regular basis. It seems inherently different to people who, for example, would buy DVDs, which is something more sporadic.

That would more or less explain why the model’s fit is understanding that the clients are basically always alive. In the end, I might have to compare my results to simply using contractual formulas — which is something I’ve never studied T.T.

Read more comments on GitHub >

github_iconTop Results From Across the Web

The Ultimate Guide to Customer Lifetime Value | Bloomreach
Customer Lifetime Value (CLV) is the most important metric that companies ignore. Learn what it is, why it is crucial, how to calculate...
Read more >
How to Calculate Customer Lifetime Value - Treasure Data
Customer Lifetime Value (CLV) is an educated estimate of the amount of profit you earn from a customer relationship. Learn how to calculate...
Read more >
What Is Customer Lifetime Value (CLV)? - Qualtrics
Customer lifetime value (CLV) is one of the key stats to track as part of a customer experience program. Learn what customer lifetime...
Read more >
Customer Lifetime Value: Why It Matters & How to Calculate It
For accounts with enough purchase data to calculate CLV, this information is accessible right in your audience dashboard, with segments you can instantly...
Read more >
CLV (Customer Lifetime Value) in Python - DataCamp
How to Identify the most profitable customers? · How can a company offer the best product and make the most money? · How...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found