question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Transform with FAMD - Is it correct ?

See original GitHub issue

Hello,

I’d like to:

  1. Build a FAMD model from a dataframe
  2. Project a single record using this model

In the example below, I have a dataframe that I use to fit a model. I then pick a single row from the original dataframe and attempt to project it with the created FAMD model.

Why do I obtain different values between:

  • Projecting the whole dataframe, then selecting the first row
  • Selecting the first row of the dataframe, then projecting with the FAMD model

Is there something I missed ?

Thanks !

Running code here: https://repl.it/repls/WearyMajorNature

import prince
import pandas as pd

df = pd.DataFrame(
    {'variable_1': [4, 5, 6, 7, 11, 2, 52],
    'variable_2': [10, 20, 30, 40, 10, 74, 10],
    'variable_3': [100, 50, -30, -50, -19, -29, -20],
    'color': ['red', 'blue', 'green', 'blue', 'red', 'red', 'blue']
    })

model = prince.FAMD(
            n_components = df.shape[1],
            copy = True,
            check_input = True,
            engine = 'auto',
            random_state = 1
        ).fit(df)

print(model.row_coordinates(df))

# Let's say we want to transform a single row
row = pd.DataFrame(df.iloc[0]).transpose()

print(model.transform(row))

# Why is this transform very different than the projection of this same record in the first dataframe ?

print(model.row_coordinates(df).iloc[0])

Issue Analytics

  • State:open
  • Created 4 years ago
  • Comments:6 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
snthibaudcommented, Apr 14, 2020

@MaxHalford I think it does not get much more minimal than this:

import pandas as pd
from prince import FAMD

famd = FAMD(n_components=1)
df = pd.DataFrame({"A": [1, 2], "B": [3, 4]})
df["A"] = df["A"].astype("category")
famd.fit(df)
print(famd.transform(df[0:1]))
print(famd.transform(df)[0:1])

Output:

     0
0 -0.5
          0
0 -1.414214
1reaction
snthibaudcommented, Apr 14, 2020

I also noticed that PCA and MCA both work well independently. Looking at the code, I think it might be related to this line -> https://github.com/MaxHalford/prince/blob/ba8a66b6575320832b118186745ecfd85c896bdc/prince/mfa.py#L98

The data is normalized before transforming there, but it should be normalized based on the fitted data.

Read more comments on GitHub >

github_iconTop Results From Across the Web

FAMD - Factor Analysis of Mixed Data in R: Essentials - Articles
Factor analysis of mixed data (FAMD) is a principal component method ... FAMD (base, ncp = 5, sup.var = NULL, ind.sup = NULL,...
Read more >
Prince FAMD unable to transform data (shapes not aligned ...
I'm using prince to perform Factor Analysis of Mixed Data (FAMD). During training everything goes as planned, but when I try to transform...
Read more >
FAMD: Factor Analysis for Mixed Data in FactoMineR - Rdrr.io
FAMD is a principal component method dedicated to explore data with both continuous and categorical variables. It can be seen roughly as a...
Read more >
Factor analysis of mixed-type data (FAMD) - RPubs
An eigenvalue >1 indicates that the PD accounts for more variance than one of the original variables in standardized data (N.B. This holds...
Read more >
5 Must-Know Dimensionality Reduction Techniques via Prince
In other words, Dimensionality Reduction transforms data from a… ... tables mean it more appropriate to apply CA to categorical features.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found