Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Reported performance on TransE differs significantly (correct hyperparameters used)

See original GitHub issue

Description

Reported performance on TransE differs with paramters given at https://docs.ampligraph.org/en/latest/experiments.html on FB15K237 dataset

Actual Behavior

mr_score(ranks0) 232.22932772286916 mrr_score(ranks0) 0.23103557722066143 hits_at_n_score(ranks0, n=1) 0.10348370682062824 hits_at_n_score(ranks0, n=3) 0.29459829728936293 hits_at_n_score(ranks0, n=10) 0.4654320383599178 `

Expected Behavior

https://docs.ampligraph.org/en/latest/experiments.html - expected results are posted here with the hyper parameters used.

Steps to Reproduce

`import numpy as np

from sklearn.metrics import brier_score_loss, log_loss from ampligraph.datasets import load_fb15k_237 from ampligraph.latent_features.models import TransE from ampligraph.utils import save_model from ampligraph.evaluation import hits_at_n_score, mr_score, evaluate_performance, mrr_score X = load_fb15k_237() model = TransE(batches_count=64, seed=0, epochs=4000, k=400, eta=30, optimizer=‘adam’, optimizer_params={‘lr’:0.0001}, loss=‘multiclass_nll’, regularizer=‘LP’, regularizer_params={‘lambda’: 0.0001, ‘p’: 2}) model.fit(X[‘train’]) save_model(model, model_name_path = ‘transe_seed_0.pkl’) filter = np.concatenate((X[‘train’], X[‘valid’], X[‘test’])) ranks0 = evaluate_performance(X[‘test’], model, filter, verbose=False) mr = mr_score(ranks0) mrr = mrr_score(ranks0) hits_1 = hits_at_n_score(ranks0, n=1) hits_3 = hits_at_n_score(ranks0, n=3) hits_10 = hits_at_n_score(ranks0, n=10)`

Issue Analytics

State:
Created 3 years ago
Comments:7 (3 by maintainers)

Top GitHub Comments

2reactions

luffycodescommented, Feb 2, 2021

Got it ! Thanks so much for helping me out with such a detailed reply, and thanks a ton for the code !

1reaction

sumitpaicommented, Feb 2, 2021

In note, x_filter is set to train + validation + test ? Is it correct to include test in it?

It depends on your use case. If x_test is a set of made up hypothesis - which may or may not be facts, then in that case x_filter shouldn’t contain x_test.

But if x_test is made up of known facts, then we must include it in the filter. This is what is commonly done in the KG community, and is the standard evaluation protocol described in Bordes et al.

Also, can you explain as to how to declare these variables? also, x_valid is set to validation? what does validation variable set to?

What you have done above is correct. ‘x_valid’: X[‘valid’][::2]

You can also set it to X[‘valid’], but we didn’t see much increase/decrease in performance. Each early stopping test takes a lot of time, so we reduce the validation set size just for speed.

We include X[‘test’] in filter for the standard datasets as X[‘test’] triples are known facts.

Top Results From Across the Web

Hyperparameter tuning and performance assessment of ...

A detailed analysis on the sensitivity of hyperparameter tuning when using different resampling methods (spatial/non-spatial) was performed.

Can Hyperparameter Tuning Improve the Performance of a ...

We compared the performance of the tuned super learner to that of the super learner using default values (“untuned”) and a carefully constructed...

An Empirical Study of the Impact of Hyperparameter Tuning ...

From our experimental results, we observe that hyperparameter tuning has a significant influence on the different performance properties of ...

Hyperparameter Tuning for Machine Learning Algorithms ...

A detailed analysis is described, along with the strengths and limitations of each hyperparameter tuning technique. The results show that the highest accuracy ......

Evaluation of Hyperparameter Optimization in Machine and ...

This paper presents the methodology, results and discussion of this study, in which we evaluate the relative performance of ML and DL approaches ......