Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

reproducing the paper's best results

See original GitHub issue

I’ve tried to replicate the paper. For bert-base-nli-mean-tokens, the model which was trained from scratch with your code reached 74.71 of cosine-similarity on the sts-test set. It is way too low compared to the score on the paper. Any thoughts?

Issue Analytics

State:
Created 4 years ago
Comments:8 (4 by maintainers)

Top GitHub Comments

1reaction

nreimerscommented, Dec 4, 2019

Hi, the problem occurred when I updated huggingface pytorch-transformers to version 1.x (which was with version 2.0 of sentence-transformers): Performances magically dropped, even though the setup was as before.

I performed extensive debugging, copying old code from huggingface, but sadly never found a way to fix it. Interesting, when loading the weights trained with the old huggingface code, the same performances were still achieved. So something must have changed in the training procedure of huggingface code that leads to this inferior performance with version 1 of pytorch-transformers. Maybe the optimizer code is a bit different?

I was not the only person affected by this, but several people mentioned this in the huggingface repo (see https://github.com/huggingface/transformers/issues/938), that they now achieve slightly worse performances. The reason is unclear.

I will soon be able to update pytorch-transformers to version 2. Maybe the issue is resolved in that version? Who knows.

If you like to reproduce the old sts experiment scores, I recommend to use the older versions of this repository, one that uses pytorch-transformers 0.x version.

Best regards Nils Reimers

0reactions

nreimerscommented, Dec 17, 2019

Hi @K-Mike I used in the paper bert-as-a-service with mean-pooling. Here is the code I used:

from __future__ import absolute_import, division, unicode_literals

import sys
import io
import numpy as np
import logging

import os
from bert_serving.client import BertClient

# Set PATHs
PATH_TO_SENTEVAL = '../'
PATH_TO_DATA = '../data'

# import SentEval
sys.path.insert(0, PATH_TO_SENTEVAL)
import senteval

# SentEval prepare and batcher
def prepare(params, samples):
    pass

bc = BertClient(ip='localhost', check_length=False)
def batcher(params, batch):
    sentences = []
    for sample in batch:
        untoken = ' '.join(sample).lower()
        if untoken == '':
            untoken = '-'

        sentences.append(untoken)
    return bc.encode(sentences)


# Set params for SentEval
#params_senteval = {'task_path': PATH_TO_DATA, 'usepytorch': True, 'kfold': 5}
#params_senteval['classifier'] = {'nhid': 0, 'optim': 'rmsprop', 'batch_size': 128, 'tenacity': 3, 'epoch_size': 2}

# Parameters suggested by Readme & https://github.com/facebookresearch/SentEval/issues/43
params_senteval = {'task_path': PATH_TO_DATA, 'usepytorch': True, 'kfold': 10}
params_senteval['classifier'] = {'nhid': 0, 'optim': 'adam', 'batch_size': 64, 'tenacity': 5, 'epoch_size': 4}

# Set up logger
logging.basicConfig(format='%(asctime)s : %(message)s', level=logging.DEBUG)

if __name__ == "__main__":
    se = senteval.engine.SE(params_senteval, batcher, prepare)
    transfer_tasks = ['MR', 'CR', 'SUBJ', 'MPQA', 'SICKEntailment', 'SST2', 'TREC', 'MRPC']
    results = se.eval(transfer_tasks)
    print(results)

As I learned later (pointed out in one of the issues here): Bert-as-a-service interprets mean-pooling a bit differently.

In the default strategy REDUCE_MEAN, I take the second-to-last hidden layer of all of the tokens in the sentence and do average pooling.

This might be the cause of the differences? Maybe taking only the last layer and perform mean pooling might be better than the REDUCE_MEAN pooling from bert-as-a-service? Would be interesting to see which is better.

Best Nils Reimers