Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Using ELMO instead of BERT

See original GitHub issue

Hi; Thank you fot your great and well-explained work. Do you have how can i use ELMO instead of BERT ?

Code :

handler = TextHandler(sentences=preprocessed_documents)
handler.prepare() # create vocabulary and training data
docELMO = [i.split() for i in unpreprocessed_documents]

from elmoformanylangs import Embedder
e = Embedder('/home/nassera/136',batch_size = 64)

training_elmo = e.sents2elmo(docELMO, output_layer=0)

print("training ELMO : ", training_elmo[0])
training_dataset = CTMDataset(handler.bow, training_elmo, handler.idx2token)

ctm = CombinedTM(input_size=len(handler.vocab), bert_input_size=768, n_components=50)

ctm.fit(training_dataset) # run the model
print('topics : ',ctm.get_topics())

When i run this code i get this error :

2021-01-16 22:12:51,392 INFO: char embedding size: 3773 2021-01-16 22:12:52,371 INFO: word embedding size: 221272 2021-01-16 22:12:58,469 INFO: Model( (token_embedder): ConvTokenEmbedder( (word_emb_layer): EmbeddingLayer( (embedding): Embedding(221272, 100, padding_idx=3) ) (char_emb_layer): EmbeddingLayer( (embedding): Embedding(3773, 50, padding_idx=3770) ) (convolutions): ModuleList( (0): Conv1d(50, 32, kernel_size=(1,), stride=(1,)) (1): Conv1d(50, 32, kernel_size=(2,), stride=(1,)) (2): Conv1d(50, 64, kernel_size=(3,), stride=(1,)) (3): Conv1d(50, 128, kernel_size=(4,), stride=(1,)) (4): Conv1d(50, 256, kernel_size=(5,), stride=(1,)) (5): Conv1d(50, 512, kernel_size=(6,), stride=(1,)) (6): Conv1d(50, 1024, kernel_size=(7,), stride=(1,)) ) (highways): Highway( (_layers): ModuleList( (0): Linear(in_features=2048, out_features=4096, bias=True) (1): Linear(in_features=2048, out_features=4096, bias=True) ) ) (projection): Linear(in_features=2148, out_features=512, bias=True) ) (encoder): ElmobiLm( (forward_layer_0): LstmCellWithProjection( (input_linearity): Linear(in_features=512, out_features=16384, bias=False) (state_linearity): Linear(in_features=512, out_features=16384, bias=True) (state_projection): Linear(in_features=4096, out_features=512, bias=False) ) (backward_layer_0): LstmCellWithProjection( (input_linearity): Linear(in_features=512, out_features=16384, bias=False) (state_linearity): Linear(in_features=512, out_features=16384, bias=True) (state_projection): Linear(in_features=4096, out_features=512, bias=False) ) (forward_layer_1): LstmCellWithProjection( (input_linearity): Linear(in_features=512, out_features=16384, bias=False) (state_linearity): Linear(in_features=512, out_features=16384, bias=True) (state_projection): Linear(in_features=4096, out_features=512, bias=False) ) (backward_layer_1): LstmCellWithProjection( (input_linearity): Linear(in_features=512, out_features=16384, bias=False) (state_linearity): Linear(in_features=512, out_features=16384, bias=True) (state_projection): Linear(in_features=4096, out_features=512, bias=False) ) ) ) 2021-01-16 22:13:11,365 INFO: 2 batches, avg len: 20.9 training ELMO : [[ 0.06318592 -0.04212857 -0.40941882 ... -0.393932 0.65597 -0.19988859] [ 0.0464317 -0.03159406 -0.23152797 ... 0.2573734 0.28932744 -0.21369117] [ 0.04215719 -0.27414545 -0.1282109 ... -0.01528776 0.15322109 -0.02998078] ... [-0.20043871 0.11804245 -0.5754699 ... 0.19337586 -0.06868231 0.11217812] [-0.1898424 -0.24078836 -0.1522124 ... -0.08325598 -0.5789431 -0.21831807] [ 0.08684797 -0.14746179 -0.2742679 ... 0.06612014 0.15257567 -0.32261848]] Settings: N Components: 50 Topic Prior Mean: 0.0 Topic Prior Variance: 0.98 Model Type: prodLDA Hidden Sizes: (100, 100) Activation: softplus Dropout: 0.2 Learn Priors: True Learning Rate: 0.002 Momentum: 0.99 Reduce On Plateau: False Save Dir: None Traceback (most recent call last): File "/home/nassera/PycharmProjects/MyProject/TM_FB/Test_CTM_ELMO.py", line 76, in <module> ctm.fit(training_dataset) # run the model File "/home/nassera/PycharmProjects/MyProject/venv/lib/python3.8/site-packages/contextualized_topic_models/models/ctm.py", line 227, in fit sp, train_loss = self._train_epoch(train_loader) File "/home/nassera/PycharmProjects/MyProject/venv/lib/python3.8/site-packages/contextualized_topic_models/models/ctm.py", line 154, in _train_epoch for batch_samples in loader: File "/home/nassera/PycharmProjects/MyProject/venv/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 363, in __next__ data = self._next_data() File "/home/nassera/PycharmProjects/MyProject/venv/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 989, in _next_data return self._process_data(data) File "/home/nassera/PycharmProjects/MyProject/venv/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1014, in _process_data data.reraise() File "/home/nassera/PycharmProjects/MyProject/venv/lib/python3.8/site-packages/torch/_utils.py", line 395, in reraise raise self.exc_type(msg) RuntimeError: Caught RuntimeError in DataLoader worker process 0. Original Traceback (most recent call last): File "/home/nassera/PycharmProjects/MyProject/venv/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 185, in _worker_loop data = fetcher.fetch(index) File "/home/nassera/PycharmProjects/MyProject/venv/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch return self.collate_fn(data) File "/home/nassera/PycharmProjects/MyProject/venv/lib/python3.8/site-packages/torch/utils/data/_utils/collate.py", line 74, in default_collate return {key: default_collate([d[key] for d in batch]) for key in elem} File "/home/nassera/PycharmProjects/MyProject/venv/lib/python3.8/site-packages/torch/utils/data/_utils/collate.py", line 74, in <dictcomp> return {key: default_collate([d[key] for d in batch]) for key in elem} File "/home/nassera/PycharmProjects/MyProject/venv/lib/python3.8/site-packages/torch/utils/data/_utils/collate.py", line 55, in default_collate return torch.stack(batch, 0, out=out) RuntimeError: stack expects each tensor to be equal size, but got [30, 1024] at entry 0 and [21, 1024] at entry 1

i got a list of numpy arrays concerning ELMO but it bug in ctm.fit(training_dataset) # run the model

Issue Analytics

State:
Created 3 years ago
Comments:9 (5 by maintainers)

Top GitHub Comments

1reaction

vinidcommented, Jan 20, 2021

You could do mean pooling (averaging over the sequence), but I am not sure if this can be good for the embeddings you get from ELMo.

I also saw your open issue. I’ll search a bit more.

Is there any reason why you prefer ELMo to BERT?

0reactions

nassera2014commented, Jan 20, 2021

Thank you so much, i don’t prefer ELMo, but i would like to compare the 2 contextualized word embedding.

Top Results From Across the Web

GloVe, ELMo & BERT - Towards Data Science

BERT uses a masked-language objective, meaning that words are randomly hidden and replaced by a mask.

What are some key strengths of BERT over ELMO/ULMFiT?

BERT tokenizes words into sub-words (using WordPiece) and those are then given as input to the model. ELMo uses character based input and...

ELMo Meet BERT: Recent Advances in Natural Language ...

BERT is different from ELMo and company primarily because it targets a different training objective. The main limitation of the earlier works is ......

The Illustrated BERT, ELMo, and co. (How NLP Cracked ...

Instead of using a fixed embedding for each word, ELMo looks at the entire sentence before assigning each word in it an embedding....

Comparison between BERT, GPT-2 and ELMo | by Gaurav Ghati

BERT and GPT are transformer-based architecture while ELMo is Bi-LSTM Language model. BERT is purely Bi-directional, GPT is unidirectional and ...