question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Add support for word embedding like features which are list of floats

See original GitHub issue

The current API doesn’t support adding features which are list of floats e.g. Word Embeddings. The current approach to add these features is to do something like {"f0": 1.5, "f1": 1.6, "f2": -1.4} for 3 dimensional embedding features, which adds extra burden on the user’s part.

I propose a wrapper feature which will allow users to pass the word embedding list as the value of the dictionary. E.g. {"f": FloatFeatures([1.5, 1.6, -1.4])}, internally this will convert the float features into a representation consistent with the CRFSuite ItemSequence and having a consistent naming convention like "f:0", "f:1", "f:2".

Issue Analytics

  • State:open
  • Created 7 years ago
  • Reactions:13
  • Comments:7 (1 by maintainers)

github_iconTop GitHub Comments

8reactions
EmilStenstromcommented, Jan 5, 2018

Using word embeddings improve accuracy a lot. Having a supported way to include them in python-crfsuite would be wonderful.

1reaction
napsternxgcommented, Mar 18, 2020

The approach I suggested is utilized in this tool I have built.

https://github.com/napsternxg/TwitterNER

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to Use Word Embedding Layers for Deep Learning with ...
About word embeddings and that Keras supports word embeddings via the Embedding layer. How to learn a word embedding while fitting a neural ......
Read more >
Word embeddings | Text - TensorFlow
Above is a diagram for a word embedding. Each word is represented as a 4-dimensional vector of floating point values. Another way to...
Read more >
Getting Started With Embeddings - Hugging Face
Embeddings are not limited to text! You can also create an embedding of an image (for example, a list of 384 numbers) and...
Read more >
Understanding Entity Embeddings and It's Application
“ The GloVe word embedding of the word “stick” — a vector of 200 ... embedding of demonstrates that users with similar characteristics...
Read more >
Compact word vectors with Bloom embeddings - Explosion AI
A high-coverage word embedding table will usually be quite large. One million 32-bit floats occupies 4MB of memory, so one million ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found