Add support for word embedding like features which are list of floats
See original GitHub issueThe current API doesn’t support adding features which are list of floats e.g. Word Embeddings. The current approach to add these features is to do something like {"f0": 1.5, "f1": 1.6, "f2": -1.4}
for 3 dimensional embedding features, which adds extra burden on the user’s part.
I propose a wrapper feature which will allow users to pass the word embedding list as the value of the dictionary. E.g. {"f": FloatFeatures([1.5, 1.6, -1.4])}
, internally this will convert the float features into a representation consistent with the CRFSuite ItemSequence and having a consistent naming convention like "f:0", "f:1", "f:2"
.
Issue Analytics
- State:
- Created 7 years ago
- Reactions:13
- Comments:7 (1 by maintainers)
Top Results From Across the Web
How to Use Word Embedding Layers for Deep Learning with ...
About word embeddings and that Keras supports word embeddings via the Embedding layer. How to learn a word embedding while fitting a neural ......
Read more >Word embeddings | Text - TensorFlow
Above is a diagram for a word embedding. Each word is represented as a 4-dimensional vector of floating point values. Another way to...
Read more >Getting Started With Embeddings - Hugging Face
Embeddings are not limited to text! You can also create an embedding of an image (for example, a list of 384 numbers) and...
Read more >Understanding Entity Embeddings and It's Application
“ The GloVe word embedding of the word “stick” — a vector of 200 ... embedding of demonstrates that users with similar characteristics...
Read more >Compact word vectors with Bloom embeddings - Explosion AI
A high-coverage word embedding table will usually be quite large. One million 32-bit floats occupies 4MB of memory, so one million ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Using word embeddings improve accuracy a lot. Having a supported way to include them in python-crfsuite would be wonderful.
The approach I suggested is utilized in this tool I have built.
https://github.com/napsternxg/TwitterNER