Generate embeddings of new points
See original GitHub issueHello,
I have two questions:
1- When we use .fit to generate the embeddings, how to save the model and use it to create the embeddings of new entities of the KG?
2- Before when I used this library, I did not get any error for the label_predicates, while now it returns an unexpected keyword argument!
kg = KG("dataset.xml", label_predicates=[rdflib.URIRef(x) for x in label_predicates])
Issue Analytics
- State:
- Created 2 years ago
- Comments:17 (1 by maintainers)
Top Results From Across the Web
Obtaining Embeddings | Machine Learning - Google Developers
There are a number of ways to get an embedding, including a state-of-the-art algorithm created at Google. Standard Dimensionality Reduction ...
Read more >Creating Word Embeddings: Coding the Word2Vec Algorithm ...
I started thinking about how to create word embeddings from scratch and thus ... Read the text -> Preprocess text -> Create (x,...
Read more >Embeddings in Machine Learning: Everything You Need to ...
This unsupervised technique maps a single category to a vector and generates a binary representation. The actual process is simple. We create a ......
Read more >Getting Started With Embeddings - Hugging Face
The first step is selecting an existing pre-trained model for creating the embeddings. We can choose a model from the Sentence Transformers ...
Read more >Using Embeddings to Make Complex Data Simple - Toptal
All embeddings attempt to reduce the dimensionality of data while preserving “essential” information in the data, but every embedding does it in its...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Hmm ok fair point, seems like a bug indeed…
I’ve investigated this problem a little more in-depth together with @GillesVandewiele and we concluded the following: Our walk hashing procedure within pyrdf2vec is as follows:
vertex.name if i == 0 or i % 2 == 1 or self.md5_bytes is None else hash...
More in general, we do not hash the first node within a walk as this node is in most cases the root node (those for which you want to generate an embedding).
we check if the root node is within the vw vocab during the transform function
But with the with_reverse option set to true, the walks are constructed differently:
Here you can see that the reverse paths are prefixed to the normal paths, resulting in a node at index 0 different from the root node. The root node will now be hashed somewhere within this path, and cannot be found in the wv vocab in its original form.
As @GillesVandewiele suggested, we might better hash all subjects and predicates within pyrdf2vec and check if the hashed root nodes are available in de wv vocab.
This works on my system without errors (pyrdf2vec version 0.2.3) (depth is 1 to speed up this test)