question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Order of rdf triples embeddings

See original GitHub issue

❓ Question

I have generated embeddings for RDF triples URIs (from DBpedia) using pyRDF2vec. When I am passing the list(set(entities)) in the transformer.fir_transform(), I am not sure about the sequence of order of embeddings generated by the pyRDF2vec transformer. Will these sequences or order affect the results when I will concatenate these rdf embeddings with sentence context embeddings while training the model?

`code:

def rdftriplestovec(filepath,entities):

kg = KG(filepath)
transformer = RDF2VecTransformer(walkers=[RandomWalker(3, None)], 
                             embedder=Word2Vec(size=500))
entities_names=[entity.name for entity in kg._entities]
filtered_entities = [e for e in entities if e in entities_names]
not_found = set(entities) -  set(filtered_entities)
print('entities could not be found in the KG! Removing them')
entities = list(set(filtered_entities))
embeddings = transformer.fit_transform(kg, entities)
print(embeddings)
return embeddings`

Sample of rdf triples in .ttl file (the predicate is of owl type): (passing as filepath in rdftriplestovec function)

@prefix owl: http://www.w3.org/2002/07/owl# .

http://dbpedia.org/resource/AT&T owl:Ontology http://dbpedia.org/resource/Espionage, http://dbpedia.org/resource/Police .

http://dbpedia.org/resource/Actor owl:Ontology http://dbpedia.org/resource/Major, http://dbpedia.org/resource/Plea, http://dbpedia.org/resource/United_States .

http://dbpedia.org/resource/Actor_model owl:Ontology http://dbpedia.org/resource/Visibility .

http://dbpedia.org/resource/Advertising owl:Ontology http://dbpedia.org/resource/Indian_Americans .

http://dbpedia.org/resource/Afghan_National_Army owl:Ontology http://dbpedia.org/resource/Enemy .

http://dbpedia.org/resource/Ago,_Mie owl:Ontology http://dbpedia.org/resource/Haunt_(comics), http://dbpedia.org/resource/Human_back, http://dbpedia.org/resource/Jesus


sample: URI list which I get from DBpedia API for my dataset (passing as entities in function rdftriplestovec)

[‘http://dbpedia.org/resource/United_States_House_of_Representatives’, ‘http://dbpedia.org/resource/Australian_Democrats’, ‘http://dbpedia.org/resource/Aide-de-camp’, ‘http://dbpedia.org/resource/United_Kingdom’, ‘http://dbpedia.org/resource/Even_language’, ‘http://dbpedia.org/resource/James_Comey’, ‘http://dbpedia.org/resource/Letter_(message)’, ‘http://dbpedia.org/resource/Jason_Chaffetz’, ‘http://dbpedia.org/resource/Twitter’, ‘http://dbpedia.org/resource/Italian_language’, ‘http://dbpedia.org/resource/Robb_Flynn’, ‘http://dbpedia.org/resource/Hillary_Clinton’, ‘http://dbpedia.org/resource/Breitbart_News’, ‘http://dbpedia.org/resource/Truth’, ‘http://dbpedia.org/resource/Get_(divorce_document)’, ‘http://dbpedia.org/resource/Inactivated_vaccine’, ‘http://dbpedia.org/resource/India’, ‘http://dbpedia.org/resource/Single_(music)’, ‘http://dbpedia.org/resource/November_2017_Somalia_airstrike’, ‘http://dbpedia.org/resource/Identified’, ‘http://dbpedia.org/resource/Iranian_peoples’, ‘http://dbpedia.org/resource/Woman’, ‘http://dbpedia.org/resource/Fiction’, ‘http://dbpedia.org/resource/Unpublished_Story’, ‘http://dbpedia.org/resource/Stoning’]

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:7 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
GillesVandewielecommented, Feb 20, 2022

Yes, it returns two things: embeddings and literals. Both will be numpy arrays.

1reaction
GillesVandewielecommented, Feb 20, 2022
from pyrdf2vec import RDF2VecTransformer
from pyrdf2vec.embedders import Word2Vec
from pyrdf2vec.graphs import KG
from pyrdf2vec.walkers import RandomWalker
import pandas as pd

def rdftriplestovec(filepath, entities):
    kg = KG(filepath, fmt='turtle')
    # Sets will change the order, go see this for yourself in a shell or notebook
    entities = list(set([x.name for x in kg._entities]).intersection(entities))
    transformer = RDF2VecTransformer(walkers=[RandomWalker(3, None)], 
                                 embedder=Word2Vec(vector_size=500))
    embeddings = transformer.fit_transform(kg, entities)
    return embeddings

entities = list(pd.read_csv('URI_list.csv')['0'].values)
rdftriplestovec('rdf_triples.ttl', entities)

Works like a charm. But as I warned, if you use set() in python, the order will change! Try to avoid it (which I am not doing here), or store the result of converting to set() so that you can reconstruct the order.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Resource Description Framework (RDF): Concepts and ... - W3C
An RDF triple is conventionally written in the order subject, predicate, object. The predicate is also known as the property of the triple....
Read more >
Learning Triple Embeddings from Knowledge Graphs
Graph embedding techniques allow to learn high-quality fea- ture vectors from graph structures and are useful in a variety.
Read more >
Combining RDF Graph Data and Embedding Models for an ...
In order to achieve seamless integration of graph and vector space models, these mutually complementary components must be queried in the same way....
Read more >
Combining RDF Graph Data and Embedding ... - Metaphacts
pressed in different formats: RDF triples and embedding vectors. In order to achieve seamless integration of graph and vector space.
Read more >
RDF2vec.org
RDF2vec is a tool for creating vector representations of RDF graphs. ... of walk-based embedding generation, but embeds entire triples instead of nodes....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found