question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Can HGT implementation be optimized?

See original GitHub issue

Hi, I understand that the code purpose might be to implement the Heterogeneous Graph Transformer (HGT), but not specifically in the most efficient way.

At first glance, it seems to me that that at least K and Q are being calculated multiple times unnecessarily, for example every time the src or dst node is repeated. It also seems that possibly some K, Q, V features are being stored even after they were used to calculate the attention and message, consuming unnecessary memory.

Am I understanding it correctly? Thanks!


def forward(self, G, h):
        with G.local_scope():
            node_dict, edge_dict = self.node_dict, self.edge_dict
            for srctype, etype, dsttype in G.canonical_etypes:
                sub_graph = G[srctype, etype, dsttype]

                #If srctype or dsttype is the same, this calculation is performed again without reason
                k_linear = self.k_linears[node_dict[srctype]]
                v_linear = self.v_linears[node_dict[srctype]]
                q_linear = self.q_linears[node_dict[dsttype]]
                
                k = k_linear(h[srctype]).view(-1, self.n_heads, self.d_k)
                v = v_linear(h[srctype]).view(-1, self.n_heads, self.d_k)
                q = q_linear(h[dsttype]).view(-1, self.n_heads, self.d_k)

                e_id = self.edge_dict[etype]

                relation_att = self.relation_att[e_id]
                relation_pri = self.relation_pri[e_id]
                relation_msg = self.relation_msg[e_id]

                k = torch.einsum("bij,ijk->bik", k, relation_att)
                v = torch.einsum("bij,ijk->bik", v, relation_msg)

                sub_graph.srcdata['k'] = k
                sub_graph.dstdata['q'] = q
                sub_graph.srcdata['v_%d' % e_id] = v
               
                #After the attention and message are calculated, the K,V,Q are not deleted, and will stay in memory, even if they wont be used again during this for loop
                sub_graph.apply_edges(fn.v_dot_u('q', 'k', 't'))
                attn_score = sub_graph.edata.pop('t').sum(-1) * relation_pri / self.sqrt_dk
                attn_score = edge_softmax(sub_graph, attn_score, norm_by='dst')

                sub_graph.edata['t'] = attn_score.unsqueeze(-1)

            G.multi_update_all({etype : (fn.u_mul_e('v_%d' % e_id, 't', 'm'), fn.sum('m', 't')) \
                                for etype, e_id in edge_dict.items()}, cross_reducer = 'mean')

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:6 (1 by maintainers)

github_iconTop GitHub Comments

1reaction
zheng-dacommented, Jun 21, 2021

The new heterogeneous graph API will be available in our next release. However, the new API may be incomplete. We’ll try our best to complete the new API as soon as possible.

However, the API isn’t HGT specific. It is designed to speed up HGT computation. You can still choose how to do aggregation/normalization, etc, with the heterogeneous message passing API.

1reaction
zheng-dacommented, Mar 31, 2021

We are in the process of optimizing HGT. Please stay tuned.

Read more comments on GitHub >

github_iconTop Results From Across the Web

RIATA-HGT: A Fast and Accurate Heuristic for Reconstructing ...
Being a heuristic, RIATA-HGT may overestimate the optimal number of HGT events; empirical performance, however, shows that such overestimation is very mild.
Read more >
Inferring Horizontal Gene Transfer - PMC - NCBI - NIH
Horizontal or Lateral Gene Transfer (HGT or LGT) is the transmission of portions of genomic DNA between organisms through a process ...
Read more >
A Novel Strategy for Detecting Recent Horizontal Gene ...
We tested the proposed strategy by applying it to a set of 10 Rhizobium genomes, and detected several large-scale recent HGT events. We...
Read more >
HGT-ID: an efficient and sensitive workflow to detect human ...
HGT-ID applies a subtraction strategy to focus on unmapped reads that don't belong to the human genome. Viral species are identified by aligning ......
Read more >
Horizontal gene transfer in human-associated microorganisms ...
Finally, we identified several core and widespread genes least influenced by HGT that could become useful markers for building robust 'trees of ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found