question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

How to use Convolution operator as the expert?

See original GitHub issue

Hi, I am trying to train an convolution-backbone network with MoE. There are two difficulties encountered. The first difficulty is that current API seems unable to directly use. The parameter of class FMoE requires the hidden dimension, but the convolution layer actually does not define the hidden dimension explicity.

Then, I find the FMoE class cannot accept tensor with dimension greater than 2. Therefore, I guess I cannot directly pass the image (with shape N, C, H, W) into the layer? My code snippet is

from fmoe.layers import FMoE
import torch
from fmoe.gates import NaiveGate,SwitchGate
N=3
num_expert=2

hidden_size=5
out_feature=4
layer=torch.nn.Linear(in_features=hidden_size,out_features=out_feature).to("cuda")
layer.weight=torch.nn.Parameter(torch.ones_like(layer.weight))
my_moe=FMoE(num_expert=num_expert,d_model=hidden_size,top_k=1,expert=layer,gate=SwitchGate).to("cuda")
inputs=torch.rand((N,1,hidden_size)).to("cuda")
print(my_moe(inputs))

Here I use the linear layer as the expert just to test the input dimension. The error information is

Traceback (most recent call last):
  File "/home/zyli/fastmoe/try.py", line 15, in <module>
    print(my_moe(inputs))
  File "/home/zyli/anaconda3/envs/QMoE/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/zyli/fastmoe/fmoe/layers.py", line 241, in forward
    experts=self.experts
  File "/home/zyli/fastmoe/fmoe/layers.py", line 78, in _fmoe_general_global_forward
    outp = tree.map_structure(gather_func, x)
  File "/home/zyli/anaconda3/envs/QMoE/lib/python3.7/site-packages/tree/__init__.py", line 430, in map_structure
    [func(*args) for args in zip(*map(flatten, structures))])
  File "/home/zyli/anaconda3/envs/QMoE/lib/python3.7/site-packages/tree/__init__.py", line 430, in <listcomp>
    [func(*args) for args in zip(*map(flatten, structures))])
  File "/home/zyli/fastmoe/fmoe/layers.py", line 75, in gather_func
    world_size,
  File "/home/zyli/fastmoe/fmoe/functions.py", line 171, in forward
    maybe_overlap=False)
  File "/home/zyli/fastmoe/fmoe/functions.py", line 89, in _local_gather
    inp_buf.index_copy_(0, pos, inp)
IndexError: index_copy_(): When source and destination are not scalars, their dimensionality must match. Source dimensionality (3), destination dimensionality (2)

One possible solution I think is to first apply img2col to the input so that the convolution is transformed to matrix multiplication, but this incurs oblivious overhead. Or I need to modify the implementation of the class FMoE. Both of them are not elegant, so is there any idea to do this?

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:8 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
hobbitlzycommented, Jul 18, 2022

You are right. I see your consideration🤔. Thanks again😁.

1reaction
hobbitlzycommented, Jul 18, 2022

Thanks😁! I get your point. BTW, FMoE does not have the document yet?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Understanding Convolution in Deep Learning - Tim Dettmers
Convolution can perhaps be seen as an operator in Hilbert Spaces. Hilber Spaces also apply to the continuous case, where the inner product...
Read more >
Where and how exactly does the convolution operator ... - Quora
The basic intuition of convolution in signal processing is that you take the first signal, reverse the second one and slide it over...
Read more >
How Do Convolutional Layers Work in Deep Learning Neural ...
In this tutorial, you will discover how convolutions work in the convolutional neural network. After completing this tutorial, you will know ...
Read more >
Linear Convolution using C and MATLAB - GeeksforGeeks
Linear Convolution using C and MATLAB · Obtain the input signal and the impulse response as two distinct arrays. · Obtain a time...
Read more >
Application of the convolution operator for scenario integration ...
A methodology that makes use of a convolution operator to integrate subject-matter-generated scenarios into operational risk models is presented ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found