question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

segmentation fault illegal instruction

See original GitHub issue

setup

ubuntu 16.04 tvm 0.7 dev1 pytorch 1.4.0 transformer 2.11.0 other same as requirements.txt

issue

I uncomment the line in diagonaled_mm_tvm.py DiagonaledMM._get_function('float32', 'cuda')

After that, When I run the code , it show Loading tvm binary from :./longformer/lib/lib_diagonaled_mm_float32_cuda.so … segmentation fault (core dump) or show Loading tvm binary from :./longformer/lib/lib_diagonaled_mm_float32_cuda.so … illegal instruction (core dump)

other

I test the tvm, tensorflow and pytorch, there are fine. And I follow the scripts/cheatsheet.txt to regenerate the lib_diagonaled_mm_float32_cuda.so, it can generate succeed.

Any idea or suggestion?

the code is below

import torch
from longformer.longformer import Longformer, LongformerConfig
from longformer.sliding_chunks import pad_to_window_size
from transformers import RobertaTokenizer

config = LongformerConfig.from_pretrained('longformer-base-4096/') 
# choose the attention mode 'n2', 'tvm' or 'sliding_chunks'
# 'n2': for regular n2 attantion
# 'tvm': a custom CUDA kernel implementation of our sliding window attention
# 'sliding_chunks': a PyTorch implementation of our sliding window attention
config.attention_mode = 'tvm'

model = Longformer.from_pretrained('longformer-base-4096/', config=config)
tokenizer = RobertaTokenizer.from_pretrained('roberta-base')
tokenizer.model_max_length = model.config.max_position_embeddings

SAMPLE_TEXT = ' '.join(['Hello world! '] * 1000)  # long input document

input_ids = torch.tensor(tokenizer.encode(SAMPLE_TEXT)).unsqueeze(0)  # batch of size 1

# TVM code doesn't work on CPU. Uncomment this if `config.attention_mode = 'tvm'`
model = model.cuda(); input_ids = input_ids.cuda()

# Attention mask values -- 0: no attention, 1: local attention, 2: global attention
attention_mask = torch.ones(input_ids.shape, dtype=torch.long, device=input_ids.device) # initialize to local attention
attention_mask[:, [1, 4, 21,]] =  2  # Set global attention based on the task. For example,
                                     # classification: the <s> token
                                     # QA: question tokens

# padding seqlen to the nearest multiple of 512. Needed for the 'sliding_chunks' attention
input_ids, attention_mask = pad_to_window_size(
        input_ids, attention_mask, config.attention_window[0], tokenizer.pad_token_id)

output = model(input_ids, attention_mask=attention_mask)[0]

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:13

github_iconTop GitHub Comments

1reaction
ibeltagycommented, Jun 30, 2020

Very interesting. Maybe a fused self-attention function or something. I will be curious to see how this goes.

Depending on how familiar you are with TVM, you might find the following discussions useful, https://discuss.tvm.ai/t/optimizing-matrix-multiplication-for-gpu/4212/24 https://discuss.tvm.ai/t/competitive-gemm-matmul-example/5478 https://discuss.tvm.ai/t/developing-a-faster-schedule-for-longformers-kernel/6367

1reaction
ibeltagycommented, Jun 29, 2020

Another suggestion; can you try running it from inside the docker container that we use to compile the cuda kernel? Follow the instructions here: https://github.com/allenai/longformer/blob/master/scripts/cheatsheet.txt#L6 to build and run the docker image, then try to run it. You don’t need to recompile the binaries, it is enough to load the existing one.

I am curious, what are you using it for, and would the sliding_chunks implementation be enough for your use case?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Segmentation Fault and GCC Illegal Instruction - Nebula Graph
This article shares the trouble shooting steps of two compiling errors: Segmentation fault and illegal instruction.
Read more >
What's the difference between Segmentation fault and Bus ...
Bus Error (also known as SIGBUS and is usually signal 10) - You can encounter this signal error when an invalid pointer is...
Read more >
C illegal instruction - Stack Overflow
When your program has UB it can seg fault, get an illegal instruction error, appear to work or any other behaviour. It's not...
Read more >
Program crash messages - Helpful
Illegal instruction means the CPU got an instruction it did not support. ... It can happen when executable code becoming corrupted. More commonly, ......
Read more >
Illegal instruction fault in stack based buffer overflow
Try to step instruction by instruction through gdb with the si ( stepi ) command. · You can turn on core dumps and...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found