question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

CUDA/FPGA compilation errors

See original GitHub issue

I’m trying to apply some transformations for GPU/FPGA device but I’m getting some errors:

image image

any ideas how can I fix it? I have installed cuda toolkit via sudo apt install, my gcc --version is 9 image But I don’t know where should I link it for dace to use it ( I can’t find gcc in ~/.dace.conf) dace.txt

Thanks for help.

#Edit After changing: for index in dace.map[0:size]: to @dace.map(_[0:size]) def fun(index): FPGA is working normally. Cuda still fails.

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:11 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
definelichtcommented, Dec 30, 2021

Yes, I have noticed that Xilinx creates extra buffers when using burst. But without minimal local array’s there will be no burst, maybe add some micro buffering between memlets like you said streaming transformations?

Xilinx detects accesses to adjacent indices in consecutive loop iterations and infers burst accesses. Local buffers are not required. For example:

void Foo(int const *from_dram, hlslib::Stream<int> &s, int n) {
  for (int i = 0; i < n; ++i) {
    #pragma HLS PIPELINE II=1
    s.Push(from_dram[i]);
  }
}

This will infer bursts of size n, even though it’s just being written to a stream.

I would like to add some specific pragmas in loop, maybe some extra commands that would be inserted into some @dace.program like:

@dace.program
def fun():
/*some code*/
loc_in_pixels = dace.define_local(shape=(burst_size), dtype=dace.uint8, memtype=dace.local) #This creates some local buffer not global!
dace.hint(loc_in_pixels, "#PRAGMA HLS PARTITION VARIABLE=loc_in_pixels COMPLETE") #Place that "text" near loc_in_pixels var definition.

You can achieve complete partitioning of a variable by settings its storage type to FPGA_Registers (for example, sdfg.arrays["loc_in_pixels"].storage = dace.StorageType.FPGA_Registers).

for i in range(10): dace.hint(“#PRAGMA UNROLL”) #Put hint right here in tasklet. Maybe add something specific like

We support map unrolling by simply setting unroll=True on the map object. This might also be supported in the map exposed in the frontend?

1reaction
definelichtcommented, Dec 28, 2021

I have noticed that: @dace.map def calc_mask(pix: _[0:size]): generated loop isn’t unrolled but why?

I don’t think we currently do any automatic unrolling. You can unroll it manually if you wish! @TizianoDeMatteis @alexnick83 we could think about automatically unrolling loops with constant loop indices that only access local memory.

loc_in_pixels = dace.define_local(shape=(burst_size), dtype=dace.uint8) it is global array by default (should it be that way?)

Yes, all arrays are global by default, but can easily be changed to be local memories.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Compilation errors in a CUDA C project (nvcc compiler)
This was caused by using a variable in a kernel whose name conflicted with a reserved keyword ( new in this case).
Read more >
FCUDA: Enabling efficient compilation of CUDA kernels onto ...
Our CUDA-to- source-to-source compilation that transforms the SPMD CUDA FPGA flow employs the state of the art high-level synthesis thread blocks into ...
Read more >
dace 0.13.3 on PyPI - Libraries.io
If you are running on Mac OS and getting compilation errors when calling DaCe programs, make sure you have OpenMP installed and configured ......
Read more >
Parallel Programming for FPGAs - Hacker News
Xilinx and Altera have been promising efficient high level C++/openCL/CUDA -> FPGA compilation for decades now and almost everybody I know has been...
Read more >
Firefly discussion pages
Sat Jul 14 '18 3:05pm , Error about RSURFACE:GRADIENT OUT OF RANGE Fumihito ... Sun Sep 22 '13 10:28pm Re^3: AVX, OpenCL, CUDA,...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found