CUDA/FPGA compilation errors
See original GitHub issueI’m trying to apply some transformations for GPU/FPGA device but I’m getting some errors:
any ideas how can I fix it? I have installed cuda toolkit via sudo apt install, my gcc --version is 9 But I don’t know where should I link it for dace to use it ( I can’t find gcc in ~/.dace.conf) dace.txt
Thanks for help.
#Edit
After changing:
for index in dace.map[0:size]:
to @dace.map(_[0:size]) def fun(index):
FPGA is working normally. Cuda still fails.
Issue Analytics
- State:
- Created 2 years ago
- Comments:11 (5 by maintainers)
Top Results From Across the Web
Compilation errors in a CUDA C project (nvcc compiler)
This was caused by using a variable in a kernel whose name conflicted with a reserved keyword ( new in this case).
Read more >FCUDA: Enabling efficient compilation of CUDA kernels onto ...
Our CUDA-to- source-to-source compilation that transforms the SPMD CUDA FPGA flow employs the state of the art high-level synthesis thread blocks into ...
Read more >dace 0.13.3 on PyPI - Libraries.io
If you are running on Mac OS and getting compilation errors when calling DaCe programs, make sure you have OpenMP installed and configured ......
Read more >Parallel Programming for FPGAs - Hacker News
Xilinx and Altera have been promising efficient high level C++/openCL/CUDA -> FPGA compilation for decades now and almost everybody I know has been...
Read more >Firefly discussion pages
Sat Jul 14 '18 3:05pm , Error about RSURFACE:GRADIENT OUT OF RANGE Fumihito ... Sun Sep 22 '13 10:28pm Re^3: AVX, OpenCL, CUDA,...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Xilinx detects accesses to adjacent indices in consecutive loop iterations and infers burst accesses. Local buffers are not required. For example:
This will infer bursts of size
n
, even though it’s just being written to a stream.You can achieve complete partitioning of a variable by settings its storage type to
FPGA_Registers
(for example,sdfg.arrays["loc_in_pixels"].storage = dace.StorageType.FPGA_Registers
).We support map unrolling by simply setting
unroll=True
on the map object. This might also be supported in themap
exposed in the frontend?I don’t think we currently do any automatic unrolling. You can unroll it manually if you wish! @TizianoDeMatteis @alexnick83 we could think about automatically unrolling loops with constant loop indices that only access local memory.
Yes, all arrays are global by default, but can easily be changed to be local memories.