question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

dev branch - PTX error 701 and 700 on Irregulars examples

See original GitHub issue

carried over from https://github.com/beehive-lab/TornadoVM/discussions/120#discussioncomment-3137390

i am running Irregulars example and as linked above the result codes come up 701

when I change the source code with s/float/double/g and rebuild the error reported changes to 700

also from a fresh reboot just to be sure.

WARNING: Using incubator modules: jdk.incubator.vector, jdk.incubator.foreign
Size = 2000
[TornadoVM-PTX-JNI] ERROR : cuModuleLoadData -> Returned: 700
PTX to cubin JIT compilation failed! (700)
PTX JIT compilation failed!
Unable to compile task task XXX__GENERATED_REDUCE0.reduce_seq0 - rAdd
[tornado.drivers.ptx@0.15-dev/uk.ac.manchester.tornado.drivers.ptx.runtime.PTXTornadoDevice.compileTask(PTXTornadoDevice.java:192), tornado.drivers.ptx@0.15-dev/uk.ac.manchester.tornado.drivers.ptx.runtime.PTXTornadoDevice.installCode(PTXTornadoDevice.java:145), tornado.runtime@0.15-dev/uk.ac.manchester.tornado.runtime.TornadoVM.compileTaskFromBytecodeToBinary(TornadoVM.java:477), tornado.runtime@0.15-dev/uk.ac.manchester.tornado.runtime.TornadoVM.execute(TornadoVM.java:741), tornado.runtime@0.15-dev/uk.ac.manchester.tornado.runtime.TornadoVM.execute(TornadoVM.java:221), tornado.runtime@0.15-dev/uk.ac.manchester.tornado.runtime.tasks.TornadoTaskSchedule.scheduleInner(TornadoTaskSchedule.java:720), tornado.runtime@0.15-dev/uk.ac.manchester.tornado.runtime.tasks.TornadoTaskSchedule.schedule(TornadoTaskSchedule.java:1049), tornado.api@0.15-dev/uk.ac.manchester.tornado.api.TaskSchedule.execute(TaskSchedule.java:300), tornado.runtime@0.15-dev/uk.ac.manchester.tornado.runtime.tasks.ReduceTaskSchedule.executeExpression(ReduceTaskSchedule.java:592), tornado.runtime@0.15-dev/uk.ac.manchester.tornado.runtime.tasks.ReduceTaskSchedule.scheduleWithReduction(ReduceTaskSchedule.java:577), tornado.runtime@0.15-dev/uk.ac.manchester.tornado.runtime.tasks.TornadoTaskSchedule.rewriteTaskForReduceSkeleton(TornadoTaskSchedule.java:992), tornado.runtime@0.15-dev/uk.ac.manchester.tornado.runtime.tasks.TornadoTaskSchedule.reduceAnalysis(TornadoTaskSchedule.java:1002), tornado.runtime@0.15-dev/uk.ac.manchester.tornado.runtime.tasks.TornadoTaskSchedule.analyzeSkeletonAndRun(TornadoTaskSchedule.java:1012), tornado.runtime@0.15-dev/uk.ac.manchester.tornado.runtime.tasks.TornadoTaskSchedule.schedule(TornadoTaskSchedule.java:1042), tornado.api@0.15-dev/uk.ac.manchester.tornado.api.TaskSchedule.execute(TaskSchedule.java:300), org.bereft.greatexpenses.ReductionIrregular.run(ReductionIrregular.java:60), org.bereft.greatexpenses.ReductionIrregular.main(ReductionIrregular.java:81)]
        tornado.runtime@0.15-dev/uk.ac.manchester.tornado.runtime.TornadoVM.compileTaskFromBytecodeToBinary(TornadoVM.java:481)
        tornado.runtime@0.15-dev/uk.ac.manchester.tornado.runtime.TornadoVM.execute(TornadoVM.java:741)
        tornado.runtime@0.15-dev/uk.ac.manchester.tornado.runtime.TornadoVM.execute(TornadoVM.java:221)
        tornado.runtime@0.15-dev/uk.ac.manchester.tornado.runtime.tasks.TornadoTaskSchedule.scheduleInner(TornadoTaskSchedule.java:720)
        tornado.runtime@0.15-dev/uk.ac.manchester.tornado.runtime.tasks.TornadoTaskSchedule.schedule(TornadoTaskSchedule.java:1049)
        tornado.api@0.15-dev/uk.ac.manchester.tornado.api.TaskSchedule.execute(TaskSchedule.java:300)
        tornado.runtime@0.15-dev/uk.ac.manchester.tornado.runtime.tasks.ReduceTaskSchedule.executeExpression(ReduceTaskSchedule.java:592)
        tornado.runtime@0.15-dev/uk.ac.manchester.tornado.runtime.tasks.ReduceTaskSchedule.scheduleWithReduction(ReduceTaskSchedule.java:577)
        tornado.runtime@0.15-dev/uk.ac.manchester.tornado.runtime.tasks.TornadoTaskSchedule.rewriteTaskForReduceSkeleton(TornadoTaskSchedule.java:992)
        tornado.runtime@0.15-dev/uk.ac.manchester.tornado.runtime.tasks.TornadoTaskSchedule.reduceAnalysis(TornadoTaskSchedule.java:1002)
        tornado.runtime@0.15-dev/uk.ac.manchester.tornado.runtime.tasks.TornadoTaskSchedule.analyzeSkeletonAndRun(TornadoTaskSchedule.java:1012)
        tornado.runtime@0.15-dev/uk.ac.manchester.tornado.runtime.tasks.TornadoTaskSchedule.schedule(TornadoTaskSchedule.java:1042)
        tornado.api@0.15-dev/uk.ac.manchester.tornado.api.TaskSchedule.execute(TaskSchedule.java:300)
        org.bereft.greatexpenses.ReductionIrregular.run(ReductionIrregular.java:60)
        org.bereft.greatexpenses.ReductionIrregular.main(ReductionIrregular.java:81)
[TornadoVM-PTX-JNI] ERROR : cuStreamSynchronize -> Returned: 700
Result is not correct - iteration: 0 expected: 1011.7773048769373 but found: 1503.754977668702
Exception in thread "main" uk.ac.manchester.tornado.api.exceptions.TornadoRuntimeException: [ERROR] TornadoVM Bytecode not recognized
        at tornado.runtime@0.15-dev/uk.ac.manchester.tornado.runtime.TornadoVM.throwError(TornadoVM.java:650)
        at tornado.runtime@0.15-dev/uk.ac.manchester.tornado.runtime.TornadoVM.execute(TornadoVM.java:769)
        at tornado.runtime@0.15-dev/uk.ac.manchester.tornado.runtime.TornadoVM.execute(TornadoVM.java:221)
        at tornado.runtime@0.15-dev/uk.ac.manchester.tornado.runtime.tasks.TornadoTaskSchedule.scheduleInner(TornadoTaskSchedule.java:720)
        at tornado.runtime@0.15-dev/uk.ac.manchester.tornado.runtime.tasks.TornadoTaskSchedule.schedule(TornadoTaskSchedule.java:1049)
        at tornado.api@0.15-dev/uk.ac.manchester.tornado.api.TaskSchedule.execute(TaskSchedule.java:300)
        at tornado.runtime@0.15-dev/uk.ac.manchester.tornado.runtime.tasks.ReduceTaskSchedule.executeExpression(ReduceTaskSchedule.java:592)
        at tornado.runtime@0.15-dev/uk.ac.manchester.tornado.runtime.tasks.TornadoTaskSchedule.runReduceTaskSchedule(TornadoTaskSchedule.java:987)
        at tornado.runtime@0.15-dev/uk.ac.manchester.tornado.runtime.tasks.TornadoTaskSchedule.analyzeSkeletonAndRun(TornadoTaskSchedule.java:1014)
        at tornado.runtime@0.15-dev/uk.ac.manchester.tornado.runtime.tasks.TornadoTaskSchedule.schedule(TornadoTaskSchedule.java:1042)
        at tornado.api@0.15-dev/uk.ac.manchester.tornado.api.TaskSchedule.execute(TaskSchedule.java:300)
        at org.bereft.greatexpenses.ReductionIrregular.run(ReductionIrregular.java:60)
        at org.bereft.greatexpenses.ReductionIrregular.main(ReductionIrregular.java:81)
[TornadoVM-PTX-JNI] ERROR : cuStreamDestroy -> Returned: 700
        [JNI] /vol/xfs01/work/TornadoVM/drivers/ptx-jni/target/linux-amd64-release/sources/source/PTXStream.cpp:188 in function: free_staging_area_list result = 700

script is

source ~/work/TornadoVM/source.sh
 
tornado --debug -Xmx9G -XX:+PrintFlagsFinal -XX:+UseFMA -XX:+UseNUMA    \
        -XX:-UseZGC -XX:-UseG1GC -XX:+UseParallelGC -XX:-UseShenandoahGC \
        -ea -XX:-UseCompressedOops      \
-cp "$PWD/target/classes:$PWD/target/lib/*" org.bereft.greatexpenses.ReductionIrregular

source is

package org.bereft.greatexpenses;

import uk.ac.manchester.tornado.api.TaskSchedule;
import uk.ac.manchester.tornado.api.annotations.Parallel;
import uk.ac.manchester.tornado.api.annotations.Reduce;

import java.util.ArrayList;
import java.util.Collections;
import java.util.Random;
import java.util.stream.IntStream;

class ConfigurationReduce {

    public static final int MAX_ITERATIONS = 101;
}

class Stats {

    public static double computeMedian(ArrayList<Long> input) {
        Collections.sort(input);
        double middle = input.size() /2 ;
        if (input.size() % 2 == 1) {
            middle = (input.get(input.size() / 2) + input.get(input.size() / 2 - 1)) / 2 ;
        }
        return middle;
    }
}

public class /*package uk.ac.manchester.tornado.examples.reductions;*/ ReductionIrregular {

    private static void reducedoubles(double[] input, @Reduce double[] output) {
        for (@Parallel int i = 0; i < input.length; i++) {
            output[0] += input[i];
        }
    }

    private void run(final int inputSize) {

        double[] input = new double[inputSize];
        double[] result = new double[]{0.0f};
        Random r = new Random(101);

        //@formatter:off
        TaskSchedule task = new TaskSchedule("s0")
                .streamIn(input)
                .task("t0", ReductionIrregular::reducedoubles, input, result)
                .streamOut(result);
        //@formatter:on

        ArrayList<Long> timers = new ArrayList<>();
        for (int i = 0; i < ConfigurationReduce.MAX_ITERATIONS; i++) {

            IntStream.range(0, inputSize).parallel().forEach(idx -> {
                input[idx] = r.nextDouble();
            });
            double[] sequential = new double[1];
            reducedoubles(input, sequential);

            long start = System.nanoTime();
            task.execute();
            long end = System.nanoTime();
            timers.add((end - start));

            if (Math.abs(sequential[0] - result[0]) > 0.1f) {
                System.out.println("Result is not correct - iteration: " + i + " expected: " + sequential[0] + " but found: " + result[0]);
            } else {
                System.out.println("Iteration: " + i + " is correct");
            }
        }

        System.out.println("Median TotalTime: " + Stats.computeMedian(timers));

    }

    public static void main(String[] args) {
        int inputSize = 2000;
        if (args.length > 0) {
            inputSize = Integer.parseInt(args[0]);
        }
        System.out.println("Size = " + inputSize);
        new ReductionIrregular().run(inputSize);
    }
}

might be related to https://forums.developer.nvidia.com/t/cuda-error-in-executeinternal-700-an-illegal-memory-access-was-encountered/191948

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

2reactions
jjfumerocommented, Jul 26, 2022

The PTX Backend has been fixed to launch the correct parameters with the latest drivers. However, some reductions still report wrong results. We will provide a fix for this.

Meanwhile, the OpenCL backend should work for the same GPUs (30XX) and latets NVIDIA Drivers.

1reaction
jjfumerocommented, Oct 12, 2022

I finally got some time to look at the pending issues with the reductions. The thread-block was not set correctly. The following PR solves the issue: https://github.com/beehive-lab/TornadoVM/pull/210 This will be merged soon.

Thanks for all the reports.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Issues · beehive-lab/TornadoVM - GitHub
dev branch - PTX error 701 and 700 on Irregulars examples bug Something isn't working ... Add Javadoc and document the examples in...
Read more >
Untitled
William cassotis planet fitness, Columbus library westland branch, ... Sap2000 error 440, Pa express spirit wear, Lal kitab book in hindi free download, ......
Read more >
Untitled
The palette tag, Vat reverse charge procedure example, Wineplus training solutions! ... 350-50 website development costs, Autograss racing fixtures 2015, ...
Read more >
Using the GNU Compiler Collection
You have freedom to copy and modify this GNU Manual, like GNU software. Copies published by the Free Software Foundation raise funds for...
Read more >
gcc - man pages section 1: User Commands
The C, C++, and Fortran front ends return 4 if an internal compiler error is ... Here is a truncated example from the...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found