Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Speed accessing single resource

See original GitHub issue

Hi,

I’m really impressed by classgraph - classloading is tricky, and I’m glad to see a library like this make it easier 😃.

Is accessing a single non-class resource via classgraph an intended use-case? It’s clearly possible, but I’m finding that it is about an order of magnitude slower than loading directly from a ClassLoader (on a variety of classpath sizes), and startup speed is in an important concern for my current situation. I’ve tried a variety of whitelisting, and disabling different types of scanning, but I’m not seeing the speed that I’m hoping to achieve.

  private void processResourceClassloader(ClassLoader loader, String name) throws IOException {
    final Enumeration<URL> urls = loader.getResources(name);
    while (urls.hasMoreElements()) {
      final URL url = urls.nextElement();
      try (final InputStream in = url.openStream()) {
        doFastOperation(in);
      }
    }
  }

  private void processResourceClassgraph(ClassLoader loader, String name) throws IOException {
    try (final ScanResult scan = new ClassGraph()
        .overrideClassLoaders(loader)
        .disableNestedJarScanning()
        .disableModuleScanning()
        .whitelistClasspathElementsContainingResourcePath(name)
        //.whitelistPaths(name)
        .scan()) {
      for (final Resource resource : scan.getResourcesWithPath(name)) {
        try (final InputStream in = resource.open()) {
          doFastOperation(in);
        }
      }
    }
  }

Issue Analytics

State:
Created 4 years ago
Comments:8 (6 by maintainers)

Top GitHub Comments

1reaction

devinrsmithcommented, Nov 1, 2019

I’ve done a bunch of profiling / JVM optimizations over the years. JMH is a good tool for micro-benchmarking, and probably the best place to start, but it might not be the best for capturing startup speed behavior. Trying to profile application startup (or cold-path code) is a tougher problem - there are ways to attach JVM agents (https://github.com/jvm-profiling-tools/perf-map-agent is one example), or the java flight recorder, to get out timing data, which can be analyzed using a number of different methods (flame graphs are popular, but I don’t have too much experience w/ that style).

I’m happy to help dig in as timing permits. Cheers!

0reactions

lukehutchcommented, Nov 25, 2019

I did some extensive profiling of ClassGraph, and found a number of opportunities for optimization, mostly in how zipfile entries were being parsed (e.g. it was possible to defer parsing of modification date and permission information until the user requests that info for a Resource). ClassGraph is a bit faster now. (Released in 4.8.55.)

However, unfortunately I don’t see any other major avenues for optimization. ClassGraph has to do additional work on startup that a regular classloader does not, for example starting up a thread pool, querying all classloaders using reflection to find classpath entries, and reading any directories, jarfile central directory entries, and modules on the classpath and module path for every scan performed (whereas classloaders can cache this information).

I tried to create an apples-to-apples comparison, by copying a jarfile 500x on disk (to prevent a Java classloader from cheating by caching the jarfile central directory info somewhere, either in the JRE, or the classloader, or the classloader’s parent). ClassGraph was called with overrideClassLoaders(classLoader) for a single target classLoader, which disables scanning of modules and other classloaders.

The results indicate that ClassGraph 4.8.55 is about 4.5x slower than URLClassLoader at loading a single resource (this varies between 3.5-7x for large jars). I suspect the performance gap would shrink as you load more and more resources from the same jars, since the cost of reading the jarfile central directory can be amortized across all the resources read.

Right now the biggest bottleneck in ClassGraph is a method that reads an unsigned short from a byte[] array using ((arr[i + 1] & 0xff) << 8) | (arr[i] & 0xff), in order to read the jarfile central directory header for the jar. This takes almost 30% of the total scan time, and collectively all the central directory parsing code takes about half the total scan time. There’s no way to speed this up other than rewriting in native code, unfortunately. The JRE’s own ZipFile.getEntry() does this work in native code, which is one of the main speed advantages. Unfortunately the JRE method can’t be used, because the ZipFile API uses a mutex around all API calls, and ClassGraph is parallelized.

So I think this comes down to a question of whether you value ClassGraph’s flexibility (its ability to work with a wide range of different classloaders and classpath specification mechanisms), or the JRE’s own speed advantage, obtained by calling into native code.

However, if you are performing more than one scan, you can at least save some startup time in ClassGraph by providing your own ExecutorService to the scan method. (Don’t forget to shut down the ExecutorService when you have finished with it.) More than 1 thread probably won’t help if you’re pulling a single resource from a single jarfile, but if you’re pulling multiple resources from multiple jarfiles, or additionally scanning classfiles, more threads should help. You can see this usage pattern in the benchmark code below.

import java.io.File;
import java.io.IOException;
import java.net.URL;
import java.net.URLClassLoader;
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.ThreadFactory;

import io.github.classgraph.ClassGraph;
import io.github.classgraph.ScanResult;
import nonapi.io.github.classgraph.utils.FileUtils;

public class ClassGraphStartupBenchmarkMain {

    private static void createTempFiles(String jarPath, List<String> paths) throws Exception {
        for (String path : paths) {
            final var cmd = "cp -f " + jarPath + " " + path;
            int result = Runtime.getRuntime().exec(cmd).waitFor();
            if (result != 0) {
                throw new IOException("Command returned status code " + result + ": " + cmd);
            }
        }
    }

    private static void deleteTempFiles(List<String> paths) {
        for (String path : paths) {
            new File(path).delete();
        }
    }

    public static void main(String[] args) throws Exception {
        // 32MB, 16870 entries:
        String jarPath = "/home/luke/.m2/repository/org/jetbrains/kotlin/kotlin-compiler/1.3.21/kotlin-compiler-1.3.21.jar";
        // 160kB:
        String resourcePath = "win32/amd64/liblz4-java.so";

        //        // 472kB, 245 entries:
        //        String jarPath = "/home/luke/.m2/repository/io/github/classgraph/classgraph/4.8.54/classgraph-4.8.54.jar";
        //        // 2.8kB:
        //        String resourcePath = "META-INF/MANIFEST.MF";

        String tmpFilePathPrefix = "/home/luke/Downloads/";
        int N = 500;

        List<String> paths = new ArrayList<>();
        for (int i = 0; i < N; i++) {
            final var path = tmpFilePathPrefix + "benchmark-tmp-" + i + ".jar";
            paths.add(path);
        }

        try {
            System.out.println("Creating temp files");
            createTempFiles(jarPath, paths);

            // ClassGraph: -----------------------------------------------------------------------------------------

            System.out.println("Testing ClassGraph speed");
            ExecutorService executorService = null;
            try {
                // Set up ExecutorService so that ClassGraph doesn't have to start a new one for each new scan
                int numWorkerThreads = 1;
                executorService = Executors.newFixedThreadPool(numWorkerThreads, new ThreadFactory() {
                    public Thread newThread(Runnable r) {
                        Thread t = new Thread(r);
                        // Kill worker threads if main thread dies
                        t.setDaemon(true);
                        return t;
                    }
                });

                // Recreate classloaders so caching does not affect timing
                List<URLClassLoader> classLoadersC = new ArrayList<>();
                for (String path : paths) {
                    classLoadersC.add(new URLClassLoader(new URL[] { new URL("file://" + path) }));
                }

                long t2 = System.nanoTime();
                for (URLClassLoader classLoader : classLoadersC) {
                    try (ScanResult scanResult = new ClassGraph().overrideClassLoaders(classLoader)
                            .scan(executorService, numWorkerThreads)) {
                        scanResult.getResourcesWithPath(resourcePath).forEachByteArray((resource, byteArray) -> {
                            // Resource content has been read
                        });
                    }
                }
                long t3 = System.nanoTime();
                
                System.out.printf("ClassGraph took %.3f sec\n", (t3 - t2) * 1.0e-9);

                for (URLClassLoader classLoader : classLoadersC) {
                    classLoader.close();
                }

            } finally {
                if (executorService != null) {
                    executorService.shutdownNow();
                }
            }

            // Recreate temp files so there's no caching advantage -------------------------------------------------
            
            System.out.println("Recreating temp files");
            deleteTempFiles(paths);
            createTempFiles(jarPath, paths);

            // URLClassLoader: -------------------------------------------------------------------------------------

            System.out.println("Testing URLClassLoader speed");
            long t0 = System.nanoTime();
            List<URLClassLoader> classLoadersU = new ArrayList<>();
            for (String path : paths) {
                classLoadersU.add(new URLClassLoader(new URL[] { new URL("file://" + path) }));
            }
            for (URLClassLoader classLoader : classLoadersU) {
                URL resURL = classLoader.findResource(resourcePath);
                try (var is = resURL.openStream()) {
                    FileUtils.readAllBytesAsArray(is, -1);
                }
                classLoader.close();
            }
            long t1 = System.nanoTime();

            System.out.printf("URLClassLoader took %.3f sec\n", (t1 - t0) * 1.0e-9);

        } finally {
            // Delete temp files
            System.out.println("Deleting temp files");
            deleteTempFiles(paths);
        }
        System.out.println("Finished");
    }

    // Times for ClassGraph / URLClassLoader (multiple runs)
    //
    //     N=500, 32MB jar, 16870 entries, reading 160kB resource:
    //
    //         9.942 / 1.642 = 6.05x slower
    //         13.345 / 3.727 = 3.58x slower
    //         11.680 / 1.628 = 7.17x slower 
    //         10.457 / 2.545 = 4.11x slower 
    //         11.146 / 2.497 = 4.46x slower
    //         10.430 / 2.553 = 4.09x slower 
    //
    //     N=500, 472kB jar, 245 entries, reading 2.8kB resource
    //
    //         0.729 / 0.158 = 4.61x slower
    //         0.794 / 0.168 = 4.73x slower
    //         0.776 / 0.158 = 4.91x slower
    //         0.779 / 0.157 = 4.96x slower
}

Unfortunately I think that’s about all that can be done for now unless native code is written to accelerate zipfile central directory parsing. Sorry about that!

You’re welcome to profile this further though if you want to see if anything else can be sped up for your specific usecase, and I’ll reopen this if you find more things that can be optimized.

Top Results From Across the Web

Will the single resource in try with resource statement be not ...

If a resource fails to initialize (that is, its initializer expression throws an exception), then all resources initialized so far by the try- ......

Operating System MCQ Part 2 (Multiple Choice Questions)

36) Which of the following method is used to prevent threads or processes from accessing a single resource? PCB; Semaphore; Job Scheduler; Non-Contiguous ......

Process Scheduling - Rutgers CS

This cache is slower than the high-speed per-core cache but still much faster than accessing main memory. Multiple processors in an SMP ...

What is Resource Contention? - TechTarget

Resource contention happens when demand exceeds supply for a certain resource. When multiple processes require the same resource, one process reaches the ...

Capacity of a Single Resource - Access Engineering

The book presents concepts and principles of operations management, with a strong emphasis on analytics and operations improvement. You will also get full ......