Q: Is there a way to control threads used bei pigz?
See original GitHub issueI’d like to have control over the number of threads used by pigz without modifying the source code.
It seems that all cores are used for (de)compression even if I specify --cores
?
That’s not a very “social approach” when more users sharing one single server … 😉
Issue Analytics
- State:
- Created 6 years ago
- Comments:32 (15 by maintainers)
Top Results From Across the Web
pigz(1): compress/expand files - Linux man page - Die.net
Pigz compresses using threads to make use of multiple processors and cores. The input is broken up into 128 KB chunks with each...
Read more >How to Compress Files Faster with Pigz Tool in Linux
Pigz can archive larger files significantly quicker than gzip since it compresses using threads to make use of multiple CPUs and cores.
Read more >DE1435614A1 - Process for the production of artificial threads ...
Use TI= to search in the title, AB= for the abstract, CL= for the claims, or TAC= for all three. For example, TI=(safety...
Read more >Homework 3. Multithreaded gzip compression filter
The pigz program can be used as a filter that reads programs from standard ... number of threads to control the compression threads...
Read more >Python informix - Night In Milano
As of this writing (July 2013), Python has two stable versions commonly used: 2. . To understand how it works, think in each...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Perhaps I should clarify that – contrary to what I wrote in one of the earliest comments in this issue – i acknowledge that there is a problem, and that I am working towards solving it.
However, Cutadapt isn’t my main job, so I need to proceed at my own pace. I can fortunately spend some of my working hours on personal (bioinformatics) projects and have used a lot of that time for Cutadapt. I’m motivated to make Cutadapt work for others, even if I personally don’t benefit from it – but it needs to remain fun. As long as I can do the things I want, it’s fine, but when the discussion moves into a territory where I get the impression that demands are made, then the fun stops. To be sure, the tone in this thread has been civil and reasonable, but it is the sheer amount of text which is not not helping.
Let me figure this out, one step at a time. Currently, I’m trying to make
--cores=1
use exactly one core by doing all compression and decompression in-process, without calling an external process at all. With a single core, Cutadapt doesn’t spawn any worker process or I/O processes anyway, so this should be relatively easy. Perhaps I may be able to get this done this or next week.@wookietreiber Thanks for your insights. I don’t have the time to reply at the moment, so this may need to wait.
One correction: Cutadapt no longer lets
pigz
use all available cores for compression. This has been limited to 4 since a while back. And decompression has recently been limited to one external process.i’ve just pushed a commit that makes Cutadapt no longer use subprocesses for gzip compression when
--cores=1
is used (or when--cores
is not specified). Input files are still read through apigz
process (using one thread) because its gzip decompression is more efficient. Total CPU usage is exactly 100%, though (it appears that the two processes never run at the same time).I think I may remove this reader subprocess as well because gzip decompression is just 2.5% of the total time (when reading a
.fastq.gz
, removing one adapter, and writing to a `.fastq.gz).In case anyone is wondering why this was ever done with subprocesses: gzip compression and decompression using Python’s built-in gzip module used to be very slow, so using an external
gzip
process was a workaround to get good speed (pigz
came later). Nowadays, they are equivalent, so we can go back to the builtin.I’ll start looking into the multi-core case, as time permits.