cromwell slows down in large scatters (against sge)
See original GitHub issueI have a 1300 shard scatter and while cromwell started off quickly, it is now submitting jobs at a rate of 1/minute (!!!) the jstack is showing that only one thread is running with the following trace (see below).
This doesn’t seem to happen against GCS.
Also, of note: I’m running without a database.
the wdl (selfcontained) is here:
task Decapitate {
File inputFile
String outputFilename
command {
tail -n+2 ${inputFile}
}
runtime {
memory: "4 GB"
}
output {
Array[Array[String]] outputMatrix=read_tsv(stdout())
}
}
task BreakUpRow {
String in_SAMPLE_ALIAS
String in_PROJECT
String in_DATA_TYPE
String in_COMPARE_SAMPLE
String in_COMPARE_PROJECT
String in_CLEAN_SAMPLE
String in_CLEAN_COMPARE_SAMPLE
String in_FILE
String in_CONTAM
command {
#do nothing
}
runtime {
memory: "1 GB"
}
output {
String SAMPLE_ALIAS = "${in_SAMPLE_ALIAS}"
String PROJECT = "${in_PROJECT}"
String DATA_TYPE = "${in_DATA_TYPE}"
String COMPARE_SAMPLE = "${in_COMPARE_SAMPLE}"
String COMPARE_PROJECT = "${in_COMPARE_PROJECT}"
String CLEAN_SAMPLE = "${in_CLEAN_SAMPLE}"
String CLEAN_COMPARE_SAMPLE = "${in_CLEAN_COMPARE_SAMPLE}"
String FILE = "${in_FILE}"
String CONTAM = "${in_CONTAM}"
}
}
task ExtractContaminant {
String picard_jar
File input_bam
String contaminant_sample
File haplotype_database
File ref_fasta
File ref_fasta_idx
File ref_dict
String contamination_rate
command {
java -jar ${picard_jar} IdentifyContaminant \
I=${input_bam} \
O=${contaminant_sample}.vcf \
H=${haplotype_database} \
R=${ref_fasta} \
C=${contamination_rate} \
SAMPLE_ALIAS=${contaminant_sample}
}
runtime {
memory: "4 GB"
}
output {
File vcf = "${contaminant_sample}.vcf"
File vcf_idx = "${contaminant_sample}.vcf.idx"
}
}
task Fingerprint {
String PICARD
File input_vcf
File input_vcf_index
File haplotype_database_file
File genotypes
String sample_name
String output_name
command {
java -Dsamjdk.buffer_size=131072 -XX:GCTimeLimit=50 -XX:GCHeapFreeLimit=10 -Xmx1024m \
-jar ${PICARD} \
CheckFingerprint \
INPUT=${input_vcf} \
OUTPUT=${output_name} \
GENOTYPES=${genotypes} \
HAPLOTYPE_MAP=${haplotype_database_file} \
SAMPLE_ALIAS="${sample_name}"
}
runtime {
continueOnReturnCode: true
memory: "4 GB"
}
output {
File summary_metrics = "${output_name}.fingerprinting_summary_metrics"
File detail_metrics = "${output_name}.fingerprinting_detail_metrics"
}
}
workflow fingerprintContaminants {
File SamplesTSV = "/dsde/working/farjoun/AGBT2017/SampleSetWithcontamination.txt"
String picard_jar = "/seq/software/picard/current/bin/picard-private.jar"
File haplotype_database = "/seq/references/Homo_sapiens_assembly19/v1/Homo_sapiens_assembly19.haplotype_database.txt"
File ref_fasta = "/seq/references/Homo_sapiens_assembly19/v1/Homo_sapiens_assembly19.fasta"
File ref_fasta_idx = "/seq/references/Homo_sapiens_assembly19/v1/Homo_sapiens_assembly19.fasta.fai"
File ref_dict = "/seq/references/Homo_sapiens_assembly19/v1/Homo_sapiens_assembly19.dict"
call Decapitate {
input:
inputFile=SamplesTSV,
outputFilename=SamplesTSV+".headless"
}
scatter(row in Decapitate.outputMatrix){
call BreakUpRow {
input:
in_SAMPLE_ALIAS = row[0],
in_PROJECT = row[1],
in_DATA_TYPE = row[2],
in_COMPARE_SAMPLE = row[3],
in_COMPARE_PROJECT = row[4],
in_CLEAN_SAMPLE = row[5],
in_CLEAN_COMPARE_SAMPLE = row[6],
in_FILE = row[7],
in_CONTAM = row[8]
}
call ExtractContaminant {
input:
picard_jar = picard_jar,
input_bam = BreakUpRow.FILE,
contaminant_sample = BreakUpRow.CLEAN_COMPARE_SAMPLE,
contamination_rate = BreakUpRow.CONTAM,
haplotype_database = haplotype_database,
ref_fasta = ref_fasta,
ref_fasta_idx = ref_fasta_idx,
ref_dict = ref_dict
}
call Fingerprint {
input:
PICARD = picard_jar,
input_vcf = ExtractContaminant.vcf,
input_vcf_index = ExtractContaminant.vcf_idx,
haplotype_database_file = haplotype_database,
genotypes = "/seq/references/reference_genotypes/non-hapmap/" + BreakUpRow.COMPARE_PROJECT + "/Homo_sapiens_assembly19/" + BreakUpRow.CLEAN_COMPARE_SAMPLE + ".vcf",
sample_name = BreakUpRow.CLEAN_COMPARE_SAMPLE,
output_name = BreakUpRow.CLEAN_COMPARE_SAMPLE + "." + BreakUpRow.CONTAM + "." + BreakUpRow.DATA_TYPE
}
}
}
the stack trace is here
java.lang.Thread.State: RUNNABLE
at scala.collection.mutable.AbstractBuffer.<init>(Buffer.scala:49)
at scala.collection.mutable.ListBuffer.<init>(ListBuffer.scala:46)
at scala.collection.immutable.List$.newBuilder(List.scala:453)
at scala.collection.generic.GenericTraversableTemplate$class.genericBuilder(GenericTraversableTemplate.scala:70)
at scala.collection.AbstractTraversable.genericBuilder(Traversable.scala:104)
at scala.collection.generic.GenTraversableFactory$GenericCanBuildFrom.apply(GenTraversableFactory.scala:57)
at scala.collection.generic.GenTraversableFactory$GenericCanBuildFrom.apply(GenTraversableFactory.scala:52)
at scala.collection.SeqLike$class.$colon$plus(SeqLike.scala:556)
at scala.collection.AbstractSeq.$colon$plus(Seq.scala:41)
at wdl4s.Scope$class.fullyQualifiedName(Scope.scala:92)
at wdl4s.Call.fullyQualifiedName(Call.scala:60)
at cromwell.engine.workflow.lifecycle.execution.ExecutionStore$$anonfun$cromwell$engine$workflow$lifecycle$execution$ExecutionStore$$isDone$1$1.apply(ExecutionStore.scala:94)
at cromwell.engine.workflow.lifecycle.execution.ExecutionStore$$anonfun$cromwell$engine$workflow$lifecycle$execution$ExecutionStore$$isDone$1$1.apply(ExecutionStore.scala:93)
at scala.collection.Iterator$class.exists(Iterator.scala:919)
at scala.collection.AbstractIterator.exists(Iterator.scala:1336)
at scala.collection.IterableLike$class.exists(IterableLike.scala:77)
at scala.collection.AbstractIterable.exists(Iterable.scala:54)
at cromwell.engine.workflow.lifecycle.execution.ExecutionStore.cromwell$engine$workflow$lifecycle$execution$ExecutionStore$$isDone$1(ExecutionStore.scala:93)
at cromwell.engine.workflow.lifecycle.execution.ExecutionStore$$anonfun$5.apply(ExecutionStore.scala:98)
at cromwell.engine.workflow.lifecycle.execution.ExecutionStore$$anonfun$5.apply(ExecutionStore.scala:98)
at scala.collection.Iterator$class.forall(Iterator.scala:905)
at scala.collection.AbstractIterator.forall(Iterator.scala:1336)
at scala.collection.IterableLike$class.forall(IterableLike.scala:75)
at scala.collection.AbstractIterable.forall(Iterable.scala:54)
at cromwell.engine.workflow.lifecycle.execution.ExecutionStore.arePrerequisitesDone(ExecutionStore.scala:98)
at cromwell.engine.workflow.lifecycle.execution.ExecutionStore.cromwell$engine$workflow$lifecycle$execution$ExecutionStore$$isRunnable(ExecutionStore.scala:43)
at cromwell.engine.workflow.lifecycle.execution.ExecutionStore$$anonfun$runnableScopes$1.applyOrElse(ExecutionStore.scala:37)
at cromwell.engine.workflow.lifecycle.execution.ExecutionStore$$anonfun$runnableScopes$1.applyOrElse(ExecutionStore.scala:37)
at scala.collection.immutable.List.collect(List.scala:303)
at cromwell.engine.workflow.lifecycle.execution.ExecutionStore.runnableScopes(ExecutionStore.scala:37)
at cromwell.engine.workflow.lifecycle.execution.WorkflowExecutionActor.cromwell$engine$workflow$lifecycle$execution$WorkflowExecutionActor$$startRunnableScopes(WorkflowExecutionActor.scala:347)
at cromwell.engine.workflow.lifecycle.execution.WorkflowExecutionActor.handleExecutionSuccess(WorkflowExecutionActor.scala:326)
at cromwell.engine.workflow.lifecycle.execution.WorkflowExecutionActor.cromwell$engine$workflow$lifecycle$execution$WorkflowExecutionActor$$handleCallSuccessful(WorkflowExecutionActor.scala:304)
at cromwell.engine.workflow.lifecycle.execution.WorkflowExecutionActor$$anonfun$3.applyOrElse(WorkflowExecutionActor.scala:97)
at cromwell.engine.workflow.lifecycle.execution.WorkflowExecutionActor$$anonfun$3.applyOrElse(WorkflowExecutionActor.scala:82)
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
at akka.actor.FSM$class.processEvent(FSM.scala:663)
at cromwell.engine.workflow.lifecycle.execution.WorkflowExecutionActor.akka$actor$LoggingFSM$$super$processEvent(WorkflowExecutionActor.scala:33)
at akka.actor.LoggingFSM$class.processEvent(FSM.scala:799)
at cromwell.engine.workflow.lifecycle.execution.WorkflowExecutionActor.processEvent(WorkflowExecutionActor.scala:33)
at akka.actor.FSM$class.akka$actor$FSM$$processMsg(FSM.scala:657)
at akka.actor.FSM$$anonfun$receive$1.applyOrElse(FSM.scala:651)
at akka.actor.Actor$class.aroundReceive(Actor.scala:484)
at cromwell.engine.workflow.lifecycle.execution.WorkflowExecutionActor.aroundReceive(WorkflowExecutionActor.scala:33)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526)
at akka.actor.ActorCell.invoke(ActorCell.scala:495)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)
at akka.dispatch.Mailbox.run(Mailbox.scala:224)
at akka.dispatch.Mailbox.exec(Mailbox.scala:234)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Issue Analytics
- State:
- Created 7 years ago
- Comments:68 (34 by maintainers)
Top Results From Across the Web
Cromwell at Worcester | History Today
Though the result of the battle was a foregone conclusion, for his armies much outnumbered the Royalists, and all England had risen against...
Read more >Andrew Marvell. The First Anniversary Of the Government ...
Cromwell alone doth with new Lustre spring, And shines the Jewel of the yearly Ring. 'Tis he the force of scatter'd Time contracts,...
Read more >Battle of Dunbar (1650) - Wikipedia
The Battle of Dunbar was fought between the English New Model Army, under Oliver Cromwell and a Scottish army commanded by David Leslie,...
Read more >Cromwellian Britain - Montgomery, Montgomeryshire
They surprised and scattered a large part of the parliamentarian garrison, which had ventured out on a foraging expedition. Mytton managed to get...
Read more >Cromwell (Hugo, tr. Ives)/Act fifth - Wikisource
Of kings mistook for rays from God on high, With his superb and ancient royalty, Knelt down before the people's keen-edged axe!
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Cool ! I merged the PR, this fix is now in develop, so I’ll close this !
With some help from @Horneth it became clear that I was using the wrong jar when I ran with file path call caching. Everything seems to be working much faster! Thank you!