These are chat archives for nextflow-io/nextflow

7th
Aug 2018
Luca Cozzuto
@lucacozzuto
Aug 07 2018 11:04
Hi all, is there any way to indicate a pattern in "output" for being sent to a channel?
    output:
   file '*.{txt, pdf}' into my_output
Paolo Di Tommaso
@pditommaso
Aug 07 2018 11:06
'*.{txt,pdf}' no blank
Luca Cozzuto
@lucacozzuto
Aug 07 2018 11:09
thanks! since they are mutually exclusive I also added
  output:
   file '*.{txt,pdf}' optional true into my_output
Paolo Di Tommaso
@pditommaso
Aug 07 2018 11:10
not needed
it's implicitly and *or* condition
Luca Cozzuto
@lucacozzuto
Aug 07 2018 11:13
wonderful
Paolo Di Tommaso
@pditommaso
Aug 07 2018 11:13
:smile:
Luca Cozzuto
@lucacozzuto
Aug 07 2018 11:13
I think this was not described in the documentation (patterns are only in publishDir)
;)
Paolo Di Tommaso
@pditommaso
Aug 07 2018 11:14
DIY
:wink:
Luca Cozzuto
@lucacozzuto
Aug 07 2018 11:23
done
WTFM
;)
Paolo Di Tommaso
@pditommaso
Aug 07 2018 11:24
:smile:
Jose Espinosa-Carrasco
@JoseEspinosa
Aug 07 2018 13:55

I am getting this error

ERROR ~ Error executing process > 'fraction_tumoral (3)'

Caused by:
  Process `fraction_tumoral (3)` terminated with an error exit status (1)

Command executed:

  samtools view -h -bs 0.15 G25781.TCGA-50-6597-01A-11D-1855-08.4.bam -o G25781.TCGA-50-6597-01A-11D-1855-08.4.{x}.bam


Command exit status:
  1

Command output:
  (empty)

Command wrapper:
  nxf-scratch-dir fsupeksvr:/tmp/nxf.Ng9sJzR5Hx
  [E::bgzf_flush] File write failed (wrong size)
  tee: .command.err: No space left on device
  samtools view: writing to "G25781.TCGA-50-6597-01A-11D-1855-08.4.{x}.bam" failed: No space left on device
  [E::bgzf_close] File write failed
  samtools view: error closing "G25781.TCGA-50-6597-01A-11D-1855-08.4.{x}.bam": -1

Work dir:
  /g/strcombio/fsupek_cancer3/jespinosa/nxf_work/0c/5ae9b0cf2ceca06ae596f025d25b42

I know there is not much space on /tmp and thus I change NXF_TEMP to a different folder, but from what I understand from this error it seems like still is complaining about /tmp/nxf.Ng9sJzR5Hx , any clue?

Paolo Di Tommaso
@pditommaso
Aug 07 2018 13:55
No space left on device
buy a bigger hard disk ! :smile:
Jose Espinosa-Carrasco
@JoseEspinosa
Aug 07 2018 13:56
is there a way to keep temporal files outside /tmp/
@pditommaso the disk where NXF_WORK and NXF_TEMP are set have enough space, if I understand the error correctly the problem is with the temporary files
Paolo Di Tommaso
@pditommaso
Aug 07 2018 13:59
don't use scratch or set a different scratch path as process.scratch = '/some/path/'
Jose Espinosa-Carrasco
@JoseEspinosa
Aug 07 2018 13:59
thanks
Tim Dudgeon
@tdudgeon
Aug 07 2018 14:10
I wondered if anyone had any thoughts on the problem I reported here on Aug 3 with the -resume option re-executing things that didn't need re-executing.
Evan Floden
@evanfloden
Aug 07 2018 14:12
@tdudgeon One final thing I was thinking, is it possible the use of the $time variables in bash invalidates the cache?
Paolo Di Tommaso
@pditommaso
Aug 07 2018 14:13
not it should not
is a test data set included in the repo ?
Tim Dudgeon
@tdudgeon
Aug 07 2018 14:14
No, but I can try to add something.
Paolo Di Tommaso
@pditommaso
Aug 07 2018 14:15
yes, please, so I can give a try
Tim Dudgeon
@tdudgeon
Aug 07 2018 14:15
OK, will do.
Tim Dudgeon
@tdudgeon
Aug 07 2018 14:30
@pditommaso Try this: https://github.com/InformaticsMatters/dls-fragalysis-stack-openshift/tree/master/s2g-processor/nextflow
Run until some of the cgd tasks have completed, then Ctrl-C and run again with the -resume option.
hang on - the input file is missing :-(
Tim Dudgeon
@tdudgeon
Aug 07 2018 14:37
OK, sorted. Over-zealous .gitignore file!
Paolo Di Tommaso
@pditommaso
Aug 07 2018 15:28
the problem is this publishDir 'results/', mode: 'move'
Note: this is only supposed to be used for a terminating process i.e. a process whose output is not consumed by any other downstream process.
Tim Dudgeon
@tdudgeon
Aug 07 2018 15:41
but this is a terminal process. No process consumes this. It is the final result.
Paolo Di Tommaso
@pditommaso
Aug 07 2018 16:04
oops, you are right
Tim Dudgeon
@tdudgeon
Aug 07 2018 16:11
but you are right changing move to copy does seem to fix the problem
Paolo Di Tommaso
@pditommaso
Aug 07 2018 16:14
yes, I was right :grimacing:
and makes sense
if you use move the output file is copied in the result dir
therefore the resume check fail and re-execute the task
Tim Dudgeon
@tdudgeon
Aug 07 2018 16:45
Don't understand. I thought the cache (and hash generation) was based on the inputs. It can't use the outputs as normally they haven't been created yet.
So why does it matter if you copy or move the outputs?
Paolo Di Tommaso
@pditommaso
Aug 07 2018 16:53
but it also check the existence of declared output files, if they do not exist the task is re-executed
Tim Dudgeon
@tdudgeon
Aug 07 2018 16:55
OK. I see.
Evan Floden
@evanfloden
Aug 07 2018 16:58
This reminds me that it would be nice to have a: "how the cache hash is calculated" somewhere
Félix C. Morency
@fmorency
Aug 07 2018 17:10
+1
We still have issues where some process are re-executed without any reason
Mike Smoot
@mes5k
Aug 07 2018 17:15
The -dump-hashes flag will dump the hashes used to calculate how the final hash is created. I've used this to debug things that seem like they shouldn't be getting re-executed, but have (to this point) always been properly calculated.
Félix C. Morency
@fmorency
Aug 07 2018 17:15
Mmm let's try that!
Mike Smoot
@mes5k
Aug 07 2018 17:16
bring your grep-fu because making sense of the output can be a challenge...
Félix C. Morency
@fmorency
Aug 07 2018 17:18
Thanks @mes5k. Got my afternoon settled.
Félix C. Morency
@fmorency
Aug 07 2018 18:34

Ok so I resumed my pipeline and two nextflow.util.ArrayBag out of tree don't have the same hash. However, the FileHolder seems the same

Run 1
  991e8af2132b8c45da56b7738a1d69c8 [nextflow.util.ArrayBag] [FileHolder(sourceObj:/imk/imk-bignas/developers/fmorency/Imeka/20180807-NF-HASH/Results/work/26/7b11d056d5ce1be8c2cdbfc49a2119/Felix__pre_b0_brain.nii.gz, storePath:/imk/imk-bignas/developers/fmorency/Imeka/20180807-NF-HASH/Results/work/26/7b11d056d5ce1be8c2cdbfc49a2119/Felix__pre_b0_brain.nii.gz, stageName:Felix__pre_b0_brain.nii.gz)] 

Run 2 (-resume)
  19452cd3ea207cb031e53c8d43bc7f74 [nextflow.util.ArrayBag] [FileHolder(sourceObj:/imk/imk-bignas/developers/fmorency/Imeka/20180807-NF-HASH/Results/work/26/7b11d056d5ce1be8c2cdbfc49a2119/Felix__pre_b0_brain.nii.gz, storePath:/imk/imk-bignas/developers/fmorency/Imeka/20180807-NF-HASH/Results/work/26/7b11d056d5ce1be8c2cdbfc49a2119/Felix__pre_b0_brain.nii.gz, stageName:Felix__pre_b0_brain.nii.gz)]

Why are the hashes different?

Paolo Di Tommaso
@pditommaso
Aug 07 2018 21:03
may be changed some file metadata ?