These are chat archives for nextflow-io/nextflow

4th
Apr 2018
Alexander Peltzer
@apeltzer
Apr 04 2018 11:25
Command error:
  .command.stub: line 38: ps: command not found
  .command.stub: line 27: ps: command not found
  [bam_sort_core] merging from 364 files and 28 in-memory blocks...
(Samtools 1.7 in the container) seems to fail at some point. Its actually a Dockerimage based on Miniconda3 + the required bioconda packages and i didnt see that happening before... any idea what might have gone wrong?
Paolo Di Tommaso
@pditommaso
Apr 04 2018 11:26
missing/wrong ps tool in the container (?)
nextflow-io/nextflow#499
btw it should not break the user command
Alexander Peltzer
@apeltzer
Apr 04 2018 11:27
.exitcode is empty (nothing in there), .command.err just shows the ps command not found things
Paolo Di Tommaso
@pditommaso
Apr 04 2018 11:28
what about [bam_sort_core] merging from 364 files and 28 in-memory blocks. ?
Alexander Peltzer
@apeltzer
Apr 04 2018 11:28
Yeah, but there is no error message in the directory - I'm quite uncertain what caused the issue...
N E X T F L O W  ~  version 0.28.0 4779
NGI-ExoSeq ANALYSIS WORKFLOW ~ 0.9dev - e08ea0291f
Command Line: nextflow run -profile binac NGI-ExoSeq/PairedSingleSampleWF.nf --notrim --reads *R{1,2}*.fastq.gz --genome GRCh37 --project GIAB --run_id GIAB -resume -with-report ExoSeq_GIAB_HG002_RuntimeReport.html
Project Dir : /beegfs/work/zxmai83/Genome_In_A_Bottle/RAW/Combined/NGI-ExoSeq
Launch Dir  : /beegfs/work/zxmai83/Genome_In_A_Bottle/RAW/Combined
Work Dir    : /beegfs/work/zxmai83/Genome_In_A_Bottle/RAW/Combined/work
Out Dir     : ./results
Genome      : /beegfs/work/zxmai83/Reference/genome/b37/human_g1k_v37.fasta
Completed at: Mon Apr 02 23:04:08 CEST 2018
Duration    : 12h 35m 21s
Success     : false
Exit status : null
Error report: Error executing process > 'sortSam (D1S1)'

Caused by:
  Process `sortSam (D1S1)` terminated for an unknown reason -- Likely it has been terminated by the external system

Command executed:

  samtools sort \
      D1S1_bwa.sam \
      -@ 28\
      -m 1170M \
      -o D1S1_bwa.sam.sorted.bam
  # Print version number to standard out
  echo "Samtools V:"$(samtools 2>&1)

Command exit status:
  -

Command output:
  (empty)

Command error:
  .command.stub: line 38: ps: command not found
  .command.stub: line 27: ps: command not found
  [bam_sort_core] merging from 364 files and 28 in-memory blocks...

Work dir:
  /beegfs/work/zxmai83/Genome_In_A_Bottle/RAW/Combined/work/3f/5c4af3ee2105a9a63cd79000245c71
Paolo Di Tommaso
@pditommaso
Apr 04 2018 11:30
make sure there's the ps tool in the container
Alexander Peltzer
@apeltzer
Apr 04 2018 11:30
Okay, I see - time ran out, but that should reschedule
I'll check, something weird there
Maxime Garcia
@MaxUlysse
Apr 04 2018 11:30
Have you tried with another image?
Alexander Peltzer
@apeltzer
Apr 04 2018 11:31
Not yet - will do
Paolo Di Tommaso
@pditommaso
Apr 04 2018 11:31
.exitcode is empty
this should not happen
Alexander Peltzer
@apeltzer
Apr 04 2018 11:31
Its pretty much the same style of image we use in nfcore
Paolo Di Tommaso
@pditommaso
Apr 04 2018 11:31
what batch scheduLer are u using ?
Alexander Peltzer
@apeltzer
Apr 04 2018 11:31
(e.g. in NGI-RNAseq/methylseq)
PBS/Torque
Paolo Di Tommaso
@pditommaso
Apr 04 2018 11:33
this may happen if the cluster kill the job using -9 signal (hard kill)
in that case there's no way to handle the error condition on NF side
however usually schedulers first send a soft kill, and after a few seconds an hard kill
Alexander Peltzer
@apeltzer
Apr 04 2018 11:35
Time was set to 10hours, 12 hr 35m it was killed
Paolo Di Tommaso
@pditommaso
Apr 04 2018 11:35
you should check this with your sysadmins and eventually organise a test case
Alexander Peltzer
@apeltzer
Apr 04 2018 11:35
I will
Paolo Di Tommaso
@pditommaso
Apr 04 2018 11:36
the job should be killed with one of these signal TERM INT USR1 USR2
to allow NF to handle the termination
Phil Ewels
@ewels
Apr 04 2018 11:38
Is anyone else having problems with the nextflow reports recently? The last few that I've generated have empty plots for cpu, memory and disk (time is fine)
this is with a small test set that only takes a few minutes to run, but still..
The %cpu and %mem in the table are set to - also
Paolo Di Tommaso
@pditommaso
Apr 04 2018 11:40
this sounds always related to nextflow-io/nextflow#499
Phil Ewels
@ewels
Apr 04 2018 11:47
aha.. yup :+1:
/Users/philewels/GitHub/nf-core/RNAseq/tests/work/8d/0d372444e497b2d5c18bc074d437c1/.command.stub: line 38: ps: command not found
Sorry, should check the above messages before posting :laughing:
Paolo Di Tommaso
@pditommaso
Apr 04 2018 11:48
no pb, I need to find a workaround for that :/
Alexander Peltzer
@apeltzer
Apr 04 2018 12:07
Do you also use ps to figure out whether a job runs, e.g. related to the exitcode stuff?
or maybe something else that e.g. is present in the procps package e.g. on debian?
Paolo Di Tommaso
@pditommaso
Apr 04 2018 12:08
nope
Alexander Peltzer
@apeltzer
Apr 04 2018 12:09
Ok
Paolo Di Tommaso
@pditommaso
Apr 04 2018 12:09
create a fake job with a trap creating a file
then verify trap is invoked when the cluster kill that job
Alexander Peltzer
@apeltzer
Apr 04 2018 12:10
Awesome - will do that now!
Paolo Di Tommaso
@pditommaso
Apr 04 2018 12:12
use these signals TERM INT USR1 USR2
Tim Diels
@timdiels
Apr 04 2018 12:27
Does channel1 in channel1.tap(channel2) keep all objects sent into channel1 in memory? Since it's tapped, but never fully consumed as it would with channel1.into(channel2) or channel1.tap(channel2).doSomething()?
Or does it 'know' it has no other output and so throws it away or so?
Paolo Di Tommaso
@pditommaso
Apr 04 2018 12:31
well, the result of channel1.tap(channel2) is a new channel that you are not referencing
Tim Diels
@timdiels
Apr 04 2018 12:34
and the new channel would buffer all the input it gets, which never gets read, and so it stays in memory, correct?
Paolo Di Tommaso
@pditommaso
Apr 04 2018 12:35
I guess so
Tim Diels
@timdiels
Apr 04 2018 12:42
Do I understand this correctly?:
species.into { somethingNew }  # creates somethingNew channel
Channel.create().doThings().set { somethingNew }  # overwrites somethingNew variable with a new channel, the original channel is now unused
Paolo Di Tommaso
@pditommaso
Apr 04 2018 12:43
yes
Tim Diels
@timdiels
Apr 04 2018 12:43
I.e. instead I should do?:
species.into { species1 }
species1.doThings().set { somethingNew }
Paolo Di Tommaso
@pditommaso
Apr 04 2018 12:44
event better
species.set { species1 }
species1.doThings().set { somethingNew }
or
species.doThings().set { somethingNew }
Tim Diels
@timdiels
Apr 04 2018 12:46
true, I oversimplified my example. I was thinking of
species.into { species1 ; species2 }
species1.doComplexThings.set { oneThing }
species2.doOtherComplexThings.set { anotherThing }
Complex being multiple lines of operators
Paolo Di Tommaso
@pditommaso
Apr 04 2018 12:46
sounds good
Tim Diels
@timdiels
Apr 04 2018 12:46
Good, thanks
Paolo Di Tommaso
@pditommaso
Apr 04 2018 12:47
:+1:
Luca Cozzuto
@lucacozzuto
Apr 04 2018 12:50
There are no processes in this code? Is only for combining, reshaping the channels?
Paolo Di Tommaso
@pditommaso
Apr 04 2018 12:55
I think it's just an extract
Simone Baffelli
@baffelli
Apr 04 2018 13:08
What does that mean: DEBUG nextflow.Session - Session aborted -- Cause: class nextflow.file.http.XFileSystemProvider (in unnamed module @0x4450d156) cannot access class sun.net.www.protocol.ftp.FtpURLConnection (in module java.base) because module java.base does not export sun.net.www.protocol.ftp to unnamed module @0x4450d15
I'm trying to use a remote file with http. It works fine in nextflow console
Paolo Di Tommaso
@pditommaso
Apr 04 2018 13:09
are you using java 9/10 ?
Paolo Di Tommaso
@pditommaso
Apr 04 2018 13:15
in any case open an issue including complete stack trace and the java version you are using
Simone Baffelli
@baffelli
Apr 04 2018 13:23
Java 9
That might be the reason
But it surprises me that using nextflow console it does not happen
#646
Félix C. Morency
@fmorency
Apr 04 2018 14:10
Anyone has an opinion on #562?
I'm about to upgrade our cluster to NF >0.25 and I have the feeling that the feature introduced by #443 will cause more problem for us
Paolo Di Tommaso
@pditommaso
Apr 04 2018 14:21
what's the problem with #443? it makes sense to invalid the cache when changing the container image
Félix C. Morency
@fmorency
Apr 04 2018 14:25
In theory, yes. However, in our case, it causes problems. Changing one binary inside the container which is called by one and only one NF process should not trigger a complete reprocessing of the whole pipeline. Our pipelines are all long running and having to reprocess everything everytime we change one bit in the container is not practical. It would be nice to at least have the option to skip the container hash check and not enforcing it.
Paolo Di Tommaso
@pditommaso
Apr 04 2018 14:28
but if you change a tool, how to you instruct to invalidate only the process using that tool ?
Félix C. Morency
@fmorency
Apr 04 2018 14:29
Delete related work directories. In that case, this is the operator responsibility, not NF one.
Another (not so farfetched) example would be to apply a security update to the container os.
It should not trigger a complete reprocessing imo. Or at least we should have an option to skip container hashing
Luca Cozzuto
@lucacozzuto
Apr 04 2018 14:30
you can specify the new container only for the processes you want to rerun.
This is what I normally do when I make a new version of the image
and I don't want to rerun everything
Félix C. Morency
@fmorency
Apr 04 2018 14:32
@lucacozzuto is there a way to do this all from the command-line or if one needs to modify the configuration file?
Luca Cozzuto
@lucacozzuto
Apr 04 2018 14:32
I modify the config file
but because I know that this is only for the current run
then when I have everything stable I use only the latest image
for other executions
Félix C. Morency
@fmorency
Apr 04 2018 14:34
This is not practical is our case. We nextflow pulland run with -with-singularity. We use one container for the whole pipeline. We could try what you're saying if there's a way of doing it all using the command-line
Luca Cozzuto
@lucacozzuto
Apr 04 2018 14:35
have you different singularity images?
Félix C. Morency
@fmorency
Apr 04 2018 14:35

then when I have everything stable I use only the latest image

This is also what we're doing, but we sometime discover bugs down the road

We only have a single singularity image containing everything required for the pipeline
Luca Cozzuto
@lucacozzuto
Apr 04 2018 14:36
of course when you discover bugs you need to re-run only the affected processes (and the subsequent ones)
so I'll change in the config file the images only for that one
and I'll rerun with --resume
@pditommaso can you pull two different images?
Félix C. Morency
@fmorency
Apr 04 2018 14:38
That would work if we could specify said image for said process from the command-line
Luca Cozzuto
@lucacozzuto
Apr 04 2018 14:39
@fmorency there is any reason for not keeping the image locally?
Félix C. Morency
@fmorency
Apr 04 2018 14:40
Image is on a shared NAS. We pass the absolute path to -with-singularity. Ex. -with-singularity /mnt/path/to/image.img
Luca Cozzuto
@lucacozzuto
Apr 04 2018 14:42
why not specifying in the config file? Is often changing the path?
Félix C. Morency
@fmorency
Apr 04 2018 14:43
Yeah we always produce new image versions. Our tools are always evolving and we would have to change the config file a lot. We also produce a lot of rnd images
Phil Ewels
@ewels
Apr 04 2018 15:19
@pditommaso - ps error and missing plots problem now resolved with a fix by @apeltzer : https://github.com/nf-core/tools/blob/a29a412bbc015e7e41ea7ce0927b0d74e1ae902f/Dockerfile#L5-L6
I get a different log warning at the end of each task now though, not sure if it's related?
nf-core/RNAseq/tests/work/80/ecac5a0d67ba9ffd8ec1e194cc397e/.command.stub: line 99:    11 Terminated              nxf_trace "$pid" .command.trace
Maybe that's normal and not a problem? Not sure. Will ignore for now anyway :)
Paolo Di Tommaso
@pditommaso
Apr 04 2018 15:20
that's safe
(sorry meeting)
Phil Ewels
@ewels
Apr 04 2018 15:28
:+1:
jncvee
@jncvee
Apr 04 2018 15:36
I am trying to use the cutadapt program in my script. However whenever I use the channel to get the fastq file I get a command output saying the first line of the fastq file isnt the fastq file but it's the titles of the work folders instead ?
Paolo Di Tommaso
@pditommaso
Apr 04 2018 15:39
not sure to understand
copy and paste the complete error message
jncvee
@jncvee
Apr 04 2018 15:42
cutadapt: error: Line 1 in FASTQ file is expected to start with '@', but found '\x1f\x8b\x08\x00\xd1\xdd\x17W\x00\x03'
Paolo Di Tommaso
@pditommaso
Apr 04 2018 15:44
mmm, how is the process defined ?
jncvee
@jncvee
Apr 04 2018 15:45
input:
file faster from fastqUntrimmed
file adapter from adapterFile
output:
file 'KOTh1exp1_R1_trimmed.fastq.gz' into aligning

"""
head $adapter
cat  faster | cutadapt -a AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC -a AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTA -m 20 --format fastq -o KOTh1exp1_R1_trimmed.fastq.gz faster
"""
Paolo Di Tommaso
@pditommaso
Apr 04 2018 15:48
it looks the fastq is corrupted, have you checked that ?
jncvee
@jncvee
Apr 04 2018 15:54
the file is not corrupted
Paolo Di Tommaso
@pditommaso
Apr 04 2018 15:55
you have to check that file in the task work dir
jncvee
@jncvee
Apr 04 2018 16:31
cutadapt is corrupting the fastq file as it goes through I think
Paolo Di Tommaso
@pditommaso
Apr 04 2018 16:33
not a user of cutadapt but it seems you are specifying the input twice
cat faster | cutadapt ... faster
is that right ?
Phil Ewels
@ewels
Apr 04 2018 16:35
What is faster?
And what is being supplied to cutadapt via the pipe?
ah, code block formatting was a bit wonky - I see it now sorry
Paolo Di Tommaso
@pditommaso
Apr 04 2018 16:36
I guess is the input fastq
Phil Ewels
@ewels
Apr 04 2018 16:36
It should be $faster with a dollar sign if you want the filename
jncvee
@jncvee
Apr 04 2018 16:37
faster is the file name from the pipeline that gives me the fastq file
Phil Ewels
@ewels
Apr 04 2018 16:37
cat faster will just pipe the string faster to cutadapt if it doesn't have the $
Any reason to use a cat pipe instead of just supplying the input file as an argument to cutadapt?
jncvee
@jncvee
Apr 04 2018 16:38
no not really. a previous script I made had that so I sort of just kept it
Phil Ewels
@ewels
Apr 04 2018 16:39
input:
file fastq from fastqUntrimmed

output:
file '*_trimmed.fastq.gz' into aligning

script:
"""
cutadapt -a AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC -a AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTA -m 20 --format fastq -o ${fastq.baseName}_trimmed.fastq.gz $fastq
"""
I would write it something like this
Then the output filename is dynamic, based on what the input filename was
the $adapter file wasn't being used, so I removed that
Rohan Shah
@rohanshah
Apr 04 2018 20:52
does anyone know if you can set a working directory per process? or if that question even makes sense?
my larger question is, is it possible to run nextflow with some processes executing in AWS Batch and others using the local executor? it seems that AWS Batch needs an s3 working directory but local cannot use an s3 working directory
Paolo Di Tommaso
@pditommaso
Apr 04 2018 20:57
currently it's not possible nextflow-io/nextflow#631
what's your use case that requires local execution ?
Rohan Shah
@rohanshah
Apr 04 2018 21:00
there are a few steps in the workflow that are quite simple and fast where the startup time for Batch is too expensive
the simple steps are just some commands that interact with AWS through the CLI
Paolo Di Tommaso
@pditommaso
Apr 04 2018 21:01
I see, you can take in consideration to merge that step into a larger process running into Batch
Rohan Shah
@rohanshah
Apr 04 2018 21:03
ya thats another option for sure
but they're kind of a separate processes completely
Paolo Di Tommaso
@pditommaso
Apr 04 2018 21:03
that's a common pattern in NF workflows
Rohan Shah
@rohanshah
Apr 04 2018 21:04
its sort of a one-off task to stage some things for stuff outside nextflow/the workflow and by putting it in a separate task we were hoping to have it run in parallel with other steps
Paolo Di Tommaso
@pditommaso
Apr 04 2018 21:05
i see
Rohan Shah
@rohanshah
Apr 04 2018 21:09
I can add a description to that feature request if you'd like, and thanks for the quick answer!
Paolo Di Tommaso
@pditommaso
Apr 04 2018 21:13
Yes please
Mike Smoot
@mes5k
Apr 04 2018 21:15
@rohanshah I wonder if you could use exec instead of script and then use groovy to shell out to do whatever you need to do with the aws cli? I believe exec always runs on the node where you run nextflow from. Seems like a horrible hack, but might work...
Rohan Shah
@rohanshah
Apr 04 2018 21:36
maybe but wouldnt that still require a non-S3 working directory?
Mike Smoot
@mes5k
Apr 04 2018 21:43
Hmmm, not sure, but could be a problem. I was primarily thinking about run local vs. run in Batch. exec does behave fairly differently than script so I'm not totally sure what it would do.
Rohan Shah
@rohanshah
Apr 04 2018 22:23
ya the local execution is mostly fine, nextflow just bombs out after the execution because the working directory is S3 and it looks for the path locally