These are chat archives for nextflow-io/nextflow

7th
Feb 2018
Simone Baffelli
@baffelli
Feb 07 2018 08:41
Good morning :coffee:
Is there a way to reuse parts of the original filename when using multiple input files as in input: file("something_??")
Bioninbo
@Bioninbo
Feb 07 2018 09:40
Good Morning. Is it possible to publish files that are not output of a process ?
Paolo Di Tommaso
@pditommaso
Feb 07 2018 11:17
@Bioninbo no
Vladimir Kiselev
@wikiselev
Feb 07 2018 11:26
Does nextflow pull work for everyone as expected? In my case it says it has pulled the newest version but then when I run it is says that there is a newer version of a pipeline. I have to remove ~/.nextflow/assets and run again - then everything is ok. This problem appeared on both MacOS and Linux. Or am I doing something wrong?
Paolo Di Tommaso
@pditommaso
Feb 07 2018 11:29
looks weird let me check
I made this test:
modified a project on github
runin locally NF wars about a new version
then nextflow pull <name>
nextflow run <name>
no warning is shown .. so it's working as expected
Maxime Borry
@maxibor
Feb 07 2018 11:51
Hello,
Is there a way to run NextFlow without saving the work directory for each process execution ? I'm running out of disk space (it's a rather long pipeline).
I tried the -cache false flag to no effect.
(when I call nextflow run mypipeline -cache false in my CLI)
Vladimir Kiselev
@wikiselev
Feb 07 2018 11:54
@pditommaso thanks Paolo! This is what I expected, but somehow in my case it didn’t work… Will try again today and will update. Thanks again for your answer.
Paolo Di Tommaso
@pditommaso
Feb 07 2018 11:55
ok
Bioninbo
@Bioninbo
Feb 07 2018 12:11
@pditommaso Thanks! Then, I should create an output that in a channel that goes nowhere I guess.
Maxime Borry
@maxibor
Feb 07 2018 12:21
It has to be process specific with https://www.nextflow.io/docs/latest/process.html#cache ?
Paolo Di Tommaso
@pditommaso
Feb 07 2018 12:32
what do you mean?
Maxime Borry
@maxibor
Feb 07 2018 12:36
not saving each process output in a work subdirectory to save on disk space (the pipeline crashed because the work directory saturates mu disk space with 500G + of data)
Paolo Di Tommaso
@pditommaso
Feb 07 2018 12:39
well, you need to save outputs to pass them to next step, do we agree on that?
if you pipeline is producing a lot of temporary files other than output files, you can set process.scratch = true in the config file
it may help
Simone Baffelli
@baffelli
Feb 07 2018 12:44
Perhaps you could set the work directory to someother place?
I personally find it very nice that everything is cached
Maxime Borry
@maxibor
Feb 07 2018 12:50
Of course @pditommaso but once outputs are not needed any more ?
I also find the caching very nice, especially in the development phase of the pipeline, but in the "production" I'm in right now, the disk space occupied is too much...
I thought the cache false (wheter in the pipeline's code, or in the NF CLI) was implementing this option.
Edgar
@edgano
Feb 07 2018 13:50
@maxibor i never used it... but maybe "clean" is what are you looking ? nextflow clean -h
Maxime Borry
@maxibor
Feb 07 2018 14:28
I'll run my samples one by one, and delete the work directory after each run. Thanks @edgano , @baffelli , @pditommaso !
Paolo Di Tommaso
@pditommaso
Feb 07 2018 14:34
well, the suggested approach is just to delete work/ after each run
kevbrick
@kevbrick
Feb 07 2018 16:45
Hi there, I have a question. I am debugging a workflow that is difficult to prototype. Therefore, I must run the initial, slow steps to debug the end of the pipeline. The pipe runs on a slurm cluster. My problem is that if my pipeline crashes, I can only resume it from the node from which it was run. If I try to resume the job from another node on the cluster, it restarts the entire pipeline. Any idea if it's possible to get this behavior to work differently?
Paolo Di Tommaso
@pditommaso
Feb 07 2018 16:46
it should not happen
is something depending on the launching environment ?
kevbrick
@kevbrick
Feb 07 2018 16:56
Well. On our cluster, each job gets assigned a unique $SLURM_JOBID variable that is , in turn, used as the scratch path (ie /lscratch/$SLURM_JOBID). In each process, I therefore define : scratch '/lscratch/$SLURM_JOBID' at the top. However, since each process executes on slurm, this should be equally variable independent of which node I run the pipe from.
Paolo Di Tommaso
@pditommaso
Feb 07 2018 16:58
I would suggest to make a test with the basic nextflow hello pipeline to see if it's a infra problem or a issue with your code
kevbrick
@kevbrick
Feb 07 2018 16:59
Good idea ... willdo ...
Thanks
Paolo Di Tommaso
@pditommaso
Feb 07 2018 17:00
welcome
kevbrick
@kevbrick
Feb 07 2018 17:00
I just wanted to check if this was "normal" behaviour ...
Paolo Di Tommaso
@pditommaso
Feb 07 2018 17:00
not at all