These are chat archives for nextflow-io/nextflow

20th
Apr 2017
Alessia
@alesssia
Apr 20 2017 08:13
Hello, I am trying to delete some temporary files, but it seems that when all the processes are terminated my files are still there. I mean, on step 1 I create two files that send to a channel called toread and they are received and used by step 2, that sends them to another channel (called toremove). The final step receives the files from the channel toremove and execute rm -rf $file1 $file2 with no success. What I am doing wrong?
Paolo Di Tommaso
@pditommaso
Apr 20 2017 08:15
usually that's not required, when your pipeline is complete just drop the work folder where all intermediate files are located
Alessia
@alesssia
Apr 20 2017 08:15
These are not located in the work folder but in my folder
(the user can decide to keep them, there is a when clause that decide whether they are removed or not)
Paolo Di Tommaso
@pditommaso
Apr 20 2017 08:18
any process input file is staged as symlink in the process work dir, thus I guess your rm -rf $file1 $file2 is just removing the those links
Alessia
@alesssia
Apr 20 2017 08:19
so how can i remove them for good?
(the symlink in the work directory indeed disappears)
Paolo Di Tommaso
@pditommaso
Apr 20 2017 08:19
well, you should resolve the symlinks to the real files .. anyhow it smells bad
I would suggest to use an inverse logic, produce only if the user needs/wants them
instead of deleting if not needed
Alessia
@alesssia
Apr 20 2017 08:22
The step2 needs them
and then the user may be interested in having them as well (but they must be produced)
Paolo Di Tommaso
@pditommaso
Apr 20 2017 08:23
all files needed by a process are (or at least are supposed to be) located in the work folder
if this is not the case your are implementing an anti-pattern
Alessia
@alesssia
Apr 20 2017 08:26
I have process 1 that creates fileA & fileB, and then moves them to a "storeDir", where process 2 picks them up --they are moved so the user can have them there. What you suggest is then to DO not move them to "storeDir" in process 1 but use the process that is now trying to remove them to move them?
(bonus question: can I automatically remove the work folder when the nextflow script finishes?)
Paolo Di Tommaso
@pditommaso
Apr 20 2017 08:28
use publishDir instead of storeDir to have output files in some a specific folder
bonus question: can I automatically remove the work folder when the nextflow script finishes?
you could be able to do implementing a completion handler but it would require some custom code
Alessia
@alesssia
Apr 20 2017 08:31
for removing my two files?
Paolo Di Tommaso
@pditommaso
Apr 20 2017 08:31
maybe an easier approach is just to write a wrapper script the launcher the pipeline execution and then delete that folder
ah
Alessia
@alesssia
Apr 20 2017 08:32
maybe an easier approach is just to write a wrapper script the launcher the pipeline execution and then delete that folder
I am already doing this, looking for something more elegant
Paolo Di Tommaso
@pditommaso
Apr 20 2017 08:33
for two files or all the work folder
?
Alessia
@alesssia
Apr 20 2017 08:36
Ok, I am messing up. I have two issues A) deleting the files that are created by process 1 and used by process 2 and B) the bonus question. I am supposing that using publishDir + completion handler || wrapper (as I am already doing) is solution at problem B. What about problem A? What you suggest is then to DO not move them to "storeDir" in process 1 but use the process that is now trying to remove them to move them?
Paolo Di Tommaso
@pditommaso
Apr 20 2017 08:42
I'm suggesting to not use storeDir nor move them, but to publish the use dir only if needed/required with publishDir
note that you can make the publishing conditional writing a custom rule for the saveAs attribute and returning a null when you don't want to publish a file
that's elegant .. :)
Alessia
@alesssia
Apr 20 2017 08:45
Ok, let's make the problem more complicated them. Process 1 creates, along file A and B, that may or may not be kept, a file C, that must be always saved (that is the reason while I was using storeDir). Can I do it with publishDir?
Paolo Di Tommaso
@pditommaso
Apr 20 2017 08:47
storeDir should be used only for long term output caching not to chose output folder
Can I do it with publishDir?
yes
Alessia
@alesssia
Apr 20 2017 08:47
ok, let's see what can I do
Paolo Di Tommaso
@pditommaso
Apr 20 2017 08:48
:+1:
Alessia
@alesssia
Apr 20 2017 08:48
I am afraid you will hear from me again :)
Paolo Di Tommaso
@pditommaso
Apr 20 2017 08:49
:D
out of curiosity, what's your org/lab ?
Alessia
@alesssia
Apr 20 2017 08:50
KCL
(too vague?)
Paolo Di Tommaso
@pditommaso
Apr 20 2017 08:51
nice
are you in contact with @snewhouse ?
Alessia
@alesssia
Apr 20 2017 08:51
nope
Paolo Di Tommaso
@pditommaso
Apr 20 2017 08:52
ok, maybe now you will :)
Alessia
@alesssia
Apr 20 2017 08:52
:D
Alessia
@alesssia
Apr 20 2017 14:02
@pditommaso I knew you were anxious to know: it works :D Thanks for your help!
Paolo Di Tommaso
@pditommaso
Apr 20 2017 14:03
LOL
you are welcome
chdem
@chdem
Apr 20 2017 14:15
Hi guys ! Is there any way to enable singularity only for some process ?
Paolo Di Tommaso
@pditommaso
Apr 20 2017 14:16
and the others?
chdem
@chdem
Apr 20 2017 14:16
the other would without any containers
would run
Paolo Di Tommaso
@pditommaso
Apr 20 2017 14:16
ok, sure
specify the singularity image only for that process eg
process.$foo.container = '/path/singularity.img'
singularity.enabled = true
use the above configuration settings
chdem
@chdem
Apr 20 2017 14:21
ok, I think it is what I've tried but I get a NullPointerException
let me try again ;)
Paolo Di Tommaso
@pditommaso
Apr 20 2017 14:21
uh
chdem
@chdem
Apr 20 2017 14:21
process {
$bam_to_fastq{
container="$params.containers_folder/bedtools/2.26/chdem_bedtools_2.26-2017-04-19-9c60dc4e29f6.img"
}
$hot_count_bam{ }
}
singularity {
enabled = true
autoMounts = true
}
java.lang.NullPointerException

Caused by:
java.lang.NullPointerException

java.lang.NullPointerException: null
at nextflow.container.SingularityBuilder.normalizeImageName(SingularityBuilder.groovy:161)
at nextflow.container.SingularityBuilder$normalizeImageName$3.call(Unknown Source)

(sorry, I forget to use markdown syntax)
Paolo Di Tommaso
@pditommaso
Apr 20 2017 14:24
maybe you have spotted a bug, could you please open an issue on GH
chdem
@chdem
Apr 20 2017 14:24
of course ! ;)
Paolo Di Tommaso
@pditommaso
Apr 20 2017 14:24
thanks!
chdem
@chdem
Apr 20 2017 14:24
thank you Paolo !
Paolo Di Tommaso
@pditommaso
Apr 20 2017 14:24
you are welcome
chdem
@chdem
Apr 20 2017 14:32
I'm sorry Paolo but I have another question
Paolo Di Tommaso
@pditommaso
Apr 20 2017 14:32
one at time, wait a moment :)
chdem
@chdem
Apr 20 2017 14:32
ok, no worries
Paolo Di Tommaso
@pditommaso
Apr 20 2017 14:33
I've uploaded a snapshot that should solve the problem
you can try it running this
NXF_VER=0.24.3-SNAPSHOT nextflow run .. etc
next question?
Félix C. Morency
@fmorency
Apr 20 2017 14:34
lol
Paolo Di Tommaso
@pditommaso
Apr 20 2017 14:34
what? :)
chdem
@chdem
Apr 20 2017 14:34
this is ok
for the singularity problem
(I do not understand why but this is ok)
the other question is about the .nextflow folder and the .nextflow.log files
Paolo Di Tommaso
@pditommaso
Apr 20 2017 14:35
what are you not understanding ?
chdem
@chdem
Apr 20 2017 14:35
is there any way to give to nextflow a folder where to write theses files/folder
because nextflow create theses files/folder always in the path where it is executed, but if I want to centralize my logs
how can I do that ?
Paolo Di Tommaso
@pditommaso
Apr 20 2017 14:38
well, the usage pattern is to use a different folder for each experiments
however you can specify the log file path with this command
nextflow -log /your/log/file run .. etc
chdem
@chdem
Apr 20 2017 14:38
Great !
thank you
Paolo Di Tommaso
@pditommaso
Apr 20 2017 14:38
the .nextflow folder cannot be relocated, it must be in the launch folder
welcome
@chdem I didn't understand if the new snapshot is fixing the singularity problem. Can you confirm that?
chdem
@chdem
Apr 20 2017 14:46
the new snapshot is fixing the singularity problem, yes !
Paolo Di Tommaso
@pditommaso
Apr 20 2017 14:46
thanks
chdem
@chdem
Apr 20 2017 14:48
I was simply surprised by the nextflow header (N E X T F L O W ~ version 0.23.4) even with your environment variable NXF_VER=0.24.3-SNAPSHOT
Paolo Di Tommaso
@pditommaso
Apr 20 2017 14:49
um, that not that should not happen
you are using the old one
chdem
@chdem
Apr 20 2017 14:50
ok, I'm going to update nextflow, thank you
Mike Smoot
@mes5k
Apr 20 2017 17:51

Hi @pditommaso, I feel like I'm missing something simple here - I'm not sure how to get a file generated in the exec block of a process into the work dir for that process so that the output channel will pick it up:

Channel.from(1,2,3).into{ inchan }

process hello {

    input:
    val(x) from inchan

    output:
    file('hello.txt') into outchan

    exec:
    file('hello.txt').text = "Hello ${x}"
}

Maybe there's a path available to the work dir for the given task?

Paolo Di Tommaso
@pditommaso
Apr 20 2017 17:52
exactly
 exec:
    file("$task.workDir/hello.txt").text = "Hello ${x}"
Mike Smoot
@mes5k
Apr 20 2017 17:53
Perfect, thanks!
Paolo Di Tommaso
@pditommaso
Apr 20 2017 17:53
:+1:
Mike Smoot
@mes5k
Apr 20 2017 17:55
Is there a place where all task variables are documented?
Paolo Di Tommaso
@pditommaso
Apr 20 2017 17:56
unfortunately no, I need to add that
Félix C. Morency
@fmorency
Apr 20 2017 17:56
+1 :D
Paolo Di Tommaso
@pditommaso
Apr 20 2017 17:56
+2 :D
Mike Smoot
@mes5k
Apr 20 2017 17:57
Would definitely be useful. If I knew which variables were available, I'd help out. :)
Paolo Di Tommaso
@pditommaso
Apr 20 2017 17:57
I will try to add it soon
Evan Floden
@evanfloden
Apr 20 2017 22:00
Good night/morning/evening! I can't see the PR at all for some reason. But good news it the Supp should be live. I check now
Evan Floden
@evanfloden
Apr 20 2017 23:39
Sorry, wrong channel. Could have been worse I guess!