These are chat archives for nextflow-io/nextflow

30th
Mar 2016
Matthieu Foll
@mfoll
Mar 30 2016 12:51
Hi Paolo, I have a question regarding the publishDir directive
We have pipelines producing large intermediate outputs that we want ultimately to trash when we are sure everything worked properly
So just deleting the work directory
Paolo Di Tommaso
@pditommaso
Mar 30 2016 12:52
rm -rf work
:)
Matthieu Foll
@mfoll
Mar 30 2016 12:52
Yes but we use publishDir to copy intermediate outputs that we want to keep
Paolo Di Tommaso
@pditommaso
Mar 30 2016 12:53
by default publishDir creates symlinks
Matthieu Foll
@mfoll
Mar 30 2016 12:53
Ideally we would like to be able to use publishDir with the mode “move” to avoid copying large files for nothing
but it doesn’t work for non-terminating processes
My question is then: could you make the move mode just the opposite of the symlink mode
Meaning you move the file in the publishDir directory and make a symlink in the work directory
So that the rest of the pipeline can keep running
Paolo Di Tommaso
@pditommaso
Mar 30 2016 12:54
you can either use mode: 'copy' or consolidate your published dir copying the symlinks to regular files
Matthieu Foll
@mfoll
Mar 30 2016 12:55
yes but in both cases in involves copying large files for nothing
Paolo Di Tommaso
@pditommaso
Mar 30 2016 12:55
um, what you are suggesting is a bit tricky, currently is not possible
I see
Feel free to open a feature request for that, I will investigate it's possible to implement it
Matthieu Foll
@mfoll
Mar 30 2016 12:57
ok I will
thanks
one alternative would be to postpone the actual move of the files at the end of the pipeline
Paolo Di Tommaso
@pditommaso
Mar 30 2016 12:58
could be an idea
Matthieu Foll
@mfoll
Mar 30 2016 12:59
I find the idea of moving and back symlinking more beautiful but if not possible why not
Paolo Di Tommaso
@pditommaso
Mar 30 2016 13:00
for now I would suggest to use symlinks and consolidate the results dereferencing them to regular files using a tar or cp
Matthieu Foll
@mfoll
Mar 30 2016 13:01
thanks for the tip, I didn’t know this option in cp
Paolo Di Tommaso
@pditommaso
Mar 30 2016 13:08
@mfoll Have you tried instead mode: 'link' instead?
It creates an hard link instead of a symlink
thus you can delete all the work files without affecting the ones linked in the publish dir
Matthieu Foll
@mfoll
Mar 30 2016 13:16
oh yes I didn’t think about that
it seems to be the perfect solution
thanks!
Paolo Di Tommaso
@pditommaso
Mar 30 2016 13:17
the hidden power of linux !
:)
Matthieu Foll
@mfoll
Mar 30 2016 13:18
indeed, I never use hard links so I didn’t realize it would behave like this
Paolo Di Tommaso
@pditommaso
Mar 30 2016 13:19
um, I've just realised that I won't work over a mounted file system :(
Matthieu Foll
@mfoll
Mar 30 2016 13:29
ouch yes indeed: "Operation not permitted"
we have a BeegFS file system and it supports hardlinks only when they are in the same directory: https://groups.google.com/forum/#!msg/fhgfs-user/cTJcqGZceVA/H7JkSH3uhOYJ