These are chat archives for nextflow-io/nextflow

8th
Aug 2018
haidianfarmer
@haidianfarmer
Aug 08 2018 04:29
Could you explain the meaning and the purpose of the executor in the configure file. And I don't understand the difference between the executor in process and the executor in configure file. Thank you.
Kevin Sayers
@KevinSayers
Aug 08 2018 07:44
@haidianfarmer just two ways to specify the executor, it can be done either within a process block or specified in the config file either individually for each process or universally for the entire workflow.
Evan Floden
@evanfloden
Aug 08 2018 07:47
@haidianfarmer Adding to this, one of the key points of Nextflow is to seperate the execution (local, hpc, cloud, whatever) from the workflow logic which enables truly portable pipelines.
haidianfarmer
@haidianfarmer
Aug 08 2018 09:37
@KevinSayers @skptic Thank you ! I have understand。
Vanessasaurus
@vsoch
Aug 08 2018 12:06
hey nextflow! Has anyone done a pipeline for https://github.com/heathsc/gemBS ?
I'm trying it with Cromwell / wdl and it's a mess, and am wondering if anyone else has hit this issue, or general thoughts on Cromwell vs. nextflow, etc.
Maxime Garcia
@MaxUlysse
Aug 08 2018 12:26
@vsoch Is something like that similar to what you're looking for https://github.com/nf-core/methylseq ?
Vanessasaurus
@vsoch
Aug 08 2018 12:29
I'm a genomic idiot so confuse my questions in advance - does that do the same data processing it's just a different tool?
I'm wondering if others on here have run into the "which workflow thing do I use" and have tried cromwell and could compare to nextflow? I'm trying cromwell now and finding that you can't use it inside docker (without development from the maintainers, in scala, which doesn't seem to happen) and it doesn't have support for singularity either. The only option is cromwell on the host, then it launches docker containers
Maxime Garcia
@MaxUlysse
Aug 08 2018 12:45
I'm a regular idiot :-D no idea about Cromwell vs Nextflow, sorry ;-)
Evan Floden
@evanfloden
Aug 08 2018 12:47
@vsoch Warning, you will get a very biased answer asking here. When I ran through the hello-world examples on Cromwell (and the time it took to spin up!!), I was quite surprised, expecially compared to NF and Snakemake. That said, the syntax of WDL is most similar to Nextflow and I find it somewhat readible (without saying anything about the Contrived Workflow Language).
Vanessasaurus
@vsoch
Aug 08 2018 12:48
"quite surprised" how so?
I am skeptical because the WDL is "trying" to be simple, but there are many more dependencies aside from it that are where you hit the big issues
Evan Floden
@evanfloden
Aug 08 2018 12:50
In the examples I saw, it takes over 30 sec to launch hello-world locally with Cromwell.
Vanessasaurus
@vsoch
Aug 08 2018 12:50
question: does nextflow handle slurm, sge, and different cloud environments?
it seems to handle docker and singularity better, but I wonder if the user tried to use nextflow (in a contianer itself) it would run into the same issues
Evan Floden
@evanfloden
Aug 08 2018 12:51
Yes, yes, AWS currently, many more bing built as we speak.
Vanessasaurus
@vsoch
Aug 08 2018 12:51
does anyone use nextflow in a container or is it always a local executable?
Evan Floden
@evanfloden
Aug 08 2018 12:54
I understand the kubernetes executor for Nextflow works similar to this (NF runs in one pod, launches tasks in other pods).
Vanessasaurus
@vsoch
Aug 08 2018 12:54
ah okay, thanks
Evan Floden
@evanfloden
Aug 08 2018 12:55
For cloud, people have been getting a lot of milage out of batch.
Vanessasaurus
@vsoch
Aug 08 2018 12:55
it sounds like it would have a lot of similar issues running locally, if trying to run nextflow (in a container) to launch other containers
Kevin Sayers
@KevinSayers
Aug 08 2018 12:55
@pditommaso @skptic I could be wrong but NF can run in docker and launch sibling docker containers
https://github.com/nextflow-io/nextflow/blob/master/docker/entry.sh my understanding is it can be setup using this
@vsoch you probbaly have better handle on this, but from poking around trying to run singularity sibling containers is not feasible/easily accomplished?
Félix C. Morency
@fmorency
Aug 08 2018 13:02
@pditommaso like what? I didn't change anything besides resuming the run. how is this hash calculated?
Vanessasaurus
@vsoch
Aug 08 2018 13:04
Singularity it could work with sudo but it needs additional caps
there isn't documentation on it, but I think it's this bit --> https://github.com/singularityware/singularity/blob/master/src/runtime/startup/scontainer.go#L106
here is what you get without it
$ sudo singularity shell --bind /usr/local cromwell.simg 
[sudo] password for vanessa: 
Singularity: Invoking an interactive shell within container...
Singularity cromwell.simg:~> which singularity
/usr/local/bin/singularity
Singularity cromwell.simg:~> whoami
root
Singularity cromwell.simg:~> singularity run shub://vsoch/hello-world
Progress |===================================| 100.0% 
ERROR  : Failed to set processus capabilities
ABORT  : Retval = 255
Kevin Sayers
@KevinSayers
Aug 08 2018 13:16
@vsoch interesting
Paolo Di Tommaso
@pditommaso
Aug 08 2018 13:18
@KevinSayers yes, nextflow -d run .. etc
@fmorency the hash is calculated using fullpath + size + lastmodifed time
Félix C. Morency
@fmorency
Aug 08 2018 13:21
@pditommaso could there be issues with network FS and lastmodified time? I mean, it's the same exact file. It haven't been recomputed.
Paolo Di Tommaso
@pditommaso
Aug 08 2018 13:22
In a few times, I seen network FS returning different timestamps for the same file, try to track if that could be the problem
Félix C. Morency
@fmorency
Aug 08 2018 13:23
is there a way for me to track this using NF? can I print the fullpath hash, size hash and lastmodified time hash?
Paolo Di Tommaso
@pditommaso
Aug 08 2018 13:26
use the following one liner
println nextflow.util.CacheHelper.hasher(file('<file path>')).hash()
repeat 100 times with a interval of 5 secs (for example)
and see if it changes
Félix C. Morency
@fmorency
Aug 08 2018 13:54
@pditommaso is the hash from this one-liner supposed to be the same one that -dump-hashes provides?
Paolo Di Tommaso
@pditommaso
Aug 08 2018 13:54
one for the file
Félix C. Morency
@fmorency
Aug 08 2018 13:58

Interesting. I ran the little script while my pipeline is running and the hash is not the same. For the file /imk/imk-bignas/developers/fmorency/Imeka/20180807-NF-HASH/Results/work/80/ce017e528006dfac7d4e2ede4cbf71/Felix__pre_b0_brain.nii.gz, the one-liner keeps returning 155e6d17b241a24e2c4bd5f0f69fd566 while -dump-hashes returns

 67862a21f9d9bdfb38b5edcbb37918ff [nextflow.util.ArrayBag] [FileHolder(sourceObj:/imk/imk-bignas/developers/fmorency/Imeka/20180807-NF-HASH/Results/work/80/ce017e528006dfac7d4e2ede4cbf71/Felix__pre_b0_brain.nii.gz, storePath:/imk/imk-bignas/developers/fmorency/Imeka/20180807-NF-HASH/Results/work/80/ce017e528006dfac7d4e2ede4cbf71/Felix__pre_b0_brain.nii.gz, stageName:Felix__pre_b0_brain.nii.gz)]

If I stop the pipeline and -resume, the job gets rescheduled with

 6b2ba469532afae4180268e375ceea63 [nextflow.util.ArrayBag] [FileHolder(sourceObj:/imk/imk-bignas/developers/fmorency/Imeka/20180807-NF-HASH/Results/work/80/ce017e528006dfac7d4e2ede4cbf71/Felix__pre_b0_brain.nii.gz, storePath:/imk/imk-bignas/developers/fmorency/Imeka/20180807-NF-HASH/Results/work/80/ce017e528006dfac7d4e2ede4cbf71/Felix__pre_b0_brain.nii.gz, stageName:Felix__pre_b0_brain.nii.gz)]

while the one-liner still returns me 155e6d17b241a24e2c4bd5f0f69fd566

If I -resumeagain, the job gets cached with
6b2ba469532afae4180268e375ceea63 [nextflow.util.ArrayBag] [FileHolder(sourceObj:/imk/imk-bignas/developers/fmorency/Imeka/20180807-NF-HASH/Results/work/80/ce017e528006dfac7d4e2ede4cbf71/Felix__pre_b0_brain.nii.gz, storePath:/imk/imk-bignas/developers/fmorency/Imeka/20180807-NF-HASH/Results/work/80/ce017e528006dfac7d4e2ede4cbf71/Felix__pre_b0_brain.nii.gz, stageName:Felix__pre_b0_brain.nii.gz)]
Paolo Di Tommaso
@pditommaso
Aug 08 2018 14:03
use this instead
import nextflow.util.* 
def bag = new ArrayBag( [file('<your file>')] )
println CacheHelper.hasher(bag).hash()
Félix C. Morency
@fmorency
Aug 08 2018 14:07

I started the pipeline and get

 cfc9a879488ea6fd7a9ee95e471e813e [nextflow.util.ArrayBag] [FileHolder(sourceObj:/imk/imk-bignas/developers/fmorency/Imeka/20180807-NF-HASH/Results/work/85/953487da69458aaa69601971416ce3/Felix__pre_b0_brain.nii.gz, storePath:/imk/imk-bignas/developers/fmorency/Imeka/20180807-NF-HASH/Results/work/85/953487da69458aaa69601971416ce3/Felix__pre_b0_brain.nii.gz, stageName:Felix__pre_b0_brain.nii.gz)]

I started the script while the pipeline is still running and get ee36866e95b471abb2c1fb512a24ed28. I stopped the pipeline and re-started it with -resume and get

ee36866e95b471abb2c1fb512a24ed28 [nextflow.util.ArrayBag] [FileHolder(sourceObj:/imk/imk-bignas/developers/fmorency/Imeka/20180807-NF-HASH/Results/work/85/953487da69458aaa69601971416ce3/Felix__pre_b0_brain.nii.gz, storePath:/imk/imk-bignas/developers/fmorency/Imeka/20180807-NF-HASH/Results/work/85/953487da69458aaa69601971416ce3/Felix__pre_b0_brain.nii.gz, stageName:Felix__pre_b0_brain.nii.gz)]
Paolo Di Tommaso
@pditommaso
Aug 08 2018 14:08
what's the value of the snippet I wrote above ?
Félix C. Morency
@fmorency
Aug 08 2018 14:08
It seems the hash computed and stored the first time a process is executed is not the right one
It's ee36866e95b471abb2c1fb512a24ed28
Paolo Di Tommaso
@pditommaso
Aug 08 2018 14:09
can you isolate this behaviour in a test case ?
Félix C. Morency
@fmorency
Aug 08 2018 14:09
I can try
It seems ArrayBag is doing something it shouldn't
Maxime Garcia
@MaxUlysse
Aug 08 2018 14:25
@pditommaso Is this feature documented anywhere
Paolo Di Tommaso
@pditommaso
Aug 08 2018 14:25
which feature ?
Maxime Garcia
@MaxUlysse
Aug 08 2018 14:25
(the nextflow -d run ...)
Paolo Di Tommaso
@pditommaso
Aug 08 2018 14:25
no
:grin:
more an experiment
Maxime Garcia
@MaxUlysse
Aug 08 2018 14:26
OK good to know
You should try to document a little more, or organize a hackathon ;-)
Paolo Di Tommaso
@pditommaso
Aug 08 2018 14:27
I should also try to survive :smile:
Maxime Garcia
@MaxUlysse
Aug 08 2018 14:28
That's true
Paolo Di Tommaso
@pditommaso
Aug 08 2018 14:28
and the hackathon is waiting you !
Maxime Garcia
@MaxUlysse
Aug 08 2018 14:28
We will be making more PRs on the nextflow docs, so you'll have some help ;-)
I know, but I'm afraid I can't join, I've already been to too many conferences this year
Paolo Di Tommaso
@pditommaso
Aug 08 2018 14:29
ouch! this is bad for your karma! :satisfied:
Maxime Garcia
@MaxUlysse
Aug 08 2018 14:31
I'll try to help remotely ;-)
Paolo Di Tommaso
@pditommaso
Aug 08 2018 14:31
:+1:
Vanessasaurus
@vsoch
Aug 08 2018 14:40
oooh I love that flyer thing
Paolo Di Tommaso
@pditommaso
Aug 08 2018 14:40
:smile:
Maxime Garcia
@MaxUlysse
Aug 08 2018 14:40
@vsoch You should go to Barcelona
Paolo Di Tommaso
@pditommaso
Aug 08 2018 14:41
it would be nice to have a singulrity guru :wink:
Maxime Garcia
@MaxUlysse
Aug 08 2018 14:41
And to get some stickers
Vanessasaurus
@vsoch
Aug 08 2018 15:16
omg I would love that, just no avenue to do so
I'm a big percentage Spaniard you know!
my grandfather was Cuban, my grandma Puerto Rican
I'm just the unfortunate generation born in the US that has completely lost basis with the culture because my mom was terrified of me being different (she grew up in Cuba)
and my Dad is some Jewish dude from California, which makes me totally lacking any real cultural identity
Mike Smoot
@mes5k
Aug 08 2018 15:18
Spaniard? Them's fightin' words in Catalonia!
Paolo Di Tommaso
@pditommaso
Aug 08 2018 15:18
aahha, mike! putting gasoline on fire !
Mike Smoot
@mes5k
Aug 08 2018 15:19
I got to join some very exciting protests last time!
Paolo Di Tommaso
@pditommaso
Aug 08 2018 15:19
@vsoch very interesting story!!
Vanessasaurus
@vsoch
Aug 08 2018 15:20
is the air cleaner there?
I'm thinking of moving to Canada at some point, but I have never thought about Europe
Mike Smoot
@mes5k
Aug 08 2018 15:21
@pditommaso can you elaborate on your plans to extend batch configuration in a more general way? Just curious.
Paolo Di Tommaso
@pditommaso
Aug 08 2018 15:21
well, maybe barcelona has some problems, but I guess definitely better than LA or SF
Vanessasaurus
@vsoch
Aug 08 2018 15:21
it might be a problem I don't speak French
Paolo Di Tommaso
@pditommaso
Aug 08 2018 15:22
but it canada they don't have this
and this
@mes5k coming back to batch, the idea is configure to allow NF to create on fly the batch environment
Félix C. Morency
@fmorency
Aug 08 2018 15:26
@vsoch hey I live in sherbrooke, quebec, canada :) it's not really a problem if you don't speak french. everyone speaks english.
@pditommaso good news I could reproduce it in a toy use-case. I'm cleaning up and opening an issue today
Mike Smoot
@mes5k
Aug 08 2018 15:27
So do what I now do with terraform dynamically with NF?
Paolo Di Tommaso
@pditommaso
Aug 08 2018 15:28
yes, that's the idea, ideally it should be a zero-config env
Mike Smoot
@mes5k
Aug 08 2018 15:30
Ok, got it. I'm excited to see what you come up with.
Paolo Di Tommaso
@pditommaso
Aug 08 2018 15:31
stay tuned :smile:
Netsanet Gebremedhin
@gnetsanet
Aug 08 2018 15:52
Hello everyone. Is it possible to access and Yaml.load the -params-file from within Nextflow? Of course I can access individual contents of the YAML file, however I am hoping to rewrite/update the YAML config file and maybe use the updated version in subsequent processes
Mike Smoot
@mes5k
Aug 08 2018 15:54
Why not just create a new yaml file with whatever data you want (from params or otherwise)?
Netsanet Gebremedhin
@gnetsanet
Aug 08 2018 16:02
Yes, that should work. I did not want to go that route since it will be a lengthy rewrite when dealing with a larger set of params or params that are not known a priori.
What I was hoping is, regardless of how numerous or how variable/dynamic the params are in that file, I will load the YAML and add new mappings while retaining the previous contents.
I guess the only other way is to load it again independently of NF ?
Paolo Di Tommaso
@pditommaso
Aug 08 2018 16:05
you can use snakeyaml api importing the org.yaml.snakeyaml.Yaml class and use it as in any Java/groovy script
Netsanet Gebremedhin
@gnetsanet
Aug 08 2018 16:12
Yes, I am thinking of doing something like:
import org.yaml.snakeyaml.Yaml
x = new Yaml().load(${params-file})
or
import org.yaml.snakeyaml.Yaml
x = new Yaml().load(workflow.params-file)

Instead of plain:

import org.yaml.snakeyaml.Yaml
x = new Yaml().load(new FileReader('/path/to/config.yaml'))

:-)

Félix C. Morency
@fmorency
Aug 08 2018 17:21
@pditommaso I opened #828. Let me know if you can reproduce the issue