These are chat archives for nextflow-io/nextflow

5th
May 2016
Paolo Di Tommaso
@pditommaso
May 05 2016 06:01
IOError: [Errno 13] Permission denied: 'file_worm.motion.csv'
Docker requires root permission to write files, but this is not allowed by the NFS file system
To solve it set process.scratch=true in your nextflow.configfile
Jose Espinosa-Carrasco
@JoseEspinosa
May 05 2016 07:30
ok
thanks
Evan Floden
@evanfloden
May 05 2016 12:25
Maria and I have an issue when splitting a channel fasta into fasta_1 and fasta_2. Only the first process (align) seems to run: https://github.com/skptic/phyvaluate-nf
Paolo Di Tommaso
@pditommaso
May 05 2016 12:26
is there a container image that I can use to test it?
Evan Floden
@evanfloden
May 05 2016 12:26
Okay, give us a few minutes
mariach
@mariach
May 05 2016 12:32
So the problem was us forgetting to put ( ) after a "set file (' ')"
We need to make Nextflow output an error in such cases
Paolo Di Tommaso
@pditommaso
May 05 2016 12:33
it looks weird
so is it fixed?
mariach
@mariach
May 05 2016 12:37
that thing yes, but we are fighting with docker
Paolo Di Tommaso
@pditommaso
May 05 2016 12:38
:)
mariach
@mariach
May 05 2016 12:38
we want to activate it only for one process
Paolo Di Tommaso
@pditommaso
May 05 2016 12:38
so, what's the problem?
mariach
@mariach
May 05 2016 12:38
how we do that ? because what we are doing so far activates it for all
Paolo Di Tommaso
@pditommaso
May 05 2016 12:39
in the nextflow.config add
mariach
@mariach
May 05 2016 12:40
?????
Paolo Di Tommaso
@pditommaso
May 05 2016 12:40
process.$the-process-name.container = 'container-name'
mariach
@mariach
May 05 2016 12:40
did that
of course :P
Paolo Di Tommaso
@pditommaso
May 05 2016 12:40
I wont believe
:grin:
push as it is, I will do
Evan Floden
@evanfloden
May 05 2016 12:43
We keep finding the the solution (and new problems) as you answer!
Paolo Di Tommaso
@pditommaso
May 05 2016 12:43
this is good!
constructive dialectic :)
mariach
@mariach
May 05 2016 12:45
ok we didn't fix it. We only thought so! Committing current state in git
Paolo Di Tommaso
@pditommaso
May 05 2016 12:46
finally I can have some fun
mariach
@mariach
May 05 2016 12:46
hahahah :P
Paolo Di Tommaso
@pditommaso
May 05 2016 12:48
$ docker pull cbcrg/pasta_upp
Using default tag: latest
Pulling repository docker.io/cbcrg/pasta_upp
Error: image cbcrg/pasta_upp not found
push that image please
mariach
@mariach
May 05 2016 12:51
it is already pushed
go to russia and check
Paolo Di Tommaso
@pditommaso
May 05 2016 12:53
Do you see it here?
Jose Espinosa-Carrasco
@JoseEspinosa
May 05 2016 12:54
hey just saw this, to push to cbcrg docker hub do I need any credentials, and it so, can you send it to me, invite...?
I would like to push pergola docker
Paolo Di Tommaso
@pditommaso
May 05 2016 12:54
sure, how is you handle
Jose Espinosa-Carrasco
@JoseEspinosa
May 05 2016 12:55
handle?
Paolo Di Tommaso
@pditommaso
May 05 2016 12:55
em, user name on docker hub
Jose Espinosa-Carrasco
@JoseEspinosa
May 05 2016 12:56
I will never learn all this jargon :worried:
joseespinosa
Paolo Di Tommaso
@pditommaso
May 05 2016 12:57
ok, check if you can push now
and explain how to do to Maria ..
Jose Espinosa-Carrasco
@JoseEspinosa
May 05 2016 12:57
:+1:
jajaja
:smile:
I just learn, I don't think I am a proper teacher
Paolo Di Tommaso
@pditommaso
May 05 2016 12:59
:)
anyway is the push working now?
Jose Espinosa-Carrasco
@JoseEspinosa
May 05 2016 13:00
just one question before
mariach
@mariach
May 05 2016 13:01
ok so then how benchfam is running??
Jose Espinosa-Carrasco
@JoseEspinosa
May 05 2016 13:01
is latest tag automatically tagged, or it is the user who should tag it this way?
mariach
@mariach
May 05 2016 13:01
because they use the same container and run on cl-7
Paolo Di Tommaso
@pditommaso
May 05 2016 13:01
This message was deleted
@JoseEspinosa um, I think you need to update it
Jose Espinosa-Carrasco
@JoseEspinosa
May 05 2016 13:03
ok
I'll try
Paolo Di Tommaso
@pditommaso
May 05 2016 13:05
@mariach no, it's another one
process.$6_Large_scale_MSAs.container = 'cbcrg/benchfam_large_scale'
mariach
@mariach
May 05 2016 13:06
is this the image for UPP and PASTA
?
Paolo Di Tommaso
@pditommaso
May 05 2016 13:07
there's only one image, thus it must be
Jose Espinosa-Carrasco
@JoseEspinosa
May 05 2016 13:07
@pditommaso
docker push cbcrg/pergola
The push refers to a repository [docker.io/cbcrg/pergola]
a0a49196bba3: Preparing 
f1dc0d589ff3: Preparing 
25077f3237c9: Preparing 
1f20b1510a3b: Preparing 
070b93a6a767: Preparing 
abb568b37ae2: Waiting 
6aeada42326d: Waiting 
5f70bf18a086: Waiting 
5cf01afeecd4: Waiting 
unauthorized: authentication required
Paolo Di Tommaso
@pditommaso
May 05 2016 13:07
um, wait
@JoseEspinosa try it again
Jose Espinosa-Carrasco
@JoseEspinosa
May 05 2016 13:08
voy
seems to be working!!!
Paolo Di Tommaso
@pditommaso
May 05 2016 13:08
:+1:
Jose Espinosa-Carrasco
@JoseEspinosa
May 05 2016 13:08
thanks!!! :smile:
Paolo Di Tommaso
@pditommaso
May 05 2016 13:11
@mariach OK, I'm pushing it
Paolo Di Tommaso
@pditommaso
May 05 2016 13:23
@mariach @skptic I guess in that image there's no mega_coffee
I can't run it
mariach
@mariach
May 05 2016 13:23
exactly, because we don't want to use docker for that process
Evan Floden
@evanfloden
May 05 2016 13:24
Okay, so in our process 'align' which contains mega, we do NOT want to use Docker. However, the .command.run contains:
docker run -i --memory 20480m -e "NXF_DEBUG=${NXF_DEBUG:=0}" -e "BASH_ENV=/nfs/users/cn/efloden/projects/megaCoffee/phyvaluate-nf/work/87/c9e124adbc011a92f7a7379aa1badd/.command.env" -v /nfs/users/cn/efloden/projects/megaCoffee/phyvaluate-nf:/nfs/users/cn/efloden/projects/megaCoffee/phyvaluate-nf -v "$PWD":"$PWD" -w "$PWD" --entrypoint /bin/bash $cpuset --name $NXF_BOXID [:] -c "/bin/bash -ue /nfs/users/cn/efloden/projects/megaCoffee/phyvaluate-nf/work/87/c9e124adbc011a92f7a7379aa1badd/.command.sh"
) >"$COUT" 2>"$CERR" &
we only want docker for the alignUPP process
The generalised question is: How do I specify to ruhn with docker for only one process
Paolo Di Tommaso
@pditommaso
May 05 2016 13:25
this thing is misplaced
our you put it outside the the profiles
or you put it inside a profiles e.g.
profiles {
  myProfile {
       process.$alignUPP.container='cbcrg/pasta_upp'
       process.$alignUPP.queue="cn-el7"
       process.$alignUPP.memory="40G"
       process.$alignUPP.cpus=1
       process.$alignUPP.scratch=true      

       docker { enabled = true }  
  }
}
mariach
@mariach
May 05 2016 13:28
ok still not working...
Paolo Di Tommaso
@pditommaso
May 05 2016 13:28
I think you are tired today, guys .. :)
mariach
@mariach
May 05 2016 13:29
also it seems that if you do that then you need to choose between CRG or myProfile. We committed again the latest version :P
Paolo Di Tommaso
@pditommaso
May 05 2016 13:31
wait
um, there isn't the dataset the pipeline is referencing ..
This message was deleted
Cannot find any input sequence files matching: /home/pditommaso/projects/phyvaluate-nf/tutorial/data/*.tt
Evan Floden
@evanfloden
May 05 2016 13:38
try again
Paolo Di Tommaso
@pditommaso
May 05 2016 13:38
@skptic @mariach uh?
ah
mariach
@mariach
May 05 2016 13:38
:)
Paolo Di Tommaso
@pditommaso
May 05 2016 13:40
better
how long does it take to run?
mariach
@mariach
May 05 2016 13:40
10min more or less
for one 1 thing
Paolo Di Tommaso
@pditommaso
May 05 2016 13:41
cannot be used a small dataset just for test it?
mariach
@mariach
May 05 2016 13:41
can you edit the docker file accordingly?! so we can try it
nope
Paolo Di Tommaso
@pditommaso
May 05 2016 13:41
nope
is not a valid answer
mariach
@mariach
May 05 2016 13:42
we don't have a "fake" true reference tree! Blame biology not us :P
Paolo Di Tommaso
@pditommaso
May 05 2016 13:44
why not?
it needs to be a fake tree, it's not important the produce some meaningful result
it's only need to test the pipeline execution logic
I can't help in this way guys, please put together a dataset that allow the pipeline to be run in a few seconds
Evan Floden
@evanfloden
May 05 2016 14:00
we just put a test dataset
data/test
Paolo Di Tommaso
@pditommaso
May 05 2016 14:01
great!
upp is not happy with it
Error executing process > 'alignUPP (1)'

Caused by:
  Missing output file(s): 'seatoxin_upp.aln' expected by process: alignUPP (1)

Command executed:

  run_upp.py -s seatoxin.fa -m amino --cpu 1 -d seatoxin -o seatoxin_upp.aln
  cp seatoxin/pasta.fasttree seatoxin_upp.nwk

Command error:
  WARNING: Your kernel does not support swap limit capabilities. Limitation discarded.
  [14:03:52] exhaustive_upp.py (line 59):     INFO: Reading input sequences: <open file 'seatoxin.fa', mode 'r' at 0x1c889c0>
  [14:03:52] exhaustive_upp.py (line 79):     INFO: Backbone size set to: 93
  [14:03:52] exhaustive_upp.py (line 85):     INFO: Writing backbone set. 
  [14:03:52] filemgr.py (line 114):     INFO: Root temp directory built: /tmp/sepp/seatoxin_upp.aln.u51s4C
  [14:03:52] exhaustive_upp.py (line 89):     INFO: Generating pasta backbone alignment and tree. 
  [14:03:52] jobs.py (line 66):     INFO: Starting pasta Job with input: run_pasta.py --num-cpus=1 -i /tmp/sepp/seatoxin_upp.aln.u51s4C/backbone/backboneQ32ZW7.fas --datatype=protein --temporaries=/tmp/nxf.oqlU4p2WlN/seatoxin/pastatmp -j pastajob --output-directory=/tmp/sepp/seatoxin_upp.aln.u51s4C/pastaout/
  [14:04:02] jobs.py (line 111):     INFO: Finished pasta Job with input: run_pasta.py --num-cpus=1 -i /tmp/sepp/seatoxin_upp.aln.u51s4C/backbone/backboneQ32ZW7.fas --datatype=protein --temporaries=/tmp/nxf.oqlU4p2WlN/seatoxin/pastatmp -j pastajob --output-directory=/tmp/sepp/seatoxin_upp.aln.u51s4C/pastaout/ with:
   return code: 0
   output: PASTA INFO: Reading input sequences from '/tmp/sepp/seatoxin_upp.aln.u51s4C/backbone/backboneQ32ZW7. ... (continued: 2345 ) ...
  [14:04:02] exhaustive_upp.py (line 101):     INFO: Backbone alignment written to <open file '/tmp/nxf.oqlU4p2WlN/seatoxin/pasta.fasta', mode 'r' at 0x1c88b70>.
  Backbone tree written to <open file '/tmp/nxf.oqlU4p2WlN/seatoxin/pasta.fasttree', mode 'r' at 0x1c88c00>
  [14:04:02] exhaustive_upp.py (line 104):     INFO: No query sequences to align.  Final alignment saved as /tmp/nxf.oqlU4p2WlN/seatoxin/seatoxin_upp.aln_alignment.fasta
Paolo Di Tommaso
@pditommaso
May 05 2016 14:11
@skptic But do you really need the aln? The upp_alignments is used nowhere
mariach
@mariach
May 05 2016 14:13
try now
there was a problem with one of the input files
and yes we need it. The process "compare_tree" uses it
Paolo Di Tommaso
@pditommaso
May 05 2016 14:15
ok, I managed to reproduce the problem
mariach
@mariach
May 05 2016 14:15
how about a solution to it :P
Paolo Di Tommaso
@pditommaso
May 05 2016 14:15
no, upp_alignments is not used
mariach
@mariach
May 05 2016 14:15
it will be used in the future
we need it
Paolo Di Tommaso
@pditommaso
May 05 2016 14:16
future is uncertain ..
mariach
@mariach
May 05 2016 14:17
yes and "My the force be with us", which clearly is not the case today :P
Paolo Di Tommaso
@pditommaso
May 05 2016 14:35
the config was better before ..
Paolo Di Tommaso
@pditommaso
May 05 2016 14:50
@mariach @skptic Solved
skptic/phyvaluate-nf#1
it is missing the into declaration
No it stops with Can't locate MOTree.pm but you can deal with it
Evan Floden
@evanfloden
May 05 2016 15:11

Sorry Paolo,

This should have been said before. I don't think we are getting the same errors. This is what we have been getting:

nextflow run phyvaluate.nf --input='data/test/*.fa' --ref_trees='data/test/*.tt' --output='results' -with-docker -profile crg
N E X T F L O W  ~  version 0.18.0
Launching phyvaluate.nf
c o n c T r e e  - N F  ~  version 0.1
=====================================
name                   : Evaluation of Phylogenetic Trees from Simulated Data
input                  : data/test/*.fa
ref_trees              : data/test/*.tt
output                 : results
aligner                : MEGA-Coffee


[warm up] executor > crg
[89/9a964b] Submitted process > align (1)
[83/e96c08] Submitted process > alignUPP (1)
Error executing process > 'align (1)'

Caused by:
  Process 'align (1)' terminated with an error exit status

Command executed:

  mega_coffee -i seatoxin.fa                     -o "seatoxin_prediction.aln"                     --cluster_size 2                     --cluster_number 5000                     -n 2                     --phylo_out "seatoxin_prediction.nwk"                     -d

Command exit status:
  125

Command output:
  (empty)

Command error:
  docker: Error parsing reference: "[:]" is not a valid repository/tag.
  See 'docker run --help'.

Work dir:
  /nfs/users/cn/efloden/projects/megaCoffee/phyvaluate-nf/work/89/9a964b82afce1a74a81acc86dd6924

Tip: view the complete command output by changing to the process work dir and entering the command: 'cat .command.out'
Paolo Di Tommaso
@pditommaso
May 05 2016 15:12
don't use the -with-docker flag
it is enabled by the config file
Evan Floden
@evanfloden
May 05 2016 15:13
great, lost in translation
Paolo Di Tommaso
@pditommaso
May 05 2016 15:14
:)
Evan Floden
@evanfloden
May 05 2016 15:14
working now!
Paolo Di Tommaso
@pditommaso
May 05 2016 15:14
actually it looks a weird bug, I need to check
thanks guys
Mike Smoot
@mes5k
May 05 2016 15:36
I'm wondering if there are any tools for visualizing the DAGs inferred from Nextflow pipelines? If not, is there a way of dumping the DAG somehow?
Paolo Di Tommaso
@pditommaso
May 05 2016 15:39
Currently it's only possible to produce a chart like this
but it's not a DAG
in principle should be possible to infer it though I think it would not be a trivial task
Mike Smoot
@mes5k
May 05 2016 15:44
Looks nice! Can you explain how the chart was produced?
Paolo Di Tommaso
@pditommaso
May 05 2016 15:45
Using the -timeline option
it produces an html of all tasks executed
Mike Smoot
@mes5k
May 05 2016 15:45
Perfect, will check that out. Thanks!
Paolo Di Tommaso
@pditommaso
May 05 2016 15:45
:+1:
Paolo Di Tommaso
@pditommaso
May 05 2016 21:43
@mes5k Just realised that the option is -with-timeline not -timeline
Mike Smoot
@mes5k
May 05 2016 21:49
yeah, figured that one out :)
Paolo Di Tommaso
@pditommaso
May 05 2016 21:50
great
anyway I think I've found a way to render the DAG as well
Mike Smoot
@mes5k
May 05 2016 22:03
Cool - I've been playing with a TraceObserver that should write a DAG after the pipeline runs.
Still not working, but close
Paolo Di Tommaso
@pditommaso
May 05 2016 22:03
wow, interesting ..
mariach
@mariach
May 05 2016 22:04
cool! :D Please commit it when you manage to make it work!
Mike Smoot
@mes5k
May 05 2016 22:05
will keep you posted!
Paolo Di Tommaso
@pditommaso
May 05 2016 22:05
still not understanding how you track the relationship between the processes, since there isn't that information at the TraceObserver level
Mike Smoot
@mes5k
May 05 2016 22:06
just started a meeting - will reply properly in a bit
Paolo Di Tommaso
@pditommaso
May 05 2016 22:06
no pb, almost time to sleep here :)
Paolo Di Tommaso
@pditommaso
May 05 2016 22:34
nextflow-dag.png
well, still some work to do ..
Mike Smoot
@mes5k
May 05 2016 23:17
That looks like a great start! My idea was to simply note the names of the processes, inputs, and outputs as they execute and then match inputs to outputs to generate edges. Unfortunately the input and output names that I see (when I see anything) aren't what I expect, so this approach isn't working.