These are chat archives for nextflow-io/nextflow

25th
Jul 2018
micans
@micans
Jul 25 2018 08:33
@pditommaso ok yes will do that's the right place, but I will be a bit slow in responding due to work.
Paolo Di Tommaso
@pditommaso
Jul 25 2018 08:34
No, problem
It's experimental stuff
misssoft
@misssoft
Jul 25 2018 08:48
Hello, I am new for Gitter, not sure in the right place. Here is the issue we are facing in Nextflow implementation with Ignite.
The problem is if we run 2 nextflow run command in the Ignite cluster, both will fail silently... is it possible to run several nextflow command at once on Ignite cluster?
Paolo Di Tommaso
@pditommaso
Jul 25 2018 08:53
each workflow run needs to operation in its own cluster instance
Egon Willighagen
@egonw
Jul 25 2018 09:03
@pditommaso, @mes5k, thanks for your feedback yesterday... it kept me motivated to overcome the learning curve and I got something working now
Paolo Di Tommaso
@pditommaso
Jul 25 2018 09:04
you are welcome
Maxime Garcia
@MaxUlysse
Jul 25 2018 09:20
Well done @egonw !!!
Egon Willighagen
@egonw
Jul 25 2018 09:20
I would not say "well"... it was rather painful :)
Egon Willighagen
@egonw
Jul 25 2018 09:49
quick question... when using script:, how can I get the output to STDOUT show up on the .nextflow.log (or on the STDOUT for the ./nextflow run)?
(for debug purposes...)
Paolo Di Tommaso
@pditommaso
Jul 25 2018 09:51
add process.echo = true in the nextflow.config file
or echo truein a specific process
Denis Volk
@dvolk_gitlab
Jul 25 2018 09:58
@pditommaso if we use different ignite cluster instances (all running on the same vms) will they schedule 'cooperatively'?
Paolo Di Tommaso
@pditommaso
Jul 25 2018 09:59
nope, it's designed to use its own vms (ie. transient cluster)
Denis Volk
@dvolk_gitlab
Jul 25 2018 10:01
i mean if I have ignite cluster 1 running on machines 1,2,3 and ignite cluster 2 also running on machines 1,2,3 will it respect the nextflow resource limits or will it basically be using 2 times more than it should?
Paolo Di Tommaso
@pditommaso
Jul 25 2018 10:02
yes, unless you don't set the max amount of resources to use
Denis Volk
@dvolk_gitlab
Jul 25 2018 10:04
so by default if I have two clusters running on the same machines with 20 CPUs total, will it be running 20 or 40 nextflow tasks at a time?
Paolo Di Tommaso
@pditommaso
Jul 25 2018 10:06
40
Denis Volk
@dvolk_gitlab
Jul 25 2018 10:11
So with ignite it's not really possible to run two arbitrary nextflows at the same time?
If you want to run two, you have to basically edit the config and halve the resources available on both
Paolo Di Tommaso
@pditommaso
Jul 25 2018 10:12
it's designed to spawn it's own cluster instance along with the vms
in the cloud this is not supposed to be a problem
if this is a problem, you should take in consideration a AWS batch executor, that manages the VMs scaling for you
Denis Volk
@dvolk_gitlab
Jul 25 2018 10:17
We are on openstack
Paolo Di Tommaso
@pditommaso
Jul 25 2018 10:18
I see, in this case - maybe - Kubernetes could be a better option
Denis Volk
@dvolk_gitlab
Jul 25 2018 10:20
We might just queue them for now
We can reuse the cluster right?
Paolo Di Tommaso
@pditommaso
Jul 25 2018 10:21
yes
Denis Volk
@dvolk_gitlab
Jul 25 2018 10:21
OK thanks
This has been a source of mysterious failures for a while now :)
Paolo Di Tommaso
@pditommaso
Jul 25 2018 10:22
I understand
Vladimir Kiselev
@wikiselev
Jul 25 2018 11:51
has anyone had this problem before on Kubernetes cluster? Nextflow is not allowed to create pods:
Caused by:
  Request POST /api/v1/namespaces/default/pods returned an error code=403

  {
      "kind": "Status",
      "apiVersion": "v1",
      "metadata": {

      },
      "status": "Failure",
      "message": "pods is forbidden: User \"system:serviceaccount:default:default\" cannot create pods in the namespace \"default\"",
      "reason": "Forbidden",
      "details": {
          "kind": "pods"
      },
      "code": 403
  }
Paolo Di Tommaso
@pditommaso
Jul 25 2018 11:52
you may need to use a specific serviceAccount
are you able to create a pod w/o using NF ?
Vladimir Kiselev
@wikiselev
Jul 25 2018 11:56
we were able to start an NF pod using nextflow kuberun login -v testpvc:/mnt/gluster, but then once we were inside of this pod we got the error above.
Paolo Di Tommaso
@pditommaso
Jul 25 2018 11:57
we were able to start an NF pod
what has changed? NF version or the K8s config ?
Vladimir Kiselev
@wikiselev
Jul 25 2018 12:00

Hmm, this is on the machine from which we access the cluster:

ubuntu@anton-master:~/kubespray$ nextflow -v
nextflow version 0.30.2.4867

This is inside the NF pod:

high-wozniak:/mnt/gluster/ubuntu# nextflow -v
nextflow version 0.30.2.4867

so the version is the same

and we haven’t changed any config between these two actions
Paolo Di Tommaso
@pditommaso
Jul 25 2018 12:01
what I mean, was it working in the past?
Vladimir Kiselev
@wikiselev
Jul 25 2018 12:03
oh, yes, it was
we haven’t touched it for a while (> 1 month)
yesterday we destroyed our old k8s cluster and created a new one
Paolo Di Tommaso
@pditommaso
Jul 25 2018 12:04
I see. Frankly I'm not a K8s admin guru, but it looks like (default) user permission
Vladimir Kiselev
@wikiselev
Jul 25 2018 12:05
ok, will investigate further on our side, I just haven’t seen this one before, wanted to check before digging into it
thanks!!
Paolo Di Tommaso
@pditommaso
Jul 25 2018 12:06
welcome
Vladimir Kiselev
@wikiselev
Jul 25 2018 12:09
ok, checked the k8s version - it is the same as before v1.9.2
it’s hardcoded in our installation scripts
Paolo Di Tommaso
@pditommaso
Jul 25 2018 12:11
it may help
Vladimir Kiselev
@wikiselev
Jul 25 2018 12:20
cool, thanks!
Paolo Di Tommaso
@pditommaso
Jul 25 2018 12:20
solved ?
Vladimir Kiselev
@wikiselev
Jul 25 2018 12:39
no, still trying things...
Vladimir Kiselev
@wikiselev
Jul 25 2018 12:58
I think we figured it
this worked kubectl create clusterrolebinding nextflow --clusterrole=edit --serviceaccount=default:default -n default
we created a clusterrolebinding for serviceaccount=default:default with role=edit
Paolo Di Tommaso
@pditommaso
Jul 25 2018 12:59
I have also seen people creating specific namespaces and serice accounts
it seems you are using default for everything
Vladimir Kiselev
@wikiselev
Jul 25 2018 13:01
yes, we tried the fastest thing
Paolo Di Tommaso
@pditommaso
Jul 25 2018 13:01
:ok_hand:
Vladimir Kiselev
@wikiselev
Jul 25 2018 13:02
Thanks a lot!
Alexander Peltzer
@apeltzer
Jul 25 2018 13:48
splitCsv can be used to split a TSV too right?
Paolo Di Tommaso
@pditommaso
Jul 25 2018 13:48
yep, use sep: '\t' argument
Alexander Peltzer
@apeltzer
Jul 25 2018 13:48
ERROR ~ Cannot cast object '    ' with class 'java.lang.String' to class 'groovy.lang.Closure'

 -- Check script 'main.nf' at line: 176 or see '.nextflow.log' file for more details
getting this, https://github.com/nf-core/ICGC-featureCounts/blob/a25649ce3eac8bd3135ad2d1c753c208e2759232/main.nf#L176
Okay, thats what I do
Paolo Di Tommaso
@pditommaso
Jul 25 2018 13:49
sep='\t' WHY equals !!!!
NF is not GO ! :joy:
Alexander Peltzer
@apeltzer
Jul 25 2018 13:50
goddamnit
Learning too many languages is not always a good thing
THanks ;-)
Paolo Di Tommaso
@pditommaso
Jul 25 2018 13:50
welcome
Alexander Peltzer
@apeltzer
Jul 25 2018 14:07
when I specify an S3 work directory, is it normal that the prefix gets removed of the URL ?
Jul-25 14:06:20.189 [main] DEBUG nextflow.Session - Work-dir: /XYZ-icgc-testrun/work [null]
Jul-25 14:06:20.491 [main] DEBUG nextflow.Session - Session start invoked
Paolo Di Tommaso
@pditommaso
Jul 25 2018 14:07
yes
aws batch ?
Alexander Peltzer
@apeltzer
Jul 25 2018 14:08
but I specified using the s3:// instance and get a
Jul-25 14:06:20.608 [main] ERROR nextflow.cli.Launcher - @unknown
java.lang.UnsupportedOperationException: null
        at com.upplication.s3fs.S3FileSystemProvider.move(S3FileSystemProvider.java:554)
Just on a regular EC2 instance right now, trying to test something quickly and then run it on AWSBatch yes
Paolo Di Tommaso
@pditommaso
Jul 25 2018 14:09
You can use S3 as work dir, only when using AWS Batch or Ignite executor
Alexander Peltzer
@apeltzer
Jul 25 2018 14:09
Aaah, okay
Good
So I cant really have results/work folders on S3 when running on a regular EC2 instance right?
Paolo Di Tommaso
@pditommaso
Jul 25 2018 14:11
yes, you can
you specifying a S3 path when using a publishDir directive for your processes
Alexander Peltzer
@apeltzer
Jul 25 2018 14:12
hm, doing that but getting an error message
Jul-25 14:09:57.969 [main] DEBUG nextflow.processor.TaskDispatcher - Dispatcher > start
Jul-25 14:09:57.970 [main] DEBUG nextflow.trace.TraceFileObserver - Flow starting -- trace file: /mybucketXYZ-icgc-testrun/results/pipeline_i
nfo/ICGC-FeatureCounts_trace.txt
Jul-25 14:09:58.139 [main] ERROR nextflow.cli.Launcher - @unknown
java.lang.UnsupportedOperationException: null
        at com.upplication.s3fs.S3FileSystemProvider.move(S3FileSystemProvider.java:554)
aws s3 ls s3://.... does resolve however and finds the path
Paolo Di Tommaso
@pditommaso
Jul 25 2018 14:13
how is the complete error trace ?
Alexander Peltzer
@apeltzer
Jul 25 2018 14:14
Can send it to you
Paolo Di Tommaso
@pditommaso
Jul 25 2018 14:14
share it via pastebin / gist
Alexander Peltzer
@apeltzer
Jul 25 2018 14:15
ok
Paolo Di Tommaso
@pditommaso
Jul 25 2018 14:19
are trying to store trace file on S3? is not sure it's possible
Alexander Peltzer
@apeltzer
Jul 25 2018 14:20
Oh okay 👌I guess we’d have that issue with most nf core pipelines then
Paolo Di Tommaso
@pditommaso
Jul 25 2018 14:21
tho it would be useful to have it working in the S3 storage, you may want to open an issue for that
Alexander Peltzer
@apeltzer
Jul 25 2018 14:25
Ok i will do that now
Alexander Peltzer
@apeltzer
Jul 25 2018 14:59

Quick question, how do I bind the content of a file (containing just an URL) to a variable that I can use then?

url=$(cat $s3_path)
 wget -O $file_name $url
 featureCounts -a $gtf -g gene_id -o ${bam_featurecounts.baseName}_gene.featureCounts.txt -p -s $featureCounts_direction $file_name

This is unfortunately not working (though the statement above should work in regular bash syntax...)

Kevin Sayers
@KevinSayers
Jul 25 2018 15:18
@apeltzer if it is just a URL perhaps splitText?
input:
val filetext from Channel.fromPath('infile.txt').splitText()
Alexander Peltzer
@apeltzer
Jul 25 2018 18:28
Good point - but I have a tuple coming from the last process so this is not that easy :-(
Alexander Peltzer
@apeltzer
Jul 25 2018 18:33
(also its a multiline string, as its containing a pre-authenticated AWS URL)
WIll figure this out ;-)
Kevin Sayers
@KevinSayers
Jul 25 2018 18:55
ah
I think you just need to escape url, in the wget portion
\$url because it is a bash variable not NF
Jemma Nelson
@fwip
Jul 25 2018 22:27
re: getting the content of a file, try doing file(myPath).text