These are chat archives for nextflow-io/nextflow

12th
Feb 2018
Bioninbo
@Bioninbo
Feb 12 2018 09:43
Hello. Do you know how I can activate and desactivate a conda environment before and after each process ?
I tried: beforeScript = singularity exec '../singularity/img_prod' source /opt/anaconda2/bin/activate R342 but it doesn't work. Or how to activate an environment for the whole script?
Paolo Di Tommaso
@pditommaso
Feb 12 2018 09:52
not understanding the mixed use of singularity and conda here
Bioninbo
@Bioninbo
Feb 12 2018 09:55
Ah right, thanks! However, when I do beforeScript = "source /opt/anaconda2/bin/activate R342" I get the following error: CondaEnvironmentNotFoundError: Could not find environment: R342 . But the environment is present when I open a singularity shell and type conda info --envs.
Paolo Di Tommaso
@pditommaso
Feb 12 2018 09:59
frankly I don't know (not a conda expert)
Bioninbo
@Bioninbo
Feb 12 2018 09:59
or if I type singularity exec '../singularity/img_prod' conda info --envs
Paolo Di Tommaso
@pditommaso
Feb 12 2018 09:59
you may want to debug the failing task changing in the task work dir and running bash .command.run
it may help you to troubleshoot the problem
said that likely the easiest way is to activate the conda environment and then run NF
Bioninbo
@Bioninbo
Feb 12 2018 10:01
I see. But I need to switch environments since I wanted to use different python version
when I go to the folder and type the command it works
btw I use the singularity options runOptions = '--cleanenv --containall' ; maybe it mess up with the nextflow environments?
Paolo Di Tommaso
@pditommaso
Feb 12 2018 10:03
I don't have experience with this configuration, can't help
check the launche scripts created by NF and figure out what is wrong
Tim Diels
@timdiels
Feb 12 2018 10:04
I've heard that Docker is difficult to manage and too unstable for production, has anyone here had any issues with it? It's not possible to run Nextflow in cloud without Docker right?
Bioninbo
@Bioninbo
Feb 12 2018 10:04
Ok thanks for your help! I will try to do that. Otherwise I could try to change my pipeline to use only one conda environment.
Alexander Peltzer
@apeltzer
Feb 12 2018 10:04
Another suggestion: Why don't you just create single containers for each tool you want to call? Then reference these in each single process scope without changing something
Thats also possible @Bioninbo , I have quite good experiences with having individual Singularity/Docker containers for each process scope - or a "big" container for all steps, thats fine too
@timdiels can't confirm that, never really had more issues than with "normal" environments...
e.g. the modulestuff on some HPC systems is far less usable IMHO than docker
Bioninbo
@Bioninbo
Feb 12 2018 10:06
I see thanks for the suggestion @apeltzer.
Just wondering, isn't it a bit demanding in terms of disk usage?
Each image occupying quite some space
Alexander Peltzer
@apeltzer
Feb 12 2018 10:10
It depends what you do - I created simple images using Alpine Linux to keep disk usage as low as possible for some tools: Samtools ~150MB, ...
Doesn't work for all tools, but a fair number of tools can be packaged like this and work fine in a minimal Alpine Linux environment
Of course, if you use a large base image for your containers (e.g. a full-blown Ubuntu, and some even ship it with unrequired packages too), you're easily > 1GB
Bioninbo
@Bioninbo
Feb 12 2018 10:13
Ah good point! I was using ubuntu for my images. Thanks for tips
Paolo Di Tommaso
@pditommaso
Feb 12 2018 10:15
@timdiels Docker in a multitenant cluster is definitively more challenging to setup and maintain than singularity. it requires a modern linux kernel. Also I've experienced random failures pulling images.
Singularity is tool of choice for HPC clusters
Bioninbo
@Bioninbo
Feb 12 2018 10:16
One more question that I was wondering for some time. Is it possible to use a singularity sandbox images with nextflow? I tried and it corrupted my images each time. So now I convert them to squashsf format before using them, but it takes quite some time for testing each change.
Paolo Di Tommaso
@pditommaso
Feb 12 2018 10:17
Is it possible to use a singularity sandbox images with nextflow?
what do you mean ?
Alexander Peltzer
@apeltzer
Feb 12 2018 10:18
I think he means that there is the possibility to specify a so called "sandbox" image (e.g. a writable image container) in Singularity and use that in Nextflow
It should be - use the option for the container path and point it to the directory/image etc. As far as I know, Singularity is simply called by nextflow, so if the command is passed directly, it should work....
if you run like this nextflow run -with-singularity /path/to/your/sandbox it should work fine I guess
Bioninbo
@Bioninbo
Feb 12 2018 10:21
Thanks @apeltzer. And indeed that is what I meant.
Paolo Di Tommaso
@pditommaso
Feb 12 2018 10:21
NF would run the in the singularity container or outside ?
Bioninbo
@Bioninbo
Feb 12 2018 10:21
However when I tried that it didn't worked and then corrupted my sandbox
But maybe I did something wrong
Anyone succeeded using a sandbox image with nextflow?
@pditommaso inside I guess since I use the command -with-singularity
Paolo Di Tommaso
@pditommaso
Feb 12 2018 10:24
when you use -with-singularity tasks run in the container and NF outside
Alexander Peltzer
@apeltzer
Feb 12 2018 10:24
And that makes sense for testing purposes - that way you can quickly test and later "fix", then subsequently use a final working recipe for your container
Paolo Di Tommaso
@pditommaso
Feb 12 2018 10:25
ok, I'm almost understanding
sandboxing is basically an unpacked singularity image in a host directory
never used this feature, once you have created the sandbox, how is the command line to run the container using the sandbox ?
Alexander Peltzer
@apeltzer
Feb 12 2018 10:27
Yes. I think they implemented it for testing purposes - in the beginning Singularity required to specify image size on container build - which was sometimes hard to estimate beforehand. That was circumvented by allowing people to use such a directory / sandbox (they call it that way)
Bioninbo
@Bioninbo
Feb 12 2018 10:27
sudo singularity shell --writable img_dev/
if you want to edit your development container
then I convert it to a img_prod container in squashsf format when I am satisfied
Alexander Peltzer
@apeltzer
Feb 12 2018 10:29
that would give you a shell in the container.... but you could also do something like singularity exec /path/to/folder-sandbox toolname I guess
Bioninbo
@Bioninbo
Feb 12 2018 10:29
yes indeed
Paolo Di Tommaso
@pditommaso
Feb 12 2018 10:30
in principle should work, if not open an issue on GitHub
including the steps to replicate the problem
Bioninbo
@Bioninbo
Feb 12 2018 10:32
Ok I'll try to investigate this.
Tim Diels
@timdiels
Feb 12 2018 10:49
@pditommaso Can Singularity be used in cloud with autoscaling?
Paolo Di Tommaso
@pditommaso
Feb 12 2018 11:26
yes, tho in a cloud environment I would go for docker (you are free to use latest linux kernel and pull are much faster, so docker could be a better option)
maybe aws batch it's even easier
Bioninbo
@Bioninbo
Feb 12 2018 13:05
As a follow up on our discussions about my issues activating conda environments in nextflow: I runned nextflow from within the singularity container (as @pditommaso suggested) and it seemed to have solved most issues. Thanks for the help!
Paolo Di Tommaso
@pditommaso
Feb 12 2018 13:14
:+1:
Tim Diels
@timdiels
Feb 12 2018 14:28
@pditommaso @apeltzer Thanks