These are chat archives for nextflow-io/nextflow

7th
Mar 2019
Qi ZHAO
@likelet
Mar 07 00:07
hi, i want to write the filenames stored in a channel into a file like NormalCovFilesForfile.collectFile { file -> ['argument.file', "-I "+file.name.flatten() + ' '] } .set { normalPONargmentFile }, but in the argument.file, there always have a [] in the filenames: -I [CRC1744.normal.counts.hdf5], how to remove this ?
Daniel E Cook
@danielecook
Mar 07 09:33
Does anyone know a good way to debug caching issues? Having trouble with processes sometimes dropping the cache on rerun.
Paolo Di Tommaso
@pditommaso
Mar 07 09:48
usually happens on some not-deterministic ordering of some inputs
focus on the first process not caching properly
Maxime Garcia
@MaxUlysse
Mar 07 09:49
Look at the .nextflow.log, and the .command.err or the .command.log in the corresponding wok directory
use the dump operator to check if your channels are properly formed
Chelsea Sawyer
@csawye01
Mar 07 10:22
@micans to be more specific I have a process that checks a csv file and if its laid out in a certain way I need to reformat it with another process. But if its fine I don't need to reformatting process to run and to just skip over to the next process down the line. I want to have this done within the pipeline so just setting optional parameters wouldn't work. I'm not entirely sure of how to go about this but I thought if I passed the result to an channel.choice and then only had the 'failed' channel is being passed to the reformatting process and using the 'when' directive when the channel isn't empty it may work but I'm obviously running into issues with that. If I figure it out I will share here :)
@tobsecret I will try that, thanks!
micans
@micans
Mar 07 10:29
@csawye01 Can't you have the script section output either file type .A (which is fine for immediate processing) or file type .B (which will be sent for reformatting), and have two different channels pick up the different file types? I understand the script section itself cannot do the reformatting.
Chelsea Sawyer
@csawye01
Mar 07 10:31
@micans unfortunately no, as the issue is within the csv file and will only be some rows that I will have to remove from it not the entire sheet.
micans
@micans
Mar 07 10:33
That could be done in a script section, unless it is hard to get the context information that you need into the script section. But simplistically, you could grep what you need and create a derived sheet in the script section? (I'm sure I miss the big picture!)
Evan Floden
@evanfloden
Mar 07 15:32
:loudspeaker::loudspeaker: We are looking to redesign the Nextflow website. Any comments or designer recommendations on this issue would be greatly appreciated: nextflow-io/website#14
Paolo Di Tommaso
@pditommaso
Mar 07 15:32
cool!
Tobias "Tobi" Schraink
@tobsecret
Mar 07 15:40

@csawye01
Are these csv files the output of another process? If so, you could add a check at the end of that process, save the result to a variable and bundle them up.

process {
    input:
    file(some_input) from some_channel
    output:
    set val(csv_intact), file(csv_file) into output_channel
    script:
    csv_file = "$some_input.csv"
    """
    regular processing script goes here
    """
    checking_script = """
                                         checking script goes here
                                         """
    csv_intact = checking_script.execute().text
}

you can then use a splitting operator to funnel everything from output_channelthat got a bad value for csv_intact into your repair/fail_option process and have everything else bypass that process.

micans
@micans
Mar 07 15:46
wow, learn something new every day! cool
Tobias "Tobi" Schraink
@tobsecret
Mar 07 15:49
:sweat_smile: I'm glad a youngin like me can also contribute
Olga Botvinnik
@olgabot
Mar 07 16:19
Hello! I posted an issue (nextflow-io/nextflow#1064) and am being doubly annoying by asking here. The AWS batch configuration can't seem to find the aws command even when executor.awscli is specified to a file that exists. Do you know what may be happening?
Evan Floden
@evanfloden
Mar 07 16:23
@olgabot Did you try using an AMI that is known to work? I don't think you need awscli in the batch job docker.
KochTobi
@KochTobi
Mar 07 16:49
@olgabot Did you specify your AMI in the compute environment? The AMI from the cloud directive does not impact your awsbatch executor. There is a checkbox I always forget to check which allows you to provide the environment with an ami-id
micans
@micans
Mar 07 16:52
I have with singularity this error: container creation failed: unabled to /lustre/scratch106/cellgeni/tic-166/singularity/work/sing+small/4e/3c1fff2bef57ec71e019a7cc0c748d to mount list: destination /lustre/scratch106/cellgeni/tic-166/singularity/work/sing+small/4e/3c1fff2bef57ec71e019a7cc0c748d is already in the mount point list ... does that ring any bells?
micans
@micans
Mar 07 17:06
we run singularity 2.5; the .command.run has a singularity invocation with basically -B $PWD -B /a/path/that/is/the/same/as/pwd
if we change the second -B to another path (the one where our data is stored, defined in the channel), then it works
micans
@micans
Mar 07 17:34
This is on LSF + singularity
Joe Brown
@brwnj
Mar 07 21:41
Has anyone had to deal with multiple AWS profiles or switching keys through a workflow? Often we have non-public, collaborator owned data we need to operate on (input data) while we need to write our work dir and publish dir to another AWS account. Also, these data are prohibitively large to simply copy to another s3 bucket. Ideas?
Rad Suchecki
@rsuchecki
Mar 07 23:01
@micans I am struggling to debug this at the moment. Something has changed and some processes with singularity fail due to attempted binding of task.workDir more than once. Not yet sure if that is due to recent changes in cluster's file system, in singularity or even in NF or some combination of these.
Olga Botvinnik
@olgabot
Mar 07 23:14

@olgabot Did you try using an AMI that is known to work? I don't think you need awscli in the batch job docker.

Is there a us-west-2 AMI available?