Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • 09:25

    pditommaso on master

    Fix -with-apptainer cli option … (compare)

  • 09:25

    pditommaso on fix-apptainer-cli-option

    (compare)

  • 09:25
    pditommaso closed #3621
  • Feb 07 22:16
    bentsherman commented #3606
  • Feb 07 22:14
    bentsherman commented #2083
  • Feb 07 22:14
    krokicki commented #3606
  • Feb 07 21:56
    bentsherman synchronize #3600
  • Feb 07 21:56

    bentsherman on 3595-nf-cli

    Update params parsing in CLI v2… (compare)

  • Feb 07 21:20
    bentsherman synchronize #3620
  • Feb 07 21:20

    bentsherman on 3606-singularity-trace-cleanup-child-procs

    Fix test Signed-off-by: Ben Sh… (compare)

  • Feb 07 21:15
    bentsherman commented #3606
  • Feb 07 21:11
    pditommaso labeled #3619
  • Feb 07 21:09
    bentsherman commented #3619
  • Feb 07 21:08
    bentsherman commented #3619
  • Feb 07 21:03
    krokicki commented #3606
  • Feb 07 20:56
    bentsherman commented #3619
  • Feb 07 20:55
    bentsherman labeled #3619
  • Feb 07 20:55
    bentsherman commented #3619
  • Feb 07 20:51
    bentsherman labeled #3618
  • Feb 07 20:38
    bentsherman synchronize #3620
Alaa Badredine
@AlaaBadredine_twitter
thanks
Stijn van Dongen
@micans
yw, good luck!
Stijn van Dongen
@micans

@pditommaso as an alternative to this pattern: http://nextflow-io.github.io/patterns/index.html#_problem_19
I've made https://github.com/micans/nextflow-idioms/blob/master/ab-abc-until.nf
It uses

ch_skipB.until {  params.doB }.set { ch_AC }
ch_doB.until { !params.doB }.set { ch_AB }

Instead of

(ch_AC, ch_AB) = ( params.doB ? [Channel.empty(), ch_doB] : [ch_skipB, Channel.empty()] )

This made me wonder; first if there is a drawback to using until like this, and second, the example uses into { ch_doB; ch_skipB } just before. I am envisioning into syntax extended like this:

into { .until{  params.doB }.set{ ch_AC }; 
       .until{ !params.doB }.set{ ch_AB }
     }

I'm not sure it's possible or that useful, but wanted to document the thought. []

Riccardo Giannico
@giannicorik_twitter
@AlaaBadredine_twitter do you see this file to be created ${toRaw}/SampleSheet.csv? I'm wandering if you have write permissions under ${toRaw} .
Does the .nextflow.log file report any error? Does it report if the process you need has been "submitted" or "chached" ?
Alaa Badredine
@AlaaBadredine_twitter
@giannicorik_twitter yes I do, after all, the system we have here is only under root, so we don't have issues with writing/reading permissions
bit of weird tbh but that's how it is
Michael Adkins
@madkinsz
Hi! I have a question about parsing variables from the input channel made fromPath. I'm interested in parsing project names from a path doing operations on entire projects. Here's kind of a pseudocode pipeline: https://hastebin.com/raw/tutifepuxa
I can't really find any clear documentation about how to parse variables from paths
danchubb
@danchubb
Hi, I'm enjoying getting the hang of nextflow but I've hit a brick wall with what is hopefully a simple problem. When parsing a csv file using splitCsv() how do you access columns where the header has spaces? e.g. if the col is filename then it is ${row.filename} what if it is "file name" ? backticks? quotes? Thanks a lot - Dan
Stijn van Dongen
@micans
@danchubb I had a quick try, this seems to work:
#!/usr/bin/env nextflow
Channel
    .from( 'alpha,beta,gam ma\n10,20,30\n70,80,90' )
    .splitCsv(header: true)
    .subscribe { row ->
       println "${row.alpha} - ${row.beta} - ${row.'gam ma'}"
    }
(edited to show it's easy to test a small snippet -- this was taken from the splitCsv documentation)
danchubb
@danchubb
great, thanks a lot for the help.
Michael Adkins
@madkinsz
Does nextflow attempt to ignore duplicating input files as output files? e.g. I'm getting an error: Missing output file(s) *.fastq expected by process merge_nextseq_lanes when calling a script that renames files in place. There are many .fastq files in the working directory but it cannot find any?
Paolo Di Tommaso
@pditommaso
input file names are not captured by globs
Michael Adkins
@madkinsz
Is there a way to make them captured?
Or is calling a script to combine/rename some of the files bad practice? I want to take a collection, rename a small subset or combine some, then pass all the resulting files as a new channel
Paolo Di Tommaso
@pditommaso
not a good idea, a task should produce its own outputs
Michael Adkins
@madkinsz
Okay, I don't understand how preprocessing tasks are supposed to work then. I have a tool that needs to operate on all of the fastq files but some of the fastqs require preprocessing first.
Paolo Di Tommaso
@pditommaso
you can have the pre-proc task getting some of the fastqs, and another task getting all fastqs + out of the pre-proc
makes sense?
Michael Adkins
@madkinsz
That does make sense but I don't know how to exclude the ones that would be preprocessed from the all fastqs.
Since preprocessing requires all the fastqs to be collected so that some can be merged
Paolo Di Tommaso
@pditommaso
glob pattern? csv file? you should have a criteria to express that
Michael Adkins
@madkinsz
You're right. That should be reasonable, I'll look into that. Thank you.
Paolo Di Tommaso
@pditommaso
@micans the alternative may work (haven't tried), the second proposal it looks to much creative ..
Michael Adkins
@madkinsz
Can you use the channel factory / builder in the input/output parts of a process?
The connection between those two syntactic forms is kind of unclear
Stijn van Dongen
@micans
@pditommaso the alternative works ... (pretty sure, tested it). It's not that creative ... it unleashes huge possibilities :grin: ... I found the need to introduce extra channel names a little bit annoying ... so I was thinking about ways to get an implicit channel into into.
Paolo Di Tommaso
@pditommaso
you can create as many Channel.fromPath('foo*.fastq') as you need
I found the need to introduce extra channel names a little bit annoying
I understand, but dsl-2 won't require anymore to create channel dups
Michael Adkins
@madkinsz
@pditommaso but how do you use that within a process rather than at the head of a .nf file?
Stijn van Dongen
@micans
Cool @pditommaso I'll check it out. I think these extra names may be because of the a(b)c optional b process rather than into duplication, but will check for sure.
Paolo Di Tommaso
@pditommaso
ch1 = Channel.fromPath('*.fasta')
ch2 = Channel.fromPath('*.fasta')

process foo {
  input: 
  file x from ch1
  .. 
}

process bar {
  input: 
  file x from ch2
  .. 
}
@micans check it out! I need your feedback to move it on :D
and now :wave: :smile:
Stijn van Dongen
@micans
@madkinsz We link our fastq files in from a process; it gives a lot of control so you can do whatever you want. In our case the starting point is a sample file with IDs so we have explicit control. We then expect to find the fastq files in a directory. e.g. https://github.com/cellgeni/rnaseq/blob/master/main.nf#L269-L292
:wave: @pditommaso ah I thought there were lots of heavy users providing feedback already! I have a shortage of pipelines to do much abstraction. Anyway, will look nevertheless!
Michael Adkins
@madkinsz
I like what you have @micans. We expect to run this process on all the fastqs in a directory. My problem is this:
Oh geez -- edited to remove unformatted code
Sorry used to markdown.
Stijn van Dongen
@micans
(sorry have to run for dinner, will check later!)
Michael Adkins
@madkinsz
fastq_gz = Channel.fromPath('/cb/boostershot-basic/Data/Intensities/BaseCalls/CBTEST_Project_0091/*.fastq.gz')


process unzip_fastq {
  tag '$fastq_gz'

  input:
    file fastq_gz from fastq_gz

  output:
    file '*.fastq' into fastq_files

  script:
    """
    gunzip -df $fastq_gz
    """
}
Then I want to merge the lanes using a channel such as
fastq_pairs = Channel
    .fromFilePairs('/cb/boostershot-basic/Data/Intensities/BaseCalls/CBTEST_Project_0091/*_L00[1-4]_R[1-2]_001.fastq'), size: -1)
but I don't know how to make that channel read from the fastq_files channel
Nor do I understand how I would declare fastq_pairs in a process output directive
Stijn van Dongen
@micans
@madkinsz fromPath and fromFilePairs are channel factory methods; they are channel 'sources', you can't use them as connectors. But you can create file pairs yourself -- the following is not a great example, but it does end up with a process spitting out a pair of files -- https://github.com/cellgeni/rnaseq/blob/master/main.nf#L295 . I would try to make a small example using toy files emulating what you want to achieve.
Michael Adkins
@madkinsz
Thanks @micans. I've just ended up pulling from the publishDir, which I don't love but its good enough for now since I'm just trying to explore nextflow.
rithy8
@rithy8
Hello,