These are chat archives for nextflow-io/nextflow

12th
Mar 2019
Laurence E. Bernstein
@lebernstein
Mar 12 00:13

Your first suggestion does not work. It gives the error:

ERROR ~ No signature of method: groovyx.gpars.dataflow.DataflowQueue.splitText() is applicable for argument types: (_nf_script_17f8a116$_run_closure2$_closure7) values: [_nf_script_17f8a116$_run_closure2$_closure7@1c4ee95c]
Possible solutions: splitText(), splitText(java.util.Map), countText(), splitCsv(), splitFastq(), splitFasta()

However your second idea works.. partially..
When I want to compare projectName in an if statement:

if (projectName.trim() == "XXX"

This now works. So thats great. Thanks.
However, when I want to do something like:

  file newFile from file("${projectName}.txt")

I'm not sure how to get that to work. Neither of these work:

  file newFile from file("${projectName.trim()}.txt")
  file newFile from file("${projectName}.trim().txt")

What is the proper syntax?

Laurence E. Bernstein
@lebernstein
Mar 12 00:40
It appears I can use .trim() without issue inside the script portion but not outside of it, but what I'm trying to do is specify an input file using the name I have found inside my other file. That way the input file can be properly staged.
Rad Suchecki
@rsuchecki
Mar 12 01:03
hmm, can you come up with a toy example which include the two processes and the intended interaction between them? I am concerned that the problem may not just be about the newline being there (or not). If there is a file being produced by process one and consumed by process two the file needs to be declared in input/output blocks and the actual file name, if unknown is not necessarily needed.
but no the file from file will not work, you can only have value or file from a channel.
Rad Suchecki
@rsuchecki
Mar 12 01:08
also remember that each process execution happens in a separate dir, declared input files are symlinked from upstream process dirs, so with current approach (as I understand it) ${projectName}.txt is absent from the process workDir so knowing its name will not help as path to it is not known
Laurence E. Bernstein
@lebernstein
Mar 12 01:28

Here is a simplified version.

baseDir  = file("/home/laurence.e.bernstein/")

process '1A_setup' {
  input:
    // Doesn't matter
  output:
    file "projectName.txt" into project_name_ch
  script:
  {
  """
    #!/usr/bin/python
    import json
    with open("${sampleJson}") as inputJsonFile:
      jsonMap = json.load(inputJsonFile)
    with open("projectName.txt", 'w') as projectNameFile:
      projectNameFile.write(jsonMap["project"])
  """
  }
}

process '2A_dosomething' {
  input:
    val projectName from project_name_ch.splitText()
    val inputFile from file("${baseDir}/XXX.config")
  output:
    // Doesn't matter
  script:
    if (projectName.trim() == "XXX") {
      """
        runsomething ${inputFile}
      """
    } else {
      """
        // Unknown project name
      """
    }
}

As you can see what I'd like to do is have the name of the input file to the second process dynamically decided. All the XXX.txt files are present in the baseDir on my file system so if I specify them correctly, Nextflow will be able to create the links in the staging area (the files exist).
I may be able to achieve this by writing the full path name into the projectNameFile, but I haven't tried that yet. The only reason I didn't do that is I have multiple files with the same base name (XXX) with different extensions, so I'd have to create multiple files, one for each file name. I can do that. It's just even more clunky.

Rad Suchecki
@rsuchecki
Mar 12 02:34
you might also stage in an entire dir rather than individual files
Laurence E. Bernstein
@lebernstein
Mar 12 02:38
Hmm.. that's a really interesting idea. I actually didn't even think about doing that.
Paolo Di Tommaso
@pditommaso
Mar 12 07:26
Webinar alert tomorrow :sound: :sound:
Salable, Sharable and Reproducible Computational Workflows across Clouds and Clusters.
Rad Suchecki
@rsuchecki
Mar 12 12:24
Time zones clarification for the webinar https://twitter.com/bioinforad/status/1105427632303202304
Paolo Di Tommaso
@pditommaso
Mar 12 12:35
well done !
Laurence E. Bernstein
@lebernstein
Mar 12 17:11
@rsuchecki I'm not sure I actually understand how to stage a whole directory at once. If I specify a directory as a channel, won't the files be emitted one at a time and cause my process to execute once per item? As opposed to the desired behavior which would be to have all the files staged for my one process?
And BTW.. Thanks for all the assistance.
Yasset Perez-Riverol
@ypriverol
Mar 12 22:05
Does anyone use IntellIJ to edit nextflow workflows ? any other editor that recognize Groovy/Nextflow code ?
Rad Suchecki
@rsuchecki
Mar 12 22:25
@lebernstein I believe that Channel.fromPath('/path/to/dir') will just emitdir`
Rad Suchecki
@rsuchecki
Mar 12 22:32
@lebernstein not sure if I managed to help, perhaps others have better ideas? Most of the experts are based in Europe so you may have to time your questions accordingly :confused: . Hopefully the community around the Pacific will continue to grow
Laurence E. Bernstein
@lebernstein
Mar 12 23:25
@rsuchecki Still working on it, but you are almost right.
You need the 'type' parameter:
Channel.fromPath('/path/to/dir', type: 'dir')
I think I might have made it work. Huzzah.