These are chat archives for nextflow-io/nextflow

2nd
Oct 2018
Krittin Phornsiricharoenphant
@sinonkt
Oct 02 2018 03:52
Hi, I loves nextflow. and been thinking about how can i expose knowledge of entire workflow (input params for each process, publish output, etc.) as API contract in a generic, context agnostic manner then be consumed by other workflow manager micro-service. which handle multiple nextflow efficiently.
the extension implmentation might look like How DAG was created but has more focused on input, output, which params belong to each process, param type, default value, and also description for each component in workflow.
Sven F.
@sven1103
Oct 02 2018 06:11
hi @sinonkt ! you might be interessted in #866 then?
this focuses more on a standard structure for parameter description
Krittin Phornsiricharoenphant
@sinonkt
Oct 02 2018 06:44
@sven1103 I loves that ! and really want to make it happens. I’ ll live there.
Diogo Silva
@ODiogoSilva
Oct 02 2018 14:26
Hi everyone. I've been checking the new weblog functionality of nextflow and I was wondering if there is more documentation on the matter other than what is provided here https://www.nextflow.io/docs/latest/tracing.html?highlight=weblog#weblog-via-http
I've already check that that there are different POST JSON being sent, but it would be nice to have a description of all of them
Sven F.
@sven1103
Oct 02 2018 14:27
@ODiogoSilva probably sth I should do then :D
Diogo Silva
@ODiogoSilva
Oct 02 2018 14:29
Also, is it possible to control what is sent on those JSON at all? For instance, the script file is sent on a process_submitted JSON only (right?) but would it be possible to exclude some fields from the POST?
btw, this looks really nice and I'm hopping to use it soon
Sven F.
@sven1103
Oct 02 2018 14:30
thanks a lot :)

For instance, the script file is sent on a process_submitted JSON only (right?) but would it be possible to exclude some fields from the POST?

Hm, it is dependant on which fields you set in the trace config

Diogo Silva
@ODiogoSilva
Oct 02 2018 14:32
Yes, that's for the contents of the trace JSON, but the script key, for instance, is outside the scope of the trace
Sven F.
@sven1103
Oct 02 2018 14:33
ah I see
currently, this is not possible with the weblog feature
but should be possible to implement, if I access the workflow object
would you mind open an issue for this and tag me? We can put the discussion there then and see what @pditommaso thinks of it
Diogo Silva
@ODiogoSilva
Oct 02 2018 14:36
Oh, ok. It would be a nice-to-have feature because it'll pass down the same script (which can be potentially large and potentially many times over)
sure
thanks a lot
Sven F.
@sven1103
Oct 02 2018 14:37
I ll try my best
:D
Diogo Silva
@ODiogoSilva
Oct 02 2018 14:39
Should I open an issue for the documentation as well?
Diogo Silva
@ODiogoSilva
Oct 02 2018 14:49
I've already opened the issue (nextflow-io/nextflow#881)
Sven F.
@sven1103
Oct 02 2018 16:23

Should I open an issue for the documentation as well?

@ODiogoSilva that would be great, so i dont forget it

Tobias "Tobi" Schraink
@tobsecret
Oct 02 2018 16:28

Given an input_channel like this:

input_channel = Channel
                             .from(
                             [
                             ['ENA_ID1', ['id1_1.fq', 'id1_2.fq']],
                             ['ENA_ID2', ['id2_1.fq', 'id2_2.fq']],
                             ['ENA_ID2', ['id3_1.fq', 'id3_2.fq']]
]
)

How would I unpack that tuple ['ENA_ID1', ['id1_1.fq', 'id1_2.fq']] into ena_id, read1 and read2 inside a process input definition?

assume that ena_id is a val and read1 and read2 are of type file
Mike Smoot
@mes5k
Oct 02 2018 16:44
@tobsecret this should work:
input_channel.map{ [it[0], file(it[1][0]), file(it[1][1])] }
Tobias "Tobi" Schraink
@tobsecret
Oct 02 2018 16:58
Aaah. so you would have to transform the channel output, got it.
Mike Smoot
@mes5k
Oct 02 2018 17:05
I'm sure you could also deal with the list inside the process, it's just a matter of where you want to deal with the indexing. However, calling the file function on file name strings is important. Without that nextflow won't know to symlink the files into the process working directory.
cwytko
@cwytko
Oct 02 2018 17:40
Would I need to use a process in order to move files that I have in a channel to a new directory?
or is there pipeline script that I could use that I'm missing?
Diogo Silva
@ODiogoSilva
Oct 02 2018 18:50

Hi, in the fromFilePairs documentation there is

Channel
    .fromFilePairs('/some/data/*', size: -1) { file -> file.extension }
    .println { ext, files -> "Files with the extension $ext are $files" }

I'm wondering what other attributes of file are available?

The issue actually arose from the fact that if you do something like:

.fromFilePairs('/fastq/SRR1027078_*')

and if:

$ ls fastq
SRR1027078_1.fq.gz  SRR1027078_2.fq.gz

The default fromFilePairs operator does not any value

the only way seems to use .fromFilePairs('/fastq/SRR1027078_{1,2}*')?
Tobias "Tobi" Schraink
@tobsecret
Oct 02 2018 19:17
perfect, thanks @mes5k !
Paolo Di Tommaso
@pditommaso
Oct 02 2018 20:23
@cwytko you can use the idiom channel.subscribe { it.moveTo(path) }
See for example here
Tobias "Tobi" Schraink
@tobsecret
Oct 02 2018 20:29

I was wondering how I can forward an input of a process to the output.
Say I have a process like this:

process download_accessions {
    input:
    val accession from accessions
    output:
    [accession, file('**.fastq.gz')] into downloaded_accessions
    script:
    """
    some download code
    """

This throws an error for the line underneath output:, saying that it cannot find accession.

How can I feed the accession through my workflow?
Paolo Di Tommaso
@pditommaso
Oct 02 2018 20:31
it supposed to be
    output:
    set val(accession), file('**.fastq.gz') into downloaded_accessions
Tobias "Tobi" Schraink
@tobsecret
Oct 02 2018 20:33
ooooh, of course! For some reason I thought set is just for the input field. Thanks!
Paolo Di Tommaso
@pditommaso
Oct 02 2018 20:34
:ok_hand:
Tobias "Tobi" Schraink
@tobsecret
Oct 02 2018 21:29
ok, this was definitely a hump I had some difficulty getting over! Thanks so much Paolo, Evan and Mike! My pipeline appears to work :grinning: