These are chat archives for nextflow-io/nextflow

26th
Oct 2018
Maxime HEBRARD
@mhebrard
Oct 26 2018 02:35

Hello. is there a way to check if a folder exist ? I try

Channel.fromPath(params.folder)
    .ifEmpty { exit 1, "folder not found: ${params.folder}" }

But when the folder don't exist the eror is not fired

Maxime HEBRARD
@mhebrard
Oct 26 2018 02:45
solved :
if (file(params.folder).isEmpty()) {
    exit 1, "directory not found: ${params.folder}"
  }
Maxime HEBRARD
@mhebrard
Oct 26 2018 06:58
Question:
Channel.fromPath(params.samples)
    .splitCsv(header: true, sep: '\t')
   .map {
    // Here each line is one obj{key1:val1, key2:val2}
    // How can I edit val2 on each lien and return the object
    // out: {key1:val1, key2:newVal2}
}
actually I wish to delete or substitute the [space] in val2 ... can I do that here ?
Maxime HEBRARD
@mhebrard
Oct 26 2018 07:39
I found out that splitCsv create groovy maps and then, if I know the keys I can do :
Channel.fromPath(params.samples)
    .splitCsv(header: ['firstKey', 'secondKey'], skip: 1, sep: '\t')
    .map{ [firstKey: it.firstKey, secondKey: it.secondKey.replaceAll(' ', '_')] }
Paolo Di Tommaso
@pditommaso
Oct 26 2018 09:35
:+1:
Luca Cozzuto
@lucacozzuto
Oct 26 2018 12:52
dear @pditommaso I have a question :)
//peptideCSVs
def peptideCSVs = [:]
peptideCSVs["QC01"] = file("${CSV_folder}/knime_peptides_final.csv")
peptideCSVs["QC02"] = file("${CSV_folder}/knime_peptides_final.csv")
peptideCSVs["QC03"] = file("${CSV_folder}/knime_peptides_qc4l.csv") 


process calc_peptide_area {

    input:
    set sample_id, internal_code from shot_featureXMLfiles_for_calc_peptide_area.mix(srm_featureXMLfiles_for_calc_peptide_area)
    file(csvfile) from file(peptideCSVs[internal_code])

    """
        echo "${internal_code} ${csvfile} ${peptideCSVs['QC01']} ${peptideCSVs['QC03']}" 
    """
I got
echo "QC01 knime_peptides_qc4l.csv /nfs/software/bi/biocore_tools/git/nextflow/Qcloud/csv/knime_peptides_final.csv /nfs/software/bi/biocore_tools/git/nextflo
w/Qcloud/csv/knime_peptides_qc4l.csv"
I don't understand why is doing this...
Paolo Di Tommaso
@pditommaso
Oct 26 2018 13:01
what is this ?
Luca Cozzuto
@lucacozzuto
Oct 26 2018 13:20
csvfile = knime_peptides_qc4l.csv
while it should be
/nfs/software/bi/biocore_tools/git/nextflow/Qcloud/csv/knime_peptides_final.csv
since
internal code = QC01
Paolo Di Tommaso
@pditommaso
Oct 26 2018 13:32
the main problem is that you are not supposed to specify file parameters in that way
input files need to be declared as input
Luca Cozzuto
@lucacozzuto
Oct 26 2018 13:34
what is wrong?
Paolo Di Tommaso
@pditommaso
Oct 26 2018 13:35
you are not supposed to use peptideCSVs in the command script in that way
Luca Cozzuto
@lucacozzuto
Oct 26 2018 13:36
is a simple dictionary (hash or associative array, depending on the language :) )
Paolo Di Tommaso
@pditommaso
Oct 26 2018 13:36
as you want
Luca Cozzuto
@lucacozzuto
Oct 26 2018 13:37
I would like to call a file depending on the value of internal_code
internal code can be QC01, QC02, QC03... and on that call one of the two files
if there is a better way I'll be happy to use it
Paolo Di Tommaso
@pditommaso
Oct 26 2018 13:38
you need to resolve that externally, likely using a map operator
Luca Cozzuto
@lucacozzuto
Oct 26 2018 13:41
ok
I'll do it. Many thanks.
Paolo Di Tommaso
@pditommaso
Oct 26 2018 13:42
:+1:
Krittin Phornsiricharoenphant
@sinonkt
Oct 26 2018 13:45
Hi, Paolo, can i use glob pattern over S3 file('s3://my-bucket/data/*.fa’) and is there any cache awareness behind the scene for these kind of downloading and uploading.
Paolo Di Tommaso
@pditommaso
Oct 26 2018 13:45
you can use it but there's no caching
Krittin Phornsiricharoenphant
@sinonkt
Oct 26 2018 13:47
and for private S3 compatible like minio instead of AWS should be work just fine?
Paolo Di Tommaso
@pditommaso
Oct 26 2018 13:48
it may work specifying the your endpoint in the config file
Krittin Phornsiricharoenphant
@sinonkt
Oct 26 2018 13:48
I’ll try that. Thanks, you’re always lovely. :))
Paolo Di Tommaso
@pditommaso
Oct 26 2018 13:49
ahah
there are different opinions on this :satisfied:
Krittin Phornsiricharoenphant
@sinonkt
Oct 26 2018 13:51
LOL 😆
Tobias "Tobi" Schraink
@tobsecret
Oct 26 2018 15:03
My cluster admin wants me to download my starting data into a certain directory. I see there is a move option for publishDirbut that breaks re-runs. Is there a way to move the files but leave a link to the files in the work directory for each task? (basically the inverse of the symlink option)
Luca Cozzuto
@lucacozzuto
Oct 26 2018 15:08
storeDir?
Paolo Di Tommaso
@pditommaso
Oct 26 2018 15:27
Use mode 'link' provided files are in the same storage https://www.nextflow.io/docs/latest/process.html#publishdir
Tobias "Tobi" Schraink
@tobsecret
Oct 26 2018 19:33
Hmmm, so if that folder is flushed, do the files get deleted? Looks like that might not be the case with mode 'link' since what's in those folders realistically is just a link. Or maybe I am not understanding hardlinks correctly
Tobias "Tobi" Schraink
@tobsecret
Oct 26 2018 19:41
Looks like storedir is the better alternative in this case. I mean I could also just use mode 'link'and pretend our cluster admins are flushing the directory when they're really just deleting a bunch of links but they'd probably be pretty mad when they found out.
Thanks for the help, folks!
Paolo Di Tommaso
@pditommaso
Oct 26 2018 21:31
an hardlink (or just link) is an additional entry in the file system table for the same file
if you delete it, the file will still there
Tobias "Tobi" Schraink
@tobsecret
Oct 26 2018 21:54
Yes, that was my intuition as well. I had forgotten about the storeDir directive, didn't know that it supported reruns
Krittin Phornsiricharoenphant
@sinonkt
Oct 26 2018 22:20
Hi again :), what is your local development environment look like, i have a hard time on mac so i decided to make this docker image to work with https://github.com/sinonkt/docker-centos7-singularity-nextflow, it’s just support your newly released Nextflow 18.0 and Singularity 3.0, what do you think of it. have i miss something.
not even try it yet.😆