Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    Bruno Vieira
    @bmpvieira
    This one
    Bruno Vieira
    @bmpvieira
    @/all any progress trying to get this to run?
    Julian Mazzitelli
    @thejmazz
    @tiagofilipe12 your edits have no effect probably means your running with an npm linked version, or an isntalled one from when i published v0.51
    @bmpvieira can you include the ids text file in gist if its not too big?
    oh is it just sra ids
    maybe dat to share input files even
    Julian Mazzitelli
    @thejmazz
    But yeah that can be overkill lol
    If someone can please paste a few lines if they already have
    Bruno Vieira
    @bmpvieira
    The last gist I posted generates the IDs file, its just two ids
    This message was deleted
    Bruno Vieira
    @bmpvieira
    that gist is the whole pipeline and doesnt require anything else
    Julian Mazzitelli
    @thejmazz
    Ah kk awesome
    Tiago Jesus
    @tiagofilipe12
    I guys, have you managed to make it work? I am still afk but maybe this weekend I will have some time to watch over your gist and conversation
    Julian Mazzitelli
    @thejmazz
    ive been busy with work trying to finishing things up before heading to SF (im gonna stay an extra 3 days too for vacay XD)
    hopefully can look it this weekend, gonna work on a presentation this weekend too
    Bruno Vieira
    @bmpvieira
    Thanks guys, I haven't figured it out yet so any help would be appreciated
    Julian Mazzitelli
    @thejmazz
    ill be able this weekend
    Tiago Jesus
    @tiagofilipe12
    ok managed to replicate Bruno's error
    but found something odd
    there is a \n before every command
    in the last two tasks
    I don't get the undefined stuff that Bruno was getting
    but there is a lot of \n in commmand line
    Tiago Jesus
    @tiagofilipe12
    ok, so, this task
    const generateNcbiRefGenomeUrlFromNcbiMetadata = task({
      input: '*.metadata.json',
      output: '*.urls.txt',
      name: 'From accessions in a file, generate ENA download URLs for FASTQ'
    }, ({ input }) => `cat ${input} | \
        jq -r '@text "\\(.ftppath_refseq)/\\(.assemblyaccession)_\\(.assemblyname)_genomic.fna.gz"'
        > ${input.replace(/\.metadata.json/, '.urls.txt')}`
    )
    gets the output properly written I think:
    ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR620/SRR620242
    ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR620/SRR620547
    Am I right?
    but... this task
    const generateEnaFastqUrlsFromRunsInNcbiMetadata = task({
      // NCBI SRAs takes longer to download and extract, so we use ENA
      input: '*.metadata.json',
      output: '*.urls.txt',
      name: 'From Runs accessions in a file, generate ENA download URLs for FASTQ'
    }, ({ input }) => `cat ${input} | jq -r '.runs.Run[] | .acc' | bash -c 'while read acc; do
        if [ \${#acc} == 9 ] ; then
          dir2=""
        else
          dir2=$(printf %03d \${acc:9:3})/
        fi
        echo ftp://ftp.sra.ebi.ac.uk/vol1/fastq/\${acc:0:6}/$dir2$acc
      done' > ${input.replace(/\.metadata.json/, '.urls.txt')}`
    )
    creates an empty output
    Tiago Jesus
    @tiagofilipe12
    also notice this
    "operationString": "cat /home/tiago/bin/bionode-watermill/examples/pipelines/tests/data/ebd8890/solenopsis.metadata.json | jq -r '.runs.Run[] | .acc' | bash -c 'while read acc; do\n    if [ ${#acc} == 9 ] ; then\n      dir2=\"\"\n    else\n      dir2=$(printf %03d ${acc:9:3})/\n    fi\n    echo ftp://ftp.sra.ebi.ac.uk/vol1/fastq/${acc:0:6}/$dir2$acc\n  done' > /home/tiago/bin/bionode-watermill/examples/pipelines/tests/data/ebd8890/solenopsis.urls.txt"
    there are some do\n
    and then\n
    Julian Mazzitelli
    @thejmazz
    maybe should have a \ at end of those?
    all that gets turned into child_process.spawn('cat', [everythingElseSplitBySpace], { shell: true })
    so maybe something about that while loop being executed that way, can try ./my script.sh instead
    Tiago Jesus
    @tiagofilipe12
    I would suggest also that
    It's is cleaner also
    Julian Mazzitelli
    @thejmazz
    maybe wrapping it all inside a bash -c work too
    Julian Mazzitelli
    @thejmazz
    hmm but you use ${input} inside your script so making a script file you'd lose that abilitiy
    maybe can do
    cat ${input} | ./mythingy.sh | ...
    I sketched out a way to run inline scripts: bionode/bionode-watermill#89
    Julian Mazzitelli
    @thejmazz

    using my "script" function from that issue and still get same error as bruno:

    69840a4 :   traverse got undefined, returning
    error!:  TypeError: path.split is not a function
    Unhandled rejection (<{"threads":1,"container":null,"resume"...>, no stack trace)

    but the solenopsis.metadata.json is also empty

    Tiago Jesus
    @tiagofilipe12
    Can you see where the error occurs? I could not find the "error!" in watermill repo. I mean where is that being printed?
    Julian Mazzitelli
    @thejmazz
    idk im gonna go through pipeline commenting out each task to see first why the metadata.json is empty
    Julian Mazzitelli
    @thejmazz
    oh wait nvm derp i dont even have bionode-ncbi on this box thats probably why its empty lol
    Julian Mazzitelli
    @thejmazz
    hmmm ok so separating out the stuff in junction and each works (alebit download can't resolve output for ref cause its looking for fastq.gz not fna) - but anyways each works on its own
    Julian Mazzitelli
    @thejmazz
    hmmmm, i thought maybe it was because *.urls.txt existed from two tasks and that was tripping it up, so changed one to be *.readsurls.txt - but still that problem