These are chat archives for nextflow-io/nextflow

26th
Jan 2017
Gijs Molenaar
@gijzelaerr
Jan 26 2017 09:29
λ docker --version Docker version 1.13.0, build 49bf474
oops, wrong channel, sorry
Carlos Guzman
@c-guzman
Jan 26 2017 16:16
@pditommaso Hey Paolo, are you here? Not sure how to ping.
Paolo Di Tommaso
@pditommaso
Jan 26 2017 16:31
@c-guzman Hi Carlos, are u around ?
Carlos Guzman
@c-guzman
Jan 26 2017 16:31
@pditommaso Hey, yes I am
Paolo Di Tommaso
@pditommaso
Jan 26 2017 16:31
great
so the mem was full of these
[# stringtie Ctrl2.sorted.mapped.bam -v -G gencode.v19.annotation.gtf -A Ctrl2.gene_abundance.txt -C Ctrl2.cov_refs.gtf -e -b Ctrl2_ballgown -p 10
# StringTie version 1.3.0
chr1    HAVANA    transcript    89295    120932    .    -    .    gene_id "ENSG00000238009.2"; transcript_id "ENST00000466430.1"; ref_gene_name "RP11-34P13.7"; cov "0.0"; FPKM "0.000000"; TPM "0.000000";
chr1    HAVANA    exon    89295    91629    .    -    .    gene_id "ENSG00000238009.2"; transcript_id "ENST00000466430.1"; exon_number "1"; ref_gene_name "RP11-34P13.7"; cov "0.0";
chr1    HAVANA    exon    92091    92240    .    -    .    gene_id "ENSG00000238009.2"; transcript_id "ENST00000466430.1"; exon_number "2"; ref_gene_name "RP11-34P13.7"; cov "0.0";
chr1    HAVANA    exon    112700    112804    .    -    .    gene_id "ENSG00000238009.2"; transcript_id "ENST00000466430.1"; exon_number "3"; ref_gene_name "RP11-34P13.7"; cov "0.0";
chr1    HAVANA    exon    120775    120932    .    -    .    gene_id "ENSG00000238009.2"; transcript_id "ENST00000466430.1"; exon_number "4"; ref_gene_name "RP11-34P13.7"; cov "0.0";
chr1    HAVANA    transcript    89551    91105    .    -    .    gene_id "ENSG00000239945.1"; transcript_id "ENSG00000239945.1"; ref_gene_name "RP11-34P13.8"; cov "0.0"; FPKM "0.000000"; TPM "0.000000";
chr1    HAVANA    exon    89551    91105    .    -    .    gene_id "ENSG00000239945.1"; transcript_id "ENSG00000239945.1"; exon_number "1"; ref_gene_name "RP11-34P13.8"; cov "0.0";
chr1    HAVANA    transcript    89551    91105    .    -    .    gene_id "ENSG00000239945.1"; transcript_id "ENST00000495576.1"; ref_gene_name "RP11-34P13.8"; cov "0.0"; FPKM "0.000000"; TPM "0.000000";
chr1    HAVANA    exon    89551    90050    .    -    .    gene_id "ENSG00000239945.1"; transcript_id "ENST00000495576.1"; exon_number "1"; ref_gene_name "RP11-34P13.8"; cov "0.0";
chr1    HAVANA    exon    90287    91105    .    -    .    gene_id "ENSG00000239945.1"; transcript_id "ENST00000495576.1"; exon_number "2"; ref_gene_name "RP11-34P13.8"; cov "0.0";
chr1    HAVANA    transcript    92230    129217    .    -    .    gene_id "ENSG00000238009.2"; transcript_id "ENST00000477740.1"; ref_gene_name "RP11-34P13.7"; cov "0.0"; FPKM "0.000000"; TPM "0.000000";
chr1    HAVANA    exon    92230    92240    .    -    .    gene_id "ENSG00000238009.2"; transcript_id "ENST00000477740.1"; exon_number "1"; ref_gene_name "RP11-34P13.7"; cov "0.0";
chr1    HAVANA    exon    112700    112804    .    -    .    gene_id "ENSG00000238009.2"; transcript_id "ENST00000477740.1"; exon_number "2"; ref_gene_name "RP11-34P13.7"; cov "0.0";
chr1    HAVANA    exon    120721    120932    .    -    .    gene_id "ENSG00000238009.2"; transcript_id "ENST00000477740.1"; exon_number "3"; ref_gene_name "RP11-34P13.7"; cov "0.0";
chr1    HAVANA    exon    129055    129217    .    -    .    gene_id "ENSG00000238009.2"; transcript_id "ENST00000477740.1"; exon_number "4"; ref_gene_name "RP11-34P13.7"; cov "0.0";
chr1    HAVANA    transcript    110953    129173    .    -    .    gene_id "ENSG00000238009.2"; transcript_id "ENST00000471248.1"; ref_gene_name "RP11-34P13.7"; cov "0.0"; FPKM "0.000000"; TPM "0.000000";
Carlos Guzman
@c-guzman
Jan 26 2017 16:32
hmm
so the stringtie process is messing the pipeline up?
Paolo Di Tommaso
@pditommaso
Jan 26 2017 16:33
I guess you are splitting that annotations somewhere ..
send me again the link to your code please
it could be this? -> stdout into stringtie_log
Paolo Di Tommaso
@pditommaso
Jan 26 2017 16:35
I guess so
Carlos Guzman
@c-guzman
Jan 26 2017 16:35
Weird. I essentially modelled this after the NGI workflow
and that's what they used
Paolo Di Tommaso
@pditommaso
Jan 26 2017 16:36
is the stringtie stdout very big ?
Carlos Guzman
@c-guzman
Jan 26 2017 16:36
honestly not really
let me see if i can find an exact size
Paolo Di Tommaso
@pditommaso
Jan 26 2017 16:37
I would suggest a very easy change
stringtie ... > stringtie_log
than you can capture that file instead of the stdout
let's see if it solve the problem
Carlos Guzman
@c-guzman
Jan 26 2017 16:38
is it > or 2>?
Paolo Di Tommaso
@pditommaso
Jan 26 2017 16:38
> is stdout
2> is stderr
thus I guess it's the first
Carlos Guzman
@c-guzman
Jan 26 2017 16:39
gotcha. sorry. i'm a wet lab biologist, somehow ended up doing all the bioinformatics recently
Paolo Di Tommaso
@pditommaso
Jan 26 2017 16:39
no pb
Carlos Guzman
@c-guzman
Jan 26 2017 16:40
okay i'm running the pipeline now after the change, i will ping you with results
Paolo Di Tommaso
@pditommaso
Jan 26 2017 16:40
:+1:
Carlos Guzman
@c-guzman
Jan 26 2017 17:19
@pditommaso That seems to have fixed the problem
Paolo Di Tommaso
@pditommaso
Jan 26 2017 17:20
Fantastic
Carlos Guzman
@c-guzman
Jan 26 2017 17:22
Thanks for all the help!
Paolo Di Tommaso
@pditommaso
Jan 26 2017 17:58
Welcome
Mike Smoot
@mes5k
Jan 26 2017 18:12
Hi @pditommaso are you aware of any size limitations on strings in channels? I'm seeing a situation where a channel basically stops processing on a string that is too large (200MB).
Paolo Di Tommaso
@pditommaso
Jan 26 2017 18:15
Um, chunks of that size I would store into files
Mike Smoot
@mes5k
Jan 26 2017 18:18
That's what I'm trying to do, but I guess I should avoid making strings that size in the first place. I'm trying to split a large fasta file into somewhat evenly sized files while not splitting any records. I'll revisit my logic!
Paolo Di Tommaso
@pditommaso
Jan 26 2017 18:18
Anyhow does it produce any errors message?
Mike Smoot
@mes5k
Jan 26 2017 18:19
No errors and there's plenty of memory on the machine.
Paolo Di Tommaso
@pditommaso
Jan 26 2017 18:21
Um, JVM heap != machine mem
Mike Smoot
@mes5k
Jan 26 2017 18:23
Good point, I assume there's a JVM_OPT or something I can export?
Paolo Di Tommaso
@pditommaso
Jan 26 2017 18:26
Exactly, I always forget, Google for Java heap non/max sizes
*mins/max
Mike Smoot
@mes5k
Jan 26 2017 18:29
I remember the command line: -Xmx=4G
Paolo Di Tommaso
@pditommaso
Jan 26 2017 18:30
I guess so
Mike Smoot
@mes5k
Jan 26 2017 18:31
This didn't seem to work JAVA_OPTS="-Xmx=4G" nextflow run buffer_fasta.nf
Actually it's apparently -Xmx4G although I can't figure out how to tell Nextflow to use that.
Paolo Di Tommaso
@pditommaso
Jan 26 2017 18:37
Use either _JAVA_OPTIONS or NXF_OPTS
Mike Smoot
@mes5k
Jan 26 2017 18:38
Yes, just looked at the source. :)
Paolo Di Tommaso
@pditommaso
Jan 26 2017 18:39
Best friend of devs:)
Mike Smoot
@mes5k
Jan 26 2017 18:54
BTW, did you have a chance to read what I wrote for nextflow-io/nextflow#238
Paolo Di Tommaso
@pditommaso
Jan 26 2017 21:12
I gave a quick read, but I haven't yet enough time to organise the work and a detailed answer
I will comment soon, thanks a lot for your proposal for now.