Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • Jan 12 17:17

    nellore on master

    adds deprecation note (compare)

  • Nov 30 2020 19:01
    Rimsk commented #85
  • Nov 30 2020 18:58
    Rimsk commented #85
  • Jul 09 2020 23:08

    nellore on master

    adds unit test to alignment_han… (compare)

  • Oct 18 2019 17:59
    muhammed-ali commented #85
  • Oct 09 2019 17:29
    dfermin commented #85
  • Oct 08 2019 13:14
    dfermin commented #85
  • Apr 09 2019 00:16

    BenLangmead on master

    embed parsed_md (compare)

  • Apr 08 2019 21:26

    BenLangmead on master

    make more portable (compare)

  • Apr 08 2019 21:18

    BenLangmead on master

    some attempts to make this scri… (compare)

  • Dec 03 2018 15:25
    gianmaz edited #88
  • Dec 03 2018 14:25
    gianmaz opened #88
  • Mar 15 2018 20:05
    ChristopherWilks opened #87
  • Mar 15 2018 20:01

    ChristopherWilks on master

    switched to use sratoolkit 2.8.… (compare)

  • Mar 04 2018 22:50

    nellore on master

    patches bowtie2-build in travis… (compare)

  • Mar 04 2018 22:11

    nellore on master

    uses bowtie2 2.3.4.1 (compare)

  • Mar 04 2018 21:57

    nellore on master

    specifies samtools version to i… (compare)

  • Mar 04 2018 21:47

    nellore on master

    updates dependencies Merge branch 'master' of https:… (compare)

  • Mar 04 2018 21:39

    nellore on master

    quote rules Merge pull request #86 from Ben… (compare)

  • Mar 04 2018 21:39
    nellore closed #86
abhinav
@nellore
so you run rail on 4 samples in local mode on an EC2 instance
and everything is fine
abhinav
@nellore
then you run rail in local mode on the same EC2 instance with different output directories on S3, and everything is not fine?
can you paste the rail-rna commands you're running?
Julia di Iulio
@juliadiiulio_twitter

so the output directories are all on the EC2 instance.
but I open a screen (with > screen -S railrna)
and run let's say 2 different sets of 4 samples each one on a different screen and with different output directories

here is the command line I use :

smpl=smpl41to44
mcd /scratch/output/RNAseq/${smpl}/
deliverables=idx,tsv,bed,bam,bw,jx
rail-rna go local -m /scratch/output/RNAseq/${smpl}.txt \
--bowtie-idx /scratch/output/Genome/Index/Bowtie1/hg38_ERCC92 /scratch/output/Genome/Index/Bowtie2/hg38_ERCC92 \
-d ${deliverables} --verbose --num-processes 8 --scratch ./ --skip-bad-records --sort-memory-cap 8000000

if I then want to change the set of samples, I just change smpl=smpl41to44 to smpl=smpl45to48 and run the same command but on another screen.
where mcd is an alias for mkdir -p and cd

abhinav
@nellore
@juliadiiulio_twitter ok hm, so does the following thing happen: when you run on sample set A alone, you get no error, but when you run on sample set A at the same time as sample sets B and C on separate screens, the sample set A run returns an error? Or do different sample sets return errors consistently?
trying to figure out whether the errors are reproduced across runtime conditions
Julia di Iulio
@juliadiiulio_twitter
@nellore the first option is correct :
"when you run on sample set A alone, you get no error, but when you run on sample set A at the same time as sample sets B and C on separate screens, the sample set A run returns an error"
abhinav
@nellore
cry
that's no good
Julia di Iulio
@juliadiiulio_twitter
hahah I did a little inside :smile:
abhinav
@nellore
okay, we want to find out which line makes rail choke
which input line
this will require adding a line to align_reads.py
Julia di Iulio
@juliadiiulio_twitter
Oh no don't worry, it takes several hours before rail chokes, so I think for now I'll just run the set on different instances :)
abhinav
@nellore
would love to figure out why this happens
but i'd also like to know why you're running rail this way
rather than on all samples at once on EMR
Julia di Iulio
@juliadiiulio_twitter
oh ideally I would definitely run all samples at once on EMR
... it just that I am running into those permission issues on AWS, and devops didn't come back to me yet.. and I have to find a way to get the project going
abhinav
@nellore
sorry to hear you're having trouble!
how many samples are you analyzing in total?
Julia di Iulio
@juliadiiulio_twitter
but I agree, that would def be my first choice :)
abhinav
@nellore
and how many reads per sample?
Julia di Iulio
@juliadiiulio_twitter
96 samples
abhinav
@nellore
have you tried taking your credentials
Julia di Iulio
@juliadiiulio_twitter
~50mio reads
abhinav
@nellore
and launching the EMR job from your laptop?
that's the use case we were targeting
Julia di Iulio
@juliadiiulio_twitter
ha I did not !
abhinav
@nellore
it's much easier
Julia di Iulio
@juliadiiulio_twitter
I'll try
abhinav
@nellore
your credentials might work
but you then need to be able to create the default emr roles from your laptop
if they're already set up for your account it may work
Julia di Iulio
@juliadiiulio_twitter
hahah from what I learned so far... nothing is set up for my account :yum: but I'll try!
Ben Strober
@BennyStrobes
Hi @nellore . I'm hoping to use rail to generate exon-exon junction counts for ~200 samples (at ~50 mi reads). As rail takes a fair amount of time to run, I was planning on running multiple batches. And then aggregating the junction data across each of the batches. I just want to check to make sure doing this would give the same quantification as running all the samples in one batch??
abhinav
@nellore
@BennyStrobes it will if you use first-pass junctions
which are exactly the junctions recorded by snaptron
moreover, you can run only the first part of rail
if you specify that the only deliverable you want is jx
that is
use
-d jx
and then the whole thing takes a bit less than half the time it usually does
do you understand what i mean by "first-pass junctions?"
Ben Strober
@BennyStrobes
Hey @nellore, thanks for the quick reply! Thats perfect, as I want to compare my results to the snaptron data. I don't understand what you mean by "first-pass junctions", could you elaborate a bit?
abhinav
@nellore
junctions detected by the aligner in a given sample on a single pass of alignment
if we share junctions across samples, we may find that some junctions undetected in a given sample after a single pass of alignment are detected there on a second pass of alignment
but if you're comparing with snaptron