These are chat archives for nellore/rail

22nd
Aug 2017
Julia di Iulio
@juliadiiulio_twitter
Aug 22 2017 00:22

Hey @nellore, turns out that if I uncomment my credentials, the IAM role that was attached to my instance by devops is no longer working... (I talked to them, and this is why they had previously asked me to comment out my credentials in ~/.aws/credentials) which yield to the following error

*Errors encountered*
Job flow failed on Monday, Aug 21, 2017 at 10:38:37 PM UTC. Run time was 142.681 seconds.
Traceback (most recent call last):
File "app_main.py", line 75, in run_toplevel
File "/usr/local/raildotbio/rail-rna/dooplicity/emr_runner.py", line 285, in <module>
args.aws_exe, args.profile, args.region)
File "/usr/local/raildotbio/rail-rna/dooplicity/emr_runner.py", line 230, in run_job_flow

+ (' ensure that IAM roles are '

RuntimeError: (HTTP Error 400: Bad Request); ensure that IAM roles are configured properly. This may require talking to your AWS account admin. ...

Do you think there is a way to hack rail to prevent the previous error (the one that I got when I didn't have my credentials in ~/.aws/credentials) ? or does that sound difficult ?

abhinav
@nellore
Aug 22 2017 00:24
are you launching an EMR cluster from an EC2 instance or from your laptop?
Julia di Iulio
@juliadiiulio_twitter
Aug 22 2017 00:25
from an EC2 instance
abhinav
@nellore
Aug 22 2017 00:27
hm, what happens when you run aws emr create-default-roles?
Julia di Iulio
@juliadiiulio_twitter
Aug 22 2017 00:27

when the credentials are commented, it works and returns

[]

but when I don't comment the credentials, I get an error

An error occurred (AccessDenied) when calling the CreateRole operation

abhinav
@nellore
Aug 22 2017 00:33
is that the default profile you're using?
Julia di Iulio
@juliadiiulio_twitter
Aug 22 2017 00:33
hmm sorry but I am not sure I know, how can I check ?
abhinav
@nellore
Aug 22 2017 00:33
on the EC2 instance, can you enter echo $AWS_ACCESS_KEY_ID
and let me know if it says anything, but not paste it?
Julia di Iulio
@juliadiiulio_twitter
Aug 22 2017 00:34
it's blank (but I have my credentials commented again)
abhinav
@nellore
Aug 22 2017 00:34
and in ~/.aws/credentials, just see if the commented-out lines are under [default]
blank, i see
Julia di Iulio
@juliadiiulio_twitter
Aug 22 2017 00:34
yes they are
abhinav
@nellore
Aug 22 2017 00:36
alright
can you do the following
(i'm following instructions here, modifying them slightly so rail might work)
export AWS_ACCESS_KEY_ID=curl http://169.254.169.254/latest/meta-data/iam/security-credentials/${instance_profile} | grep AccessKeyId | cut -d':' -f2 | sed 's/[^0-9A-Z]*//g'
grrr
``export AWS_ACCESS_KEY_ID=curl http://169.254.169.254/latest/meta-data/iam/security-credentials/${instance_profile} | grep AccessKeyId | cut -d':' -f2 | sed 's/[^0-9A-Z]*//g'````
``export AWS_ACCESS_KEY_ID=\curl http://169.254.169.254/latest/meta-data/iam/security-credentials/${instance_profile} | grep AccessKeyId | cut -d':' -f2 | sed 's/[^0-9A-Z]*//g'````
ugh
okay
you see the aws_access_key_id and secret_access_key lines there?
run those, except precede them with export, and make the variable names before the equals signs uppercase
then, keeping those lines in your credentials file commented
try running the rail-rna cluster launch command
there = from the link i sent you
Julia di Iulio
@juliadiiulio_twitter
Aug 22 2017 00:42

hmmm from

curl http://169.254.169.254/latest/meta-data/iam/security-credentials/${instance_profile}

all I get is the ec2-role, there is no aws_access_key_id or secret_access_key that I can see

aaaaargh... I'm sorry for so much troubleshooting :(
abhinav
@nellore
Aug 22 2017 00:46
no no, i'm learning too
a role was created for you
so you need to grab some temporary security credentials
read under Using Temporary Security Credentials with the AWS CLI
Julia di Iulio
@juliadiiulio_twitter
Aug 22 2017 00:46
Okay!
abhinav
@nellore
Aug 22 2017 00:47
run $ aws sts assume-role --role-arn arn:aws:iam::123456789012:role/role-name --role-session-name "RoleSession1" --profile IAM-user-name > assume-role-output.txt
except replace role-name with the name of the role given you
and IAM-user-name with your IAM user name
see if an access key id and secret access key are stored in assume-role-output
if so, then set the global variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY to them, and _now_try resubmitting the job flow
Julia di Iulio
@juliadiiulio_twitter
Aug 22 2017 00:49
Okay let me try !
Julia di Iulio
@juliadiiulio_twitter
Aug 22 2017 01:13

... now looks like the role devops gave me doesn't allow me to do "assume-role"

An error occurred (AccessDenied) when calling the AssumeRole operation: Not authorized to perform sts:AssumeRole

I will have to check with them tomorrow because they are gone for the day
I'll keep you posted

abhinav
@nellore
Aug 22 2017 01:14
sounds good
Julia di Iulio
@juliadiiulio_twitter
Aug 22 2017 18:22

Hi @nellore, I'm still waiting from devops for the elastic mode issue...
meanwhile for the local mode issue that I referred to yesterday

rail-rna_logs/align_reads/dp.reduce.log/1.0.log
Traceback (most recent call last):
File "app_main.py", line 75, in run_toplevel
File "/usr/local/raildotbio/rail-rna/rna/steps/align_reads.py", line 774, in <module>
no_polyA=args.no_polyA)
File "/usr/local/raildotbio/rail-rna/rna/steps/align_reads.py", line 465, in go
for is_reversed, name, qual in xpartition:
ValueError: expected length 3, got 2

when I open the respective preprocessed file in

rail-rna_logs/preprocess/push/1.0.gz
All of the reads (column 1 in that file) are at least 35 bp long, and so are the qualities (column 4 in that file). Is there anything else that could create this issue ?

I also get the following error a lot (for most of the run; btw locally, I'm running 4 samples at a time with around 50Mio reads per sample, PE 75bp initially before trimming, and I am using 8 thread per run).

*Errors encountered*
Streaming command "LC_ALL=C sort -T ./ -S 8000000 -k1,1 -t$'\t' -m /scratch/output/RNAseq/rail-rna_logs/align_reads/dp.tasks/4.* | /usr/local/raildotbio/pypy-2.5-linux_x86_64-portable/bin/pypy /usr/local/raildotbio/rail-rna/rna/steps/align_reads.py --bowtie-idx=/scratch/output/Genome/Index/Bowtie1/hg38_ERCC92 --bowtie2-idx=/scratch/output/Genome/Index/Bowtie2/hg38_ERCC92 --bowtie2-exe=/usr/local/raildotbio/bowtie2-2.2.7/bowtie2 --exon-differentials --partition-length=5000 --min-exon-size=9 --search-filter=1 --manifest=/scratch/output/RNAseq/manifest_20170808_local.txt --max-readlet-size=25 --readlet-interval=4 --capping-multiplier=1.1 --gzip-level 3 --index-count 400 --tie-margin 0 --verbose --scratch ./ --output-bam-by-chr --no-polyA -- 2>/scratch/output/RNAseq/rail-rna_logs/align_reads/dp.reduce.log/4.0.log" failed; exit level was 137.
Job flow failed on Tuesday, Aug 22, 2017 at 03:08:10 AM UTC. Run time was 15272.846 seconds.

and the rail-rna_logs/align_reads/dp.reduce.log/4.0.log is actually empty

Julia di Iulio
@juliadiiulio_twitter
Aug 22 2017 18:49
just realized something that goes beyond my understanding :smile: : when I only run 1 set of 4 sample at a time locally on an ec2 instance, it seems like the run doesn't give error, while if I run several set of 4 samples in the same instance (but different directories), it looks like the chances of errors are very high.
The exact same set of 4 samples that gave me the error (ValueError: expected length 3, got 2) don't give the error anymore when they are the only set of samples run on the instance. There is also another confounding factor: the instance type is now different. Does any of this make sense to you ?
abhinav
@nellore
Aug 22 2017 19:20
everything you just said makes total sense
jk
that's super weird
abhinav
@nellore
Aug 22 2017 19:42
okay, i think i need more details to help
@juliadiiulio_twitter
so you run rail on 4 samples in local mode on an EC2 instance
and everything is fine
abhinav
@nellore
Aug 22 2017 19:48
then you run rail in local mode on the same EC2 instance with different output directories on S3, and everything is not fine?
can you paste the rail-rna commands you're running?
Julia di Iulio
@juliadiiulio_twitter
Aug 22 2017 20:44

so the output directories are all on the EC2 instance.
but I open a screen (with > screen -S railrna)
and run let's say 2 different sets of 4 samples each one on a different screen and with different output directories

here is the command line I use :

smpl=smpl41to44
mcd /scratch/output/RNAseq/${smpl}/
deliverables=idx,tsv,bed,bam,bw,jx
rail-rna go local -m /scratch/output/RNAseq/${smpl}.txt \
--bowtie-idx /scratch/output/Genome/Index/Bowtie1/hg38_ERCC92 /scratch/output/Genome/Index/Bowtie2/hg38_ERCC92 \
-d ${deliverables} --verbose --num-processes 8 --scratch ./ --skip-bad-records --sort-memory-cap 8000000

if I then want to change the set of samples, I just change smpl=smpl41to44 to smpl=smpl45to48 and run the same command but on another screen.
where mcd is an alias for mkdir -p and cd

abhinav
@nellore
Aug 22 2017 22:42
@juliadiiulio_twitter ok hm, so does the following thing happen: when you run on sample set A alone, you get no error, but when you run on sample set A at the same time as sample sets B and C on separate screens, the sample set A run returns an error? Or do different sample sets return errors consistently?
trying to figure out whether the errors are reproduced across runtime conditions
Julia di Iulio
@juliadiiulio_twitter
Aug 22 2017 23:01
@nellore the first option is correct :
"when you run on sample set A alone, you get no error, but when you run on sample set A at the same time as sample sets B and C on separate screens, the sample set A run returns an error"
abhinav
@nellore
Aug 22 2017 23:01
cry
that's no good
Julia di Iulio
@juliadiiulio_twitter
Aug 22 2017 23:01
hahah I did a little inside :smile:
abhinav
@nellore
Aug 22 2017 23:12
okay, we want to find out which line makes rail choke
which input line
this will require adding a line to align_reads.py
Julia di Iulio
@juliadiiulio_twitter
Aug 22 2017 23:14
Oh no don't worry, it takes several hours before rail chokes, so I think for now I'll just run the set on different instances :)
abhinav
@nellore
Aug 22 2017 23:14
would love to figure out why this happens
but i'd also like to know why you're running rail this way
rather than on all samples at once on EMR
Julia di Iulio
@juliadiiulio_twitter
Aug 22 2017 23:15
oh ideally I would definitely run all samples at once on EMR
... it just that I am running into those permission issues on AWS, and devops didn't come back to me yet.. and I have to find a way to get the project going
abhinav
@nellore
Aug 22 2017 23:16
sorry to hear you're having trouble!
how many samples are you analyzing in total?
Julia di Iulio
@juliadiiulio_twitter
Aug 22 2017 23:16
but I agree, that would def be my first choice :)
abhinav
@nellore
Aug 22 2017 23:16
and how many reads per sample?
Julia di Iulio
@juliadiiulio_twitter
Aug 22 2017 23:16
96 samples
abhinav
@nellore
Aug 22 2017 23:16
have you tried taking your credentials
Julia di Iulio
@juliadiiulio_twitter
Aug 22 2017 23:17
~50mio reads
abhinav
@nellore
Aug 22 2017 23:17
and launching the EMR job from your laptop?
that's the use case we were targeting
Julia di Iulio
@juliadiiulio_twitter
Aug 22 2017 23:17
ha I did not !
abhinav
@nellore
Aug 22 2017 23:17
it's much easier
Julia di Iulio
@juliadiiulio_twitter
Aug 22 2017 23:17
I'll try
abhinav
@nellore
Aug 22 2017 23:17
your credentials might work
but you then need to be able to create the default emr roles from your laptop
if they're already set up for your account it may work
Julia di Iulio
@juliadiiulio_twitter
Aug 22 2017 23:19
hahah from what I learned so far... nothing is set up for my account :yum: but I'll try!