These are chat archives for nellore/rail

21st
Aug 2017
Julia di Iulio
@juliadiiulio_twitter
Aug 21 2017 01:07

Hi @nellore! Just started using Rail-rna because it looks like it rocks:) but have had some issues... I am trying (with help from devOps people) to run it with AWS EMR..
Here is the error I get , any help would be greatly appreciated:

Parameter validation failed:
Missing required parameter in LifecycleConfiguration.Rules[0]: "Prefix"
Traceback (most recent call last):
File "/usr/lib64/python2.7/runpy.py", line 174, in _run_module_as_main
"main", fname, loader, pkg_name)
File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/usr/local/raildotbio/rail-rna/main.py", line 975, in <module>
ec2_slave_security_group_id=args.ec2_slave_security_group_id
File "/usr/local/raildotbio/rail-rna/rna/driver/rna_config.py", line 6435, in init
secure_stack_name=secure_stack_name)
File "/usr/local/raildotbio/rail-rna/rna/driver/rna_config.py", line 2217, in init
days=intermediate_lifetime)
File "/usr/local/raildotbio/rail-rna/dooplicity/ansibles.py", line 386, in expire_prefix
' '.join(aws_command)
RuntimeError: Error encountered changing lifecycleparameters with command "aws --profile default s3api put-bucket-lifecycle --bucket rail-rna --lifecycle-configuration {"Rules":[{"Status": "Enabled", "ID": "something", "NoncurrentVersionExpiration": {"NoncurrentDays": 365}, "Expiration": {"ExpiredObjectDeleteMarker": true}, "AbortIncompleteMultipartUpload": {"DaysAfterInitiation": 7}}, {"Status": "Enabled", "Prefix": "ElasticMode/20170816.intermediate/", "Expiration": {"Days": -1}}]}".

In the meanwhile, I also tried using it locally, and get the following error for some of the samples but not for all:

less rail-rna_logs/align_reads/dp.reduce.log/0.0.log
Traceback (most recent call last):
File "app_main.py", line 75, in run_toplevel
File "/usr/local/raildotbio/rail-rna/rna/steps/align_reads.py", line 774, in <module>
no_polyA=args.no_polyA)
File "/usr/local/raildotbio/rail-rna/rna/steps/align_reads.py", line 465, in go
for is_reversed, name, qual in xpartition:
ValueError: expected length 3, got 2

I thought this could be due to bad input fastq, so I used the --skip-bad-records, but it doesn't seem to fix it.

Julia di Iulio
@juliadiiulio_twitter
Aug 21 2017 01:49
oh and I'm using Rail-RNA v0.2.4a if that helps
and get the same error (in elastic mode) with v0.2.4b
abhinav
@nellore
Aug 21 2017 21:33
hi @juliadiiulio_twitter! thanks for trying rail-rna. the lifecycle error is a known issue in rail-rna v0.2.4b; see nellore/rail#59. (we want to fix a few other issues too before our next release.) the current workaround is for you to use --intermediate-lifetime -1 in your rail-rna command to disable expiring intermediate data; after rail is done a job flow, you can manually delete these from s3 using either the console or the aws cli
as for your local-mode error, my best guess is this has something to do with malformed input. from some previous experience helping another user, one possibility is you're trimming reads, and for some reads the trimmer eliminates the entire read. this could be way off though, and i'm happy to debug live with you in this chat room
Julia di Iulio
@juliadiiulio_twitter
Aug 21 2017 21:40
Thanks @nellore for you advice :)
I tried v0.2.4a in elastic mode and get the same error (both when I use --intermediate-lifetime -1 or not) any other idea ? or should I try to go back to an earlier version before the v0.2.4a ?
For the local mode, I will look into it (there is indeed some preprocessing on the fastq files that might have taken off the whole read in some instances)
Thanks a lot!
abhinav
@nellore
Aug 21 2017 21:42
gah
i see intermediate-lifetime = -1 is an issue too!
ok, good to know
can you try creating a new bucket in the AWS console
and adding one dud lifecycle rule?
Julia di Iulio
@juliadiiulio_twitter
Aug 21 2017 21:45
Let me check with our devops team (but I think that's what they did). I'll come back to you as soon as I hear from them!
abhinav
@nellore
Aug 21 2017 21:45
great!
Julia di Iulio
@juliadiiulio_twitter
Aug 21 2017 21:47
Okay, they confirmed that they did that already
abhinav
@nellore
Aug 21 2017 21:48
alright, can i just walk you through hacking rail so it doesn't do this?
Julia di Iulio
@juliadiiulio_twitter
Aug 21 2017 21:48
sure! I'll do my best to follow:)
abhinav
@nellore
Aug 21 2017 21:51
edit the file /usr/local/raildotbio/rail-rna/dooplicity/ansibles.py so L356 reads if 'NoSuchLifecycleConfiguration' not in errors: rather than if 'NoSuchLifecycleConfiguration' in errors:
save and rerun, and hopefully that particular error will be gone
but then you'll still face the issue that's potentially overtrimming
one way to handle that is that if a read is 100% trimmed, replace it with a single-character read sequence "N" with quality sequence "#"
on the other hand, you may find you don't really need to trim since rail-rna soft-clips alignments
Julia di Iulio
@juliadiiulio_twitter
Aug 21 2017 21:55

hmm looks like I still get the error

Parameter validation failed:
Missing required parameter in LifecycleConfiguration.Rules[0]: "Prefix"
Traceback (most recent call last):
File "/usr/lib64/python2.7/runpy.py", line 174, in _run_module_as_main
"main", fname, loader, pkg_name)
File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/usr/local/raildotbio/rail-rna/main.py", line 975, in <module>
ec2_slave_security_group_id=args.ec2_slave_security_group_id
File "/usr/local/raildotbio/rail-rna/rna/driver/rna_config.py", line 6435, in init
secure_stack_name=secure_stack_name)
File "/usr/local/raildotbio/rail-rna/rna/driver/rna_config.py", line 2217, in init
days=intermediate_lifetime)
File "/usr/local/raildotbio/rail-rna/dooplicity/ansibles.py", line 386, in expire_prefix
' '.join(aws_command)
RuntimeError: Error encountered changing lifecycleparameters with command "aws --profile default s3api put-bucket-lifecycle --bucket rail-rna --lifecycle-configuration {"Rules":[{"Status": "Enabled", "ID": "something", "NoncurrentVersionExpiration": {"NoncurrentDays": 365}, "Expiration": {"ExpiredObjectDeleteMarker": true}, "AbortIncompleteMultipartUpload": {"DaysAfterInitiation": 7}}, {"Status": "Enabled", "Prefix": "jdiiulio/storage/analysis/RNAseq/HNX/GNE/ElasticMode/20170816.intermediate/", "Expiration": {"Days": -1}}]}".

abhinav
@nellore
Aug 21 2017 21:55
alright better idea
add a new line at the beginning of the function
right before L315
write return
i think AWS changed its API
Julia di Iulio
@juliadiiulio_twitter
Aug 21 2017 21:58
just tried and still got the same error :(
abhinav
@nellore
Aug 21 2017 21:59
the error didn't change when you put return right before L315?
Julia di Iulio
@juliadiiulio_twitter
Aug 21 2017 22:00

It looks the same to me but maybe I am missing something, here it is

Parameter validation failed:
Missing required parameter in LifecycleConfiguration.Rules[0]: "Prefix"
Traceback (most recent call last):
File "/usr/lib64/python2.7/runpy.py", line 174, in _run_module_as_main
"main", fname, loader, pkg_name)
File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/usr/local/raildotbio/rail-rna/main.py", line 975, in <module>
ec2_slave_security_group_id=args.ec2_slave_security_group_id
File "/usr/local/raildotbio/rail-rna/rna/driver/rna_config.py", line 6435, in init
secure_stack_name=secure_stack_name)
File "/usr/local/raildotbio/rail-rna/rna/driver/rna_config.py", line 2217, in init
days=intermediate_lifetime)
File "/usr/local/raildotbio/rail-rna/dooplicity/ansibles.py", line 387, in expire_prefix
' '.join(aws_command)
RuntimeError: Error encountered changing lifecycleparameters with command "aws --profile default s3api put-bucket-lifecycle --bucket rail-rna --lifecycle-configuration {"Rules":[{"Status": "Enabled", "ID": "something", "NoncurrentVersionExpiration": {"NoncurrentDays": 365}, "Expiration": {"ExpiredObjectDeleteMarker": true}, "AbortIncompleteMultipartUpload": {"DaysAfterInitiation": 7}}, {"Status": "Enabled", "Prefix": "jdiiulio/storage/analysis/RNAseq/HNX/GNE/ElasticMode/20170816.intermediate/", "Expiration": {"Days": -1}}]}".

abhinav
@nellore
Aug 21 2017 22:01
oh
Julia di Iulio
@juliadiiulio_twitter
Aug 21 2017 22:01
oh wait! the "return" was indented... my fault
abhinav
@nellore
Aug 21 2017 22:01
no no
forget return for now
remove --intermediate-lifetime -1
from the command
return is supposed to be indented, it's just that return is supposed to exit the function, so my guess is you didn't save
but don't worry about the return now
keep your last change
the addition of the not
and just remove the --intermediate-lifetime -1 from your rail-rna command
i think the trouble is it's now sending an invalid number of days as expiration
Julia di Iulio
@juliadiiulio_twitter
Aug 21 2017 22:06
wait it looks like without the --intermediate-lifetime -1 and keeping the "return" and the "not", it woks !!!! youhouhouhouh well I got another error, but at least it started :)
abhinav
@nellore
Aug 21 2017 22:07
yay!
what's the other error?
Julia di Iulio
@juliadiiulio_twitter
Aug 21 2017 22:07
NameError: global name 'Url' is not defined
but I might have messed the manifest file... let me check
abhinav
@nellore
Aug 21 2017 22:07
wow
where are you getting that?
can you paste the full traceback?
Julia di Iulio
@juliadiiulio_twitter
Aug 21 2017 22:08
Loading...
Checked all files listed in manifest file.
Copied Rail-RNA and bootstraps to S3.
Traceback (most recent call last):
File "/usr/lib64/python2.7/runpy.py", line 174, in _run_module_as_main
"main", fname, loader, pkg_name)
File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/usr/local/raildotbio/rail-rna/main.py", line 975, in <module>
ec2_slave_security_group_id=args.ec2_slave_security_group_id
File "/usr/local/raildotbio/rail-rna/rna/driver/rna_config.py", line 6477, in init
profile=base.profile))
File "/usr/local/raildotbio/rail-rna/rna/driver/rna_config.py", line 3938, in init
if not Url(assembly).is_s3:
NameError: global name 'Url' is not defined
abhinav
@nellore
Aug 21 2017 22:10
oh this is happening because you're not selecting one of hg19, hg38, mm9, mm10, dm3, dm6 as the assembly
can you paste me your rail-rna command
(it's also a bug in the code i'm just making a commit to fix now_
)
Julia di Iulio
@juliadiiulio_twitter
Aug 21 2017 22:12
rail-rna go elastic -m s3://rail-rna-pdx/data/RNAseq/manifest_20170808_2samples_railrna.txt -a s3://rail-rna-pdx/Genome/Index/RailRNA/hg38_ERCC92.tgz -o s3://rail-rna-pdx/jdiiulio/storage/analysis/RNAseq/ElasticMode/20170816 --core-instance-type c3.2xlarge --master-instance-type c3.2xlarge -c 8 --region us-west-2 -d idx,tsv,bed,bam,bw,jx --verbose --action-on-failure "TERMINATE_CLUSTER" --master-instance-bid-price 0.15 --core-instance-bid-price 0.15
abhinav
@nellore
Aug 21 2017 22:13
ahhh okay you want to use your own assembly
Julia di Iulio
@juliadiiulio_twitter
Aug 21 2017 22:13
ya :)
abhinav
@nellore
Aug 21 2017 22:13
ok /usr/local/raildotbio/rail-rna/rna/driver/rna_config.py, line 3939, change Url(assembly) to ab.Url(assembly)
sorry, there are so many combinations of parameters to test in this software
hard to be comprehensive
Julia di Iulio
@juliadiiulio_twitter
Aug 21 2017 22:15
no worries at all
it's running for 27 seconds!! I'ts a record :smile:
abhinav
@nellore
Aug 21 2017 22:17
yeah but it's gonna fail at align cuz of your local-mode error ;P
unless this is different data!
Julia di Iulio
@juliadiiulio_twitter
Aug 21 2017 22:17
haha well it just failed :)
abhinav
@nellore
Aug 21 2017 22:17
where?
Julia di Iulio
@juliadiiulio_twitter
Aug 21 2017 22:17

~.oOo.>

00h:00m:00s || Read job flow from input JSON.
00h:00m:15s |
| Verified that output directories on S3 are writable.
00h:00m:35s |___| Set up output directories on S3.
*Errors encountered*
Job flow failed on Monday, Aug 21, 2017 at 10:17:01 PM UTC. Run time was 35.845 seconds.
Traceback (most recent call last):
File "app_main.py", line 75, in run_toplevel
File "/usr/local/raildotbio/rail-rna/dooplicity/emr_runner.py", line 285, in <module>
args.aws_exe, args.profile, args.region)
File "/usr/local/raildotbio/rail-rna/dooplicity/emr_runner.py", line 226, in run_job_flow
job_flow_response = aws_ansible.post_request(full_payload)
File "/usr/local/raildotbio/rail-rna/dooplicity/ansibles.py", line 719, in post_request
service=self.service
File "/usr/local/raildotbio/rail-rna/dooplicity/ansibles.py", line 449, in aws_signature
('AWS4' + secret_key).encode('utf-8'),
TypeError: unsupported operand type(s) for +: 'str' and 'NoneType'

abhinav
@nellore
Aug 21 2017 22:18
whoa
ok
well
that's because
well, why is that
hm
can you less ~/.aws/credentials and tell me if your access key ID and secret access key are present there
don't paste them here!
Julia di Iulio
@juliadiiulio_twitter
Aug 21 2017 22:22
they are commented out... it had something to do with the IAM role ... not sure why but devops, asked me to comment them, I 'll try to uncomment!
abhinav
@nellore
Aug 21 2017 22:23
cool
Julia di Iulio
@juliadiiulio_twitter
Aug 21 2017 22:24
Success!!! looks like it's working !!!
Thanks so much
abhinav
@nellore
Aug 21 2017 22:24
sure, but we'll see :P
another thing
oh, nm
yeah, mention @nellore if something else comes up
Julia di Iulio
@juliadiiulio_twitter
Aug 21 2017 22:27
:thumbsup: