These are chat archives for GMOD/Apollo

5th
Apr 2018
Yating Liu
@Yating-L
Apr 05 2018 15:00
@nathandunn I removed the gene lines from GFF3 and generated GFF3Tabix track. But there are a few CDS do not show. Is this a known issue? Or it has been solved?
Screen Shot 2018-04-05 at 9.57.24 AM.png
Nathan Dunn
@nathandunn
Apr 05 2018 15:02
@Yating-L I’m not sure. If the CDS parent is the exon or the gene instead of the mRNA that could be a problem. Mostly it depends on how its created. Can you send your samples before and after?
I’m assuming this is just for the HTML version?
Yating Liu
@Yating-L
Apr 05 2018 15:02
HTMLFeatures, yes
@nathandunn here are the gene prediction output file, before and after filtering gene lines
Nathan Dunn
@nathandunn
Apr 05 2018 16:37
@Yating-L there are no exons (which is fine). Are you saying that it is not showing some of the specified CDS’s?
Robert Buels
@rbuels
Apr 05 2018 16:38
@nathandunn is this the use case you're working on in your gff3tabix pull req so that it works out of the box?
Nathan Dunn
@nathandunn
Apr 05 2018 16:38
@rbuels I thiknk so
Yating Liu
@Yating-L
Apr 05 2018 16:38
@nathandunn yes, some CDSs are not showing
Nathan Dunn
@nathandunn
Apr 05 2018 16:39
@Yating-L this is with the HTMLFeatures?
I’m guess it is
so, the answer to your question is yes @rbuels
Yating Liu
@Yating-L
Apr 05 2018 16:45
@nathandunn I think so. I tested on two GFF3. Canvas renders correctly
but HTML missing some subfeatures
Robert Buels
@rbuels
Apr 05 2018 16:46
which data store were both tests using? GFF3Tabix?
Yating Liu
@Yating-L
Apr 05 2018 16:46
"JBrowse/Store/SeqFeature/GFF3Tabix"
Nathan Dunn
@nathandunn
Apr 05 2018 16:50
yeah, they both use the same store. The difference is when the Canvas redners, they just render one on top of another and it works. With HTMLFeatures, you can’t do that with divs and so it blocks the other one out
Yating Liu
@Yating-L
Apr 05 2018 16:52
is this solvable?
Robert Buels
@rbuels
Apr 05 2018 16:55
is there a reason that you guys can't just use canvasfeatures? i've been hoping since 2013 to deprecate htmlfeatures
is it cause of apollo needing to drag the features around
Nathan Dunn
@nathandunn
Apr 05 2018 16:57
@rbuels Part of it is dragging the features, an important feature of Apollo. The other reason is highlighting edges.
Nathan Dunn
@nathandunn
Apr 05 2018 17:20
I guess the two questions are: @Yating-L can you use the JSON version? and @rbuels how much work would it be to fix the HTMLFeatures?
Robert Buels
@rbuels
Apr 05 2018 17:20
i'm going to work on fixing the htmlfeatures, cause it is already on the 1.14.0 milestone
Nathan Dunn
@nathandunn
Apr 05 2018 17:21
A couple of reasons to fix it as well: (1) the more users we have using GFF3Tabix, the fewer users we’ll have to transition with the newer JBrowse, and (2) handling the JBrowse JSON files has been problematic because of the large number of files versus tabix
@rbuels . . .. looking for proper emojis
:fire: :clap: :sparkles:
:rocket:
@rbuels Do you have colin and my half-working attempts?
not sure if they would be helpful or not
Yating Liu
@Yating-L
Apr 05 2018 17:37
@nathandunn we want to use tabix because we want to upload jbrowse to CyVerse Data Store. JSON version generates too many files, which takes too long to upload
Nathan Dunn
@nathandunn
Apr 05 2018 17:39
thanks @Yating-L . . @rbuels , who is much more capable than I, is going to work on this. Thank you very much @rbuels
Yating Liu
@Yating-L
Apr 05 2018 17:39
And HTMLFeatures is useful for our team because they want the dragging features and highlight edge features
Robert Buels
@rbuels
Apr 05 2018 17:39
@nathandunn do you have everything in your pull requests?
if not, just tell me what other branches there are and I'll take care of weaving them all together
@nathandunn the only one i see so far is GMOD/jbrowse#996
@Yating-L could you send me the reference sequence that go with those augustus gff3s? i will use those as test data to make sure they display well
using htmlfeatures
or link them here
Yating Liu
@Yating-L
Apr 05 2018 17:48
Let me make a folder
@rbuels
Thanks a lot!
Scott Cain
@scottcain
Apr 05 2018 18:01
I get the appeal of having a single tabix file for a track, but it isn’t clear to me why having hundreds of json files is a problem. Just run rsync (or the equiv for the cloud platform of choice) and all of the files go in place. I do this regularly with JBrowse tracks that are run off of Amazon’s S3 service. It takes a few minutes to do the upload (since it’s transering lots of small files) but once it’s done I never have to think about it again.
Yating Liu
@Yating-L
Apr 05 2018 18:03
do you mean download from S3?
how long does it usually take for running rsync? It took me days… @scottcain
for large files, it just took me a blink, but for hundreds of small files, very long
Nathan Dunn
@nathandunn
Apr 05 2018 18:11
I’ve talked to other groups who’ve had the same problem as @Yating-L The problem is worse for the sequence files, but backups, data creation, etc. becomes problematic with lots of data.
@rbuels you can also reproduce the problem using the Volvox GFF3Tabix I added for canvas
Scott Cain
@scottcain
Apr 05 2018 18:15
No, I mean uploading to S3; I run flat file_to_json on a local machine and then aws s3 cp —recursive to move the files to my S3 bucket. I do this for gene data for several species at once (yeast, worm, fly, zebrafish, mouse, rat and human) and it takes under an hour. To be fair, copy is a lot faster than rsync, and I only do the sequence data once even though I update feature data often. In this context, removing old remote data and copying the new local data up to the cloud makes sense and rsync isn’t really needed. of course, I also compress all of my tracks :-)
Yating Liu
@Yating-L
Apr 05 2018 19:59
@scottcain Things will be different when you uploading to CyVerse. Besides transfering the files, each file need to be registered in the database. So more files, more database updates…That’s my understanding. So it takes much longer than normal platform with numerious files
Scott Cain
@scottcain
Apr 05 2018 20:00
@Yating-L yuck. I guess you get what you pay for :-/
scottcain @scottcain congratulates himself for talking about using CyVerse but never getting around to it.
Yating Liu
@Yating-L
Apr 05 2018 20:02
It is free, so great place for long-term storage
I don’t know if there’re better options
Scott Cain
@scottcain
Apr 05 2018 20:04
Probably not for free—though you might look into Google. I know somebody at Google Cloud who is there to promote it for use with genomics data and probably has free stuff to give out.
Nathan Dunn
@nathandunn
Apr 05 2018 20:53
@scottcain I think there are also grant obligations to use the Cyverse platform. This is true for a number of users within the GMOD. I’m not sure if this is because the idea is that each object can have an individual ID.
Julie McMurry
@jmcmurry
Apr 05 2018 22:18
I'm going to opt out of this room, but please mention me if you want me to weigh back in.
Nathan Dunn
@nathandunn
Apr 05 2018 23:36
will do @jmcmurry ;)