Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • Aug 20 14:25
    sbliven opened #847
  • Aug 20 13:24
    sbliven edited #846
  • Aug 20 13:24
    sbliven opened #846
  • Aug 14 10:01
    prashantVaishla commented #843
  • Jul 26 18:36

    josemduarte on master

    Fixing issue with MTRIXn record… Merge pull request #845 from jo… (compare)

  • Jul 26 18:36
    josemduarte closed #845
  • Jul 26 18:07

    josemduarte on master

    Update UniprotProxySequenceRead… Update UniprotProxySequenceRead… Update UniprotProxySequenceRead… and 3 more (compare)

  • Jul 26 18:07
    josemduarte closed #844
  • Jul 17 17:45
    josemduarte commented #844
  • Jul 09 18:35
    heuermh closed #842
  • Jul 09 18:35
    heuermh commented #842
  • Jul 09 15:30
    josemduarte commented #842
  • Jul 08 19:11
    josemduarte opened #845
  • Jul 05 18:48
    heuermh commented #842
  • Jul 05 18:31
    heuermh commented #842
  • Jun 24 17:22
    josemduarte commented #842
  • Jun 24 17:17
    heuermh commented #842
  • Jun 24 00:37
    emckee2006 commented #844
  • Jun 24 00:37
    josemduarte commented #842
  • Jun 24 00:32
    josemduarte commented #844
Luke Czapla
@lukeczapla
it may be that I just have to switch to a different method of extracting sequences for this case, unless there is a package already that can construct the sequence from data missing SEQRES
Luke Czapla
@lukeczapla
it looks like getAtomSequence worked! Ok so I should be good to go
it is great, I will have to update my PR because the AtomSequence and AtomGroups work so much better!
Luke Czapla
@lukeczapla
it all works great, The new TertiaryBasePairParameters class is able to detect an unusual G-A wobble type of RNA base pair and it gave the same values as the external program 3DNA
Aleix Lafita
@lafita
Great! Yes, in BioJava we use SEQRES for the protein sequence (experimental construct) and we align them to the ATOM records in the structure to handle the missing ones
About the different PDB formats, we try to handle some of the non-standard PDB formats, and throw warnings if there are formatting problems in the file
There is an issue about this with a long discussion
Aleix Lafita
@lafita
But if you observe any problems, you can submit an issue with the file
Luke Czapla
@lukeczapla
Sure, I haven't had a problem at all (and those warnings didnt affect anything). It seems like it can read anything that's mostly formatted correctly with just ATOM records. I have some duplicate chains from building these models that are like 105 base pairs of DNA with 3 bound identical proteins to it (just different x,y,z coordinates), and it is still fine with separating it off into the right # of chains (eight total chains: 2 DNA strands, and 3 copies of proteins that are dimers). The AtomSequence and AtomGroups does a great job.
Luke Czapla
@lukeczapla
I just made a toString() method to conveniently output all the parameters coming out of the BasePairParameters class. And moved the analysis out of the constructor into an analyze() method that returns the same object for convenience, so something like this would work to print out all the data System.out.println(new BasePairParameters(structure).analyze());
I have a friend in Stockholm who's very interested in trying it out, maybe you guys could review the latest version and let me know what else I should do to it before it can get merged. The TestBasePairParameters.java tests all three implementations
Luke Czapla
@lukeczapla
btw I just migrated to log4j2 and it fixed my issues. I couldn't get it to work with logback-classic, but I got rid of it and I put a log4j2.xml in my resource path and it's basically functional again
no changes really needed otherwise since it's all org.slf4j
Aleix Lafita
@lafita
Thank you for all the great work @lukeczapla! The PR looks really nice, I have just made one more change request and I think it can be merged into master, if everyone else agrees with it.
Luke Czapla
@lukeczapla
ok I will take a look, thank you!
Luke Czapla
@lukeczapla
Ok guys, I addressed both reviewers comments and switched to using List<Pair<Group>> for the pairs. added the author annotation and the getLength() method with JavaDoc, and had it throw IllegalArgumentException for out of bounds value
the tests all pass and the code on my end still works without modification because I didn't call findPairs() directly
Luke Czapla
@lukeczapla
my web server has an additional 4 tests
@sbliven the best fix for Spring was to just adopt Log4j2 like BioJava uses. I'm not sure why it was stuck in the classpath but I asked the author of Spring. I got Log4j2 basically acting exactly the same way as Logback-classic was so it's all good
Luke Czapla
@lukeczapla
hey guys. I tried to update my PR. been really busy because the school I work at wants me to do some SQL work for them, and I'm maintaining a DNA/RNA server at Rutgers. is it so wrong to use a match(char a, char b, boolean RNA) function instead of a NucleotideCompound? because it seemed a lot easier to me and I spent hours writing all of this
the algorithm involves looking letter by letter and the functionality I was pointed to seemed to be more for direct and fuzzy matches of long sequences for genomics research
it's a lot different than simple structural biology characterization of helical regions by geometry
Luke Czapla
@lukeczapla
it's odd because even some DNA sequences have T-T base pairs and weird things like that, or a base pair overhanging off the chain. if there's some version of boolean match(char a, char b, boolean RNA) that biojava has and it's just a matter of switching to that, I'm ok with that.
(base rather than base pair overhanging). that's why my tests included RCSB Ids like 1P71 and 3PHP, they're really unusual. my internal tests are on weird PDB files that come out of some code that discretizes continuum mechanics and finds equilibrium structures of DNA, and from my Monte Carlo simulation code that complements that technique by statistical treatment of fluctuations.
Luke Czapla
@lukeczapla
I'd port the whole thing to BioJava but I need Nd4j for CUDA support, it's a good package. so what do I still have to do to pass the review for the variations of the BasePairParameters code? @heuermh @lafita @josemduarte
Luke Czapla
@lukeczapla
hi guys. I am having an issue with the -DPDB_CACHE_DIR=/home/czapla/pdb -DPDB_DIR=/home/czapla/pdb it's not saving anything on Ubuntu
it seems to ignore the specification of that folder
it works fine on my Mac though, but I deployed it on an Ubuntu machine at Rutgers and I don't see anything being saved in the specified folders
I checked the command-line and process list and it's the same Oracle Java 8, jdk1.8.0_144, and launched the same way. mvn spring-boot:run -DPDB_CACHE_DIR=/home/czapla/pdb -DPDB_DIR=/home/czapla/pdb
Luke Czapla
@lukeczapla
ok please ignore me because it is now working :) I don't know what was up before but maybe it was because I had specified ~/pdb instead of the full path
Luke Czapla
@lukeczapla
I posted an issue though about the Structure serialization. it's having some trouble and I am going to try to trace what's going on.
my class serializes all the internal data and I had originally marked the Structure object as transient. So after undoing that, I tried serializing and deserializing, and although it no longer throws an error, it comes back with "no data" as if the Structure object is missing something.
Luke Czapla
@lukeczapla
it is actually still throwing an error. I forgot to recompile but I went and recloned the original repository, added my folder with the BasePairParameters, and tried a test. it's posted in the issue, I'll put up a gist with the test
the test is just testing the Structure object, nothing with my code because it's separate from my package and I can't serialize Structures :(
Maximilian Greil
@MaxGreil
Hello everyone, my name is Maximilian Greil. I am interested in contributing to biojava and already had a look into the open issues. Maybe someone here can give me a suggestion on which issue may be suitable for beginning?
Jose Manuel Duarte
@josemduarte
Welcome @MaxGreil !
There's plenty of open issues: https://github.com/biojava/biojava/issues
This one would be really nice to resolve for instance : biojava/biojava#574
Jose Manuel Duarte
@josemduarte
Another project that recently came up is extension of dssp to promotif, I've just created an issue for it: biojava/biojava#764
And related to secondary structure assignment we also have biojava/biojava#454
Jose Manuel Duarte
@josemduarte
If you like alignments or want to learn more about the basics of alignments, this one can be interesting: biojava/biojava#243
Anyway, have a look at all issues with the "help wanted" tag: https://github.com/biojava/biojava/labels/help%20wanted
Maximilian Greil
@MaxGreil
@josemduarte thank you very much for your response and your suggestions. I will have a look at issue #764.
Jose Manuel Duarte
@josemduarte
Thanks @MaxGreil . Good luck with it, please do ask for help if you get stuck
Maximilian Greil
@MaxGreil
On the second thought, I think I try issue #574 first and then try #764. If that is ok.
Ankith
@AnkithO-0
for the example provided of protein sequence alignment, the imports have not been mentioned. Now since i have taken the JAR from maven, I have to look into each subpackage to find where the class might exist. Can any of you provide an easier solution.... maybe a place where i can know what to import for what type of objects
Ankith
@AnkithO-0
nevermind....didnt know you guys had a docs page too
a link on the webpage maybe............?
Aleix Lafita
@lafita
You can also check the tutorial github.com/biojava/biojava-tutorial
Jose Manuel Duarte
@josemduarte
If you use an IDE (e.g. Eclipse, IntelliJ) it is quite easy to find the imports by just doing CTRL+space on the class name in code. In any case we'll try to add more imports in tutorial. Another good place to find fully working examples is the demo packages in each of the biojava modules.