by

Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    macca1996-bit
    @macca1996-bit
    @rbharath you're right...as my final goodbye, your point about the random forest is a good one, especially for benchmarking. Is it as simple as loading/featurizing/splitting my data as i usually would, but then passing this featurized training set into a random forest?
    I really appreciate all your help. Had a lot of fun working through the book just trying to apply it to some new datasets
    Bharath Ramsundar
    @rbharath
    Yep, that's right, you should be able to just pass this featurized training set into a random forest
    Also, sorry, didn't mean to imply it's wrong to ask questions here!
    macca1996-bit
    @macca1996-bit
    Nah you didn't, all good
    Now i'm getting an error which i'd not seen up until this point..not sure why. "Entity <bound method GraphConv.call of <deepchem.models.layers.GraphConv object at 0x7f2cd8030550>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, export AUTOGRAPH_VERBOSITY=10) and attach the full output. Cause: module 'gast' has no attribute 'Num'"
    I'll make a forum post
    Bharath Ramsundar
    @rbharath
    Nah, fine to askhere
    I'm fairly sure that's just a tensorflow complaint
    The last deepchem released version 2.3.0 is on TF 1.14
    I've seen similar warnings reported before, and I think they're safe to ignore
    macca1996-bit
    @macca1996-bit
    I've installed gast 0.2.2. This is not the only complaint. There's ~16 of them
    Alright if i can ignore it then I'm happy :)
    Bharath Ramsundar
    @rbharath
    I'm fairly sure these are the TF 1.14 warnings about upgrades needed... If the model runs and gives realistic accuracies you're probably good!
    macca1996-bit
    @macca1996-bit
    cheers
    rfhari
    @rfhari
    Thanks! @rbharath . One more question,
    In tox21_graphcnn.py, when we load tox21_datasets, does this include both Morgan Fingerprints and Graph attributes or just one of those. I'm confused because, when I printed the shape of tox21_dataset after using deepChem's data loader, I found the shape to be (6264, 1024). So are these 1024 features being concatenated with the graph attributes under the hood in tox21_graphcnn.py
    Bharath Ramsundar
    @rbharath
    To clarify my earlier comment, for code related bugs, the github issue place is a good place to post bug reports: https://github.com/deepchem/deepchem/issues. The forums are good for conceptual questions/issues https://forum.deepchem.io/. And here's good for quicker one-off things :)
    rfhari
    @rfhari
    Thanks! @rbharath . From next time, I'll post it on the forum. Sorry
    Bharath Ramsundar
    @rbharath
    @rfhari No worries! The file you're looking at is using the adjacency fingerprints dc.feat.AdjacencyFingerprint. This is one of the more obscure featurizers. It's a little confusing since there are many graph convolutional variants.
    The examples/ folder has a lot of old stuff in there. I'm going to do a cleanup pass sooon, but in the meanwhile, I recommend taking a look at https://github.com/deepchem/deepchem/blob/master/examples/tutorials/04_Introduction_to_Graph_Convolutions.ipynb
    This tutorial is cleaned up and maintained and is the best documented graph conv
    the tox21_graphcnn.py is using one the PetroskiSuchModel which is a variant graph convolution, not the best tested one
    rfhari
    @rfhari
    Thanks! @rbharath
    test-user-18
    @test-user-18
    @rbharath Thank you for your quick response! I appreciate your help. You are awesome!
    Bharath Ramsundar
    @rbharath
    Glad to be helpful :)
    macca1996-bit
    @macca1996-bit
    deepchem/deepchem#1788 made an issue thread regarding a problem i'm having with RFC
    Bharath Ramsundar
    @rbharath
    @macca1996-bit Just responded!
    Vishesh Mangla
    @XtremeGood
    Hey is deep chem participating in gsoc?
    Bharath Ramsundar
    @rbharath
    Vishesh Mangla
    @XtremeGood
    is there still time to submit a proposal?
    Bharath Ramsundar
    @rbharath
    Is cutting a little close to the deadline (March 31st), but it might still be possible if you have relevant experience in the field! If you ping me on here, I'm glad to help review any proposals
    Vishesh Mangla
    @XtremeGood
    yep. I 'm a Msc Chemistry student and know tensorflow 2.0 and data Structures and Algorithm.
    This is the last semester of my MSc Chem degree.
    Bharath Ramsundar
    @rbharath
    https://wiki.openchemistry.org/GSoC_Ideas_2020
    ^ Cool, I'd recommend taking a look at the ideas page
    Vishesh Mangla
    @XtremeGood
    thanks!
    Bharath Ramsundar
    @rbharath
    Our two project (suggested) for the year are exploring the use of Jax for molecular machine learning and improved transfer learning support.
    I'd encourage checking out the forums too for ideas https://forum.deepchem.io/
    Vishesh Mangla
    @XtremeGood
    Hi @rbharath . I checked the following project :"Develop a validation and standardization filter". Can I get more info on this?
    Does this project requires to like read a molecule as a string and count and label the stereocenters? Then it rather seems to require to check whether the string is palindromic or not rather than Data Science. Is the objective something else ?
    Vishesh Mangla
    @XtremeGood
    Checking whether string is palindromic or not is a part of Design and Analysis of Algorithms which is known as Manacher's algorithm.
    Vishesh Mangla
    @XtremeGood
    And the two projects which you are suggesting are already being done by other guys.
    mukeshb23
    @mukeshb23
    Dear sir, I have 76 cardiovascular drugs data with experimental solubility value which are made by himself . when I load this data set in deep chem modeling solubility tutorial then i found only 8 valid data set after validation after that i also geting negative r^2 value for this . is it possible with 77 datasets to get positive r^2 value and good results. help me regarding this please
    Bharath Ramsundar
    @rbharath
    @XtremeGood This sounds like a project from a different OpenChemistry team. The ones I'm personally mentoring are the DeepChem project track (the Jax and transfer learning project). The application process is competitive, so it's normal to have multiple applications for the same project. If you'd like feedback on your ideas for the DeepChem project track, feel free to ping me anytime (you can DM me on here)
    @mukeshb23 76 datapoints is a very small dataset. The machine learning tools we have will likely struggle with a dataset this small. I'd recommend trying to build a simple random forest model to start and going from there, but it might be hard. Maybe see if there's a way to get more datapoints to improve learning
    mukeshb23
    @mukeshb23
    @rbharath Dear sir thank you so much for clarifying . Sir I have one more doubt regarding this. When I mixed this 77 cardiovascular drugs data sets with other large data set and then reperformed on modelling solubility tutorial I got positive r *2 value and also get more valid data after validation. Sir what I am thinking is it possible can I do this?
    Bharath Ramsundar
    @rbharath
    It's not unreasonable at all, but main thing to do is to make sure your validation R^2 is measured on compounds from your original dataset (it's not useful if your numbers are for molecules you don't care about)
    Vishesh Mangla
    @XtremeGood
    @rbharath I 'm not sure why you are using Jax. I think the new tensorflow 2.0 is complete in itself with even more support for if we have any errors. It is a totally different architecture from 1.0 which the tensorflow community too believes was a problem.
    Pytorch and tensorflow both are good.
    mukeshb23
    @mukeshb23
    @rbharath Dear sir I am getting approx 110 molecules after validation this data sets. then how to do only for target drugs any specific code required for separate this terget drugs for measuring r*2 value.
    macca1996-bit
    @macca1996-bit