by

Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    ^ This file shows how the .mat files can be loaded into deepchem datasets
    rfhari
    @rfhari
    Yeah, Thanks a lot, this works!
    Can you please clarify - the label "u0 _atom" refers to what exactly?
    Bharath Ramsundar
    @rbharath
    I don't recall off the top of my head, but it might be atomization energy I think
    rfhari
    @rfhari
    ohh okay. Thanks a lot for the guidance!
    Bharath Ramsundar
    @rbharath
    As a reminder folks, GSoC applications are due tomorrow! Please make sure to submit if you plan on applying
    macca1996-bit
    @macca1996-bit
    Hey guys, hope everyone is staying healthy. Quick question: Is it possible to convert a graph conv vector representation of a molecule back into a SMILES string?
    Bharath Ramsundar
    @rbharath
    @macca1996-bit Good question! By graph conv vector representation, do you mean the extracted neural fingerprint? There's unfortunately not a great way to translate back to the original (the transformation wasn't designed to be invertible)
    What's your application? There might be a workaround possible for what you're looking to do
    rfhari
    @rfhari
    Hi @rbharath, I'm working on QM7 dataset. I got smiles from the following link https://github.com/deepchem/deepchem/blob/master/deepchem/molnet/load_function/qm7_datasets.py
    But seems like it contains only one label, so how is "Multitasks" performed in this, which is mentioned in the MoleculeNET paper. Can you please explain
    Bharath Ramsundar
    @rbharath
    @rfhari I believe that the multitask was done by combining data from many different datasets together
    I think there might have been one massive multitask model that combined all (or a large subset) of moleculenet
    rfhari
    @rfhari
    ohh, but from the above link, I could get a .csv file with a single label. Can you please guide me on getting this combined big dataset.
    Bharath Ramsundar
    @rbharath
    Hmm, I'll have to go back and check the manuscript tbh. I didn't run the models for this part so I don't recall how it was don
    I remember that the multitask models didn't actually work that great
    Are you trying to get a high performing model or are you interested in directly replicating MoleculeNet
    If it's the first, I think no need to do multitask
    rfhari
    @rfhari
    I'm trying to replicate and understand all the models. It's mentioned in the paper that, for QM7 dataset KRR (CM) i.e. multitask model performed the best among all. So, that why I'm a bit confused. Sorry for bothering you too much
    Bharath Ramsundar
    @rbharath
    No worries at all! I'm glad to help
    It's just that MoleculeNet was a large project with lots of contributors and lots of details
    So I don't know off-hand all the details. I think the paper and the code in the repo now are our best resources
    Have you gotten singletask models on QM7 running and giving reasonable numbers?
    rfhari
    @rfhari
    ohh okay, Thanks a lot @rbharath! I'll refer to those. Yeah, I got the single-task model working good.
    iherath
    @iherath
    Hi I am new to deepchem and was going through the tutorial titled "Predicting Ki of Ligands to a Protein", and I have some questions:
    Bharath Ramsundar
    @rbharath
    Welcome @iherath!
    What issues are you running into?
    iherath
    @iherath
    Thank you!
    Yeah how do you load your own protein into the notebook?
    Is there a place where I can find the necessary kind of csv file specified?
    Bharath Ramsundar
    @rbharath
    Ah, so that model is for a dataset of binding measurement for a particular protein
    It's not a structure based model where you can load a separate protein
    iherath
    @iherath
    Oh okay thank you
    Bharath Ramsundar
    @rbharath
    To work on your own protein, you'll need to write a bit of custom code. Take a look at deepchem.molnet.load_pdbbind and see how it featurizes protein/ligands from the pdbbind dataset
    We should definitely have better documentation on how to do this. I'm working on revamping the docs over the next few weeks so I'll try to improve the explanations
    iherath
    @iherath
    Oh okay thanks so much!
    iherath
    @iherath
    Also, what kind of preprocessing of the protein does one need to do before performing these analyses? And how do I account for the different conformations of the protein and the ligands?
    Bharath Ramsundar
    @rbharath
    So you need to "co-crystal" pose. You can get this by docking
    *need a
    To process different protein/ligand conformations, you basically process them as separate datapoints
    Let's say you have a trained protein-ligand binding energy model (trained on pdbbind)
    And you have a bunch of binding poses of different conformations of protein/ligand
    You could run them through your trained model and take the max perhaps
    To get the most optimal binding free energy
    iherath
    @iherath
    Okay thank you
    iherath
    @iherath
    By docking do you mean predicting the predicting the Ki of the various conformations?
    Bharath Ramsundar
    @rbharath
    By docking, I mean generating binding poses for the protein-ligand complex. Like with Autodock Vina for example
    iherath
    @iherath
    Oh okay thanks
    iherath
    @iherath
    For some reason, sometimes when I try to run import deepchem as dc I get an error that says "ImportError: cannot import name 'NewCheckpointReader'". How do I fix. this?
    iherath
    @iherath
    Also, I still don't quite understand how to use a model such as the one trained in the "Modeling Protein-Ligand Interactions" Tutorial to predict binding affinities for a proten-ligand pair of interest.