Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    9H Fluorene
    @ninehfluorene:matrix.org
    [m]
    kurokawaikki
    @kurokawaikki
    Hi,
    I am new to deepchem. Currently, I am now try to train a model to calculate the complxation energy of Host-Guest interaction. Therefore, I am trying to find the avalible dataset on the internet. So far, I have collected only around 500 samples... Is there a dataset in the deepchem? Or do you have any suggestion for me to find the datasets? Thank you very much!
    Bharath Ramsundar
    @rbharath
    @ninehfluorene:matrix.org What do you mean by readable data? You can visualize numpy arrays in the console but I think that's now what you're asking for
    4 replies
    @kurokawaikki Could you clarify what you mean by host-guest interaction? Sorry my knowledge of materials science might be lacking here!
    9H-Fluorene (Kasitinard M.)
    @ninehfluorene:matrix.org
    [m]
    9H-Fluorene (Kasitinard M.)
    @ninehfluorene:matrix.org
    [m]
    i'm try to recreate this paper they use 15 propoties
    https://pubs.acs.org/doi/10.1021/acs.jpca.8b09376
    Bharath Ramsundar
    @rbharath
    You'll want to pick out the same 15 properties as in that paper
    9H-Fluorene (Kasitinard M.)
    @ninehfluorene:matrix.org
    [m]
    yes or just one or two as well
    9H-Fluorene (Kasitinard M.)
    @ninehfluorene:matrix.org
    [m]
    i might do something wrong , maybe my model is really bad that why it not close to the input label data
    9H-Fluorene (Kasitinard M.)
    @ninehfluorene:matrix.org
    [m]
    again thanks for your help @rbharath
    Atreya Majumdar
    @atreyamaj
    Hey everyone! I was wondering if we could maybe add a description to the PyPI release of Deepchem? I would love to contribute and write the description as well if someone could point me in the right direction as to where I can start writing it!
    alat-rights
    @alat-rights
    I think that would be really helpful! I’m not too sure how that would work. Maybe @rbharath would have a better idea?
    alat-rights
    @alat-rights
    Would it make sense for us to separate the flaky tests from the non-flaky tests in the test-suite so that our “build passing/failing” badge is more useful?
    Bharath Ramsundar
    @rbharath
    @atreyamaj That would be a great idea :). If you'd like to help, DM me your pypi username and I'll give you access to edit the description. It might be good to make the description documented in the main repo and add a note to the release docs to update the pypi description if needed
    @alat-rights It's a little tricky basically. The flaky tests are already separated out so most of their failures don't affect the CI but we have a hundreds of tests and we run into edge cases that are hard to route around. The only solution right now is to just look into each case individually and try to understand the cause of the failure, but maybe there's a better idea
    Atreya Majumdar
    @atreyamaj
    Thank you, I have sent you a DM
    meihua
    @MhDang
    Hello! I am new to AI drug and deepchem library. I have to say this work is remarkable and very user-friendly to beginners like me. I am now trying to play with the models and reproduce the results in http://moleculenet.ai/latest-results, I find the benchmark scripts in deepchem/examples/benchmark.py, I am wondering whether you happen to keep the record of hyperparameters to reproduce the results ?
    Bharath Ramsundar
    @rbharath
    @MhDang Welcome to the project! I'd suggest checking out the new moleculenet repo: https://github.com/deepchem/moleculenet
    We have a new leaderboard up with maintained results. The original model benchmarks were run on TF 1.x and the underlying libraries have changed a lot so it's not easy to directly replicate those results
    meihua
    @MhDang
    @rbharath Thanks a lot for the reference, let me try to reproduce the new results!
    kurokawaikki
    @kurokawaikki
    @rbharath I am sorry for the late responses. In my research, the host compounds are the cyclic (or ring like) compounds such as valinomycin. I try to find a guest compound that can fit into the ring area and predicts the binding energy. Host–guest chemistry is currently being tested in SAMPL challenge. Since I would like to build a prediction model, I hope I could have as many sample as possible. Therefore, is there a dataset in the deepchem? Or do you have any suggestion for me to find the datasets? Thank you very much!
    Bharath Ramsundar
    @rbharath
    @kurokawaikki Ah, I see! That makes sense. Hmm, unfortunately, I'm not aware of a good dataset in DeepChem for host-guest interactions. Closest would be pdbbind but that's for more generic protein-ligand interactions and not for the types of host-guest interactions you're envisioning
    Sahar RZ
    @SaharRohaniZ
    Hi Deepchem team - The GraphConvModel is failing with PDB bind data saying ndarray doesn't have atom_features. Is this a known error? is there a workaround to make GraphConvModel work with PDBbind data ? thanks for your input in advance.
    Bharath Ramsundar
    @rbharath
    @SaharRohaniZ For pdbbind data, you'd probably want an interaction fingerprint or something that handles the protein ligand complexes (check out tutorials 13/14 for examples). You can do graph conv on the ligands only but you might need to do some custom processing
    Sahar RZ
    @SaharRohaniZ
    Thanks @rbharath for your reply.
    Vignesh Venkataraman
    @VIGNESHinZONE

    Hi everyone,
    I have working with generative modelling for Molecules (SMILES) and I was exploring the AspuruGuzikAutoEncoder given on seqtoseq.py. The original paper has a step for Gaussian Process step for exploring the latent space and I couldn't find its implementation in deepchem. It would be really helpful if someone could suggest me generative models research or frameworks which can provide us with the option of exploring the latent space for finding more optimized molecules.

    reference -

    1. Aspuru Guzik's Mol VAE paper - https://arxiv.org/abs/1610.02415 (Gaussian Process is given in Page 11 , Optimization of molecules via properties)

    Thanks in advance :)

    Bharath Ramsundar
    @rbharath
    @VIGNESHinZONE Have you checked out the normalizing flows or the new molgan?
    I don't think we have a good out-of-box technique for exploring the latent space but something should work
    *might work :)
    Atreya Majumdar
    @atreyamaj

    I found this repo for the paper you linked above: https://github.com/HIPS/molecule-autoencoder

    It's outside of deepchem, but hope this helps!
    @VIGNESHinZONE

    Vignesh Venkataraman
    @VIGNESHinZONE
    @rbharath I just checked them out and they might be useful. Thank you :)
    @atreyamaj Thanks for link :) I will definitely check them out
    Gökhan Tahıl
    @gokhantahil
    Hello everyone, I try to optimize hyperparameters of RF on DeepChem but I guess there is a bug.
    _criterionmae_max_depth_8_min_samples_leaf_3_min_samples_split_3_min_weight_fraction_leaf_0_n_estimators_120: 0.6832978488194399 here is the result of the first hyperparameter research
    and the r2 score of testing : 0.7238619516613416
    when i add another value to "min samples leaf" the validation score and r2 score of testing are changing
    so the validation scores :
    _criterionmae_max_depth_8_min_samples_leaf_1_min_samples_split_3_min_weight_fraction_leaf_0_n_estimators_120_njobs-1: 0.6514039887382881
    _criterionmae_max_depth_8_min_samples_leaf_3_min_samples_split_3_min_weight_fraction_leaf_0_n_estimators_120_njobs-1: 0.6667454007419835
    and the r2 score of testing : 0.7514118547850513
    Saurav Maheshkar
    @SauravMaheshkar
    Hello guys, I was working on issue deepchem/deepchem#631 and opened up a draft PR deepchem/deepchem#2501. I'm quite new to deepchem and would appreciate any help I can get. It involved the use of the BindingPocketFeaturizer.
    Bharath Ramsundar
    @rbharath
    @SauravMaheshkar Will try to take a look within a day or two :)
    @gokhantahil Sorry, I'm not sure what the bug is here. Would you mind clarifying a bit more?
    simonaxelrod
    @simonaxelrod
    Hi everyone - I have a basic question about the pdbbind data. My understanding is that a pdbbind model can take either the ligand or the protein-ligand complex as input, and produce -ln(kd/ki) as output. Are the proteins all the same or are they different? If they're different, how can a purely ligand-based model be trained to predict -ln(kd/ki)? Wouldn't it also need some information about the protein as input?
    Bharath Ramsundar
    @rbharath
    @simonaxelrod The proteins are different. (I think a few proteins are repeated but these are the exceptions). The purely ligand models are really learning a measure of "ligand-ness" and are more of a baseline control on the protein-ligand models. The delta between the protein-ligand model and the ligand-only model are a measure of how much information about the protein the model is actually using
    simonaxelrod
    @simonaxelrod
    Thanks @rbharath! That makes sense that the ligand models are really just a baseline control. Though I think some papers would benefit from saying this explicitly - for example, the ChemProp paper just notes that their model outperforms all moleculenet models for all tasks other than QM and pdbbind. But should probably have said that it definitely shouldn't work for pdbbind or something would be really wrong
    Bharath Ramsundar
    @rbharath
    Yes agreed! This is a subtle point that comparisons in the literature often miss
    kingscolour
    @kingscolour

    I'm looking into featurizing a set of molecules with the ConvMolFeaturizer. I'm interested in featurizing the chemical environment of the atoms within the molecule so I presume that I'd want to set the per_atom_fragmentation parameter. In the docs it notes:

    This option is typically used in combination with a FlatteningTransformer to split the lists into separate samples.

    I can't find any mention of FlatteningTransformer in the docs, can someone point me somewhere?

    Bharath Ramsundar
    @rbharath
    @kingscolour per_atom_fragmentation is a new feature so this may be a docs error. Check out the new tutorial at https://github.com/deepchem/deepchem/blob/master/examples/tutorials/Training_a_Normalizing_Flow_on_QM9.ipynb
    kingscolour
    @kingscolour
    Thanks! I actually missed that one because I didn't skipped over the files without a number. The Atomic Contributions for Molecules tutorial was also helpful for my understanding. Cheers for your work!
    Bharath Ramsundar
    @rbharath
    Oh my bad! Meant to link the atomic contributions tutorial and not the normalizing flows one lol
    kingscolour
    @kingscolour
    No worries! The Normalizng Flow tutorial seems to be helpful too. I'd like to model my own small molecule data with deepchem tools, but it's a bit overwhelming because I've only done basic MLP and decision trees/random forests thus far. I have a basic understanding of GraphConv and Transformers, but I'm still trying to bridge that understanding to implementation. So again, thanks!