Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    Saurav Maheshkar
    @SauravMaheshkar
    Hello guys, I was working on issue deepchem/deepchem#631 and opened up a draft PR deepchem/deepchem#2501. I'm quite new to deepchem and would appreciate any help I can get. It involved the use of the BindingPocketFeaturizer.
    Bharath Ramsundar
    @rbharath
    @SauravMaheshkar Will try to take a look within a day or two :)
    @gokhantahil Sorry, I'm not sure what the bug is here. Would you mind clarifying a bit more?
    simonaxelrod
    @simonaxelrod
    Hi everyone - I have a basic question about the pdbbind data. My understanding is that a pdbbind model can take either the ligand or the protein-ligand complex as input, and produce -ln(kd/ki) as output. Are the proteins all the same or are they different? If they're different, how can a purely ligand-based model be trained to predict -ln(kd/ki)? Wouldn't it also need some information about the protein as input?
    Bharath Ramsundar
    @rbharath
    @simonaxelrod The proteins are different. (I think a few proteins are repeated but these are the exceptions). The purely ligand models are really learning a measure of "ligand-ness" and are more of a baseline control on the protein-ligand models. The delta between the protein-ligand model and the ligand-only model are a measure of how much information about the protein the model is actually using
    simonaxelrod
    @simonaxelrod
    Thanks @rbharath! That makes sense that the ligand models are really just a baseline control. Though I think some papers would benefit from saying this explicitly - for example, the ChemProp paper just notes that their model outperforms all moleculenet models for all tasks other than QM and pdbbind. But should probably have said that it definitely shouldn't work for pdbbind or something would be really wrong
    Bharath Ramsundar
    @rbharath
    Yes agreed! This is a subtle point that comparisons in the literature often miss
    kingscolour
    @kingscolour

    I'm looking into featurizing a set of molecules with the ConvMolFeaturizer. I'm interested in featurizing the chemical environment of the atoms within the molecule so I presume that I'd want to set the per_atom_fragmentation parameter. In the docs it notes:

    This option is typically used in combination with a FlatteningTransformer to split the lists into separate samples.

    I can't find any mention of FlatteningTransformer in the docs, can someone point me somewhere?

    Bharath Ramsundar
    @rbharath
    @kingscolour per_atom_fragmentation is a new feature so this may be a docs error. Check out the new tutorial at https://github.com/deepchem/deepchem/blob/master/examples/tutorials/Training_a_Normalizing_Flow_on_QM9.ipynb
    kingscolour
    @kingscolour
    Thanks! I actually missed that one because I didn't skipped over the files without a number. The Atomic Contributions for Molecules tutorial was also helpful for my understanding. Cheers for your work!
    Bharath Ramsundar
    @rbharath
    Oh my bad! Meant to link the atomic contributions tutorial and not the normalizing flows one lol
    kingscolour
    @kingscolour
    No worries! The Normalizng Flow tutorial seems to be helpful too. I'd like to model my own small molecule data with deepchem tools, but it's a bit overwhelming because I've only done basic MLP and decision trees/random forests thus far. I have a basic understanding of GraphConv and Transformers, but I'm still trying to bridge that understanding to implementation. So again, thanks!
    ivy1997
    @ivy1997:matrix.org
    [m]
    Hi! I have a question regarding inconsistent numbers of samples before and after Keras model prediction. I have 1809 data; however, after I ran data_generator -> fit_generator -> predict_on_generator, the numbers of my dataset became 1856. Could you help me find the problem? Thank you.
    The above pictures are my code. n_classes=3. Thank you very much.
    hcy5561
    @hcy5561
    Hi , I am relatively new to this. I have 26 compounds. For these, there is a data set in csv format that includes descriptors that I have calculated around 5000 and ic50 value. For these, I would like to perform QSAR analysis with deep learning artificial neural networks and predicted ic50 values. However, I am insufficient with code. Can anyone help?
    Thanks
    Bharath Ramsundar
    @rbharath
    @ivy1997:matrix.org This looks like a batching issue :). So predict on generator pads to full batches by default. Just trim off the last few elements to recover the dataset predictions (null datapoints are appended by default I believe)
    1 reply
    @hcy5561 Check out our tutorials :). Tutorial 4 in particular might help you get started with QSAR analysis
    hcy5561
    @hcy5561
    This does not meet the question I am asking and what I want. I do not use fingerprints. I will not calculate any descriptors for my molecules. I just gave numbers like 1,2,3,4 to the molecules. Already in the csv file, I put the calculated descriptors for all complexes in the csv from some external programs. I want to use this csv and create my model with deep learning artificial neural networks.
    Bharath Ramsundar
    @rbharath
    You can adapt the example to process your data. Take a look at https://deepchem.readthedocs.io/en/latest/api_reference/featurizers.html#userdefinedfeaturizer. This will require some familiarity with the DeepChem API though since we don't have an out-of-box example for your use case
    hcy5561
    @hcy5561
    Thanks.
    I will search for other deep learning programs.
    ivy1997
    @ivy1997:matrix.org
    [m]
    ivy1997
    @ivy1997:matrix.org
    [m]
    So I originally have 1809 elements. When I used batch_size=64, the number of predicted result bacame 1856, which was 47 (64-17)more than original dataset. Then I tried batch_size=32, the number of predicted result became 1824, which was 15 (32-17) more than original dataset.🥲
    Bharath Ramsundar
    @rbharath
    @ivy1997:matrix.org I believe what's happening is that the last batch is getting padded here. It's a little surprising to me that the predictions aren't all null! This may just be undefined behavior in the predictions (I don't recall how precisely the padded elements are generated and it's possible there's some variation there)
    We should really have better documentation on this...
    Bharath Ramsundar
    @rbharath
    @ivy1997:matrix.org I've started up deepchem/deepchem#2513 to document and hopefully fix
    ivy1997
    @ivy1997:matrix.org
    [m]
    OK. Thank you!! Hope it can be solved.🙂
    paulsonak
    @paulsonak
    Hi all, are there any plans (or existing implementations) for exporting deepchem models into a standard format such as ONNX or PMML?
    Bharath Ramsundar
    @rbharath
    @paulsonak There's some interest! We're working towards establishing a modelhub and adopting some common framework like ONNX/PMML for weight storage would be useful. We don't have any infrastructure for this yet though. See the discussion https://forum.deepchem.io/t/a-sketch-of-a-modelhub/445
    1 reply
    Karthik Viswanathan
    @nickinack
    Hey, I am trying to reproduce a paper that requires the following version of deepchem: https://github.com/deepchem/deepchem/tree/july2017. Unfortunately, this link is inactive. How do I download and use this version?
    Bharath Ramsundar
    @rbharath
    That was released in July 2017
    Karthik Viswanathan
    @nickinack
    @rbharath Thank you very much :) I also wanted to ask how the ConvMolFeaturizer() works. Do you have any documentation which explains how the featurisation takes place?
    alat-rights
    @alat-rights
    Yup! That should be in our documentation pages. Have you looked under featurizers? deepchem.readthedocs.io/ @nickinack
    Karthik Viswanathan
    @nickinack
    Yess :) I looked into it and it was very insightful. Thank you for the support :)
    9H-Fluorene (Kasitinard M.)
    @ninehfluorene:matrix.org
    [m]
    i have a question in the future is deepchem gonna have progress bar (for example tensorflow has verbose progress bar when it train)
    Omid Tarkhaneh
    @OmidTarkhaneh
    I want to design new efficient model based on DeepChem and (keras or pycharm) is there any resources help me for this aim??
    Atreya Majumdar
    @atreyamaj

    @OmidTarkhaneh There are layers defined within Deepchem from Keras etc, they can be found here: https://deepchem.readthedocs.io/en/latest/api_reference/layers.html

    These can be used like how you would use Keras

    Omid Tarkhaneh
    @OmidTarkhaneh
    @atreyamaj Thanks a lot for your help. Is there any resources which help how to preprcoess datasets related to the chemistery. My datasets are kind of DUDE datasets (structural datasets) and I should change in a way suitable for DeepChem
    Bharath Ramsundar
    @rbharath
    @ninehfluorene:matrix.org Good question. We don't have plans for this at present but it would be a useful feature to add
    @OmidTarkhaneh Check out our tutorial series (click tutorials on deepchem.io). Some of the the tutorials may be relevant for your work
    Abhik Seal
    @abhik1368
    Where did the from deepchem.models.tensorgraph.layers import Label, Weights go ? Can anyone help, i used it a year ago and now latest version things have changed
    Bharath Ramsundar
    @rbharath
    @abhik1368 TensorGraph was our old framework for building models (basically a custom version of keras). We've ported all our models to just use Keras directly. You can get the available layers from deepchem.models.layers now
    Abhik Seal
    @abhik1368
    I want to run a standard regression like solubility can you point me to an example ?