Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    Bharath Ramsundar
    @rbharath
    Yes agreed! This is a subtle point that comparisons in the literature often miss
    kingscolour
    @kingscolour

    I'm looking into featurizing a set of molecules with the ConvMolFeaturizer. I'm interested in featurizing the chemical environment of the atoms within the molecule so I presume that I'd want to set the per_atom_fragmentation parameter. In the docs it notes:

    This option is typically used in combination with a FlatteningTransformer to split the lists into separate samples.

    I can't find any mention of FlatteningTransformer in the docs, can someone point me somewhere?

    Bharath Ramsundar
    @rbharath
    @kingscolour per_atom_fragmentation is a new feature so this may be a docs error. Check out the new tutorial at https://github.com/deepchem/deepchem/blob/master/examples/tutorials/Training_a_Normalizing_Flow_on_QM9.ipynb
    kingscolour
    @kingscolour
    Thanks! I actually missed that one because I didn't skipped over the files without a number. The Atomic Contributions for Molecules tutorial was also helpful for my understanding. Cheers for your work!
    Bharath Ramsundar
    @rbharath
    Oh my bad! Meant to link the atomic contributions tutorial and not the normalizing flows one lol
    kingscolour
    @kingscolour
    No worries! The Normalizng Flow tutorial seems to be helpful too. I'd like to model my own small molecule data with deepchem tools, but it's a bit overwhelming because I've only done basic MLP and decision trees/random forests thus far. I have a basic understanding of GraphConv and Transformers, but I'm still trying to bridge that understanding to implementation. So again, thanks!
    ivy1997
    @ivy1997:matrix.org
    [m]
    Hi! I have a question regarding inconsistent numbers of samples before and after Keras model prediction. I have 1809 data; however, after I ran data_generator -> fit_generator -> predict_on_generator, the numbers of my dataset became 1856. Could you help me find the problem? Thank you.
    The above pictures are my code. n_classes=3. Thank you very much.
    hcy5561
    @hcy5561
    Hi , I am relatively new to this. I have 26 compounds. For these, there is a data set in csv format that includes descriptors that I have calculated around 5000 and ic50 value. For these, I would like to perform QSAR analysis with deep learning artificial neural networks and predicted ic50 values. However, I am insufficient with code. Can anyone help?
    Thanks
    Bharath Ramsundar
    @rbharath
    @ivy1997:matrix.org This looks like a batching issue :). So predict on generator pads to full batches by default. Just trim off the last few elements to recover the dataset predictions (null datapoints are appended by default I believe)
    1 reply
    @hcy5561 Check out our tutorials :). Tutorial 4 in particular might help you get started with QSAR analysis
    hcy5561
    @hcy5561
    This does not meet the question I am asking and what I want. I do not use fingerprints. I will not calculate any descriptors for my molecules. I just gave numbers like 1,2,3,4 to the molecules. Already in the csv file, I put the calculated descriptors for all complexes in the csv from some external programs. I want to use this csv and create my model with deep learning artificial neural networks.
    Bharath Ramsundar
    @rbharath
    You can adapt the example to process your data. Take a look at https://deepchem.readthedocs.io/en/latest/api_reference/featurizers.html#userdefinedfeaturizer. This will require some familiarity with the DeepChem API though since we don't have an out-of-box example for your use case
    hcy5561
    @hcy5561
    Thanks.
    I will search for other deep learning programs.
    ivy1997
    @ivy1997:matrix.org
    [m]
    ivy1997
    @ivy1997:matrix.org
    [m]
    So I originally have 1809 elements. When I used batch_size=64, the number of predicted result bacame 1856, which was 47 (64-17)more than original dataset. Then I tried batch_size=32, the number of predicted result became 1824, which was 15 (32-17) more than original dataset.🥲
    Bharath Ramsundar
    @rbharath
    @ivy1997:matrix.org I believe what's happening is that the last batch is getting padded here. It's a little surprising to me that the predictions aren't all null! This may just be undefined behavior in the predictions (I don't recall how precisely the padded elements are generated and it's possible there's some variation there)
    We should really have better documentation on this...
    Bharath Ramsundar
    @rbharath
    @ivy1997:matrix.org I've started up deepchem/deepchem#2513 to document and hopefully fix
    ivy1997
    @ivy1997:matrix.org
    [m]
    OK. Thank you!! Hope it can be solved.🙂
    paulsonak
    @paulsonak
    Hi all, are there any plans (or existing implementations) for exporting deepchem models into a standard format such as ONNX or PMML?
    Bharath Ramsundar
    @rbharath
    @paulsonak There's some interest! We're working towards establishing a modelhub and adopting some common framework like ONNX/PMML for weight storage would be useful. We don't have any infrastructure for this yet though. See the discussion https://forum.deepchem.io/t/a-sketch-of-a-modelhub/445
    1 reply
    Karthik Viswanathan
    @nickinack
    Hey, I am trying to reproduce a paper that requires the following version of deepchem: https://github.com/deepchem/deepchem/tree/july2017. Unfortunately, this link is inactive. How do I download and use this version?
    Bharath Ramsundar
    @rbharath
    That was released in July 2017
    Karthik Viswanathan
    @nickinack
    @rbharath Thank you very much :) I also wanted to ask how the ConvMolFeaturizer() works. Do you have any documentation which explains how the featurisation takes place?
    alat-rights
    @alat-rights
    Yup! That should be in our documentation pages. Have you looked under featurizers? deepchem.readthedocs.io/ @nickinack
    Karthik Viswanathan
    @nickinack
    Yess :) I looked into it and it was very insightful. Thank you for the support :)
    9H-Fluorene (Kasitinard M.)
    @ninehfluorene:matrix.org
    [m]
    i have a question in the future is deepchem gonna have progress bar (for example tensorflow has verbose progress bar when it train)
    Omid Tarkhaneh
    @OmidTarkhaneh
    I want to design new efficient model based on DeepChem and (keras or pycharm) is there any resources help me for this aim??
    Atreya Majumdar
    @atreyamaj

    @OmidTarkhaneh There are layers defined within Deepchem from Keras etc, they can be found here: https://deepchem.readthedocs.io/en/latest/api_reference/layers.html

    These can be used like how you would use Keras

    Omid Tarkhaneh
    @OmidTarkhaneh
    @atreyamaj Thanks a lot for your help. Is there any resources which help how to preprcoess datasets related to the chemistery. My datasets are kind of DUDE datasets (structural datasets) and I should change in a way suitable for DeepChem
    Bharath Ramsundar
    @rbharath
    @ninehfluorene:matrix.org Good question. We don't have plans for this at present but it would be a useful feature to add
    @OmidTarkhaneh Check out our tutorial series (click tutorials on deepchem.io). Some of the the tutorials may be relevant for your work
    Abhik Seal
    @abhik1368
    Where did the from deepchem.models.tensorgraph.layers import Label, Weights go ? Can anyone help, i used it a year ago and now latest version things have changed
    Bharath Ramsundar
    @rbharath
    @abhik1368 TensorGraph was our old framework for building models (basically a custom version of keras). We've ported all our models to just use Keras directly. You can get the available layers from deepchem.models.layers now
    Abhik Seal
    @abhik1368
    I want to run a standard regression like solubility can you point me to an example ?
    Abhik Seal
    @abhik1368
    Can you show how to load a dataset and use the latest code to train and test , any links ? I want to use atom_features, degree_slice, membership as features
    2 replies
    9H-Fluorene (Kasitinard M.)
    @ninehfluorene:matrix.org
    [m]
    or you want to custom dataset?
    example
    qm9 have 20 rebel but you want just 15 rebel
    make seperate for train,test,valid (50:30:20)
    normalization , make you own model with tensorflow
    9H-Fluorene (Kasitinard M.)
    @ninehfluorene:matrix.org
    [m]
    This is interesting i just found paper that working on QM9 prediction and just find out that order to improve on loss (mae) i should do data clean up (decrease Noise on data) and add more features (for example group up some of molecules that similar etc.)
    https://pubs.acs.org/doi/10.1021/acs.jpca.0c05969
    9H-Fluorene (Kasitinard M.)
    @ninehfluorene:matrix.org
    [m]
    And it bag a question in the future of deepchem molnet is there are gonna be improved version of any database (for example qm9_v2 or something like that) what do you think if this concept ?
    AshW360
    @AshW360
    Hi, how do we separate compounds from .csv file with water solubility?
    Omid Tarkhaneh
    @OmidTarkhaneh
    I am looking for some papers with their code using deepchem for potential energy prediction, if any I wonder someone send me. Thanks in advance.