by

Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    Bharath Ramsundar
    @rbharath
    So to install the latest version of DeepChem, all you need to do is:
    pip install tensorflow
    pip install deepchem-nightly
    Note that this won't have RDKit support (since RDKit isn't on pip), so you'll have to figure out how to install RDKit in your environment. This should make hacking on DeepChem much nicer!
    Any feedback or comments would be much appreciated :)
    Mansoor Sayed
    @mansoor-s
    Hello friends, I'm trying to get deepchem to work on my local setup, but I'm getting this error when trying to import deepchem: ImportError: cannot import name 'jaccard_similarity_score' from 'sklearn.metrics'
    Digging into it, I see that sklearn.metrics has a jaccard_score, but not a jaccard_similarity_score function. Is this due to a version mismatch?
    I installed using the conda, following the instructions on the deepchem website
    Mansoor Sayed
    @mansoor-s
    interesting, the correct function is being called from the master branch, looks like the conda package needs an update
    Bharath Ramsundar
    @rbharath
    Ah yep, the old conda release is a little behind unfortunately. You can fix it by pinning sklearn==0.22 for deepchem 2.3.0.
    I'd recommend using deepchem-nightly though :)
    I'll update the docs soon, but you can do pip install tensorflow; pip install deepchem-nightly
    Mansoor Sayed
    @mansoor-s
    Ahhh, thank you. I wasn't aware of nightly. I got it working by installing from git master
    Peng Fei
    @cpfpengfei

    @rbharath For a model with uncertainty prediction capabilities, when uncertainty = True and using a trained model to predict_uncertainty, an array of pred and std values corresponding to the sample y is returned.

    For these 2 arrays, am I right to say that every value in the pred array is the mean predicted value after 50 predictions (standard dropout mask of 50), and that the corresponding value in std can be considered the standard deviation of that prediction (root of squared aleatoric and epistemic uncertainties by dropout)? Thanks a lot! And am looking forward to the new 2.4.0 version!

    3 replies
    Peng Fei
    @cpfpengfei
    I have another question on normalisation transformers. Would it be necessary (in terms of better model performance) to use it on all datasets to transform the y values? Because my model requires to output prediction uncertainties and having transformers will cause some misalignments in the dataset scales.
    Bharath Ramsundar
    @rbharath
    Ah yes, the uncertainty prediction doesn't work well necessarily with the transformers since the scale of the standard deviation doesn't necessarily untransform cleanly. I think you're fine to not normalize if your performance is still in the range you need it to be!
    Gökhan Tahıl
    @gokhantahil
    Hello, I try to use "MultitaskClassifier" but it gives me an error as "index -16 is out of bounds for axis 1 with size 2".
    Could you please help me ?
    Bharath Ramsundar
    @rbharath
    @gokhantahil Could you give a failing code snippet? I don't think I've seen this error before
    Christian Stemmle
    @chstem
    @rbharath I have troubles testing the latest deepchem-nightly, because import tensorflow-probability fails. There was a PR merged a couple of days ago to make this a soft dependency. Can deepchem-nightly be updated?
    Mansoor Sayed
    @mansoor-s
    Any recommendations on a clean way to represent amino-acid chains (from FASTA) in SMILES format?
    1 reply
    Christian Stemmle
    @chstem
    3 replies
    Bharath Ramsundar
    @rbharath
    @chstem Ah I thought I fixed this! Let me take a look. I might have missed one tfp import. I'll report back shortly...
    5 replies
    Bharath Ramsundar
    @rbharath
    I'm able to confirm the import issue. I missed a tensorflow_probability import on my PR. deepchem-nightly seems to be working fine thankfully though. I'll put up a fix PR shortly
    Christian Stemmle
    @chstem

    When I run

    for i in range(10):
        loss = model.fit(dataset, nb_epoch=1)

    I sometimes get a loss of 0 (like 1 in 5 loops). What is going on here?

    Bharath Ramsundar
    @rbharath
    @chstem Hmm, that's unfortunate. We've gotten a few bug reports of this. I've actually got a branch open right now on my local repo to figure out what's happening with the loss reporting
    Let me dig into this a bit and get back to you shortly
    Christian Stemmle
    @chstem
    great, thanks
    Bharath Ramsundar
    @rbharath
    Could you give me a full code snippet for your case? It would be very useful for debugging :)
    Christian Stemmle
    @chstem
    Not without all my data ... what exactly are you looking for?
    Bharath Ramsundar
    @rbharath
    I'm currently trying to get a simple snippet that reproduces the loss going to 0 every so often. A few people have reported this, but I haven't been able to locally reproduce yet
    Christian Stemmle
    @chstem
    not even with one of the benchmark cases?
    Bharath Ramsundar
    @rbharath
    Taking a look now :). I haven't seriously looked until today so I'm hopeful of turning up something soon
    Christian Stemmle
    @chstem
    I am running GraphConvModel, in case this is somehow depending on the model
    Bharath Ramsundar
    @rbharath
    @chstem I'm able to replicate it now. See discussion in this issue deepchem/deepchem#1944. I'm seeing it on the 100th epoch on my test, but I'm not sure what's causing this issue
    Is your dataset large by any chance? Maybe it has something to do with the number of training steps.
    GreenTail
    @GreenTail
    @rbharath , Is there a paper that explains algorithm behind GraphConv layer from deepchem.models.tensorgraph.layers?
    I am doing the tutorial "Graph Convolutions For Tox21" and trying to understand math (by reading the source code).
    GreenTail
    @GreenTail
    I posted a couple of questions about GraphConv to the deepchem forum (since the questions are not very specific the forum seems as a more appropriate place to post them, rather than here in gitter). I would appreciate if someone may have a look and answer them.
    Bharath Ramsundar
    @rbharath
    @GreenTail The algorithm comes from https://arxiv.org/abs/1509.09292
    It's a well written paper so will hopefully be clear, but glad to answer here questions or on the forums :)
    GreenTail
    @GreenTail
    @rbharath , Thank you for the link. I will have a read.
    Ohyeahmanolito
    @Ohyeahmanolito
    Any research papers that you can recommend me to read that is aligned on few shot learning for drug discovery? :D
    Bharath Ramsundar
    @rbharath
    @Ohyeahmanolito You can check out my old paper https://pubs.acs.org/doi/10.1021/acscentsci.6b00367 :)
    It's on my TODO list to get this back into DeepChem. The sample code for this only worked with an old version of DeepChem unfortunately
    Ohyeahmanolito
    @Ohyeahmanolito
    @rbharath , that is the paper I am currently reading. :D The application of DL on drug discovery is very interesting (and new) for me. I will explore the code later. :D Thank you.
    Christian Stemmle
    @chstem
    DiskDataset.create_dataset(shard_generator, data_dir) will generate and store all the data on disk. But is there any function to reload it from data_dir, without having to run shard_generator again? There is dc.utils.save.save_dataset_to_disk() but this is for separate train, val, test data ...
    2 replies
    Peng Fei
    @cpfpengfei
    @rbharath with GraphConvModel, is it possible to use the graphconv layers as some sort of encoder to encode molecules into a latent space representation? Particularly for untrained data? And plug the representation back to the dense layers for predictions? Thanks a lot!
    Also, it seems like GraphConvModel works mainly for smaller structures and not large structures, is that true? And the current implementation of GraphConvModel is based on the neural fingerprint (NFP) paper?
    Bharath Ramsundar
    @rbharath
    This will let you extract the "neural fingerprint" from a trained graph conv model
    GraphConvModel primarily has been tested on smaller structures, but it should also work fine on larger molecules I believe. The implementation is based on the NeuralFingerprint paper