Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    Bharath Ramsundar
    @rbharath
    @ignaczgerg I'm currently working on this! I was offline most of the last week but just starting to come back online and get to work on the migration. I'll post more information soon
    @MasunNabhanHoms_twitter Can you report more details about the error that you're seeing? I'm not sure what the issue is
    @yuanjames I think GPU utilization should be pretty good for most models, from 50-90%. Which models are you seeing poor utilization with?
    Arthur Funnell
    @elemets
    @rbharath Thanks, I guess I'm using an inefficient method then? It's taken over 8 hours to featurize 600k SMILES. Given they are peptides so are quite long if that makes a difference?
    The code I am using to featurize is:
    def featurizing_smis(smi):
        featurized = featurizer(smi)
        return featurized
    Should I be using a pandas loop instead?
    vectorized_featurizing = np.vectorize(featurizing_smis)
    array_of_feats = vectorized_featurizing(array_of_smis)
    ignaczgerg
    @ignaczgerg

    @ignaczgerg I'm currently working on this! I was offline most of the last week but just starting to come back online and get to work on the migration. I'll post more information soon

    @rbharath That is awesome, thank you! If I could help with anything regarding this, I am more than happy to help.

    Bharath Ramsundar
    @rbharath
    @elemets Ah, being peptides would explain the slowdown! The numbers I mentioned above are for small molecules. Featurization time is either linear or quadratic in molecular length (depending on featurization) so there will definitely be a slowdown. Weave is quadratic, but convmol is linear so perhaps it might be useful to experiment with different featurizations?
    @ignaczgerg Will do! I'll be posting issues and updates at deepchem/deepchem#2512 as I work on the upgrades
    James Y
    @yuanjames
    @rbharath Thanks, from deepchem.models import GCNModel, I am using RTX 3090, it only gives me 2% utility. and deepchem.models.GraphConvModel is around 10%, but MPNN from pytorch could reach 70% all on delaney datasets.
    1 reply
    Akhil Pandey
    @akhilpandey95
    @rbharath what is a good place to start for implementing bayesian GCN using deepchem ?
    Masun Nabhan Homsi
    @MasunNabhanHoms_twitter
    @MasunNabhanHoms_twitter Can you report more details about the error that you're seeing? I'm not sure what the issue is
    @rbharath
    When it crashes, it say "Session crashes with no reason". The log messages are :
    WARNING:root:kernel dab80b04-c3b5-45ae-827c-33975f71d502 restarted
    KernelRestarter: restarting kernel (1/5), keep random ports 2021-06-04 08:15:54.625105: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
    Akhil Pandey
    @akhilpandey95
    Can anyone help me on this ?
    :point_up:
    su-chao
    @su-chao
    NotImplementedError: Cannot convert a symbolic Tensor (gradient_tape/private__graph_conv_keras_model/graph_gather/sub:0) to a numpy array. This error may indicate that you're trying to pass a Tensor to a NumPy call, which is not supported
    how to deal this Error?
    ignaczgerg
    @ignaczgerg

    NotImplementedError: Cannot convert a symbolic Tensor (gradient_tape/private__graph_conv_keras_model/graph_gather/sub:0) to a numpy array. This error may indicate that you're trying to pass a Tensor to a NumPy call, which is not supported

    Hi, which python version are you using?
    I got this error when I accidentally used python=3.9 with deepchem. Try to downgrade to python=3.7 and you will be fine.

    su-chao
    @su-chao
    My version : Python 3.7.10 ; deepchem 2.5 ;tensorflow-gpu 2.41 @ignaczgerg
    su-chao
    @su-chao
    Where can I see the update logs about deepchem-version? I found some "import * "code can't work when run the examples files
    kingscolour
    @kingscolour
    Anyone know if dc=2.5.0 is incompatible with tf=2.5.0? On linux, dc would not import with tf-gpu=2.5.0: AttributeError: module 'deepchem' has no attribute 'data'
    I had to uninstall tf-gpu=2.5.0 and install tf=2.4.0 to resolve the error.
    Bharath Ramsundar
    @rbharath
    Sorry for the lack of responses folks
    I've been travellign and just starting to catch up
    @kingscolour Hmm, we don't support tf 2.5.0 right now (I hadn't realized it was out already!)
    We'll have to bump our versions but for now tf 2.4.0 is the same way to start
    1 reply
    Bharath Ramsundar
    @rbharath
    @su-chao Your error is basically due to numpy 1.20. Make sure you have numpy 1.19.5 installed. Tensorflow doesn't yet support numpy 1.20.*
    su-chao
    @su-chao
    Thanks! :)
    Masun Nabhan Homsi
    @MasunNabhanHoms_twitter
    @rbharath : please, any solution for colab problem (Session crash)? thank you
    Bharath Ramsundar
    @rbharath
    @MasunNabhanHoms_twitter Unfortunately I don't have a good idea for you right now! I'll keep an eye out and see if I come across this myself
    Tonylac77
    @Tonylac77
    Hi everyone, I am trying to do a k-fold split on my dataset, then convert all the folds to pandas dataframes. I've tried a for loop, iterating on a list of the folds but get the following error : 'DiskDataset' object is not iterable. Any ideas?
    alat-rights
    @alat-rights

    @Tonylac77 dataset.itersamples() might be what you’re looking for.

    For more info: https://deepchem.readthedocs.io/en/latest/api_reference/data.html#deepchem.data.DiskDataset

    Bharath Ramsundar
    @rbharath
    @Tonylac77 Can you share your code? I've definitely done something similar to what you're describing so I'm not sure what the difference is
    su-chao
    @su-chao
    @Tonylac77 In my ideas if you want use deepchem.model, you should use load_dataset to load you dataset, The ' deepchem.molnet.load_function.molnet_loader' hava a good explain and example
    davidRFB
    @davidRFB
    Hi I am tring yo run some local test for the inclusion of swissprot to molnet. However I am getting this error in the pytest execution :
    image.png
    I already add the function load_swissprot in swissprot_datasets.py
    I already add the line from deepchem.molnet.load_function.swissprot_datasets import load_swissprot
    in the init.py file of molnet
    I dont know if anyone can helpme
    davidRFB
    @davidRFB
    thank u very much
    Bharath Ramsundar
    @rbharath
    Oh this is weird
    Can you run other test files normally?
    su-chao
    @su-chao
    image.png
    Someone can solve the problem that small molecules are too large, for example:
    CC(C)c1ccc(cc1)NC(=O)O[C@@H]1CO[C@H]2[C@H](CO[C@@H]12)NC(=O)Nc1cccc(C(F)(F)F)c1
    my featurizer : dc.feat.CircularFingerprint()
    su-chao
    @su-chao
    Hi, Anyone can tell me the output predict.shape always (#,#,2) which is 3D,
    and that predict[:,:0] and predict[:,:,1] is differnet values,which is the correct predicted value ?
    su-chao
    @su-chao
    It's very confusing
    davidRFB
    @davidRFB
    Hi @rbharath Thank you for answer. The same mistake appears with other test.
    image.png
    I am executing with the deepchem 2.6.0dev into the deepchem directory that I fork from github. Maybe a environment reset could work ?
    Tonylac77
    @Tonylac77

    Dear all, sorry for the late answer. It seems we have fixed the iteration over Diskdatasets. In this case we use a 'for' loop where we split the tuples generated by the k-fold split function, and then output them as two csv files (we want to do this for use in other machine learning software).

    k = [k1, k2, k3, k4, k5, k6, k7, k8, k9, k10] #where K1-K10 are the folds from the splitting
    a = 1
    
    for x in k:
        train = x[0].to_dataframe()
        cv = x[1].to_dataframe()
        train = train['ids']
        cv = cv['ids']
        train.to_csv("k"+str(a) +"_train.csv")
        cv.to_csv("k"+str(a) +"_cv.csv")
        a = a+1

    This works fine when loading our dataset from CSV (with CSVLoader function) without an ID field. However, if we try to use a dataset with ChemBL IDs (in this case) we get the following RDkit error when performing the k-fold-split (see below) would love any input on this!

    ArgumentError                             Traceback (most recent call last)
    <ipython-input-8-604eaa868421> in <module>
          1 # split dataset
    ----> 2 k = splitter.k_fold_split(dataset=dataset, k=10)
    
    C:\Anaconda3\envs\deepchem38\lib\site-packages\deepchem\splits\splitters.py in k_fold_split(self, dataset, k, directories, **kwargs)
         84       frac_fold = 1. / (k - fold)
         85       train_dir, cv_dir = directories[2 * fold], directories[2 * fold + 1]
    ---> 86       fold_inds, rem_inds, _ = self.split(
         87           rem_dataset,
         88           frac_train=frac_fold,
    
    C:\Anaconda3\envs\deepchem38\lib\site-packages\deepchem\splits\splitters.py in split(self, dataset, frac_train, frac_valid, frac_test, seed, log_every_n)
       1107     for ind, smiles in enumerate(dataset.ids):
       1108       mols.append(Chem.MolFromSmiles(smiles))
    -> 1109     fps = [AllChem.GetMorganFingerprintAsBitVect(x, 2, 1024) for x in mols]
       1110 
       1111     # calcaulate scaffold sets
    
    C:\Anaconda3\envs\deepchem38\lib\site-packages\deepchem\splits\splitters.py in <listcomp>(.0)
       1107     for ind, smiles in enumerate(dataset.ids):
       1108       mols.append(Chem.MolFromSmiles(smiles))
    -> 1109     fps = [AllChem.GetMorganFingerprintAsBitVect(x, 2, 1024) for x in mols]
       1110 
       1111     # calcaulate scaffold sets
    
    ArgumentError: Python argument types in
        rdkit.Chem.rdMolDescriptors.GetMorganFingerprintAsBitVect(NoneType, int, int)
    did not match C++ signature:
        GetMorganFingerprintAsBitVect(class RDKit::ROMol mol, unsigned int radius, unsigned int nBits=2048, class boost::python::api::object invariants=[], class boost::python::api::object fromAtoms=[], bool useChirality=False, bool useBondTypes=True, bool useFeatures=False, class boost::python::api::object bitInfo=None, bool includeRedundantEnvironments=False)
    Vignesh Ram Somnath
    @vsomnath
    This feels like a molecule / SMILES is invalid.