Where communities thrive

  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community

    Hello everynody. I am new to pytorch-geometric, I am looking for a resource that help to learn how to create different Graph neural network architectures, I could not understand the site of pytorch-geometric well, I mean I need a resource that have different sample of architecture in GNN. Many thanks.

    For GNN, I have found cs224w machine learning with graph lectures useful. You might be interested in seeing from lecture 7 - they are avl on youtube - https://www.youtube.com/watch?v=JAB_plj2rbA&list=PLoROMvodv4rPLKxIpqhjhPgdQy7imNkDn Yoy might also be interested in the course materials which you can find here - http://web.stanford.edu/class/cs224w/

    I was wondering why we need to install miniconda and then install deepchem. Why can't we directly install deepchem in colab?
    Vignesh Venkataraman
    We had been using miniconda for Rdkit install. But recently, RdKit has made a pip release https://pypi.org/project/rdkit-pypi/ . I don't think we require conda install anymore
    Thanks, that helps a lot!
    Omid Tarkhaneh
    @arunppsg Thanks a lot for your help
    Shaipranesh S
    hello devs, i got a doubt, shud we use python2 for working optimally with deepchem because in the tutorials we used pip instead of pip3.
    hi, everybody! are there examples/tutorials of deepchem on pytorch env.?the github decumens are mainly tensorflow-related deepchem guides. i ran the examples on pytorch with errors. my python env.: windows 10 python 3.7.6 deepchem 2.6 dev. pytorch 1.6(cpu only)
    @Caped-Crusader624 deepchem supports python 3.7 to 3.8. Regarding pip, in a conda environment with python 3.x or when you only have python3.x, you need not mention pip3 explicitly. pip will do good.
    Hi everyone, please could you take a look at this google colab notebook? https://colab.research.google.com/drive/1mFqqgoOElV_XKAXzPufdQdcWWMY8yzp3?usp=sharing I have some questions related to the use of the MolGraphConvFeaturizer and Keras functional API. Thank you for your time in advance.
    Bharath Ramsundar
    @rjd55 I took a quick look but it looks like the access settings are set to private
    Sorry about the incorrect access settings, here is the notebook with my code and questions (which are above the model) https://colab.research.google.com/drive/1mFqqgoOElV_XKAXzPufdQdcWWMY8yzp3?usp=sharing @rbharath
    Hey everyone, I want to make a model that estimates drug-target interaction. I want a GraphConv branch for the drugs and a CNN branch for the targets. Can I use deepchem to write a model like this:
    from deepchem.models.layers import GraphConv
    protein_input = Input(shape=(train_protein.shape[1:]))
    compound_input = Input(shape=(featurized_smiles.shape[1:]))
    #protein layers
    x = Conv1D(filters=32, padding="valid", activation="relu", strides=1, kernel_size=4)(protein_input)
    x = Conv1D(filters=64, padding="valid", activation="relu", strides=1, kernel_size=8)(x)
    x = Conv1D(filters=96, padding="valid", activation="relu", strides=1, kernel_size=12)(x)
    final_protein = GlobalMaxPooling1D()(x)
    #compound layers
    #How do I add in a DeepChem layer like this?
    y = GraphConv(128)(compound_input)
    final_compound = GlobalMaxPooling1D()(y)
    join = tf.keras.layers.concatenate([final_protein, final_compound], axis=-1)
    x = Dense(1024, activation="relu")(join)
    x = Dropout(0.1)(x)
    x = Dense(1024, activation='relu')(x)
    x = Dropout(0.1)(x)
    x = Dense(512, activation='relu')(x)
    predictions = Dense(1,kernel_initializer='normal')(x)
    keras_model = Model(inputs=[protein_input, compound_input], outputs=[predictions])
    model = dc.models.KerasModel(keras_model, dc.models.losses.Loss())
    Bharath Ramsundar
    @rjd55 I think in theory yes, but I don't think we've had someone build a model like this before
    Give it a try and let us know if it works! This is a usecase we'd like to support :)
    Bharath Ramsundar
    @/all If your organization has used DeepChem can you please chime in on this thread? https://forum.deepchem.io/t/organizations-using-deepchem/567? We're trying to gather a list of companies using DeepChem for the new website and for applying for grants
    Bharath Ramsundar
    ^ Check out our new deepchem model wishlist internships
    Rahul Ramesh
    I was trying to reproduce the AUC_PRC numbers for the MUV dataset using the Multi-task Classifier. I was using https://github.com/deepchem/deepchem/blob/085a3e53fb69b3108de0527afa1f144e1b861108/examples/muv/muv_tf.py but haven't had any success. I get an AUC-PRC close to 0.11 (while the MoleculeNet paper reports 0.17)
    Hello, I want to create a torch dataset from dc.data.NumpyDataset and I tried the following code:
    X = np.random.rand(5, 2)
    y = np.random.rand(5)
    dataset = dc.data.NumpyDataset(X, y)
    I expected torch.utils.data.Dataset but got deepchem.data.pytorch_datasets._TorchNumpyDataset. Any hints on why it is not getting converted?
    Bharath Ramsundar
    @arunppsg Hmmm, could you raise a github issue? Let's track on there to see what's happening
    @rahul13ramesh You might need to do some hyperparameter tuning. The underlying numerical code in TensorFlow/etc has changed a lot since the original paper (it's been over 3 years) so some retuning might be needed
    Hi! A deepchem beginner here 🙂 I have used RDKitDescriptors class to featurize my dataset. As expected, it returned a array of the descriptors for each molecule. But how can I know which is descriptor present in each column?
    Hi! I was wondering if someone could help me with autoencoder implementation in the contrib folder. From what I see that is deprecated. I tried to run the test code but can't successfully load the zinc_model.h5 file provided from http://karlleswing.com/misc/keras-molecule/model.h5. Is seqtoseq the replacement? I know I can use that to train a new model, but I just wanted to see if there was an already trained model available.
    Bharath Ramsundar
    @jxsoares The easiest way is likely to check the source for that class. There's a list of descriptors in there I think
    @agoliaei The Seq2seq model can be used as an autoencoder. If you're looking for a generative model, check out molgan or normalizing flows
    @rbharath Thanks so much! Yes, I am looking for a GAN. I will take a look at those.
    Bharath Ramsundar
    Our first online DeepChem conference is now scheduled! October 6th from 9am PST - 11am PST. Please RSVP: https://www.meetup.com/DeepChem-User-Group/events/280694218/
    Hello, I find that none of the tutorial notebooks work well in google colaboratory. They all return errors when the code loads datasets. I am not sure what else to provide other than this fact.
    Bharath Ramsundar
    @marissa270 Can you copy/paste the error here? I don't think I've seen this before

    ImportError Traceback (most recent call last)

    <ipython-input-5-9bfb66006519> in <module>()
    ----> 1 tasks, datasets, transformers = dc.molnet.load_delaney(featurizer='GraphConv')
    2 train_dataset, valid_dataset, test_dataset = datasets

    10 frames
    /usr/local/lib/python3.7/dist-packages/deepchem/molnet/load_function/delaney_datasets.py in load_delaney(featurizer, splitter, transformers, reload, data_dir, save_dir, kwargs)
    78 loader = _DelaneyLoader(featurizer, splitter, transformers, DELANEY_TASKS,
    79 data_dir, save_dir,
    ---> 80 return loader.load_dataset('delaney', reload)

    /usr/local/lib/python3.7/dist-packages/deepchem/molnet/load_function/molnet_loader.py in load_dataset(self, name, reload)
    176 logger.info("About to featurize %s dataset." % name)
    --> 177 dataset = self.create_dataset()
    179 # Split and transform the dataset.

    /usr/local/lib/python3.7/dist-packages/deepchem/molnet/load_function/delaney_datasets.py in create_dataset(self)
    20 loader = dc.data.CSVLoader(
    21 tasks=self.tasks, feature_field="smiles", featurizer=self.featurizer)
    ---> 22 return loader.create_dataset(dataset_file, shard_size=8192)

    /usr/local/lib/python3.7/dist-packages/deepchem/data/data_loader.py in create_dataset(self, inputs, data_dir, shard_size)
    225 yield X, y, w, ids
    --> 227 return DiskDataset.create_dataset(shard_generator(), data_dir, self.tasks)
    229 def _get_shards(self, inputs: List, shard_size: Optional[int]) -> Iterator:

    /usr/local/lib/python3.7/dist-packages/deepchem/data/datasets.py in create_dataset(shard_generator, data_dir, tasks)
    1202 metadata_rows = []
    1203 time1 = time.time()
    -> 1204 for shard_num, (X, y, w, ids) in enumerate(shard_generator):
    1205 basename = "shard-%d" % shard_num
    1206 metadata_rows.append(

    /usr/local/lib/python3.7/dist-packages/deepchem/data/data_loader.py in shard_generator()
    205 for shard_num, shard in enumerate(self._get_shards(inputs, shard_size)):
    206 time1 = time.time()
    --> 207 X, valid_inds = self._featurize_shard(shard)
    208 ids = shard[self.id_field].values
    209 ids = ids[valid_inds]

    /usr/local/lib/python3.7/dist-packages/deepchem/data/data_loader.py in _featurize_shard(self, shard)
    394 raise ValueError(
    395 "featurizer must be specified in constructor to featurizer data/")
    --> 396 features = [elt for elt in self.featurizer(shard[self.feature_field])]
    397 valid_inds = np.array(
    398 [1 if np.array(elt).size > 0 else 0 for elt in features], dtype=bool)

    /usr/local/lib/python3.7/dist-packages/deepchem/feat/base_classes.py in call(self, datapoints, kwargs)
    68 Any blob of data you like. Subclasss should instantiate this.
    69 """
    ---> 70 return self.featurize(datapoints,
    72 def _featurize(self, datapoint: Any, **kwargs):

    /usr/local/lib/python3.7/dist-packages/deepchem/feat/graph_features.py in featurize(self, datapoints, log_every_n, **kwargs)
    785 features = super(ConvMolFeaturizer, self).featurize(
    --> 786 datapoints, log_every_n=1000)
    787 if self.per_atom_fragmentation:
    788 # create temporary valid ids serving to filter out failed featurizations from every sublist

    /usr/local/lib/python3.7/dist-packages/deepchem/feat/base_classes.py in featurize(self, datapoints, log_every_n, **kwargs)
    260 """
    261 try:
    --> 262 from rdkit import Chem
    263 from rdkit.Chem import rdmolfiles
    264 from rdkit.Chem import rdmolops

    /root/miniconda/lib/python3.7/site-packages/rdkit/Chem/init.py in <module>()
    21 _HasSubstructMatchStr = rdchem._HasSubstructMatchStr
    22 from rdkit.Chem.rdchem import
    ---> 23 from rdkit.Chem.rdmolfiles import

    24 from rdkit.Chem.rdmolops import
    25 from rdkit.Chem.rdCIPLabeler import

    ImportError: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.26' not found (required by /root/miniconda/lib/python3.7/site-packages/rdkit/Chem/../../../../libboost_regex.so.1.74.0)

    NOTE: If your import is failing due to a missing package, you can
    manually install dependencies using either !pip or !apt.

    To view examples of installing some common dependencies, click the

    "Open Examples" button below.

    I do not know python enough to figure out what to do with this. And I definitely waited around 5 minutes for conda to be installed by running the preceding code blocks.
    Bharath Ramsundar
    Hmm, ok try this:
    ` pip installl rdkit-pypi
    I think the issue is that the conda rdkit is broken onn colab
    So you might want to install it from pip
    is there any way to save model to disk? model.save() is not working and pickle is not working?
    Hi, Is there a way to convert the variable length SMILES encodings into fixed length numerical feature vectors?
    Atreya Majumdar

    is there any way to save model to disk? model.save() is not working and pickle is not working?

    You can try reloading model checkpoints if model.save is not working for you.

    Hi, Is there a way to convert the variable length SMILES encodings into fixed length numerical feature vectors?

    Padding the vectors to the size of the largest vector might help.

    1 reply
    Bharath Ramsundar
    @seungwoo-shin Models are saved automatically when you call fit(). Just make sure to specify model_dir in the constructor
    @r-b-1-5 Check out our fingerprints
    1 reply

    Hi, I'm working with encoding SMILES as feature vectors, and using this particular module: https://deepchem.readthedocs.io/en/latest/api_reference/featurizers.html#mol2vecfingerprint does not give the result as expected.

    import deepchem as dc

    from rdkit import Chem
    smiles = ['CCC']
    featurizer = dc.feat.Mol2VecFingerprint()
    features = featurizer.featurize(smiles)

    <class 'numpy.ndarray'>

    This is the demo code that is present in the examples subsection of the module, and the feature vectors yield a shape (0,) when I run it after installing the mol2vec package from github, as instructed. Why is this so?

    2 replies
    Hi, Im working through a tutorial for Graph Convolutions here(https://github.com/deepchem/deepchem/blob/master/examples/tutorials/Introduction_to_Graph_Convolutions.ipynb) and get an error below:
    miniconda3/envs/my-rdkit-env/lib/python3.7/site-packages/deepchem/models/keras_model.py:474 apply_gradient_for_batch  *
        grads = tape.gradient(batch_loss, vars)
    .local/lib/python3.7/site-packages/tensorflow_core/python/eager/backprop.py:1014 gradient
    .local/lib/python3.7/site-packages/tensorflow_core/python/eager/imperative_grad.py:76 imperative_grad
    .local/lib/python3.7/site-packages/tensorflow_core/python/eager/backprop.py:138 _gradient_function
        return grad_fn(mock_op, *out_grads)
    .local/lib/python3.7/site-packages/tensorflow_core/python/ops/math_grad.py:455 _UnsortedSegmentMaxGrad
        return _UnsortedSegmentMinOrMaxGrad(op, grad)
    .local/lib/python3.7/site-packages/tensorflow_core/python/ops/math_grad.py:432 _UnsortedSegmentMinOrMaxGrad
        _GatherDropNegatives(op.outputs[0], op.inputs[1])
    TypeError: 'NoneType' object is not subscriptable

    This is when running the code verbatum:

    import deepchem as dc

    tasks, datasets, transformers = dc.molnet.load_tox21(featurizer='GraphConv')
    train_dataset, valid_dataset, test_dataset = datasets

    n_tasks = len(tasks)
    model = dc.models.GraphConvModel(n_tasks, mode='classification')
    model.fit(train_dataset, nb_epoch=50)

    Has anyone seen this error or know where it originates? I installed tensorflow as suggested in the installation notes
    @ng478 it seems to work fine in colab. Can you share your tensorflow and numpy version?