- Join over
**1.5M+ people** - Join over
**100K+ communities** - Free
**without limits** - Create
**your own community**

Note that this won't have RDKit support (since RDKit isn't on pip), so you'll have to figure out how to install RDKit in your environment. This should make hacking on DeepChem much nicer!

Any feedback or comments would be much appreciated :)

Digging into it, I see that sklearn.metrics has a

`jaccard_score`

, but not a `jaccard_similarity_score`

function. Is this due to a version mismatch?
I installed using the conda, following the instructions on the deepchem website

I'd recommend using deepchem-nightly though :)

I'll update the docs soon, but you can do

`pip install tensorflow; pip install deepchem-nightly`

@rbharath For a model with uncertainty prediction capabilities, when `uncertainty = True`

and using a trained model to `predict_uncertainty`

, an array of `pred`

and `std`

values corresponding to the sample y is returned.

For these 2 arrays, am I right to say that every value in the `pred`

array is the **mean** predicted value after 50 predictions (standard dropout mask of 50), and that the corresponding value in `std`

can be considered the **standard deviation** of that prediction (root of squared aleatoric and epistemic uncertainties by dropout)? Thanks a lot! And am looking forward to the new 2.4.0 version!

3 replies

I have another question on normalisation transformers. Would it be necessary (in terms of better model performance) to use it on all datasets to transform the y values? Because my model requires to output prediction uncertainties and having transformers will cause some misalignments in the dataset scales.

Could you please help me ?

@rbharath Is this supposed to be a

`-1`

? https://github.com/deepchem/deepchem/blob/master/deepchem/models/graph_models.py#L638
3 replies

Let me dig into this a bit and get back to you shortly

@chstem I'm able to replicate it now. See discussion in this issue deepchem/deepchem#1944. I'm seeing it on the 100th epoch on my test, but I'm not sure what's causing this issue

Is your dataset large by any chance? Maybe it has something to do with the number of training steps.

I am doing the tutorial "Graph Convolutions For Tox21" and trying to understand math (by reading the source code).

I posted a couple of questions about GraphConv to the deepchem forum (since the questions are not very specific the forum seems as a more appropriate place to post them, rather than here in gitter). I would appreciate if someone may have a look and answer them.

It's a well written paper so will hopefully be clear, but glad to answer here questions or on the forums :)

@Ohyeahmanolito You can check out my old paper https://pubs.acs.org/doi/10.1021/acscentsci.6b00367 :)

It's on my TODO list to get this back into DeepChem. The sample code for this only worked with an old version of DeepChem unfortunately

`DiskDataset.create_dataset(shard_generator, data_dir)`

will generate and store all the data on disk. But is there any function to reload it from `data_dir`

, without having to run `shard_generator`

again? There is `dc.utils.save.save_dataset_to_disk()`

but this is for separate train, val, test data ...
2 replies

Also, it seems like GraphConvModel works mainly for smaller structures and not large structures, is that true? And the current implementation of GraphConvModel is based on the neural fingerprint (NFP) paper?

@cpfpengfei Check out the predict_embedding method: https://deepchem.readthedocs.io/en/latest/models.html#deepchem.models.GraphConvModel.predict_embedding

This will let you extract the "neural fingerprint" from a trained graph conv model

GraphConvModel primarily has been tested on smaller structures, but it should also work fine on larger molecules I believe. The implementation is based on the NeuralFingerprint paper