- Join over
**1.5M+ people** - Join over
**100K+ communities** - Free
**without limits** - Create
**your own community**

- 11:03yashml closed #1247
- Apr 01 20:26yashml edited #1247
- Apr 01 19:03codecov[bot] commented #1247
- Apr 01 18:58codecov[bot] commented #1247
- Apr 01 18:45yashml opened #1247
- Mar 30 14:58ankurankan milestoned #1245
- Mar 30 14:57ankurankan milestoned #1246
- Mar 30 14:57ankurankan labeled #1246
- Mar 30 14:57ankurankan labeled #1246
- Mar 27 12:51ankurankan labeled #1245
- Mar 25 18:29mjuarezm commented #1246
- Mar 25 18:18mjuarezm edited #1246
- Mar 25 18:17mjuarezm commented #1246
- Mar 25 18:16mjuarezm opened #1246
- Mar 25 17:58mjuarezm edited #1244
- Mar 25 17:58mjuarezm closed #1244
- Mar 25 17:58mjuarezm commented #1244
- Mar 25 14:15Aleib89 opened #1245
- Mar 25 11:10ankurankan commented #1244
- Mar 25 04:51mjuarezm edited #1244

@axi672 Thanks for reporting this. I have updated the notebook now to work with the latest version of the codebase. If you still get any errors, let me know along with your code, so that I can reproduce it.

Hi @ankurankan I just tested reading back the file written with ProbModelXMLWriter and get the following error:

```
/usr/local/lib/python3.6/dist-packages/pgmpy/readwrite/ProbModelXML.py in add_edge(self, edge)
981 >>> reader.add_edge(edge)
982 """
--> 983 var1 = edge.findall("Variable")[0].attrib["name"]
984 var2 = edge.findall("Variable")[1].attrib["name"]
985 self.probnet["edges"][(var1, var2)] = {}
IndexError: list index out of range
```

Code I’m running:

```
model_data = get_probmodel_data(model)
writer = ProbModelXMLWriter(model_data=model_data)
writer.write_file('test.pgmx')
from pgmpy.readwrite import ProbModelXMLReader
reader_string = ProbModelXMLReader('test.pgmx')
```

The model is the one from the notebook https://github.com/pgmpy/pgmpy_notebook/blob/master/notebooks/8.%20Reading%20and%20Writing%20from%20pgmpy%20file%20formats.ipynb

XMLBeliefNetwork also gives the following error

```
/usr/local/lib/python3.6/dist-packages/pgmpy/readwrite/XMLBeliefNetwork.py in get_static_properties(self)
97 return {
98 tags.tag: tags.get("VALUE")
---> 99 for tags in self.bnmodel.find("STATICPROPERTIES")
100 }
101
TypeError: 'NoneType' object is not iterable
```

The numpy array will be an n-dim (n is the number of variables) array and each axis represents the values for the corresponding variable.

@ankurankan

```
inference = BayesianModelSampling(model)
inference.likelihood_weighted_sample(evidence=evidence, size=2)
```

The profiler suggests that the function

`pre_compute_reduce`

is taking a significant amount of time
Is there any way i could do the inference faster, given some evidence.

Further analysis shows that it was the

`copy`

that was slowing everything down
here is my code

q = infer.query(variables=["HAS_DISEASE"], evidence = )

@benthestudent_twitter Evidence basically tells the inference method about the values of variables that are known. For example, let's say you are trying to compute the probability of a given ball to be basketball or football based on their weight and diameter. In this case, if you already know the weight and/or diameter of the given ball, you can pass it as the evidence and the inference method would increase or decrease the probability of it being basket ball or foot ball based on the evidence. In general machine learning terminology say we are trying to predict y using X as the features. While doing prediction for any new datapoint, we want to know the value of y given X, the values of X becomes the evidence and y becomes the variables in the query.

@ankurankan Sorry, I have an issue with the prediction by PGMPY. I create the model through BayesianModel and would like to do the prediction. I follow the example source code of the predict method in the library (https://pgmpy.org/models.html#module-pgmpy.models.BayesianModel). But I am curious that why they use 'value variable' that holds the whole dataset to fit the model? Why don't they use the train data? Since I try to follow them by fitting the model with the whole dataset, the RMSE and R square that I use to check accuracy are 1.71 and -6 respectively which is very weird. However, it is an error as a key error when I change to fit the model with train data.

Since the predict attribute needs to be droped for doing the prediction, but why there is an error?

@KittyNatty @ankurankan I also encountered this KeyError exception when doing prediction with bayesian model. Above is the screen shot:

@MatheusCL8 Could you please elaborate on what exactly are you trying to do with Dynamic BNs because pgmpy has limited functionality for it. So, things like learning them from data is not very straightforward.

@ankurankan thanks for reply. I uploaded code and dataset to https://c-t.work/s/7b9c3f200fc84a .(password: 2020325)

@axi672 Yes, have a look at https://github.com/pgmpy/pgmpy/blob/dev/pgmpy/inference/CausalInference.py. You can compute adjustment sets, biasing paths etc and then based on that can apply any statistical model to compute the effects. For linear estimation, you can simply use

`CausalInference.estimate_ate`

but for other estimates you will need to use other packages like `statsmodels`

.
Hi @ankurankan, I'm new to PGMs and found the package to be interesting. I'm trying to build a Bayesian network for continuous random variables. I would like to build a network and infer the dependencies between these variables, estimate the population covariance parameters, mean and standard deviation. By going through tutorials I realized discretizing the variables is the way to go. I was using pymc3 so far to do inference but they do not have support for graphical models.

- Is using the directed Bayesian network the way to go for this kind of problem? (Looking for a simple fit model, not linear models)
- After discretizing the continuous random variable, how can I build CPTs?

It would be great if you could give any lead. With pymc3 I buil model using LKJ priors https://docs.pymc.io/notebooks/LKJ.html

If you want to work with discrete variables, you can build the model in pgmpy and then call the

`fit`

method on it to learn the CPTs from the data. Have a look at `BayesianModel.fit`

method.
@ankurankan Thank you very much for the reply. I will look at BayesianModel.fit method. I'm pretty new to this field so let me explain with an example.

Let's say I am monitoring an industrial process where my target variable of interest is the "fineness of coffee powder". The variables that can affect the fineness of a coffee powder are the vibration of the engine, voltage, amount of coffee beans, room temperature, humidity etc. I also assume that voltage and vibration are covarying. In such a situation, if I want to build a graphical model, what is the right approach?

- As you suggested I can build a linear model using this data. But can I still infer the covariance between voltage and vibration or voltage and coffee powder fineness?
- Can I ask conditional probability questions such as the P(powder_fineness| voltage, humidity) etc?

I would really appreciate any help. Is there any toy problem similar to this?

Is there a way to know/caculate in a bayesian network the respective impact of other nodes on one node? I am doing prediction with pgmpy and would like to find out nodes that have sth to do with the prediction(label node) and their importance order/ranking based on the contribution to the prediction.

@ankurankan

@ankurankan

@vvrahul11 You should be able to answer both the questions with both discrete and linear models. For the covariance question, you would want to look into causal inference as you are trying to deduce direct and indirect covariance between two variables from the sample covariance. You would basically need to find the adjustment set for the covariance parameter, such that conditioning on those variables would result in a single path between the variables. For the conditional probabilities, I think you will have to make a distribution assumption say Linear Gaussian and then you can compute a conditional Gaussian distribution.

@bushyttail I think for such questions, you can look into causal inference. With that you will be able to quatify the strength of direct relationship between variables and will give you a sense of how much they affect each other.

I am trying to implement a dynamic network in a real data set with various types of variables and several years. I am trying to make a 10 year forecast over this data set. I saw that the library is very limited to do this in a very large group. I'm trying to adapt to what I want, but I have doubts about time_slice, and also about the application of DBNInference and its output. About time_slice, does it not accept more than one value in timi_slice at once? Do I have to put several variables with different time_slice? And about DBNInference, you could clarify the parameters he asks for, I am new and in the library and I had difficulty understanding how he asks for the parameters and how the output is, I would like to understand this part about the parameters. Thank you very much. Although the library does not have very good documentation, it is a great library.

it is a great library.