- Join over
**1.5M+ people** - Join over
**100K+ communities** - Free
**without limits** - Create
**your own community**

- Dec 11 16:15ankurankan opened #1202
- Dec 11 15:17yaowang1111 closed #1201
- Dec 11 15:17yaowang1111 commented #1201
- Dec 11 15:08ankurankan commented #1201
- Dec 11 14:58yaowang1111 commented #1201
- Dec 11 14:51ankurankan commented #1201
- Dec 11 14:08codecov[bot] commented #1176
- Dec 11 14:08ankurankan synchronize #1176
- Dec 11 13:13yaowang1111 opened #1201
- Dec 11 10:21
ankurankan on dev

Renames BdeuScore to BDeuScore (compare)

- Dec 11 10:21ankurankan closed #1200
- Dec 11 10:18codecov[bot] commented #1200
- Dec 11 09:48ankurankan edited #1162
- Dec 11 09:40ankurankan opened #1200
- Dec 11 09:37
ankurankan on dev

Updates Readme Make the version number just de… (compare)

- Dec 11 09:37ankurankan closed #1199
- Dec 11 08:58ankurankan opened #1199
- Dec 10 16:02ankurankan edited #1198
- Dec 10 15:40
ankurankan on dev

Adds github action to run all t… Don't run Mmhc estimators tests… (compare)

- Dec 10 15:40ankurankan closed #1189

am sorry, here is the link for cache implementation https://github.com/pgmpy/pgmpy/pull/944/commits/54c92c0b8c3d7fbb0df39258b8d8942865d64dd0

```
+-------------------------------------+-----------+
| project_type(Fast Track Onboarding) | 0.0299222 |
+-------------------------------------+-----------+
| project_type(Innovation) | 0.0113704 |
+-------------------------------------+-----------+
| project_type(governance) | 0.0388989 |
+-------------------------------------+-----------+
| project_type(innovation) | 0.032316 |
+-------------------------------------+-----------+
| project_type(other) | 0.831837 |
+-------------------------------------+-----------+
| project_type(performance) | 0.0359066 |
+-------------------------------------+-----------+
| project_type(productivity) | 0.0197487 |
+-------------------------------------+-----------+
```

In this example I want to get `Fast Track Onboarding, Innovation...`

Thanks @ankurankan I managed to find it last night. I'm just working on this pgmpy/pgmpy#913 I might create a PR for my implementation at some point when I clean the code up

@mrkmcknz I am currently working on a Data class which will implement different conditional independence testing algorithms (for both continuous and hybrid). And with minor changes in the current structure learning algorithms, they should be able to learn the structure from continuous and hybrid datasets. But I don't think I will have the bandwidth to work on parameter learning or inference on continuous models soon.

Hey everyone, I am just getting started with pgmpy, and have a question regarding performance of inference via BeliefPropagation. I have setup a moderate Bayesian Network (about 40 nodes, 80 edges), and want to get the state probabilities for a central node (without providing any evidence). With the recent dev-branch, this operation will take about 4-5 minutes on my developer-machine (i7, 8GB RAM ...) I would have thought it to be faster. Am I hitting any limits here (exponential runtime growths?), or should this indeed be faster, and I am doing something wrong? Any help much appreciated!

FAIL: test_sampling (pgmpy.tests.test_sampling.test_continuous_sampling.TestNUTSInference)

Traceback (most recent call last):

File "/Users/jonvaljohn/Code/pgmpy/pgmpy/pgmpy/tests/test_sampling/test_continuous_sampling.py", line 208, in test_sampling

np.linalg.norm(sample_covariance - self.test_model.covariance) < 0.4

AssertionError: False is not true

Ran 676 tests in 146.484s

FAILED (SKIP=7, failures=1)

How do I find the dependency packages' versions?

Thanks for all your help.

Has anyone used dynamic bayesian network in pgmpy? The provided code is implementing a 2-TBN but I dont see a way to iterate with time. What is the best practice to provide new values for the CPDs for the "0" parameters. I assume that the "1" parameters should be automatically updated.

```
from pgmpy.factors.discrete import TabularCPD
from pgmpy.models import DynamicBayesianNetwork as DBN
from pgmpy.inference import DBNInference
dbn = DBN()
#dbn.add_edges_from([(('M',0), ('S', 0)), (('S',0), ('S', 1)), (('S', 0), ('J', 0)) ])
dbn.add_edges_from([(('M',1), ('S', 1)), (('S', 1), ('J', 1)), (('S',0), ('S', 1)) ])
#S_cpd = TabularCPD (('S',0), 2, [[0.5, 0.5]]) # Prior the state should go 50/50
M_cpd = TabularCPD (('M',1), 2, [[0.5, 0.5]]) # Prior the model should go 50/50
M_S_S_cpd = TabularCPD(variable = ('S',1), variable_card=2,
values = [[0.95, 0.3, 0.1, 0.05],
[0.05, 0.7, 0.9, 0.95]],
evidence=[ ('M', 1), ('S', 0)],
evidence_card=[2, 2])
S_J_cpd = TabularCPD(('J',1), 2, [[0.9, 0.1],
[0.1, 0.9]],
evidence=[('S',1)],
evidence_card=[2])
dbn.add_cpds(M_cpd, M_S_S_cpd, S_J_cpd)
dbn_inf = DBNInference(dbn)
dbn_inf.forward_inference([('J', 2)], { ('M', 1):0, ('M', 2):0, ('S', 0): 0})
```

It is still failing... if I have the variable that goes from state to state as the parent in the individual time slice then the code works otherwise it fails.

Now this gives a "key error"

Because this for loop in pre_compute_reduce() Sampling.py seems to use (var_name, var_state_no) as input:

```
for state_combination in itertools.product(
*[range(self.cardinality[var]) for var in variable_evid]
):
states = list(zip(variable_evid, state_combination))
cached_values[state_combination] = variable_cpd.reduce(
states, inplace=False
).values
```

Hey folks!

I'm using pgmpy in a project, and needed to fit a

Would the maintainers be interested in a PR with this implementation? Note that it would be introducing a dependency on sklearn. Or would you prefer an implementation with scipy/statsmodels?

I'm using pgmpy in a project, and needed to fit a

`LinearGaussianBayesianNetwork`

to a dataset. Since the `.fit()`

method isn't yet implemented, I wrote my own using sklearn's LinearRegression.Would the maintainers be interested in a PR with this implementation? Note that it would be introducing a dependency on sklearn. Or would you prefer an implementation with scipy/statsmodels?

@sandeep-n Hey, that would be great to have in pgmpy. But ideally I wanted to have the implementation using statsmodels mainly because of the implementations for different fit metrics . With sklearn we will have to write our own methods to compute these. Do you think it would be possible for you to use statsmodels instead of sklearn? Else if you open a PR with your current implementation, I can work on it to use statsmodels.

Hi! I am new to pgmpy. I have created a model, generated the cpds and made a prediction, which is what I needed, and this was cool. However there are two things I am struggling with: 1. 'predict_probability' doesn't work. It gives me an index error (IndexError: index 11 is out of bounds for axis 0 with size 9), I can't figure out why. Standard predict works fine. 2. I couldn't find any reference for using a latent variable. How can include a latent variable? How do I establish its cpd? Suppose my latent variable is "C", and is the outcome of "A" and "B", and the outcome of "C" is "D". How do I combine everything?. Thanks

Hi, I want to import the data from the standard datasets in th R-library blnearn, http://www.bnlearn.com/bnrepository/. Did anyone do this before?

Hello,

Is there any way to update already defined CPD for given model, with new data points with missing values? Here's an example of code (in order for you to understand more clearly what I'd like to do)

```
import numpy as np
import pandas as pd
import numpy as np
from pgmpy.models import BayesianModel
from pgmpy.estimators import MaximumLikelihoodEstimator, BayesianEstimator
from pgmpy.factors.discrete import TabularCPD
# define links between nodes
model = BayesianModel([('A', 'B'), ('C', 'B')])
# define some initial CPDs
cpd_a = TabularCPD(variable='A', variable_card=2,
values=[[0.9], [0.1]])
cpd_c = TabularCPD(variable='C', variable_card=2,
values=[[0.3], [0.7]])
cpd_b = TabularCPD(variable='B', variable_card=3,
values=[[0.2, 0.1, 0.25, 0.5],
[0.3, 0.5, 0.25, 0.3],
[0.5, 0.4, 0.5, 0.2]],
evidence=['A', 'C'],
evidence_card=[2, 2])
# Associating the parameters with the model structure.
model.add_cpds(cpd_a, cpd_b, cpd_c)
# Checking if the cpds are valid for the model.
model.check_model()
#generate some data, with C as missing value
raw_a = np.random.randint(low=0, high=2,size=100)
raw_b = np.random.randint(low=0, high=3,size=100)
raw_c = np.empty(100)
raw_c[:] = np.NaN
data = pd.DataFrame({"A" : raw_a, "B" : raw_b, "C" : raw_c})
# define pseudo counts according to initial cpds and variable cardinality
pseudo_counts = {'A': [[300], [700]], 'B': [[500,100,100,300], [100,500,300,400], [400,500,100,200]], 'C': [[200], [100]]}
# fit model with new data
model.fit(data, complete_samples_only=False, estimator=BayesianEstimator, prior_type='dirichlet', pseudo_counts=pseudo_counts)
# print updated cpds
for cpd in model.get_cpds():
print("CPD of {variable}:".format(variable=cpd.variable))
print(cpd)
```

When i try this, i get an error message saying:

"ValueError: The shape of pseudo_counts must be: (3, 0)"