ankurankan on dev
Renames BdeuScore to BDeuScore (compare)
ankurankan on dev
Updates Readme Make the version number just de… (compare)
ankurankan on dev
Adds github action to run all t… Don't run Mmhc estimators tests… (compare)
FAIL: test_sampling (pgmpy.tests.test_sampling.test_continuous_sampling.TestNUTSInference)
Traceback (most recent call last):
File "/Users/jonvaljohn/Code/pgmpy/pgmpy/pgmpy/tests/test_sampling/test_continuous_sampling.py", line 208, in test_sampling
np.linalg.norm(sample_covariance - self.test_model.covariance) < 0.4
AssertionError: False is not true
Ran 676 tests in 146.484s
FAILED (SKIP=7, failures=1)
from pgmpy.factors.discrete import TabularCPD
from pgmpy.models import DynamicBayesianNetwork as DBN
from pgmpy.inference import DBNInference
dbn = DBN()
#dbn.add_edges_from([(('M',0), ('S', 0)), (('S',0), ('S', 1)), (('S', 0), ('J', 0)) ])
dbn.add_edges_from([(('M',1), ('S', 1)), (('S', 1), ('J', 1)), (('S',0), ('S', 1)) ])
#S_cpd = TabularCPD (('S',0), 2, [[0.5, 0.5]]) # Prior the state should go 50/50
M_cpd = TabularCPD (('M',1), 2, [[0.5, 0.5]]) # Prior the model should go 50/50
M_S_S_cpd = TabularCPD(variable = ('S',1), variable_card=2,
values = [[0.95, 0.3, 0.1, 0.05],
[0.05, 0.7, 0.9, 0.95]],
evidence=[ ('M', 1), ('S', 0)],
evidence_card=[2, 2])
S_J_cpd = TabularCPD(('J',1), 2, [[0.9, 0.1],
[0.1, 0.9]],
evidence=[('S',1)],
evidence_card=[2])
dbn.add_cpds(M_cpd, M_S_S_cpd, S_J_cpd)
dbn_inf = DBNInference(dbn)
dbn_inf.forward_inference([('J', 2)], { ('M', 1):0, ('M', 2):0, ('S', 0): 0})
It is still failing... if I have the variable that goes from state to state as the parent in the individual time slice then the code works otherwise it fails.
Now this gives a "key error"
for state_combination in itertools.product(
*[range(self.cardinality[var]) for var in variable_evid]
):
states = list(zip(variable_evid, state_combination))
cached_values[state_combination] = variable_cpd.reduce(
states, inplace=False
).values
LinearGaussianBayesianNetwork
to a dataset. Since the .fit()
method isn't yet implemented, I wrote my own using sklearn's LinearRegression.Hello,
Is there any way to update already defined CPD for given model, with new data points with missing values? Here's an example of code (in order for you to understand more clearly what I'd like to do)
import numpy as np
import pandas as pd
import numpy as np
from pgmpy.models import BayesianModel
from pgmpy.estimators import MaximumLikelihoodEstimator, BayesianEstimator
from pgmpy.factors.discrete import TabularCPD
# define links between nodes
model = BayesianModel([('A', 'B'), ('C', 'B')])
# define some initial CPDs
cpd_a = TabularCPD(variable='A', variable_card=2,
values=[[0.9], [0.1]])
cpd_c = TabularCPD(variable='C', variable_card=2,
values=[[0.3], [0.7]])
cpd_b = TabularCPD(variable='B', variable_card=3,
values=[[0.2, 0.1, 0.25, 0.5],
[0.3, 0.5, 0.25, 0.3],
[0.5, 0.4, 0.5, 0.2]],
evidence=['A', 'C'],
evidence_card=[2, 2])
# Associating the parameters with the model structure.
model.add_cpds(cpd_a, cpd_b, cpd_c)
# Checking if the cpds are valid for the model.
model.check_model()
#generate some data, with C as missing value
raw_a = np.random.randint(low=0, high=2,size=100)
raw_b = np.random.randint(low=0, high=3,size=100)
raw_c = np.empty(100)
raw_c[:] = np.NaN
data = pd.DataFrame({"A" : raw_a, "B" : raw_b, "C" : raw_c})
# define pseudo counts according to initial cpds and variable cardinality
pseudo_counts = {'A': [[300], [700]], 'B': [[500,100,100,300], [100,500,300,400], [400,500,100,200]], 'C': [[200], [100]]}
# fit model with new data
model.fit(data, complete_samples_only=False, estimator=BayesianEstimator, prior_type='dirichlet', pseudo_counts=pseudo_counts)
# print updated cpds
for cpd in model.get_cpds():
print("CPD of {variable}:".format(variable=cpd.variable))
print(cpd)
When i try this, i get an error message saying:
"ValueError: The shape of pseudo_counts must be: (3, 0)"