by

## Where communities thrive

• Join over 1.5M+ people
• Join over 100K+ communities
• Free without limits
##### Activity
• Sep 17 06:25
basavarajuise commented #881
• Sep 16 09:33
ankurankan commented #881
• Sep 16 09:09
basavarajuise commented #881
• Sep 15 07:39
ankurankan milestoned #1322
• Sep 15 07:39
ankurankan commented #1322
• Sep 15 07:38
ankurankan labeled #1322
• Sep 15 06:41
ankurankan closed #1323
• Sep 15 06:41
ankurankan commented #1323
• Sep 15 05:49
ysgncss opened #1323
• Sep 07 17:30
• Sep 07 13:42
MrYu001 commented #1322
• Sep 07 12:57
ankurankan edited #1322
• Sep 07 12:57
ankurankan closed #1321
• Sep 07 12:57
ankurankan commented #1321
• Sep 07 12:49
MrYu001 opened #1322
• Sep 07 10:54
• Aug 28 12:26
ankurankan commented #1320
• Aug 28 12:26
ankurankan closed #1320
• Aug 28 12:26
ankurankan commented #1320
• Aug 28 12:24
ankurankan edited #1320
Ankur Ankan
@ankurankan
@matsonair_twitter Inference for NoisyOR models is not implemented yet.
Ankur Ankan
@ankurankan
@StefanRiess Depends on what kind of final structure you are looking for. For a generalized structure learning, I don't think there is any way to specify a root node in case of PC or HillClimb algorithms. But in the case of HillClimb you can specify a list of whitelist and blacklist nodes, using which you can force the direction of the edges to be always outwards from a given root node. If you are looking to learn a tree structure, you can easily specify a root node: http://pgmpy.org/estimators.html#tree-search
Ankur Ankan
@ankurankan
@pedrodgn You might be able to use the BayesianEstimator and specify a prior for the CPDs. This would initialize the CPDs to the priors and then use the data that you provide to update these probabilities.
KittyNatty
@KittyNatty
@ankurankan Hi, I would like to ask a favor about Bayesian theorem, sorry that maybe it does not relate directly to the pgmpy. However, I hope further to implement the network through pgmpy. Is it possible to calculate the posterior probability by the entire factor (random variable) instead of event? For example, there are 2 related factors; A and B, which are continuous data. So, how can I calculate the probability of P(A|B)? I found the way to use the normal dist which calculate from the μ and σ^2. But the result will show the probability of μ and σ^2 given by data, P(μ, σ^2|data) which refer to the posterior of only one variable. Please kindly give me any suggestion, thank you in advanced.
Ankur Ankan
@ankurankan
@KittyNatty I am sorry but I don't understand the question. Are you looking to compute the posterior of both the variables? In that case, to compute the joint posterior P(A, B | data), you can simply start with priors for 4 parameters ($\mu_A$, $\mu_B$, $\sigma_A$, $\sigma_B$) instead of two, and then can compute the posterior for all these parameters. You should also be able to encode your relational assumptions in the likelihood function. If you are just looking for the conditional distributions, P(A|B), you will get a single \mu and \sigma but they will be functions of B (if you assume some kind of relation between them). Have a look at this: https://en.wikipedia.org/wiki/Multivariate_normal_distribution#Conditional_distributions
KittyNatty
@KittyNatty
arghyakusumdas6163
@arghyakusumdas6163
Hi All, I am new to PGMPY.... I have installed it using conda install -c ankurankan pgmpy. I was wondering if there is any example code that I can use for Baysian Strcture learning? Any help will be appreciated.
I am using HillClimbing with BIC Scoring
Ankur Ankan
@ankurankan
arghyakusumdas6163
@arghyakusumdas6163
Great.... Thanks for the prompt response
Let me look at it
arghyakusumdas6163
@arghyakusumdas6163
Ended up in an error here from pgmpy.utils import get_example_model
ImportError: cannot import name 'get_example_model'
Ankur Ankan
@ankurankan
Umm, could you install the dev branch from github? The conda package is from last release which doesn't have the get_example_model function.
arghyakusumdas6163
@arghyakusumdas6163
Let me check
arghyakusumdas6163
@arghyakusumdas6163
It is not installing..... I am in a POWER environment. Possibly that is why it is failing
I not sure
Is there any way how I can create a model quickly
Ankur Ankan
@ankurankan
The get example method fetches the model from https://www.bnlearn.com/bnrepository/ and uses the readwrite.BIFReader to create the model. So, you can try to do it manually. Or you can also look at the example notebooks for manually creating models
arghyakusumdas6163
@arghyakusumdas6163
Is it going to be this type of format? G = BayesianModel([('diff', 'grade'), ('intel', 'grade'),
Sorry for asking so many question. I am actually in the process of benchmarking many Baysian structure learning tools
So, Just needed to know the mose efficient way to do it in here
Ankur Ankan
@ankurankan
I don't think you will need to define a model if you are just looking for doing structure learning. You can directly give the data to HillClimb. If you are looking to simulate data as well, you will need to add parameterization to the model structure as well. Here's a complete example: https://github.com/pgmpy/pgmpy/blob/dev/examples/Creating%20a%20Discrete%20Bayesian%20Network.ipynb
arghyakusumdas6163
@arghyakusumdas6163
I do not want to simulate the data. I am working with a dataset with 29 nodes in it
Assuming everything has a binary outcome
arghyakusumdas6163
@arghyakusumdas6163
By the way, Do you support GPU for Bayesian structure learning?
Ankur Ankan
@ankurankan
No, I don't think structure learning can benefit much from GPUs. It's compuatationally expensive mostly because of sequential nature of algorithms.
arghyakusumdas6163
@arghyakusumdas6163
I understand.....
makes sense.....
Jeremy Zucker
@djinnome
Quick question: does pgmpy support chain graphs? Is there an algorithm to perform c-separation on them?
Ankur Ankan
@ankurankan
@djinnome No, chain graphs aren't supported.
arghyakusumdas6163
@arghyakusumdas6163
K2score and BDeUScore are taking a very long time with hill climbing.... Any idea, what is going wrong?
Ankur Ankan
@ankurankan
@arghyakusumdas6163 You could try increasing epsilon, or setting a reasonalble max_indegree. You can also provide a starting structure if you already know some relation between variables.
arghyakusumdas6163
@arghyakusumdas6163
Thanks Ankur. the algorithm ended correctly without any change in epsilon.
But max_degree needed to set to 1
I have another question: Is there any function so that I can pass a graph structure and get the overall score of that?
arghyakusumdas6163
@arghyakusumdas6163
Let us say, I want the Bic score of the following model,
model = BayesianModel([('A', 'C'), ('B', 'C')])
data = pd.DataFrame(data={'A': ['a1', 'a1', 'a2'],
'B': ['b1', 'b2', 'b1'],
'C': ['c1', 'c1', 'c2']})
Copied the code from your example
Ankur Ankan
@ankurankan
@arghyakusumdas6163 Yes, you can use the BicScore.score method. Have a look at the this example: https://github.com/pgmpy/pgmpy/blob/dev/pgmpy/estimators/StructureScore.py#L59
arghyakusumdas6163
@arghyakusumdas6163
Thanks
Tomislav Kovačević
@tomkovna
Is there any (i guess constrained based) structure learning algorithm that allows you to specify minimum/maximum number of parents?
Ankur Ankan
@ankurankan
@tomkovna You can specify max allowed parents using max_indegree parameter of HillClimbSearch. For PC, you can control the sparsity of the model by specifying max_cond_vars, a higher value should give a sparser structure.
felixleopoldo
@felixleopoldo
Hi, I wonder if it is possible to estimate the cpds using e.g. dirichlet prior only? It seems to me like it is only possible to get estimates from the posterior and not the prior alone..
cpd_a = estimator.estimate_cpd('A', prior_type="dirichlet", pseudo_counts=[[1], [3]])
This only seem to work is I have actually observed two values for A in the data.. Could someone please help me to work this out?
Ankur Ankan
@ankurankan
@felixleopoldo If I am understanding this correctly, you have a dataset which doesn't have both the states for A in it? In that case, you can pass a state_names argument while initializing the BayesianEstimator and it will fall back to computing CPD sizes using that. The state_names should be a dict of type {var1: [state1, state2, ..], var2: [...] }. Here's an example:
In [21]: df = pd.DataFrame(np.ones((5, 2), dtype='int'), columns=['A', 'B'])

In [22]: model = BayesianModel([('A', 'B')])

In [23]: est = BayesianEstimator(model, data=data, state_names={'A': [0, 1]})

In [24]: est.estimate_cpd('A', prior_type='dirichlet', pseudo_counts=[[1], [3]])
Out[24]: <TabularCPD representing P(A:2) at 0x7f51955d92b0>

In [25]: print(est)
<pgmpy.estimators.BayesianEstimator.BayesianEstimator object at 0x7f5194f34b50>

In [26]: print(__)
+------+----------+
| A(0) | 0.111111 |
+------+----------+
| A(1) | 0.888889 |
+------+----------+
felixleopoldo
@felixleopoldo
wohoo
thanks! thats what I needed