Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • Dec 05 16:16
    ankurankan opened #1197
  • Dec 05 14:16

    ankurankan on dev

    Updates version number (compare)

  • Dec 05 14:16
    ankurankan closed #1196
  • Dec 05 14:15
    codecov[bot] commented #1196
  • Dec 05 14:13
    codecov[bot] commented #1196
  • Dec 05 13:33
    ankurankan closed #1192
  • Dec 05 13:33

    ankurankan on dev

    Removes six as a dependency [fi… (compare)

  • Dec 05 13:33
    ankurankan closed #1195
  • Dec 05 13:22
    ankurankan opened #1195
  • Dec 05 12:59

    ankurankan on dev

    Updates python version in docs (compare)

  • Dec 05 12:59
    ankurankan closed #1194
  • Dec 05 12:59

    ankurankan on dev

    Refactor NaiveBayes: 1. Change… (compare)

  • Dec 05 12:59
    ankurankan closed #1193
  • Dec 05 12:37
    ankurankan opened #1194
  • Dec 05 12:36
    ankurankan opened #1193
  • Dec 05 10:51
    ankurankan assigned #1192
  • Dec 05 10:51
    ankurankan opened #1192
  • Dec 05 10:12
    ankurankan closed #1190
  • Dec 05 10:12
    ankurankan closed #1191
  • Dec 05 10:12
    ankurankan commented #1191
Rashmeet Nayyar
@Rashmeet09
Or is there a way to specify evidence when using the ContinuousFactor (for creating a continuous random variable node)?
James Matthew Miraflor
@miraflor
Hello pgmpy community! I'm looking forward to contribute modules soon.
Chester
@llsjdkn
hi everynoe, there are some bugs in models-DynamicBayesianNetork.py-get_cpds(). Has it been fixed? Thanks for any help.
karnatapu
@karnatapu
hello
i see lot of fixes made to the orginal code, do anyone know how to pull latest code which has the bug fixes. Git clone gives non-fix solution. any clue
Ankur Ankan
@ankurankan
@karnatapu You can use pip: pip install pgmpy --upgrade. It should update to the latest version.
kbe206
@kbe206
df = pd.DataFrame(data)
est = HillClimbSearch(df, scoring_method=BicScore(df))
best_model = est.estimate()
edges = best_model.edges()
unsupported operand type(s) for +: 'OutEdgeView' and 'list'
Some people have this problem?
Ankur Ankan
@ankurankan
@kbe206 Try downgrading networkx to 1.11
karnatapu
@karnatapu
@ankurankan Thank you for the response. I did the pip update. After I update I did checked the python files and it does not show the fixes. for ex: Bic,Base, K2score doesn't have LRU implementation. Btw, when I do the update, I got the version as latest 0.1.7 (Successfully installed pgmpy-0.1.7). is this the version had all the updates?
Ankur Ankan
@ankurankan
@karnatapu Yes 0.1.7 is the latest. I am not sure about the cache implementation, it should have been there but doesn't seem to be. I will check and get back to you.
Ankur Ankan
@ankurankan
@karnatapu Thanks for pointing out the PR. I had totally forgotten about it. I will check if that can be merged.
Mark McKenzie
@mrkmcknz
Evening everyone. I was wondering what the most logical method is to get a list of value labels for a node. Kind of like what you see in the get_cpds() output.
Mark McKenzie
@mrkmcknz
I was half expecting it to be .variables
Ankur Ankan
@ankurankan
@mrkmcknz I am not sure what you exactly mean by value labels. Could you please elaborate?
Mark McKenzie
@mrkmcknz
+-------------------------------------+-----------+
| project_type(Fast Track Onboarding) | 0.0299222 |
+-------------------------------------+-----------+
| project_type(Innovation)            | 0.0113704 |
+-------------------------------------+-----------+
| project_type(governance)            | 0.0388989 |
+-------------------------------------+-----------+
| project_type(innovation)            | 0.032316  |
+-------------------------------------+-----------+
| project_type(other)                 | 0.831837  |
+-------------------------------------+-----------+
| project_type(performance)           | 0.0359066 |
+-------------------------------------+-----------+
| project_type(productivity)          | 0.0197487 |
+-------------------------------------+-----------+
In this example I want to get Fast Track Onboarding, Innovation...
Ankur Ankan
@ankurankan
@mrkmcknz You can use the .state_names attribute. It should return a dict of state names of all the variables.
Mark McKenzie
@mrkmcknz
Thanks @ankurankan I managed to find it last night. I'm just working on this pgmpy/pgmpy#913 I might create a PR for my implementation at some point when I clean the code up
Mark McKenzie
@mrkmcknz
I see a lot of comments in various places re continuous/discrete hybrid models in the pipelines. I was wondering what progress has been made towards this.
Ankur Ankan
@ankurankan
@mrkmcknz I am currently working on a Data class which will implement different conditional independence testing algorithms (for both continuous and hybrid). And with minor changes in the current structure learning algorithms, they should be able to learn the structure from continuous and hybrid datasets. But I don't think I will have the bandwidth to work on parameter learning or inference on continuous models soon.
5991dream
@5991dream
hello,I want to learn the PGM entry information. Do you have any recommendations?
5991dream
@5991dream
I am a newbie
pengjunli
@pengjunli
Hi guys, I am a starter. I want to use DBN of pgm, but I hava not found documentation on parameter learning, structure learning and inference for DBN. Any demo or documentation about DBN? Thanks for help.
Clemens Harten
@clemensharten_twitter
Hey everyone, I am just getting started with pgmpy, and have a question regarding performance of inference via BeliefPropagation. I have setup a moderate Bayesian Network (about 40 nodes, 80 edges), and want to get the state probabilities for a central node (without providing any evidence). With the recent dev-branch, this operation will take about 4-5 minutes on my developer-machine (i7, 8GB RAM ...) I would have thought it to be faster. Am I hitting any limits here (exponential runtime growths?), or should this indeed be faster, and I am doing something wrong? Any help much appreciated!
Clemens Harten
@clemensharten_twitter
... so, just for the record: BeliefPropagation is implemented as an exact algorithm. For my use case, I need an approximation, and BayesianModelSampling gives me exactly what I need :).
Ankur Ankan
@ankurankan
@clemensharten_twitter Yes, it's slow because it finds the exact solution. VariableElimination should be faster for exact solutions.
Yujian Liu
@yujianll
Hi everyone, I wonder what's the correct way to build a Bayesian Network from an undirected graph (I have a list of undirected edges, and I just want to add directionality to those edges).
Yujian Liu
@yujianll
Does anyone know if there is a function that returns the separating_sets given a specific undirected graph, or how can I do that manually?
jonvaljohn
@jonvaljohn
Hi, just installed pgmpy on the Mac using the latest code from the dev branch. When I run "nosetests -v", one test fails, is that expected?
Ankur Ankan
@ankurankan
@jonvaljohn Not really. Is it TestIVEstimator by any chance?
jonvaljohn
@jonvaljohn

FAIL: test_sampling (pgmpy.tests.test_sampling.test_continuous_sampling.TestNUTSInference)


Traceback (most recent call last):
File "/Users/jonvaljohn/Code/pgmpy/pgmpy/pgmpy/tests/test_sampling/test_continuous_sampling.py", line 208, in test_sampling
np.linalg.norm(sample_covariance - self.test_model.covariance) < 0.4
AssertionError: False is not true


Ran 676 tests in 146.484s

FAILED (SKIP=7, failures=1)

jonvaljohn
@jonvaljohn
@ankurankan, this is the error I am getting.
Ankur Ankan
@ankurankan
@jonvaljohn Hmm, it's working fine on my machine. Could you tell me your python version and dependency packages' versions?
jonvaljohn
@jonvaljohn
Python 3.7
How do I find the dependency packages' versions?
Thanks for all your help.
Ankur Ankan
@ankurankan
@jonvaljohn You can run: pip freeze | grep -E '(numpy|scipy|pandas|networkx)=='
jonvaljohn
@jonvaljohn
networkx==2.2
numpy==1.16.2
pandas==0.24.2
scipy==1.2.1
jonvaljohn
@jonvaljohn
Has anyone used dynamic bayesian network in pgmpy? The provided code is implementing a 2-TBN but I dont see a way to iterate with time. What is the best practice to provide new values for the CPDs for the "0" parameters. I assume that the "1" parameters should be automatically updated.
Ankur Ankan
@ankurankan
@jonvaljohn You can't use Variable Elimination for DBN with the current implementation. Try using pgmpy.inference.DBNInference.
jonvaljohn
@jonvaljohn
Thanks so much. Let me try... if I can get all this to work, I could contribute a DBN usage notebook so others can leverage this.
jonvaljohn
@jonvaljohn
from pgmpy.factors.discrete import TabularCPD
from pgmpy.models import DynamicBayesianNetwork as DBN
from pgmpy.inference import DBNInference


dbn = DBN() 
#dbn.add_edges_from([(('M',0), ('S', 0)), (('S',0), ('S', 1)), (('S', 0), ('J', 0)) ])
dbn.add_edges_from([(('M',1), ('S', 1)), (('S', 1), ('J', 1)), (('S',0), ('S', 1)) ])
#S_cpd = TabularCPD (('S',0), 2, [[0.5, 0.5]]) # Prior the state should go 50/50 
M_cpd = TabularCPD (('M',1), 2, [[0.5, 0.5]]) # Prior the model should go 50/50 


M_S_S_cpd = TabularCPD(variable = ('S',1), variable_card=2, 
                       values = [[0.95, 0.3, 0.1, 0.05],
                                 [0.05, 0.7, 0.9, 0.95]],
                     evidence=[ ('M', 1), ('S', 0)],
                     evidence_card=[2, 2])


S_J_cpd = TabularCPD(('J',1), 2, [[0.9, 0.1],
                                   [0.1, 0.9]],
                     evidence=[('S',1)],
                     evidence_card=[2])
dbn.add_cpds(M_cpd, M_S_S_cpd, S_J_cpd)
dbn_inf = DBNInference(dbn)
dbn_inf.forward_inference([('J', 2)], { ('M', 1):0, ('M', 2):0, ('S', 0): 0})

It is still failing... if I have the variable that goes from state to state as the parent in the individual time slice then the code works otherwise it fails.

Now this gives a "key error"

Ankur Ankan
@ankurankan
@jonvaljohn Sorry for the late reply. Yes, the current code makes the assumption that the model structure remains the same in each time slice. Are you trying to have a different model structure?
Yujian Liu
@yujianll
Hi, it looks like reduce() function in tabularCPD and DiscreteFactor takes (var_name, var_state_name) as input. I wonder is there a way to pass (var_name, var_state_no) as input.
Because this for loop in pre_compute_reduce() Sampling.py seems to use (var_name, var_state_no) as input:
for state_combination in itertools.product( *[range(self.cardinality[var]) for var in variable_evid] ): states = list(zip(variable_evid, state_combination)) cached_values[state_combination] = variable_cpd.reduce( states, inplace=False ).values
Sandeep Narayanaswami
@sandeep-n
Hey folks!
I'm using pgmpy in a project, and needed to fit a LinearGaussianBayesianNetwork to a dataset. Since the .fit() method isn't yet implemented, I wrote my own using sklearn's LinearRegression.
Would the maintainers be interested in a PR with this implementation? Note that it would be introducing a dependency on sklearn. Or would you prefer an implementation with scipy/statsmodels?