Hi Lars, thanks for the kind words :-) I am trying to reproduce your bug.
For now, the code
import pyAgrum as gum gum.about() bn=gum.fastBN("A->B<-C->D->A") bn.saveBIF("one.bif") bn2=gum.loadBN("one.bif") bn2.saveDSL("two.dsl") bn3=gum.loadBN("two.dsl") bn3.saveNET("three.net") bn4=gum.loadBN("three.net") if bn==bn2: print("ok2") if bn==bn3: print("ok3") if bn==bn4: print("ok4") bn5=gum.fastBN("A->B->C->D") if bn==bn5: print("not ok5") else: print("ok5")
works on linux and anaconda macOS64 ... I am updating anaconda on a windows computer before trying it
This is a move that we planned long time ago already. LGPL does not change anything for current users. We do it now because LGPL is considered by several contacts as an advantage in order to choose aGrUM/pyAgrum (more precisely : GPL seen as a drawback).
Hi Francisco, thanks for the kind words! We do not deal with continuous variables for now in aGrUM... mainly because we do not feel very confident in the existing models (like CLG, etc.) for Continuous graphical models. We are working on different extensions or collaborations with other libraries to provide a good way t o deal with continuous variable in graphical models.
For now, you can of course learn quasi-continuous BN from continuous data (see above)
First, before analyzing and using your BN, I suggest you either to wait for the new tag "0.15.3" (during this week hopefully) or (if you use
pip to download pyAgrum) to install
pyAgrum-nightly : I found a nasty bug in the Parameter learning algorithm when at least one configuration of the parents of a node is not found in the database. It is fixed in the master of our gitlab site but not deployed. So the results you have for now may be erroneous (in term of learned parameters). If you prefer, a quick workaround should be to add a small prior (Laplace adjustment) with :
learner.useAprioriSmoothing(1e-5) which guarantees that no parents configuration is unknown.
Second, for your choice of discretization, this is exactly the reason why we do not do it ourself :-) the correct discretiezation is directly related to the base and the user.
Concerning the impact of the granularity, I do not know any theory on that kind but it is quite well known that indepency test (such as chi2) and scores are not very robust w.r.t this granularity.
You may try the learning algorithm based on mutual information instead of scores or chi2 in aGrUM : miic (
learner.useMIIC()) which should be a bit more robust.
However, it is important to note also that a high granularity (with many values for the discrete variable) will need much more data for the learning phase (structure and parameters).
learner.useAprioriSmoothing(1e-5)with some results 'apparently better', thanks again !!
csl.causalImpact(). Every time I run that, my kernel dies (as shown in pic below). I am running it on a Google Cloud VM with 10 cores and 50 GB of RAM, using Ubuntu 18.04 as OS. The network is rather small (with 11 nodes) where each node has a Labelized variable with 20 different possible categories, and I can perform inference with no problem at all. Could it be a bug in the Causal methods ? or maybe something is wrong with my machine configuration?
Hi Francisco, this is weird. but 2 things before the bug :
1- to apply causality calculus, you have to be certain that your BN is a causal network.
2- let's assume that this is the case and that you have a causal network.
pyAgrum.causalis a module to perform such causal calculus., particularly in the presence of latent variables (do-calculus). If your causal network do not imply any latent variables, you can compute the causal impact of $X$ just by building a BN, copy of your causal network where you remove the arcs to $X$ ($x$ has no parent). In this BN, classical inference are sufficient to explore the causal impact of $X$.
Now, I do not think that your configuration has anything to do with this error. It is weird and should be the result of either a bug in
pyAgrum.causalor something weird in your BN (in the CPTs of your BN probably). If you want me to investigate a bit more, maybe you can give me the exact CausalImpact you want to compute (using the node Id and not the names that I cannot interpret) or an anonymised BIF version of your BN ?
This example has been built during an internship in association with a company and I am not sure how much we can give it. However, the idea of this example was not to build to learn a BN dedicated to troubleshooting but rather as a way to show how to use a BN and mutual information to design an algorithm for troubleshooting.
Do not hesitate to contact and question us if needed !
It may not be the perfect reference but a simple entry point : https://arxiv.org/pdf/1301.3880.pdf (in particular, section 5 defines the (proposes a) general structure of a BN for troubleshooting).