by

Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
Pierre-Henri Wuillemin
@phwuill_gitlab
Pierre-Henri Wuillemin
@phwuill_gitlab
Julien Schueller
@jschueller
Why the change of license ?
Pierre-Henri Wuillemin
@phwuill_gitlab

Hi Julien.

This is a move that we planned long time ago already. LGPL does not change anything for current users. We do it now because LGPL is considered by several contacts as an advantage in order to choose aGrUM/pyAgrum (more precisely : GPL seen as a drawback).

Francisco J. Camacho
@pachocamacho1990
Hi everyone ! I 've been looking for a solid library with pre-defined methods for building Bayesian Networks, thank god I found you guys. I was wondering if you have something for performing parameter learning for continuous (quasi-continuous) variables . ? Examples I have found so far in the notebooks concern only to categorical variables.
Pierre-Henri Wuillemin
@phwuill_gitlab
image.png

Hi Francisco, thanks for the kind words! We do not deal with continuous variables for now in aGrUM... mainly because we do not feel very confident in the existing models (like CLG, etc.) for Continuous graphical models. We are working on different extensions or collaborations with other libraries to provide a good way t o deal with continuous variable in graphical models.

For now, you can of course learn quasi-continuous BN from continuous data (see above)

But be aware that the choice of the discretization may have a large impact on the structure of the final BN
Francisco J. Camacho
@pachocamacho1990
Thank you for the kind and quick reply, I tried as you said. Instead of discretizing the variable using a homogeneous spacing with linspace() or arange() I used a customized non-homogeneous spacing because my variables have a range of values of logaritmic scale !. I finally manage to train a BN with such a discretization scheme. I also noticed that the granularity of the discretization scheme has a strong impact in the resulting BN structure, is there any theory or study regarding this connection ?
Pierre-Henri Wuillemin
@phwuill_gitlab

Hi Francisco,

First, before analyzing and using your BN, I suggest you either to wait for the new tag "0.15.3" (during this week hopefully) or (if you use pip to download pyAgrum) to install pyAgrum-nightly : I found a nasty bug in the Parameter learning algorithm when at least one configuration of the parents of a node is not found in the database. It is fixed in the master of our gitlab site but not deployed. So the results you have for now may be erroneous (in term of learned parameters). If you prefer, a quick workaround should be to add a small prior (Laplace adjustment) with : learner.useAprioriSmoothing(1e-5) which guarantees that no parents configuration is unknown.

Second, for your choice of discretization, this is exactly the reason why we do not do it ourself :-) the correct discretiezation is directly related to the base and the user.
Concerning the impact of the granularity, I do not know any theory on that kind but it is quite well known that indepency test (such as chi2) and scores are not very robust w.r.t this granularity.
You may try the learning algorithm based on mutual information instead of scores or chi2 in aGrUM : miic (learner.useMIIC()) which should be a bit more robust.

However, it is important to note also that a high granularity (with many values for the discrete variable) will need much more data for the learning phase (structure and parameters).

Francisco J. Camacho
@pachocamacho1990
Hi Pierre,
Yeah, I noticed that the inference results were weird in most cases, though. I tried the learner.useAprioriSmoothing(1e-5) with some results 'apparently better', thanks again !!
Pierre-Henri Wuillemin
@phwuill_gitlab
Nice to hear... The next tag should come up (around) next monday
With such a very weak prior, the results are very close to ML estimation for the probabilities.
(you can make it smaller ! :-) )
Francisco J. Camacho
@pachocamacho1990
Hi ! it's me again, I'm now in the stage of causal analysis over my BN, but I am unable to run the methods from pyAgrum.causal such as csl.causalImpact(). Every time I run that, my kernel dies (as shown in pic below). I am running it on a Google Cloud VM with 10 cores and 50 GB of RAM, using Ubuntu 18.04 as OS. The network is rather small (with 11 nodes) where each node has a Labelized variable with 20 different possible categories, and I can perform inference with no problem at all. Could it be a bug in the Causal methods ? or maybe something is wrong with my machine configuration?
Screen Shot 2019-08-24 at 10.54.18 AM.png
image.png
Francisco J. Camacho
@pachocamacho1990
image.png
Francisco J. Camacho
@pachocamacho1990
image.png
I also tried using directly ipython UI but ...
Pierre-Henri Wuillemin
@phwuill_gitlab

Hi Francisco, this is weird. but 2 things before the bug :

1- to apply causality calculus, you have to be certain that your BN is a causal network.

2- let's assume that this is the case and that you have a causal network. pyAgrum.causalis a module to perform such causal calculus., particularly in the presence of latent variables (do-calculus). If your causal network do not imply any latent variables, you can compute the causal impact of $X$ just by building a BN, copy of your causal network where you remove the arcs to $X$ ($x$ has no parent). In this BN, classical inference are sufficient to explore the causal impact of $X$.

Now, I do not think that your configuration has anything to do with this error. It is weird and should be the result of either a bug in pyAgrum.causalor something weird in your BN (in the CPTs of your BN probably). If you want me to investigate a bit more, maybe you can give me the exact CausalImpact you want to compute (using the node Id and not the names that I cannot interpret) or an anonymised BIF version of your BN ?

image.png
For now, I tried to create the same BN (with random CPTs) and I am able to compute causal impacts in it...
Pierre-Henri Wuillemin
@phwuill_gitlab
Using bn=gum.fastBN("10->3->8<-4->9<-1->7<-3->0->6->5<-2<-3->1->4;2<-0->1;4<-3->5",20)to have the same size (20 categories) as in your BN does not change anything (i.e. still working on my laptop )
Pierre-Henri Wuillemin
@phwuill_gitlab
Hello, aGrUM/pyAgrum 0.16.0 is out ! see https://gitlab.com/agrumery/aGrUM/blob/master/CHANGELOG.md
@pachocamacho1990 , the bug on learning CPTs is fixed in this update.
Pierre-Henri Wuillemin
@phwuill_gitlab
Hello, aGrUM/pyAgrum 0.16.2 is out ! see https://agrum.gitlab.io/articles/agrumpyagrum-0162-released.html
Enzo
@smartinsightsfromdata_gitlab
Hi. I discovered pyagrum a while back and I really like it. I saw the slides of a recent presentation of prof. Wuillemin with an example around a faulty car ("voiture en panne"?). Would it be possible to have a copy of the notebook?
Disclosure: my interests is ultimately troubleshooting within the wide context of telecom networks. I would like to learn from simple, real life examples how to structure data for learning etc. This could be as simple as the troubleshooting of home internet to more complex cases. Thanks in advance.
Pierre-Henri Wuillemin
@phwuill_gitlab

Hi, Enzo.

This example has been built during an internship in association with a company and I am not sure how much we can give it. However, the idea of this example was not to build to learn a BN dedicated to troubleshooting but rather as a way to show how to use a BN and mutual information to design an algorithm for troubleshooting.

Do not hesitate to contact and question us if needed !

Enzo
@smartinsightsfromdata_gitlab
Hi. An update. I actually managed to find the faulty car bn here: http://www.cs.uu.nl/docs/vakken/prob/practicum.html (as car diagnosis network) and I implemented in pyAgrum (it seems it is a sort of "hello world" for troubleshooting BNs!).
This has helped me to understand a bit better:
  1. how could I structure the data to build a troubleshooting bn (at least I have an initial idea - the Titanic example helped a lot - and pyAgrum in a notebook is just perfect for self learning!).
  2. how to build a network from data in unsupervised manner. I know it is a very active research field and I would have some practical questions but maybe I'll ask another time
    But where I would like to spend some time now is on reflecting how to structure a BN of this type for decision making (which I thought was the most interesting part of Prof. Wuillemin presentation at the IBM event): e.g. how to create a sort of hierarchy of nodes types: diagnosis nodes / fault nodes etc. to support a potentially unattended (e.g. via chatbot) troubleshooting.
    Is there any article / research paper or book that would help me on this (application of BNs for decision making - especially focused on troubleshooting)?
    Thanks in advance. Enzo
Pierre-Henri Wuillemin
@phwuill_gitlab

Hi Enzo,

It may not be the perfect reference but a simple entry point : https://arxiv.org/pdf/1301.3880.pdf (in particular, section 5 defines the (proposes a) general structure of a BN for troubleshooting).

Pierre-Henri Wuillemin
@phwuill_gitlab
Hello, aGrUM/pyAgrum 0.16.4 is out ! see https://agrum.gitlab.io/articles/agrumpyagrum-0164-released.html
(particularly : package for python 3.8 !!)
Pierre-Henri Wuillemin
@phwuill_gitlab
Hello, aGrUM/pyAgrum 0.17.0 is out ! see https://agrum.gitlab.io/articles/agrumpyagrum-0170-released.html
rhim marwa
@marwarh_gitlab
Hello, I'm trying to discover pyAgrum , I hope you could help me where I can find the source code of "setEvidence" function , I need to know how it works. Thanks in advance
Pierre-Henri Wuillemin
@phwuill_gitlab

Hi Rhim Marwa,

As you may know, pyAgrum is just a wrapper for the C++ library aGrUM.

SetEvidence is a method that is designed only for this very wrapper. You can find its code in wrappers/pyAgrum/swigsrc/inference.i and you will see that setEvidence is a call to clearAllEvidence and then a loop to addEvidence for all key:value found in its arg.

Then setEvidence({"A":1,"B":2}) is just a shortcut for the code :

clearAllEvidence()
addEvidence("A",1)
addEvidence("B",2)
Pierre-Henri Wuillemin
@phwuill_gitlab
Hello, aGrUM/pyAgrum 0.17.1 is out ! see https://agrum.gitlab.io/articles/agrumpyagrum-0171-released.html
Pierre-Henri Wuillemin
@phwuill_gitlab
Julien Schueller
@jschueller
hello @phwuill_gitlab why did you change the default value of the cmake variable BUILD_PYTHON to OFF ?
Pierre-Henri Wuillemin
@phwuill_gitlab
Hi Julien.
By default, the CMakeLists.txt builds aGrUM. Building pyAgrum is an option. I know that it may be not the case for you. But for us, aGrUM still remains the main target (at least, for compilation , developpement, tests and prototypes)
it should be not difficult to add the proper "-D" when calling cmake
Julien Schueller
@jschueller
ok
Julien Schueller
@jschueller

hi agrumers,
hope everybody is well

I was surprised by agrum api being non-deterministic eventhough using gum::initRandom.
If I print a potential many times I can see the order can change: either A or C.

#include <iostream>

#include <agrum/BN/BayesNet.h>
#include <agrum/BN/inference/lazyPropagation.h>
#include <agrum/tools/multidim/potential.h>
#include <agrum/tools/variables/discretizedVariable.h>
#include <agrum/tools/variables/labelizedVariable.h>
#include <agrum/tools/variables/rangeVariable.h>

int main(int /*argc*/, char ** /*argv*/)
{
  for (int i = 0; i< 100; ++i)
  {
    gum::initRandom(10);
    gum::BayesNet<double> bn;
    bn.add(gum::DiscretizedVariable<double>("A", "A", {1, 1.5, 2, 2.5, 3, 4}));
    bn.add(gum::LabelizedVariable("B", "B", {"chaud", "tiede", "froid"}));
    bn.add(gum::RangeVariable("C", "C", 1, 4));
    bn.addArc("A", "C");
    bn.addArc("C", "B");
    bn.cpt("A").fillWith({1, 2, 3, 4, 5}).normalize();
    bn.cpt("C").fillWith({1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20}).normalizeAsCPT();
    bn.cpt("B").fillWith({1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12}).normalizeAsCPT();

    gum::LazyPropagation<double> ie(&bn);
    ie.addJointTarget({0, 2});
    ie.addTarget(1);
    ie.makeInference();

    std::cout << ie.jointPosterior({0, 2}) << std::endl;
  }
  return 0;
}

Fortunately I saw there is a putFirst method that allows to rearrange the view, but still.
Is it because you use pointers in your hash tables ?
What is initRandom for then ?

Pierre-Henri Wuillemin
@phwuill_gitlab

Hi Julien,
It is a big difference between our multidimensional containers (gum::Potential) and those from openturns or numpy for instance : we wanted not to be forced to use an explicit and difficult-to-maintain order while dealing with tensor algebra. For instance, P(A|B,C) * P(B,C) will be always correctly computed for any order of A,B,C in the first Potential and any order of B,C in the second.
The 'true and not interesting order' is indeed given by the hashed values of the pointers of variables. (You may note that our Potential operators (+,*,margSumIn,etc.) can even reordrer the potentials in order to optimize the computations).
If you really want to force an order, you indeed have the 2 potentials methods Potential::putFirst (for a Potential containing a CPT : the first variable is the conditionned one) and Potential::reorganize

gum::initRandom just choose the seed for our random generator.

By the way, your code can be really compacted using fastPrototype(in aGrUM) renamed in fastBN in pyAgrum

auto bn=gum::BayesNet<double>::fastPrototype("A[1, 1.5, 2, 2.5, 3, 4]->B{chaud|tiede|froid}<-C[1,4]");
bn.cpt("A").fillWith({1, 2, 3, 4, 5}).normalize();
bn.cpt("C").fillWith({1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20}).normalizeAsCPT();
bn.cpt("B").fillWith({1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12}).normalizeAsCPT();
Julien Schueller
@jschueller

Ok, is this non-determinism mentioned in the doc ?

Weird, with your code it throws a SizeError in the line "bn.cpt("C").fillWith"

Pierre-Henri Wuillemin
@phwuill_gitlab

For the doc : Not sure :-) We should add a warning in the multidim section somewhere

For my code, I just messed up the structure :-) A->B<-C instead of A->C->B

bn=gum.fastBN("A[1, 1.5, 2, 2.5, 3, 4]->C[1,4]->B{chaud|tiede|froid}");
bn.cpt("A").fillWith([1, 2, 3, 4, 5]).normalize();
bn.cpt("C").fillWith([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]).normalizeAsCPT();
bn.cpt("B").fillWith([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]).normalizeAsCPT();

(oups, in python sorry ... just take the right string in the first line)

Julien Schueller
@jschueller
It should definetely be added, it's so unusual to get into random apis
well spotted, thanks
Pierre-Henri Wuillemin
@phwuill_gitlab
Hi, Julien. I have to disagreee with you on that point: is it not so unsual for a container indexed by keys to not guarantee an order when a sequence of contained elts is returned. Hashtable is a good example of course... and if the keys are pointers to dynamically allocated objects, the API is "random" in the sense that the result may differ for each run (of course it is not really random).
Pierre-Henri Wuillemin
@phwuill_gitlab
Hi @jschueller , just to conclude this discussion. Here is a (slightly modified) extract of our tests :
            auto a = gum::LabelizedVariable("a", "afoo", 3);
            auto b = gum::LabelizedVariable("b", "bfoo", 3);
            auto c = gum::LabelizedVariable("c", "cfoo", 3);
            auto d = gum::LabelizedVariable("d", "dfoo", 3);

            gum::Potential<double> p;
            p.add(a).add(b).fillWith({1, 2, 3, 4, 5, 6, 7, 8, 9});

            gum::Potential<double> q;
            q.add(c).add(d).fillWith({4, 5, 6, 3, 2, 1, 4, 3, 2});

            // What should be the same "correct" order for joint1 and joint2  ?
            auto joint1 = p * q; 
            auto joint2 = q * p;
            TS_ASSERT_DIFFERS(joint1.toString(), joint2.toString()); // because the ordering are not the same
            auto joint3 = (q * p).reorganize({&c, &d, &a, &b});
            TS_ASSERT_EQUALS(joint1.toString(), joint3.toString()); // because joint3 has been reordered just like joint1

            // using an instantiation that gives the order, you can iterate on joint1 and joint2 with the same sequence 
            gum::Instantiation inst;
            inst << a << c << b << d;
            for (inst.setFirst(); !inst.end(); ++inst) {
                TS_ASSERT_EQUALS(joint1.get(inst), joint2.get(inst));
                TS_ASSERT_EQUALS(joint1.get(inst), joint3.get(inst));
            }
Pierre-Henri Wuillemin
@phwuill_gitlab
Hello, aGrUM/pyAgrum 0.17.3 is out ! see https://agrum.gitlab.io/articles/agrumpyagrum-0173-released.html