Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    Jakedismo
    @Jakedismo
    now I got a broadcast error about shape, progress
    Henri Vuollekoski
    @vuolleko
    Ok, well that’s good, I guess. :D I hope solving that will solve the memory issue as well
    Jakedismo
    @Jakedismo
    "could not broadcast input array from shape (36924,18462) into shape (2,18462)" that'll propably solve the memory issue once I figure this one out
    but thx for the help I think this is enough for a working day
    Henri Vuollekoski
    @vuolleko
    No problem. I hope you get it fixed
    Jakedismo
    @Jakedismo
    Hello, I have a question about parallelization, I can
    I can't get it working under jupyter notebook, I set the client to multiprocessing as in examples but when I start sampling I get these errors in console: AttributeError: Can't get attribute 'legacy_v2' on <module '__main__' (built-in)>
    Process SpawnPoolWorker-2:
    Traceback (most recent call last):
    File "C:\Anaconda\envs\DataScienceEnv\lib\multiprocessing\process.py", line 258, in _bootstrap
    self.run()
    File "C:\Anaconda\envs\DataScienceEnv\lib\multiprocessing\process.py", line 93, in run
    self._target(self._args, *self._kwargs)
    File "C:\Anaconda\envs\DataScienceEnv\lib\multiprocessing\pool.py", line 108, in worker
    task = get()
    File "C:\Anaconda\envs\DataScienceEnv\lib\multiprocessing\queues.py", line 337, in get
    return _ForkingPickler.loads(res)
    AttributeError: Can't get attribute 'legacy_v2' on <module '__main__' (built-in)>
    basically the worker nodes can't get the model function?
    Henri Vuollekoski
    @vuolleko
    The worker nodes can’t see definitions from interactive sessions. There are some hints in http://elfi.readthedocs.io/en/latest/usage/parallelization.html#working-interactively-with-ipyparallel . The problem is the same with the multiprocessing module.
    Maria L. Montoya
    @marilomf
    Hi, I am using BOLFI and trying to run with parallelization. but I am getting the following ipyparallel.error.RemoteError: AttributeError('Matern32' object has no attribute '_name').. I already trying to include this with ipyclient[:].sync_imports(): import GPy. but it does not work. any clue, how to solve this?. Thank you
    Henri Vuollekoski
    @vuolleko
    Hi, answered in Github #263
    Henri Vuollekoski
    @vuolleko
    ELFI 0.7.1 released, make sure you update. :) pip install -U elfi
    Bart Pelssers
    @pelssers
    Hi ELFI developers, I've been successfully using BOLFI for a project with an additional constraint on my model. The model I use has 2 inputs 'px' and 'py'. The constraint I have is that px ** 2 + py ** 2 < R ** 2 with R a constant. When I run BOLFI.fit() the GP also proposes some points outside the allowed region which do not have a physical interpretation in my model.
    I've solved the problem by changing the minimize function in elfi.methods.bo.utils. When calling scipy.optimize.minimize I give method='SLSQP' and the constraints argument. Secondly I needed to subclass LCBSC to call this modified minimize function.
    My question is if you are interested in implementing this in BOLFI. It would require adding minimizer_kwargs to the minimize function so one can call the minimizer with different methods/constraints. Secondly LCBSC (and possibly other acquisition methods) should also take these kwargs to pass them on to the minimize call.
    Let me know what you think, I'd be happy to submit a pull request.
    Henri Vuollekoski
    @vuolleko
    Hi Bart. Yes, of course we’re interested. This has actually been on our TODO list. Feel free to open a pull request in Github :)
    Bart Pelssers
    @pelssers
    Hi Henri. Thanks, I'll do that.
    Tatiana
    @ZePoLiTaT_twitter
    Hi ELFI-Dev team,
    First of all, thanks for the great framework! I've been following the BOLFI tutorial and I was wondering if there was a way to update the prior every k iterations. For example, let's say I start with a prior elfi.Prior(scipy.stats.uniform, 0, 2) and after k iterations I have information to shrink it to elfi.Prior(scipy.stats.uniform, 0, 1) and then perform k more iterations, update the prior again, etc..
    Henri Vuollekoski
    @vuolleko
    Hi Tatiana. You can set the update_interval keyword argument to the BOLFI constructor, see http://elfi.readthedocs.io/en/latest/usage/BOLFI.html This will set the period of automatically optimizing the GP hyperparameters.
    If you want to change the ELFI priors, I suppose you could write an external loop that changes the priors using their become method and then setup a new BOLFI instance using the previous points as initial_evidence. (However, note that BOLFI’s fit method doesn’t use ELFI priors with the default acquisition function.) I’m not sure why you would want to do that though
    Henri Vuollekoski
    @vuolleko
    ELFI 0.7.2 released pip install -U elfi
    Henri Pesonen
    @hpesonen
    ELFI 0.7.3 released pip install -U elfi
    Umberto Picchini
    @uPicchini_twitter
    Hi. I run post = bolfi.fit(n_evidence=200). I would like to extract the value of the parameters that correspond to the best discrepancy. Is there a way to do that? To expand a little: this is because, while it would be nice to followup bolfi.fit with NUTS sampling (via bolfi.sample), actually it seems that the NUTS chains end up quite far away from the very nice results returned by bolfi.fit. I mean, using bolfi.plot_discrepancy() I can really tell that indeed the smaller discrepancies happen to be around the true parameter values (I am doing a simulation study). This is great, and I would like to extract those parameter values returning the best discrepancies and feed them to some other software, running something other than NUTS.
    Henri Pesonen
    @hpesonen
    Hi Umberto. You can use e.g. bolfi.extract_result().x_min to access the parameters.
    Umberto Picchini
    @uPicchini_twitter
    awesome! Thanks.
    Umberto Picchini
    @uPicchini_twitter
    In ABC it is often useful to "weight" the summary statistics, so that these all have about the same contribution to the discrepance, or at least we try to avoid that the discrepance is dominated too much by a single summary (e.g. Prangle 2018 "Adapting the ABC Distance Function"). If we don't do that, the ABC threshold will have an effect only on the most variable of these summaries, while the others are neglected. Is there a mechanism like this implemented within BOLFI ? I see that some parameter is strangely badly estimated for a g-and-k case study I took from your list of examples, while other parameters are ok
    Henri Pesonen
    @hpesonen
    Currently the scaling of the summary statistics has to be done by hand. The weighting is one of the features on the TODO-list!
    Henri Pesonen
    @hpesonen
    ELFI 0.7.4 released - pip install -U elfi
    NedaMarvasti
    @NedaMarvasti
    Hi, I have a problem with parallelization. Following Elfi documentation, when I add "elfi.set_client(‘multiprocessing’)” before my rej.sample line, the process takes more time compare to not using it. Any help?
    Henri Pesonen
    @hpesonen
    It's often problem dependent, whether you get a speed-up from using multiprocessing
    ELFI 0.7.5 released - pip install -U elfi
    Josh Jacobson
    @joshhjacobson
    Hi dev team, thanks for all your great work on this framework! I have a question about the Summary node: currently, my simulator generates a random sample, then my summary statistics are a set of distribution quantiles. However, given a set of parameters, I could also produce the theoretical quantiles directly and avoid simulation and computation of empirical quantiles. If I want to pass a set of quantiles directly from theSimulator to the Distance node, do I just pass the exact vector through the Summary node, or is there a better way? Thanks!
    Henri Pesonen
    @hpesonen
    Hi Josh! You don't need to pass it through Summary, you can just pass Simulator for the Distance
    Josh Jacobson
    @joshhjacobson
    @hpesonen thanks for the quick response! I was able to pass the simulator node directly to the distance node and run the rejection sampler. However, when I try to apply a regression adjustment using elfi.adjust_posterior(model=mod, sample=res, summary_names=[sim.name]), I get the following error: The node _sim_observed is not in the digraph. Observations were passed in the simulator node, named sim. Is this a bug I should report?
    Henri Pesonen
    @hpesonen
    @joshhjacobson Currently it seems that you have to explicitly have the summary statistics in elfi-graph for the regression (it's mentioned in the docstring ). Currently the quickest workaround would be to define dummy summary statistic nodes that would pick out the dimensions of your simulator output as separate "summary statistics" (regression adjustment works currently for a set of 1D sumstats, but I'll fix that soon). Let me know if I can help out with the workaround.
    Josh Jacobson
    @joshhjacobson
    Hi @hpesonen, thanks for clearing that up! I was able to get things working with the following setup (I'm not sure why the first condition in the summary function is required, but it works):
    def match_quantile(q, idx): 
        if len(q.shape) != 2:
            return np.expand_dims(np.array(q[idx]), axis=0)
        else:
            return q[:, idx]
    
    summary_stats = list()
    for idx in range(len(observed_data)):
        summary_stats.append(elfi.Summary(match_quantile, model, idx, name=f"S{idx}"))
    Henri Pesonen
    @hpesonen
    @joshhjacobson Great! The requirement for the first condition is probably due to elfi performing the inference in batches with different batches being in rows. numpy.atleast_2d() also usually works wonders!
    Josh Jacobson
    @joshhjacobson
    return numpy.atleast_2d(q)[:, idx] does the trick and is much cleaner, thanks! Looks like it's required due to the first pass through the summary stats in which a 1d array is passed to the function, but after that things come through in batches as 2d arrays.
    Josh Jacobson
    @joshhjacobson
    Upon further inspection, it looks like the 1d array is the data vector and the 2d array contains the batches of simulation output. This makes complete sense considering both are needed for the distance calculation.
    Daniel Ward
    @danielward27

    Hi. Thanks for the great software! I have managed to create a simulator node y and a summary statistic node s. If I run their respective generate methods, it produces the expected output with the correct shapes:

    test_y = y.generate(3)
    test_s = s.generate(3)
    print(f"test_y {test_y.shape}   test_s {test_s.shape}")
    >> test_y (3, 1)   test_s (3, 102)

    However, when I try to run rejection sampling, I get an error:
    “TypeError: In executing node '_s_observed': iteration over a 0-d array.”
    For some reason when my summary statistic function is called, it gets passed a 0 dimensional array from the simulator. Any idea why this could be? By the speed the error is thrown, it doesn't seem like the simulator is even run. Thanks a lot for any help it is much appreciated!

    Daniel Ward
    @danielward27
    Ah nevermind. I think the issue was similar to above, in that my observed data was 1d not 2d (and hence why generate worked as it doesn't rely on the observed data).
    Henri Pesonen
    @hpesonen
    Release 0.7.7 is out. pip install -U elfi
    kycn0
    @kycn0
    Hello, I am really a newbie and trying to use BOLFI. So can I ask some questions in this channel as a user. Since this channel is called 'dev' , I was not sure if it is okay ?
    Henri Pesonen
    @hpesonen
    Hi @kycn0! Of course, please do.
    kycn0
    @kycn0
    Thanks. One of my question is that, how can I use the 'iterative importance sampling approach' as in the reference article of Gutmann, rather than MCMC which is the sampling method of BOLFI method ?
    2 replies
    kycn0
    @kycn0
    I also want to generate two parameters one parent and its child. Both child and parent are connected to simulator. I have defined the priors of the parameters with custom distributions which inherit elfi.Distributions. One problem arises once run .iterate() my bolfi object. So the parent parameter gives an integer number and child parameter creates a list whose length is the integer provided by the parent. So you can think of , number of vectors and its amplitudes. Parent decides the number of vectors and therefore the simulation needs that many amplitudes which are contained in a list. I am trying to guess the number of vectors and their amplitudes. I have generated generative function accordingly. But once I ran the .iterate or .fit of my bolfi object, error is raised saying that dimensions are not matching between the parameters, which is obvious since parent provides an integer according to which a list is generated by the child. I cannot overcome this issue for a few days, so am I doing something fundamentally wrong ? Generative function accepts the outputs of parent and child correctly, nevertheless, I cannot initiate the simulation. Thank you.
    15 replies
    kycn0
    @kycn0
    As an addition to last question, is it possible to define a discrete type scipy.stats custom prior distribution ? Since custom prior distributions must have rvs, logpdf and pdf, in case of defining a discrete distribution last two must be logpmf and pmf. However, in class discreption of prior, it is suggested to follow rv_continuous in order to define a custom prior. So can't we define a rv_discrete type distribution using logpdf and pdf?
    3 replies
    btw thank you all the dev team for this product !!! :)
    kycn0
    @kycn0
    Hello, I want to create the evidence dict for BOLFI. In the description it says, "... dict containing parameter and discrepancy values" . So assuming that I have two parameters and 50 initial evidence. Then how this dict must be ? I think the keys must match with the prior names, however how should I add the parameters and corresponding discrepancies ?
    36 replies
    Henri Pesonen
    @hpesonen
    Release 0.8.0 is out. pip install -U elfi
    6 replies
    kycn0
    @kycn0
    Hello again, I have a question regarding the parent definition of an Operation object. So I have nodes whose output I want to sum. Then I need to give those nodes to the *parents argument of Operation function. But I have plenty of nodes then I want to do this as a loop . But in this case I got the error, because the second time the loop iterates, it says operation node is already exists. To clarify, I have written this small reproducible script , that explains my problem . If you have time, can get your point of view on this issue ?
    2 replies
    or I can just copy the script as well