Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • Oct 13 12:10

    jonas-eschle on develop

    docs: update CHANGELOG.rst for … (compare)

  • Oct 11 08:46
    jonas-eschle synchronize #272
  • Oct 11 08:46
    jonas-eschle synchronize #272
  • Oct 11 08:46

    jonas-eschle on binned_new

    chore: add benchmark in require… (compare)

  • Oct 10 19:58
    jonas-eschle synchronize #272
  • Oct 10 19:58
    jonas-eschle synchronize #272
  • Oct 10 19:58

    jonas-eschle on binned_new

    fix: unsupported format type (compare)

  • Oct 10 19:54
    jonas-eschle synchronize #272
  • Oct 10 19:54
    jonas-eschle synchronize #272
  • Oct 10 19:54

    jonas-eschle on binned_new

    fix: unsupported format type (compare)

  • Oct 09 08:12
    jonas-eschle synchronize #272
  • Oct 09 08:12
    jonas-eschle synchronize #272
  • Oct 09 08:12

    jonas-eschle on binned_new

    debug: add statement to check n… (compare)

  • Oct 09 08:03
    jonas-eschle synchronize #272
  • Oct 09 08:03
    jonas-eschle synchronize #272
  • Oct 09 08:03

    jonas-eschle on binned_new

    [pre-commit.ci] pre-commit auto… Merge pull request #368 from zf… Merge branch 'develop' into bin… (compare)

  • Oct 08 22:32
    jonas-eschle synchronize #272
  • Oct 08 22:32
    jonas-eschle synchronize #272
  • Oct 08 22:32

    jonas-eschle on binned_new

    docs: add docs for binned data enh: add chi2 docs and options … (compare)

  • Oct 07 18:08
    jonas-eschle synchronize #272
Blaise Delaney
@BlaiseDelaney
Hello devs, I have a quick question on something that I suspect may have been sort of asked already; apologies, if that's the case. I wondered if there was a way to perform several fits while looping over arrays within the scope of a .py file or whether one must write a wrapper of some sort. My attempts so far within a jupyter notebook seem to not allow it, as I need to restart the kernel to re-initialise parameters at each iteration. Thanks a lot in advance for your help!
Jonas Eschle
@jonas-eschle
Hey, what exactly to you mean by "looping over arrays"? Is the array the data or the parameter values?
If it is the latter (e.g. to create a likelihood profile), you don't have to re-initialize the parameters but can use the set_value method to set them to a new value
Yanina Biondi
@YaniBion
Hi, newbie question: is there a way to model a pdf with a normalize histogram like rv_histogram.pdf() from scipy.stats when one does not know the parametric form?
*normalized
I'm guessing I can use an object from the class custompdf, I want to know if this is stable and someone tried before or there is better method
Jonas Eschle
@jonas-eschle
Hi, this has indeed been done already multiple times, here is a snippet on how to do it as an example:
import zfit
from zfit import z
import numpy as np
import tensorflow as tf

zfit.run.set_autograd_mode(False)


class BinnedEfficiencyPDF(zfit.pdf.BasePDF):

    def __init__(self, efficiency, eff_bins, obs, name='BinnedEfficiencyPDF'):
        self.efficiency = efficiency
        self.eff_bins = eff_bins
        super().__init__(obs=obs, name=name)

    def _bin_content(self, x):
        eff_bin = np.digitize(x, self.eff_bins)
        return self.efficiency[eff_bin]

    def _unnormalized_pdf(self, x):  # or even try with PDF
        x = z.unstack_x(x)
        probs =  z.py_function(func=self._bin_content, inp=[x], Tout=tf.float64)
        probs.set_shape(x.shape)
        return probs
Yanina Biondi
@YaniBion
Thank you! I couldn't find it
Yanina Biondi
@YaniBion
when I try to make it extended, it fails though :/
Jonas Eschle
@jonas-eschle
Instead of using create_extended, you can also make a PDF extended in-place by using pdf.set_yield()
The problem that it fails is because the copy function has not properly implemented to take into account the things you've added customly. But the set_yield(...) works
This will turn the pdf into an extended one
Yanina Biondi
@YaniBion
thanks a lot
Ryunosuke O'Neil
@ryuwd
Hello, I wondered how to correctly use the convolution tools (e.g. FFTConvPDFV1) in zfit. How might I use FFTConvPDFV1, to create a PDF equivalent to the Voigtian in RooFit? I tried creating a new instance with the kernel set to a Gaussian and func to the RBW from the zfit_physics / bw branch, but it seems I cannot use this straightaway. Is there something I'm missing?
Jonas Eschle
@jonas-eschle
Hi, this should in principle work straight out of the box, if not, it is maybe the best to open an issue and post there any error. Alternatively, the [Voigtian (or erfx) function was just implemented in TensorFlow-Probability] (https://www.tensorflow.org/probability/api_docs/python/tfp/math/erfcx) (there is alternatively also an exp conv gauss)
If you gonna to use one of these, you can contact us directly (here private chat, mail) and you could even contribute it with some hints as it is anyway on the list of PDFs to be added.
4 replies
Rizwaan Mohammed
@Rizwaan96_twitter
Hi, sorry if this is a silly question, but when doing an UnbinnedNLL fit, is it a problem if the final FCN value is positive? The fit looks good otherwise
Jonas Eschle
@jonas-eschle
Hi, this is no problem at all, in fact the absolute value of the likelihood is meaningless (and should not be relied on (!). It can even be beneficial for the minimization to subtract a constant from the likelihood), The difference is only what matters (between the same likelihood but with different parameter values)
Rizwaan Mohammed
@Rizwaan96_twitter
Ah that's great, thanks a lot!
Jonas Eschle
@jonas-eschle

We've released the 0.6 series of zfit! Major addition is a lot of new minimizers that all support uncertainty estimations the same way as used now.

They can now be invoked independent of zfit models at all and used with pure Python functions

The main changes (full changelog here

  • upgraded to TensorFlow 2.4
  • Added many new minimizers. A full list can be found in :ref:minimize_user_api.

    • IpyoptV1 that wraps the powerful Ipopt large scale minimization library
    • Scipy minimizers now have their own, dedicated wrapper for each instance such as
      ScipyLBFGSBV1, or ScipySLSQPV1
    • NLopt library wrapper that contains many algorithms for local searches such as
      NLoptLBFGSV1, NLoptTruncNewtonV1 or
      NLoptMMAV1 but also includes more global minimizers such as
      NLoptMLSLV1 and NLoptESCHV1.
  • Completely new and overhauled minimizers design, including:

    • minimizers can now be used with arbitrary Python functions and an initial array independent of zfit
    • a minimization can be 'continued' by passing init to minimize
    • more streamlined arguments for minimizers, harmonized names and behavior.
    • Adding a flexible criterion (currently EDM, the same that iminuit uses) that will terminate the minimization.
    • Making the minimizer fully stateless.
    • Moving the loss evaluation and strategy into a LossEval that simplifies the handling of printing and NaNs.
    • Callbacks are added to the strategy.
  • Major overhaul of the FitResult, including:

    • improved zfit_error (equivalent of MINOS)
    • minuit_hesse and minuit_minos are now available with all minimizers as well thanks to an great
      improvement in iminuit.
    • Added an approx hesse that returns the approximate hessian (if available, otherwise empty)
Aman Goel
@amangoel185

Hey @mayou36! :)

I wrote to you regarding GSoC 2021 (via aman.goel185@gmail.com) , and have a doubt regarding the same in the evaluation task.

Can I contact you over private chat?

1 reply
Jonas Eschle
@jonas-eschle
This message was deleted
Anil Panta
@panta-123
Is there any example where i can find the code to plot pull of the fit (ExtendedunbinnedNLL fit) ? or could anyone provide me some example script here.
3 replies
Jonas Eschle
@jonas-eschle

We released multiple small releases up to 0.6.3 with a few minor improvements and bugfixes. Make sure to upgrade to the latest version using

pip install -U zfit

Thanks to the finders of the bugs. We appreciate any kind of (informal) feedback, ideas or bugs, feel free to reach out to us anytime with anything

anthony-correia
@anthony-correia
Hello @mayou36, we would like to try a template fit to some 3D binned data. I've been told that a binned fit is possible with zfit, but is still experimental and undocumented yet. I guess it is the branch "binned_new" of the github of zfit. Am I right so far?
Is there anything I need to know before trying to use the code there?
34 replies
Henrikas Svidras
@henrikas-svidras

Hi. I am intensively using zfit in notebooks, and I have been running into the well-known NameAlreadyTakenErrors. I have found workarounds that work for me, but I just wanted to say that it seems to me the example presented in here does not work. Like, if you try to use this you will get an error when minimising:

~/.local/lib/python3.6/site-packages/zfit/minimizers/minimizer_minuit.py in <listcomp>(.0)
     76         errors = tuple(param.step_size for param in params)
     77         start_values = [p.numpy() for p in params]
---> 78         limits = [(low.numpy(), up.numpy()) for low, up in limits]
     79         errors = [err.numpy() for err in errors]
     80 

AttributeError: 'int' object has no attribute 'numpy'

I guess the problem is that the limits are supposed to be tf.Tensor, but if we simply assign a float or int via param.lower or param.upper that breaks the code later?

Couldn't there be some sort of a method, such as set_limit_lower(value) ? Or am I misusing zfit somehow?

6 replies
zleba
@zleba
Hello, do you know what is the best way to deal with the discrete observables? One dimension of my pdf is the charge with values 1 or -1. The normalisation of the pdf(q=1) and pdf(q=-1) is in general different (predicted by the pdf parametrization) and only the sum pdf(q=1)+pdf(q=-1) =1.
10 replies
Kevin Wang
@LeavesWang

Hi, to overload the parameters in Jupyter, does the following way work?

iCell = get_ipython().execution_count #get the current cell number

par1 = zfit.Parameter("par1"+str(iCell), 8., 0., 20.)
par2 = zfit.Parameter("par2
"+str(iCell), -20., -50., 50.)
par3 = zfit.Parameter("par3_"+str(iCell), 10., 0., 20.)

4 replies
Henrikas Svidras
@henrikas-svidras
Hi experts. What kind of a seed does zfit use when you use pdf.sample() method? I found the zfit.setting.set_seed() (which I assume would affect the sample method as well?) Does it work the same way as numpy would, where if you don't specify it explicitly, it would use a random seed? Thanks a lot!
Jonas Eschle
@jonas-eschle
Hi, indeed. The seed is just a global seed, setting it is just setting the seed in the backends, meaning numpy and TensorFlow
Henrikas Svidras
@henrikas-svidras
Alright! I guess this means that it should be fine not setting it explicitly when doing many toy fits in parallel :) I was thinking that it could have some clock/date dependance, so just had to make sure anything like that wasn't the case. Thanks a lot.
Henrikas Svidras
@henrikas-svidras

Hi again :) I was wondering if there is a way to have a parameter which has its upper limit depending on another parameter. As a naive illustration, imagine you are fitting a quadratic parabola ax**2+bx+c, and you want the peak of the parabola to be between 0 and 5. The would mean 0<-b/2a<5.

I naively tried something similar to this:

a = zfit.Parameter("a", 5, floating=True)
b = zfit.Parameter("b", 0, lower = -10*a, floating=True)

However, it seems that all this does is sets the upper limit to be at -10 * initial value of a. Is there a way to somehow change limit to the value as a is changing?

Jonas Eschle
@jonas-eschle
Hey, while there are possibilities (you can create a composite parameter, which is effectively just a function of zero or more free parameters, and let it return the minimum of a or b, leaving the lower limit on b and setting a high enough upper limit), it is not advisable. Limits are in general a bad thing in parameters: when minmimizing a likelihood, most minimizers look for local minimia. This means that we have to make sure that we start close enough and limits should, under normal circumstances, not influence the minimization and shoul be chosen wide enough.
Henrikas Svidras
@henrikas-svidras
Thanks. I understand the caveats, and I guess this makes sense. So, in principle, it would only be available through ComposedParameters. That's what I thought, but I hoped there might be some kind of a secret zfit trick :) In any case, thanks a lot!
Jonas Eschle
@jonas-eschle
Thanks for bringing it up! Indeed, it is however a one-liner with composed parameters. The limits are intentionally kept simple
Jonas Eschle
@jonas-eschle

A new zfit version, 0.8x series is available, with bugfixes, improved numerical integration and different Kernel Density Estimations (also for large sample sizes).

all changes listed

The tutorials also improved in the style and have now their own site. They can be run interactively, or downloaded, or be viewed.

Henrikas Svidras
@henrikas-svidras

Hi experts.

I had a question regarding errors. I have made a quick example to highlight my question, hopefully it makes sense.

I have noticed that if I fit my pdf, and then refit that same pdf again and again in a loop, the error I get out is not the same but keeps changing. Also the errors calculated using different methods are not consistent, at least not always, even after the initial fit (e.g. 0th iteration).

I have to say I don't quite understand this. I have to say I am not an expert in fitting, so maybe this is expected, but I find it very weird.

I'll try to illustrate this with an example:

from zfit.pdf import Gauss, Exponential


obs = zfit.Space("x", limits=(-5, 5))
minimizer = zfit.minimize.Minuit(use_minuit_grad=True)

mu = zfit.Parameter("muu", 0, step_size=0.01)
sigma = zfit.Parameter("sigma", 1,step_size=0.01)
gauss = zfit.pdf.Gauss(mu=mu, sigma=sigma, obs=obs)
gauss_yield = zfit.Parameter("g_yield", 100, step_size=0.1)
gauss_ext = gauss.create_extended(gauss_yield)

lam = zfit.Parameter("lam", -1,step_size=0.01)
expo  = zfit.pdf.Exponential(lam=lam, obs=obs)
expo_yield = zfit.Parameter("e_yield", 500, step_size=0.1)
expo_ext = expo.create_extended(expo_yield)

gauss_expo = zfit.pdf.SumPDF([expo_ext, gauss_ext])

random_gauss = np.random.normal(size=500)+1
random_exp = np.random.exponential(scale = 5, size=1000)-5
random_data = np.append(random_gauss, random_exp)

# then for each different error method I run this:
lam.set_value(-1)
mu.set_value(1)
sigma.set_value(1)
frac.set_value(0.5)
expo_yield.set_value(100)
gauss_yield.set_value(100)

data = zfit.Data.from_numpy(obs=obs, array=random_data)
nll = zfit.loss.UnbinnedNLL(model=gauss_expo, data=data)

iterations = np.arange(0,20)
yield_error = []

for it in iterations:
    result = minimizer.minimize(nll)
    result.errors() # here I also try result.hesse() with #method='hesse_np', 'approx', 'minuit_hesse'


    yield_error.append(result.params[gauss_yield]['minuit_minos']['upper'])

Each of these iterations produces a different error, particularly when using minuit_hesse and hesse_np:

#minuit_minos upper (lower is almost the same with a negative sign)
array([118.3930117 , 118.13616212, 118.17518686, 118.1589878 ,
       118.17934101, 118.17029047, 118.18437411, 118.17827166,
       118.18786236, 118.1837249 , 118.19027599, 118.1874818 ,
       118.19182178, 118.1899428 , 118.19290011, 118.19164637,
       118.19360303, 118.19276492, 118.19412068, 118.19355127])

#minuit_hesse
array([264.56868112, 263.99099257, 264.03559365, 263.99005815,
       264.0375983 , 264.01347448, 590.9170118 , 264.02200987,
       590.84657398, 264.03923864, 592.87290404, 264.05010254,
       589.78390794, 426.94055222, 588.76391966, 585.73316667,
       591.81967819, 587.7401319 , 591.81833332, 592.84125751])

#hesse_np
array([205.6559601 , 207.41405669,          nan,          nan,
                nan, 737.84227104,          nan,  70.71791776,
       314.54393221,          nan, 110.97088789, 237.15737011,
                nan,          nan,          nan,          nan,
                nan, 149.10569392, 300.7499339 ,          nan])

#approx
array([118.39301118, 118.1361616 , 118.17518634, 118.15898728,
       118.17934049, 118.17028995, 118.18437359, 118.17827114,
       118.18786184, 118.18372437, 118.19027547, 118.18748128,
       118.19182126, 118.18994228, 118.19289958, 118.19164585,
       118.19360251, 118.1927644 , 118.19412016, 118.19355074])

I understand that I am generally speaking not supposed to loop an already converged fit again, but what is puzzling me that even if I only look at the very first element in each of these lists they are not at all consistent. I noticed this in a more complicated fit that I am doing in an analysis and I am a bit puzzled. I prepared this simple mock example to make it easier to reproduce.

Is this expected? Am I doing something crazy here?

Sorry for the long question, and thanks a lot.

Jonas Eschle
@jonas-eschle

Hi, first of all, thanks a lot for bringing it up and making such a good reproducible examlpe. You are also welcome to opend an issue. The problem is that you create an unbinned likelihood (=> create an ExtendedUnbinnedNLL instead, this works for me), so you are not constraining the sum of yields to be the (poisson distributied) number of events. There should be a warning displayed like

AdvancedFeatureWarning: Either you're using an advanced feature OR causing unwanted behavior. To turn this warning off, use `zfit.settings.advanced_warnings['extended_in_UnbinnedNLL']` = False`  or 'all' (use with care) with `zfit.settings.advanced_warnings['all'] = False
Extended PDFs are given to a normal UnbinnedNLL. This won't take the yield into account and simply treat the PDFs as non-extended PDFs. To create an extended NLL, use the `ExtendedUnbinnedNLL`.
  warn_advanced_feature("Extended PDFs are given to a normal UnbinnedNLL. This won't take the yield "

So the fit you are doing is equal to defining a sum of two pdfs using two free parameters: we end up with a degree of freedom too much. This is what causes the error to vary each time (at least I suspect it).

To explain the errors: 'minuit_minos' is the builtin minuit error (from iminuit, the minos method). minuit_hesse is the hesse algorithm of iminuit. approx is the minimizers approximation of the hesse (and is maybe not available or completely off. It's just a "better than nothing", but often accurate enough for some usecases such as getting the order of magnitude). hesse_np is zfits implementation of Hesse and the NaNs are probably pretty accurate: it can't determine the hession because it fails for the good reason that it's an underconstraint problem.

Just to mention, one method you didn't try is zfit_error, zfits own implementation of "minos". In my test it gives a comparable error (42 vs 39 from minos) using the ExtendedUnbinnedNLL

Henrikas Svidras
@henrikas-svidras
Hi many thanks for the in depth answer. Yes, in this example UnbinnedNLL seemed to be the culprit. I still need to investigate why the fit where I initially spotted this was misbehaving as there I was using the correct ExtendedUnbinnedNLL. But your answer proposes some hints so I will try them. It also brings a bit more clarity about zift overall, thanks :)
greennerve
@greennerve
hi all, I tried to limit the number of CPU used by zfit with zfit.run.set_n_cpu(8), but it doesn't work, zfit still uses all available cores/threads.
what's the proper way to limit the number of cores/threads used by zfit