Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • Apr 03 17:41
    mayou36 commented #226
  • Apr 03 13:55
    marinang commented #226
  • Apr 03 13:53
    HDembinski closed #225
  • Apr 03 13:53
    HDembinski closed #225
  • Apr 03 13:53
    HDembinski commented #225
  • Apr 03 13:43
    HDembinski commented #226
  • Apr 03 13:39
    HDembinski commented #226
  • Apr 03 11:32

    mayou36 on gh-pages

    Deploy zfit/zfit to github.com/… (compare)

  • Apr 03 11:07

    mayou36 on develop

    Add check and test for correct … (compare)

  • Apr 03 08:07
    marinang synchronize #214
  • Apr 03 08:07
    marinang synchronize #214
  • Apr 03 08:07

    marinang on mm_errors

    Update issue templates Update issue templates Merge branch 'develop' of https… and 1 more (compare)

  • Mar 31 09:27
    marinang synchronize #214
  • Mar 31 09:27
    marinang synchronize #214
  • Mar 31 09:27

    marinang on mm_errors

    raise NewMinimum when shifted p… (compare)

  • Mar 30 14:24

    mayou36 on gh-pages

    Deploy zfit/zfit to github.com/… (compare)

  • Mar 30 14:15

    mayou36 on gh-pages

    Deploy zfit/zfit to github.com/… (compare)

  • Mar 30 14:01
    mayou36 commented #225
  • Mar 30 14:00

    mayou36 on develop

    Update issue templates (compare)

  • Mar 30 13:51

    mayou36 on develop

    Update issue templates (compare)

Jonas Eschle
@mayou36

Hi, so if you want your data to be correlated, the easiest way seems to me to create a custom PDF, there you can define everything you want, using a functional approach.
The idea on the longer term is to collect every pdf shape in a separate module (if you want you can also directly open an issue, we will focus more on the implementation of this then) as a simple function.

For now, I would recommend to implement a simple Gaussian 2D yourself (just the shape, no normalization constant needed) and use as mu whatever you want to (you are completely free to decide the dimensionality and parameters of your pdf). It should be one line of code.
Let me know back if there are any problems or also if it succeeds (you can PM me directly)

Colm Harold Murphy
@chm-ipmu
@mayou36 Ok, this is what I had done before but I thought somehow there was an easier way. The problem is that this can become a little difficult for more complex shapes. At some stage I will also need to implement a 2D correlated crystal ball, and reinventing the wheel in this way just to have a single correlated parameter seems unnecessary. In RooFit this is straightforward to do, but I think this is because they do not make the explicit distinction between a Space and a Parameter as is done in zfit.
Jonas Eschle
@mayou36

Yes, I agree. It's a little bit trickier to allow for arbitrary things as in RooFit, since it is actually a distinction but behind the scenes, namely when it comes to the normalization. It is technically not a problem to implement this in zfit as well similarly (and on the roadmap, but rather low priority). Instead providing functions in shape will make things easier to implement, therefore you would need to do a very simple work only.

It's a design principle to have for every pdf a simple function that returns you the values and the pdf is a mere wrapper around it.

So a 2d CB will actually look like (preview, not yet there):

def _unnormalized_pdf(...):
...
return zfit.shape.cb(...) * zfit.shape.cb(...)

Would you need this soon?

Colm Harold Murphy
@chm-ipmu
Ok, I think I understand. So you would create a custom PDF with the _PARAMS being two sets of crystal ball parameters, then add in the correlations with the 2nd observable as you wish, then evaluate the parameters using the functions in zfit.shape.X ?
This was my attempt for a double gaussian with a correlation to a 2nd fit dimension, perhaps you can tell me if I've got the right end of the stick. It seems to work as expected so long as I get the projection when plotting, of course
class CorrelatedDoubleGaussian(zfit.pdf.ZPDF):
    """Double Gaussian which allows for correlations with a second observable"""

    _PARAMS = "mu1 sigma1 mu_slope sigma_slope mu2 sigma2 frac".split()
    _N_OBS = 2

    @staticmethod
    def calc_gaus(mu, sigma, x):
        """Calculate value of Gaussian function in TF compatible way"""
        sqrt2pi = tf.constant(math.sqrt(2 * math.pi), dtype=tf.float64)
        expo = - 0.5 * ((x - mu) / sigma) ** 2
        gaus = tf.exp(expo) / (sqrt2pi * sigma)
        return gaus

    def _unnormalized_pdf(self, x):
        x, y = zfit.ztf.unstack_x(x)
        mu1 = self.params["mu1"]
        sigma1 = self.params["sigma1"]
        mu_slope = self.params["mu_slope"]
        sigma_slope = self.params["sigma_slope"]
        mu2 = self.params["mu2"]
        sigma2 = self.params["sigma2"]
        frac = self.params["frac"]

        # Add correlation terms to normal observable parameters
        mu1 = mu1 + mu_slope * y
        sigma1 = sigma1 + sigma_slope * y
        mu2 = mu1 + mu2
        sigma2 = sigma1 * sigma2

        # Now just return standard evaluation of Gaussian
        # return mu + sigma * zfit.ztf.py_function(func=norm, inp=[x], Tout=tf.float64)   # Doesnt seem to work, even with numerical grad.
        g1 = self.calc_gaus(mu=mu1, sigma=sigma1, x=x)
        g2 = self.calc_gaus(mu=mu2, sigma=sigma2, x=x)
        return frac * g1 + (1 - frac) * g2
Yes, I will be needing to implement this correlated CB shape imminently as part of the Belle II flavour tagger validation work. It's being used to model BBbar background where the shape in the Mbc distribution is correlated with the corresponding deltaE distribution
Jonas Eschle
@mayou36

Yes exactly, that's the way to go. While it is great to have a simple version for everything oc, this seems still like a reasonable effort and allows to have arbitrary things there.

(I'll put the shapes out ASAP then, and fixing an odd behavior currently with the CB). We can continue with PM, if anyone is interested in the status of this, let us know.

spflueger
@spflueger
Hi, I'm a postdoc from Helmholtz Institute Mainz and current main developer of a Partial Wave Analysis framework ComPWA/pycompwa. Since tensorflow beat our core implementation in terms of convenience and performance, we are thinking on dropping further development in that direction and start using tensorflow. We stumbled across this and zfit. We also heard that this tensorflow PWA analysis stuff is being integrated into zfit. Is that true? Since we don't like to reinvent the wheel here, I wanted to ask what your plans and ideas are. Maybe we can even integrate some of our code and "knowledge" into your repos. Thanks, Stefan
Jonas Eschle
@mayou36
Hi Stefan, thanks a lot for reaching out! In short: we plan to extent into this direction and in general to bring together all the fitting packages, mainly the ones based on TensorFlow. We're also working closely with TensorFlow Analysis and have people working on both.
It would be great to integrate ComPWA also into the ecosystem and figure out what we should do to not duplicate things, there are in fact plans and some code in zfit for PWAs. I'd propose to have a more detailed discussion in PMs, in case anyone else here is interested in the topic, please let us know to be kept in the loop.
abhijitm08
@abhijitm08
@mayou36 sorry for the silence, could you add me to the offline discussions with @spflueger. Thanks.
Stefano Roberto Soleti
@soleti
Hi everyone, I've just started using a zfit and I have a question: how do I build an equivalent of RooHistPdf? Is it possible? Thanks!
Jonas Eschle
@mayou36
Hi Stefano, unfortunately, this is currently not yet implemented, but we're working on it. Do you have many data points? If not, you could try to use a KDE, which is currently in zfit_physics.unstable.GaussianKDE (install zfit-physics first: pip install git+https://github.com/zfit/zfit-physics)
But for many datapoints this will oc be slow
Matthieu Marinangeli
@marinang
This should be straightforward to create a custom zfit pdf with an histogram using rv_histogram from scipy I think.
Jonas Eschle
@mayou36
Yes, I agree. You can simply wrap it with z.py_function (same API as tf.py_function) and use the numerical gradient (zfit.settings.options['numerical_grad'] = True)
Or, if your input data is constant, you don't even need to wrap or change anything and it should work out-of-the-box
(sorry, you need to wrap it with z.py_function. But this is straight-forward, let me know (PM) if you have any issues with it or questions)
Stefano Roberto Soleti
@soleti
Thank you for the suggestions! I managed to achieve what I wanted to do.
However, right now I am trying to do a convolution of two PDFs but when I try to resample it’s incredibly slow, even for one event
This is what I am using:
ce_convoluted = zphys.unstable.pdf.NumConvPDFUnbinnedV1(obs=obs_spectrum, func=ce, kernel=dscb, limits=obs)
where dscb is a double Crystal Ball and ce is a custom PDF
sampling from dscb and ce is not particularly slow, but the convolution takes forever
I wrote my custom PDF in this way, maybe there is a better one:
class CESpectrumPDF(zfit.pdf.ZPDF):
    _PARAMS = []  # the name of the parameters

    def _unnormalized_pdf(self, x): 
        x = zfit.ztf.unstack_x(x)

        me = ztf.constant(electron.mass)
        eMax = ztf.constant(muon.mass - bound_energy - recoil_energy)
        alpha_c = ztf.constant(alpha)
        pi_c = ztf.constant(pi)

        E = zfit.ztf.sqrt(x * x + me * me)
        result = (1. / eMax) * (alpha_c / (2 * pi_c)) * (zfit.ztf.log(4 * E * E / me / me) - 2.) * ((E * E + eMax * eMax) / eMax / (eMax - E))

        return tf.maximum(result, tf.zeros_like(result))
Jonas Eschle
@mayou36
Yes, I have an idea what this is about, a safety cap that samples too many. But they fixed things upsteams so we can reduce that again.
Stefano Roberto Soleti
@soleti
I just do
sampler = ce_convoluted.create_sampler(n=1)
sampler.resample()
but it still running after 5’...
Jonas Eschle
@mayou36
Is the pdf(...) reasonably fast? Sure the convolution takes time, but other then that?
Stefano Roberto Soleti
@soleti
you mean the ce_convoluted.pdf(…)?
Jonas Eschle
@mayou36
yes
Stefano Roberto Soleti
@soleti
it’s quite slow yes
%time ce_convoluted.pdf(105)
CPU times: user 48.3 s, sys: 4.91 s, total: 53.2 s
Wall time: 8.03 s
Stefano Roberto Soleti
@soleti

I am doing my minimization like this

nll = zfit.loss.ExtendedUnbinnedNLL(model=spectrum, data=data_mom_zfit)  # loss
minimizer = zfit.minimize.Minuit(verbosity=7, use_minuit_grad=True)
minimum = minimizer.minimize(loss=nll)
params = minimum.params

Is there a reason why the minuit_hesse error is not available?

I have only the value
params[yieldCE]
{'value': 63.750158844710356}
Jonas Eschle
@mayou36
Yes. You have to call this explicitly to be calculated first (it does not calculate any error estimation by default, since this can be quite costly).
minimum.hesse()
Ilya Komarov
@mozgit
Dear *,
Is there any way to fit with simulated shapes in zfit? (Analogue of RooHistPdf may be?)
Thanks in advance
Matthieu Marinangeli
@marinang
It was discussed in this thread a while ago
but you can construct a custom pdf
using scipy rv_histogram
and wrap it with z.py_function if I am correct
Ilya Komarov
@mozgit
Thanks @marinang !
Jonas Eschle
@mayou36
Here is an example of how it could roughly look like.
class HistPDF(zfit.pdf.BasePDF):

    def __init__(self, hist_args, hist_bins, obs, name='HistPDF'):
        self.rv_hist = scipy.stats.rv_histogram([hist_args, hist_bins])  # or something, unsure
        super().__init__(obs=obs, name=name)

    def _unnormalized_pdf(self, x):
        x = z.unstack_x(x)
        probs =  z.py_function(func=self.rv_hist.pdf, inp=[x], Tout=tf.float64)
        probs.set_shape(x.shape)
        return probs
If you encounter any issues, you can also write me directly in a PM.
Ilya Komarov
@mozgit
Thanks!
Do I get it right that it's for 1D only?
(from limitations of rv_hist)
Jonas Eschle
@mayou36

Do I get it right that it's for 1D only?
(from limitations of rv_hist)

seemingly yes, but you can also write your own function using e.g. np.histdd. Wrap it in z.py_function as above basically

Ilya Komarov
@mozgit
Thanks!
greennerve
@greennerve
hi all, I'm new to zfit, I'd like to know what's the equivalence of RooKeysPdf in RooFit? thanks
Jonas Eschle
@mayou36
Hey, we're still polishing a good implementation of it. For the moment being, there are two options: either use zfit-physics package, where you can access a GaussianKDE if you want to use kernel density estimation with zfit_physics.unstable.pdf.GaussianKDE (this would be the preferred one) or alternatively, the following self-made implementation of a histogram. With it you can also use e.g. the scikit-learn kde instead. :
class HistPDF(zfit.pdf.BasePDF):

    def __init__(self, hist_args, hist_bins, obs, name='HistPDF'):
        self.rv_hist = scipy.stats.rv_histogram([hist_args, hist_bins])  # or something, unsure
        super().__init__(obs=obs, name=name)

    def _unnormalized_pdf(self, x):
        x = z.unstack_x(x)
        probs =  z.py_function(func=self.rv_hist.pdf, inp=[x], Tout=tf.float64)
        probs.set_shape(x.shape)
        return probs
greennerve
@greennerve
thank you
Jonas Eschle
@mayou36
Just write me directly in case you have any troubles with it