Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • 07:21
    ThuWangzw synchronize #19271
  • 07:15
    ThuWangzw synchronize #19271
  • 05:57
    AYUSHBlaze opened #19284
  • 03:58
    thefirebanks commented #10487
  • 03:09
    thomasjpfan synchronize #19274
  • 02:36
    thomasjpfan commented #18744
  • 01:43
    thomasjpfan synchronize #18741
  • Jan 26 23:45
    lucyleeow synchronize #17443
  • Jan 26 23:41
    robert-robison commented #18603
  • Jan 26 23:31
    MichalRIcar commented #18138
  • Jan 26 23:23
    robert-robison synchronize #18603
  • Jan 26 22:55
    m7142yosuke synchronize #19182
  • Jan 26 22:03
    lucyleeow synchronize #17443
  • Jan 26 21:58
    sturlamolden commented #11103
  • Jan 26 21:46
    sturlamolden commented #11103
  • Jan 26 21:42
    sturlamolden commented #11103
  • Jan 26 21:37
    lorentzenchr commented #19280
  • Jan 26 21:35
    lorentzenchr commented #19280
  • Jan 26 20:54
    kno10 commented #19253
  • Jan 26 20:30
    CheriseCodes commented #19280
Guillaume Lemaitre
@glemaitre
Actually you could get_scorer("balanced_acccuracy")(clf, X, y) but I think that we don't head toward readable code :)
lesshaste
@lesshaste
@glemaitre I like it. Thank you
I have a different more general question. I am doing binary classification. I would like to maximize the number of items in the positive class that get a probability higher than any probability from the negative class. Does this correspond to a known loss function?
let me edit it to get rid of the word score...
Guillaume Lemaitre
@glemaitre
You might want this things maybe
Basically, this is tuning the threshold of the argmax when doing the predict from the predict_proba
Otherwise, sample_weigth or class_weigth will allow you to play on the inner loss
while training
lesshaste
@lesshaste
@glemaitre thank you. I haven't fully understood how to use your suggestions for my problem but I will have a think
maybe it could go on scikit-learn discussions as well :)
Guillaume Lemaitre
@glemaitre
I think so
I might have misunderstood the use-case (a small example with specific number might help :))
lesshaste
@lesshaste
I can give one in about 90 minutes
Loïc Estève
@lesteve

maybe it could go on scikit-learn discussions as well :)

+1. As mentioned in https://github.com/scikit-learn/scikit-learn/discussions/19220#discussioncomment-298015 my feeling (and probably others feeling) is that gitter is not the best place for Q&A. I guess a reasonable approach is to create a discussion and then ping on gitter if you feel you have not received an answer after some time

lesshaste
@lesshaste
@lesteve thanks. I do like the interactive nature of gitter to a) improve the question and/or b) realise I shouldn't have asked it in the first place :)
Loïc Estève
@lesteve
Yeah I agree the threshold about "what is OK to ask on gitter" is not very clear. I would favour an approach as I mention above discussion + ping on gitter after some time. It is not as much interactive but it is a better investment of answerer time since the question + answer will be findable by googling (contrary to gitter)
lesshaste
@lesshaste
makes sense
Guillaume Lemaitre
@glemaitre
Gitter should come with a feature that you cannot scroll-up in your discussion feed
because this is a bit what happens in reality :)
lesshaste
@lesshaste
:)
This isn't a full example but hopefully it will help clarify. Say my positive class items get 0.1, 0.3, 0.7, 0.9 from predict_proba and my negative class items get 0.01, 0.2, 0.2, 0.5. Then two of the positive class items get a prob (0.7, 0.9) larger than the largest prob (0.5) from the negative class.
@glemaitre does that make it any clearer?
Guillaume Lemaitre
@glemaitre
So the cutoff classifier intend to change the probability from 0.5 to another threshold
such that you can for instance the maximum number of predictions of the positive label
lesshaste
@lesshaste
@glemaitre yes. But the cutoff is a function of the probs that the negative class items are given
my example of 0.5 above wasn't a great choice :)
Guillaume Lemaitre
@glemaitre
Oh you want to reinforce your learning step
I see
In some way, I could think about a boosting strategy as AdaBoost, but instead of learning new learner favoring misclassified samples, you want to favor specific samples from the positive class.
Guillaume Lemaitre
@glemaitre
I don't know if there is something in active learning allowing such stuff
But I am not knowing so much in this area
lesshaste
@lesshaste
thanks. I was going to post on discussions but I can't think of a suitable title :)
Guillaume Lemaitre
@glemaitre
"Reinforce sample weight for online learning"
lesshaste
@lesshaste
posted
Loïc Estève
@lesteve
with the link there is even more chances that someone answers :wink: https://github.com/scikit-learn/scikit-learn/discussions/19239
lesshaste
@lesshaste
@lesteve thanks :)
lesshaste
@lesshaste
argh... I hate how easy it is to be confusing.
@lesteve do you think my post is clear now?
lesshaste
@lesshaste
I guess it's equivalent to maximizing true positives when you have 0 false positives...?
lesshaste
@lesshaste
now I am tempted to try one of the options mentioned in the discussions. Now really sure which one though
lesshaste
@lesshaste
*Not
lesshaste
@lesshaste
Can any of the classifiers in scikit learn directly optimize auc as the loss function?
Fabio
@gatto
Hello all! I'm not a library developer, I'm a student developer and user of scikit-learn. Is there any work going on for scikit-learn to use Apple's ML Compute frameworks so that ML calculations can be accelerated by the 16-core neural engine in the recent Apple Silicon macs?
Joseph Redfern
@JosephRedfern

Hi All. Had a question about a possible discrepancy between user guide and autogenerated docs for LASSO Linear model. Auto-gen docs (https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.Lasso.html) says that:

(1 / (2 * n_samples)) * ||y - Xw||^2_2 + alpha * ||w||_1

Is minimised. However, the user guide seems to to imply that it's ||Xw - y|| rather than ||y-Xw|| (https://scikit-learn.org/stable/modules/linear_model.html#lasso). y-Xw makes more sense to me. Am I reading something incorrectly, or is the user guide wrong?

Jérémie du Boisberranger
@jeremiedbb
@JosephRedfern both are the same. ||-x|| = ||x||
Guillaume Lemaitre
@glemaitre
I think the common way would be y - y_hat = y - Xw
but they lead to the same
as @jeremiedbb just mentioned :)
Joseph Redfern
@JosephRedfern
oh boy, what a brain fart! apologies, I should have thought before posting.