Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • 12:55
    jjerphan edited #22587
  • 12:52
    jjerphan commented #24875
  • 12:48
    ogrisel commented #25048
  • 12:48
    ka00ri synchronize #25061
  • 12:47
    ka00ri synchronize #25061
  • 12:46
    ka00ri synchronize #25061
  • 12:44
    ogrisel commented #25048
  • 12:43
    thomasjpfan edited #25091
  • 12:42
    OmarManzoor commented #24875
  • 12:40
    glemaitre commented #25098
  • 12:40

    glemaitre on main

    MAINT adjust tolerance in test_… (compare)

  • 12:40
    glemaitre closed #25095
  • 12:40
    glemaitre closed #25098
  • 12:39
    OmarManzoor synchronize #25064
  • 12:14
    iuiu34 commented #25096
  • 12:04
    thomasjpfan edited #24873
  • 12:00
    thomasjpfan closed #25096
  • 12:00
    thomasjpfan commented #25096
  • 11:55
    thomasjpfan labeled #25098
  • 11:53

    jeremiedbb on main

    ENH better error message in HGB… (compare)

Guillaume Lemaitre
@glemaitre
I think that I have 2 quick examples showing a bit how things can be connected:
lesshaste
@lesshaste
I suppose even in 1d random forests etc are invariant to permutations of the input array
@glemaitre thanks
Olivier Grisel
@ogrisel
yes, you have to do feature engineering first. You can consider the 2D conv layers before the final flatten / global average pooling as a feature extractor and the last fully connected layers as a standard classifier. It's just that both the feature extraction and the classifier are trained end-to-end together
lesshaste
@lesshaste
it's only really the convolutions that take advantage of the neighborhood of pixels I suppose
@ogrisel right.
Olivier Grisel
@ogrisel
but nowawdays, (convolutional) neural networks are almost always the good solution for image classification, unless you have very specific prior knowledge on the image you want to classify.
lesshaste
@lesshaste
I wonder if random forests could be changed to take arrays of pairs, say, as inputs
@ogrisel that's true but I am also thinking of time series data
where it makes a big difference if two values are from successive times or not
Olivier Grisel
@ogrisel

it's only really the convolutions that take advantage of the neighborhood of pixels I suppose

No: if you have deep conv layers with downsampling (strides or max pooling for instance) the conv layers can capture large high level complex patterns that span a large receptive field.

lesshaste
@lesshaste
@ogrisel you said No but I read your answer as yes :)
Olivier Grisel
@ogrisel
we need an example of some standard feature engineering you can do on time windows for time series forecasting / classification.
lesshaste
@lesshaste
@ogrisel that would be good to see
Olivier Grisel
@ogrisel
I misread the original quote, then yes.
But what I meant is that deep conv net can model non-local patterns
lesshaste
@lesshaste
@ogrisel yes . What I meant is that without any convolutions you don't get to see local patterns
on an NN topic, is there software to give you a good guess at a reasonable architecture for a classification task? I saw autokeras but it's pretty heavy.
Olivier Grisel
@ogrisel
if you really want to use decision trees for image classification you might be interested in https://arxiv.org/abs/1905.10073 but this is not (and will not) be implemented in scikit-learn ;)
lesshaste
@lesshaste
@ogrisel thanks! Why won't it be implemented? Because it doesn't work or coding resources?
Olivier Grisel
@ogrisel
I don't know what is the practical state of the art for architecture search for image classification
lesshaste
@lesshaste
really I am secretly interested in time series
Olivier Grisel
@ogrisel
because it is not a standard, established method.
lesshaste
@lesshaste
@ogrisel got you
image classification was just interesting because the data is in 2d
lesshaste
@lesshaste
but even in 1d it seems unclear to me what the right thing to do is
@ogrisel I have read those guidelines! They seem very sensible to me
lesshaste
@lesshaste
I greatly admire how scikit learn is run in general
@ogrisel I read that second link too :)
Joseph Daudi
@josedaudi
hi
Christopher Chavez
@chrstphrchvz
What version of OpenMP does scikit-learn require? Is 2.5 sufficient?
Olivier Grisel
@ogrisel
Probably, we use OpenMP via the prange construct of Cython.
Dhwani Shah
@dhwanishah
Hey guys, so I'm new to scikit, so please bear with me. I have a pandas dataframe that looks like this: [email, businessId, manager, app1, app2, app3, ... , app170] So essentially one row defines one user that has either a 1 or NaN on each of the appX columns specifying if they have that app or not.
What I want is a classifier that given email, businessId, and manager ....would return a list of apps should have
I've got the data in that format as i specified, what are the models do you guys think would b good for creating this type of classifier? And how would i go about this in general?
Ghost
@ghost~5bc98094d73408ce4fabf741
How I get the individual components from classification_report?
but the output_dict=True doesn't seem to work, I am receiving an error stating this parameter does not exist on the classification_report function, I also don't trust precision_recall_fscore_support, plus it misses accuracy
Thomas J. Fan
@thomasjpfan
@piotr-mamenas Please check the version of sklearn you are using. I believe output_dict was added in 0.20.
Ghost
@ghost~5bc98094d73408ce4fabf741
@thomasjpfan yup, I figured it out yesterday and after some fight with tensorflow dependencies I got it running
freepancakes
@OudarjyaS_twitter
hi everyone
Dillon Niederhut
@deniederhut
Hello from the SciPy sprints!
Meghann Agarwal
@mepa
Hi All, also from the SciPy sprints :)
Andreas Mueller
@amueller
Welcome everybody :)
Andreas Mueller
@amueller
@thomasjpfan wanna look at scikit-learn/scikit-learn#14326 ?