Python module to perform under sampling and over sampling with various techniques.
Hello team. I have a question about Borderline SMOTE:
The variant 2 is supposed to interpolate between the minority in danger and other neighbors from the minority, and the minority in danger and some neighbors from the majority.
In line https://github.com/scikit-learn-contrib/imbalanced-learn/blob/4162d2d/imblearn/over_sampling/_smote.py#L352
we train a KNN only on the minority class and then derive the neighbors nns from it, which we use for the interpolation.
Then we use that nns to obtain the neighbors from the majority class in the second part (https://github.com/scikit-learn-contrib/imbalanced-learn/blob/4162d2d/imblearn/over_sampling/_smote.py#L397) of the borderline-2 code. But would not nns contain only neighbours from the minority? as it is derived from a knn trained only in the minority class?
0.13
in order to generate (and add) a single minority instance. So the new ratio will be 90:11.
BalancedBaggingClassifier
(that use a RandomUnderSampler
with a strong learner as a HistGradientBoosting
TypeError: All intermediate steps of the chain should be estimators that implement fit and transform or fit_resample. 'Pipeline(steps=[('smote', SMOTE(n_jobs=-1, random_state=42)), ('under', RandomUnderSampler(random_state=42))])' implements both)
Also, can someone explain what this error means? The Pipeline only exposes fit and fit_resample methods, since, transform is not being implemented, the first condition is not met and the second one about fit_resample is being met. Then, shouldn't this work? Thank you.
smote_pipeline
implement fit_resample
and transform
as well
fit_resample
or fit
/transform
Main_Pipeline = imb_Pipeline([
('feature_handler', FeatureTransformer(list(pearson_feature_vector.index))),
('smote', SMOTE()),
('random_under_sampler', RandomUnderSampler()),
('scaler', StandardScaler()),
('pca', PCA(n_components=0.99)),
('model', LogisticRegression(max_iter=1750)),
])