Python module to perform under sampling and over sampling with various techniques.
glemaitre on 0.9.X
[doc build] (compare)
glemaitre on master
MNT update setup.py (compare)
glemaitre on 0.9.1
glemaitre on master
DOC add whats new 0.9.1 (compare)
glemaitre on 0.9.X
MNT adapt for scikit-learn 1.1 … DOC add whats new 0.9.1 REL make 0.9.1 release (compare)
glemaitre on master
MNT rename CI build (compare)
glemaitre on master
MNT adapt for scikit-learn 1.1 … (compare)
0.13
in order to generate (and add) a single minority instance. So the new ratio will be 90:11.
BalancedBaggingClassifier
(that use a RandomUnderSampler
with a strong learner as a HistGradientBoosting
TypeError: All intermediate steps of the chain should be estimators that implement fit and transform or fit_resample. 'Pipeline(steps=[('smote', SMOTE(n_jobs=-1, random_state=42)), ('under', RandomUnderSampler(random_state=42))])' implements both)
Also, can someone explain what this error means? The Pipeline only exposes fit and fit_resample methods, since, transform is not being implemented, the first condition is not met and the second one about fit_resample is being met. Then, shouldn't this work? Thank you.
smote_pipeline
implement fit_resample
and transform
as well
fit_resample
or fit
/transform
Main_Pipeline = imb_Pipeline([
('feature_handler', FeatureTransformer(list(pearson_feature_vector.index))),
('smote', SMOTE()),
('random_under_sampler', RandomUnderSampler()),
('scaler', StandardScaler()),
('pca', PCA(n_components=0.99)),
('model', LogisticRegression(max_iter=1750)),
])
Please correct me if my understanding is lacking.
So, when I call fit
to the Main_Pipeline
, since smote_pipeline
as a fit present, it is assumed that transform
is also present, actually it doesn't, I tried to call transform
and got an error:
AttributeError: 'RandomUnderSampler' object has no attribute 'transform'
Pipeline code:
Smote_Under_pipeline = imb_Pipeline([
('smote', SMOTE(random_state=rnd_state, n_jobs=-1)),
('under', RandomUnderSampler(random_state=rnd_state)),
]
, and accordingly because of assumption fit/transform
and fit_resample
both become available. This causes ambuiguity and the code blows up?