For reference I am using the SMOTE method for oversampling:
smoter = SMOTE(random_state=42, n_jobs=-1, sampling_strategy = 'not majority') X_train_smote, y_train_smote = smoter.fit_resample(X_train, y_train)
To be more specific, I am wondering whether it is possible to know the index for X_train in the X_train_smote dataset.
pipeline[:-1].transform(np.array(X_train)). However, I then get the error "AttributeError: 'SMOTE' object has no attribute 'transform'". I don't know how to proceed.
import numpy as np from imblearn.over_sampling import RandomOverSampler from imblearn.over_sampling import SMOTE x = np.array([['aaa'] * 100, ['bbb'] * 100]).T y = np.array( * 10 +  * 90) ros = RandomOverSampler() x_res, y_res = ros.fit_sample(x, y) smote = SMOTE() x_res, y_res = smote.fit_sample(x, y)
Hello team. I have a question about Borderline SMOTE:
The variant 2 is supposed to interpolate between the minority in danger and other neighbors from the minority, and the minority in danger and some neighbors from the majority.
In line https://github.com/scikit-learn-contrib/imbalanced-learn/blob/4162d2d/imblearn/over_sampling/_smote.py#L352
we train a KNN only on the minority class and then derive the neighbors nns from it, which we use for the interpolation.
Then we use that nns to obtain the neighbors from the majority class in the second part (https://github.com/scikit-learn-contrib/imbalanced-learn/blob/4162d2d/imblearn/over_sampling/_smote.py#L397) of the borderline-2 code. But would not nns contain only neighbours from the minority? as it is derived from a knn trained only in the minority class?
0.13in order to generate (and add) a single minority instance. So the new ratio will be 90:11.
BalancedBaggingClassifier(that use a
RandomUnderSamplerwith a strong learner as a