Oftentimes, a target categorical variable might be severely imbalanced, which makes a mess of modeling.
With an imbalanced dataset, simply predicting the majority class yields a solid accuracy, but fails to capture minority classes.
Number of different ways to get around this.
Up-Sample the Minority Class #
Randomly duplicate records from the minority class. This can be done by sampling with replacement.
from sklearn.utils import resample