calour.training.SortedStratifiedKFold.split

SortedStratifiedKFold.split(X, y, groups=None)[source]

Generate indices to split data into training and test set.

Parameters:
  • X (array-like, shape (n_samples, n_features)) –

    Training data, where n_samples is the number of samples and n_features is the number of features.

    Note that providing y is sufficient to generate the splits and hence np.zeros(n_samples) may be used as a placeholder for X instead of actual training data.

  • y (array-like, shape (n_samples,)) – The target variable for supervised learning problems. Stratification is done based on the y labels.
  • groups (object) – Always ignored, exists for compatibility.
Returns:

  • train (ndarray) – The training set indices for that split.
  • test (ndarray) – The testing set indices for that split.

Notes

Randomized CV splitters may return different results for each call of split. You can make the results identical by setting random_state to an integer.