Why Does Calling The Kfold Generator With Shuffle Give The Same Indices?
With sklearn, when you create a new KFold object and shuffle is true, it'll produce a different, newly randomized fold indices. However, every generator from a given KFold object g
Solution 1:
A new iteration with the same KFold object will not reshuffle the indices, that only happens during instantiation of the object. KFold()
never sees the data but knows number of samples so it uses that to shuffle the indices. From the code during instantiation of KFold:
if shuffle:
rng = check_random_state(self.random_state)
rng.shuffle(self.idxs)
Each time a generator is called to iterate through the indices of each fold, it will use same shuffled indices and divide them the same way.
Take a look at the code for the base class of KFold _PartitionIterator(with_metaclass(ABCMeta))
where __iter__
is defined. The __iter__
method in the base class calls _iter_test_indices
in KFold to divide and yield the train and test indices for each fold.
Post a Comment for "Why Does Calling The Kfold Generator With Shuffle Give The Same Indices?"