Hold out method is the most basic of the Cross-Validation (CV) techniques.
But why do we need this ?
Suppose you train a model on a given dataset using any specific algorithm. You tried to find the accuracy of the trained model using the same training data and found the accuracy to be 95% or maybe even 100%. What does this mean? Is your model ready for prediction?
The answer is no , Why?
Because your model has trained itself on the given data, i.e. it knows the data and it has generalized over it very well. But when you try and predict over a new set of data, it’s most likely to give you very bad accuracy, because it has never seen the data before and thus it fails to generalizes well over it. This is the problem of overfitting. To tackle such problem, Hold-Out Method comes into the picture. Hold-Out Method is a resampling technique with a basic idea of dividing the training dataset into two parts i.e. train and test. On one part(train) you try to train the model and on the second part(test) i.e. the data which is unseen for the model, you make the prediction and check how well your model works on it. If the model works with good accuracy on your test data, it means that the model has not overfitted the training data and can be trusted with the prediction, whereas if it performs with bad accuracy then our model is not to be trusted and we need to change our algorithm.
So this is How we proceed with the Hold out method:
1. In the first step, we randomly divide our available data into two subsets: a training and a test set. Setting test data aside is our work-around for dealing with the imperfections of a non-ideal world, such as limited data and resources, and the inability to collect more data from the generating distribution. Here, the test set shall represent new, unseen data to our learning algorithm; it’s important that we only touch the test set once to make sure we don’t introduce any bias when we estimate the generalization accuracy. Typically, we assign 2/3 to the training set, and 1/3 of the data to the test set. Other common training/test splits are 60/40, 70/30, 80/20, or even 90/10
2. We set our test samples aside, we pick a learning algorithm that we think could be appropriate for the given problem. Now, what about the Hyperparameter Values depicted in the figure above? As a quick reminder, hyperparameters are the parameters of our learning algorithm, or meta-parameters if you will. And we have to specify these hyperparameter values manually – the learning algorithm doesn’t learn them from the training data in contrast to the actual model parameters.
Since hyperparameters are not learned during model fitting, we need some sort of “extra procedure” or “external loop” to optimize them separately – this holdout approach is ill-suited for the task. So, for now, we have to go with some fixed hyperparameter values – we could use our intuition or the default parameters of an off-the-shelf algorithm if we are using an existing machine learning library.
3. Our learning algorithm fit a model in the previous step. The next question is: How “good” is the model that it came up with? That’s where our test set comes into play. Since our learning algorithm hasn’t “seen” this test set before, it should give us a pretty unbiased estimate of its performance on new, unseen data! So, what we do is to take this test set and use the model to predict the class labels. Then, we take the predicted class labels and compare it to the “ground truth,” the correct class labels to estimate its generalization accuracy.
4. Finally, we have an estimate of how well our model performs on unseen data.
The Hold-out method is not well suited for sparse data-set. Sparse data set is the data set in which classes are not equally distributed. For example: consider following data set-
Now for Hold out method if we did this 70-30 split as shown , so you can see that in Train set we’ve majority of records of class A & only single record is of B. So ; if we train/ learn our algorithm on that train set , do you think it will train itself good enough to predict B class ? While evaluating this model on the test set we will get wrong prediction for B class & the error will increase.
In this situation The Random-Subsampling is the better approach than Hold-out method
Random Sub-sampling Method
Lets understand How Random Sub-sampling works:
Pros - It is better approach than Hold out method for sparse dataset.
Cons - There is chances of selecting same record in test set for other iteration.
As shown above,
Suppose we select k = 4 = no. of iterations;
Now talking about cons ; there is chances of selecting same record again and again in test sets
Here you can see in iteration 2,3,4 we have majority of B in test set .. again we will face same problem i.e. again our model will not be able to learn for B class and will fail in validation.
To solve this problem there is another good method called as K-fold Cross Validation.