Cross validation
To assess how a model will perform in practice we need to validate it against an independent data set. This is commonly done by splitting the data into training and test set. A model is then trained on the training set and validated on the test set. However this method still leaves the question: what if the model was only good on this particular test set? To circumvent this issue we repeat the method multiple times, with differently sampled test sets, and thus we can measure the expected mean and variance between the model performances. This algorithm is known as Cross Validation, and allows us to better understand how a model trained on all the data will perform.