The Gini coefficient is a well-established method to quantify the inequality among values of a frequency distribution, and can be used to measure the quality of a binary classifier. Gini is measured between 0 and 1. A Gini index of 0 expresses perfect...
A receiver operating characteristic (ROC), or simply ROC curve, is a graphical plot which illustrates the performance of a binary classifier system as its discrimination threshold is varied. It is created by plotting the fraction of true positives...
It’s the process of balancing a data set by discarding examples of the overrepresented class so that each has the same amount of examples.
A balanced data set allows a model to learn equal amounts of characteristics from each one of the classes...
Hyperaparameter is a parameter whose value is used to control the learning process of a specific AI model. A hyperparameter has to be set / fixed before starting the training process.
To assess how a model will perform in practice we need to validate it against an independent data set. This is commonly done by splitting the data into training and test set. A model is then trained on the training set and validated on the test set....
Out-of-time test is the method of training a model on data from the earlier part of the time-interval and testing it against the later, so called out-of-time test set. The purpose is to create a testing scenario where the model and test set simulates...