Combine machine learning and human intelligence to build accurate credit decisions.
Machine learning is a powerful tool to use when working with credit decisions. It has the potential to increase the accuracy significantly compared to using traditional logistic regression and scorecard development. Although plenty of machine learning models could be used to increase the predictability, credit analysts working in regulated industries face a dilemma of choosing more predictive accuracy or honoring their regulatory responsibilities.
A way to combine the power of machine learning together with know-how and industry expertise is to use machine learning together with monotonic constraint.
What is Monotonic constraints
As per the oxford dictionary, monotonic is a function or quantity varying in such a way that it either never decreases or never increases. A positive constraint means that for an increasing variable value it’s contribution to the final score will be positive. A negative constraint will mean the opposite. A more concrete example could be if a credit decision model is developed. If everything else is equal, it is expected that a person with higher salary will have a lower probability of default compared to a person with lower salary. However if there are some applications in the data having the reversed relationship, higher salary associated with more defaults, that are counter-intuitive. This kind of situation often happens if data is sparse or has a lot of noise.
To illustrate, let’s create some simulated data with two features and a response according to the following scheme:
The response generally increases with respect to the x1 feature, but a sinusoidal variation has been superimposed, resulting in the true effect being non-monotonic. For the x2 feature the variation is decreasing with a sinusoidal variation.
Let´s fit a model to this data without imposing any monotonic constraints:
The black curve shows the trend inferred from the model for each feature. To make these plots the distinguished feature xi is fed to the model over a one-dimensional grid of values, while all the other features (in this case only one other feature) are set to their average values. We see that the model does a good job of capturing the general trend with the oscillatory wave superimposed.
Here is the same model, but fit with monotonicity constraints:
When to use it?
If you as a modeler or your team has experience and know-how within the field you are operating in, setting a variable constraint can often result in better model performance on the test data, meaning that the constrained model may generalize better compared to not using any constraints.
Another reason to use monotonic constraint is that the model will be easier to trust and understand. As the example when building a credit decision model and using variable salary. If everything else is equal, it is expected that a person with higher salary will have a lower probability of default compared to a person with lower salary. If you want this behaviour to be 100% true, regardless of what the data say, a monotonic constraint must be set.
How to use it in Evispot ML platform
A public dataset UCI ML Data set from Taiwanese Bank (Click here) has been used to illustrate how to set and verify the monotonic constraint using the Evispot ML platform.
The dataset contains demographic data such as age, education gender, history of past payment and a target variable: default_payment_next_month.
After the dataset has been uploaded to the evispot ML-platform, the preprocessing view will be shown where the first data exploration takes part and it is also here where it is possible to set a monotonic constraint.
To be able to train and analyse a transparent monotonic ML model, constraints must be supplied. You can either manually set the constraint for each variable or automatically set the constraints on all variables, using the variables Pearson Coefficient value. For this example, I will use the automatic approach by clicking on “Enable for all” that can be found to the left side under section: Set monotonic constraint.
The Pearson coefficient measures the correlation between the variable and the target. It returns a value between -1 and 1, where 1 indicates a strong positive correlation, -1 a strong negative and 0 indicates no correlation. Hence, a positive pearson coefficient will result in a positive constraint and a negative pearson coefficient will result in a negative constraint.
After monotonic constraint has been added I navigate to the Modelling page and start the automatic model training.
In the modelling view ML models can be created. Either by running the automatic model optimization, which lets the evispot ML platform automatically train and optimize different models or use expert settings to customize an ML model.
For this purpose I will use the default settings meaning I only need to click on the start button. When a model has been trained it will automatically be shown in a list under the model settings.
Above you can see that four models have been trained, (two extra models without monotonic constraint have been created to showcase the differences the model has found). To navigate to evaluation you can click on the blue button to the left named “Evaluate” or use the menu bar at the top of the page and click on “Evaluation”.
In the evaluation view Evispot ML platform employs a host of different techniques and methodologies for interpreting the created model. A common tool to use when evaluating monotonic constraints is partial dependency plots, which can be found by navigating to submenu traits and expanding a variable using the arrow to the right in the table.
The above plot contains the final evispot score as y-axis and the variable value for variable PAY_0. If a constraint is added the line must go in one direction, however the line does change the general decreasing direction between variable values -1 to 0 and between 4 to 5. Hence no constraint has been added. The plot below (same variable) is constrained since the evispot score never increases if PAY_0 has a higher value.
In summary, when using monotonic constraints for variable PAY_0, you will know that if everything else is equal (e.g all other variables) the evispot score must only decrease (or be the same value) if PAY_0 has a higher value and increase (or be the same value) if it has a lower value.
If you are interested to know more about the Evispot ML platform, don’t hesitate to contact us.