An interpretable model versus black-box algorithms for complex decision making
Making a decision today comes at the cost of possibly missing a better opportunity tomorrow. When looking at job candidates, at homes — or even for love — it’s hard to know when to stop looking and make a choice. The good news is that there’s an algorithm for that.
There are actually several algorithms for solving this type of problem, known as an optimal stopping problem. These algorithms attempt to time a decision to lead to the best possible outcome. The bad news is that it’s not always clear why an algorithm recommends a particular time to make the decision.
When there are several variables leading to a decision, some algorithms can become a black box. Decisions with many variables create a multidimensional problem. For example, an algorithm for making personal loans may be trained on data from past applicants’ credit scores, incomes, debts, assets and defaulted loans. The algorithm may also use combinations, correlations and distributions of these variables. All these many factors are weighted in the algorithm’s training process and create a complicated mathematical equation that outputs whether to approve or deny a loan to a new applicant.
The Problem with Black Boxes
If a bank using the algorithm is ever accused of bias in its loan decisions, it will be difficult to prove that nothing shady is going on. To defend such an algorithm, it’ll be a challenge to justify and explain the weightings for the variables. Many industries face this problem as they become reliant on machine learning and deep learning models to inform decision making. For this reason, there’s a push for models that are interpretable, meaning transparent, rather than black boxes. This is especially true in health care and public policy.
A paper published in Management Science proposes an approach to solving complex optimal stopping problems using an interpretable model. The authors of the paper, INSEAD’s Dragos Florin Ciocan and UCLA Anderson’s Velibor Mišić, believe their method can outperform state-of-the-art black-box solutions for complex optimal stopping problems.
Not every machine learning model is a black box. Regression models and decision trees are the most well-known interpretable models. Decision trees are an especially powerful tool forming the basis of many well-performing machine learning models. A binary decision tree is easy to interpret as seen in the illustration below.
To employ decision trees in solving an optimal stopping problem, and thus benefit from the interpretability of that model, Ciocan and Mišić needed to develop a new approach for building trees. The researchers’ method tries to optimize each branch of the tree directly as it builds an interpretable policy for the solution to an optimal stopping problem, quite different than a black box that, through its many unseen machinations, arrives at a sought-after value.
A challenge Ciocan and Mišić faced in directly optimizing the tree, though, is that solutions to optimal stopping problems have to consider the progression of time and each of the tree’s levels and their labels (Lend/Deny in the previous example) cannot be considered independently like they are in the personal loan example. A change in one label will impact the other labels.
Their algorithm builds the tree from the top down, using data to determine each conditional split point. (These are the points in the tree with a conditional statement, such as, Credit Score is greater than 720, as in the previous example) It will keep expanding the tree, developing new split points until there is no improvement in the outcome by adding another split.
Simplicity and Interpretability Triumphs
Ciocan and Mišić apply their method to the problem of pricing a Bermudan option. A Bermudan option is like an American option on a security, which can be exercised at any time up to expiration, but the Bermudan option can only be exercised on particular days up to the expiration date. Having to decide at which point among the set dates to exercise a Bermudan option makes it an optimal stopping problem. And given the many factors that must be considered like underlying stock price, time to expiration, volatility, etc. makes pricing such an option a complex problem.
It’s reasonable to expect that the researchers’ method will give up some degree of performance in return for being interpretable. But they assert that’s not the case — that their method outperforms two state-of-the-art approaches in pricing such options “while simultaneously producing policies that are significantly simpler and more transparent,” they write.
They performed further testing of their method against a simple, one-dimensional optimal stopping problem where the exact optimal solution could be calculated for comparison. The interpretable method’s solution was either optimal or very close to the optimal result during this testing.
These results provide optimism that more black-box algorithms can be replaced with interpretable models in the future: “We believe that this methodology represents an exciting starting point for future research at the intersection of stochastic control and interpretable machine learning,” they write.