A preprint uploaded to the Arxiv yesterday, courtesy of Denmark Technical University and UC Berkeley, introduces a novel method for penalising bias in deep neural networks.
Explainability has been a recent focus of many deep learning research teams and much progress has been made in this area (including invertible networks, for example) but there has been little in the way of techniques for using the insights gained to engineer better, more robust and less biased networks.
The preprint’s authors suggest a method that they call Contextual Decomposition Explanation Penalisation (CDEP) which allows data scientists to remove spurious correlations between features at training time.
The method involves adding explanation error terms to the model’s loss function. These error terms calculate the L1 norm between the beta terms from the Contextual Decomposition (CD) algorithm (which is used to calculate feature importance) and a target term (usually zero) for those features and feature interactions scientists want to remove. In this way, the model minimises the loss when it reaches peak accuracy and when its explainability metrics match the scientist’s assumptions.
As use of the CD algorithm only adds a small constant term to the training time of a DNN, the authors of the paper have found a much more computationally efficient way of explaining and updating these models than the current state of the art.
Several datasets are tested in the preprint (including the Stanford Sentiment Treebank dataset, the International Skin Imaging Collaboration dataset and ColorMNIST) and the methodology is shown to improve accuracy across the board.
However, this seeming breakthrough in neural network explainability and bias-removal does have one caveat.
Neural networks have become popular (and successful) precisely because they remove the need for feature engineering. This means that engineers and scientists without domain expertise can utilise them to learn arbitrary decision functions. It is not necessary, for example, to know anything about edge detection or image segmentation to train an object detection model.
Systems like CDEP, however, do require their user to have some understanding of not only the data (which could run to many millions of observations) but also the domain from which it’s drawn. Both pieces are needed to correctly calibrate a model and remove its spurious correlations.
As more people develop deep learning skills (with the aid of online and distance learning) and as more money is invested into startups using deep learning in every imaginable vertical, it is likely that we’ll see a shortage of the kind of knowledge needed to counteract the effects of model bias.
The models whose latent bias provoke outrage are rarely manifesting the bias of their engineers. Rather they are the result of the engineer’s ignorance toward the bias in their datasets.
CDEP doesn’t resolve this problem, but it as least allows those with awareness to efficiently and effectively remove bias from their models.
Before we can use deep learning as the magic bullet it claims to be, we need to have a culture shift around who should be engineering and programming these models. Techniques like CDEP emphasise the importance of specialist domain knowledge in a world where the latest and greatest techniques are available to everyone with an AWS account.