Developing the sensitivity of LIME for better machine learning explanation
Abstract
Machine learning systems can provide outstanding results, but their black-box nature means that it's hard to understand how the conclusion has been reached. Understanding how the results are determined is especially important in military and security contexts due to the importance of the decisions that may be made as a result. In this work, the reliability of LIME (Local Interpretable Model Agnostic Explanations), a method of interpretability, was analyzed and developed. A simple Convolutional Neural Network (CNN) model was trained using two classes of images of "gun-wielder" and "non-wielder". The sensitivity of LIME improved when multiple output weights for individual images were averaged and visualized. The resultant averaged images were compared to the individual images to analyze the variability and reliability of the two LIME methods. Without techniques such as those explored in this paper, LIME appears to be unstable because of the simple binary coloring and the ease with which colored regions flip when comparing different analyses. A closer inspection reveals that the significantly weighted regions are consistent, and the lower weighted regions flip states due to inherent randomness of the method. This suggests that improving the weighting methods for explanation techniques, which can then be used in the visualization of the results, is important to improve perceived stability and therefore better enable human interpretation and trust.