Introduction & Motivation
Deep learning (DL) has achieved great success in recent years. The superior performance of DL, however, comes at the cost of model opacity, and without clear interpretations or guidelines on how the answers are derived. Concerns about the black-box nature of DL has hampered its further applications in real-world scenarios, especially on mission-critical applications such as the finance and healthcare, and time-critical applications like self-driving. As AI is becoming more entrenched within our society and bringing benefits to the smart nation initiatives, the next generation AI should be more transparent and with better interpretability.
Interpretable machine learning, as a forefront of AI research, opens up new possibilities and holds promise for helping humans understand the working mechanism of black-box models or how a particular decision is made. Existing works are mainly categorized into either intrinsic (by-design) approach, which designs self-explanatory models, or post-hoc approach, which constructs a second model to interpret the target model. Nevertheless, both approaches have major limitations that: the by-design approach typically trade-offs interpretability for accuracy; while the post-hoc approach lacks guarantees about the explanation quality. Furthermore, while making the models more transparent, existing approaches lack in-depth consideration for privacy (e.g., avoiding information leaks of data), fairness (e.g., ensuring fairness during decision-making process), and robustness (e.g., enhancing robustness against adversarial attacks) in model interpretation.
In this research, we aim to exploit interpretable knowledge, such as the prior domain knowledge, information theoretic principles, as well as adversarial algorithms to elucidate deep black-box models, in order to broaden the spectrum of explainable AI solutions. In particular, we focus on a three-level interpretable knowledge framework, in terms of working mechanism, model components, and individual decisions. Furthermore, we take in-depth consideration of privacy, fairness, and robustness issues in model interpretations.
We exploit knowledge proxy to create self-explanatory models and inherently interpret working mechanisms. Hence, we extract interpretable knowledge from raw data first, which can be organized in the form of decision trees, inference rules, and symbolic reasoning. We next incorporate them into deep model structures, in order to endow these models with the interpretability of proxy and strong representation ability of neural networks. For example, we proposed TEM  and KPRN , which separately learn decision rules and knowledge-aware paths as the proxy; and utilized these models [2,3,5] to explain why a recommendation is made.
Moreover, we explore network dissection to interpret model components, in order to understand internal representations via interpretable knowledge structure. Towards that, we developed KTUP , which identifies semantic relations from knowledge graph, and couples hidden neuros of recommenders with known relations. As such, KTUP can select top relations (e.g., director, actors, genres) to explain why a user prefers the items (e.g., movies).
We take in-depth consideration of the robustness of a predictive model, which is expected to maintain a high-level of performance, independently from adversarial attacks. For example, we developed APR , which adds adversarial perturbations to model parameters, in order to make the predictions robust against the perturbed inputs.
Plans for Future Research
First, in order to complete the generic three-level interpretable knowledge framework, we would like to exhibit feature attribution to interpret individual decisions of arbitrary black-box models. Second, we will pay more attentions on the robustness, privacy, and fairness factors of the explainable models. Third, we will design human-based evaluations to evaluate the explanations, in particular, on the aspects of quality and utility of recommendation. Lastly, we will apply our framework to the real-world application domains of recommendation systems, Fintech, and healthcare.
- Xiangnan He, Zhankui He, Xiaoyu Du, Tat-Seng Chua: Adversarial Personalized Ranking for Recommendation. SIGIR 2018: 355-364
- Xiang Wang, Xiangnan He, Fuli Feng, Liqiang Nie, Tat-Seng Chua: TEM: Tree-enhanced Embedding Model for Explainable Recommendation. WWW 2018: 1543-1552
- Xiang Wang, Dingxian Wang, Canran Xu, Xiangnan He, Yixin Cao, Tat-Seng Chua: Explainable Reasoning over Knowledge Graphs for Recommendation. AAAI 2019
- Yixin Cao, Xiang Wang, Xiangnan He, Zikun Hu, Tat-Seng Chua: Unifying Knowledge Graph Learning and Recommendation: Towards a Better Understanding of User Preferences. WWW 2019: 151-161
- Xun Yang, Xiangnan He, Xiang Wang, Yunshan Ma, Fuli Feng, Meng Wang & Tat-Seng Chua: Interpretable Fashion Matching with Rich Attributes. SIGIR 2019