Explainable Machine Learning

PD Dr. Ullrich Köthe, Prof. Carsten Rother, SS 2018
Thursdays, 14:00-16:00, HCI, Mathematikon B (Berliner Str. 43), 3. floor, SR 128

Today's machine learning algorithms, and in particular neural networks, mostly act as blackboxes: They make very good predictions, but we don't really understand why. This is problematic for various reasons: Why should users (e.g. physicians) trust these algorithms? Will blackbox methods contribute to the advancement of science, when they produce numbers, not insight? How can one legally challenge an objectionable machine decision? Explainable machine learning attempts to solve these problems by opening the blackbox. In the seminar, we will look at many different ideas on how this can be done and what kind of insight is gained.

Since 29 students are registered for the seminar, we will have two talks every week, either on two related topics or on a single more complex topic. Please send me an email with your favourite topics, especially if you want to present at the beginning of the semester.

Schedule

19. April	Felix Feldmann: Experimental Investigation of Explainability	Slides \| Report
	Conrad Sachweh: The EU "right to explanation"; An explainable machine learning challenge	Slides \| Report
26. April	Mohammad Gharaee: Grad-CAM	Slides \| Report
	Philipp Wimmer: Interpreting and understanding deep neural networks	Slides \| Report
3. May	Philip Grassal: Why Should I Trust You?	Slides \| Report
	Christoph Schaller: What is Relevant in a Text Document?	Slides \| Report
17. May	Carsten Lüth: Dynamic routing between capsules	Slides \| Report
	Michael Dorkenwald: Matrix capsules with EM routing	Slides \| Report
24. May	Philipp Reiser: Metric learning with adaptive density discrimination	Slides \| Report
	Benedikt Kersjes: Learning Deep Nearest Neighbor Representations Using Differentiable Boundary Trees	Slides \| Report
	Jens Beyermann: Deep Unsupervised Similarity Learning using Partially Ordered Sets	Slides \| Report
7. June	Sebastian Gruber: What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision?	Slides \| Report
	Florian Kleinicke: Learning how to explain neural networks: PatternNet and PatternAttribution	Slides \| Report
14. June	Philipp de Sombre: Understanding Black-box Predictions via Influence Functions	Slides \| Report
	Thorsten Wünsche: MARGIN: Uncovering Deep Neural Networks using Graph Signal Analysis	Slides \| Report
21. June	Pingchuan Ma: Network Dissection: Quantifying Interpretability of Deep Visual Representations	Slides \| Report
	Johannes Vogt: Feature Visualization	Slides \| Report
28. June	Aliya Amirzhanova: Deep feature interpolation for image content changes	Slides \| Report
	Julian Rodriquez: Distilling a Neural Network Into a Soft Decision Tree	Slides \| Report
5. July	Daniela Schacherer: Interpreting Deep Classifier by Visual Distillation of Dark Knowledge	Slides \| Report
	Michael Aichmüller: Generating Visual Explanations	Slides \| Report
	Frank Gabel: Generative Adversarial Text to Image Synthesis	Slides \| Report
12. July	Peter Huegel: InfoGAN	Slides \| Report
	Hannes Perrot: Inferring Programs for Visual Reasoning	Slides \| Report
19. July	Nasim Rahaman: Discovering Causal Signals in Images	Slides \| Report
	Stefan Radev: CausalGAN	Slides \| Report

Topics to Choose From:

Topic 1: Black-box Attention Indicators

Attention mechanisms explain an algorithm's decision by pointing out the crucial evidence in the data. For example, when the algorithm recognizes a cat in an image, an attention mechanism will highlight the mask presumably containing the cat. If the mask is off, there is a problem, and the algorithm should not be trusted. Black-box methods achieve this without looking into the algorithm itself and thus work for any machine learning method.

[Speaker: Philip Grassal, 3.5.] Ribeiro et al. "Why Should I Trust You?": Explaining the Predictions of Any Classifier (2016)
Lundberg and Lee A unified approach to interpreting model predictions (2017)
Fong and Vedaldi Interpretable Explanations of Black Boxes by Meaningful Perturbation (2017)
Dabkowski and Gal Real Time Image Saliency for Black Box Classifiers (2017)
Elenberg et al. Streaming Weak Submodularity: Interpreting Neural Networks on the Fly (2017)

Topic 2: White-box Attention Indicators

The goal is the same (pointing out the evidence), but the machine learning algorithm is opened-up and extended to facilitate the search. Often, a modified version of back-propagation is used to trace high neuron activations in a network's output layer (indicating high evidence for a class) all the way through the network to the input.

Bach et al. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation (2015)
Goyal et al. Towards Transparent AI Systems: Interpreting Visual Question Answering Models (2016)
[Speaker: Christoph Schaller, 3.5.] Arras et al. "What is Relevant in a Text Document?": An Interpretable Machine Learning Approach (2016)
[Speaker: Mohammad Gharaee, 26.4.] Selvaraju et al. Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization (2017)
[Speaker: Philipp Wimmer, 26.4.] Montavon et al. Methods for interpreting and understanding deep neural networks (2017)
[Speaker: Florian Kleinicke] Kindermans et al. Learning how to explain neural networks: PatternNet and PatternAttribution (2018)

Topic 3: Feature Visualization

Feature visualization tries to make visible what a neural network has learned: What patterns cause high activations in the network's interior layers? What forms or objects do individual neurons specialize in? Is there a "grandma neuron"?

[Speaker: Johannes Vogt] Olah et al. Feature Visualization (2017)
[Speaker: Pingchuan Ma] Bau et al. Network Dissection: Quantifying Interpretability of Deep Visual Representations (2017)
[Speaker: Daniela Schacherer] Xu et al. Interpreting Deep Classifier by Visual Distillation of Dark Knowledge (2018)

Topic 4: Influential Training Examples

A training example is influential for a particular prediction task if the prediction would change significantly had the example been missing from the training set. The result of this analysis reveals typical prior instances for the present situation, as well as typical counter-examples, thus providing an "explanation by analogy".

[Speaker: Philipp de Sombre] Koh and Liang Understanding Black-box Predictions via Influence Functions (2017)
[Speaker: Thorsten Wünsche] Anirudh et al. MARGIN: Uncovering Deep Neural Networks using Graph Signal Analysis (2018)

Topic 5: Confidence Estimation

When a system is able to provide reliable self-diagnosis and to point out cases where the results should be ignored, it becomes much more trustworthy, even when it cannot explain its reasoning.

Guo et al. On Calibration of Modern Neural Networks (2017)
[Speaker: Sebastian Gruber] Kendall and Gal What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision? (2017)

Topic 6: Reduction of Complex Models to Simpler Ones

Powerful methods like neural networks are incomprehensible for humans. However, one can use complex models to train simpler ones for special cases (e.g. in the neighborhood of an instance of interest, or for an important subproblem), which can then be understood.

Meinshausen Node harvest (2010)
Hinton et al. Distilling the knowledge in a neural network (2015)
[Speaker: Julian Rodriquez] Frosst and Hinton Distilling a Neural Network Into a Soft Decision Tree (2017)
Krishnan and Wu PALM: Machine Learning Explanations For Iterative Debugging (2017)
Zhang et al. Interpreting CNNs via decision trees (2018)
Gross et al. Hard Mixtures of Experts for Large Scale Weakly Supervised Vision (2017)
Shazeer et al. Outrageously large neural networks: The sparsely-gated mixture-of-experts layer (2017)

Topic 7: Text/Caption/Rule/Program Generation

Texutual descriptions and rule sets are easy to understand, so it makes sense to extract them from the implicit knowledge of a complex method. In an advanced form, neural networks perform program induction, generate simple programs that can then be run to answer queries of interest.

Lei et al. Rationalizing neural predictions (2016)
[Speaker: Michael Aichmüller] Hendricks et al. Generating Visual Explanations (2016)
Park et al. Attentive explanations: Justifying decisions and pointing to the evidence (2017)
Vedantam et al. Context-aware Captions from Context-agnostic Supervision (2017)
Ribeiro et al. Nothing Else Matters: Model-Agnostic Explanations By Identifying Prediction Invariance (2016)
[Speaker: Frank Gabel] Reed et al. Generative Adversarial Text to Image Synthesis (2016)
Yang et al. Scalable Bayesian Rule Lists (2017)
[Speaker: Hannes Perrot] Johnson et al. Inferring and Executing Programs for Visual Reasoning (2017)
Devlin et al. RobustFill: Neural Program Learning under Noisy I/O (2017)

Topic 8: Similarity Learning

Similarity is one of the most fundamental modes of explanation. However, it is very difficult to define the similarity between complex situations or entities in a way that conforms to human intuition. Simple criteria such as Euclidean distance don't work well, and more suitable metrics are best obtained by learning.

[Speaker: Philipp Reiser] Rippel et al. Metric learning with adaptive density discrimination (2015)
[Speaker: Jens Beyermann] Bautista et al. Deep Unsupervised Similarity Learning using Partially Ordered Sets (2017)
[Speaker: Benedikt Kersjes] Zoran et al. Learning Deep Nearest Neighbor Representations Using Differentiable Boundary Trees (2017)
Haeusser et al. Learning by Association - A versatile semi-supervised training method for neural networks (2017)

Topic 9: Learning Disentangled Representations

Many learning methods transform the raw data into an internal representation where the task of interest is easier to solve. When these hidden variables carry semantic meaning, their activations can serve as categories for explanation. Suitable modifications of the network architecture and/or training procedure encourage these variables to disentangle into meaningful concepts.

[Speaker: Aliya Amirzhanova] Upchurch et al. Deep feature interpolation for image content changes (2016)
[Speaker: Peter Huegel] Chen et al. InfoGAN: Interpretable representation learning by information maximizing generative adversarial nets (2016)
Siddharth et al. Learning Disentangled Representations with Semi-Supervised Deep Generative Models (2017)
Thomas et al. Independently Controllable Factors (2017)
[Speaker: Carsten Lüth, 17.5.] Sabour et al. Dynamic routing between capsules (2017)
[Speaker: Michael Dorkenwald, 17.5.] Hinton et al. Matrix capsules with EM routing (2018)

Topic 10: Learning with Higher Level Concepts and/or Causality

Even more meaningfuls descriptions can be achieved when learned latent attributes are connected to high-level abstractions such as object hierarchies or causal graphs. Recent work has shown very promosing results in this direction.

Al-Shedivat et al. Contextual Explanation Networks (2017)
[Speaker: Nasim Rahaman] Lopez-Paz et al. Discovering Causal Signals in Images (2017)
[Speaker: Stefan Radev] Kocaoglu et al. CausalGAN: Learning Causal Implicit Generative Models with Adversarial Training (2017)
Higgins et al. SCAN: learning abstract hierarchical compositional visual concepts (2017)

Topic 11: Application Perspectives

[Speaker: Conrad Sachweh, 19.4.] Goodman and Flaxman European Union regulations on algorithmic decision-making and a "right to explanation" (2016)
[Speaker: Conrad Sachweh, 19.4.] Escalante et al. Design of an explainable machine learning challenge for video interviews (2017)
Goyal et al. Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering (2017)
Miller Explanation in artificial intelligence: Insights from the social sciences (2017)
Holzinger et al. What do we need to build explainable AI systems for the medical domain? (2017)
[Speaker: Felix Feldmann, 19.4] Narayanan et al. How do Humans Understand Explanations from Machine Learning Systems? An Evaluation of the Human-Interpretability of Explanation (2018)
[Speaker: Felix Feldmann, 19.4] Poursabzi-Sangdeh et al. Measuring model interpretability (2018)

top