Your Deep Learning Model Can be Absolutely Certain and Really Wrong

Quick Summary

Generative models are super interesting because they can learn data and generate new samples from what it learn. Researchers from a number of institutions have used generative models to create brand new images like below (none of these people are real). In the case of Nalisnick, et. al., the authors build classifies to distinguish objects using generative models.

This example of generated faces comes from here.

Nalisnick, et. al. show that deep generative models trained on a range of image datasets return strange results when applied to unrelated data sets.

Key Idea

The question of what a neural network knows is not terribly shocking. We ask this all the time when we attempt to transfer a model into the wild. In the case of transfer learning, we know we are applying a model not necessarily as intended but it’s easier then starting all over (especially if data is a blocker). We hope our neural network has a warm start in the right direction and we can reuse it with a minimal amount of tweaking. We tend to focus on use cases which share a reasonable amount a similarity.

Nalisnick, et. al., explores the question of what exactly are generative models doing when they classify things they should never look at. In a separate publication, Louizos & Welling provides a nice example of this effect. They show that rotating an MNIST digit leads to the neural network predicting the wrong class with high probability. So essentially, neural networks like humans can have the Dunning-Kruger effect: the neural network does not know what it doesn’t know.

Image taken from here.

The Findings (The Numbers)

Nalisnick, et. al., trained generative models trained on one dataset and then applied this model to an orthogonal dataset. The key datasets comparisons:

  • FashionMNIST (images of clothing) vs MNIST (images of numbers)
  • CelebA (images of celebrities) vs SVHN (street view house numbers)
  • ImageNet (20,000 types of images) vs CIFAR-10 / CIFAR-100 / SVHN.

The authors “expect the models to assign a lower probability to this data because they were not trained on it”.

The stranger data set is introduced to the model leads a probability distribution with a higher mean probability.

Credit: Nalisnick, et. al.

Why Does It Matter?

This result goes against intuition. We would like to believe as we transfer a model to a new and different data set, the classification probabilities would be meaningful. This publication shows that not only are they a little bit off, the probability distributions are sufficiently shifted and likely an artifact. The fourth group comparison from above is especially troubling because I would have expected ImageNet and CIFAR sets to more relevant to each other, but the shifted probability distribution is still very striking.

Note, authors examined a number of methods Glow, RNVP, PixelCNNs and VAEs observed the similar behavior behavior for all.

What Does This Mean For My Life?

In the authors’ words:

We have shown that comparing the likelihoods of deep generative models alone cannot identify the training set or inputs like it. Therefore, we urge caution when using these models with out-of-training-distribution inputs or in unprotected user-facing systems.

When we move a model into the wild, we need to test our models with samples unlike the original training set as much as possible. We should take these probability distributions with a grain of salt and think carefully how they fit into our use case. Researchers in the coming years will be examining how to fix these flaws with importance of these types of models.