Your Deep Learning Model Can be Absolutely Certain and Really Wrong

Do deep generative models know what they don’t know?


Generative models are super exciting because they can learn data and generate new samples from what it learns. Researchers from several institutions have used generative models to create brand new images like below (none of these people are real). In the case of Nalisnick et al., the authors build classifies to distinguish objects using generative models.

This example of generated faces comes from here.

Nalisnick et al. show that deep generative models trained on a range of image datasets return strange results when applied to separate data sets.


The question of what a neural network knows is not shocking. We ask this all the time when we attempt to transfer a model into the wild. In the case of transfer learning, we know we are applying a model not necessarily as intended, but it’s easier then starting all over (especially if data is a blocker). We hope our neural network has a warm start in the right direction, and we can reuse it with a minimal amount of tweaking. We tend to focus on use cases that share a reasonable amount a similarity.

Nalisnick et al. explore the question of what generative models are doing when they classify things they should not have seen. In a separate publication, Louizos & Welling provides an excellent example of this effect. They show that rotating an MNIST digit leads to the neural network predicting the wrong class with high probability. So necessarily, neural networks like humans can have the Dunning-Kruger effect: the neural network does not know what it doesn’t know.

Image taken from here.


Nalisnick, et al., trained generative models trained on one dataset and then applied this model to an orthogonal dataset. The key datasets comparisons:

  • FashionMNIST (images of clothing) vs. MNIST (images of numbers)
  • CelebA (images of celebrities) vs. SVHN (street view house numbers)
  • ImageNet (20,000 types of images) vs CIFAR-10 / CIFAR-100 / SVHN.

The authors “expect the models to assign a lower probability to this data because the models were not trained on it.”

The stranger data set is introduced to the model leads to a probability distribution with a higher mean probability.

Credit: Nalisnick, et. al.


This result goes against intuition. We would like to believe as we transfer a model to a new and different data set, the classification probabilities would be meaningful. This publication shows that they are a little bit off, but the probability distributions are also sufficiently shifted and are likely an artifact. The fourth group comparison from above is especially troubling because I would have expected ImageNet and CIFAR sets to more relevant to each other. However, the shifted probability distribution is still very striking.

Note, authors examined several methods of Glow, RNVP, PixelCNNs, and VAEs observed similar behavior for all.


In the authors’ words:

We have shown that comparing the likelihoods of deep generative models alone cannot identify the training set or inputs like it. Therefore, we urge caution when using these models with out-of-training-distribution data or in unprotected user-facing systems.

When we move a model into the wild, we need to test our models with samples, unlike the original training set as much as possible. We should take these probability distributions with a grain of salt and think carefully about how they fit into our use case. Researchers will examine how to fix these flaws with the importance of these types of models in the coming years.

Written by

Angela Wilkins

I like science, machine learning, start-ups, venture capital, and technology.