Recurrent Model of Visual Attention Demo

Description

This demo trains a novel recurrent neural network model that is capable of extracting information from an image by adaptively selecting a sequence of regions or locations and only processing the selected regions at high resolution. We train the network on the MNIST Dataset (Mixed National Institute of Standards and Technology database) digits dataset and show the training process in your browser.

More precisely, we will use mlpack's neural network architecture to show the training process of the so called Glimpse layer.

Glimpse Layer
Glimpse Layer Output
pred 0
pred 1
pred 2
pred 3
pred 4
pred 5
pred 6
pred 7
pred 8
pred 9
TPR
act 0
0
0
0
0
0
0
0
0
0
0
0
act 1
0
0
0
0
0
0
0
0
0
0
0
act 2
0
0
0
0
0
0
0
0
0
0
0
act 3
0
0
0
0
0
0
0
0
0
0
0
act 4
0
0
0
0
0
0
0
0
0
0
0
act 5
0
0
0
0
0
0
0
0
0
0
0
act 6
0
0
0
0
0
0
0
0
0
0
0
act 7
0
0
0
0
0
0
0
0
0
0
0
act 8
0
0
0
0
0
0
0
0
0
0
0
act 9
0
0
0
0
0
0
0
0
0
0
0
ACC
PPV
0
0
0
0
0
0
0
0
0
0
0
Confusion Samples