Review: STN — Spatial Transformer Network (Image Classification) | by Sik-Ho Tsang | Towards Data Science

Review: STN — Spatial Transformer Network (Image Classification)

In this story, Spatial Transformer Network (STN), by Google DeepMind, is briefly reviewed. STN helps to crop out and scale-normalizes the appropriate region, which can simplify the subsequent classification task and lead to better classification performance as below:

(a) Input Image with Random Translation, Scale, Rotation, and Clutter, (b) STN Applied to Input Image, (c) Output of STN, (d) Classification Prediction

It is published in 2015 NIPS with more than 1300 citations. Spatial transformation such as affine transformation and homography registration has been studied for decades. But in this paper, spatial transformation is coped with neural network. With learning-based spatial transformation, transformation is applied conditioned on input or feature map. And it is highly related to another paper called “Deformable Convolutional Networks” (2017 ICCV). Thus, I decided to read this first. (Sik-Ho Tsang @ Medium)