Objectives
To download the slides of the presentation, please click here.
Abstract: Deep neural networks are known to require lots of labelled data for their training. For many tasks obtaining this labelled data is expensive. In this talk I will address learning from unlabeled data in two scenarios. Firstly, a large class of problems in computer vision can be framed as image-to-image translation models (e.g. image to semantic segmentation). We will show how these models can be trained for modalities (e.g. depth) for which no training labels are available. In addition, we will show how they can be used to generate synthetic data for domains with little data (FIR tracking). In the second part of the talk I will go into recent work that shows that for some applications one can automatically generate ranked training examples. We show how these rankings can be used as a self-supervised task which can be added to the original network. This can be used to greatly increase the training dataset and therefore train better feature representation. We will show the application of this idea on image quality assessment, where a quality measure needs to be assigned to an image, and to the crowd counting problem, where the goal is to estimate the number of persons in an image.

