Neural networks for data-streams

Hoeffding Trees are an established method for classification; at the same time, gradient descent methods are becoming increasingly popular, owing in part to the successes of deep learning. We are investigating the benefits of using GPUs for data-stream learning due to their high scalability.

Summary

In this project we are studying new ways to adapt Neural Networks for real-time analysis. The main challenges we are facing are:

Gradient descent methods have strong convergence guarantees but this convergence is slow. This method requires a large amount of samples in order to achieve good accuracy, and as a consequence, the training requires large amount of time.
Neural Networks are trained to solve an specific task due to their high sensitivity to hyper-parameters configurations.
Moving data efficiently to/from the GPU is a the key factor to meet the real-time analysis constrains.

Objectives

We are currently studying how prune NN hyper-parameters so they can become an effective ‘off-the-shelf’ data-streams solution. The ideal target we pursue is:

Be able to make a classification at any time.
Deal with a potentially infinite number of examples.
Access each example in the stream just once.
- In a bounded pre-defined time.
- With limited amount of memory.