Hoeffding Trees are an established method for classification; at the same time, gradient descent methods are becoming increasingly popular, owing in part to the successes of deep learning. We are investigating the benefits of using GPUs for data-stream learning due to their high scalability.
Summary
In this project we are studying new ways to adapt Neural Networks for real-time analysis. The main challenges we are facing are:
- Gradient descent methods have strong convergence guarantees but this convergence is slow. This method requires a large amount of samples in order to achieve good accuracy, and as a consequence, the training requires large amount of time.
- Neural Networks are trained to solve an specific task due to their high sensitivity to hyper-parameters configurations.
- Moving data efficiently to/from the GPU is a the key factor to meet the real-time analysis constrains.
Objectives
We are currently studying how prune NN hyper-parameters so they can become an effective ‘off-the-shelf’ data-streams solution. The ideal target we pursue is:
- Be able to make a classification at any time.
- Deal with a potentially infinite number of examples.
- Access each example in the stream just once.
- In a bounded pre-defined time.
- With limited amount of memory.