FOSDEM 2020
/
Schedule
/
Events
/
Developer rooms
/
HPC, Big Data, and Data Science
/
Efficient Model Selection for Deep Neural Networks on Massively Parallel Processing Databases

Efficient Model Selection for Deep Neural Networks on Massively Parallel Processing Databases

Track: HPC, Big Data, and Data Science devroom
Room: UB5.132
Day: Sunday
Start: 11:30
End: 11:55

In this session we will present an efficient way to train many deep learning model configurations at the same time with Greenplum, a free and open source massively parallel database based on PostgreSQL. The implementation involves distributing data to the workers that have GPUs available and hopping model state between those workers, without sacrificing reproducibility or accuracy. Then we apply optimization algorithms to generate and prune the set of model configurations to try.

Deep neural networks are revolutionizing many machine learning applications, but hundreds of trials may be needed to generate a good model architecture and associated hyperparameters. This is the challenge of model selection. It is time consuming and expensive, especially if you are only training one model at a time.

Massively parallel processing databases can have hundreds of workers, so can you use this parallel compute architecture to address the challenge of model selection for deep nets, in order to make it faster and cheaper?

It’s possible!

We will demonstrate results from this project using a version of Hyperband, which is a well known hyperparameter optimization algorithm, and the deep learning frameworks Keras and TensorFlow, all running on Greenplum database using Apache MADlib. Other topics will include architecture, scalability results and bright opportunities for the future.

We look forward to presenting this topic at FOSDEM’20!

Speakers

Frank McQuillan

Attachments

Efficient Model Selection for Deep Neural Networks on Massively Parallel Processing Databases (slides)

FOSDEM20

Brussels / 1 & 2 February 2020

Efficient Model Selection for Deep Neural Networks on Massively Parallel Processing Databases

Speakers

Attachments

Links

FOSDEM

This year

Practical information

Media and press