Analysis of Efficient Neural Architecture Search via Parameter Sharing

Prabhant Singh
Deep learning based approaches have improved the state of the art performance of systems in various tasks such as language modeling, computer vision, object recognition, and image segmentation. Every task in deep learning requires custom architectures tailored specifically for that task. This resulted in high demand of domain experts for deep learning who can craft novel architectures. With the cost of domain experts rising and computational expenses falling, automating the neural architecture design is considered as an alternative.
The concept of neural architecture search has been introduced to tackle this problem. Neural architecture search can be considered a subset of automated machine learning(AutoML) domain.
In this thesis, we have looked at a state of the art neural architecture search technique "Efficient neural architecture search via parameter sharing"(ENAS). ENAS was introduced by Google Brain and it was a major improvement over its predecessor "Neural architecture search with Reinforcement learning"(NAS) . ENAS use a controller to sample the architectures from a search space which are later selected based on the measure defined by ENAS performance estimation strategy. Due to the impressive performance of ENAS, there has been research to apply ENAS and similar parameter sharing techniques in critical areas like medicine and diagnostics. The motivation behind this thesis is to speed up and analyze the learning behavior of ENAS.
In this work we have analyzed the learning process of ENAS, evaluated ENAS performance estimation strategy and applied transfer learning on ENAS controller. We found that architectures do not improve with ENAS controller training via various experiments. We conclude that training of ENAS controller is not necessary and discuss limitations of ENAS performance estimation strategy.
Graduation Thesis language
Graduation Thesis type
Master - Computer Science
Tobias Jacobs, Meelis Kull
Defence year