Impact of Input Dataset Size and Fine-tuning on Faster R-CNN with Transfer Learning

Name
Wei Zheng
Abstract
Deep learning models are widely used for machine learning tasks such as object detection. The lack of available data to train these models is a common hindrance in many industrial applications, where data gathering/annotation and insufficient computational resources often impose a barrier to the financial feasibility of deep learning implementations. Transfer learning is a possible answer to this challenge by exploiting the information learned by a model from data in a different domain than that of the target dataset. This technique has been typically applied on the backbone network of a two-stage object detection pipeline. In this work, we investigate the association between the input dataset size and the proportion of trainable layers in the backbone. In particular, we show some interesting findings on Faster R-CNN ResNet-50 FPN, a state-of-the-art object detection model, and MS COCO, a benchmarking dataset. The outcomes of our experiments indicate that, although a model generally performs better when trained with more layers fine-tuned to the training data, such an advantage reduces as the input dataset becomes smaller, as unfreezing too many layers can even lead to a severe overfitting problem. Choosing the right number of layers to freeze when applying transfer learning not only allows the model to reach its best possible performance but also saves computational resources and training time. Additionally, we explore the association between the effect of learning rate decay and the input dataset size, and also discuss the advantage of using pre-trained weights when compared to training a network from scratch.
Graduation Thesis language
English
Graduation Thesis type
Master - Computer Science
Supervisor(s)
Victor Henrique Cabral Pinheiro, Tomas Björklund
Defence year
2023
 
PDF