NeurIPS2020

Look-ahead Meta Learning for Continual Learning

Gunshi Gupta, Karmesh Yadav, Liam Paull

74 citations

Abstract

The Continual Learning (CL) problem involves performing well on a sequence of tasks under limited compute. Current algorithms in the domain are either slow, offline or sensitive to hyper-parameters. La-MAML, an optimization-based meta-learning algorithm claims to be better than other replay-based, prior-based and meta-learning based approaches. Scope of Reproducibility According to the MER paper [2], metrics to measure performance in the continual learning arena are Retained Accuracy (RA) and Backward Transfer-Interference (BTI). La-MAML [1] claims to be a better performing, robust and faster algorithm compared to the existing baselines. These are the main claim of the paper. Methodology We used the author's code which was pretty new and built on the latest packages. Most of the experiments were tried on Free Kaggle Notebooks (Tesla P100 GPU). We ran the code according to the hyperparameters given in the original paper. We found that the results were very similar to the ones given in the paper. Results We reproduced the Retained Accuracy on real world datasets to within 6% of the reported value, which supports the paper's conclusion that it outperforms the baselines. What was easy Running the code was easy. The packages used for the official implementation were the latest. It was easy to incorporate Weights and Biases into the implementation. What was difficult For some of the experiments, the computational requirement was too high. For example, the MNIST Many Permutations Dataset requires more than 12GB of RAM to pass into the loader. Further, some other experiments exceeded 12 hours of running time due to which we had to use less powerful GPUs. Communication with original authors For most of the experiments concerning the main claim of the paper, the code was enough from the official repo provided by the authors on Github. However, reproducing some of the figures and the tables involving Gradient Alignment and Catastrophic Forgetting visualization proved to be difficult due to those parts not being published. We were able to contact the authors and received help for those experiments. Preprint. Under review.