ICLR2025
Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN
Pengxiang Li, Lu Yin, Shiwei Liu
Abstract
and the ELLIS -Max Planck Campus Tuebingen (Germany). My research focuses on understanding and leveraging low-dimensionality in machine learning, with particular interests in efficient training and inference for foundation models, reasoning and robustness in neural networks, and hardware-friendly learning algorithms.