ICLR2025

Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN

Pengxiang Li, Lu Yin, Shiwei Liu

摘要

and the ELLIS -Max Planck Campus Tuebingen (Germany). My research focuses on understanding and leveraging low-dimensionality in machine learning, with particular interests in efficient training and inference for foundation models, reasoning and robustness in neural networks, and hardware-friendly learning algorithms.