ICML2024

Interaction-based Retrieval-augmented Diffusion Models for Protein-specific 3D Molecule Generation

Zhilin Huang, Ling Yang, Xiangxin Zhou, Chujun Qin, Yijie Yu, Xiawu Zheng, Zikun Zhou, Wentao Zhang, Yu Wang, Wenming Yang

被引用 18 次

摘要

Generating ligand molecules that bind to specific protein targets via generative models holds substantial promise for advancing structurebased drug design. Existing methods generate molecules from scratch without reference or template ligands, which poses challenges in model optimization and may yield suboptimal outcomes. To address this problem, we propose an innovative interaction-based retrieval-augmented 3D molecular diffusion model named IRDIFF to facilitate target-aware molecule generation. IRDIFF leverages a curated set of ligand references, i.e., those with desired properties such as high binding affinity, to steer the diffusion model towards synthesizing ligands that satisfy design criteria. Specifically, we design a geometric proteinmolecule interaction network (PMINet), and pretrain it with binding affinity signals to: (i) retrieve target-aware ligand molecules with high binding affinity to serve as references, and (ii) incorporate essential protein-ligand binding structures for steering molecular diffusion generation with two effective augmentation mechanisms, i.e., retrieval augmentation and self augmentation. Empirical studies on CrossDocked2020 dataset show IRDIFF can generate molecules with more realistic 3D structures and achieve stateof-the-art binding affinities towards the protein targets, while maintaining proper molecular properties. The codes and models are available at https://github.com/YangLing0818/IRDiff .