Efficient biomolecule modeling and drug discovery with large language models
编号:69
访问权限:仅限参会人
更新:2025-03-25 14:42:39
浏览:28次
口头报告
摘要
Large language models, which can integrate and process large amounts of data in biomedicine, have great potential in modeling complex diseases and discovering functional biomolecules for potential therapeutics. In this talk, we will first introduce the models based on protein language models to efficiently discover remote homologs and functional biomolecules from nature, such as signal peptides. With the model, we can identify remote homologs 22 times faster than PSI-BLAST and discover diverse functional peptides with sequence similarity lower than 20% against the known ones. Then, we developed an RNA language model to model the RNA sequence and structure relation, which enables us to perform RNA structure prediction and reverse design effectively. Within two months, we designed and experimentally validated 19 RNA aptamers that are structurally similar, yet sequence dissimilar, to known light-up aptamers. More importantly, 10 designed aptamers show higher fluorescence than the native Mango-I. The above projects demonstrate the great potential of large language models in promoting fundamental computational biological research and transformational development.
发表评论