Abstract
Drug discovery is a complex and costly process, often taking over a decade from target identification to FDA approval, with many candidates failing along the way. AI foundation models, applied to vast datasets of small molecules, proteins, and transcriptomic (or more broadly, omic) data, are transforming biomedical research by accelerating target identification, drug design, and testing. A promising and ambitious goal is to leverage these models to construct a virtual cell capable of simulating health and disease.
Two key challenges must be addressed to achieve this goal:
- Comprehensive molecular representation – While molecular graphs, images, and text are essential for accurate modeling, previous work has typically focused on single representations.
- Integration of diverse data modalities – Predicting complex biological interactions (e.g., antibody-protein binding) requires combining RNA, protein, and small molecule data.
This talk presents two complementary approaches to address these challenges:
- Multi-view Molecular Embedding with Late Fusion (MMELON) – Pre-trained on datasets of up to 200M molecules, aggregated into combined representations [1].
- Molecular Aligned Multi-Modal Architecture and Language (MAMMAL) – Trained on over 2B data points, integrating small molecules, proteins, and single-cell RNA-seq data [2].
Both approaches achieve state-of-the-art results in multi-modal drug discovery.
[1] Suryanarayanan, Parthasarathy, et al. "Multi-view biomedical foundation models for molecule-target and property prediction." arXiv preprint arXiv:2410.19704 (2024).
[2] Shoshan, Yoel, et al. "MAMMAL--Molecular Aligned Multi-Modal Architecture and Language." arXiv preprint arXiv:2410.22367 (2024).
![](/sites/default/files/public/u5004/michals_photo_for_bio.jpg)