The Patient as a Multimodal Story: A Project Building Multimodal AI for Medicine
Benedikt Peterson, Charité - Universitätsmedizin Berlin, Center of Digital Health - BIH@Charité
Inhalte/ Contents
Patients generate multimodal clinical data (imaging, signals, reports, genomics), but jointly leveraging these sources to support clinical decisions and improve patient outcomes has long been limited by computing and architecture constraints. Recent advances in multimodal AI/LLMs and GPU computing make it feasible to build models that synthesize heterogeneous evidence. We'll jointly explore, tackle current limitations, and further develop solutions for multimodal models. This research tutorial centers on large-scale contrastive learning similar to SigLIP but beyond two modalities (Symile (https://arxiv.org/abs/2411.01053), GRAM (https://arxiv.org/abs/2412.11959), TRIANGLE (https://arxiv.org/abs/2509.24734)). Target group: advanced BSc/MSc students in CS or related fields. Prerequisites: strong Python and deep learning basics (PyTorch preferred); familiarity with representation/contrastive learning is a plus. Outcome: a publishable paper
Kontakt/ Contact
benedikt.peterson@fu-berlin.de
