Data Analysis

Computational and mathematical methods are indispensable to integrate the large data sets obtained in SMART-CARE and derive predictive models for the risk of tumor recurrence. In tumor genomics, readily generated data, including genomes/exomes, transcriptomes and methylomes, are already being used in clinical practice for patient stratification and personalized treatment. By comparison, proteomes, metabolomes and secretomes are expected to yield a more direct and faithful picture of functional cancer cell phenotypes for risk prediction, making computational analyses for identifying informative markers and molecular signatures a pivotal task. Reasonably mature workflows exist for the primary analysis of proteomics and (to a lesser extent) metabolomics data. However, the integration of the various molecular data types with each other and with clinical data presents a fundamental new challenge. To address this challenge, we propose a concerted computational biology work program that links sophisticated statistical analyses (Huber) with data integration through large-scale network analyses (Saez-Rodriguez) and the development of mechanistically-based inference tools for metabolic and signaling pathway activities in tumor samples (Höfer).

Team Huber

EMBL

The Huber group develops and applies the statistical, computational and bioinformatic methods needed to analyze novel proteomic and metabolomic data types, to integrate them with genetic, transcriptomic and clinical data, and to discover new insights into tumor biology, vulnerabilities, resistance mechanisms, and recurrence. Particular focus lies on the development of innovative quality assessment (QA) and quality control (QC) methods that scale to the required data volumes and heterogeneity and help automate and objectify the task of QA/QC as much as possible - a pivotal requirement for use of proteomics and metabolomics in clinical research. The group integrates single-cell resolution multi-omics data (in particular CITE-Seq) with mass spectrometric proteomic and metabolomic data and spatially resolved protein level data from immunohistochemistry to better understand intra-tumor heterogeneity and its impact on variable drug responses in blood cancers. Together with the other computational and data science oriented groups - Saez-Rodriguez and Höfer - the Huber group develops procedures and techniques to integrate the different modalities within the SMART-CARE project.

Team Saez-Rodriguez

University Hospital Heidelberg

Our team is involved in designing new approaches to exploit single and multi-omic data. We mainly focus on finding new ways to systematically integrate prior-knowledge from various sources with omic data to generate mechanistic hypotheses (see Kinact (Wirbel et al, 2018), Omnipath (Türei et al, 2016), COSMOS (Dugourd et al, 2021), ocEAn (Sciacovelli et al. 2022)). Kinact allows to estimate kinase activity from phosphoproteomic data, omnipath is a one stop shop for protein interaction and annotations, while COSMOS allows to generate mechanistic connection between phospho-proteomic and metabolomic data. Finally, ocEAn can integrate metabolomic data with metabolic enzyme signatures. Such approaches allow us to find context dependent biological deregulation that drives complex diseases, guiding us to find novel and more accurate treatments.

Team Höfer

DKFZ Heidelberg

We are developing mechanistically-based mathematical models of signal transduction networks and metabolic pathways in cancer cells. To this end, we will integrate mass spec data on the expression levels of proteins – signaling molecules such as kinases and phosphatases or metabolic enzymes – with data on the outcome of their action – phosphorylation levels or metabolite concentrations. This work is expected to inform the search for functional molecular signatures to better predict the course of malignant disease and help choose appropriate therapeutic options. We collaborate closely with other computational biologists, including the structural network analyses performed in the Saez-Rodriguez group.

Young Investigator Group Junyan Lu

University Hospital Heidelberg

Dr. Junyan Lu studied computational biology and drug design at Shanghai Institute of Material Medica, Chinese Academy of Sciences. Then he joined Wolfgang Huber’s group at EMBL as a postdoc fellow and later as staff scientist, focusing on advancing precision oncology of blood cancers through multi-omics data integration. Since December 2021, Dr. Junyan Lu joined University Hospital Heidelberg and the SMART-CARE consortium as a junior research group leader. As a member of the SMART-CARE team, his group focuses on providing innovative and robust computational solutions for mining and integrating mass-spectrometry data, in order to identify biomarkers for cancer progression and recurrence. As a computational team embedded in a clinical setting, Dr. Lu’s group also works closely with physicians and clinician scientists to translate their search outputs into better patient care. More information about Lu group can be found at Github.

Yueyang Xie, Qianwu Liao, Caroline Lohoff, Shubham Agrawal, Junyan Lu, Shuo Wang (from left to right)