Data Analysis

Computational and mathematical methods are indispensable to integrate the large data sets obtained in SMART-CARE and derive predictive models for the risk of tumor recurrence. In tumor genomics, readily generated data, including genomes/exomes, transcriptomes and methylomes, are already being used in clinical practice for patient stratification and personalized treatment. By comparison, proteomes, metabolomes and secretomes are expected to yield a more direct and faithful picture of functional cancer cell phenotypes for risk prediction, making computational analyses for identifying informative markers and molecular signatures a pivotal task. Reasonably mature workflows exist for the primary analysis of proteomics and (to a lesser extent) metabolomics data. However, the integration of the various molecular data types with each other and with clinical data presents a fundamental new challenge. To address this challenge, we propose a concerted computational biology work program that links sophisticated statistical analyses (Huber) with data integration through large-scale network analyses (Saez-Rodriguez) and the development of mechanistically-based inference tools for metabolic and signaling pathway activities in tumor samples (Höfer).

Team Huber

EMBL

The Huber group develops and applies the statistical, computational and bioinformatic methods needed to analyze novel proteomic and metabolomic data types, to integrate them with genetic, transcriptomic and clinical data, and to discover new insights into tumor biology, vulnerabilities, resistance mechanisms, and recurrence. Particular focus lies on the development of innovative quality assessment (QA) and quality control (QC) methods that scale to the required data volumes and heterogeneity and help automate and objectify the task of QA/QC as much as possible - a pivotal requirement for use of proteomics and metabolomics in clinical research. The group integrates single-cell resolution multi-omics data (in particular CITE-Seq) with mass spectrometric proteomic and metabolomic data and spatially resolved protein level data from immunohistochemistry to better understand intra-tumor heterogeneity and its impact on variable drug responses in blood cancers. Together with the other computational and data science oriented groups - Saez-Rodriguez and Höfer - the Huber group develops procedures and techniques to integrate the different modalities within the SMART-CARE project.

Wolfgang Huber

Wolfgang Huber


Thomas Naake


Donnacha Fitzgerald


Team Saez-Rodriguez

University Hospital Heidelberg

Our team is involved in designing new approaches to exploit single and multi-omic data. We mainly focus on finding new ways to systematically integrate prior-knowledge from various sources with omic data to generate mechanistic hypotheses (see Kinact (Wirbel et al, 2018), Omnipath (Türei et al, 2016), COSMOS (Dugourd et al, 2021), ocEAn (TBA)). Kinact allows to estimate kinase activity from phosphoproteomic data, omnipath is a one stop shop for protein interaction and annotations, while COSMOS allows to generate mechanistic connection between phospho-proteomic and metabolomic data. Finally, ocEAn can integrate metabolomic data with metabolic enzyme signatures. Such approaches allow us to find context dependent biological deregulation that drives complex diseases, guiding us to find novel and more accurate treatments.

Team Höfer

DKFZ Heidelberg

We are developing mechanistically-based mathematical models of signal transduction networks and metabolic pathways in cancer cells. To this end, we will integrate mass spec data on the expression levels of proteins – signaling molecules such as kinases and phosphatases or metabolic enzymes – with data on the outcome of their action – phosphorylation levels or metabolite concentrations. This work is expected to inform the search for functional molecular signatures to better predict the course of malignant disease and help choose appropriate therapeutic options. We collaborate closely with other computational biologists, including the structural network analyses performed in the Saez-Rodriguez group.