dGTEx U01: Statistical methods for characterizing molecular mechanisms of human tissue development and disease

Status: Active Start:
24/09/2024
End:
31/08/2027

Primary tabs

Description

This project aims to uncover molecular processes in postnatal development as well as genetic and nongenetic sources of inter-individual molecular variation, with the ultimate goal of better understanding human biology and disease. We will achieve these goals by developing novel and innovative analysis methods and approaches applied to multimodal Developmental GTEx data types from both bulk and single-cell tissue analyses.

We have chosen to focus on the human dGTEx data, but a selected subset of the approaches proposed here will be applied to non-human primates as well. We have a strong track-record in improving the gold standard for handling large and complex data sets of transcriptome and other molecular data, producing informative and widely used data resources, and developing statistical methods that benefit the wider community. As long-term collaborators and as leaders in the adult GTEx project as well as other consortia, we are exceptionally well positioned to contribute to the success of the broader dGTEx project with productive interactions within the consortium and the broader research community.

The combination of the novel data types and biological questions of the dGTEx project and the next generation of analytical approaches that we propose offer exciting opportunities. The methodological innovations include novel approaches to transcriptome annotation and variation; methods for informative interpretation of multimodal data including epigenomic and spatial data; leveraging the pseudo-chronological nature of dGTEx data; and genetic analysis with new predictive models and transfer learning. We will integrate dGTEx data with data from adult GTEx, ENCODE, GENCODE and ENTex, Human Cell Atlas, and rare and common disease associations. This project emphasizes integration of methods development with data analysis, which in our experience is the most fruitful approach to extract biological insights out of complex, emerging data types. While not the main focus of this project we are well-positioned to work closely with the dGTEx LDACCs on data processing and dissemination pipelines that are crucial to the success of our work. In this project, methods and analyses relevant to disease are distributed across the aims and subaims.