TS-EUROTRAIN - Projects

Developing algorithmic prediction models for TS and related disorders (WP2)

Objectives: 1. Develop and apply sophisticated algorithms in order to scan genome-wide datasets of hundreds of thousands of genetic markers and thousands of individuals in order to identify subsets of loci that carry the signature of the disease under study. 2. Develop tools for the estimation of genetic risk for neurodevelopmental disorders. 3. Explore the biological significance and the pathways associated with the selected SNPs by using gene ontology tools.

Fellow ESR7
Host institution: Democritus University of Thrace
Duration: 36 months    
Supervisor: P. Paschou, DUTH; Co-supervisor: P. Drineas, RPI (Associated partner); Secondment: RUNMC, SENSA

Tasks and methodology: In order to achieve our goals, we will leverage Linear Discriminant Analysis (LDA), a powerful dimensionality reduction technique, and design statistical tests that analyze all the SNP genotypes from a GWAS simultaneously and thus identify a subset of SNPs that are most associated or predictive of disease risk. Such techniques would promise an improved performance over single-SNP tests, but, to date, such analyses have seemed infeasible due to their enormous computational requirements. Our statistical test will be applied both on simulated data in order to explore its potential, as well as on real GWAS data that are available through the resources of this ITN (eg EMTICS GWAS dataset). We will also explore the biological significance and the pathways associated with the selected SNPs by using gene ontology tools.

  • Task 1: Develop and make publicly available software implementing LDA.
  • Task 2: Develop and make publicly available software that extracts genomic markers (SNPs) that are highly correlated with the discriminative axes returned by LDA.
  • Task 3: Test the LDA and SNP selection software on the synthetic data described above. A preliminary evaluation of our methodology on simulated data indicated that LDA returns one axis that perfectly separates cases and controls, when 250 cases and 250 controls were genotyped for 50,000 markers (data not shown).
  • Task 4: Test the LDA and SNP selection software in the real GWAS datasets in order to extract a subset of SNPs that are associated and potentially predictive of the disease.
  • Task 5: Ascertain the biological significance of the associated SNPs, using gene ontology tools (such as PANTHER).

 

INSTITUTIONS

         
       

 

Home The Training Programme Projects Developing algorithmic prediction models for TS and related disorders (WP2)

Contact Us

Feel free to contact us for any comments or questions you might have.