fredag 5 oktober 2018

Statistical Inference for Analysis of Massive Health Data: Challenges and Opportunities

Speaker: Professor Xihong Lin - Chair, Department of Biostatistics, Harvard T.H. Chan School of Public Health

Massive data from genome, exposome, and phenome are becoming available at an increasing rate with no apparent end in sight. Examples include Whole Genome Sequencing data, large-scale remote-sensing satellite air pollution data, digital phenotyping, and Electronic Medical Records. The emerging field of Health Data Science presents statisticians with many exciting research and training opportunities and challenges. Success in health data science requires strong statistical inference integrated with computer science, information science and domain science. Examples include signal detection, network analysis, integrative analysis of different types and sources of data, and incorporation of domain knowledge in health data science method development. In this talk, I discuss some of such challenges and opportunities, and illustrate them using high-dimensional testing of dense and sparse signals for whole genome sequencing analysis, integrative analysis of different types and sources of data using causal mediation analysis, and analysis of multiple phentoypes (pleiotropy) using biobanks and Electronic Medical Records (EMRs).