Statistical Methods for Network-Based Analysis of Genomic Data
主 题: Statistical Methods for Network-Based Analysis of Genomic Data
报告人: Prof. Hongzhe LI (University of Pennsylvania School of Medicine)
时 间: 2007-07-03 下午 13:00 - 14:00
地 点: 老化学楼东配楼一楼报告厅
A central problem in genomic research is the identification of genes and pathways involved in diseases and other biological processes. Many methods have been developed for identifying genes in a regression framework. The genes identified are often linked to known biological pathways through gene set enrichment analysis in order to identify the pathways involved. However, most of the procedures for identifying the biologically relevant genes do not utilize the known pathway information. In this talk, I present two network-based approaches for genomic data analysis: a pathway-based regression analysis using a group gradient descent boosting procedure for identifying pathways and a Markov random field (MRF)-based method for identifying genes and subnetworks that are related to diseases. Simulation studies indicated that the method is quite effective in identifying genes and subnetworks that are related to disease and has higher sensitivity and lower false discovery rates than the commonly used procedures that do not use the pathway structure information. Applications to two breast cancer microarray gene expression datasets identified several subnetworks on several of the KEGG transcriptional pathways that are related to breast cancer recurrence or survival due to breast cancer. Extension to analysis of time course gene expression data will also be discussed.