Computer Science Colloquium

Su Mo Tu We Th Fr Sa
30 31 1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 1 2 3
Date/Time:Thursday, 10 Apr 2014 at 3:40 pm
Location:B29 Atanasoff Hall
Cost:Free
Phone:515-294-4377
Channel:College of Liberal Arts and Sciences
Categories:Lectures
Actions:Download iCal/vCal | Email Reminder
"Mining High-Throughput Biological Data," Xiang Zhang, Case Western Reserve, Cleveland. A reception precedes the talk.

Bio
Xiang Zhang is currently the T&D Schroeder Assistant Professor in the Electrical Engineering and Computer Science Department at Case Western Reserve University. He received his Ph.D. in Computer Science from the University of North Carolina at Chapel Hill in 2011. His research interests include data mining, bioinformatics and databases. He received a best research paper award at SIGKDD'08, a best student paper award at ICDE'08, and one of the best papers at SDM'12. His dissertation won an honorable mention for the SIGKDD Dissertation Award in 2012.

Abstract
Advanced biotechnologies have rendered feasible high-throughput data collecting in human and other model organisms. The availability of such data holds promise for dissecting complex biological processes. Making sense of the flood of biological data poses great computational challenges.

In this talk, I will discuss the problem of finding gene-gene interactions in high-throughput genetic data. Finding genetic interactions is an important biological problem since many common diseases are caused by joint effects of genes. Previously, it was considered intractable to find genetic interactions in the whole-genome scale due to the enormous search space. The problem was commonly addressed using heuristics which do not guarantee the optimality of the solution. I will show that by utilizing the upper bound of the test statistic and effectively indexing the data, we can dramatically prune the search space and reduce computational burden. Moreover, our algorithms guarantee to find the optimal solution. In addition to handling specific statistical tests, our algorithms can be applied to a wide range of study types by utilizing convexity, a common property of many commonly used statistics.

I will also briefly survey my recent work on large-scale expression quantitative trait loci (eQTL) mapping, and integrating and analyzing heterogeneous biological networks.