New experimental technologies in
molecular biology (particularly oligonucleotide arrays and micro
arrays) now make it possible to quickly obtain vast amounts of time-series
data on gene expression in a particular organism under various conditions.
We have developed a new computational methodology for making sense
of the large, multiple time-series data sets arising in expression
analysis, and have built a prototype implementation of our methodology
and applied it to the analysis of gene expression in Saccraromyces
cerevisiae. We propose to build on this work to:
- Build a powerful software system for identifying interesting
features in large multiple timeseries data sets. Our work will
emphasize the analysis of gene expression data, but our system
will flexible enough to work on other biological data sets as
well.
- Develop and implement new combinatorial algorithms essential
to expression analysis, particularly in (1) signal processing
for the multiple short, noisy time-series data sets obtained from
gene-regulation experiments, (2) integrating database information
on gene function and location into our analysis, and (3) identifying
optimal candidate networks under a variety of different selection
criteria.
- Perform modest gene knockout and/or over-expression experiments
in yeast to evaluate the most promising regulatory elements identified
by our software.
|