UCSD MATHEMATICS DEPARTMENT: APPLICATIONS SEMINAR
APPLICATIONS SEMINAR
Noon, Monday, October 2, 2006,
AP&M 6402
Dr. M. Vidyasagar,
Executive Vice President
Tata Consultancy Services Limited
Hyderabad INDIA
Stochastic Modelling Methods for Gene Finding
In this talk, the problem of finding genes from the genome
(DNA
sequence) is formulated as a problem in stochastic modelling and
classification. No prior knowledge of biology is assumed and the talk
will be completely self-contained in this respect. A new
classification
algorithm, called Mixed Memory Markov Model (4M) algorithm, is
presented,
and its significance (probability of generating an incorrect
classification) is analyzed using sound statistical principles. It is
also shown that, on nearly 70 bacterial genomes, the 4M algorithm
performs
as well or better than the currently most popular algorithm, known as
Glimmer-2. (But the emphasis of the talk is on the statistical
aspects.)