##### Department of Mathematics,

University of California San Diego

****************************

### Statistics Colloquium

## Hal Stern

#### UC Irvine

## Estimating the number of unseen species in a population

##### Abstract:

The problem of estimating the number of unseen species in a population based on the results of a single sample of animals is a familiar one in the statistical literature. In a related problem associated with genome sequencing the goal is to design a sampling strategy for finding a specified proportion of the total number of species. A generalized multinomial model is applied to estimate the number of unseen species; the model also forms the basis for a Monte Carlo simulation approach to determing the sample size required to guarantee that a specified proportion of the total species are collected. The methods are demonstrated on simulated data and data from a DNA sequencing application.

Host: Dimitris Politis

### June 2, 2003

### 3:00 PM

### AP&M 5829

****************************