Printable PDF
Department of Mathematics,
University of California San Diego

****************************

Math 288 - Probability and Statistics Seminar

Mihoko Minami

The Institute of Statistical Mathematics, Japan

Statistical Challenges for Modeling Data with Many Zeros: A New Feature Extraction Method for Very Non-Normal Data

Abstract:

Data that we encounter in practice often have meny zero-valued observations. Anaylizing such data without any consideration given to how the zeros arose might lead to misleading results. In this talk, we propose a new feature extraction method for very non-normal data. Our method extends principle component analysis (PCA) in the same manner as the generalized linear model extends the ordinary linear regression model. As an example, we analyze multivariate species-size data from a purse-seine fishery in the eastern Pacific Ocean. The data contain many zero-valued observations for each variable (combinations of species and size). Thus, as an error distribution we use the Tweedie distribution which has a probability mass at zero and apply Tweedie-generalized PCA (GPCA) method to the data.

Host: Ronghui 'Lily' Xu

February 6, 2009

1:00 PM

AP&M 6402

****************************