\documentclass[11pt]{article} \setlength{\oddsidemargin}{0.0truein} \setlength{\evensidemargin}{0.0truein} \setlength{\textwidth}{6.5truein} \setlength{\topmargin}{0.0truein} \setlength{\textheight}{9.0truein} \setlength{\headsep}{0.0truein} \setlength{\headheight}{0.0truein} \setlength{\topskip}{10.0pt} \setlength{\parskip}{5mm} \usepackage{url} \begin{document} \begin{center} \textbf{\textsc{STANFORD UNIVERSITY}}\\[5pt] \textbf{\textsc{DEPARTMENT OF STATISTICS}}\\[5pt] \Large{\textbf\textsc{{DEPARTMENTAL SEMINAR}}} \end{center} \begin{center} 4:15 p.m., Tuesday, January 17, 2006\\ Sequoia Hall Room 200\\ (Cookies at 3:45 in 1st Floor Lounge) \end{center} \begin{center} \textsl{Ramani S. Pilla} \\ Department of Statistics\\ Department of Biology\\ Case Western Reserve University \\ \end{center} \begin{center} \textbf{Inference in Perturbation Models, Mixtures and Spatial Scan Process} \end{center} \noindent In this talk, a general class of models referred to as {\em perturbation models} are introduced. These models are described by an underlying ``null'' model that accounts for most of the structure in a data while a perturbation accounts for possible small localized departures. For instance, in the context of finite mixture models, the null model represents a mixture with $m$ components and the perturbation model represents additional components. In the spatial scan process context, the null density accounts for the background or noise whereas the perturbation searches for an unusual region such as a tumorous tissue in mammography or a target in an image recognition problem. We derive a new test statistic for detecting the presence of perturbation and show that the asymptotic distribution of the test statistic is equivalent to the supremum of a Gaussian process over a high-dimensional manifold (e.g., curve, surface etc.) with boundaries and singularities. A technique for approximating the quantiles of the test statistic via the Hotelling-Weyl {\em volume-of-tube formula} is presented. \noindent Fitting mixture models and performing statistical inference on the results is an important but a very challenging problem. A long-pending fundamental question is: {\em how many mixture components}? The asymptotic null distribution of the likelihood ratio test statistic is highly complex and very difficult to simulate from in practice. Building on the perturbation theory, inferential methods are developed to address the problem of testing for an arbitrary number of components from smooth families of distributions, including multivariate mixtures. The resulting theory has broad applications including astronomy, astrophysics, particle physics, bioinformatics and genetics. We illustrate the theory in the context of a model problem from high-energy particle physics wherein the goal is to distinguish a signal from random fluctuation in data with a high probability. More information on the particle physics and other application problems is at {\sf http://stat.case.edu/$\sim$pillar/PRL/PRL.htm}. \end{document}