\magnification=1200 \baselineskip=20pt \nopagenumbers \font\big=cmr12 scaled \magstep2 \centerline{\bf STANFORD UNIVERSITY} \centerline{\bf DEPARTMENT OF STATISTICS} \centerline{\big DEPARTMENTAL SEMINAR} \bigskip \baselineskip=12pt \centerline{4:15 p.m., Tuesday, March 9, 2004} \centerline{Sequoia Hall Room 200} \centerline{(Cookies at 3:45 in 1st Floor Lounge)} \bigskip \baselineskip=15pt \centerline{\sl Alexander G. Gray} \centerline{\sl Carnegie Mellon} \bigskip \centerline{\bf Generalized N-Body Problems: A Story of Statistics, Computation, and Science} \bigskip Abstract: In this talk I'll review the fundamental connections between statistics, computation, and science, and the modern trends which have amplified their importance. I'll illustrate these connections and trends through several projects I've been working on, in which progress toward answering fundamental questions in cosmology has been held back by the inability to apply well-motivated multivariate statistics/machine learning methods to very large modern datasets. I will present a collection of simple yet powerful ideas which constitute a theory and toolkit for efficient solutions to a large class of algorithmic problems I call 'generalized N-body problems'. The results so far include, among others, the fastest practical multivariate algorithms to date for kernel density estimation, the all-nearest-neighbors problem, smoothed particle hydrodynamics, and the n-point correlation functions. In only the last year, the early scientific consequences of these developments so far include an explosion in the size of the best available quasar catalog by one order of magnitude, the largest-scale effort to statistically validate the Standard Model by over 3 orders of magnitude, and the strongest experimental evidence to date of the existence of dark energy, the 2003 Science Magazine \#1 Breakthrough of the Year. Looking ahead, I'll touch on further outstanding fundamental problems in statistics (e.g. high-dimensional estimation), computation (e.g. high-dimensional integration), and science (e.g. some problems in proteomics), and some perspectives I believe may lead to new results on these problems in the next few years. \bye