\magnification=1200 \baselineskip=20pt \nopagenumbers \font\big=cmr12 scaled \magstep2 \centerline{\bf STANFORD UNIVERSITY} \centerline{\bf DEPARTMENT OF STATISTICS} \centerline{\big DEPARTMENTAL SEMINAR} \bigskip \baselineskip=12pt \centerline{4:15 p.m., Tuesday, October 2, 2001} \centerline{Sequoia Hall Rm. 200} \centerline{(Cookies at 3:45 in 1st Floor Lounge)} \bigskip \baselineskip=15pt \centerline{\sl Jerome H. Friedman} \centerline{\sl Department of Statistics} \centerline{\sl Stanford University} \centerline{\sl Stanford, CA 94305} \bigskip \centerline{\bf Weighted Harmonic Distance Clustering} \bigskip A dissimilarity measure for value-attribute data is proposed for use in cluster analysis. It assigns small dissimilarities to observation pairs that have close values on any subset of the attribute variables regardless of their values on the complement set of variables. Using this measure in conjunction with dissimilarity based clustering algorithms encourages the detection of subgroups of observations that preferentially cluster on subsets of the variables. The relevant variable subsets for each individual cluster can be different and partially (or completely) overlap with those of other clusters. Enhancements for increasing sensitivity for detecting especially low cardinality groups clustering on a small subset of variables are discussed. Applications in several different domains, including gene expression arrays, are presented. Joint work with Jacqueline Meulman, Leiden University. \bye