Graphical Comparison and Outlier Detection for High dimensional Distributions

Fri, 12 March, 2021 11:00am

Speaker: Reza Modarres, The George Washington University

Abstract: I consider groups of observations in R^d and present a simultaneous plot of the empirical cumulative distribution functions of the within and between interpoint distances to visualize and examine the equality of the underlying distribution functions. I provide several examples to illustrate how such plots can be utilized to envision and canvass the relationship between the distributions under location, scale, dependence and shape changes. I suggest new statistics for testing the equality of k distributions using the interpoint distances. Based on a new dissimilarity measure and the ordered values of the total dissimilarity of each observation from all others, I present a nonparametric method for detection of high dimensional outliers. Algorithms to obtain the distribution of the test statistic based on the bootstrap are provided. I compare the interpoint distance outlier test with five competing methods under different distributions.


Share This Event