[Compstats] Seminar Wed @ 2:30pm: Wayne Oldford

Pascal Poupart ppoupart at cs.uwaterloo.ca
Mon Mar 29 17:32:09 EDT 2010


Hi everyone,

Wayne Oldford will give the compstats seminar this Wednesday at 2:30pm 
(room TBA).
cheers, Pascal
==============================

Visualizing High Dimensional Data:
Applications of graph theory to statistical graphics

Wayne Oldford
Department of Statistics & Actuarial Science
University of Waterloo

Date: Wednesday March 31
Time: 2:30pm
Location: TBA


In statistical data analysis, we are often looking for structure in  
high dimensional data.  In classification problems, we are interested  
in how different known classes separate from (and relate to) one  
another in the data space of measured values.  In clustering, we are  
hoping to discover distinct groups of points in this space.  In model  
building, we are often interested in which data points agree/disagree  
with the conjectured model and whether important structure has been  
missed.  And, … we hope to do all of this without prejudging the  
nature of the structure itself, even as far as to discover the  
unanticipated!

In three or fewer dimensions, our visual system is an important  
asset, as much (even unanticipated) structure can be recognized  
effortlessly when points can be plotted so few dimensions.   
Unfortunately, even after formal dimension reduction methods have  
been applied, we are often faced with many more dimensions than three.

In this talk, I will explore some visualization methods for high  
dimensional data. I will review and illustrate methods based on  
radial, parallel, and orthogonal coordinates.   These three axis  
systems have different strengths and weaknesses.  In all cases  
however, improvements may be had by casting the axis arrangement in a  
graph theoretic framework.  I will explore the relevant graph  
theoretic representations and illustrate their use on real data  
sets.  I will pay particular attention to the orthogonal axis system  
and show how graph traversal can be used to meaningfully navigate  
through high dimensional space.

All software used is (or shortly will be) available as a package in  
the open source statistical system called R.

________

This is based on joint work with Catherine Hurley of the National  
University of Ireland, Maynooth and Adrian Waddell of the University  
of Waterloo.





More information about the Compstats mailing list