Multi-objective genetic programming for data visualization and classification

Item

Title
Multi-objective genetic programming for data visualization and classification
Identifier
d_2009_2013:ffacc430e17a:11036
identifier
11375
Creator
Icke, Ilknur,
Contributor
Andrew Rosenberg
Date
2011
Language
English
Publisher
City University of New York.
Subject
Computer science | Information science | Genetics | Data Classification | Data Visualization | Dimensionality Reduction | Exploratory Data Analysis | Genetic Programming
Abstract
The process of knowledge discovery lies on a continuum ranging between the human driven (manual exploration) approaches to fully automatic data mining methods. As a hybrid approach, the emerging field of visual analytics aims to facilitate human-machine collaborative decision making by providing automated analysis of data via interactive visualizations. One area of interest in visual analytics is to develop data transformation methods that support visualization and analysis. In this thesis, we develop an evolutionary computing based multi-objective dimensionality reduction method for visual data classification. The algorithm is called Genetic Programming Projection Pursuit (G3P) where genetic programming is utilized in order to automatically create visualizations of higher dimensional labeled datasets which are assessed in terms of discriminative power and interpretability. We consider two forms of interpretability of the visualizations: clearly separated and compact class structures along with easily interpretable data transformation expressions relating the original data attributes to the visualization axes. The G3P algorithm incorporates a number of automated measures of interpretability that were found to be in alignment with human judgement through a user study we conducted.;On a number of data mining problems, we show that G3P generates a large number of data transformations that are better than those generated by a number of dimensionality reduction methods such as the principal components analysis (PCA), multiple discriminants analysis (MDA) and targeted projection pursuit (TPP) in terms of discriminative power and interpretability.
Type
dissertation
Source
2009_2013.csv
degree
Ph.D.
Program
Computer Science