The doctoral dissertations of the former Helsinki University of Technology (TKK) and Aalto University Schools of Technology (CHEM, ELEC, ENG, SCI) published in electronic format are available in the electronic publications archive of Aalto University - Aaltodoc.
Aalto

Studies on Dimension Reduction and Feature Spaces

Eli Parviainen

Doctoral dissertation for the degree of Doctor of Science in Technology to be presented with due permission of the School of Science for public examination and debate in Auditorium F239a at the Aalto University School of Science (Espoo, Finland) on the 11th of November 2011 at 12 noon.

Dissertation in PDF format (ISBN 978-952-60-4312-8)   [1233 KB]
Dissertation is also available in print (ISBN 978-952-60-4311-1)

Abstract

Today's world produces and stores huge amounts of data, which calls for methods that can tackle both growing sizes and growing dimensionalities of data sets. Dimension reduction aims at answering the challenges posed by the latter.

Many dimension reduction methods consist of a metric transformation part followed by optimization of a cost function. Several classes of cost functions have been developed and studied, while metrics have received less attention. We promote the view that metrics should be lifted to a more independent role in dimension reduction research. The subject of this work is the interaction of metrics with dimension reduction. The work is built on a series of studies on current topics in dimension reduction and neural network research. Neural networks are used both as a tool and as a target for dimension reduction.

When the results of modeling or clustering are represented as a metric, they can be studied using dimension reduction, or they can be used to introduce new properties into a dimension reduction method. We give two examples of such use: visualizing results of hierarchical clustering, and creating supervised variants of existing dimension reduction methods by using a metric that is built on the feature space of a neural network. Combining clustering with dimension reduction results in a novel way for creating space-efficient visualizations, that tell both about hierarchical structure and about distances of clusters.

We study feature spaces used in a recently developed neural network architecture called extreme learning machine. We give a novel interpretation for such neural networks, and recognize the need to parameterize extreme learning machines with the variance of network weights. This has practical implications for use of extreme learning machines, since the current practice emphasizes the role of hidden units and ignores the variance.

A current trend in the research of deep neural networks is to use cost functions from dimension reduction methods to train the network for supervised dimension reduction. We show that equally good results can be obtained by training a bottlenecked neural network for classification or regression, which is faster than using a dimension reduction cost.

We demonstrate that, contrary to the current belief, using sparse distance matrices for creating fast dimension reduction methods is feasible, if a proper balance between short-distance and long-distance entries in the sparse matrix is maintained. This observation opens up a promising research direction, with possibility to use modern dimension reduction methods on much larger data sets than which are manageable today.

Keywords: dimension reduction, neural network, metric, visualization

This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.

© 2011 Aalto University


Last update 2011-11-02