The doctoral dissertations of the former Helsinki University of Technology (TKK) and Aalto University Schools of Technology (CHEM, ELEC, ENG, SCI) published in electronic format are available in the electronic publications archive of Aalto University - Aaltodoc.
Aalto

Discriminative Learning with Application to Interactive Facial Image Retrieval

Zhirong Yang

Dissertation for the degree of Doctor of Science in Technology to be presented with due permission of the Faculty of Information and Natural Sciences for public examination and debate in Auditorium TU2 at Helsinki University of Technology (Espoo, Finland) on the 14th of November, 2008, at 12 noon.

Overview in PDF format (ISBN 978-951-22-9626-2)   [1023 KB]
Dissertation is also available in print (ISBN 978-951-22-9625-5)

Abstract

The amount of digital images is growing drastically and advanced tools for searching in large image collections are therefore becoming urgently needed. Content-based image retrieval is advantageous for such a task in terms of automatic feature extraction and indexing without human labor and subjectivity in image annotations. The semantic gap between high-level semantics and low-level visual features can be reduced by the relevance feedback technique. However, most existing interactive content-based image retrieval (ICBIR) systems require a substantial amount of human evaluation labor, which leads to the evaluation fatigue problem that heavily restricts the application of ICBIR.

In this thesis a solution based on discriminative learning is presented. It extends an existing ICBIR system, PicSOM, towards practical applications. The enhanced ICBIR system allows users to input partial relevance which includes not only relevance extent but also relevance reason. A multi-phase retrieval with partial relevance can adapt to the user's searching intention in a from-coarse-to-fine manner.

The retrieval performance can be improved by employing supervised learning as a preprocessing step before unsupervised content-based indexing. In this work, Parzen Discriminant Analysis (PDA) is proposed to extract discriminative components from images. PDA regularizes the Informative Discriminant Analysis (IDA) objective with a greatly accelerated optimization algorithm. Moreover, discriminative Self-Organizing Maps trained with resulting features can easily handle fuzzy categorizations.

The proposed techniques have been applied to interactive facial image retrieval. Both a query example and a benchmark simulation study are presented, which indicate that the first image depicting the target subject can be retrieved in a small number of rounds.

This thesis consists of an overview and of the following 8 publications:

  1. Zhirong Yang and Jorma Laaksonen. 2005. Interactive retrieval in facial image database using Self-Organizing Maps. In: Proceedings of the 9th IAPR Conference on Machine Vision Applications (MVA 2005). Tsukuba Science City, Japan. 16-18 May 2005, pages 112-115. © 2005 MVA Conference Committee. By permission.
  2. Zhirong Yang and Jorma Laaksonen. 2005. Approximated classification in interactive facial image retrieval. In: Heikki Kalviainen, Jussi Parkkinen, and Arto Kaarna (editors). Proceedings of the 14th Scandinavian Conference on Image Analysis (SCIA 2005). Joensuu, Finland. 19-22 June 2005. Springer. Lecture Notes in Computer Science, volume 3540, pages 770-779.
  3. Zhirong Yang and Jorma Laaksonen. 2005. Partial relevance in interactive facial image retrieval. In: Sameer Singh, Maneesha Singh, Chid Apte, and Petra Perner (editors). Proceedings of the Third International Conference on Advances in Pattern Recognition (ICAPR 2005). Part II. Bath, UK. 22-25 August 2005. Springer. Lecture Notes in Computer Science, volume 3687, pages 216-225.
  4. Zhirong Yang and Jorma Laaksonen. 2007. Regularized neighborhood component analysis. In: Bjarne Kjær Ersbøll and Kim Steenstrup Pedersen (editors). Proceedings of the 15th Scandinavian Conference on Image Analysis (SCIA 2007). Aalborg, Denmark. 10-14 June 2007. Springer. Lecture Notes in Computer Science, volume 4522, pages 253-262.
  5. Zhirong Yang and Jorma Laaksonen. 2007. Face recognition using Parzenfaces. In: Joaquim Marques de Sá, Luís A. Alexandre, Włodzisław Duch, and Danilo Mandic (editors). Proceedings of the 17th International Conference on Artificial Neural Networks (ICANN 2007). Part II. Porto, Portugal. 9-13 September 2007. Springer. Lecture Notes in Computer Science, volume 4669, pages 200-209.
  6. Zhirong Yang and Jorma Laaksonen. 2007. Multiplicative updates for non-negative projections. Neurocomputing, volume 71, numbers 1-3, pages 363-373. © 2007 Elsevier Science. By permission.
  7. Zhirong Yang, Zhijian Yuan, and Jorma Laaksonen. 2007. Projective non-negative matrix factorization with applications to facial image processing. International Journal of Pattern Recognition and Artificial Intelligence, volume 21, number 8, pages 1353-1362. © 2007 World Scientific Publishing Company. By permission.
  8. Zhirong Yang and Jorma Laaksonen. 2008. Principal whitened gradient for information geometry. Neural Networks, volume 21, numbers 2-3, pages 232-240. © 2008 Elsevier Science. By permission.

Keywords: content-based image retrieval (CBIR), relevance feedback, Self-Organizing Map, discriminant analysis, facial image

This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.

© 2008 Helsinki University of Technology


Last update 2011-05-26