The doctoral dissertations of the former Helsinki University of Technology (TKK) and Aalto University Schools of Technology (CHEM, ELEC, ENG, SCI) published in electronic format are available in the electronic publications archive of Aalto University - Aaltodoc.
Aalto

Model-Based Assessment of Factors Influencing Categorical Audiovisual Perception

Tobias S. Andersen

Dissertation for the degree of Doctor of Philosophy to be presented with due permission of the Department of Electrical and Communications Engineering for public examination and debate in Auditorium S1 at Helsinki University of Technology (Espoo, Finland) on the 10th of March, 2005, at 12 o'clock noon.

Overview in PDF format (ISBN 951-22-7548-1)   [279 KB]
Dissertation is also available in print (ISBN 951-22-7547-3)

Abstract

Information processing in the sensory modalities is not segregated but interacts strongly. The exact nature of this interaction is not known and might differ for different multisensory phenomena. Here, we investigate two cases of categorical audiovisual perception: speech perception and the perception of rapid flashes and beeps.

It is known that multisensory interactions in general depend on physical factors, such as information reliability and modality appropriateness, but it is not known how the effects occur. Here we parameterize the effect of information reliability for both our model phenomena. We also describe the effect of modality appropriateness as that of a factor that interacts with the effect of information reliability for counting rapid flashes and beeps.

Less explored is whether multisensory perception depends on cognitive factors such as attention. Here we show that visual spatial attention and attentional set influence audiovisual speech perception. Whereas visual spatial attention affected unimodal perception prior to audiovisual integration, attentional set influenced the audiovisual integration stage. We also show a strong effect of intermodal attention on counting rapid flashes and beeps.

Finally, we introduce a quantitative model, early maximum likelihood integration (MLI), of the interaction between counted flashes and counted beeps. We compare early MLI to the Fuzzy Logical Model of Perception (FLMP) which is a MLI model based on categorical percepts, and show that early MLI fits the data better using fewer parameters. Early MLI is also able to incorporate the effects of information reliability and intermodal attention in a more efficient way than the FLMP.

This thesis consists of an overview and of the following 7 publications:

  1. Andersen, T. S., Tiippana, K., and Sams, M., 2004. Factors influencing audiovisual fission and fusion illusions. Cognitive Brain Research 21, pages 301-308.
  2. Tiippana, K., Andersen, T. S., and Sams, M., 2004. Visual attention modulates audiovisual speech perception. European Journal of Cognitive Psychology 16, pages 457-472.
  3. Andersen, T. S., Tiippana, K., and Sams, M., 2005. Visual spatial attention in audiovisual speech perception. Helsinki University of Technology, Laboratory of Computational Engineering publications, Technical Report B48.
  4. Tuomainen, J., Andersen, T. S., Tiippana, K., and Sams, M., 2005. Audio-visual speech perception is special. Cognition, in press.
  5. Andersen, T. S., Tiippana, K., and Sams, M., 2005. Maximum likelihood integration of rapid flashes and beeps. Neuroscience Letters, in press.
  6. Andersen, T. S., Tiippana, K., Lampinen, J., and Sams, M., 2001. Modeling of audiovisual speech perception in noise. Proceedings of the International Conference on Auditory-Visual Speech Processing (AVSP 2001), pages 172-176.
  7. Andersen, T. S., Tiippana, K., and Sams, M., 2002. Using the fuzzy logical model of perception in measuring integration of audiovisual speech in humans. Proceedings of NeuroFuzzy2002.

Keywords: categorical audiovisual perception, speech perception, rapid flashes, beeps, mathematical modeling

This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.

© 2005 Helsinki University of Technology


Last update 2011-05-26