The doctoral dissertations of the former Helsinki University of Technology (TKK) and Aalto University Schools of Technology (CHEM, ELEC, ENG, SCI) published in electronic format are available in the electronic publications archive of Aalto University - Aaltodoc.

Tools and Experiments in Multimodal Interaction

Tommi Ilmonen

Dissertation for the degree of Doctor of Science in Technology to be presented with due permission of the Department of Computer Science and Engineering for public examination and debate in Auditorium T2 at Helsinki University of Technology (Espoo, Finland) on the 14th of December, 2006, at 12 noon.

Overview in PDF format (ISBN 951-22-8551-7)   [1078 KB]
Dissertation is also available in print (ISBN 951-22-8550-9)


The goal of this study is to explore different strategies for multimodal human-computer interaction. Where traditional human-computer interaction uses a few common user interface metaphors and devices, multimodal interaction seeks new application areas with novel interaction devices and metaphors. Exploration of these new areas involves creation of new application concepts and their implementation. In some cases the interaction mimics human-human interaction while in other cases the interaction model is only loosely tied to the physical world.

In the virtual orchestra concept a conductor can conduct a band of virtual musicians. Both the motion and sound of the musicians is synthesized with a computer. A critical task in this interaction is the analysis of the conductor motion and control of the sound synthesis. A system that performs these tasks is presented. The system is also capable of extracting emotional content from the conductor's motion. While the conductor follower system was originally developed using a commercial motion tracker, an alternative low-cost motion tracking system was also made. The new system used accelerometers with application-specific signal processing for motion capture.

One of the basic tasks of the conductor follower and other gesture-based interaction systems is to refine raw user input data into information that is easy to use in the application. For this purpose a new approach was developed: FLexible User Input Design (FLUID). This is a toolkit that simplifies the management of novel interaction devices and offers general-purpose data conversion and analysis algorithms.

FLUID was used in a virtual reality drawing applications AnimaLand and Helma. Also new particle system models and a graphics distribution system were developed for these applications. The traditional particle systems were enhanced by adding moving force fields that interact with each other. The interacting force fields make the animations more lively and credible.

Graphics distribution become an issue if one wants to render 3D graphics with a cost-effective PC-cluster. A graphics distribution method based on network broadcast was created to minimize the amount of data traffic, thus increasing performance.

Many multimodal applications also need a sound synthesis and processing engine. To meet these needs the Mustajuuri toolkit was developed. Mustajuuri is a flexible and efficient sound signal processing framework with support for sound processing in virtual environments.

This thesis consists of an overview and of the following 8 publications:

  1. Ilmonen, Tommi and Kontkanen, Janne. Software Architecture for Multimodal User Input – FLUID. In Universal Access. Theoretical Perspectives, Practice, and Experience, 7th ERCIM International Workshop on User Interfaces for All, Lecture Notes in Computer Science 2615, pages 319-338, Springer Berlin / Heidelberg, 2003. © 2003 Springer Science+Business Media. By permission.
  2. Ilmonen, Tommi and Takala, Tapio. Conductor Following With Artificial Neural Networks. In Proceedings of the International Computer Music Conference, pages 367-370, Beijing, China, 1999. © 1999 by authors.
  3. Ilmonen, Tommi and Jalkanen, Janne. Accelerometer-Based Motion Tracking for Orchestra Conductor Following. In Proceedings of the 6th Eurographics Workshop on Virtual Environments, Amsterdam, Netherlands, 2000. © 2000 Eurographics Association. By permission.
  4. Ilmonen, Tommi and Takala, Tapio. Detecting Emotional Content from the Motion of an Orchestra Conductor. In Gesture in Human-Computer Interaction and Simulation, 6th International Gesture Workshop, Lecture Notes in Artificial Intelligence 3881, pages 292-295, Springer Berlin / Heidelberg, 2006. © 2006 Springer Science+Business Media. By permission.
  5. Ilmonen, Tommi and Reunanen, Markku. Virtual Pockets in Virtual Reality. In Virtual Environments 2005, Eurographics/ACM SIGGRAPH Symposium Proceedings, pages 163-170, 2005. © 2005 Eurographics Association. By permission.
  6. Ilmonen, Tommi. Mustajuuri – An Application and Toolkit for Interactive Audio Processing. In Proceedings of the 7th International Conference on Auditory Display, pages 284-285, Helsinki, Finland, 2001. © 2001 by author.
  7. Ilmonen, Tommi and Kontkanen, Janne. The Second Order Particle System. Journal of WSCG, 11 (2): 240-247, 2003. © 2003 UNION Agency - Science Press. By permission.
  8. Ilmonen, Tommi, Reunanen, Markku, and Kontio, Petteri. Broadcast GL: An Alternative Method for Distributing OpenGL API Calls to Multiple Rendering Slaves. Journal of WSCG, 13 (2): 65-72, 2005. © 2005 UNION Agency - Science Press. By permission.

Errata of publication 3

Keywords: gestural interaction, conductor following, virtual reality, digital art, graphics clusters, particle systems, 3D sound, digital signal processing

This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.

© 2006 Helsinki University of Technology

Last update 2011-05-26