The doctoral dissertations of the former Helsinki University of Technology (TKK) and Aalto University Schools of Technology (CHEM, ELEC, ENG, SCI) published in electronic format are available in the electronic publications archive of Aalto University - Aaltodoc.

Sonic Gestures and Rhythmic Interaction Between the Human and the Computer

Antti Jylhä

Doctoral dissertation for the degree of Doctor of Science in Technology to be presented with due permission of the School of Electrical Engineering for public examination and debate in Auditorium S1 at the Aalto University School of Electrical Engineering (Espoo, Finland) on the 20th of April 2012 at 12 noon.

Overview in PDF format (ISBN 978-952-60-4554-2)   [508 KB]
Dissertation is also available in print (ISBN 978-952-60-4553-5)


This thesis addresses the use of sonic gestures as input in human-computer interaction with a special applicability focus on rhythmic interactive systems and their design and evaluation. Sonic gestures are defined as human-generated sounding actions which convey information to a computational system. Examples of such gestures are impulsive sounding actions such as hand claps and finger snaps, sustained actions such as humming and blowing, and iterative actions, such as tapping a table to the beat of music.

The use of sonic gestures at the interface requires analysis algorithms that are capable of tracking the desired information from an audio stream containing the human-generated sounds. In interactive systems, these algorithms must be capable of real-time processing. In this thesis, the focus is on percussive sonic gestures, which can be seen to be analogous to the sounds of percussive instruments. Therefore, it is reasonable to assume that the same tools that are applied for retrieving information from drums and percussion in music can be deployed for sonic gesture analysis. This work presents algorithms for the classification of different percussive sounds, such as different types of hand claps.

To demonstrate the use of sonic gestures, a hand clap interface capable of recognizing different hand clap types and extracting continuous information, such as the tempo, from a clapping sequence has been developed. This interface has been utilized in the development of various rhythmic prototype applications, most importantly a system called iPalmas, an interactive Flamenco rhythm tutor. The iPalmas system can produce realistic-sounding synthetic Flamenco hand clapping patterns to the user, listen to the clapping of the user, and give audiovisual feedback on the learning and performance.

The iPalmas system was evaluated in a subjective experiment, resulting in qualitative and quantitative findings related to the system design, the human capabilities, and the interaction. In conjunction with this evaluation, a structured framework for evaluating this kind of systems has been proposed. Based on the evaluation results, the system has undergone iterative development of the audiovisual feedback elements.

The main outcomes of the thesis are a novel definition of sonic gestures in human-computer interaction and a taxonomy of the information they can convey to computational systems and the interactive iPalmas system, resulting in several relevant findings that can be generalized in the design and evaluation of rhythmic interactive systems.

This thesis consists of an overview and of the following 9 publications:

  1. Antti Jylhä. Sonic Gestures as Input in Human-Computer Interaction: Towards a Systematic Approach. In Proceedings of Sound and Music Computing Conference, Padova, Italy, pp. 1-7, July 2011.
  2. Antti Jylhä and Cumhur Erkut. Inferring the Hand Configuration from Hand Clapping Sounds. In Proceedings of the 11th International Conference on Digital Audio Effects (DAFx), Espoo, Finland, pp. 301-304, September 2008.
  3. Umut Şimşekli, Antti Jylhä, Cumhur Erkut, and A. Taylan Cemgil. Real-Time Recognition of Percussive Sounds by a Model-Based Method. EURASIP Journal on Advances in Signal Processing, pp. 1-14, January 2011.
  4. Antti Jylhä, Cumhur Erkut, Umut Şimşekli, and A. Taylan Cemgil. Sonic Handprints: Person Identification with Hand Clapping Sounds by a Model-Based Method. In Proceedings of the 45th Conference of the Audio Engineering Society, Espoo, Finland, pp. 1-6, March 2012.
  5. Antti Jylhä and Cumhur Erkut. A Hand Clap Interface for Sonic Interaction with the Computer. In Proceedings of the 27th International Conference Extended Abstracts on Human Factors in Computing Systems, Boston, MA, USA, pp. 3175-3180, April 2009.
  6. Antti Jylhä, Inger Ekman, Cumhur Erkut, and Koray Tahiroğlu. Design and Evaluation of Human-Computer Rhythmic Interaction in a Tutoring System. Computer Music Journal, vol. 35, no. 2, pp. 36-48, May 2011.
  7. Antti Jylhä, Cumhur Erkut, Matti Pesonen, and Inger Ekman. Simulation of Rhythmic Learning - A Case Study. In Proceedings of the 5th Audio Mostly Conference, Glasgow, UK, pp. 146-149, September 2010.
  8. Cumhur Erkut, Antti Jylhä, and Reha Dişçioğlu. A Structured Design and Evaluation Model with Application to Rhythmic Interaction Displays. In Proceedings of International Conference on New Interfaces for Musical Expression (NIME), Oslo, Norway, pp. 477-480, May 2011.
  9. Antti Jylhä and Cumhur Erkut. Auditory Feedback in an Interactive Rhythmic Tutoring System. In Proceedings of the 6th Audio Mostly Conference, Coimbra, Portugal, pp. 109-115, September 2011.

Errata of publication 2

Keywords: sonic interaction design, rhythmic interaction, audio input

This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.

© 2012 Aalto University

Last update 2012-10-31