How is your laugh today?
The one above is a very challenging task that is currently addressed by researchers in the Project ILHAIRE. It is noteworthy to take into account that: laughter is highly multimodal and, in social context, this multimodality can be exploited to detect laughter. Detecting laughter from the voice or facial expressions can be in some conditions difficult e.g. distinguishing and analyzing users’ voices in multi-party interaction is an open challenge; facial activity can not be tracked in ecological contexts. Consequently, we present a method for real-time automated detection of laughter and its intensity from another modality, that is, body movement.
Body Laughter Features
Body and its movements are important indicators of laughter, which have been widely neglected in the past. Some researchers observed that laughter is often accompanied by one or more (i.e., occurring at the same time) of the following body behaviors: “rhythmic patterns”, “rock violently sideways, or more often back and forth”, “nervous tremor ... over the body”, “twitch or tremble convulsively”. Other observed that laughing users “moved their heads backward to the left and lifted their arms resembling an open-hand gesture”. Laughter has been analyzed in professional (virtual) meetings: the user laughs “accompanying the joke’s escalation in an embodied manner, moving her torso and laughing with her mouth wide open” and “even throwing her head back”.
The green balls position is detected by looking for the green parts of the image captured by the webcam. The Kinect camera captures the user's silhouette, that is, the shape of the user's body and the green ball's position allows to determine which part of the silhouette corresponds to the user's head (i.e., the one above the green balls) and which one is the trunk (i.e., the one below the green balls). In Figure 1 an example of such process is shown.
Analysis of head movement starts from the head’s silhouette, that is, the region labeled H (see Figure 1). The Center of Gravity (CoG) of the region is detected and its 2D coordinates are extracted; CoG horizontal and vertical speed are computed.
The Kinect camera is able to detect the distance of the persons present in its field of view. Analysis of trunk movement starts from a comparison between user's head and trunk distance from the Kinect camera. If the first is greater than the second then we conclude that the user is leaning backward and vice-versa. Further mathematical computation is perfomed to determine whether the user is leaning backward and forward in a repetitive way, because from the literature a periodic trunk movement could be an indicator of laughter. Also, we check if the user's trunk moves impulsively.
Shoulder trembling is a quick and repetitive movement often displayed by laughing people. Three features, based on shoulders’ vertical coordinates are extracted by looking at the position of the 2 green balls: energy of shoulders' movement, their correlation (how much the 2 shoulders moves at the same time) and shoulders' repetitivity.
To demonstrate that it is possible to automatically determine whether and how much a person is laughing just by looking at her body movements we conducted an experiment in which we made some people laugh and we recorded them. They had to participate in two different tasks: an individual one, that is, watching video clips alone; a social one, that is, playing a game called yes/no. The rules of the game are the following: the experimenter can ask the participant any questions and she is obliged to answer them without using any words “yes” and “no”.
We then used a machine learning model called Kohonen’s self-organising map (SOM) that was trained with laughter and non-laughter data taken from the recordings. We also trained the map with the intensity of laughter showed by the user. In total we provided the map with 125 segments showing laughter Vs. non-laughter behaviors and 425 segments showing different intensities of laughter. Then we checked whether the map was able to recognize laughter intensity by providing some test segments. It came out that the map can recognize laughter Vs. non-laughter for more than 75% of times and laughter intensity for 50% of times. This result is good because the chance level, that is, the probability that the map provides correct results just by chance are, respectively, 50% and 25%.
For more details see: Mancini, M., Varni, G., Niewiadomski, R., Volpe, G., Camurri, A., How is your laugh today? In CHI 2014, Toronto, Canada.
Maurizio Mancini, Giovanna Varni, Radoslaw Niewiadomski, Gualtiero Volpe, Antonio Camurri
Casa Paganini –InfoMus
University of Genoa
viale Causa 13
16145, Genoa, Italy