Laughter movements are very rhythmic and show saccadic patterns. To capture laughter motion characteristics, we have developed a statistical approach to reproduce frequency movements, such as shaking and trembling. We used Coupled HMMs which have been designed to model multiple interdependent streams of observations.The coupled model can simulate the relationship between modalities, especially head and torso motions.Our statistical model takes as input pseudo-phoneme sequences and acoustic features of laughter sound. Then it outputs the head and torso animations of the virtual agent as well as facial expressions. In the training model, not only the relation between input and output features is modelled, but also the relation between head and torso movements is captured.
The Animation synthesis model was built using human data of 2 subjects. The data contains 205 laugh sequences and 25625 frames in total. Human data from another subject was used for validation through subjective and objective evaluation studies. It contains 54 laugh sequences and 6750 frames. Objective and subjective evaluations were conducted to validate the proposed animation synthesis model. We tested if coupled HMM was able to capture relationship between multiple modalities. We also measure the similarity between synthesized and real motions. Similarity measures were done considering 3 features: the main frequency of the signal, its amplitude, and its energy. Two perceptive studies were also conducted. We first compared motions resulting from statistical models, HMM vs coupled HMM. Then, as for the objective evaluation, one perceptive study compared synthesized and real motions. Both evaluations showed that our model is able to capture the dynamism of laughter movement, but do not overcome animation from human data.
Two studies presented the results on inter-individual differences when engaging with avatar laughter. In one approach, an e-learning setting where the virtual tutor feedback was either humorous or serious was evaluated. In the second study, individuals with or without a fear of being laughed at created avatar laughter that was non-threatening to them, by modifying the avatars’ face and voice. Another two posters focused on a study on laughter-eliciting emotions. Here, emotions were elicited when participants were unobserved, and later they posed expressions of the same emotions. Like this, we are able to compare facial responses to emotions when they are expressed spontaneously (and people feel unobserved) and when people arein a social situation and pose the displays (portraying what people think they would show when they feel a certain emotion). We focused on laughter-eliciting emotions in order to enable the avatars of the future to also portray these displays.
We had the opportunity to get feedback on our work and talk to other researchers interested in virtual agents and laughter. Last but not least, one of our students won the third place in the poster awards in the „Master“ category. Thus, it was a successful conference for the team of Professor Ruch and ILHAIRE.
The one above is a very challenging task that is currently addressed by researchers in the Project ILHAIRE. It is noteworthy to take into account that: laughter is highly multimodal and, in social context, this multimodality can be exploited to detect laughter. Detecting laughter from the voice or facial expressions can be in some conditions difficult e.g. distinguishing and analyzing users’ voices in multi-party interaction is an open challenge; facial activity can not be tracked in ecological contexts. Consequently, we present a method for real-time automated detection of laughter and its intensity from another modality, that is, body movement.
Body Laughter Features
Body and its movements are important indicators of laughter, which have been widely neglected in the past. Some researchers observed that laughter is often accompanied by one or more (i.e., occurring at the same time) of the following body behaviors: “rhythmic patterns”, “rock violently sideways, or more often back and forth”, “nervous tremor ... over the body”, “twitch or tremble convulsively”. Other observed that laughing users “moved their heads backward to the left and lifted their arms resembling an open-hand gesture”. Laughter has been analyzed in professional (virtual) meetings: the user laughs “accompanying the joke’s escalation in an embodied manner, moving her torso and laughing with her mouth wide open” and “even throwing her head back”.
The green balls position is detected by looking for the green parts of the image captured by the webcam. The Kinect camera captures the user's silhouette, that is, the shape of the user's body and the green ball's position allows to determine which part of the silhouette corresponds to the user's head (i.e., the one above the green balls) and which one is the trunk (i.e., the one below the green balls). In Figure 1 an example of such process is shown.
Analysis of head movement starts from the head’s silhouette, that is, the region labeled H (see Figure 1). The Center of Gravity (CoG) of the region is detected and its 2D coordinates are extracted; CoG horizontal and vertical speed are computed.
The Kinect camera is able to detect the distance of the persons present in its field of view. Analysis of trunk movement starts from a comparison between user's head and trunk distance from the Kinect camera. If the first is greater than the second then we conclude that the user is leaning backward and vice-versa. Further mathematical computation is perfomed to determine whether the user is leaning backward and forward in a repetitive way, because from the literature a periodic trunk movement could be an indicator of laughter. Also, we check if the user's trunk moves impulsively.
Shoulder trembling is a quick and repetitive movement often displayed by laughing people. Three features, based on shoulders’ vertical coordinates are extracted by looking at the position of the 2 green balls: energy of shoulders' movement, their correlation (how much the 2 shoulders moves at the same time) and shoulders' repetitivity.
To demonstrate that it is possible to automatically determine whether and how much a person is laughing just by looking at her body movements we conducted an experiment in which we made some people laugh and we recorded them. They had to participate in two different tasks: an individual one, that is, watching video clips alone; a social one, that is, playing a game called yes/no. The rules of the game are the following: the experimenter can ask the participant any questions and she is obliged to answer them without using any words “yes” and “no”.
We then used a machine learning model called Kohonen’s self-organising map (SOM) that was trained with laughter and non-laughter data taken from the recordings. We also trained the map with the intensity of laughter showed by the user. In total we provided the map with 125 segments showing laughter Vs. non-laughter behaviors and 425 segments showing different intensities of laughter. Then we checked whether the map was able to recognize laughter intensity by providing some test segments. It came out that the map can recognize laughter Vs. non-laughter for more than 75% of times and laughter intensity for 50% of times. This result is good because the chance level, that is, the probability that the map provides correct results just by chance are, respectively, 50% and 25%.
For more details see: Mancini, M., Varni, G., Niewiadomski, R., Volpe, G., Camurri, A., How is your laugh today? In CHI 2014, Toronto, Canada.
Maurizio Mancini, Giovanna Varni, Radoslaw Niewiadomski, Gualtiero Volpe, Antonio Camurri
Casa Paganini –InfoMus
University of Genoa
viale Causa 13
16145, Genoa, Italy