Laughter data collection in Peru

Posted on Mon 01 Sep 2014
One of the goals of Work Package 1 in the ILHAIRE project was to create a multicultural and multimodal database of laughter. The multimodal part of the database was achieved through the collaboration of the Queen’s University Belfast and University of Augsburg teams. Together we set up a multimodal recording installation that was capable of recording four people in conversation at a time. This used Four kinect systems to capture movement data, four HD webcams for the visual components, and four high quality microphones for the auditory streams of data. This data required nine computers and a Network Attached Storage device capable of storing fifteen terabytes of data. All the various streams of data were synchronized using the Social Signal Interpretation (SSI) software developed by the University of Augsburg team.

We used this setup to record two parts of the ILHAIRE laughter database, the Belfast Storytelling Database which sought to capture hilarious laughter, and the Belfast Conversational Dyads that sought to capture conversational laughter. In the end both contained large quantities of laughter that could be catgorised as hilarious and conversational. In the Belfast Storytelling Database we addressed some of our multicultural goals by using two linguistic groups English speakers and Spanish speakers. However, the multicultural goal of the work package had been to test in a place that was not European and had relatively little interaction with WEIRD (Western, Educated, Industrialized, Rich, and Democratic) people. We chose to do some testing in Peru as the Queen’s University team had previously collected recordings of emotional material in Peru as part of the Belfast Induced Natural Emotion Database.

Adding this truly multicultural component to the database presented many challenges, not least of which was converting a fairly static hardware installation into something that was more mobile. However, we also wanted to keep the set up as similar as possible to that used in the Belfast data collection sessions so that we could have comparable data. We decided to test dyads and groups of three using our conversational task as it was the most natural of our tasks.

In creating our mobile strategy we decided to bring the sensor equipment and rent computers while we were out there. We came to an agreement with a local internet café to rent their computers for the duration of the project. Storage presented a problem but we took three 2 terabyte portable drives with us as storage. Packed up the equipment in a flight box and headed to a town called Chincha Alta in coastal Peru.

Despite being adventurous and exciting cross-cultural data collection is always fraught with unforeseen issues. When we arrived we had the nasty surprise of being forced to pay an import duty on all the equipment, after haggling the price of the duty down to about 50% of the original cost we paid the duty and proceeded to Chincha. Again there were more problems–despite assurances to the contrary the internet café had underpowered computers and so only one served to collect the auditory data and we had to search for additional computers that were powerful enough to run the equipment. We managed to hire enough reasonably powerful laptops that we could get a sufficient set-up but unfortunately we could not get enough fast computers to gather the Kinect data streams. So we settled on gathering audio streams and video streams from dyads and groups. We hired a house outside of town to minimise noise disturbances that were an issue in the Belfast Induced Natural Emotion Database. We set up the lab and data collection started. One major issue was getting sufficient lighting, the rooms were dark only had one window. We used natural daylight with translucent gauze on the windows to diffuse the light but for short periods of the day the sun shone directly through causing problems. We also had several artificial up-lighters to create a more even spread of lighting, however, these were popular with many local insects that took an interest in our work and occasionally the amount of dead insects would build up and catch fire in the halogen bulbs in the uplighters. Insects also caused problems with our felt backdrops that we used to provide an even background and regular inspections had to take place to stop them eating large holes in the felt that may show up on the videos.

 In the Peru data collection the participants sat opposite each other with the HD webcam placed on a table between them replicating as closely as possible the setup used in Belfast

We had many other issues that interfered with the data collection. Dust is a big issue in the coastal area of Peru and keeping it out of the equipment required effort. Power cuts were common but thankfully rare at the times of the day when we were testing and the house had a back up generator that minimised disruption. However, at one stage thieves cut into and stole the power lines that delivered electricity to the area, this meant there was no reliable electricity for two days.

Recruitment of participants was an issue, men leave to go to work during the day and women are often looking after children both of these factors need to be considered; looking after children while participants are in conversation became part of the work load. As many men work during the week the weekends become an important time to gather male and mixed sex dyads and planning is required to maximise opportunities for gathering data that doesn’t fit with the normal rhythms of local life. Illiteracy can be a problem, and most participants had never filled in questionnaires before so many things that can be taken for granted with “WEIRD” participants take a much longer time with those who have never been involved in anything like this before. Punctuality is also often an issue, a neat and tidy testing timetable can require constant revision to adapt to local chronological norms. Adaptability is the key, and it is also very useful if the topic of your scientific investigation is laughter!



One of the major limiting factors on the speed of recording was the size of the data files, our setup captured raw information streams from the HD webcams creating files of up to 150 GB for each 1 hour session across for each participant. These had to be copied from the hard drives to the storage devices between sessions which took a considerable amount of time. We collected data until there was about five terabytes of material on the storage devices, giving us 20 dyads sessions and 4 groups sessions but unfortunately one of the storage device took a knock in transit on the way back and the data was lost and can not be recovered. Given the file sizes and time taken to transfer the files between devices is was not possible to create backups of these devices although some compressed versions had been made. However, the damaged device was the one with the least data on it so only a small number of sessions were lost, but it has impacted the group session data most severely.

So despite the trials and tribulations of data collection in Peru and the best efforts of the local fauna to upset our goals we managed to make it back with a substantial number of synchronised high quality recordings of multimodal natural conversations between Peruvian participants. These provide a further addition to the ILHAIRE laughter database providing a comparable set of data to those recorded during the Belfast Sessions.