Article-Journal

Data Collection in Multimodal Language and Communication Research: A Flexible Decision Framework
Diese Publikation ist nur auf Englisch verfügbar.

Abstract

Contemporary research on language and communication has expanded beyond its traditional focus on spoken and written forms to encompass signing, gestures, facial expressions, and other bodily actions. This shift has been accompanied by methodological advancements that extend beyond classical tools, such as tape recorders or video cameras, and include motion-tracking systems, depth cameras, and multimodal data-fusion techniques. Although these tools enable richer empirical insights, they also introduce significant conceptual and practical challenges, particularly for researchers new to multimodal data collection. In this article, we present a structured, decision-oriented workflow for multimodal data collection in language and communication research. We introduce a flexible framework that guides researchers through key methodological choices, including the alignment of research questions with data streams; study-design and acquisition strategies; synchronization and technical requirements; ethical governance; and data management, dissemination, and reuse. The framework is illustrated with case studies spanning controlled laboratory experiments, large-scale annotated sign-language corpora, and field-based research, including nonhuman primates. Rather than advocating a one-size-fits-all approach, in our discussion, we emphasize key decision points, trade-offs, and real-world examples to help researchers navigate the complexities of multimodal data collection. By integrating perspectives from different disciplines, our flexible decision-making framework is intended as a practical tool for researchers seeking to design, implement, and address common conceptual and methodological challenges in the rapidly developing area of multimodal data collection.

Phonetic differences between affirmative and feedback head nods in German Sign Language (DGS): A pose estimation study
Diese Publikation ist nur auf Englisch verfügbar.

Abstract

This study investigates head nods in natural dyadic German Sign Language (DGS) interaction, with the aim of finding whether head nods serving different functions vary in their phonetic characteristics. Earlier research on spoken and sign language interaction has revealed that head nods vary in the form of the movement. However, most claims about the phonetic properties of head nods have been based on manual annotation without reference to naturalistic text types and the head nods produced by the addressee have been largely ignored. There is a lack of detailed information about the phonetic properties of the addressee’s head nods and their interaction with manual cues in DGS as well as in other sign languages, and the existence of a form-function relationship of head nods remains uncertain. We hypothesize that head nods functioning in the context of affirmation differ from those signaling feedback in their form and the co-occurrence with manual items. To test the hypothesis, we apply OpenPose, a computer vision toolkit, to extract head nod measurements from video recordings and examine head nods in terms of their duration, amplitude and velocity. We describe the basic phonetic properties of head nods in DGS and their interaction with manual items in naturalistic corpus data. Our results show that phonetic properties of affirmative nods differ from those of feedback nods. Feedback nods appear to be on average slower in production and smaller in amplitude than affirmation nods, and they are commonly produced without a co-occurring manual element. We attribute the variations in phonetic properties to the distinct roles these cues fulfill in turn-taking system. This research underlines the importance of non-manual cues in shaping the turn-taking system of sign languages, establishing the links between such research fields as sign language linguistics, conversational analysis, quantitative linguistics and computer vision.

Determining sentiment views of verbal multiword expressions using linguistic features

Online veröffentlicht am 15. Mai 2023

Diese Publikation ist nur auf Englisch verfügbar.

Abstract

We examine the binary classification of sentiment views for verbal multiword expressions (MWEs). Sentiment views denote the perspective of the holder of some opinion. We distinguish between MWEs conveying the view of the speaker of the utterance (e.g., in “The company reinvented the wheel” the holder is the implicit speaker who criticizes the company for creating something already existing) and MWEs conveying the view of explicit entities participating in an opinion event (e.g., in “Peter threw in the towel” the holder is Peter having given up something). The task has so far been examined on unigram opinion words. Since many features found effective for unigrams are not usable for MWEs, we propose novel ones taking into account the internal structure of MWEs, a unigram sentiment-view lexicon and various information from Wiktionary. We also examine distributional methods and show that the corpus on which a representation is induced has a notable impact on the classification. We perform an extrinsic evaluation in the task of opinion holder extraction and show that the learnt knowledge also improves a state-of-the-art classifier trained on BERT. Sentiment-view classification is typically framed as a task in which only little labeled training data are available. As in the case of unigrams, we show that for MWEs a feature-based approach beats state-of-the-art generic methods.