Resource and data intense methods for robust fine grained sentiment analysis

1. August 2018

In this research project, we address shortcomings on expression-level sentiment analysis. This fine-grained level has not been examined much in previous work even though for practical applications, such as opinion question answering or summarization, it is essential.

For the predominant type of expressions in this task, i.e. polar expressions such as nice or terrible that convey positive or negative sentiment, we will focus on the problem of unknown words. We plan to investigate the use of morphological analysis for both decomposing and synthesizing words. Moreover, we will address the issue of polar intensity. We plan to systematically compare different automatic ordering methods among each other and also with human ratings.

We will also create lexicons that contain different types of valence shifters. Shifters are essential for contextual classification, as they modify or even fully switch the polarity conveyed by polar expressions. Since valence shifting, so far, has been mostly reduced to handling common negation, this task requires a more thorough investigation on the nature of shifting.

With regard to the entity extraction tasks in expression-level sentiment analysis, i.e. opinion holder and opinion target extraction, we aim to create novel lexicons that can serve as the back-bone of rule-based extraction systems. Such systems are usually fairly domain-independent and easy to create in the absence of labeled textual data.

In order to tackle the afore-mentioned tasks, we will employ both resource-intensive methods, i.e. rule-based methods that make use of very deep semantic representations, and data-intensive methods, i.e. corpus-based methods which may also employ standard NLP tools.

We will examine these tasks for two languages, English and German. Since the majority of previous research in natural language processing focussed on the former language, there are already sophisticated resources available which allow investigations of deep(er) linguistic methods. By contrast, for the latter these resources are not available. Accordingly, shallower methods, typically data-intensive ones, need to be applied. One additional contribution of this project is that, in particular for German, new resources, such as lexical resources and processing tools for sentiment analysis, will be created.

In connection with the comparison of resource-intensive and data-intensive methods, we also want to answer the question which type of representation is best suited for the different classification/extraction tasks in fine-grained sentiment analysis. In this context, we will also critically assess the suitability of traditional lemma-based representations and contrast them with other potential levels, such as the sense level.

Finally, we plan to review established evaluation methods and examine whether they make sufficiently transparent which kinds of phenomena an analysis system handles well and which it does not.

Polarity Shifters Sentiment Analysis

Marc Schulder

Wissenschaftlicher Mitarbeiter für Computerlinguistik

Meine Forschungsinteressen umfassen Gebärdensprachen, Computerlinguistik und Open Science.

Publikationen

Determining sentiment views of verbal multiword expressions using linguistic features

Online veröffentlicht am 15. Mai 2023

Diese Publikation ist nur auf Englisch verfügbar.

Abstract

We examine the binary classification of sentiment views for verbal multiword expressions (MWEs). Sentiment views denote the perspective of the holder of some opinion. We distinguish between MWEs conveying the view of the speaker of the utterance (e.g., in “The company reinvented the wheel” the holder is the implicit speaker who criticizes the company for creating something already existing) and MWEs conveying the view of explicit entities participating in an opinion event (e.g., in “Peter threw in the towel” the holder is Peter having given up something). The task has so far been examined on unigram opinion words. Since many features found effective for unigrams are not usable for MWEs, we propose novel ones taking into account the internal structure of MWEs, a unigram sentiment-view lexicon and various information from Wiktionary. We also examine distributional methods and show that the corpus on which a representation is induced has a notable impact on the classification. We perform an extrinsic evaluation in the task of opinion holder extraction and show that the learnt knowledge also improves a state-of-the-art classifier trained on BERT. Sentiment-view classification is typically framed as a task in which only little labeled training data are available. As in the case of unigrams, we show that for MWEs a feature-based approach beats state-of-the-art generic methods.

Michael Wiegand, Marc Schulder, Josef Ruppenhofer

Automatic Generation of Lexica for Sentiment Polarity Shifters

Online veröffentlicht am 9. Juli 2020

Diese Publikation ist nur auf Englisch verfügbar.

Abstract

Alleviating pain is good and abandoning hope is bad. We instinctively understand how words like alleviate and abandon affect the polarity of a phrase, inverting or weakening it. When these words are content words, such as verbs, nouns and adjectives, we refer to them as polarity shifters. Shifters are a frequent occurrence in human language and an important part of successfully modeling negation in sentiment analysis; yet research on negation modeling has focused almost exclusively on a small handful of closed class negation words, such as not, no and without. A major reason for this is that shifters are far more lexically diverse than negation words, but no resources exist to help identify them.

Marc Schulder, Michael Wiegand, Josef Ruppenhofer

Enhancing a Lexicon of Polarity Shifters through the Supervised Classification of Shifting Directions

Diese Publikation ist nur auf Englisch verfügbar.

Zusammenfassung

The sentiment polarity of an expression (whether it is perceived as positive, negative or neutral) can be influenced by a number of phenomena, foremost among them negation. Apart from closed-class negation words like no, not or without, negation can also be caused by so-called polarity shifters. These are content words, such as verbs, nouns or adjectives, that shift polarities in their opposite direction, e. g. abandoned in “abandoned hope” or alleviate in “alleviate pain”. Many polarity shifters can affect both positive and negative polar expressions, shifting them towards the opposing polarity. However, other shifters are restricted to a single shifting direction. Recoup shifts negative to positive in “recoup your losses”, but does not affect the positive polarity of fortune in “recoup a fortune”. Existing polarity shifter lexica only specify whether a word can, in general, cause shifting, but they do not specify when this is limited to one shifting direction. To address this issue we introduce a supervised classifier that determines the shifting direction of shifters. This classifier uses both resource-driven features, such as WordNet relations, and data-driven features like in-context polarity conflicts. Using this classifier we enhance the largest available polarity shifter lexicon.

Marc Schulder, Michael Wiegand, Josef Ruppenhofer

Sentiment Polarity Shifters: Creating Lexical Resources through Manual Annotation and Bootstrapped Machine Learning

Diese Publikation ist nur auf Englisch verfügbar.

Zusammenfassung

Alleviating pain is good and abandoning hope is bad. We instinctively understand how words like alleviate and abandon affect the polarity of a phrase, inverting or weakening it. When these words are content words, such as verbs, nouns and adjectives, we refer to them as polarity shifters. Shifters are a frequent occurrence in human language and an important part of successfully modeling negation in sentiment analysis; yet research on negation modeling has focussed almost exclusively on a small handful of closed class negation words, such as not, no and without. A major reason for this is that shifters are far more lexically diverse than negation words, but no resources exist to help identify them.

Marc Schulder

Automatically Creating a Lexicon of Verbal Polarity Shifters: Mono- and Cross-lingual Methods for German

Diese Publikation ist mit Ausnahme der Zusammenfassung nur auf Englisch verfügbar.

Zusammenfassung

In dieser Arbeit untersuchen wir Methoden zur Erstellung eines deutschsprachigen Lexikons polaritätsverschiebender Verben. Diese Verben, die vielfach auch Polaritätsshifter genannt werden, sind Inhaltswörter, die die Polarität einer Phrase zu ihrem entgegengesetzten Wert verschieben, wie z.B. das Verb „aufgeben“ in der Verbalphrase „alle Hoffnung aufgeben“. Das Verhalten von Polaritätsshiftern ähnelt somit dem von Negationswörtern wie „nicht“. Für robuste Sentimentanalyse werden sowohl Negationswörter als auch Polaritätsshifter benötigt. Während Listen von Negationswörtern in vielen Sprachen verfügbar sind, existiert jedoch ein Polaritätsshifter-Lexikon hinreichender Größe nur für das Englische.

Marc Schulder, Michael Wiegand, Josef Ruppenhofer

Introducing a Lexicon of Verbal Polarity Shifters for English

Diese Publikation ist nur auf Englisch verfügbar.

Zusammenfassung

The sentiment polarity of a phrase does not only depend on the polarities of its words, but also on how these are affected by their context. Negation words (e.g. not, no, never) can change the polarity of a phrase. Similarly, verbs and other content words can also act as polarity shifters (e.g. fail, deny, alleviate). While individually more sparse, they are far more numerous. Among verbs alone, there are more than 1200 shifters. However, sentiment analysis systems barely consider polarity shifters other than negation words. A major reason for this is the scarcity of lexicons and corpora that provide information on them. We introduce a lexicon of verbal polarity shifters that covers the entirety of verbs found in WordNet. We provide a fine-grained annotation of individual word senses, as well as information for each verbal shifter on the syntactic scopes that it can affect.

Marc Schulder, Michael Wiegand, Josef Ruppenhofer, Stephanie Köser

Towards Bootstrapping a Polarity Shifter Lexicon using Linguistic Features

Diese Publikation ist nur auf Englisch verfügbar.

Zusammenfassung

We present a major step towards the creation of the first high-coverage lexicon of polarity shifters. In this work, we bootstrap a lexicon of verbs by exploiting various linguistic features. Polarity shifters, such as abandon, are similar to negations (e.g. not) in that they move the polarity of a phrase towards its inverse, as in abandon all hope. While there exist lists of negation words, creating comprehensive lists of polarity shifters is far more challenging due to their sheer number. On a sample of manually annotated verbs we examine a variety of linguistic features for this task. Then we build a supervised classifier to increase coverage. We show that this approach drastically reduces the annotation effort while ensuring a high-precision lexicon. We also show that our acquired knowledge of verbal polarity shifters improves phrase-level sentiment analysis.

Marc Schulder, Michael Wiegand, Josef Ruppenhofer, Benjamin Roth

Separating Actor-View from Speaker-View Opinion Expressions using Linguistic Features

Diese Publikation ist nur auf Englisch verfügbar.

Zusammenfassung

We examine different features and classifiers for the categorization of opinion words into actor and speaker view. To our knowledge, this is the first comprehensive work to address sentiment views on the word level taking into consideration opinion verbs, nouns and adjectives. We consider many high-level features requiring only few labeled training data. A detailed feature analysis produces linguistic insights into the nature of sentiment views. We also examine how far global constraints between different opinion words help to increase classification performance. Finally, we show that our (prior) word-level annotation correlates with contextual sentiment views.

Michael Wiegand, Marc Schulder, Josef Ruppenhofer

Opinion Holder and Target Extraction for Verb-based Opinion Predicates – The Problem is Not Solved

Diese Publikation ist nur auf Englisch verfügbar.

Zusammenfassung

We offer a critical review of the current state of opinion role extraction involving opinion verbs. We argue that neither the currently available lexical resources nor the manually annotated text corpora are sufficient to appropriately study this task. We introduce a new corpus focusing on opinion roles of opinion verbs from the Subjectivity Lexicon and show potential benefits of this corpus. We also demonstrate that state-of-the-art classifiers perform rather poorly on this new dataset compared to the standard dataset for the task showing that there still remains significant research to be done.

Michael Wiegand, Marc Schulder, Josef Ruppenhofer