Collection: Polarity Shifter Resources

repository: github DOI licence: CC BY 4.0

This dataset is a collection of polarity shifter resources that were created as part of my doctoral studies. It was compiled to accompany my doctoral thesis “Sentiment Polarity Shifters: Creating Lexical Resources through Manual Annotation and Bootstrapped Machine Learning”.

Publications & Attribution

The polarity shifter resources contained in this collection are also connected to a number of peer-reviewed publications:

If you use the data in your research or work, please cite the relevant publication(s).

Data

The repository contains the following resources:

  1. A general lexicon of English polarity shifters, covering verbs, adjectives and nouns. Provides lemma labels for shifters and for which polarities they can affect.
  2. A lexicon of English verbal shifters. Provides word sense labels for shifters and their shifting scopes.
  3. A lexicon of German verbal shifters. Provides lemma labels for shifters.
  4. A set of verb phrases annotated for shifting polarities.

1. English Shifter Lexicon (Lemma)

A lexicon of 9145 English words, annotated for whether they are polarity shifters and which polarities they affect. The lexicon is based on the vocabulary of WordNet v3.1 (Miller et al., 1990). It contains 2631 shifters and 6514 non-shifters.

2. English Verbal Shifter Lexicon (Word Sense)

A lexicon of word senses of English verbs, annotated for whether they are polarity shifters and their shifting scope. The lexicon covers all verbs of WordNet v3.1 (Miller et al., 1990) that are single word or particle verbs. Polarity shifter and scope labels are given for each lemma-synset pair (i.e. each word sense of a lemma).

The data is presented in the following forms:

  1. A complete lexicon of all verbal shifters and their shifting scopes.
  2. Two auxiliary lists containing simplified information:
    1. A list of all lemmas with shifter labels
    2. A list of all word senses with shifter labels

All files are in CSV (comma-separated value) format.

2.1. Complete Lexicon

The main lexicon lists all verbal shifters and their shifting scopes. Verbal shifters are modeled as lemma-sense pairs with one or more shifting scopes.

The lexicon lists all lemma-sense pairs that are verbal shifters. Any lemma-sense pair not listed is not a verbal shifter. When a lemma-sense pair has more than one possible scope, a separate entry is made for each scope.

2.2. List of Lemmas

List of all verb lemmas and whether they are shifters in at least one of their word senses. Does not provide shifter scope information.

Many verbal shifter lemmas only cause shifting in some of their word senses. This list is therefore considerably more coarse-grained than the main lexicon. It is intended as a convenience measure for quick experimentation.

2.3. List of Synsets

List of all synsets and whether their lemmas are shifters in this specific word sense. Does not provide shifter scope information.

Shifting is shared among lemmas of the same word sense. This list, therefore, provides (almost) the same granularity for the shifter label as the main lexicon. However, in a few exceptions, synsets contained words with subtly different senses that did not all cause shifting. These senses are considered shifters in this list, analogous to the generalization in the list of lemmas.

3. German Verbal Shifter Lexicon (Lemma)

A lexicon of 2595 German verbs, annotated for whether they are polarity shifters and which polarities they affect. The lexicon is based on the vocabulary of GermaNet (Hamp and Feldweg, 1997). It contains 677 shifters and 1918 non-shifters.

4. Sentiment Verb Phrases

A set of verb phrases, annotated for the polarity of the verb phrase and the polarity of a polar noun that it contains. Can be used to evaluate whether a polarity classifier correctly recognizes polarity shifting. The file starts with 400 phrases containing shifter verbs, followed by 2231 phrases containing non-shifter verbs.

  • Every item consists of:
    • The sentence from which the VP and the polar noun were extracted.
    • The VP, polar noun and the verb heading the VP.
    • Constituency parse for the VP.
    • Gold labels for VP and polar noun by a human annotator.
    • Predicted labels for VP and polar noun by RNTN tagger (Socher et al., 2013) and LEX_gold approach.
    • Items are separated by a line of asterisks (*)