Word Embedding of Product Reviews

DOI licence: CC BY 4.0

A word embedding of the Amazon Product Review Corpus (Jindal and Liu, 2008).

Created using Word2Vec in CBOW mode, 500 dimensions and window size 5.

Words have been lemmatised and particle verbs have been merged into a single token (e.g. calm_down).

Attribution

This dataset was created as part of our IJCNLP 2017 paper “Towards Bootstrapping a Polarity Shifter Lexicon using Linguistic Features”. If you use the dataset in your research or work, please cite the publication.

Marc Schulder
Marc Schulder
Research Associate in Computational Linguistics

My research interests include sign languages, natural language processing, and open science.