Most computational approaches to metaphor detection try to leverage either conceptual metaphor mappings or selectional preferences. Both require extensive knowledge of the mappings/preferences in question, as well as sufficient data for all involved conceptual domains. Creating these resources is expensive and often limits the scope of these systems.
We propose a statistical approach to metaphor detection that utilizes the rarity of novel metaphors, marking words that do not match a text’s typical vocabulary as metaphor candidates. No knowledge of semantic concepts or the metaphor's source domain is required.
We analyze the performance of this approach as a stand-alone classifier and as a feature in a machine learning model, reporting improvements in F$_1$ measure over a random baseline of 58% and 68%, respectively. We also observe that, as a feature, it appears to be particularly useful when data is sparse, while its effect diminishes as the amount of training data increases.