Overview of Datasets for the Sign Languages of Europe

Maria Kopf, Marc Schulder, Thomas Hanke

July, 2021

Type

Report

Publication

EASIER Deliverable D6.1

Versions

Latest version:
Version 1:

Abstract

This document identifies linguistic corpora that can be explored as high-quality training data for automatic translation within EASIER (as opposed to loosely aligned broadcast data). For each data set, the document lists what parts of the data are available under what access conditions. It also lists the elicitation formats used in several corpora in order to identify those parts of the available corpora that could be explored to build multilingual resources.

In order to support the construction of an interlingual index across European sign languages, the document also lists lexical resources (lexical databases and dictionaries) available and their characteristics.

Marc Schulder

Research Associate in Computational Linguistics

My research interests include sign languages, natural language processing, and open science.