This document identifies linguistic corpora that can be explored as high-quality training data for automatic translation within EASIER (as opposed to loosely aligned broadcast data).
For each data set, the document lists what parts of the data are available under what access conditions.
It also lists the elicitation formats used in several corpora in order to identify those parts of the available corpora that could be explored to build multilingual resources.
This project note gives an overview of how pose information was created for the Public DGS Corpus with the use of OpenPose. Pose information is machine-readable data that describes where people are located in an image, providing the coordinates for various points of each body, such as joints, eyes or ears. The data we generate consists of body, face and hand models for informants in every camera perspective of all published transcripts.