We introduce the Sign Language Interchange Format, a new format for representing annotations and lexical inventories of sign language datasets. The format is designed as an intermediate step in data preparation for language technologies, unifying the annotation conventions of different corpora for further use. Complex gloss notations and implicit relations between tiers are made explicit through a hierarchy of machine-readable container structures. Sample implementations for converting to and from the new format are provided.
M. Schulder, S. Bigeard, T. Hanke and M. Kopf, “The Sign Language Interchange Format: Harmonising Sign Language Datasets For Computational Processing”, 2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops: Sign Language Translation and Avatar Technology, Rhodes Island, Greece, 2023, doi: 10.1109/ICASSPW59220.2023.10193022. URL: https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10193022&isnumber=10192577