Publications

Corpus

MIAM

Recently the project released Multilingual dIalogAct benchMark (MIAM). It is a collection of resources for training, evaluating, and analyzing natural language understanding systems specifically designed for spoken language. Datasets are in English, French, German, Italian and Spanish. The benchmark can be found in the Huggingface Datasets repository.

SILICONE

The project has released the Sequence labellIng evaLuatIon benChmark fOr spoken laNguagE (SILICONE) benchmark. It is a collection of resources for training, evaluating, and analyzing natural language understanding systems designed for spoken language. The benchmark can be found in the Huggingface Datasets repository.

Publications & Corpus

Corpus

MIAM

SILICONE