Dati per il sistem descritto nell'articolo sottomesso a Interspeech 2021

Su questa pagina potete scaricare le features utilizzate negli esperimenti descritti nell'articolo accettato a Interspeech 2021, e per la sottomissione a NeurIPS, inclusi i modelli wav2vec 2.0 ottimizzati sul corpus MEDIA e le features estratte con tali modelli. Per una descrizione su come utilizzare il sistema descritto nell'articolo rimando al nostro repository git per Interspeech 2021.

Wav2Vec 2.0 models fine-tuned on MEDIA

Model description	Link
Self-supervised fine-tuned models
W2V2-Fr-3K-large	Download
W2V2-Fr-7K-large	Download
XLSR53-large	Download
Supervised fine-tuned models (for ASR)
W2V2-Fr-3K-large	Download
W2V2-Fr-7K-large	Download
XLSR53-large	Download

Features

The features must be used as input to the system with the option --serialized-corpus data-prefix. data-prefix is the common prefix for all filenames (train, dev, test and dict). For example, for using spectrogram features of the MEDIA corpus (the only currently provided here), the option for the system is --serialized-corpus MEDIA.user+machine.spectro-Fr-Normalized.data.
All splits plus the dictionary must be downloaded for the system to work.

Feature description
Type	Train	Dev	Test	Dict	SLU Model
Spectrogram	Download	Download	Download	Download	Download
W2V2-En-base	Download	Download	Download	Download	Download
W2V2-En-large	Download	Download	Download	Download	Download
W2V2-Fr-1K-base	Download	Download	Download	Download	Download
W2V2-Fr-1K-large	Download	Download	Download	Download	Download
W2V2-Fr-2.6K-base	Download	Download	Download	Download	Download
W2V2-Fr-3K-base	Download	Download	Download	Download	Download
W2V2-Fr-3K-large	Download	Download	Download	Download	Download
W2V2-Fr-7K-base	Download	Download	Download	Download	Download
W2V2-Fr-7K-large	Download	Download	Download	Download	Download
XLSR53	Download	Download	Download	Download	Download
Features from self-supervised fine-tuned models
W2V2-Fr-3K-large	Download	Download	Download	Download	Download
W2V2-Fr-7K-large	Download	Download	Download	Download	Download
XLSR53	Download	Download	Download	Download	Download
Features from supervised fine-tuned models (for ASR)
W2V2-Fr-3K-large	Download	Download	Download	Download	Download
W2V2-Fr-7K-large	Download	Download	Download	Download	Download
XLSR53	Download	Download	Download	Download	Download

Risultati

Nella tabella seguente riportiamo i risultati ottenuti sul corpus MEDIA con sistema descritto nell'articolo Interspeech 2021 e nel repository git.

Token decoding (Word Error Rate)
Model	Input Features	DEV ER	TEST ER
Comparison to our previous work
ICASSP 2020 Seq	Spectrogram	29.42	28.71
Interspeech 2021
Kheops+Basic	Spectrogram	36.25	37.16

Kheops+Basic	W2V2-En-base	19.80	21.78
Kheops+Basic	W2V2-En-large	24.44	26.96

Kheops+Basic	W2V2-Fr-S-base	23.11	25.22
Kheops+Basic	W2V2-Fr-S-large	18.48	19.92
Kheops+Basic	W2V2-Fr-M-base	14.97	16.37
Kheops+Basic	W2V2-Fr-M-large	11.77	12.85

Kheops+Basic	XLSR53-large	14.98	15.74
Concept decoding (Concept Error Rate)
Model	Input Features	DEV ER	TEST ER
Comparison to our previous work
ICASSP 2020 Seq	Spectrogram	28.11	27.52
ICASSP 2020 XT	Spectrogram	23.39	24.02
Interspeech 2021
Kheops+Basic	Spectrogram	39.66	40.76
Kheops+Basic +token	Spectrogram	34.38	34.74
Kheops+LSTM +SLU	Spectrogram	33.63	34.76

Kheops+Basic +token	W2V2-En-base	26.79	26.57
Kheops+LSTM +SLU	W2V2-En-base	26.31	26.11
Kheops+Basic +token	W2V2-En-large	29.31	30.39
Kheops+LSTM +SLU	W2V2-En-large	28.38	28.57

Kheops+Basic +token	W2V2-Fr-S-base	27.18	28.27
Kheops+LSTM +SLU	W2V2-Fr-S-base	26.16	26.69
Kheops+Basic +token	W2V2-Fr-S-large	23.34	23.75
Kheops+LSTM +SLU	W2V2-Fr-S-large	22.53	23.03
Kheops+Basic +token	W2V2-Fr-M-base	22.11	21.30
Kheops+LSTM +SLU	W2V2-Fr-M-base	22.56	22.24
Kheops+Basic +token	W2V2-Fr-M-large	21.72	21.35
Kheops+LSTM +SLU	W2V2-Fr-M-large	18.54	18.62

Kheops+Basic +token	XLSR53-large	21.00	20.67
Kheops+LSTM +SLU	XLSR53-large	20.34	19.73