Scientific publications
Neural Networks for NLP
R. Whetten, T. Parcollet, Marco Dinarelli, Y. Estève
An Analysis of Linear Complexity Attention Substitutes with BEST-RQ
International Spoken Language Technology Workshop, Macao, 2024.
An Analysis of Linear Complexity Attention Substitutes with BEST-RQ
International Spoken Language Technology Workshop, Macao, 2024.
V. Segonne, A. Mannion, A. Audibert, L. Xingyu, C. Macaire, A. Pupier, Y. Zhou, M. Aguiar, N. Magali, F. Herron, I. Eshkol Taravella, T. François, L. Goeuriot, J. Goulian, M. Lafourcade, B. Lecouteux, F. Portet, F. Ringeval, M. Coavoux, V. Vandeghinste, Marco Dinarelli, L. Alonzo-Canul, A. Massih-Reza, P. Bouillon, D. Schwab, E. Esparança-Rodier
Jargon : Une suite de modèles de langues et de référentiels d'évaluation pour les domaines spécialisés du français
Journées d'Études sur la Parole, Toulouse, 2024.
Jargon : Une suite de modèles de langues et de référentiels d'évaluation pour les domaines spécialisés du français
Journées d'Études sur la Parole, Toulouse, 2024.
R. Whetten, T. Parcollet, Marco Dinarelli, Y. Estève
Implémentation ouverte et étude de BEST-RQ pour le traitement de la parole
Journées d'Études sur la Parole, Toulouse, 2024.
Implémentation ouverte et étude de BEST-RQ pour le traitement de la parole
Journées d'Études sur la Parole, Toulouse, 2024.
Marco Dinarelli, D. Niaouri, F. Lopez, G. Gonzalez-Saez, M. Nakhle, E. Esperança-Rodier, C. Rossi, D. Schwab, N. Ballier
Context-Aware Neural Machine Translation Models Analysis And Evaluation Through Attention
TAL Journal, special issue on Explicabilité des modèles de TAL, volume 64-3, 2024.
Context-Aware Neural Machine Translation Models Analysis And Evaluation Through Attention
TAL Journal, special issue on Explicabilité des modèles de TAL, volume 64-3, 2024.
V. Segonne, A. Mannion, L. C. Alonzo Canul, A. Audibert, X. Liu, C. Macaire, A. Pupier, Y. Zhou, M. Aguiar, F. Herron, M. Norré, M. R. Amini, P. Bouillon, I. Eshkol-Taravella, E. Esperança-Rodier, T. François, L. Goeuriot, J. Goulian, M. Lafourcade, B. Lecouteux, F. Portet, F. Ringeval, V. Vandeghinste, M. Coavoux, Marco Dinarelli, D. Schwab
Jargon: A Suite of Language Models and Evaluation Tasks for French Specialized Domains
LREC-COLING International Conference, Turin, 2024.
Jargon: A Suite of Language Models and Evaluation Tasks for French Specialized Domains
LREC-COLING International Conference, Turin, 2024.
T. Parcollet, H. Nguyen, S. Evain, H. Le, M. Zanon Boito, S. Mdhaffar, S. Alisamir, Z. Tong, N. Tomashenko, Marco Dinarelli, A. Allauzen, Y. Esteve, B. Lecouteux, F. Portet, S. Rossato, F. Ringeval, D. Schwab, L. Besacier
LeBenchmark 2.0: a Standardized, Replicable and Enhanced Framework for Self-supervised Representations of French Speech
Journal on Computer Speech and Language (CSL), Volume 86, Elsevier, 2024.
LeBenchmark 2.0: a Standardized, Replicable and Enhanced Framework for Self-supervised Representations of French Speech
Journal on Computer Speech and Language (CSL), Volume 86, Elsevier, 2024.
R. Whetten, T. Parcollet, Marco Dinarelli, Y. Estève
Open Implementation and Study of BEST-RQ for Speech Processing
ICASSP Workshop on Self-Supervision in Audio, Speech and Beyond (SASB), Seoul, 2024.
Open Implementation and Study of BEST-RQ for Speech Processing
ICASSP Workshop on Self-Supervision in Audio, Speech and Beyond (SASB), Seoul, 2024.
E. Gugliotta, Marco Dinarelli
An Empirical Analysis of Task Relations in the Multi-Task Annotation of an Arabizi Corpus
Language, Data and Knowledge Conference (LDK), Vienna, Austria, 2023.
An Empirical Analysis of Task Relations in the Multi-Task Annotation of an Arabizi Corpus
Language, Data and Knowledge Conference (LDK), Vienna, Austria, 2023.
E. Gugliotta, Marco Dinarelli, O. Kraif
Multi-Task Sequence Prediction For Tunisian Arabizi Multi-Level Annotation
The Fifth Arabic Natural Language Processing Workshop (WANLP), Barcelona, Spain, 2020.
Multi-Task Sequence Prediction For Tunisian Arabizi Multi-Level Annotation
The Fifth Arabic Natural Language Processing Workshop (WANLP), Barcelona, Spain, 2020.
Marco Dinarelli, L. Grobol
Hybrid Neural Models For Sequence Modelling: The Best Of Three Worlds
arXiv Technical Report, 2019.
English translation of the 2019 TALN paper
Hybrid Neural Models For Sequence Modelling: The Best Of Three Worlds
arXiv Technical Report, 2019.
English translation of the 2019 TALN paper
Marco Dinarelli, L. Grobol
Modèles neuronaux hybrides pour la modélisation de séquences : le meilleur de trois mondes
Traitement Automatique des Langues Naturelles, Toulouse, France, 2019.
Modèles neuronaux hybrides pour la modélisation de séquences : le meilleur de trois mondes
Traitement Automatique des Langues Naturelles, Toulouse, France, 2019.
Marco Dinarelli, L. Grobol
Seq2Biseq: Bidirectional Output-wise Recurrent Neural Networks for Sequence Modelling
International Conference on Intelligent Text Processing and Computational Linguistics (CICling), La Rochelle, France, 2019.
Seq2Biseq: Bidirectional Output-wise Recurrent Neural Networks for Sequence Modelling
International Conference on Intelligent Text Processing and Computational Linguistics (CICling), La Rochelle, France, 2019.
Marco Dinarelli, L. Grobol
Modélisation d'un contexte global d'étiquettes pour l'étiquetage de séquences dans les réseaux neuronaux récurrents
Journée TAL\&IA, Nancy, France, 2018.
Modélisation d'un contexte global d'étiquettes pour l'étiquetage de séquences dans les réseaux neuronaux récurrents
Journée TAL\&IA, Nancy, France, 2018.
Marco Dinarelli, Y. Dupont, I. Tellier
Effective Spoken Language Labeling with Deep Recurrent Neural Networks
arXiv Technical Report, 2017.
Effective Spoken Language Labeling with Deep Recurrent Neural Networks
arXiv Technical Report, 2017.
Y. Dupont, Marco Dinarelli, I. Tellier
Réseaux neuronaux profonds pour l’étiquetage de séquences
Article court à Traitement Automatique des Langues Naturelles (TALN), Orléans, France, 2017.
Réseaux neuronaux profonds pour l’étiquetage de séquences
Article court à Traitement Automatique des Langues Naturelles (TALN), Orléans, France, 2017.
Marco Dinarelli, Y. Dupont
Modélisation de dépendances entre étiquettes dans les réseaux neuronaux récurrents
Revue TAL (Traitement Automatique des Langues) Volume 58 Numéro 1, France, 2017.
Modélisation de dépendances entre étiquettes dans les réseaux neuronaux récurrents
Revue TAL (Traitement Automatique des Langues) Volume 58 Numéro 1, France, 2017.
Y. Dupont, Marco Dinarelli, I. Tellier
Label-Dependencies Aware Recurrent Neural Networks
International Conference on Intelligent Text Processing and Computational Linguistics (CICling), Budapest, Hungary, 2017.
Published in Lecture Notes in Computer Sciences (LNCS), Springer
Best Verifiability, Reproducibility, and Working Description award
Label-Dependencies Aware Recurrent Neural Networks
International Conference on Intelligent Text Processing and Computational Linguistics (CICling), Budapest, Hungary, 2017.
Published in Lecture Notes in Computer Sciences (LNCS), Springer
Best Verifiability, Reproducibility, and Working Description award
Marco Dinarelli, I. Tellier
Improving Recurrent Neural Networks for Sequence Labelling
arXiv Technical Report, 2016.
Improving Recurrent Neural Networks for Sequence Labelling
arXiv Technical Report, 2016.
Spoken Language Understanding
Marco Dinarelli, M. Naguib, F. Portet
Toward Low-Cost End-to-End Spoken Language Understanding
Interspeech, Incheon, Korea, 2022.
Toward Low-Cost End-to-End Spoken Language Understanding
Interspeech, Incheon, Korea, 2022.
M. Naguib, F. Portet, Marco Dinarelli
Vers la compréhension automatique de la parole bout-en-bout à moindre effort
Traitement Automatique des Langues Naturelles, Avignon, France, 2022.
Vers la compréhension automatique de la parole bout-en-bout à moindre effort
Traitement Automatique des Langues Naturelles, Avignon, France, 2022.
S. Evain, H. Nguyen, H. Le, M. Zanon Boito, S. Mdhaffar, S. Alisamir, Z. Tong, N. Tomashenko, Marco Dinarelli, T. Parcollet, A. Allauzen, Y. Esteve, B. Lecouteux, F. Portet, S. Rossato, F. Ringeval, D. Schwab, L. Besacier
LeBenchmark, un référentiel d'évaluation pour le français oral
Journée d'Étude sur la Parole, Île de Noirmoutier, France, 2022.
LeBenchmark, un référentiel d'évaluation pour le français oral
Journée d'Étude sur la Parole, Île de Noirmoutier, France, 2022.
S. Evain, H. Nguyen, H. Le, M. Zanon Boito, S. Mdhaffar, S. Alisamir, Z. Tong, N. Tomashenko, Marco Dinarelli, T. Parcollet, A. Allauzen, Y. Esteve, B. Lecouteux, F. Portet, S. Rossato, F. Ringeval, D. Schwab, L. Besacier
Modèles neuronaux pré-appris par auto-supervision sur des enregistrements de parole en français
Journée d'Étude sur la Parole, Île de Noirmoutier, France, 2022.
Modèles neuronaux pré-appris par auto-supervision sur des enregistrements de parole en français
Journée d'Étude sur la Parole, Île de Noirmoutier, France, 2022.
S. Evain, H. Nguyen, H. Le, M. Zanon Boito, S. Mdhaffar, S. Alisamir, Z. Tong, N. Tomashenko, Marco Dinarelli, T. Parcollet, A. Allauzen, Y. Esteve, B. Lecouteux, F. Portet, S. Rossato, F. Ringeval, D. Schwab, L. Besacier
Task Agnostic and Task Specific Self-Supervised Learning from Speech with LeBenchmark
In proceedings of NeurIPS, Datasets and Benchmarks Track, 2021.
Task Agnostic and Task Specific Self-Supervised Learning from Speech with LeBenchmark
In proceedings of NeurIPS, Datasets and Benchmarks Track, 2021.
S. Evain, H. Nguyen, H. Le, M. Zanon Boito, S. Mdhaffar, S. Alisamir, Z. Tong, N. Tomashenko, Marco Dinarelli, T. Parcollet, A. Allauzen, Y. Esteve, B. Lecouteux, F. Portet, S. Rossato, F. Ringeval, D. Schwab, L. Besacier
LeBenchmark: A Reproducible Framework for Assessing Self-Supervised Representation Learning from Speech
In proceedings of Interspeech, Brno, Czech, 2021.
LeBenchmark: A Reproducible Framework for Assessing Self-Supervised Representation Learning from Speech
In proceedings of Interspeech, Brno, Czech, 2021.
Marco Dinarelli, N. Kapoor, B. Jabaian, L. Besacier
A Data-Efficient End-to-End Spoken Language Understanding Architecture
International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Barcelona, Spain, 2020.
A Data-Efficient End-to-End Spoken Language Understanding Architecture
International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Barcelona, Spain, 2020.
Marco Dinarelli, V. Vukotic, C. Raymond
Label-dependency coding in Simple Recurrent Networks for Spoken Language Understanding
In proceedings of Interspeech, Stockholm, Sweden, 2017.
Label-dependency coding in Simple Recurrent Networks for Spoken Language Understanding
In proceedings of Interspeech, Stockholm, Sweden, 2017.
Marco Dinarelli, S. Rosset
Hypotheses Selection Criteria in a Reranking Framework for Spoken Language Understanding
In Proceedings of Empirical Methods for Natural Language Processing (EMNLP), Edinburgh, U.K., 2011.
Hypotheses Selection Criteria in a Reranking Framework for Spoken Language Understanding
In Proceedings of Empirical Methods for Natural Language Processing (EMNLP), Edinburgh, U.K., 2011.
Marco Dinarelli, A. Moschitti, G. Riccardi
Discriminative Reranking for Spoken Language Understanding
IEEE Journal of Transactions on Audio, Speech and Language Processing (TASLP), volume 20, issue 2, pages 526 - 539, 2012.
Discriminative Reranking for Spoken Language Understanding
IEEE Journal of Transactions on Audio, Speech and Language Processing (TASLP), volume 20, issue 2, pages 526 - 539, 2012.
S. Hahn, Marco Dinarelli, C. Raymond, F. Lefèvre, P. Lehnen, R. De Mori, A. Moschitti, H. Ney, G. Riccardi
Comparing Stochastic Approaches to Spoken Language Understanding in Multiple Languages
IEEE Journal of Transactions on Audio, Speech and Language Processing (TASLP), volume 19, issue 6, pages 1569 - 1583, 2010.
Comparing Stochastic Approaches to Spoken Language Understanding in Multiple Languages
IEEE Journal of Transactions on Audio, Speech and Language Processing (TASLP), volume 19, issue 6, pages 1569 - 1583, 2010.
Marco Dinarelli, A. Moschitti, G. Riccardi
Hypotheses Selection For Re-ranking Semantic Annotations
IEEE Workshop on Spoken Language Technology (SLT), Berkeley, U.S.A., 2010.
Hypotheses Selection For Re-ranking Semantic Annotations
IEEE Workshop on Spoken Language Technology (SLT), Berkeley, U.S.A., 2010.
Marco Dinarelli
Spoken Language Understanding: from Spoken Utterances to Semantic Structures
Ph.D. Dissertation, University of Trento
Department of Computer Science and Information Engineering (DISI), Italy, 2010.
Spoken Language Understanding: from Spoken Utterances to Semantic Structures
Ph.D. Dissertation, University of Trento
Department of Computer Science and Information Engineering (DISI), Italy, 2010.
Marco Dinarelli, A. Moschitti, G. Riccardi
Reranking Models Based On Small Training Data For Spoken Language Understanding
In Proceedings of Empirical Methods for Natural Language Processing (EMNLP), Singapore, 2009.
Reranking Models Based On Small Training Data For Spoken Language Understanding
In Proceedings of Empirical Methods for Natural Language Processing (EMNLP), Singapore, 2009.
Marco Dinarelli, A. Moschitti, G. Riccardi
Concept Segmentation And Labeling For Conversational Speech
In Proceedings of Interspeech, Brighton, U.K., 2009.
Concept Segmentation And Labeling For Conversational Speech
In Proceedings of Interspeech, Brighton, U.K., 2009.
Machine Translation
G. Gonzalez-Saez, F. Lopez, M. Nakhlé,, Marco Dinarelli, E. Esperança-Rodier, S. He, C. Rossi, D. Schwab, J. Yang, J.R. Turner, N. Ballier
The MAKE-NMTViz Project: Meaningful, Accurate and Knowledge-limited Explanations of NMT Systems for Translators
25th Annual Conference of the European Association for Machine Translation, Sheffield, 2024.
The MAKE-NMTViz Project: Meaningful, Accurate and Knowledge-limited Explanations of NMT Systems for Translators
25th Annual Conference of the European Association for Machine Translation, Sheffield, 2024.
G. Gonzalez-Saez, M. Nakhlé, J.R. Turner, F. Lopez, N. Ballier, Marco Dinarelli, E. Esperança-Rodier, S. He, R. Qader, C. Rossi, D. Schwab, J. Yang
Exploring NMT Explainability for Translators Using NMT Visualising Tools
25th Annual Conference of the European Association for Machine Translation, Sheffield, 2024.
Exploring NMT Explainability for Translators Using NMT Visualising Tools
25th Annual Conference of the European Association for Machine Translation, Sheffield, 2024.
F. Lopez, G. González-Sáez, D. Hansen, M. Nakhlé, B. Namdarzadeh, Marco Dinarelli, E. Esperança-Rodier, S. He, S. Mohseni, C. Rossi, D. Schwab, J. Yang, J.-B. Yunès, L. Zhu, N. Ballier
The MAKE-NMTViz System Description for the WMT23 Literary Task
WMT Conference, Singapore, 2023.
The MAKE-NMTViz System Description for the WMT23 Literary Task
WMT Conference, Singapore, 2023.
L. Lupo, Marco Dinarelli, L. Besacier
Encoding Sentence Position in Context-Aware Neural Machine Translation with Concatenation
Workshop on Insights from Negative Results in NLP, Dubrovnik, Croatia, 2023.
Encoding Sentence Position in Context-Aware Neural Machine Translation with Concatenation
Workshop on Insights from Negative Results in NLP, Dubrovnik, Croatia, 2023.
L. Lupo, Marco Dinarelli, L. Besacier
Focused Concatenation for Context-Aware Neural Machine Translation
Seventh Conference on Machine Translation, Abu Dhabi, 2022.
Focused Concatenation for Context-Aware Neural Machine Translation
Seventh Conference on Machine Translation, Abu Dhabi, 2022.
L. Lupo, Marco Dinarelli, L. Besacier
Divide and Rule: Effective Pre-Training for Context-Aware Multi-Encoder Translation Models
Association for Computational Linguistics, 2022.
Pre-print of an earlier version available on arXiv
Divide and Rule: Effective Pre-Training for Context-Aware Multi-Encoder Translation Models
Association for Computational Linguistics, 2022.
Pre-print of an earlier version available on arXiv
Named Entity Recognition
Y. Dupont, Marco Dinarelli, I. Tellier, C. Lautier
Structured Named Entity Recognition by Cascading CRFs
International Conference on Intelligent Text Processing and Computational Linguistics (CICling), Budapest, Hungary, 2017.
Published in Lecture Notes in Computer Sciences (LNCS), Springer
Structured Named Entity Recognition by Cascading CRFs
International Conference on Intelligent Text Processing and Computational Linguistics (CICling), Budapest, Hungary, 2017.
Published in Lecture Notes in Computer Sciences (LNCS), Springer
T. Tian, Marco Dinarelli, I. Tellier, P. Dias Cardoso
Domain Adaptation for Named Entity Recognition Using CRFs
Language Resources Evaluation Conferences (LREC), Portoroz, Slovenia, 2016.
Domain Adaptation for Named Entity Recognition Using CRFs
Language Resources Evaluation Conferences (LREC), Portoroz, Slovenia, 2016.
Y. Dupont, I. Tellier, C. Lautier, Marco Dinarelli
Extraction automatique d'affixes pour la reconnaissance d'entités nommées chimiques
Extraction et Gestion des Connaissances, Reims, France, 2016.
Accepted for publication
Extraction automatique d'affixes pour la reconnaissance d'entités nommées chimiques
Extraction et Gestion des Connaissances, Reims, France, 2016.
Accepted for publication
T. Tian, Marco Dinarelli, I. Tellier
Data Adaptation for Named Entity Recognition on Tweets with Features-Rich CRF
ACL Workshop on Noisy User-generated Text: Twitter Lexical Normalization and Named Entity Recognition, Beijing, China, 2015.
Data Adaptation for Named Entity Recognition on Tweets with Features-Rich CRF
ACL Workshop on Noisy User-generated Text: Twitter Lexical Normalization and Named Entity Recognition, Beijing, China, 2015.
T. Tian, Marco Dinarelli, I. Tellier, P. Cardoso
Etiquetage morpho-syntaxique de tweets avec des CRF
Traitement Automatique des Langues Naturelles, Caen, France, 2015.
Etiquetage morpho-syntaxique de tweets avec des CRF
Traitement Automatique des Langues Naturelles, Caen, France, 2015.
Marco Dinarelli, S. Rosset
Tree-Structured Named Entity Recognition on OCR Data: Analysis, Processing and Results
In Proceedings of the Language Resources and Evaluation Conference (LREC), Istanbul, Turkey, 2012.
Tree-Structured Named Entity Recognition on OCR Data: Analysis, Processing and Results
In Proceedings of the Language Resources and Evaluation Conference (LREC), Istanbul, Turkey, 2012.
Text Categorization
A. Garcia-Fernandez, A.L. Ligozat, Marco Dinarelli, D. Bernhard
Méthodes pour l'archéologie linguistique: datation par combinaison d'indices temporels
Expérimentations et évaluations en fouille de textes, un panorama des campagnes DEFT, sous la direction de Cyril Grouin et Dominic Forest, Hermes Lavoisier, 2012
Méthodes pour l'archéologie linguistique: datation par combinaison d'indices temporels
Expérimentations et évaluations en fouille de textes, un panorama des campagnes DEFT, sous la direction de Cyril Grouin et Dominic Forest, Hermes Lavoisier, 2012
Coreference resolution
L. Grobol, I. Tellier, E. de La Clergerie, Marco Dinarelli, F. Landragin
Apports des analyses syntaxiques pour la détection automatique de mentions dans un corpus de français oral
Article court à Traitement Automatique des Langues Naturelles (TALN), Orléans, France, 2017.
Apports des analyses syntaxiques pour la détection automatique de mentions dans un corpus de français oral
Article court à Traitement Automatique des Langues Naturelles (TALN), Orléans, France, 2017.
A. Désoyer, F. Landragin, I. Tellier, A. Lefevre, J.-Y. Antoine, Marco Dinarelli
Coreference Resolution for French Oral Data: Machine Learning Experiments with ANCOR
International Conference on Intelligent Text Processing and Computational Linguistics (CICling), Konya, Turkey, 2016.
Published in Lecture Notes in Computer Sciences (LNCS), Springer
Coreference Resolution for French Oral Data: Machine Learning Experiments with ANCOR
International Conference on Intelligent Text Processing and Computational Linguistics (CICling), Konya, Turkey, 2016.
Published in Lecture Notes in Computer Sciences (LNCS), Springer
Resource Collection
E. Gugliotta, Marco Dinarelli
TArC: Tunisian Arabish Corpus, First complete release
Language Resources and Evaluation Conference (LREC), Marseille, France, 2022.
TArC: Tunisian Arabish Corpus, First complete release
Language Resources and Evaluation Conference (LREC), Marseille, France, 2022.
E. Gugliotta, Marco Dinarelli
TArC Un corpus d'Arabish tunisien
Traitement Automatique des Langues Naturelles, Nancy, France, 2020.
TArC Un corpus d'Arabish tunisien
Traitement Automatique des Langues Naturelles, Nancy, France, 2020.
E. Gugliotta, Marco Dinarelli
TArC: Incrementally and Semi-Automatically Collecting a Tunisian Arabish Corpus
Language Resources and Evaluation Conference (LREC), Marseille, France, 2020.
TArC: Incrementally and Semi-Automatically Collecting a Tunisian Arabish Corpus
Language Resources and Evaluation Conference (LREC), Marseille, France, 2020.
L. Grobol, I. Tellier, E. De La Clergerie, Marco Dinarelli, F. Landragin
ANCOR-AS: Enriching the ANCOR Corpus with Syntactic Annotations
Language Resources and Evaluation Conference (LREC), Miyazaki, Japan, 2018.
ANCOR-AS: Enriching the ANCOR Corpus with Syntactic Annotations
Language Resources and Evaluation Conference (LREC), Miyazaki, Japan, 2018.
Ontology-Based Semantic Analysis
Information Extraction
T. Tian, I. Tellier, Marco Dinarelli, P. Dias Cardoso
Understanding Social Media Texts with Minimum Human Effort on #Twitter
Language and the new (instant) media (PLIN), Louvain-la-Neuve, Belgium, 2016.
Accepted for publication
Understanding Social Media Texts with Minimum Human Effort on #Twitter
Language and the new (instant) media (PLIN), Louvain-la-Neuve, Belgium, 2016.
Accepted for publication