@inproceedings{segonne-etal-2024-jargon,
    title = "Jargon: A Suite of Language Models and Evaluation Tasks for {F}rench Specialized Domains",
    author = "Segonne, Vincent  and
      Mannion, Aidan  and
      Alonzo Canul, Laura Cristina  and
      Audibert, Alexandre Daniel  and
      Liu, Xingyu  and
      Macaire, C{\'e}cile  and
      Pupier, Adrien  and
      Zhou, Yongxin  and
      Aguiar, Mathilde  and
      Herron, Felix E.  and
      Norr{\'e}, Magali  and
      Amini, Massih R  and
      Bouillon, Pierrette  and
      Eshkol-Taravella, Iris  and
      Esperan{\c{c}}a-Rodier, Emmanuelle  and
      Fran{\c{c}}ois, Thomas  and
      Goeuriot, Lorraine  and
      Goulian, J{\'e}r{\^o}me  and
      Lafourcade, Mathieu  and
      Lecouteux, Benjamin  and
      Portet, Fran{\c{c}}ois  and
      Ringeval, Fabien  and
      Vandeghinste, Vincent  and
      Coavoux, Maximin  and
      Dinarelli, Marco  and
      Schwab, Didier",
    editor = "Calzolari, Nicoletta  and
      Kan, Min-Yen  and
      Hoste, Veronique  and
      Lenci, Alessandro  and
      Sakti, Sakriani  and
      Xue, Nianwen",
    booktitle = "Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)",
    month = may,
    year = "2024",
    address = "Torino, Italia",
    publisher = "ELRA and ICCL",
    url = "https://aclanthology.org/2024.lrec-main.827",
    pages = "9463--9476",
    abstract = "Pretrained Language Models (PLMs) are the de facto backbone of most state-of-the-art NLP systems. In this paper, we introduce a family of domain-specific pretrained PLMs for French, focusing on three important domains: transcribed speech, medicine, and law. We use a transformer architecture based on efficient methods (LinFormer) to maximise their utility, since these domains often involve processing long documents. We evaluate and compare our models to state-of-the-art models on a diverse set of tasks and datasets, some of which are introduced in this paper. We gather the datasets into a new French-language evaluation benchmark for these three domains. We also compare various training configurations: continued pretraining, pretraining from scratch, as well as single- and multi-domain pretraining. Extensive domain-specific experiments show that it is possible to attain competitive downstream performance even when pre-training with the approximative LinFormer attention mechanism. For full reproducibility, we release the models and pretraining data, as well as contributed datasets.",
}