Publications

2025

  1. Metronome: tracing variation in poetic meters via local sequence alignment
    Ben Nagy, Artjoms Šeļa, Mirella De Sisto, and Petr Plecháč
    Computational Humanities Research, vol. 1, pp. e1, 2025
  2. An Interdisciplinary Approach to Human-Centered Machine Translation
    Marine Carpuat, Omri Asscher, Kalika Bali, Luisa Bentivogli, Frédéric Blain, Lynne Bowker, Monojit Choudhury, Hal Daumé III, Kevin Duh, Ge Gao, Alvin II Grissom, Marzena Karpinska, Elaine C. Khoong, William D. Lewis, André F.T. Martins, Mary Nurminen, Douglas W. Oard, Maja Popovic, Michel Simard, and François Yvon
    arXiv preprint arXiv:2506.13468, 2025
  3. Are We Paying Attention to Her? Investigating Gender Disambiguation and Attention in Machine Translation
    In Proceedings of the 3rd Workshop on Gender-Inclusive Translation Technologies (GITT 2025), pp. 1–16, Jun, 2025
  4. Findings of the WMT25 Shared Task on Automated Translation Evaluation Systems: Linguistic Diversity is Challenging and References Still Help
    Alon Lavie, Greg Hanneman, Sweta Agrawal, Diptesh Kanojia, Chi-kiu Lo, Vilém Zouhar, Frederic Blain, Chrysoula Zerva, Eleftherios Avramidis, Sourabh Dattatray Deoghare, Archchana Sindhujan, Jiayi Wang, David Ifeoluwa Adelani, Brian Thompson, Tom Kocmi, Markus Freitag, and Daniel Deutsch
    In Proceedings of the Tenth Conference on Machine Translation (WMT 2025), pp. 414–461, Nov, 2025

2024

  1. XSL-HoReCo and GoSt-ParC-Sign: Two New Signed Language - Written Language Parallel Corpora
    Mirella De Sisto, Vincent Vandeghinste, Caro Brosens, Myriam Vermeerbergen, and Dimitar Shterionov
    Selected papers from the CLARIN Annual Conference 2023, 2024
  2. PoeTree: Poetry Treebanks in Czech, English, French, German, Hungarian, Italian, Portuguese, Russian, Slovenian and Spanish
    Petr Plecháč, Silvie Cinková, Robert Kolár, Artjoms Šeļa, Mirella De Sisto, Lara Nugues, Thomas Haider, and Neža Kočnik
    Research Data Journal for the Humanities and Social Sciences, pp. 1 - 17, 2024
  3. What do Large Language Models Need for Machine Translation Evaluation?
    Shenbin Qian, Archchana Sindhujan, Minnie Kabra, Diptesh Kanojia, Constantin Orăsan, Tharindu Ranasinghe, and Frédéric Blain
    2024
  4. DORE: A Dataset For Portuguese Definition Generation
    Anna Beatriz Dimas Furtado, Tharindu Ranasinghe, Frédéric Blain, and Ruslan Mitkov
    2024
  5. Understanding poetry using natural language processing tools: a survey
    Mirella De Sisto, Laura Hernández-Lorenzo, Javier De la Rosa, Salvador Ros, and Elena González-Blanco
    Digital Scholarship in the Humanities, pp. fqae001, Feb, 2024
  6. Microvariation in the second form of the infinitive in Campania: the case of Valle Caudina
    Kim Groothuis, and Mirella De Sisto
    Isogloss, vol. 10, Mar, 2024
  7. Forged-GAN-BERT: Authorship Attribution for LLM-Generated Forged Novels
    Kanishka Silva, Ingo Frommholz, Burcu Can, Frédéric Blain, Raheem Sarwar, and Laura Ugolini
    In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop, pp. 325–337, Mar, 2024
  8. Guiding In-Context Learning of LLMs through Quality Estimation for Machine Translation
    In Proceedings of the 16th Conference of the Association for Machine Translation in the Americas (Volume 1: Research Track), pp. 88–101, Sep, 2024
  9. Are LLMs Breaking MT Metrics? Results of the WMT24 Metrics Shared Task
    Markus Freitag, Nitika Mathur, Daniel Deutsch, Chi-Kiu Lo, Eleftherios Avramidis, Ricardo Rei, Brian Thompson, Frederic Blain, Tom Kocmi, Jiayi Wang, David Ifeoluwa Adelani, Marianna Buchicchio, Chrysoula Zerva, and Alon Lavie
    In Proceedings of the Ninth Conference on Machine Translation, pp. 47–81, Nov, 2024
  10. Findings of the Quality Estimation Shared Task at WMT 2024: Are LLMs Closing the Gap in QE?
    Chrysoula Zerva, Frédéric Blain, José G. C. De Souza, Diptesh Kanojia, Sourabh Deoghare, Nuno M. Guerreiro, Giuseppe Attanasio, Ricardo Rei, Constantin Orasan, Matteo Negri, Marco Turchi, Rajen Chatterjee, Pushpak Bhattacharyya, Markus Freitag, and André Martins
    In Proceedings of the Ninth Conference on Machine Translation, pp. 82–109, Nov, 2024

2023

  1. Evaluating the Effectiveness of Pre-trained Language Models in Predicting the Helpfulness of Online Product Reviews
    Ali Boluki, Javad Pourmostafa Roshan Sharami, and Dimitar Shterionov
    2023
  2. SignON: Sign Language Translation. Progress and challenges
    Vincent Vandeghinste, Dimitar ShterionovMirella De Sisto, Aoife Brady, Mathieu De Coster, Lorraine Leeson, Josep Blat, Frankie Picron, Marcello Paolo Scipioni, Aditya Parikh, Louis Bosch, John O’Flaherty, Joni Dambre, Jorn Rijckaert, Bram Vanroy, Victor Ubieto Nogales, Santiago Egea Gomez, Ineke Schuurman, Gorka Labaka, Adrian Nunez-Marcos, Irene Murtagh, Euan McGill, and Horacio Saggion
    In Proceedings of the 24th Annual Conference of the European Association for Machine Translation, 2023
  3. GoSt-ParC-Sign: Gold Standard Parallel Corpus of Sign and spoken language
    Mirella De Sisto, Vincent Vandeghinste, Lien Soetemans, Caro Brosens, and Dimitar Shterionov
    In Proceedings of the 24th Annual Conference of the European Association for Machine Translation, pp. 503–504, 2023
  4. Report on Europe’s Sign Languages. Deliverable European Language Equality II project
    Vincent Vandeghinste, Mirella De Sisto, Maria Kopf, Marc Schulder, Caro Brosens, Lien Soetemans, Rehana Omardeen, Frankie Picron, Davy Van Landuyt, Irene Murtagh, Eleftherios Avramidis, and Mathieu De Coster
    2023
  5. NGT-HoReCo and GoSt-ParC-Sign: Two new Sign Language - Spoken Language parallel corpora
    Mirella De Sisto, Vincent Vandeghinste, Dimitar Shterionov, Lien Soetemans, and Caro Brosens
    In CLARIN Annual Conference Proceedings, 2023, pp. 6–9, 2023
  6. The development of a poetic tradition. A study on a Dutch Renaissance Poetry Corpus
    Mirella De Sisto
    Studia Metrica et Poetica, vol. 10, pp. 36–68, 2023
  7. Tailoring Domain Adaptation for Machine Translation Quality Estimation
    In Proceedings of the 24th Annual Conference of the European Association for Machine Translation, pp. 9–20, Jun, 2023
  8. A New English-Dutch-NGT Corpus for the Hospitality Domain
    Mirella De Sisto, Vincent Vandeghinste, and Dimitar Shterionov
    In Proceedings of the Second International Workshop on Automatic Translation for Signed and Spoken Languages, pp. 34–37, Jun, 2023
  9. Authorship Attribution of Late 19th Century Novels using GAN-BERT
    Kanishka Silva, Burcu Can, Frédéric Blain, Raheem Sarwar, Laura Ugolini, and Ruslan Mitkov
    In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop), pp. 310–320, Jul, 2023
  10. Text Data Augmentation Using Generative Adversarial Networks – A Systematic Review
    Kanishka Silva, Burcu Can, Raheem Sarwar, Frédéric Blain, and Ruslan Mitkov
    Journal of Computational and Applied Linguistics, vol. 1, pp. 6–38, Jul, 2023
  11. Results of WMT23 Metrics Shared Task: Metrics Might Be Guilty but References Are Not Innocent
    Markus Freitag, Nitika Mathur, Chi-kiu Lo, Eleftherios Avramidis, Ricardo Rei, Brian Thompson, Tom Kocmi, Frédéric Blain, Daniel Deutsch, Craig Stewart, Chrysoula Zerva, Sheila Castilho, Alon Lavie, and George Foster
    In Proceedings of the Eighth Conference on Machine Translation, pp. 578–628, Dec, 2023
  12. Findings of the WMT 2023 Shared Task on Quality Estimation
    Frédéric Blain, Chrysoula Zerva, Ricardo Ribeiro, Nuno M. Guerreiro, Diptesh Kanojia, José G. Souza, Beatriz Silva, Tânia Vaz, Yan Jingxuan, Fatemeh Azadi, Constantin Orasan, and André Martins
    In Proceedings of the Eighth Conference on Machine Translation, pp. 629–653, Dec, 2023
  13. Quality Estimation-Assisted Automatic Post-Editing
    Sourabh Deoghare, Diptesh Kanojia, Frédéric Blain, Tharindu Ranasinghe, and Pushpak Bhattacharyya
    In Findings of the Association for Computational Linguistics: EMNLP 2023, pp. 1686–1698, Dec, 2023

2022

  1. The Ecological Footprint of Neural Machine Translation Systems
    2022
  2. Modelli di demarcazione della metà verso nel metro rinascimentale romanzo
    Mirella De Sisto
    ILLA - Nuove Ricerche Umanistiche, vol. Interruzioni e cesure. Fenomeni e pratiche della discontinuità in linguistica, letteratura e arti performative, pp. 192–200, 2022
  3. Quality Estimation for the Translation Industry – Data Challenges
    Javad Pourmostafa Roshan Sharami, Elena Murgolo, and Dimitar Shterionov
    In The 32nd Meeting of Computational Linguistics in The Netherlands, CLIN, Jun, 2022
  4. Challenges with Sign Language Datasets for Sign Language Recognition and Translation
    Mirella De Sisto, Vincent Vandeghinste, Santiago Egea Gómez, Mathieu De Coster, Dimitar Shterionov, and Horacio Saggion
    In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 2478–2487, Jun, 2022
  5. A Quality Estimation and Quality Evaluation Tool for the Translation Industry
    In Proceedings of the 23rd Annual Conference of the European Association for Machine Translation, pp. 307–308, Jun, 2022
  6. Final Devoicing in Dutch Medieval and Renaissance Texts: A Preliminary Study on Orthographic Variation
    Mirella De Sisto
    Filologia Germanica - Germanic Philology, pp. 99–117, Dec, 2022

2021

  1. Automatic quantitative metrical analysis of Spanish Poetry with Rantanplan: a first approach
    Laura Hernández Lorenzo, Mirella De Sisto,  Pérez, Javier De la Rosa, Salvador Ros, and Elena González-Blanco
    Tackling the Toolkit. Plotting Poetry through Computational Literary Studies., 2021
  2. Generating Gender Augmented Data for NLP
    Nishtha Jain, Maja Popović, Declan Groves, and Eva Vanmassenhove
    In Proceedings of the 3rd Workshop on Gender Bias in Natural Language Processing, pp. 93–102, Aug, 2021
  3. gENder-IT: An Annotated English-Italian Parallel Challenge Set for Cross-Linguistic Natural Gender Phenomena
    Eva Vanmassenhove, and Johanna Monti
    In Proceedings of the 3rd Workshop on Gender Bias in Natural Language Processing, pp. 1–7, Aug, 2021
  4. Defining meaningful units. Challenges in sign segmentation and segment-meaning mapping (short paper)
    Mirella De SistoDimitar Shterionov, Irene Murtagh, Myriam Vermeerbergen, and Lorraine Leeson
    In Proceedings of the 1st International Workshop on Automatic Translation for Signed and Spoken Languages (AT4SSL), pp. 98–103, Aug, 2021
  5. Early-stage development of the SignON application and open framework – challenges and opportunities
    Dimitar Shterionov, John J O’Flaherty, Edward Keane, Connor O’Reilly, Marcello Paolo Scipioni, Marco Giovanelli, and Matteo Villa
    In Proceedings of Machine Translation Summit XVIII: Users and Providers Track, pp. 277–290, Aug, 2021
  6. NeuTral Rewriter: A Rule-Based and Neural Approach to Automatic Rewriting into Gender Neutral Alternatives
    In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp. 8940–8948, Nov, 2021
  7. Transformers analyzing poetry. Multilingual metrical pattern prediction with transfomer-based language models
    Javier De la Rosa,  Pérez, Mirella De Sisto, Laura Hernández Lorenzo, Aitor Diaz, Salvador Ros, and Elena González-Blanco
    Neural Computing and Applications, Nov, 2021
  8. Selecting Parallel In-domain Sentences for Neural Machine Translation Using Monolingual Texts
    Javad Pourmostafa Roshan Sharami, Dimitar Shterionov, and Pieter Spronck
    Computational Linguistics in the Netherlands Journal, vol. 11, pp. 213–230, Dec, 2021