Continual Learning with Multilingual Foundation Model

Barathi Ganesh HB, Michal Ptaszynski, Rene Melendez, Juuso Eronen

Published May 14, 2026Featured #5In the daily list May 15, 2026

Open on arXiv Read PDF

Daily score70.8

Editorial review7.5

Relevance0.456

Freshness0.722

Why It Matters

What makes this one worth your time

This research is significant for improving the understanding of language dynamics in social media, particularly in the context of LGBTQ+ discourse, which can inform both academic research and practical applications in moderation tools.

A novel framework for detecting reclaimed slurs in multilingual social media discourse.

Summary

The paper introduces a multi-stage framework for detecting reclaimed LGBTQ+-related slurs in multilingual social media, addressing challenges like data scarcity and class imbalance through various innovative methodologies.

Key contributions

Development of a multi-stage framework for slur detection that incorporates data-driven model selection and semantic-preserving augmentation.
Evaluation of eight multilingual embedding models, leading to the selection of XLM-RoBERTa as the foundation model.
Implementation of a reproducible methodology with publicly available code and experimental setup.

Notable insights

The integration of dynamic epoch-level undersampling with inductive transfer learning is a clever approach to mitigate class imbalance in multilingual datasets.
The use of language-specific decision thresholds optimized via ROC analysis highlights the importance of linguistic variation in sentiment expression.

Possible limitations

Not stated in the abstract.

Abstract

arXiv:2605.13415v1 Announce Type: cross Abstract: This paper presents a multi-stage framework for detecting reclaimed slurs in multilingual social media discourse. It addresses the challenge of identifying reclamatory versus non-reclamatory usage of LGBTQ+-related slurs across English, Spanish, and Italian tweets. The framework handles three intertwined methodological challenges like data scarcity, class imbalance, and cross-linguistic variation in sentiment expression. It integrates data-driven model selection via cross-validation, semantic-preserving augmentation through back-translation, inductive transfer learning with dynamic epoch-level undersampling, and domain-specific knowledge injection via masked language modeling. Eight multilingual embedding models were evaluated systematically, with XLM-RoBERTa selected as the foundation model based on macro-averaged F1 score. Data augmentation via GPT-4o-mini back-translation to alternate languages effectively tripled the training corpus while preserving semantic content and class distribution ratios. The framework produces four final runs for the evaluation purposes where RUN 1 is inductive transfer learning with augmentation and undersampling, RUN 2 with masked language modeling pre-training, RUN 3 and RUN 4 are previous predictions refined via language-specific decision thresholds optimized via ROC analysis. Language-specific threshold refinement reveals that optimal decision boundaries vary significantly across languages. This reflects distributional differences in model confidence scores and linguistic variation in reclamatory language usage. The threshold-based optimization yields 2-5% absolute F1 improvement without requiring model retraining. The methodology is fully reproducible, with all code and experimental setup available at https://github.com/rbg-research/MultiPRIDE-Evalita-2026.