Springe direkt zu Inhalt

Evaluating Gender Bias in German Machine Translation

Michelle Kappl

Every day, millions rely on machine translation (MT) models to overcome language barriers, but can we trust them to be fair, or do they perpetuate harmful stereotypes? Our project tackles these questions by introducing a method for assessing gender bias in German MT, with a focus on occupational stereotyping and underrepresentation. Central to our research effort is a newly created Gender Bias Evaluation Testset, called DEval-MT. This dataset is comprised of around 10,000 sentences, which were annotated using German labor statistics to reflect realistic occupational titles and distributions. We not only include binary gendered forms (female and male), but also pay explicit attention to gender-neutral and non-binary language, including the integration of neopronouns. By doing so, we address a broader spectrum of gender identities that are often ignored in current MT evaluations. Furthermore, we introduce intersectional cases that combine gender with race and ethnicity, enabling a more nuanced assessment of how bias manifests within translation systems.  In addition to the dataset, we propose an evaluation pipeline that uses state-of-the-art alignment techniques and automatic morphological analysis. This allows for large-scale analysis of gender bias in German MT to 7 languages, while being cost-effective and easy to use. The pipeline is designed to be flexible and can be adapted for other datasets and languages in the future. To demonstrate the practical application of our method, we will evaluate several widely used MT systems, including DeepL, Google Translate, and ChatGPT.  This project brings together perspectives from computer science, linguistics, and social sciences, and is part of creating fairer, more inclusive language technologies. By providing an evaluation testset and methodology, we want to support researchers and developers in identifying and addressing bias in MT systems. Therefore, DEval-MT will be made publicly available.