SocialStigmaQA Spanish and Japanese - Towards Multicultural Adaptation of Social Bias Benchmarks

Clara Higuera Cabañes; Ryo Iwaki; Beñat San Sebastian; Rosario Uceda-Sosa; Manish Nagireddy; Hiroshi Kanayama; Mikio Takeuchi; Gakuto Kurata; Karthikeyan Natesan Ramamurthy

NeurIPS 2024

Workshop paper

10 Dec 2024

SocialStigmaQA Spanish and Japanese - Towards Multicultural Adaptation of Social Bias Benchmarks

Abstract

Many existing benchmarks for social bias evaluation of large language models are based in English. Given that finding similar datasets natively or creating them from scratch in other languages is difficult, one solution is to adapt these English-based benchmarks to other languages. However, such conversions are non-trivial given both the linguistic and cultural aspects of social bias. In this work, we present ongoing efforts to port an existing dataset - SocialStigmaQA - to both Spanish and Japanese languages. We speak on the efforts required to perform a faithful adaptation of this dataset, with respect to the specific societal and cultural norms for both of these languages. We hope our work provides insightful guidance on the adaptation of existing English-based bias benchmarks to other languages and provide further steps that can be taken for that purpose.

Workshop paper