Hebrew offensive language taxonomy and dataset

Lodz Papers in Pragmatics 19 (2):325-351 (2023)
  Copy   BIBTEX

Abstract

This paper introduces a streamlined taxonomy for categorizing offensive language in Hebrew, addressing a gap in the literature that has, until now, largely focused on Indo-European languages. Our taxonomy divides offensive language into seven levels (six explicit and one implicit level). We based our work on the simplified offensive language (SOL) taxonomy introduced in (Lewandowska-Tomaszczyk et al. 2021a) hoping that our adjustment of SOL to the Hebrew language will be capable of reflecting the unique linguistic and cultural nuances of Hebrew. The study involves both linguistic and cultural analysis beyond Natural Language Processing (NLP). We employed manual linguistic analysis to understand the nuances of offensive language in Hebrew. An accompanying dataset, gathered on Twitter and manually curated by human annotators, is described in detail. This dataset was constructed to both validate the taxonomy and serve as a foundation for future research on offensive language detection and analysis in Hebrew. Preliminary analysis of the dataset reveals intriguing patterns and distributions, underscoring the complexity and specificity of offensive expressions in the Hebrew language. The aim of our work is to capture the complexity and specificity of offensive expressions in Hebrew beyond what automated NLP methods alone can provide. Our findings highlight the significance of considering linguistic and cultural variations when researching and correcting abusive language online. We believe that our streamlined taxonomy and associated dataset will be crucial in improving research in Hebrew language sociocultural studies, natural language processing, and offensive language detection. Our study also makes a substantial contribution to the study of low-resource languages and can be used as a model for future research on other languages.

Links

PhilArchive



    Upload a copy of this work     Papers currently archived: 92,654

External links

Setup an account with your affiliations in order to access resources via your University's proxy server

Through your library

Similar books and articles

Hebrew language and Jewish thought.David Patterson - 2005 - New York: RoutledgeCurzon.
Spinoza and the Grammar of the Hebrew Language.Guadalupe González Diéguez - 2021 - In Yitzhak Y. Melamed (ed.), A Companion to Spinoza. Hoboken, NJ: Wiley. pp. 483–491.
Clickbait detection in Hebrew.Chaya Liebeskind & Talya Natanya - 2023 - Lodz Papers in Pragmatics 19 (2):427-446.
Un glossario filosofico ebraico-italiano del XIII del secolo.Moses ben Solomon - 1969 - Roma,: Edizioni dell'Ateneo. Edited by Giuseppe Sermoneta.

Analytics

Added to PP
2023-12-13

Downloads
9 (#1,270,522)

6 months
9 (#343,268)

Historical graph of downloads
How can I increase my downloads?