How Large Language Models are Detecting Racism in the Digital Arena

7 Mar 2025

By Caitlyn Schiffer

Racism has long been a pervasive issue, but the advent of social media has amplified its presence, particularly in the context of football. While platforms like Twitter (now X) provide spaces for users to share opinions and engage in conversations, they have also become hotbeds for racial abuse, especially targeting athletes. The global nature of football has made it a focal point for both celebration and hatred, with racist comments often going unchecked. As freedom of expression becomes a hotly debated issue online, the challenge is finding the balance between allowing open discourse and protecting individuals from harmful abuse.

The problem of racism in football discourse is far-reaching. In 2020, the Professional Footballers Association conducted a six week study using 825,515 tweets directed at players with 3,000 explicitly abusive messages. By using machine learning systems, researchers found that 56% of all the discriminatory abuse identified during the study was racist. This growing trend reflects the increasing difficulty of managing online platforms where racist sentiments thrive.

Example:

A stark illustration of this issue occurred during the Euro 2020 Men’s Soccer Final when three Black English players—Marcus Rashford, Jadon Sancho, and Bukayo Saka—were subjected to a barrage of racist online abuse after missing penalties in a shootout against Italy. The UK Football Policing Unit received 600 reports of racist comments, with 207 deemed criminal. This incident brought international attention to the problem and demonstrated the real-life harm inflicted on athletes through social media.

What calls to action are being made by the UAOA?

The issue of online abuse targeting athletes is becoming critical, as highlighted in the July 2024 United Against Online Abuse (UAOA) coalition forum. Speakers, including Mohammed Ben Sulayem, President of the FIA and founder of UAOA, emphasized the heartbreaking reality: 75% of athletes who train their entire lives to achieve their dreams endure hate and abuse online.

Reflecting findings from Dublin City University, which showed a steep rise in homophobic language during UEFA football championships, the 50-member forum proposed the establishment of an international Trusted Flagger entity under the EU Digital Services Act. This entity would help fast-track complaints and remove abusive comments across online platforms. The Trusted Flagger, incentivized through rewards, would need to meet strict criteria, encouraging benevolent moderation and helping to limit the prevalence of online abuse.

What is Being Done to Combat Racism Online?

Combating racism online largely falls into four different types of solutions, including automated and human content moderation, legislation (e.g., the new EU Digital Services Act), education, and awareness. For example, charities like Kick It Out, which runs educational programs to challenge discrimination and promote positive change, strive to combat and end online racism in sports. The United Against Online Abuse Coalition actively develops and strengthens all these approaches. Despite these legal measures, detecting racism on platforms like Twitter can be complex. With over 6,000 tweets posted every second, distinguishing racist content from non-offensive communication requires nuanced analysis. Moreover, after conducting a two-year study analyzing 449,209 posts in 16 different languages and additional dialects, the World Athletics Organization has developed ways to implement safeguards for its athletes. This includes reporting and tracking detected abuse on platforms, educating athletes and stakeholders on protective measures, and researching further into this pervasive issue. 

Why is defining hate speech difficult? 

Many obstacles are encountered when determining whether something is hate speech or not. Platforms such as Twitter utilize lexicon-based models, deep learning models( CNNs and RNNs), and transformers like BERT. Despite these models’ abilities to detect hate speech, they all struggle for many reasons. For one, cultural and legal differences across countries make defining hate speech even more difficult. Moreover, the speed and volume of content on social media create a technical challenge for manual and automated moderation. The complexity of language, cultural nuances, and the lack of consistent global standards around freedom of speech make detecting racism on online platforms a complex task. Automated systems, while improving, still struggle to capture the subtleties of context, slang, and intent, all of which are crucial when determining whether a comment constitutes hate speech. Language changes, spelling variations and mistakes, and a wide range of linguistic issues further complicate this problem. For example, on July 1, 2024, UEFA reported that they had identified 4,656 posts across social media platforms for potential hate speech, but only 308 (7%) posts were eligible to be reported directly to the social media platforms for further action and only 219 were ultimately acted upon. This data suggests that over 93-95% of the flagged posts were inaccurately reported as hate speech. According to the findings of the UEFA report, these inaccuracies arose because most flagged content did not meet the platform’s hate speech criteria. This indicates a need for a more sophisticated system to better distinguish between genuine abuse and other forms of commentary.

What can counter this issue?

Implementing a layered AI model could help improve accuracy. This system would incorporate both machine learning and human review to verify context. The detection system can become more precise by training AI on broader datasets—including nuanced examples of hate speech across different cultural contexts. Additionally, having human moderators review flagged posts could further refine the process and reduce misclassifications, ensuring that only genuinely harmful content is flagged and removed.

A relatively new type of AI, known as Large Language Models (LLMs), is key to improving the accuracy of detecting racism in online content. LLMs are advanced AI systems trained on vast amounts of text data, enabling them to understand language in a more nuanced way. These models can grasp the meaning of words in context, so they are better at detecting whether a statement is racist or simply casual language, which may be offensive but does not reach the threshold of hate speech. They also learn to recognise slang, regional differences, and even misspellings, making them well-suited for identifying hate speech that might otherwise go unnoticed. However, for LLMs to be truly effective, they need to be fine-tuned on context-specific data. This is especially true in environments like football, which has its own unique language, jargon, and fan culture. However, football isn’t the only sport where fans have created their own distinct dialogue. In sports like rugby, tennis and basketball, fans invent their own terminology, which also calls on the issue that racism flourished in these cases as it is harder to detect online and more training is needed on these LLM to better uncover these nuances in language. Without training on the specific ways people communicate in this context, LLMs might miss key cues or misinterpret intent. In sum, we need:

  • More labeled datasets specifically focusing on racist (and non-racist) posts in various contexts;
  • New techniques to further fine-tune and optimize LLM performance across different sports environments and languages.

And that’s exactly what we are doing as part of the United Against Online Abuse research programme. The UAOA is funding researchers at Dublin City University in Ireland to build the necessary datasets for training, testing and fine-tuning AI to classify racist tweets. And we are getting promising results, achieving a remarkable 96.18% accuracy in identifying racist speech in tweets related to the UEFA European Championships, both mens and womens, from 2008 to 2020. 


If you’d like more information on the technical work relating to this blog, you can find it here.

Latest blog Posts