Monday, April 1, 2024

The place Are All of the Ladies?. Exploring massive language fashions’ biases… | by Yennie Jun | Jul, 2023

Must read


Exploring massive language fashions’ biases in historic information

Towards Data Science
A couple of of the highest historic figures talked about probably the most usually by the GPT-4 and Claude. Particular person photographs sourced from Wikipedia. Collage created by the writer.

Giant language fashions (LLMs) comparable to ChatGPT are being more and more utilized in instructional {and professional} settings. You will need to perceive and examine the numerous biases current in such fashions earlier than integrating them into current functions and our day by day lives.

One of many biases I studied in my earlier article was concerning historic occasions. I probed LLMs to grasp what historic information they encoded within the type of main historic occasions. I discovered that they encoded a severe Western bias in the direction of understanding main historic occasions.

On an identical vein, on this article, I probe language fashions concerning their understanding of necessary historic figures. I requested two LLMs who an important historic individuals in historical past had been. I repeated this course of 10 instances for 10 completely different languages. Some names, like Gandhi and Jesus, appeared extraordinarily ceaselessly. Different names, like Marie Curie or Cleopatra, appeared much less ceaselessly. In comparison with the variety of male names generated by the fashions, there have been extraordinarily few feminine names.

The largest query I had was: The place had been all the ladies?

Persevering with the theme of evaluating historic biases encoded by language fashions, I probed OpenAI’s GPT-4 and Anthropic’s Claude concerning main historic figures. On this article, I present how each fashions include:

  • Gender bias: Each fashions disproportionately predict male historic figures. GPT-4 generated the names of feminine historic figures 5.4% of the time and Claude did so 1.8% of the time. This sample held throughout all 10 languages.
  • Geographic bias: Whatever the language the mannequin was prompted in, there was a bias in the direction of predicting Western historic figures. GPT-4 generated historic figures from Europe 60% of the time and Claude did so 52% of the time.
  • Language bias: Sure languages suffered from gender or geographic biases extra. For instance, when prompted in Russian, each GPT-4 and Claude generated zero ladies throughout all of my experiments. Moreover, language high quality was decrease for some languages. For instance, when prompted in Arabic, the fashions had been extra more likely to reply incorrectly by producing…



Supply hyperlink

More articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest article