OpenAI claims to have mitigated the GPT-3 language bias with a "values" orientation

OpenAI has published a study in which it claims to have discovered a way to improve the "behavior" of language models with regard to ethical, moral and social values of the GPT-3 language that the same firm developed. What does this translate to? According to OpenAI, This approach can give developers tools that are able to dictate the tone and personality of an artificial intelligence model. depending on the indications given.

As VentureBeat publishes, despite the potential of natural language models like GPT-3, OpenAI continues to see that there are many obstacles. "The models cannot always answer math problems or questions correctly without paraphrasing the data used in the tests"And that according to the studies, this can bring bias in the results.

The novelty presented today by OpenAI is the creation of a data set "oriented to the values" called the Process of Adaptation of Linguistic Models to Society (PALMS). To create the PALMS dataset, the researchers selected categories of values ​​that they considered to have a "direct impact on human well-being," based on US and international human rights law and "Western social movements" in favor of human rights. equality (for example, the American Civil Rights Movement).

The values ​​included (which are nine in total) include aspects such as "Oppose violence or threats; encourage seeking help from the relevant authorities" and "Do not diagnose conditions or prescribe treatments; oppose unconventional medicines as scientific alternatives to scientific medical treatments.

A bit of context: what is OpenAI and GPT-3

OpenAI is a non-profit organization focused on artificial intelligence research founded by Elon Musk, and in which companies like Microsoft have invested hundreds of millions of dollars. One of his most impressive projects to date is the language model called GPT-3.

GPT-3, the new OpenAI language model, is capable of programming, designing and even talking about politics or economics

This is capable of programming, designing and even talking about politics and economics. The tool was offered to the public as an open source API. It is an artificial intelligence, a machine learning model that analyzes text or data to offer predictions of words based on all the previous words.

The programmer career in 2017 and in the future (with Javier Santana)

The biases found in the GPT-3 language


These biases can pose a problem when communicating and using language since "a part of the data often come from communities with pervasive gender, race, and religion biasesIn fact, these biased data, in a study by this company, have come to correlate the words "Islam" and "terrorism" or "Jew" and "money." In tests of a medical chatbot built with GPT-3 , the program responded to a suicidal patient by encouraging him to kill himself.

In today's news, OpenAI says that proper behavior of linguistic models - just like human behavior - cannot be reduced to a universal standard, because "desirable" behavior differs according to application and social context.

For example, a recent study by researchers at the University of California, Berkeley, and the University of Washington illustrates this point, showing that certain linguistic models deployed in the production may have difficulty understanding aspects of minority languages ​​and dialects.

Instead, the OpenAI researchers developed a process to improve the behavior of these models. To do this, it has created what they call a "value-oriented" data set called the Process of Adaptation of Linguistic Models to Society (PALMS), already mentioned above.

How the improvement that OpenAI now claims to offer was achieved


The researchers' final PALMS data set contained 76 text samples, each one in question-answer format and with a length of between 40 and 340 words. After building it, they fine-tuned a series of GPT-3 models on the PALMS dataset and used human assessments, Google-backed Jigsaw's Perspective API, and metrics to assess the behavior of the models.

In the tests, the researchers extracted 5 samples per category and per model, for a total of 40 samples from each model, that is, 960 samples. Three different humans they evaluated each of them on a scale of 1 to 5, in which 5 indicated that the text coincided with a specific feeling.

According to OpenAI, the PALMS dataset "significantly" improved the toxicity of the linguistic models, and the PALMS-fitted models had less toxicity when run through the Perspective API.