The Reversal Curse: A Quirk in Language Models

Matryoshka


A recent study , titled “The Reversal Curse: A Quirk in Language Models,” explores an interesting phenomenon in language models, particularly large language models (LLMs) like GPT-3 and GPT-4. The researchers found that these models struggle with a certain type of generalization, which they term the “Reversal Curse.”

The Reversal Curse refers to the inability of these models to correctly answer a question when the order of the information is reversed. For instance, if the model is trained on the statement “Tom Cruise is an actor,” it may struggle to answer the question “Who is Tom Cruise?” correctly, despite having the necessary information.

The researchers conducted two experiments to explore this phenomenon. In the first experiment, they fine-tuned LLMs on documents of the form “[Name] is [description]” .The names and descriptions were for fictitious celebrities, such as “Daphne Barrington is the director of ‘A Journey Through Time’”. The models were then tested on their ability to answer questions where the order of the information was reversed, like “Who is the director of ‘A Journey Through Time’?” The models performed well when the order of the information matched the training data but failed to generalize when the order was reversed.

In the second experiment, the researchers tested the LLMs on real facts about celebrities without any fine-tuning. They asked questions like “Who is Tom Cruise’s mother?” and the reverse “Who is Mary Lee Pfeiffer’s son?”

The results showed that the models were much better at identifying the parent (in this case, Mary Lee Pfeiffer) than the child (Tom Cruise), even though the information required to answer both questions is the same. This further demonstrated the Reversal Curse, where the models struggled to answer questions where the order of the information was reversed.

The study also explored various methods to help the models generalize better, such as changing the prompt templates. However, none of these methods significantly improved the models’ performance.

This research highlights an interesting limitation of current language models and underscores the need for further research to improve their ability to generalize from the information they are trained on. It also raises interesting questions about how these models learn and process information. This indicates that fully logical abilities have not yet emerged in LLMs.

The original research paper can be found: [2309.12288] The Reversal Curse: LLMs trained on "A is B" fail to learn "B is A" (arxiv.org)

Comments

Popular Posts