The ability to update language models is essential to incorporate new information and correct undesirable behaviors. Previous methods are unwieldy and often fail as the amount of new data increases. New work offers a workaround.

What’s New: Eric Mitchell and colleagues at Stanford and École Polytechnique Fédérale de Lausanne proposed Semi-Parametric Editing with a Retrieval-Augmented Counterfactual Model (SERAC), an add-on system that can adapt trained models with an abundance of new information.

Key insight: Say you’ve trained a language model to produce output based on the current Prime Minister of the United Kingdom. You’ll need to retrain the model when the Prime Minister changes. Alternatively you can update the model either by fine-tuning or training a secondary model, known as a model editor, that estimates and applies the change in weights necessary to respond to queries about the Prime Minister accurately without affecting responses to other queries. However, both approaches have problems. Fine-tuning every time information changes is impractical, and both approaches fail beyond around 10 new pieces of data (as the authors demonstrate without proposing an explanation why). Instead of changing model weights, a separate system can store new data and learn to provide output to queries that are relevant to that data. Such a system would handle any amount of new data and work with any model without retraining.

How it works: The authors’ system is designed to complement a base model. It consists of three parts. The edit memory stored facts in the form of input-output pairs. The scope classifier determined whether a new input is relevant to facts stored in the edit memory. The counterfactual model generated output for relevant inputs. The base model continued to handle all other queries.

  • The edit memory was a list of new input-output pairs (for example “Who is the UK Prime Minister?” “Boris Johnson”). The scope classifier was a pretrained DistilBERT fine-tuned to estimate the probability that an input was relevant to a given pair in the edit memory. The counterfactual model was a pretrained T5 language model that the authors fine-tuned to generate text based on the current input and an input-output pair.
  • The fine-tuning examples, which took the form of input-output pairs, depended on the task at hand, such as question answering. Fine-tuning examples were labeled either relevant or irrelevant to pairs stored in the edit memory. For instance, given the pair “Who is the UK Prime Minister?” “Boris Johnson,” the query “Where is Boris Johnson the PM?” was relevant, while “Where did Boris Johnson attend university?” was not.
  • At inference, given a new input, the scope classifier determined whether it was relevant to a pair in the edit memory. If so, it passed the most relevant pair, along with the input, to the counterfactual model to generate output.

Results: The authors used two metrics, edit success and drawdown, to evaluate SERAC’s ability to update responses from a pretrained T5-large. Edit success measured the correctness of the T5’s responses to inputs relevant to the contents of the edit memory; higher is better (1 being perfect). Drawdown measured the correctness of responses to inputs not relevant to data in edit memory; lower is better (0 being perfect). SERAC outperformed model editors such as Model Editor Networks with Gradient Decomposition (MEND). On question-answering, SERAC achieved 0.986 edit success compared to MEND’s 0.823, and 0.009 drawdown compared to MEND’s 0.187. The authors applied the SERAC system they’d trained on T5-large to other sizes. Its performance barely budged. Moreover, SERAC continued to outperform as the number of new input-output pairs increased. The authors increased the number of simultaneous pairs to 75. Measuring performance as the difference between edit success and drawdown (the worst possible being -1, best being 1), SERAC’s fell only from 0.98 to around 0.90, while MEND’s degraded from 0.64 to around -0.95.

Why it matters: This work opens the door to keeping trained language models up to date even as information changes at a rapid clip. Presumably businesses could use it to update information about, say, their products, leadership, numbers of employees, locations, and so on. Developers of conversational models could keep their chatbots abreast of changes in politics, law, and scientific discovery.

We’re thinking: A single system that can update any language model opens the tantalizing possibility of a product, updated regularly, that can adapt previously trained models to new information.


Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox