About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
NeurIPS 2024
Workshop paper
MemReasoner: A Memory-augmented LLM Architecture for Multi-hop Reasoning
Abstract
Recent benchmarks suggest that there remains significant room to improve large language models’ ability to robustly reason across facts distributed in extremely long documents. In this work, we propose MemReasoner, a new memory-augmented LLM architecture that is trained to perform temporal reasoning, along with multiple computational steps, over the context stored in the memory. Experiments show that MemReasoner trained on the core reasoning facts generalizes better, when compared to off-the-shelf large language models and existing recurrent models, on a test distribution where the required facts are scattered across long natural text up to 128k tokens. Further, MemReasoner demonstrates robust reasoning performance relative to the baselines, when the answer distribution in test samples differs from that in the training set.