MTRAG: A Multi-Turn Conversational Benchmark for Evaluating Retrieval-Augmented Generation SystemsYannis KatsisSara Rosenthalet al.2025ACL 2025
InspectorRAGet: An Introspection Platform for RAG EvaluationBenjamin SznajderKshitij Fadniset al.2025NAACL 2025
Creating Conversational Datasets for Retrieval-Augmented Generation Applications is Hard: Challenges & Research OpportunitiesMaeda HanafiKshitij Fadniset al.2025CHI 2025
PrimeQA: The Prime Repository for State-of-the-Art Multilingual Question Answering Research and DevelopmentAvi SilJaydeep Senet al.2023ACL 2023
Tell Me More? Can AI Enhance User Experience for List Answers?Odellia BoniSara Rosenthalet al.2023IUI 2023
GAAMA 2.0: An Integrated System that Answers Boolean and Extractive QuestionsScott McCarleyMihaela Borneaet al.2023AAAI 2023
PrimeQA: The Prime Repository for State-of-the-Art Multilingual Question Answering Research and DevelopmentAvi SilJaydeep Senet al.2023arXiv
Task Transfer and Domain Adaptation for Zero-Shot Question AnsweringXiang PanAlex Shenget al.2022DeepLo 2022
SemEval-2021 Task 9: Fact Verification and Evidence Finding for Tabular Data in Scientific Documents (SEM-TAB-FACTS)Nancy X.R. WangDiwakar Mahajanet al.2021SemEval 2021