Dropping Just a Handful of Preferences Can Change Top Large Language Model RankingsJenny Yijian HuangYunyi Shenet al.2025ICML 2025
ST-WEBAGENTBENCH: A Benchmark for Evaluating Safety and Trustworthiness in Web AgentsIdo LevyBen Wieselet al.2025ICML 2025
In-Context Bias Propagation in LLM-Based Tabular Data GenerationPol Garcia RecasensAlberto Gutierrez-torreet al.2025ICML 2025
Multiscale Byte Language Models -- A Hierarchical Architecture for Causal Million-Length Sequence ModelingEric EgliMatteo Manicaet al.2025ICML 2025
Workshop on Collaborative and Federated Agentic Workflows (CFAgentic @ ICML'25)Alexander ErbenGauri Joshiet al.2025ICML 2025