Data Efficient Neural Scaling Law via Model Reusing
Peihao Wang, Rameswar Panda, et al.
ICML 2023
Recent work showed that active site rather than full-protein-sequence information improves predictive performance in kinase-ligand binding affinity prediction. To refine the notion of an "active site", we here propose and compare multiple definitions. We report significant evidence that our novel definition is superior to previous definitions and better models of ATP-noncompetitive inhibitors. Moreover, we leverage the discontiguity of the active site sequence to motivate novel protein-sequence augmentation strategies and find that combining them further improves performance.
Peihao Wang, Rameswar Panda, et al.
ICML 2023
Oscar Sainz, Iker García-ferrero, et al.
ACL 2024
Michael Feffer, Martin Hirzel, et al.
ICML 2022
Pierre Dognin, Inkit Padhi, et al.
EMNLP 2021