Michael Hersche, Mustafa Zeqiri, et al.
NeSy 2023
In this paper, we present several algorithms for performing all-to-many personalized communication on distributed memory parallel machines. We assume that each processor sends a different message (of potentially different size) to a subset of all the processors involved in the collective communication. The algorithms are based on decomposing the communication matrix into a set of partial permutations. We study the effectiveness of our algorithms from both the view of static scheduling and runtime scheduling. © 1995 Academic Press, Inc.
Michael Hersche, Mustafa Zeqiri, et al.
NeSy 2023
Rangachari Anand, Kishan Mehrotra, et al.
IEEE Transactions on Neural Networks
Chen-chia Chang, Wan-hsuan Lin, et al.
ICML 2025
R. Sebastian, M. Weise, et al.
ECPPM 2022