Hang-Yip Liu, Steffen Schulze, et al.
Proceedings of SPIE - The International Society for Optical Engineering
Serving deep learning models from relational databases brings significant benefits. First, features extracted from databases do not need to be transferred to any decoupled deep learning systems for inferences, and thus the system management overhead can be significantly reduced. Second, in a relational database, data management along the storage hierarchy is fully integrated with query processing, and thus it can continue model serving even if the working set size exceeds the available memory. Applying model deduplication can greatly reduce the storage space, memory footprint, cache misses, and inference latency. However, existing data deduplication techniques are not applicable to the deep learning model serving applications in relational databases. They do not consider the impacts on model inference accuracy as well as the inconsistency between tensor blocks and database pages. This work proposed synergistic storage optimization techniques for duplication detection, page packing, and caching, to enhance database systems for model serving. Evaluation results show that our proposed techniques significantly improved the storage efficiency and the model inference latency, and outperformed existing deep learning frameworks in targeting scenarios.
Hang-Yip Liu, Steffen Schulze, et al.
Proceedings of SPIE - The International Society for Optical Engineering
Indranil R. Bardhan, Sugato Bagchi, et al.
JMIS
Michael Ray, Yves C. Martin
Proceedings of SPIE - The International Society for Optical Engineering
Khaled A.S. Abdel-Ghaffar
IEEE Trans. Inf. Theory