Efficient Privacy-Preserving Viral Strain Classification via k-mer Signatures and FHE
Abstract
With the development of sequencing technologies, viral strain classification – which is critical for many applications, including disease monitoring and control – has become widely deployed. Typically, a lab (client) holds a viral sequence, and requests classification services from a centralized repository of labeled viral sequences (server). However, such “classification as a service” raises privacy and IP-protection concerns. We propose a privacy-preserving viral strain classification pro- tocol that allows the client to obtain classification services from the server, while maintaining complete privacy of the client’s viral strains. We implemented our protocol and performed extensive benchmarks, showing that it obtains almost perfect accuracy (99.8%–100%) and microAUC (0.999), and high efficiency (amor- tized per-sequence client and server runtimes of 4.95ms and 0.53ms, respectively, and 0.21MB communication). Along the way, we obtain two results which may be of independent interest. First, we provide a tighter bound on the inverse approximation proposed in the work of Cheon et al. (ASI- ACRYPT’17). Second, we develop an enhanced packing technique in which two reals are packed in a single complex number, with support for homomorphic inner products of vectors of ciphertexts. We note that while similar packing techniques were used before, they only supported additions and multiplication by constants.