Advancements in Traffic Processing Using Programmable Hardware Flow Offload
- Luca Deri
- Alfredo Cardigliano
- et al.
- 2024
- HPSR 2024
I am a Senior Scientist at IBM Research AI in Zurich and I lead the activities around Efficient NLP for industry-grade applications. We build model architectures optimized for low latency inference, such as the pNLP-Mixer.
My passion is providing efficient and cost-effective solutions for complex problems via innovative algorithmic design and highly optimized implementations. I enjoy squeezing instructions in CPU cycles, designing algorithms tailored for specific hardware and being part of hardware software co-design processes.
My experience ranges from low level network programming in the Linux kernel, to high-performance math programming using SIMD instructions, up to designing novel machine learning models. I wrote software to monitor very large scale distributed systems (20k+ nodes) and implemented from scratch distributed systems to train very large embedding matrices.
I contributed to several IBM products such as IBM Sonas and Tivoli TNPFA and in may open-source software (e.g., the Linux kernel, openvswitch, wireshark).
I filed ~20 patents and published 20+ papers in top systems and AI conferences.