- US
- 8019158
Tal Drory
Title
Bio
Tal Drory is the Senior Manager of the AI Multimedia department at the IBM Research – Haifa lab and lead of the Document Understanding and Enrichment sub-theme in Research. Tal's department delves into some of the most exciting areas of AI Multimedia, Computer Vision and Speech Technologies. His team builds multimodal foundation models to understand documents and other visual and language content, helps computers synthesize high quality emotive and expressive speech from text, and conducts state of the art research in multiple other related areas.
One of the projects Tal and his teams were leading is the Granite Vision foundation model – released in Hugging Face. Granite Vision is a a compact and efficient vision-language model, specifically designed for visual document understanding, enabling automated content extraction from tables, charts, infographics, plots, diagrams, and more. See the model here: https://huggingface.co/ibm-granite/granite-vision-3.3-2b and the paper here: Granite Vision: a lightweight, open-source multimodal model for enterprise Intelligence.
Another notable project is the Granite Vision Embedding model, also released in HF. It is a multimodal embedding model based on Granite vision, specifically designed for multimodal document retrieval, enabling queries on documents with tables, charts, infographics, and complex layouts .See here: ibm-granite/granite-vision-3.3-2b-embedding · Hugging Face
Finally, we have also released a compact and efficient speech-language model – Granite Speech model – specifically designed for automatic speech recognition (ASR) and automatic speech translation (AST). See here: https://huggingface.co/ibm-granite/granite-speech-3.3-8b and here: https://huggingface.co/ibm-granite/granite-speech-3.3-2b
Tal received his B.Sc. and M.Sc. degrees in Computers Science from the Technion – Israel Institute of Technology. Prior to joining IBM, Tal held a research position in the HP Haifa Research labs, specializing in database systems.
Publications
There aren’t any IBM publications to show for Tal Drory. For a complete publication history, visit Google Scholar.
Patents
- US
- 7899831
- US
- 7895277
Blog posts
IBM Granite now has eyes
ResearchKim MartineauIntroducing KVP10k: A comprehensive dataset for key-value pair extraction in business documents
Technical noteTal Drory, Udi Barzelay, and Oshri NaparstekDeep Document Understanding: IBM’s AI extracts data from complex documents
ResearchTal Drory, Doug Burdick, Yunyao Li, Udi Barzelay, Peter Staar, Christoph Auer, Michele Dolfi, Mustafa Canim, Ashish Verma, Christophe Guittet, and Anthony P Stevens5 minute read