Sai Zeng, Angran Xiao, et al.
CAD Computer Aided Design
The Pen Technologies group at IBM Research has recently been investigating methods for retrieving handwritten documents based on user queries. This paper investigates the use of typed and handwritten queries to retrieve relevant handwritten documents. The IBM handwriting recognition engine was used to generate N-best lists for the words in each of 108 short documents. These N-best lists are concise statistical representations of the handwritten words. These statistical representations enable the retrieval methods to be robust when there are machine transcription errors allowing retrieval of documents that would be missed by a traditional transcription-based retrieval system. Our experimental results demonstrate that significant improvements in retrieval performance can be achieved compared to standard keyword text searching of machine-transcribed documents. We have developed a software architecture for a multimedia document retrieval framework into which machine learning algorithms for feature extraction and matching may be easily integrated. The framework provides a "plug-and-play" mechanism for the integration of new media types, new feature extraction methods, and new document types.
Sai Zeng, Angran Xiao, et al.
CAD Computer Aided Design
B.K. Boguraev, Mary S. Neff
HICSS 2000
A. Gupta, R. Gross, et al.
SPIE Advances in Semiconductors and Superconductors 1990
Michael Ray, Yves C. Martin
Proceedings of SPIE - The International Society for Optical Engineering