Modeling patent quality: A system for large-scale patentability analysis using text mining
Abstract
Current patent systems face a serious problem of declining quality of patents as the larger number of applications make it difficult for patent officers to spend enough time for evaluating each application. For building a better patent system, it is necessary to define a public consensus on the quality of patent applications in a quantitative way. In this article, we tackle the problem of assessing the quality of patent applications based on machine learning and text mining techniques. For each patent application, our tool automatically computes a score called patentability, which indicates how likely it is that the application will be approved by the patent office. We employ a new statistical prediction model to estimate examination results (approval or rejection) based on a large data set including 0.3 million patent applications. The model computes the patentability score based on a set of feature variables including the text contents of the specification documents. Experimental results showed that our model outperforms a conventional method which uses only the structural properties of the documents. Since users can access the estimated result through a Web-browser-based GUI, this system allows both patent examiners and applicants to quickly detect weak applications and to find their specific flaws. © 2012 Information Processing Society of Japan.