COA: Finding novel patents through text analysis
Abstract
In recent years, the number of patents filed by the business enterprises in the technology industry are growing rapidly, thus providing unprecedented opportunities for knowledge discovery in patent data. One important task in this regard is to employ data mining techniques to rank patents in terms of their potential to earn money through licensing. Availability of such ranking can substantially reduce enterprise IP (Intellectual Property) management costs. Unfortunately, the existing software systems in the IP domain do not address this task directly. Through our research, we build a patent ranking software, named COA (Claim Originality Analysis) that rates a patent based on its value by measuring the recency and the impact of the important phrases that appear in the "claims" section of a patent. Experiments show that COA produces meaningful ranking when comparing it with other indirect patent evaluation metrics-citation count, patent status, and attorney's rating. In reallife settings, this tool was used by beta-testers in the IBM IP department. Lawyers found it very useful in patent rating, specifically, in highlighting potentially valuable patents in a patent cluster. In this article, we describe the ranking techniques and system architecture of COA. We also present the results that validate its effectiveness. Copyright 2009 ACM.