Everyone has heard that patent assignees are their own lexicographers. One of the challenges of patent searching and analysis is the discovery of keywords that are used by applicants to describe their inventions, for comprehensive searching. For many years organizations have worked on methods for standardizing keywords in patents, into concepts that could be used to enhance retrieval. In this post, a relatively new system, called Relecura, from INDUS TechInnovations, that has addressed this issue, will be profiled.
One of the key features of Relecura is it’s application of text mining and semantic analysis techniques to the realm of patent search and analysis. The company describes their approach to this by saying:
Elements of text analysis and machine learning are incorporated into the platform, to extract key concepts from the patent documents and relate them to each other. These concepts are provided as an additional facet that can be used to search for relevant patents.
Searching and navigating patent portfolios using concepts helps the user to understand the core invention idea described by the patent, and quickly determine the related concepts and other patents addressing similar inventions (even if they were filed in a completely different application area and described in language utilizing different terminology – read keywords).
In practice, this means that a user can input a set of claims, or portions of an invention disclosure, into the system and the key concepts associated with the text will be searched and perhaps more importantly, exposed to the user for refinement. Previously, there have been systems available for searching based on raw text, but traditionally they would only return a collection of documents and wouldn’t provide an explanation of what concepts were found in the query.
Alternatively, users can start with a tried and true method, used by searchers for many years, start by searching for your key topic in the title and reverse engineering the corresponding results to build a more comprehensive strategy using additional keywords and classification codes discovered during the analysis. Ordinarily, this is a time consuming process since records are looked at individually, keywords and codes have to be jotted down, and in the case of the classification codes, the codes need to be looked up. After a new strategy is developed the new set of results have to be evaluated for relevance, and unless the changes to the strategy are performed one at a time, it can be difficult to see the impact that individual changes made on the set.
Relecura makes this process more efficient, and less time consuming by allowing the user to “OR” in new keywords on the fly, building a larger collection, and then provides mechanisms to study the impact of the changes on the type of documents that are being received. Once a sufficiently large collection is created it can be narrowed, or prioritized by “AND”ing in concepts that bring the most likely relevant answers to the user’s immediate attention. The screen shot below shows an example of this process using the Search Trail function that collects all of the steps a user performed during their search.
In this case, starting with tennis racket in the title, searching all available patent collections provided 1,389 equivalents and 1,693 total documents. Since more than one patent issuing authority is being searched, the system allows sorting by equivalents (direct shared priority), by INPADOC family or by application number. On the left hand side of the screen the system is using guided navigation to provided the user with additional keywords that maybe of interest based on an analysis of the collection.
Three additional keywords were added bringing the collection to 5,829 equivalents (8,083 total documents) and finally, by adding two more newly suggested keywords, to 5,879 equivalents (8,164 total documents). Since we appearing to be approaching a point of diminishing returns, the “OR”ing in of additional keywords was stopped at this point.
Looking at some of the more recent records in the collection, performed by sorting the collection by publication date, as opposed to relevance, it can be seen that there are a number of documents in the set that have strayed from the topic of tennis rackets. At this point the concepts were used to start prioritizing the collection to focus on the items the user would most likely be interested in. Classification codes could have been used as well, and the system provides definitions of the codes via tool tips, when the user passes their cursor over any of them.
Starting with three tennis and racket related concepts the set was narrowed to 3,106 equivalents (4,024 total documents). Looking at the most relevant records, an additional concept was “OR”ed in bringing the set to 3,121 equivalents (4,040 total documents). Relecura provides immediate feedback on the impact of new keywords and concepts on the size, and potential relevance of a collection, which can reduce the time needed to generate a reasonable set for analysis.
The collection can be analyzed at this point to survey the type of concepts that are prevalent within it. Using the Topic Map function, it can be seen that there are a few concepts, such as stringed instruments and golf equipment, that have continued to find their way into the collection.
The size of the bubbles shows the relative number of documents containing the concept present within the collection, while the relative distance of the bubbles to one another provides a measure of how close the concepts are to each other. In our tennis racket example, it can be seen that the golf and string instrument bubbles are reasonably small, compared with the tennis bubbles, and that both are a reasonable distance from the tennis cluster. These observations can assist an analyst in deciding if documents containing these concepts should be removed from the set, and if so, they can be removed using functionality provided on the map page. This provides a nice means for measuring the likely relevance of the documents in the collection and provides a means for increasing the focus of the set, if desired. Since the generation of the concepts and all of the relationships between them are pre-generated, the system is very fast, and responsive to exploration by the user.
There has been a lot of discussion on forward citations on this blog and Relecura adds a unique twist on this type of analysis using the Browse function. Using this feature a data set can be analyzed by assignee and browsed by which organizations have been citing the patents associated with that organization.
In this image, it can be seen that Sumitomo has the highest number of documents in the tennis racket collection, and which organizations are citing it most frequently. An analyst can browse this collection by unselecting Sumitomo and selecting Dunlap to see how the list of citing assignees change.
The Browse function can also be used in combination with the Assignment search to allow analysts to see which organizations in a collection are potentially selling patents and to whom they are selling them. To filter the current collection of document to those that have undergone a re-assignment of some type the Assignment button is pushed at the top of the results page.
It can be quickly seen, from this view, that SRI Sports has likely sold a significant part, or their entire portfolio of patents, in this area, to Sumitomo. Using the Browse function an analyst can work their way through the list investigating the other assignees in the collection, and whom they might have worked with.
Relecura also contains functions for Graphs (various charts and graphs describing the data set), Explore (word clouds of keywords, concepts and classification codes), Citations (forward and backward citing documents), and Cluster (organizing the set based on keyword clustering) that provide valuable insight and details on features of specified collections. The company has also provided a list of additional key features not covered in this brief look at the system:
While a reasonably new entrant to the patent search and analysis field, Relecura has incorporated a number of unique and interesting features into a system that allows users to quickly explore patent data and the relationships associated with the concepts contained within them. The filtering of collections by the Topic Map, Explore, and Browse functions are just a few of the developments provided for in the system.