Vive la Difference – How a Range of Search Techniques can Find Relevant Patents that Might Otherwise be Missed

screenshot_285Patent searching is often done using a combination of keyword, and classification code searching, but is this enough to find all of the relevant patents a searcher may be looking for? In this post this question will be explored by considering what additional patents can be found using citation based searching in addition to traditional methods.

Nest Thermostats have been in the news recently, after Google paid $3.2 billion for the company. They have also been in the news after being sued for patent infringement by Allure Energy.

A very common response to being sued by a patent owners is to argue that asserted patents are invalid, often on grounds that the patent is not novel or inventive. Among the patents owned by Allure Energy is US8024073, for an Energy Management System that claims a method of remotely controlling a wireless home energy network. This ‘703 patent has just one independent claim, a method for the remote management of a wireless home energy network.

Patents are commonly searched by using both keyword, and class code searching. In addition, many searchers are increasingly searching using the backward citations of the patents they are trying to invalidate.

Australian company Ambercite was founded to further develop patent citation searching. The company has developed a series of algorithms to rank both patents, and patent citations based on connections in the broader citation network.

As part of this, Ambercite has launched a interactive web application called AmberScope to allow users to navigate, and explore the networks associated with these citations. As an example, the AmberScope map for US8024073 (‘073) is shown below:

Amberscope image of US8024073 - Click to Enlarge

Amberscope image of US8024073 – Click to Enlarge

To see a live version of the map click here.

More recently, Ambercite has developed Automated Patent Search processes to identify relevant patents, which are not directly connected to the patents of interest, such as patents a user might be trying to invalidate or license. These Automated Search Reports can also rank the direct backward citations based on their similarity to the patent being searched.

But how well do these different types of searches work? Can a searcher afford to use one method of searching and exclude the others?

To explore these questions, a comparison was made between the search results from an Automated Patent Search to what a professional patent searcher could find for the same patent. For this example a professional patent searcher was asked to do the following:

  • Prepare a reasonably tight keyword search for the ‘073 patent, and provide a personal assessment of the likely relevance of the results found. The scale used was based on a three-point scale with the most relevant (similar) patents rated a ‘1’, potentially relevant patents a were given ‘2’, and the least similar patents a score of ‘3’
  • Prepare a reasonable class code search based on the ‘073 patent
  • Lookup all of the backward citations for ‘073, and provide the same personal assessment of their likely relevance

For the comparison, Ambercite ran an Automated Search Report for ‘073, and the professional searcher ranked the patents found using the same criteria used previously.

Results

The different searches are shown below, along with a summary of what was found, and the relevance of the patents as assigned by the professional patent searcher.

Patent search Basis of search or search query used Results(patents found)
A) Keyword S1  (153  patent families)

TAB=((Energy and (manag* or optim* or minim* )) AND (Network* and (home or domestic or house or residen* or site) AND (appliance or equipment or machine or device or apparatus)) AND ((Wire ADJ less or wireless) NEAR5 (server or remote or communicat* or user interface))) AND (PRY<(2010) or PY<(2012))

 

S2 (455 patent families)

TAB=(remote* near5 (computer or server or network) and (communicat* or transmi* or receiv*) NEAR10 (wireless) and (control*) NEAR5 (energy or temperature or power or data) AND (home or domestic or house or residen* or site or locat*)) AND (PRY<(2010) or PY<(2012));

595 Derwent patent families
Relevant Potentially relevant Not relevant
47

(8%)

128

(21%)

418

(70%)

B) Classification code G05B 11/01 ~Adaptive control systems,

G05D 23/19 ~Control of temperature, by use of electric means

G06F 1/26  ~regulation of power supply

10,652 Derwent patent families

Not classified for relevance

C) Conventional backward citation search All listed backward citations as listed by Thompson Innovation 18 patents
Relevant Potentially relevant Not relevant
4

(22%)

8

(46%)

 

6

(33%)

 

D)  10 most similar backward citation patents Most similar patents, as ranked by the similarity filter in AmberScope or Ambercite Automated Patent Search reports 10 patents
Relevant Potentially relevant Not

relevant

3

(30%)

5

(50%)

3

(30%)

E) Automated patent search for the most similar indirectly connected patents A search for indirectly connected patents predicted to be similar to the ‘073 patent using the Automated Patent Search system available from Ambercite. These patents were ranked in terms of predicted relevance 59 patents
Relevant Potentially relevant Not

relevant

6

(10%)

15

(25%)

38

(65%)

With a study like this it takes time to review the patents found and so, where it makes sense, a smaller data set is preferred, as long as the situation warrants it. In this case, while relatively specific IPC codes were used they generated10,652 patent families, which is clearly too many to review manually. Ordinarily, IPC codes like these would be combined with keywords, or additional IPC codes to create a smaller collection, but in this case a comparison can still be made between the keyword results, and the citation based results.

It is important to mention at this point that different searching objective can dictate how many references need to be reviewed, regardless of how they are generated. If the results can be sorted by potential relevance than it might be possible to review only the most relevant documents, instead of looking at all of them. What some analysts do in certain cases is focus just on the backward citations provided by the examiner, but this assumption can lead to highly relevant patents being missed. Clearly, a combination of approaches provides the best opportunity for identifying the most relevant references.

Did the different search results overlap? 

Another very important question pertains to the overlap between the search results. The overlap between the collections generated in this study are shown in the table below:

Type of search # of patents or patent families A) Keyword
A) Keyword 595 patent families
B) All listed backward citations 19 patents 7
C) 10 closest patents as predicted by Ambercite 10 patents 5
D) Indirectly connected  patents found by Ambercite 59 patents 4

Specifically, the following observations were made:The results verified what most professional searchers intuitively know, different search methods generate additional relevant results, and searchers should never rely on a single source, or method when trying to conduct a comprehensive search.

  • Seven of the 595 patent families found in the keyword search were listed as backward citations for the ‘073 patent
  • Only four of the 59 patents found in the automated search for indirectly connected patents were also found in the keyword search

Take home message – Different search approaches can give very different results

This simple project has shown that what is found when users search for patents depends heavily on how they search. Different techniques can give very different results, both in terms of the number of patents that need to be reviewed, and their likely relevance. For this reason Ambercite has developed its automated search tools, so searchers, and their clients have the option of finding sets of patents that are:

  • Relevant to their patent of interest
  • Not found by more conventional search processes
  • Ranked by relevance in order to save time during the review process

What tools should you use?

Although in this comparison we have contrasted citation searching to keyword and class code search, in fact they are complementary processes, and as mentioned previous it is recommend that searchers use all available techniques for important searches.

For additional information Ambercite has provided a list of case studies and would be interested in hearing from users who would like a sample report, or trial access.

Acknowledgements

The support of Mike Lloyd, George Mokdsi and Sandy Robb from Griffith Hack is gratefully acknowledged.

 

Tags: , , , ,

 
 

discuss this post

  • Frazer McLennan

    Hi Tony
    Nice analysis. This stuff has always interested me. While I am sure these tools have their uses, I prefer the traditional methods, and in the end you always need a human to check the machine’s results anyway. The necktop computer is far more capable of distinguishing between what is and isn’t relevant than a machine will ever be just because of the vast subtleties of the English language.
    I would have liked to see an analysis of one of the semantic search tools using claim 1 of US8024073 as the subject, to see how it compared with both the human and Ambercite’s results. Would it compare more favourably with a fellow machine, or would it lean towards replicating the human?
    Cheers
    Frazer

    • Anthony Trippe

      Hello Frazer,

      Thank you for the kind words. I think professional searchers know they will get different results from different methods, but it is good to make some comparisons from time to time.

      As for semantic search tools, what you suggest would indeed be interesting, and I think the results will depend on whether a supervised, or unsupervised method was used. Supervised methods are more likely to return highly ranked results based on input from the person training it. So, while I agree completely that a “necktop” computer is required, I also believe that using machine learning can make the process more efficient.

      Thanks again,
      Tony

 
 

Add a comment

required

required

optional