No items found.
< Back to Blog Page

Using Patent Statistics in your Daily Work - A Report from EPO PIC 2014

Using Patent Statistics in your Daily Work - A Report from EPO PIC 2014

On Tuesday, November 4th I had the distinct honor of providing one of the keynote addresses for the 2014 European Patent Office Patent Information Conference held in Warsaw, Poland. It was my first time attending the conference, and I was really impressed by the event. At over 400 participants from more than 25 countries the meeting was well-attended, and the content provide was truly outstanding. I have often heard that this is one of the premier patent information conferences in the world, and after participating I can honestly say that the reputation the meeting has developed is absolutely deserved.

While there were many highlights of the week in Warsaw, the item that sticks out the most was the discussion rounds that took place before the official launch of the conference on Tuesday morning. I was pleased, and honored to have been asked to Chair the session on patent statistics. I was joined in this endeavor by Martin Kracker, and Christian Soltmann, the designated experts on this topic from the EPO, and who provided immeasurable expertise, and assistance in running the round.

The individuals who signed up for the discussion round were asked before the meeting to answer a questionnaire on their experiences with patent statistics. A summary of the results can be found below:

Martin, Christian and I also produced a summary of the discussion round itself, providing the highlights of the session. Those materials are also provided:

Finally, the team generated full written notes from the round, which I am providing in their entirety in this post:

Report: Discussion Round 7
Using Patent Statistics in your Daily Work

4 November 2014 / 10.45 – 12.15 hrs

Chair: Anthony Trippe, Patinformatics, LLC

EPO experts: Christian Soltmann, Martin Kracker

Patent statistics have been used frequently in the field of economics. However, they are also important, and are more frequently being used by IP professionals in companies, patent law offices, and academic institutions to cope with the ever increasing number of patent documents in their fields of activity, and by organisations looking to achieve a competitive advantage.

The 25 patent professionals participating in this discussion round were from corporations, commercial providers, academia and national offices. The practical, and lively discussion focused on common problems, shared good practices and effective workflows:

Patent family reduction

An earlier study done by the chairman has shown that 95% of patent landscape reports use the INPADOC family. However, most participants agreed that in most cases DWPI, FamPat, DOCDB simple family or individually defined family concepts (e. g. one which takes the US continuation-in-part issue into account) are more appropriate.

Patent name harmonisation

There is a general agreement that there is no single correct way to harmonise names, or to correctly structure company trees, which may even change over time. In the long run, a suggested solution was that patent offices require, and publish a globally unique ID for applicants and inventors.

Of course there is already company information integrated in many commercial IP databases, but usually they are not sufficient on their own, so typically additional measures must be applied:

  • Create and maintain your own company database (or just an Excel sheet)
  • Do manual data harmonisation, e.g. with tools like OpenRefine
  • Use the outcome of name harmonisation projects provided for free

from KU Leuven ( or OECD (

Multi-valued data

A patent document may have more than one applicant, inventor, classification symbols etc… Commercial tool providers wanted to know whether only one or all values should be provided in their reports. Participants thought that this trade-off between completeness and comprehensibility / simplicity could not be decided a priori: this just depended of the information needed.

Another option would be to define one “primary” value for a patent, while all other values are regarded as “secondary”. If precision was required, “fractional counting” was the safest way to go.

The merits of one representative data element per document vs. including all elements found within the documents were discussed. It was generally accepted that it could sometimes be difficult to explain the number of elements present when multiple values were allowed.

Place of invention origin

There seem to be two best practices to identify the place of the origin of an invention:

  • Take the country of the first filing
  • Take the country / address of the inventor

Both of these methods are suboptimal, and can represent something other than the actual location of where the invention was created such as the location of the central patent prosecution facility, but in the discussion round no reliable additional methods were suggested to overcome these issues.

How to handle the large amount of CN data

There is a high number of Chinese (and other countries’ like JP and KR) patents, but their value for patent analysis is sometimes not clear. If these patents are not representative enough for the sample to be analysed, they could also be ignored. Alternatively, some of the analyses could be done with and without the country data in question.

It was also suggested that Chinese patent documents that were also filed internationally be identified, and that only these documents should be added to the final analysis collection. The thought was that single Chinese country filings might not be particularly relevant in the type of studies typically performed when doing patent analytics.

Understanding the client’s need

The patent professional must know which business decision his analysis has to support, e.g. he must understand its financial and strategic implications.

In short: the patent professional has to understand “the need behind the need”, even if his client might not readily disclose this initially.

Without understanding the context under which the intelligence is going to be used by the decision maker, it is highly likely that data might be provided that is not aligned with what the business needs to become more efficient in its decision making.
Communication of methodology to client

For every analysis certain assumptions have to be made, e. g. the selection of the date to be used: first filing date, application date or publication date?

Many participants will not communicate their assumptions as long as they apply their methodology consistently, because typically the client is focusing on the result only. Other participants will describe the assumptions in small print in the report.

For the purposes of reproducing the work later, some acknowledgement of the methods used should be recorded, but it is up to the analyst to use the methods that are most likely to produce the most meaningful results, and to understand what the implications of their decisions are. In order to avoid confusing the clients with the details associated with dealing with patent information, all of the minutia involved in coming to these conclusions will not necessarily be shared.

A number of wishes were expressed in the meeting and from the discussion round questionnaire. These included:

  • Offices should assign a unique ID for applicants and inventors • The clients should explain the (business) need behind the (information) need when commissioning patent statistics
  • Examples of good practice should be available
  • Appropriate training should be provided
    An ongoing Community of Practice should be established to provide for the previous bullet points on examples and training
Anthony Trippe, Patinformatics, LLC

Chairman, Discussion Round 7

Besides the discussion on patent statistics there were six additional discussions that were held on Tuesday, and a super workshop for power users that was held on Thursday. Notes and summaries of all of these sessions, along with copies of most of the conference presentations can be found at the meeting program page –

We had a fantastic discussion on Tuesday highlighted by the participation of some of the world’s experts in patent analytics from all corners of the patent information world, including industry practitioners, government policy makers, academic scholars, and patent information providers. All of us involved with the production of the session hope that the larger patent analytics community will also be able to take advantage of the notes from this meeting.

To receive our newsletter on machine learning and artificial intelligence in the field of Intellectual Property from our sister website, enter your email below:

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
patinformatics logo
linkedin logotwitter logo

©Copyright 2021 Patinformatics