Reporting from II-SDV 2013 – Day 1
I am in Nice, France for a portion of the week attending the International Information Conference on Search, Data Mining and Visualization (II-SDV). The meeting provides an international forum for those in the field of advanced search applications, data and text mining, and visualization technology. The primary focus is on tools for intelligence and the meeting examines the requirements of specialists in scientific, patent and technical information. The theme is Keep in Touch with the Best Tools for Intelligence.
The meeting is organized by Harry Collier and Anne Girard in conjunction with Christoph Haxel. The attendees primarily come from Europe but I learned about it because of my association with the ICIC that was started by Harry about 25 years ago. With the focus being on tools for intelligence and in particular patent analytics I though the readers of this blog would be interested in what was presented during the meeting.
I have also been live Tweeting the event and you can see these in real-time by searching for the #iisdv13 hashtag on Twitter. I will be accumulating all of the tweets from the meeting and posting them separately as well. The following are a few highlights from each of the presentations from Day 1.
Copies of the presentations can be found here.
Roger Bradford – Agilex Technologies Inc, USA
The Analytics Challenges Posed by Big Data
While not a Keynote per se, it was up to Roger to kick-off the meeting. He did so by providing some background on Big Data and provided some context on where he thought this field was headed and why it was important to the attendees from the scientific and patent information analysis communities. Roger suggested that text analytics, Twitter mining and machine learning were areas where rapid improvements are being made and told us that the use of graphic processing units for document clustering will be an area for continued development.
Patrick Beaucamp – Vanilla Project, Bpm-Conseil, France
Open Source Platforms to deploy Search and Maps Visualization on top of a Big Data Database
Patrick essentially heralded the end of SQL and other commercial databases, as we know them today, and suggested that most analytic projects are quickly moving to Hadoop and other open source databases and analytical tools for providing infrastructure.
Anton Heijs – Treparel, The Netherlands
Large scale Application of Text Mining and Visualization in the EU Fusepool Project
Fusepool is an EU sponsored project aimed at providing resources for the Small to Medium Enterprise community to enable them to understand the patent literature, funding sources and partnership opportunities available to them in Europe. Treparal is providing the machine learning expertise in helping deal with user requirements for large text collections.
Steve Kearns – Basis Technology, USA
Big Data Triage with Text Analytics
Providing a general overview of text analytics processes and how they can be used effectively. Also shared with the attendees the steps for Big Data Processing which include Collect, Analyze and Index the data involved for the project.
Renaud Garat, Laurent Hill – Questel, France
Customizing Statistics for Sharper Analysis
An issue with out of the box analytics programs can be a lack of customizability. This can diminish the value associated with these tools and make the deployment of them to end-users difficult since they don’t recognize the output in the jargon and business context that they work in. Questel provided examples from their Orbit.com IP Business Intelligence module on how the output from this tool can be adapted to the specific business needs and language associated with an individual corporate analysis project.
Manuel Dietrich, Markus Bundschus – Roche Diagnostics, Germany
Katrin Tomanek, Philipp Daumke – Averbis, Germany
Large-Scale Patent Landscaping@Roche Diagnostics: Experiences and Lessons Learned
The group at Roche Diagnostics took on the large task of identifying relevant patent portfolios for 15 of their most significant competitors and designed a machine learning solution to classify these patents into 80 categories that were determined by the decision makers within the business. They claim 90% accuracy of their classifications with their method.
David Hawking – Funnelback, Australia
Searching within the Enterprise: Making the Best of Poor User Queries
David provided an overview of the problems and potential solutions associated with poor or highly ambiguous user queries. In particular, he championed the use of blending to make logical assumptions about additional data that might be of interest or value to the user and how these results can be incorporated into the final results to produce a more satisfying user experience.
Eduard Rozenberg – Dolcera, India
Automating Web Research through Customized Search Tools
Edward described the methods Dolcera has used to identify Non-Practicing Entities from a collection of 10 days worth of re-assignment data from the USPTO. In addition to using lists of known organizations in this category they also used address matching and a variety of formulas related to public information that can be found on the individual companies using the web.
Nathalie Gautier-Hamel – Lafarge, France
The Challenge of Finding and Using Appropriate Tools for Competitive Intelligence in the Field of Construction and Materials
Nathalie described the unique challenges associated with discovering intelligence in the field of construction materials. This is an area that is not associated with a great deal of patenting or publication in the traditional scientific literature so her group turned to LinkedIn, You Tube and other web sources to try and identify new methods being developed in the area.
Solmaz Gabery – Novo Nordisk, Denmark
Challenges in Building a Future Search Centre: our Observations and Choices
The information function at Novo Nordisk is in the process of re-inventing itself and Solmaz described the efforts being made to provide more focused and actionable intelligence to the decision makers within the organization as opposed to providing collections of search results. What she described was a nice example of the re-invention of the information function to drive themselves up the value chain within companies and ensure that the group’s input continues to be highly sought after.