Information provided by patents.
Innovation is today one of the best ways to improve competitiveness and to create jobs. In this context, the knowledge and competencies developed in academic laboratories and research centers must be transformed into products and services. To achieve this goal, the use of patent information is one of the best ways to understand the areas concerned with the research, to find partners, and often to shift academic subjects to more relevant domains. This chapter concerns patent information retrieval and automatic patent analysis (APA). It shows how the results are useful for the research valorization and transfer, especially to SMEs.
- automatic patent analysis
- public and private partnership
In two recent articles published in 2016 by “Vie et Science de l’Entreprise,” Battini  and Dou  underlined the necessity to promote the development of SMEs and innovation to develop employment on a large scale. In this chapter, we develop these concepts by outlining the main step of innovation on one hand and on the other hand how this main step can be catalyzed by patent information analysis.
2. The main step in innovation
Many people confound innovation and invention. These two are different; what is important for us is the main step in innovation process as described in the work financed by the European Union and developed by Vinnova . According to this study, innovation is developed in two steps:
The first step is the financial support of the state to the universities and research centers to create knowledge and competencies.
The second step consists of transforming these competencies and knowledge into “money” and this is the real main step of the innovation process.
This second step induces various consequences:
An increasing move toward the development of links between academic research and industry (in our case SMEs).
The development of public and private partnerships (PPP).
The valorization of academic research.
In this condition, it seems important to discuss prior to the catalyst aspects of patent information analysis, the brakes and levers, which often prevent the development of industrial relationships between academic research and SMEs. In most countries, the evaluation of the researchers as well as their university or research center is done according to the rating of some top papers published in journals which are seldom read by industrialists. The result is that for an academic researcher, working closer from industry and more specifically from SMEs is a brake to their careers since part of his work will not be devoted (as the expert evaluators will say) to fundamental aspects of research. Various voices rose to describe what was called by Gingras the “bibliometrics fever”  and its negative role in the development of multidisciplinary R&D.
Even if we criticized this state of affairs, we are not going to change it. To do so will take a long period of time and a real change in the frame of mind of people. Hence, what we will do is propose a way, while still remaining in the “canons” of fundamental research, to move from this former situation to a new one closer to what the man in the street may expect of the national research output. The objective is not to change the way of conducting research, but to provide the researchers with a way to get closer to industrial applications, and the industrialists a way to open prospective and fructuous discussions with academics. The objective is clear, if nothing is done, the part of the fundamental research which can be transferred to industry will remain small. But if a slight move is done in view of possible applications, more transfers will occur. This aspect of the virtuous spiral between research funding and application is now named social research responsibility (SRR) .
A few years ago, Zerouni , at that time the director of the NIH (National Institute of Health of the United States), made the following statement: “The success of American scientific research depends on the existing implicit partnership between academic research, the government and industry. The research institutions have the responsibility to develop the scientific capital. The Government finances the best teams by a transparent system of selection. Industry holds the critical role to develop robust products intended for the public. This strategy is the key of American competitiveness and must be maintained.″ This statement seconds the work of Vinnova in the Interreg III program cited above.
Another problem, when we speak of close relationships between academics and industries, is the multidisciplinary approach. Today most of the industrial problems and developments necessitate a multidisciplinary context. But again, because of evaluation criteria of researchers, the multidisciplinarity is most of the time absent. In a recent issue of Nature , a group of authors pinpointed that to solve the world facing problems (pollution, weather change, population, health, starvation, water supply, etc.) needs a close cooperation between various disciplines. “But research that transcends conventional academic boundaries is harder to fund, do, review and publish — and those who attempt it struggle for recognition and advancement. This special issue examines what governments, funders, journals, universities and academics must do to make interdisciplinary work a joy rather than a curse.” Industrial cooperation is a good way to promote multidisciplinary research, and patent analysis is one of the best ways to show people why this is necessary and how it can be developed.
2.1. How patent information may change the vision of research
Most of the time, patents are seen by people as a tool to protect a product or an application and then to give a monopoly of exploitation during 20 years. Many considerations on the role of patents in this area have been published , but the goal of this chapter is not to look into this aspect but rather into the information that patents provide. One of the largest patent databases available is the world patent database from the EPO (European Patent Office) which provides more than 90 million of patent notices from more than 90 countries. Other databases such as Patent Scope (Word Intellectual Patent Organization, WIPO) or European Patent Database (EPO) or national patent databases (US, Japan, European countries, etc.) are available via Internet and are free of charge. This tremendous amount of information provides a living technical encyclopedia always up to date which provides information in areas described in detail in the Glossary of Patent Terminology [9, 10]. The fields available and useful in patent analysis are indicated in Table 1.
|Type of data||Description||Usefulness|
|Title TI||Title words||Describe shortly the patent purpose|
|Applicant(s) AP||Patent owner(s)||Knowledge of new entrants, main leaders, old timer. Useful to establish contacts or to examine the company site|
|Inventors IN||Inventors||Can be the same as applicants. Useful to establish contact, or to trace the people on Internet or social networks|
|Abstracts AB||Describe the patent purpose||Useful for a quick view of the patent coverage|
|International Patent Classification IPC||Describe the technologies or applications. From 1 to 8 digits according to the precision||Technology mapping, technology trend, application areas, and so on|
|Application date||The patent application date is the date on which the patent office received the patent application||The two first digits represent the country. Searches can be done to find a specific patent or the patents from the same country, for example, FR* *=truncation|
|Priority date PR||The filing date of the first application is considered the “priority date”||Date from which the protection starts. Delay of extension of the patent to other countries, 12 months from the PR|
|Priority country||Country where the patent is first filed before being (possibly) extended to other countries||Allows to search by country using the first two digits and the truncation|
|Claims||These define the invention that the applicant wishes to protect||Help to understand the scope of the patent|
|Description||Details on the patent||Describes the patent in detail|
|Drawing(s)||Complement of the patent description||Help the expert to understand the patent description|
|Citations||Examiner(s) of patent scientific and technical novelty may cite other patents or non-patent literature relevant to the patent examined||Help to gather patents related to the same invention or to detect among these patents the ones which are the most important (the most cited)|
|Span of a patent||Generally 20 years||After 20 years, the patent is in the public domain|
The information which is provided by the patents is very important since the patent information bridge the gap between fundamental research and markets. In this condition, patents are a perfect tool to begin to answer one of the first questions that a researcher should ask: How useful is my research ? Generally, people in research know the fundamental purpose of their work (even if it is the continuation of a specialty useful at a certain time but obsolete today), but because research is vertical, they do not embrace all the whereabouts of their subject, especially on the application point of view. One of the best examples of this situation appears when one examines the references in a scientific paper. Most of the time you will not see a patent cited as a reference. This pinpoints the gap which exists between fundamental research and industry. The situation is not the same on the patent side since in US, European, and World Patents the examiners often cite the so-called “non-patent literature,” which is most of the time scientific papers.
Patent information is then a good way to break the “technical illiteracy” of some people and it will help to “open the window” through which the researcher can understand the facets related to his research. Today, there are no barriers to use these facilities, and most of the patent databases are available on the Internet. This is not a question of facility, this is a question of good will and also to develop some indicators which will evaluate the researcher’s output in a better way. The new role of university is not only to teach and to search, but also to help its environment to become more prosperous.
2.2. The positioning of a research topic
Most of the time when a researcher enters into a research laboratory to get a PhD or to be hired as a full-time researcher, he will continue a research closely linked to the laboratory specifications. This is the general rule since the laboratory wants to comfort its position. But doing so, various aspects of the subjects will not be clearly understood. This is because of the vertical specialization and because some subjects may need a multidisciplinary approach. This sort of “technical illiteracy” has been analyzed and the role of patent information as a way to clean it has been underlined . One of the examples of this situation is given by the content of the bibliographies of scientific papers where almost no patent citations can be found. This underlines the gap which exists between academic research and the industrial world. But in the World Patents (PCT), European Patents, and US patents, the examiners often cite information coming from the non-patent literature, pinpointing the links between research and application. The same situation occurs in chemistry when people look for a protocol of synthesis of some products. They go to classical academic literature forgetting that very detailed synthesis protocols may be found in the patent literature.
Example: A laboratory specialized in heterocyclic synthesis is interested in the various processes available to produce 4-methyl thiazole and 4-methyl thiazole derivatives. It can find some answers to these two questions by making a rapid search on the world patent database. There are syntheses already described in classical scientific papers  and, if not patented, they are in the public domain and concern mainly research. What we are looking for is the trend in industrial chemistry of the synthesis patented by companies in that domain.
Query: “4-methyl thiazole” AND (preparation OR synthesis) in the patent titles (to be more precise) from 1930 to date (November 2016). The brackets are used to indicate that this is a string search with only this expression.
Note that the formulation of the question will also give the 4-methyl thiazole derivatives.
12 patent families are selected covering 20 patents; out of them two patents can be selected:
Patent number: US4284784A Preparation of 4-methyl thiazole
Legal status: Unknown
Publication date: 1981-08-18
Applicant(s): Merck & Co Inc
Inventor(s): Ho Sa V
Priority number: US15228280A 19800522
Application number: US15228280A 19800522
English abstract: An improved process for the preparation of 4-methyl thiazole is disclosed. The process utilizes a substituted imine and sulfur dioxide heated in the presence of a catalyst. The thiazoles are important chemical intermediates.
Non-patents literature: Adams, Journal of Catalysis II. 1968;pp. 96-112
Colebourne et al., Journal of Chemistry. 1968;pp. 685-688
English title: Synthesis of 4-methyl thiazole
Titre Français: Synthese De 4-methylthiazol
Patent Number: CA2053428A1
Legal Status: Unknown
Publication Date: 1992-04-16
Applicant(s): Merck & Co Inc
Inventor(s): Gortsema Frank P, Sharkey John J, Wildman George T, Beshty Bahjat S
Priority Number: US59763990A 19901015, US76703091A 19911001
Application Number: CA2053428A 19911015
CPC: C07D277/22, B01J29/40, B01J2229/26, B01J2229/42
IPC: B01J29/035, B01J29/40, B01J29/70, B01J29/80, C07B61/00, C07D277/22
English abstract: Isopropylidene methylamine is reacted with SO2 to form 4-methyl thiazole in the presence of a modified zeolite catalyst that has been ion-exchanged with an ammonium salt and porefilled with an alkali metal salt.
Further information about the protocol of synthesis can be found in the patent description also available online. Also note the presence of the non-patent literature.
Also note that the citing documents (available with the same search) are the patents published after 1981 and which cite this patent. This not only provides further information but also indicates that the cited patent is important. This patent is from 1981, this means that it is now in the public domain and can be used freely.
Access to 4-methyl thiazole derivatives may also be selected from the results:
AU3280178A - 2-isopropyl-4-methyl thiazole for fruity flavor, etc.
2.3. The methodology
In the example above, the number of patents selected is small because the subject is very restrictive. But because the patent number increases every year, many patent searches give a larger set of answers. This is due to the methodology which is used to understand all the whereabouts of a subject in contrast to the “classical documentation searches” where the goal is to obtain a very precise answer with the least possible noise. In our case, a largest query ensures that all the “contours” of the subject will be “inside the results” of the search. Then, after the query results, the question will be: how to be able to deal with this large amount of patent notices, which often goes up to 1000 or more. This problem can only be solved by coupling to the system of query a system of automatic analysis. This is what is called automatic patent analysis (APA). The following presentations and analysis are done with the Patent Pulse software .
The principle of the APA is the following:
Make a query on the remote patent database, download the results, and at the same time format the patent notices to be able to perform the various correlations: charts, matrix, and networks to understand the classical questions: who is doing what, how, with whom, when, why, and so on. Figure 1 indicates the two ways to deal with the problem:
To have a permanent software on your computer and to create a downloaded database on your local computer and then to perform the analysis locally.
To use your computer as a terminal and do the same things on a remote computer and access and store the results, if necessary.
In the external mode, cooperative work is privileged, but with a total privacy between users who share their data. When a firm gets a license, this license may concern one or more users. These users may exchange results and data stored in one or several folders created by the users upon acceptation of the “exchange demand.” These exchanges, even if the users belong to the same company, are only visible to the users engaged in this exchange. When different firms get a license, these licenses may concern one or several users per company. In this case, the same process as above may occur between various users. If any user wants to contact somebody without a license, the exchange is possible. In this case, you will not go through the platform, but you will be able to exchange a patent notice (with or without your proper comment) using your email (this option is native within the platform). The two types of systems are presented in Figure 1.
Then, if we go back to the search concerning the 4-methyl thiazole, in the same time that the results (patent families) appears on the screen, you will find on the right side of the screen boxes charts of dates, inventors, applicants, and so on. The left part of the screen deals with your own results and the results which are shared with other people. The right part of the screen deals with the various histograms of dates, inventors, applicants, IPCs, and so on. On the top right of the screen, indications concerning notification of all the alerts on various subjects, connection(s) pending or real with Patent Pulse users, help, tutorials, and account can be seen. On the middle of the screen, the results of the search are indicated. Clicking on one patent opens a window which contains the bibliographic data, the abstract, the patent family, the cited patent(s), the non-patent literature, the patent status, the claims and the description, the INPADOC  status, and a link to access the full text of the document. The main screen of the Patent Pulse system is presented in Figure 2.
The correlations to see who is doing what: (between applicants and IPC) or what is the network of inventors, or applicants/inventors, etc., can be done by selecting the corresponding options (top right of the screen). Figure 2 shows the full screen of the Patent Pulse application after the query. It is important to note that as the access to the patent database is free, the only charge is a subscription to the APA software (local or external). In case of the external option (Patent Pulse), there is no need to install a software on your computer (as for Matheo Patent) and there is no problem with the computer administration.
The bibliometrics treatment of patent information has been the purpose of various scientific publications which present in detail all the methodology and results in various areas of science. See for instance the book Risks Diagonal and Innovation  as well as the use of APA in developing countries  and more specific applications such as the Avian Influenza  or natural resource such as the Moringa oleifera . In the following part of this chapter, we present an example underlining the potential of the method and using the facilities which may improve the academics and industrial cooperation.
We take a real example concerning a very hot topic: drone(s) and agriculture. The drones are now used in many fields, but one of the most promising is agriculture. In this area, it is interesting to know the main actors (inventors and applicants), the new entrants, the trend in the technology development, etc. Using the patents to answer this question is the best approach since patented applications represent the “state of the art” close from applications. Many laboratories work on the development of more or less sophisticated drones; the patent search (here in the area of agriculture, but which can be extended to many other fields) is a real companion of the researchers.
2.4.1. Materials and methods
The database used is the world patent database available from the EPO. The software used to perform the query and the analysis is the software Patent Pulse. The query was done on 29 November 2016.
Query drone* in titles, and B6* as IPC (International Patent Classification dealing with transporting) from 1970 to 2016
*stands for a truncation B6* = Generating or Transmitting mechanical vibration in General
Mind if using the term drone in agriculture field since drone also means some biological species related to bees. Using the B6* IPC avoid this interference.
We selected 404 families covering 560 patents.
Figure 3 indicates the first page of Patent Pulse with all the results (short bibliographic data). Note the presence of the drawings which are automatically extracted, clicking on a drawing enlarges it. This is a great help for technical experts. By clicking on the title of a patent, the expanded bibliography is shown.
The automatic analysis of the corpus is necessary since the number of patents is too large to do manually. We are going to answer in the analysis the classical questions who is doing what, when, where, for which purpose, and so on.
Trend in drone and transporting: We select on Figure 4 for the main frequencies of publication per year and this will automatically build up the chart.
The main applicants: Same as above, the selection of the main applicants is presented in Figure 5.
In this field, it is interesting to see if some universities are working in this area and if there are some partnerships between applicants. This is done by drawing the network of applicants as indicated in Figure 6. All the parts of the network can be enlarged if necessary. We show how the interactions appear on the screen (points which are linked). In Figure 6, the dots alone represent an applicant with no link.
We can magnify the various links to see the applicants engaged in the interactions as shown in Figure 7.
In Figure 7, the links between applicants are magnified; we can note, for instance, that two patents have been published jointly by “Puy du Fou International and Act Light Design,” and so on.
The different fields of application: Detecting the fields of application can be done in two ways: we can use the list of IPCs present in each patent or the user can extract from the titles and abstracts the most significant words and build a group from the relevant patents. Figure 8 represents the main IPC selected as indicated in Figure 8. The IPC 4 is selected because this gives the best compromise. For a most precise analysis, IPC full can be used.
The meaning of the IPC can be easily found on the EPO site , as shown in Figure 9. Working on titles or abstract words can be done by exporting the list of patent notices in a more sophisticated software (for instance Matheo Analyzer), or by exporting the title words or abstracts to an excel file, and so on. In the following example, we selected the term “underwater.” This allows to select 10 patent families covering 17 patents.
The main companies involved: Diehl Gmbh & Co, Thales Sa, Dynamit, Nobel Ag, Honeywell Elac Nautik Gmbh, Wardle Patrick (individual inventor), Heinscher Ingo (indivudal inventor), DCNS, Howaldtswerke Deutsche Werft, Diehl Gmbh & Co., US Naval are the main companies involved, see Figure 10.
The difference between the fields of research and application: for each applicant, the difference can be done via a matrix applicant(s)/IPC. This is presented in Table 2.
|HONEYWELL ELAC NAUTIK||1|
|DIEHL GMBH & CO||1||1||1||3||1|
|DYNAMIT NOBEL AG||1||1||1|
The network between applicants and inventors will represent partly the potential involved in R&D as well as the cooperation between applicants. This is presented in Figure 11. Note the link done between the two German companies via common inventors. These key inventors are important if you want to select people with the best R&D knowledge.
Inventor’s competencies: Instead of using titles or abstract words or IPC, it is possible to differentiate the inventor’s competencies by using the WIPO fields . They are keywords which describe various fields of application of patents. Figure 12 presents the results of the analysis.
Further patent information: Another type of information can be obtained from the World, US, and European Patents. This can be done by using the patent’s citations. These patents contain a field called cited patents (patents cited in the patent examined) or citing patents (patents where the patent concerned is cited). All the patent citations are done by the examiners of the patent office. For instance, the patent EP1798145A3 is indicated in Figure 13.
Now, if you want to know what are the patents linked by the citations to the US patent US7631611B1 you can easily expand the network by clicking on the patent (presented in Figure 13) and expanding the network as shown in Figure 14.
In this way, it is possible to build up a cluster of patents that are related to one important patent  and then to know all the different R&D orientations of this invention. In Figure 14 the patent EP1798145 is present in two different levels of examination: A2 and A3, the meaning of this being available in reference . For the US patents the meaning of the various levels of examination called “kind codes” is available in reference .
Even if all the application of automatic patent analysis (APA) cannot be developed because it will take too long, the usefulness of the patent information is demonstrated. The knowledge of the leaders, of the new entrants in a field, the trend in time and technology development, the links between the actors, the types of patents, and their country’s coverage are significant points which help to have a clear view of what is going on in a given area. The patent information is also interesting for developing countries which may find, in patents, in the public domain, technological information enabling them to develop useful products and services. For the academics, patents are a natural link between science—technology and markets. This is one of the powerful catalysts to move the research subjects closer to applications. Because the cost of the patent notices is free, the cost of APA software is affordable for academics, SMEs, and even individuals. Facilities available with the APA platforms enable the users to share various topics and to discuss and comment the most important analysis and key patents. This will help to move toward a multidisciplinary approach of research and development. Another point which has not been discussed here is the role of APA in preclustering. When it is necessary to develop clusters (or poles of competitiveness in France) in certain domains, it is necessary to show to the stakeholders what could be the R&D contours of the future cluster. To do so, APA is fundamental because that will show clearly what the “other or possible competitors” do and then what can be developed by linking all the stakeholder knowledge and facilities together. Various examples of this strategy were developed in scientific papers especially for the developing countries [24, 25].
We thank the Company Matheo Software for providing the patent databases access as well as the analytical software necessary to perform APA (Automatic Patent Analysis)