“Using AI tools to help assess inventive step”: A response to the CIPA journal article

Sector: Patent law

16th July 2024

This post is based on a previous article on IPKat.

The cover article of the May 2024 edition of the CIPA Journal proposed a new test for inventive step using AI. The article was inspired by the EPO’s AI assisted search tool, AI-PreSearch. The CIPA journal article proposes to use an AI derived measurement of semantic similarity between the claims and the prior art as a new test for inventive step. However, using the amount of “similarity” between the claims and the prior art as a test for inventive step would constitute a vast oversimplification of patent law, lacking any correspondence with the established legal concepts of novelty and inventive step. The proposal presented in the CIPA Journal fails to recognise that, whilst AI search-tools such as AI-PreSearch may be excellent at searching the prior art, they possess no functionality for applying complex legal tests.

EPO AI assisted search: Language models and vector search

Last year the EPO announced the introduction of a new tool to assist Examiners in patent search. According to an article by the EPO Head of Data Science Alexander Klenner-Bajaja, the AI assisted pre-search has relatively simple architecture using machine learning language model assisted vector search. Details of an earlier version of the model are described in Vowinckel et al. 2023.

Vector search is a standard machine learning method whereby inputs (e.g. features, images, text) are represented as vectors and compared. In language model assisted search, the language model produces a vectorial representation of the input text which includes its contextual semantic information. The vectors can then be compared to each other to find semantically similar texts in an embedded space. The vector space may have many thousands of dimensions. Language model assisted vector search is a widely used technique to find and recommend personalised image, music, podcasts and even AirBnBs to users.

AI-PreSearch uses a language model (EP-RoBERTa) that has been trained on patent documents. In AI-PreSearch, EP-RoBERTa produces a vector representation of the claims to be searched. The vector representation is then mapped to the 250,000 dimension patent subject area classification (CPC) space. The application can then be searched against all the prior art stored and embedded in a vector database. The closer in proximity the application vector to a prior art vector, the more textual and semantically similar the prior art is to the application. The model could be used to search the whole application or parts of it, such as the claims.

“Similarity” is not a test for inventive step

The CIPA Journal article proposes that AI-PreSearch could be used in a new inventive step test. The article proposes:

“Suppose a new patent application is received and converted into an embedding space using a large language model. The idea for a new test for inventive step is ‘the new application is inventive if the embedding space around the embedding vector of the new application within a radius of x, is empty and there is a technical effect […]’. Values of x could be found from historical data about granted patents and the state of the art. The historical values could then be used to determine a value for x to use now. “

However, the similarity between a claimed invention and the prior art, as determined by their relative positions in the embedded space, has nothing whatsoever to do with the current legal tests for novelty and inventive step. AI-PreSearch is simply a search tool for identifying documents semantically similar to the claims. The degree of “semantic similarity” between the claims and prior art does not overlap with any of the pre-existing tests for inventiveness, whether this is the Windsurfer/Pozzoli test of the UKIPO, the problem-solution test of the EPO or the non-obviousness test of the USPTO.

In the problem-solution approach, for example, the first step is to identify the closest prior art. Superficially, it may seem that semantic “similarity” may help identify the closest prior art in the problem-solution approach. However, the closest prior art is “that which in one single reference discloses the combination of features which constitutes the most promising starting point for a development leading to the invention […] In practice, the closest prior art is generally that which corresponds to a similar use and requires the minimum of structural and functional modifications to arrive at the claimed invention” (EPO Guidelines for Examination, G-VII-5.1).

AI-PreSearch assists in identifying contextually and semantically similar documents to the claimed invention. However, the simple vector search of AI-PreSearch does not and cannot identify a) which disclosure constitutes the most promising starting point for a development leading to the invention, b) which disclosure corresponds to a similar use to the claimed invention or c) which disclosure requires the minimum of structural and functional modifications to arrive at the claimed invention. None of these tests correspond to “similarity” in vector space. Similarly, there is no overlap of a test of similarity in vector space with any of the steps in the Windsurfer/Pozzoli test.

The CIPA journal article admits that there is currently no legal basis for replacing the current tests for inventive step with a “similarity” test. However, this lack of legal basis is not only absent in the case law, it is also in the legal texts themselves. The European Patent Convention (EPC) states that “an invention shall be considered as involving an inventive step if, having regard to the state of the art, it is not obvious to a person skilled in the art” (Article 56 EPC). The amount of semantic similarity between a disclosure and the claimed invention cannot be equated, according to any stretched definition of the term, with “non-obviousness” to a skilled person.

Final thoughts

The use of a simple measure of “semantic similarity” between the claims and prior art as a test for inventive step, would constitute an absurd reduction of the complex legal notion of inventiveness. Readers may recall the infamous exchange (infamous at least to patent attorneys) in Episode 16, Series 6 of the US legal drama suits:

Donna: Benjamin applied for a patent and it turns out our technology overlaps with someone else’s
Louis: How much overlap?
Donna: 32.5%
Louis: That’s over the threshold. Unless Benjamin can get you below 30…

Every patent attorney knows that this exchange is legal nonsense. The degree of overlap (or similarity) has nothing to do with the legal concepts for patentability of novelty and inventive step (even in the US). Furthermore, as the quote from Suits illustrates, such a system would be wide-open for abuse through the judicious manipulation of the legally meaningless measure of “similarity/overlap” in patent drafts.

The proposal presented in the CIPA Journal ultimately fails to recognise the limited functionality of AI-PreSearch. AI-PreSearch, according to the EPO, is very good at searching. However, it has no ability to learn or apply legal tests. Importantly, AI-PreSearch’s language model EP-RoBERTa is not in the Generative Pre-trained (GPT) family of large language models made famous by OpenAI. EP-RoBERTa is based on BERT, an earlier type of large language model from Google, and the first to use transformers to represent contextual information in language. As such, unlike ChatGPT, EP-RoBERTa has no ability to answer questions, generate text or to learn to apply tests grounded in verbal reasoning. AI-PreSearch simply uses vector search to identify and rank the similarities of prior art documents to the claims of a patent application. Whilst AI-PreSearch may be great at searching, it has no hope of providing an alternative to inventive step assessment.

GPT large language models (LLMs) such as ChatGPT, by contrast, have far greater functionality than simple AI assisted search tools. LLMs trained on patent prosecution data and legal texts can generate legal reasoning regarding the inventiveness or otherwise of a claimed invention. LLMs may also be combined with a vector search for prior art, to perform the full functionality of search and examination. Implementation of such a process would not constitute a new test for inventive step. Instead, it would be automating the current legal tests currently applied by patent examiners. However, we are not yet at the point where AI can replace a patent examiner. Specifically, the verbal reasoning produced by LLMs is currently fairly generic and superficial. Nonetheless, as the functionality of these tools continues to grow, a future place for AI in patent examination seems likely. However, it is probably safe to assume that the role of AI in patents will not be as new similarity test for inventive step.

Further reading

Use of large language models in the patent industry: A risk to patent quality? (Oct 2023)

Related insights...

AI in the patent industry: Don’t believe the hype. Believe the data.

9th June 2026

Many in the IP profession remain considerably sceptical of AI. AI may be useful for checking for typos and simple calculations of deadlines, but it cannot replace in-depth human reasoning about complex scientific and legal issues. However, the data suggests something different.

Popper: The global patent prosecution AI for pharmaceutical IP

9th June 2026

Popper is Evolve’s proprietary AI tool, built by our own pharma patent attorneys to navigate the complex intersection of life sciences, global patent law, and commercial strategy.

Is AI software for IP just expensive wrapping paper?

14th May 2026

At the last count, there were more than seventy companies offering AI-assisted IP software solutions. Most of these companies are less than two years old.

Are AI-generated materials legally privileged? United States v. Heppner

13th May 2026

Legal privilege ensures that you can share sensitive information with your lawyers without fear of it being used against you in court. This protection is critical in all fields of law. In patent law, without the assurance of secrecy, the ability of a patentee or a defendant to receive candid advice would be severely diminished.…

The future of the patent profession: Are we looking into an AI abyss?

8th May 2026

AI presents a huge dilemma for patent attorneys. There is no doubt that AI will have a dramatic impact on the profession and the business model that many firms have relied on for decades.

Use of AI in the patent industry: Are you behind the wheel or waiting for the bus?

1st May 2026

It took a global pandemic to move some patent firms away from paper files. Today, it seems that patent attorneys are finally entering modernity with the growing adoption in the industry of automation tools for patent drafting and prosecution case management. Interestingly, much of this is being sold and promoted as “AI”, despite much of…

Use of AI in the patent industry: The spectre of hallucination

10th October 2025

What are the risks of AI hallucinations for the patent industry?

Use of AI in the patent industry: Solving the confidentiality problem

10th October 2025

How can patent attorneys ensure client confidentiality when using AI software for patents.

Evolve AI: Building AI tools for IP that are expert-led and pharma-specific

2nd October 2025

We believe that the value of AI for our profession lies in developing highly specialised tools that build upon and incorporate domain-specific attorney expertise.

Is it time for patent offices to enter the bioinformatic age?

13th June 2025

In a world in which incalculable amounts of sophisticated sequence data is freely available, are the clunky processes necessary to input patent sequence data really fit-for-purpose?

PHARMACEUTICAL IP