Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 161 tok/s
Gemini 2.5 Pro 47 tok/s Pro
GPT-5 Medium 25 tok/s Pro
GPT-5 High 38 tok/s Pro
GPT-4o 79 tok/s Pro
Kimi K2 197 tok/s Pro
GPT OSS 120B 441 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

A Novel Patent Similarity Measurement Methodology: Semantic Distance and Technological Distance (2303.16767v2)

Published 23 Mar 2023 in cs.IR, cs.AI, and cs.SI

Abstract: Patent similarity analysis plays a crucial role in evaluating the risk of patent infringement. Nonetheless, this analysis is predominantly conducted manually by legal experts, often resulting in a time-consuming process. Recent advances in natural language processing technology offer a promising avenue for automating this process. However, methods for measuring similarity between patents still rely on experts manually classifying patents. Due to the recent development of artificial intelligence technology, a lot of research is being conducted focusing on the semantic similarity of patents using natural language processing technology. However, it is difficult to accurately analyze patent data, which are legal documents representing complex technologies, using existing natural language processing technologies. To address these limitations, we propose a hybrid methodology that takes into account bibliographic similarity, measures the similarity between patents by considering the semantic similarity of patents, the technical similarity between patents, and the bibliographic information of patents. Using natural language processing techniques, we measure semantic similarity based on patent text and calculate technical similarity through the degree of coexistence of International patent classification (IPC) codes. The similarity of bibliographic information of a patent is calculated using the special characteristics of the patent: citation information, inventor information, and assignee information. We propose a model that assigns reasonable weights to each similarity method considered. With the help of experts, we performed manual similarity evaluations on 420 pairs and evaluated the performance of our model based on this data. We have empirically shown that our method outperforms recent natural language processing techniques.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.