Tracing the Flow of Knowledge From Science to Technology Using Deep Learning
By: Michael E. Rose , Mainak Ghosh , Sebastian Erhardt and more
We develop a language similarity model suitable for working with patents and scientific publications at the same time. In a horse race-style evaluation, we subject eight language (similarity) models to predict credible Patent-Paper Citations. We find that our Pat-SPECTER model performs best, which is the SPECTER2 model fine-tuned on patents. In two real-world scenarios (separating patent-paper-pairs and predicting patent-paper-pairs) we demonstrate the capabilities of the Pat-SPECTER. We finally test the hypothesis that US patents cite papers that are semantically less similar than in other large jurisdictions, which we posit is because of the duty of candor. The model is open for the academic community and practitioners alike.
Similar Papers
Datasets for machine learning and for assessing the intelligence level of automatic patent search systems
Information Retrieval
Finds old inventions to help new ones.
From scratch to silver: Creating trustworthy training data for patent-SDG classification using Large Language Models
Computation and Language
Helps find inventions that solve world problems.
Extracting Information About Publication Venues Using Citation-Informed Transformers
Digital Libraries
Shows how computer science topics are changing.